/home/wangjiarui/AIGV_2025/shell/train_qa.sh: 20: torchrun: not found /home/wangjiarui/AIGV_2025/shell/train_qa.sh: 20: torchrun: not found /home/wangjiarui/AIGV_2025/shell/train_qa.sh: 20: torchrun: not found /home/wangjiarui/AIGV_2025/shell/train_qa.sh: line 21: torchrun: command not found /home/wangjiarui/.conda/envs/intern25/bin/python: can't open file '/home/wangjiarui/train/train_qa.py': [Errno 2] No such file or directory E0425 07:41:42.362114 2660716 .conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 2) local_rank: 0 (pid: 2660790) of binary: /home/wangjiarui/.conda/envs/intern25/bin/python Traceback (most recent call last): File "/home/wangjiarui/.conda/envs/intern25/bin/torchrun", line 8, in sys.exit(main()) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ train/train_qa.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-04-25_07:41:42 host : amax rank : 0 (local_rank: 0) exitcode : 2 (pid: 2660790) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ [2025-04-25 07:42:22,677] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. [2025-04-25 07:42:26,002] [INFO] [comm.py:652:init_distributed] cdb=None [2025-04-25 07:42:26,002] [INFO] [comm.py:683:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl 04/25/2025 07:42:26 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False 04/25/2025 07:42:26 - INFO - __main__ - Training/evaluation parameters TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4125, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-42-26_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) 04/25/2025 07:42:26 - INFO - __main__ - Loading Tokenizer: /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:42:26,151 >> loading file tokenizer.model [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:42:26,151 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:42:26,151 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:42:26,151 >> loading file tokenizer_config.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:42:26,151 >> loading file tokenizer.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:42:26,151 >> loading file chat_template.jinja [INFO|tokenization_utils_base.py:2304] 2025-04-25 07:42:26,607 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 04/25/2025 07:42:26 - INFO - __main__ - Loading InternVLChatModel... [INFO|configuration_utils.py:694] 2025-04-25 07:42:26,747 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/config.json [INFO|configuration_utils.py:768] 2025-04-25 07:42:26,749 >> Model config InternVLChatConfig { "_commit_hash": null, "_name_or_path": "/mnt/petrelfs/wangweiyun/workspace_wwy/open_source/InternVL/internvl_chat/work_dirs/internvl_chat_v3_0/InternVL3_0-9B-MPO-try0-2", "architectures": [ "InternVLChatModel" ], "auto_map": { "AutoConfig": "configuration_internvl_chat.InternVLChatConfig", "AutoModel": "modeling_internvl_chat.InternVLChatModel", "AutoModelForCausalLM": "modeling_internvl_chat.InternVLChatModel" }, "downsample_ratio": 0.5, "dynamic_image_size": true, "force_image_size": 448, "hidden_size": 4096, "image_fold": null, "llm_config": { "_attn_implementation_autoset": true, "_name_or_path": "/mnt/petrelfs/share_data/wangweiyun/share_ckpt/hf_home/internlm2-chat-7b", "add_cross_attention": false, "architectures": [ "InternLM2ForCausalLM" ], "attn_implementation": "flash_attention_2", "auto_map": { "AutoConfig": "configuration_internlm2.InternLM2Config", "AutoModel": "modeling_internlm2.InternLM2ForCausalLM", "AutoModelForCausalLM": "modeling_internlm2.InternLM2ForCausalLM" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bias": false, "bos_token_id": 1, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": 2, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "silu", "hidden_size": 4096, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "initializer_range": 0.02, "intermediate_size": 10240, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "length_penalty": 1.0, "max_length": 20, "max_position_embeddings": 32768, "min_length": 0, "model_type": "internlm2", "moe_config": null, "no_repeat_ngram_size": 0, "num_attention_heads": 32, "num_beam_groups": 1, "num_beams": 1, "num_hidden_layers": 48, "num_key_value_heads": 2, "num_return_sequences": 1, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": 2, "prefix": null, "pretraining_tp": 1, "problem_type": null, "pruned_heads": {}, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "rms_norm_eps": 1e-05, "rope_scaling": { "factor": 2.0, "type": "dynamic" }, "rope_theta": 50000000, "sep_token_id": null, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": false, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_cache": false, "vocab_size": 128142 }, "max_dynamic_patch": 12, "min_dynamic_patch": 1, "model_type": "internvl_chat", "pad2square": false, "ps_version": "v2", "select_layer": -1, "system_message": null, "template": "internvl2_5", "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": null, "use_backbone_lora": 0, "use_img_start_end_token": true, "use_llm_lora": 0, "use_thumbnail": true, "vision_config": { "_attn_implementation_autoset": true, "_name_or_path": "pretrained/intern_vit_6b_448px_v1_2/", "add_cross_attention": false, "architectures": [ "InternVisionModel" ], "attention_dropout": 0.0, "auto_map": { "AutoConfig": "configuration_intern_vit.InternVisionConfig", "AutoModel": "modeling_intern_vit.InternVisionModel" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bos_token_id": null, "capacity_factor": 1.2, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "drop_path_rate": 0.1, "dropout": 0.0, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": null, "eval_capacity_factor": 1.4, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "gelu", "hidden_size": 1024, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "image_size": 448, "initializer_factor": 0.1, "initializer_range": 1e-10, "intermediate_size": 4096, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "laux_allreduce": "all_nodes", "layer_norm_eps": 1e-06, "length_penalty": 1.0, "max_length": 20, "min_length": 0, "model_type": "intern_vit_6b", "moe_coeff_ratio": 0.5, "moe_intermediate_size": 768, "moe_output_scale": 4.0, "no_repeat_ngram_size": 0, "noisy_gate_policy": "RSample_before", "norm_type": "layer_norm", "num_attention_heads": 16, "num_beam_groups": 1, "num_beams": 1, "num_channels": 3, "num_experts": 8, "num_hidden_layers": 24, "num_return_sequences": 1, "num_routed_experts": 4, "num_shared_experts": 4, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": null, "patch_size": 14, "prefix": null, "problem_type": null, "pruned_heads": {}, "qk_normalization": false, "qkv_bias": true, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "sep_token_id": null, "shared_expert_intermediate_size": 3072, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": true, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_flash_attn": true, "use_moe": false, "use_residual": true, "use_rts": false, "use_weighted_residual": false } } 04/25/2025 07:42:26 - INFO - __main__ - Using flash_attention_2 for InternLM [INFO|modeling_utils.py:3901] 2025-04-25 07:42:26,749 >> loading weights file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/model.safetensors.index.json [INFO|modeling_utils.py:1582] 2025-04-25 07:42:26,750 >> Instantiating InternVLChatModel model under default dtype torch.bfloat16. [INFO|configuration_utils.py:1140] 2025-04-25 07:42:26,752 >> Generate config GenerationConfig {} this model [WARNING|logging.py:328] 2025-04-25 07:42:26,825 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. [INFO|configuration_utils.py:1140] 2025-04-25 07:42:26,825 >> Generate config GenerationConfig { "bos_token_id": 1, "eos_token_id": 2, "pad_token_id": 2, "use_cache": false } motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) None False Setting backbone: fragments_backbone [rank0]: Traceback (most recent call last): [rank0]: File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 1219, in [rank0]: main() [rank0]: File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 1064, in main [rank0]: model = InternVLChatModel.from_pretrained( [rank0]: File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/transformers/modeling_utils.py", line 4111, in from_pretrained [rank0]: model = cls(config, *model_args, **model_kwargs) [rank0]: File "/home/wangjiarui/AIGV_2025/model/internvl_chat_st1_fast/modeling_internvl_chat.py", line 284, in __init__ [rank0]: self.evaluator.load_state_dict(torch.load('/media/amax/e1efc3d3-8977-4b90-9121-3f956ab56974/huiyu/wjr/method/FAST-VQA-and-FasterVQA-dev/FAST_VQA_B_1_4.pth')["state_dict"], strict=False) [rank0]: File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/serialization.py", line 1319, in load [rank0]: with _open_file_like(f, "rb") as opened_file: [rank0]: File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/serialization.py", line 659, in _open_file_like [rank0]: return _open_file(name_or_buffer, mode) [rank0]: File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/serialization.py", line 640, in __init__ [rank0]: super().__init__(open(name, mode)) [rank0]: FileNotFoundError: [Errno 2] No such file or directory: '/media/amax/e1efc3d3-8977-4b90-9121-3f956ab56974/huiyu/wjr/method/FAST-VQA-and-FasterVQA-dev/FAST_VQA_B_1_4.pth' [rank0]:[W425 07:42:29.164691144 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) E0425 07:42:29.724624 2660996 site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 0 (pid: 2661075) of binary: /home/wangjiarui/.conda/envs/intern25/bin/python Traceback (most recent call last): File "/home/wangjiarui/.conda/envs/intern25/bin/torchrun", line 8, in sys.exit(main()) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ train/train_qa.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-04-25_07:42:29 host : amax rank : 0 (local_rank: 0) exitcode : 1 (pid: 2661075) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ [2025-04-25 07:43:05,225] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. [2025-04-25 07:43:09,010] [INFO] [comm.py:652:init_distributed] cdb=None [2025-04-25 07:43:09,010] [INFO] [comm.py:683:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl 04/25/2025 07:43:09 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False 04/25/2025 07:43:09 - INFO - __main__ - Training/evaluation parameters TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4125, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-43-09_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) 04/25/2025 07:43:09 - INFO - __main__ - Loading Tokenizer: /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:43:09,154 >> loading file tokenizer.model [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:43:09,154 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:43:09,154 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:43:09,154 >> loading file tokenizer_config.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:43:09,154 >> loading file tokenizer.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:43:09,155 >> loading file chat_template.jinja [INFO|tokenization_utils_base.py:2304] 2025-04-25 07:43:09,613 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 04/25/2025 07:43:09 - INFO - __main__ - Loading InternVLChatModel... [INFO|configuration_utils.py:694] 2025-04-25 07:43:09,759 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/config.json [INFO|configuration_utils.py:768] 2025-04-25 07:43:09,761 >> Model config InternVLChatConfig { "_commit_hash": null, "_name_or_path": "/mnt/petrelfs/wangweiyun/workspace_wwy/open_source/InternVL/internvl_chat/work_dirs/internvl_chat_v3_0/InternVL3_0-9B-MPO-try0-2", "architectures": [ "InternVLChatModel" ], "auto_map": { "AutoConfig": "configuration_internvl_chat.InternVLChatConfig", "AutoModel": "modeling_internvl_chat.InternVLChatModel", "AutoModelForCausalLM": "modeling_internvl_chat.InternVLChatModel" }, "downsample_ratio": 0.5, "dynamic_image_size": true, "force_image_size": 448, "hidden_size": 4096, "image_fold": null, "llm_config": { "_attn_implementation_autoset": true, "_name_or_path": "/mnt/petrelfs/share_data/wangweiyun/share_ckpt/hf_home/internlm2-chat-7b", "add_cross_attention": false, "architectures": [ "InternLM2ForCausalLM" ], "attn_implementation": "flash_attention_2", "auto_map": { "AutoConfig": "configuration_internlm2.InternLM2Config", "AutoModel": "modeling_internlm2.InternLM2ForCausalLM", "AutoModelForCausalLM": "modeling_internlm2.InternLM2ForCausalLM" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bias": false, "bos_token_id": 1, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": 2, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "silu", "hidden_size": 4096, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "initializer_range": 0.02, "intermediate_size": 10240, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "length_penalty": 1.0, "max_length": 20, "max_position_embeddings": 32768, "min_length": 0, "model_type": "internlm2", "moe_config": null, "no_repeat_ngram_size": 0, "num_attention_heads": 32, "num_beam_groups": 1, "num_beams": 1, "num_hidden_layers": 48, "num_key_value_heads": 2, "num_return_sequences": 1, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": 2, "prefix": null, "pretraining_tp": 1, "problem_type": null, "pruned_heads": {}, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "rms_norm_eps": 1e-05, "rope_scaling": { "factor": 2.0, "type": "dynamic" }, "rope_theta": 50000000, "sep_token_id": null, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": false, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_cache": false, "vocab_size": 128142 }, "max_dynamic_patch": 12, "min_dynamic_patch": 1, "model_type": "internvl_chat", "pad2square": false, "ps_version": "v2", "select_layer": -1, "system_message": null, "template": "internvl2_5", "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": null, "use_backbone_lora": 0, "use_img_start_end_token": true, "use_llm_lora": 0, "use_thumbnail": true, "vision_config": { "_attn_implementation_autoset": true, "_name_or_path": "pretrained/intern_vit_6b_448px_v1_2/", "add_cross_attention": false, "architectures": [ "InternVisionModel" ], "attention_dropout": 0.0, "auto_map": { "AutoConfig": "configuration_intern_vit.InternVisionConfig", "AutoModel": "modeling_intern_vit.InternVisionModel" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bos_token_id": null, "capacity_factor": 1.2, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "drop_path_rate": 0.1, "dropout": 0.0, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": null, "eval_capacity_factor": 1.4, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "gelu", "hidden_size": 1024, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "image_size": 448, "initializer_factor": 0.1, "initializer_range": 1e-10, "intermediate_size": 4096, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "laux_allreduce": "all_nodes", "layer_norm_eps": 1e-06, "length_penalty": 1.0, "max_length": 20, "min_length": 0, "model_type": "intern_vit_6b", "moe_coeff_ratio": 0.5, "moe_intermediate_size": 768, "moe_output_scale": 4.0, "no_repeat_ngram_size": 0, "noisy_gate_policy": "RSample_before", "norm_type": "layer_norm", "num_attention_heads": 16, "num_beam_groups": 1, "num_beams": 1, "num_channels": 3, "num_experts": 8, "num_hidden_layers": 24, "num_return_sequences": 1, "num_routed_experts": 4, "num_shared_experts": 4, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": null, "patch_size": 14, "prefix": null, "problem_type": null, "pruned_heads": {}, "qk_normalization": false, "qkv_bias": true, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "sep_token_id": null, "shared_expert_intermediate_size": 3072, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": true, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_flash_attn": true, "use_moe": false, "use_residual": true, "use_rts": false, "use_weighted_residual": false } } 04/25/2025 07:43:09 - INFO - __main__ - Using flash_attention_2 for InternLM [INFO|modeling_utils.py:3901] 2025-04-25 07:43:09,762 >> loading weights file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/model.safetensors.index.json [INFO|modeling_utils.py:1582] 2025-04-25 07:43:09,762 >> Instantiating InternVLChatModel model under default dtype torch.bfloat16. [INFO|configuration_utils.py:1140] 2025-04-25 07:43:09,764 >> Generate config GenerationConfig {} this model [WARNING|logging.py:328] 2025-04-25 07:43:09,837 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. [INFO|configuration_utils.py:1140] 2025-04-25 07:43:09,837 >> Generate config GenerationConfig { "bos_token_id": 1, "eos_token_id": 2, "pad_token_id": 2, "use_cache": false } motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) None False Setting backbone: fragments_backbone Loading checkpoint shards: 0%| | 0/4 [00:00> All model checkpoint weights were used when initializing InternVLChatModel. [WARNING|modeling_utils.py:4890] 2025-04-25 07:43:12,022 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. [INFO|configuration_utils.py:1093] 2025-04-25 07:43:12,031 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/generation_config.json [INFO|configuration_utils.py:1140] 2025-04-25 07:43:12,032 >> Generate config GenerationConfig {} 04/25/2025 07:43:12 - INFO - __main__ - Finished 04/25/2025 07:43:12 - INFO - __main__ - model.config.force_image_size: 448 04/25/2025 07:43:12 - INFO - __main__ - data_args.force_image_size: 448 04/25/2025 07:43:12 - INFO - __main__ - model.config.vision_config.image_size: 448 04/25/2025 07:43:12 - INFO - __main__ - [Dataset] num_image_token: 256 04/25/2025 07:43:12 - INFO - __main__ - [Dataset] dynamic_image_size: True 04/25/2025 07:43:12 - INFO - __main__ - [Dataset] use_thumbnail: True 04/25/2025 07:43:12 - INFO - __main__ - [Dataset] min_dynamic_patch: 1, max_dynamic_patch: 6 04/25/2025 07:43:12 - INFO - __main__ - Formatting inputs...Skip in lazy mode [rank0]: Traceback (most recent call last): [rank0]: File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 1219, in [rank0]: main() [rank0]: File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 1136, in main [rank0]: train_dataset = build_datasets( [rank0]: File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 939, in build_datasets [rank0]: dataset = LazySupervisedDataset( [rank0]: File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 454, in __init__ [rank0]: with open(meta['annotation_train'], 'r') as f: [rank0]: FileNotFoundError: [Errno 2] No such file or directory: 'data3/video_qa_train.jsonl' [rank0]:[W425 07:43:13.472553291 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) E0425 07:43:14.157253 2661523 site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 0 (pid: 2661602) of binary: /home/wangjiarui/.conda/envs/intern25/bin/python Traceback (most recent call last): File "/home/wangjiarui/.conda/envs/intern25/bin/torchrun", line 8, in sys.exit(main()) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ train/train_qa.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-04-25_07:43:14 host : amax rank : 0 (local_rank: 0) exitcode : 1 (pid: 2661602) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ [2025-04-25 07:43:35,983] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. [2025-04-25 07:43:39,235] [INFO] [comm.py:652:init_distributed] cdb=None [2025-04-25 07:43:39,235] [INFO] [comm.py:683:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl 04/25/2025 07:43:39 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False 04/25/2025 07:43:39 - INFO - __main__ - Training/evaluation parameters TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4125, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-43-39_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) 04/25/2025 07:43:39 - INFO - __main__ - Loading Tokenizer: /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:43:39,381 >> loading file tokenizer.model [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:43:39,381 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:43:39,382 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:43:39,382 >> loading file tokenizer_config.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:43:39,382 >> loading file tokenizer.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:43:39,382 >> loading file chat_template.jinja [INFO|tokenization_utils_base.py:2304] 2025-04-25 07:43:39,813 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 04/25/2025 07:43:39 - INFO - __main__ - Loading InternVLChatModel... [INFO|configuration_utils.py:694] 2025-04-25 07:43:39,947 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/config.json [INFO|configuration_utils.py:768] 2025-04-25 07:43:39,949 >> Model config InternVLChatConfig { "_commit_hash": null, "_name_or_path": "/mnt/petrelfs/wangweiyun/workspace_wwy/open_source/InternVL/internvl_chat/work_dirs/internvl_chat_v3_0/InternVL3_0-9B-MPO-try0-2", "architectures": [ "InternVLChatModel" ], "auto_map": { "AutoConfig": "configuration_internvl_chat.InternVLChatConfig", "AutoModel": "modeling_internvl_chat.InternVLChatModel", "AutoModelForCausalLM": "modeling_internvl_chat.InternVLChatModel" }, "downsample_ratio": 0.5, "dynamic_image_size": true, "force_image_size": 448, "hidden_size": 4096, "image_fold": null, "llm_config": { "_attn_implementation_autoset": true, "_name_or_path": "/mnt/petrelfs/share_data/wangweiyun/share_ckpt/hf_home/internlm2-chat-7b", "add_cross_attention": false, "architectures": [ "InternLM2ForCausalLM" ], "attn_implementation": "flash_attention_2", "auto_map": { "AutoConfig": "configuration_internlm2.InternLM2Config", "AutoModel": "modeling_internlm2.InternLM2ForCausalLM", "AutoModelForCausalLM": "modeling_internlm2.InternLM2ForCausalLM" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bias": false, "bos_token_id": 1, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": 2, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "silu", "hidden_size": 4096, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "initializer_range": 0.02, "intermediate_size": 10240, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "length_penalty": 1.0, "max_length": 20, "max_position_embeddings": 32768, "min_length": 0, "model_type": "internlm2", "moe_config": null, "no_repeat_ngram_size": 0, "num_attention_heads": 32, "num_beam_groups": 1, "num_beams": 1, "num_hidden_layers": 48, "num_key_value_heads": 2, "num_return_sequences": 1, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": 2, "prefix": null, "pretraining_tp": 1, "problem_type": null, "pruned_heads": {}, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "rms_norm_eps": 1e-05, "rope_scaling": { "factor": 2.0, "type": "dynamic" }, "rope_theta": 50000000, "sep_token_id": null, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": false, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_cache": false, "vocab_size": 128142 }, "max_dynamic_patch": 12, "min_dynamic_patch": 1, "model_type": "internvl_chat", "pad2square": false, "ps_version": "v2", "select_layer": -1, "system_message": null, "template": "internvl2_5", "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": null, "use_backbone_lora": 0, "use_img_start_end_token": true, "use_llm_lora": 0, "use_thumbnail": true, "vision_config": { "_attn_implementation_autoset": true, "_name_or_path": "pretrained/intern_vit_6b_448px_v1_2/", "add_cross_attention": false, "architectures": [ "InternVisionModel" ], "attention_dropout": 0.0, "auto_map": { "AutoConfig": "configuration_intern_vit.InternVisionConfig", "AutoModel": "modeling_intern_vit.InternVisionModel" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bos_token_id": null, "capacity_factor": 1.2, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "drop_path_rate": 0.1, "dropout": 0.0, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": null, "eval_capacity_factor": 1.4, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "gelu", "hidden_size": 1024, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "image_size": 448, "initializer_factor": 0.1, "initializer_range": 1e-10, "intermediate_size": 4096, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "laux_allreduce": "all_nodes", "layer_norm_eps": 1e-06, "length_penalty": 1.0, "max_length": 20, "min_length": 0, "model_type": "intern_vit_6b", "moe_coeff_ratio": 0.5, "moe_intermediate_size": 768, "moe_output_scale": 4.0, "no_repeat_ngram_size": 0, "noisy_gate_policy": "RSample_before", "norm_type": "layer_norm", "num_attention_heads": 16, "num_beam_groups": 1, "num_beams": 1, "num_channels": 3, "num_experts": 8, "num_hidden_layers": 24, "num_return_sequences": 1, "num_routed_experts": 4, "num_shared_experts": 4, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": null, "patch_size": 14, "prefix": null, "problem_type": null, "pruned_heads": {}, "qk_normalization": false, "qkv_bias": true, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "sep_token_id": null, "shared_expert_intermediate_size": 3072, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": true, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_flash_attn": true, "use_moe": false, "use_residual": true, "use_rts": false, "use_weighted_residual": false } } 04/25/2025 07:43:39 - INFO - __main__ - Using flash_attention_2 for InternLM [INFO|modeling_utils.py:3901] 2025-04-25 07:43:39,950 >> loading weights file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/model.safetensors.index.json [INFO|modeling_utils.py:1582] 2025-04-25 07:43:39,950 >> Instantiating InternVLChatModel model under default dtype torch.bfloat16. [INFO|configuration_utils.py:1140] 2025-04-25 07:43:39,952 >> Generate config GenerationConfig {} this model [WARNING|logging.py:328] 2025-04-25 07:43:40,021 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. [INFO|configuration_utils.py:1140] 2025-04-25 07:43:40,021 >> Generate config GenerationConfig { "bos_token_id": 1, "eos_token_id": 2, "pad_token_id": 2, "use_cache": false } motion_mlp.weight1 Parameter containing: tensor([[ 3.5237e-19, 4.3141e-31, 0.0000e+00, ..., 1.7323e-07, -7.8496e-17, 1.3132e-07], [ 0.0000e+00, 6.0000e+00, -1.3173e-31, ..., 3.8184e-07, -1.6693e+09, 2.8871e-07], [ 0.0000e+00, 1.2000e+01, -1.3173e-31, ..., 5.8860e-07, -1.1731e+07, 4.4703e-07], ..., [-1.4412e+17, 2.4448e+04, 5.5460e+29, ..., 8.5449e-04, -4.8884e+20, 6.4468e-04], [-2.4179e+24, 2.4448e+04, 1.6161e+35, ..., 8.5449e-04, -1.1799e+27, 6.4468e-04], [-4.0565e+31, 2.4448e+04, -4.2906e-37, ..., 8.5449e-04, -2.9815e+33, 6.4468e-04]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) None False Setting backbone: fragments_backbone Loading checkpoint shards: 0%| | 0/4 [00:00> All model checkpoint weights were used when initializing InternVLChatModel. [WARNING|modeling_utils.py:4890] 2025-04-25 07:43:42,226 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. [INFO|configuration_utils.py:1093] 2025-04-25 07:43:42,235 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/generation_config.json [INFO|configuration_utils.py:1140] 2025-04-25 07:43:42,236 >> Generate config GenerationConfig {} 04/25/2025 07:43:42 - INFO - __main__ - Finished 04/25/2025 07:43:42 - INFO - __main__ - model.config.force_image_size: 448 04/25/2025 07:43:42 - INFO - __main__ - data_args.force_image_size: 448 04/25/2025 07:43:42 - INFO - __main__ - model.config.vision_config.image_size: 448 04/25/2025 07:43:42 - INFO - __main__ - [Dataset] num_image_token: 256 04/25/2025 07:43:42 - INFO - __main__ - [Dataset] dynamic_image_size: True 04/25/2025 07:43:42 - INFO - __main__ - [Dataset] use_thumbnail: True 04/25/2025 07:43:42 - INFO - __main__ - [Dataset] min_dynamic_patch: 1, max_dynamic_patch: 6 04/25/2025 07:43:42 - INFO - __main__ - Formatting inputs...Skip in lazy mode 04/25/2025 07:43:42 - INFO - __main__ - Add dataset: sharegpt4v_instruct_gpt4-vision_cap100k with length: 49500 04/25/2025 07:43:42 - INFO - __main__ - [Dataset] num_image_token: 256 04/25/2025 07:43:42 - INFO - __main__ - [Dataset] dynamic_image_size: True 04/25/2025 07:43:42 - INFO - __main__ - [Dataset] use_thumbnail: True 04/25/2025 07:43:42 - INFO - __main__ - [Dataset] min_dynamic_patch: 1, max_dynamic_patch: 6 04/25/2025 07:43:42 - INFO - __main__ - Formatting inputs...Skip in lazy mode 04/25/2025 07:43:43 - INFO - __main__ - Add dataset: sharegpt4v_instruct_gpt4-vision_cap100k with length: 9000 eval_dataset trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.qkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.qkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.proj.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.proj.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w2.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wqkv.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wqkv.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wo.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wo.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w1.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w1.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w3.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w3.lora_B.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w2.lora_A.default.weight 04/25/2025 07:43:44 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w2.lora_B.default.weight training_args TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4125, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-43-39_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) [INFO|trainer.py:741] 2025-04-25 07:43:44,409 >> Using auto half precision backend [WARNING|trainer.py:803] 2025-04-25 07:43:44,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:43:44,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [2025-04-25 07:43:44,626] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed info: version=0.16.3, git-hash=unknown, git-branch=unknown [2025-04-25 07:43:44,627] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 1 [2025-04-25 07:43:56,887] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... Detected CUDA files, patching ldflags Emitting ninja build file /home/wangjiarui/.cache/torch_extensions/py39_cu121/fused_adam/build.ninja... Building extension module fused_adam... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) ninja: no work to do. Loading extension module fused_adam... Time to load fused_adam op: 0.6631324291229248 seconds [2025-04-25 07:43:57,554] [INFO] [logging.py:128:log_dist] [Rank 0] Using DeepSpeed Optimizer param name adamw as basic optimizer [2025-04-25 07:43:57,555] [INFO] [logging.py:128:log_dist] [Rank 0] Removing param_group that has no 'params' in the basic Optimizer [2025-04-25 07:43:57,647] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Basic Optimizer = FusedAdam [2025-04-25 07:43:57,647] [INFO] [utils.py:59:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= [2025-04-25 07:43:57,647] [INFO] [logging.py:128:log_dist] [Rank 0] Creating torch.bfloat16 ZeRO stage 1 optimizer [2025-04-25 07:43:57,647] [INFO] [stage_1_and_2.py:149:__init__] Reduce bucket size 1000000000 [2025-04-25 07:43:57,647] [INFO] [stage_1_and_2.py:150:__init__] Allgather bucket size 1000000000 [2025-04-25 07:43:57,647] [INFO] [stage_1_and_2.py:151:__init__] CPU Offload: False [2025-04-25 07:43:57,647] [INFO] [stage_1_and_2.py:152:__init__] Round robin gradient partitioning: False [2025-04-25 07:43:58,084] [INFO] [utils.py:781:see_memory_usage] Before initializing optimizer states [2025-04-25 07:43:58,085] [INFO] [utils.py:782:see_memory_usage] MA 18.18 GB Max_MA 18.28 GB CA 18.52 GB Max_CA 19 GB [2025-04-25 07:43:58,085] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 33.54 GB, percent = 13.3% [2025-04-25 07:43:58,253] [INFO] [utils.py:781:see_memory_usage] After initializing optimizer states [2025-04-25 07:43:58,254] [INFO] [utils.py:782:see_memory_usage] MA 18.18 GB Max_MA 18.37 GB CA 18.72 GB Max_CA 19 GB [2025-04-25 07:43:58,254] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 33.55 GB, percent = 13.3% [2025-04-25 07:43:58,254] [INFO] [stage_1_and_2.py:545:__init__] optimizer state initialized [2025-04-25 07:43:58,426] [INFO] [utils.py:781:see_memory_usage] After initializing ZeRO optimizer [2025-04-25 07:43:58,426] [INFO] [utils.py:782:see_memory_usage] MA 18.18 GB Max_MA 18.18 GB CA 18.72 GB Max_CA 19 GB [2025-04-25 07:43:58,427] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 33.56 GB, percent = 13.3% [2025-04-25 07:43:58,438] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Final Optimizer = DeepSpeedZeroOptimizer [2025-04-25 07:43:58,439] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed using client callable to create LR scheduler [2025-04-25 07:43:58,439] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed LR Scheduler = [2025-04-25 07:43:58,439] [INFO] [logging.py:128:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0], mom=[[0.9, 0.999]] [2025-04-25 07:43:58,449] [INFO] [config.py:999:print] DeepSpeedEngine configuration: [2025-04-25 07:43:58,450] [INFO] [config.py:1003:print] activation_checkpointing_config { "partition_activations": false, "contiguous_memory_optimization": false, "cpu_checkpointing": false, "number_checkpoints": null, "synchronize_checkpoint_boundary": false, "profile": false } [2025-04-25 07:43:58,450] [INFO] [config.py:1003:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True, 'use_gds': False} [2025-04-25 07:43:58,450] [INFO] [config.py:1003:print] amp_enabled .................. False [2025-04-25 07:43:58,450] [INFO] [config.py:1003:print] amp_params ................... False [2025-04-25 07:43:58,450] [INFO] [config.py:1003:print] autotuning_config ............ { "enabled": false, "start_step": null, "end_step": null, "metric_path": null, "arg_mappings": null, "metric": "throughput", "model_info": null, "results_dir": "autotuning_results", "exps_dir": "autotuning_exps", "overwrite": true, "fast": true, "start_profile_step": 3, "end_profile_step": 5, "tuner_type": "gridsearch", "tuner_early_stopping": 5, "tuner_num_trials": 50, "model_info_path": null, "mp_size": 1, "max_train_batch_size": null, "min_train_batch_size": 1, "max_train_micro_batch_size_per_gpu": 1.024000e+03, "min_train_micro_batch_size_per_gpu": 1, "num_tuning_micro_batch_sizes": 3 } [2025-04-25 07:43:58,450] [INFO] [config.py:1003:print] bfloat16_enabled ............. True [2025-04-25 07:43:58,450] [INFO] [config.py:1003:print] bfloat16_immediate_grad_update False [2025-04-25 07:43:58,450] [INFO] [config.py:1003:print] checkpoint_parallel_write_pipeline False [2025-04-25 07:43:58,450] [INFO] [config.py:1003:print] checkpoint_tag_validation_enabled True [2025-04-25 07:43:58,450] [INFO] [config.py:1003:print] checkpoint_tag_validation_fail False [2025-04-25 07:43:58,450] [INFO] [config.py:1003:print] comms_config ................. [2025-04-25 07:43:58,450] [INFO] [config.py:1003:print] communication_data_type ...... None [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] compression_config ........... {'weight_quantization': {'shared_parameters': {'enabled': False, 'quantizer_kernel': False, 'schedule_offset': 0, 'quantize_groups': 1, 'quantize_verbose': False, 'quantization_type': 'symmetric', 'quantize_weight_in_forward': False, 'rounding': 'nearest', 'fp16_mixed_quantize': False, 'quantize_change_ratio': 0.001}, 'different_groups': {}}, 'activation_quantization': {'shared_parameters': {'enabled': False, 'quantization_type': 'symmetric', 'range_calibration': 'dynamic', 'schedule_offset': 1000}, 'different_groups': {}}, 'sparse_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'row_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'head_pruning': {'shared_parameters': {'enabled': False, 'method': 'topk', 'schedule_offset': 1000}, 'different_groups': {}}, 'channel_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'layer_reduction': {'enabled': False}} [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] curriculum_enabled_legacy .... False [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] curriculum_params_legacy ..... False [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] data_efficiency_config ....... {'enabled': False, 'seed': 1234, 'data_sampling': {'enabled': False, 'num_epochs': 1000, 'num_workers': 0, 'curriculum_learning': {'enabled': False}}, 'data_routing': {'enabled': False, 'random_ltd': {'enabled': False, 'layer_token_lr_schedule': {'enabled': False}}}} [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] data_efficiency_enabled ...... False [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] dataloader_drop_last ......... False [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] disable_allgather ............ False [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] dump_state ................... False [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] dynamic_loss_scale_args ...... None [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] eigenvalue_enabled ........... False [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] eigenvalue_gas_boundary_resolution 1 [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] eigenvalue_layer_name ........ bert.encoder.layer [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] eigenvalue_layer_num ......... 0 [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] eigenvalue_max_iter .......... 100 [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] eigenvalue_stability ......... 1e-06 [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] eigenvalue_tol ............... 0.01 [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] eigenvalue_verbose ........... False [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] elasticity_enabled ........... False [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] flops_profiler_config ........ { "enabled": false, "recompute_fwd_factor": 0.0, "profile_step": 1, "module_depth": -1, "top_modules": 1, "detailed": true, "output_file": null } [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] fp16_auto_cast ............... None [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] fp16_enabled ................. False [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] fp16_master_weights_and_gradients False [2025-04-25 07:43:58,451] [INFO] [config.py:1003:print] global_rank .................. 0 [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] grad_accum_dtype ............. None [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] gradient_accumulation_steps .. 1 [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] gradient_clipping ............ 1.0 [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] gradient_predivide_factor .... 1.0 [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] graph_harvesting ............. False [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] hybrid_engine ................ enabled=False max_out_tokens=512 inference_tp_size=1 release_inference_cache=False pin_parameters=True tp_gather_partition_size=8 [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] initial_dynamic_scale ........ 1 [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] load_universal_checkpoint .... False [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] loss_scale ................... 1.0 [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] memory_breakdown ............. False [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] mics_hierarchial_params_gather False [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] mics_shard_size .............. -1 [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] monitor_config ............... tensorboard=TensorBoardConfig(enabled=False, output_path='', job_name='DeepSpeedJobName') comet=CometConfig(enabled=False, samples_log_interval=100, project=None, workspace=None, api_key=None, experiment_name=None, experiment_key=None, online=None, mode=None) wandb=WandbConfig(enabled=False, group=None, team=None, project='deepspeed') csv_monitor=CSVConfig(enabled=False, output_path='', job_name='DeepSpeedJobName') [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] nebula_config ................ { "enabled": false, "persistent_storage_path": null, "persistent_time_interval": 100, "num_of_version_in_retention": 2, "enable_nebula_load": true, "load_path": null } [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] optimizer_legacy_fusion ...... False [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] optimizer_name ............... adamw [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] optimizer_params ............. {'lr': 4e-05, 'betas': [0.9, 0.999], 'eps': 1e-08, 'weight_decay': 0.01} [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0, 'pipe_partitioned': True, 'grad_partitioned': True} [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] pld_enabled .................. False [2025-04-25 07:43:58,452] [INFO] [config.py:1003:print] pld_params ................... False [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] prescale_gradients ........... False [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] scheduler_name ............... None [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] scheduler_params ............. None [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] seq_parallel_communication_data_type torch.float32 [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] sparse_attention ............. None [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] sparse_gradients_enabled ..... False [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] steps_per_print .............. inf [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] timers_config ................ enabled=True synchronized=True [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] train_batch_size ............. 4 [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] train_micro_batch_size_per_gpu 4 [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] use_data_before_expert_parallel_ False [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] use_node_local_storage ....... False [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] wall_clock_breakdown ......... True [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] weight_quantization_config ... None [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] world_size ................... 1 [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] zero_allow_untested_optimizer False [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] zero_config .................. stage=1 contiguous_gradients=True reduce_scatter=True reduce_bucket_size=1000000000 use_multi_rank_bucket_allreduce=True allgather_partitions=True allgather_bucket_size=1000000000 overlap_comm=True load_from_fp32_weights=True elastic_checkpoint=False offload_param=None offload_optimizer=None sub_group_size=1000000000 cpu_offload_param=None cpu_offload_use_pin_memory=None cpu_offload=None prefetch_bucket_size=50000000 param_persistence_threshold=100000 model_persistence_threshold=9223372036854775807 max_live_parameters=1000000000 max_reuse_distance=1000000000 gather_16bit_weights_on_model_save=False module_granularity_threshold=0 use_all_reduce_for_fetch_params=False stage3_gather_fp16_weights_on_model_save=False ignore_unused_parameters=True legacy_stage1=False round_robin_gradients=False zero_hpz_partition_size=1 zero_quantized_weights=False zero_quantized_nontrainable_weights=False zero_quantized_gradients=False zeropp_loco_param=None mics_shard_size=-1 mics_hierarchical_params_gather=False memory_efficient_linear=True pipeline_loading_checkpoint=False override_module_apply=True [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] zero_enabled ................. True [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] zero_force_ds_cpu_optimizer .. True [2025-04-25 07:43:58,453] [INFO] [config.py:1003:print] zero_optimization_stage ...... 1 [2025-04-25 07:43:58,454] [INFO] [config.py:989:print_user_config] json = { "zero_optimization": { "stage": 1, "allgather_partitions": true, "allgather_bucket_size": 1.000000e+09, "overlap_comm": true, "reduce_scatter": true, "reduce_bucket_size": 1.000000e+09, "contiguous_gradients": true }, "fp16": { "enabled": false, "auto_cast": true, "loss_scale": 0, "initial_scale_power": 32, "loss_scale_window": 1000, "hysteresis": 2, "min_loss_scale": 1 }, "bf16": { "enabled": true }, "optimizer": { "type": "AdamW", "params": { "lr": 4e-05, "betas": [0.9, 0.999], "eps": 1e-08, "weight_decay": 0.01 } }, "gradient_accumulation_steps": 1, "gradient_clipping": 1.0, "steps_per_print": inf, "train_batch_size": 4, "train_micro_batch_size_per_gpu": 4, "wall_clock_breakdown": true } [INFO|trainer.py:2369] 2025-04-25 07:43:58,455 >> ***** Running training ***** [INFO|trainer.py:2370] 2025-04-25 07:43:58,455 >> Num examples = 49,500 [INFO|trainer.py:2371] 2025-04-25 07:43:58,455 >> Num Epochs = 10 [INFO|trainer.py:2372] 2025-04-25 07:43:58,455 >> Instantaneous batch size per device = 4 [INFO|trainer.py:2375] 2025-04-25 07:43:58,455 >> Total train batch size (w. parallel, distributed & accumulation) = 4 [INFO|trainer.py:2376] 2025-04-25 07:43:58,455 >> Gradient Accumulation steps = 1 [INFO|trainer.py:2377] 2025-04-25 07:43:58,455 >> Total optimization steps = 123,750 [INFO|trainer.py:2378] 2025-04-25 07:43:58,462 >> Number of trainable parameters = 52,297,728 0%| | 0/123750 [00:00, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-47-50_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) 04/25/2025 07:47:50 - INFO - __main__ - Loading Tokenizer: /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:47:50,165 >> loading file tokenizer.model [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:47:50,165 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:47:50,165 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:47:50,165 >> loading file tokenizer_config.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:47:50,165 >> loading file tokenizer.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:47:50,165 >> loading file chat_template.jinja 04/25/2025 07:47:50 - WARNING - __main__ - Process rank: 2, device: cuda:2, n_gpu: 1distributed training: True, 16-bits training: False 04/25/2025 07:47:50 - WARNING - __main__ - Process rank: 1, device: cuda:1, n_gpu: 1distributed training: True, 16-bits training: False [INFO|tokenization_utils_base.py:2304] 2025-04-25 07:47:50,615 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 04/25/2025 07:47:50 - INFO - __main__ - Loading InternVLChatModel... [INFO|configuration_utils.py:694] 2025-04-25 07:47:50,758 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/config.json [INFO|configuration_utils.py:768] 2025-04-25 07:47:50,760 >> Model config InternVLChatConfig { "_commit_hash": null, "_name_or_path": "/mnt/petrelfs/wangweiyun/workspace_wwy/open_source/InternVL/internvl_chat/work_dirs/internvl_chat_v3_0/InternVL3_0-9B-MPO-try0-2", "architectures": [ "InternVLChatModel" ], "auto_map": { "AutoConfig": "configuration_internvl_chat.InternVLChatConfig", "AutoModel": "modeling_internvl_chat.InternVLChatModel", "AutoModelForCausalLM": "modeling_internvl_chat.InternVLChatModel" }, "downsample_ratio": 0.5, "dynamic_image_size": true, "force_image_size": 448, "hidden_size": 4096, "image_fold": null, "llm_config": { "_attn_implementation_autoset": true, "_name_or_path": "/mnt/petrelfs/share_data/wangweiyun/share_ckpt/hf_home/internlm2-chat-7b", "add_cross_attention": false, "architectures": [ "InternLM2ForCausalLM" ], "attn_implementation": "flash_attention_2", "auto_map": { "AutoConfig": "configuration_internlm2.InternLM2Config", "AutoModel": "modeling_internlm2.InternLM2ForCausalLM", "AutoModelForCausalLM": "modeling_internlm2.InternLM2ForCausalLM" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bias": false, "bos_token_id": 1, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": 2, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "silu", "hidden_size": 4096, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "initializer_range": 0.02, "intermediate_size": 10240, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "length_penalty": 1.0, "max_length": 20, "max_position_embeddings": 32768, "min_length": 0, "model_type": "internlm2", "moe_config": null, "no_repeat_ngram_size": 0, "num_attention_heads": 32, "num_beam_groups": 1, "num_beams": 1, "num_hidden_layers": 48, "num_key_value_heads": 2, "num_return_sequences": 1, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": 2, "prefix": null, "pretraining_tp": 1, "problem_type": null, "pruned_heads": {}, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "rms_norm_eps": 1e-05, "rope_scaling": { "factor": 2.0, "type": "dynamic" }, "rope_theta": 50000000, "sep_token_id": null, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": false, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_cache": false, "vocab_size": 128142 }, "max_dynamic_patch": 12, "min_dynamic_patch": 1, "model_type": "internvl_chat", "pad2square": false, "ps_version": "v2", "select_layer": -1, "system_message": null, "template": "internvl2_5", "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": null, "use_backbone_lora": 0, "use_img_start_end_token": true, "use_llm_lora": 0, "use_thumbnail": true, "vision_config": { "_attn_implementation_autoset": true, "_name_or_path": "pretrained/intern_vit_6b_448px_v1_2/", "add_cross_attention": false, "architectures": [ "InternVisionModel" ], "attention_dropout": 0.0, "auto_map": { "AutoConfig": "configuration_intern_vit.InternVisionConfig", "AutoModel": "modeling_intern_vit.InternVisionModel" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bos_token_id": null, "capacity_factor": 1.2, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "drop_path_rate": 0.1, "dropout": 0.0, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": null, "eval_capacity_factor": 1.4, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "gelu", "hidden_size": 1024, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "image_size": 448, "initializer_factor": 0.1, "initializer_range": 1e-10, "intermediate_size": 4096, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "laux_allreduce": "all_nodes", "layer_norm_eps": 1e-06, "length_penalty": 1.0, "max_length": 20, "min_length": 0, "model_type": "intern_vit_6b", "moe_coeff_ratio": 0.5, "moe_intermediate_size": 768, "moe_output_scale": 4.0, "no_repeat_ngram_size": 0, "noisy_gate_policy": "RSample_before", "norm_type": "layer_norm", "num_attention_heads": 16, "num_beam_groups": 1, "num_beams": 1, "num_channels": 3, "num_experts": 8, "num_hidden_layers": 24, "num_return_sequences": 1, "num_routed_experts": 4, "num_shared_experts": 4, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": null, "patch_size": 14, "prefix": null, "problem_type": null, "pruned_heads": {}, "qk_normalization": false, "qkv_bias": true, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "sep_token_id": null, "shared_expert_intermediate_size": 3072, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": true, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_flash_attn": true, "use_moe": false, "use_residual": true, "use_rts": false, "use_weighted_residual": false } } 04/25/2025 07:47:50 - INFO - __main__ - Using flash_attention_2 for InternLM [INFO|modeling_utils.py:3901] 2025-04-25 07:47:50,761 >> loading weights file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/model.safetensors.index.json [INFO|modeling_utils.py:1582] 2025-04-25 07:47:50,761 >> Instantiating InternVLChatModel model under default dtype torch.bfloat16. [INFO|configuration_utils.py:1140] 2025-04-25 07:47:50,763 >> Generate config GenerationConfig {} this model [WARNING|logging.py:328] 2025-04-25 07:47:50,833 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. [INFO|configuration_utils.py:1140] 2025-04-25 07:47:50,833 >> Generate config GenerationConfig { "bos_token_id": 1, "eos_token_id": 2, "pad_token_id": 2, "use_cache": false } this model this model [WARNING|logging.py:328] 2025-04-25 07:47:50,900 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. [WARNING|logging.py:328] 2025-04-25 07:47:50,924 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 motion_mlp.weight1 Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) Parameter containing: tensor([[-4.0625e-01, 4.9593e-08, 0.0000e+00, ..., 1.7323e-07, -7.8496e-17, 1.3132e-07], [ 0.0000e+00, 6.0000e+00, -1.3173e-31, ..., 3.8184e-07, -1.6693e+09, 2.8871e-07], [ 0.0000e+00, 1.2000e+01, -1.3173e-31, ..., 5.8860e-07, -1.1731e+07, 4.4703e-07], ..., [-1.4412e+17, 2.4448e+04, 5.5460e+29, ..., 8.5449e-04, -4.8884e+20, 6.4468e-04], [-2.4179e+24, 2.4448e+04, 1.6161e+35, ..., 8.5449e-04, -1.1799e+27, 6.4468e-04], [-4.0565e+31, 2.4448e+04, -4.2906e-37, ..., 8.5449e-04, -2.9815e+33, 6.4468e-04]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight2 motion_mlp.weight2 Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) None False Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) None False None False Setting backbone: fragments_backbone Setting backbone: fragments_backbone Setting backbone: fragments_backbone Loading checkpoint shards: 0%| | 0/4 [00:00> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.76it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.37it/s] [WARNING|modeling_utils.py:4890] 2025-04-25 07:47:57,061 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.82it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.50it/s] [INFO|modeling_utils.py:4888] 2025-04-25 07:47:57,074 >> All model checkpoint weights were used when initializing InternVLChatModel. [WARNING|modeling_utils.py:4890] 2025-04-25 07:47:57,074 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. [INFO|configuration_utils.py:1093] 2025-04-25 07:47:57,084 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/generation_config.json [INFO|configuration_utils.py:1140] 2025-04-25 07:47:57,085 >> Generate config GenerationConfig {} 04/25/2025 07:47:57 - INFO - __main__ - Finished 04/25/2025 07:47:57 - INFO - __main__ - model.config.force_image_size: 448 04/25/2025 07:47:57 - INFO - __main__ - data_args.force_image_size: 448 04/25/2025 07:47:57 - INFO - __main__ - model.config.vision_config.image_size: 448 04/25/2025 07:47:57 - INFO - __main__ - [Dataset] num_image_token: 256 04/25/2025 07:47:57 - INFO - __main__ - [Dataset] dynamic_image_size: True 04/25/2025 07:47:57 - INFO - __main__ - [Dataset] use_thumbnail: True 04/25/2025 07:47:57 - INFO - __main__ - [Dataset] min_dynamic_patch: 1, max_dynamic_patch: 6 04/25/2025 07:47:57 - INFO - __main__ - Formatting inputs...Skip in lazy mode 04/25/2025 07:47:57 - INFO - __main__ - Add dataset: sharegpt4v_instruct_gpt4-vision_cap100k with length: 49500 04/25/2025 07:47:57 - INFO - __main__ - [Dataset] num_image_token: 256 04/25/2025 07:47:57 - INFO - __main__ - [Dataset] dynamic_image_size: True 04/25/2025 07:47:57 - INFO - __main__ - [Dataset] use_thumbnail: True 04/25/2025 07:47:57 - INFO - __main__ - [Dataset] min_dynamic_patch: 1, max_dynamic_patch: 6 04/25/2025 07:47:57 - INFO - __main__ - Formatting inputs...Skip in lazy mode eval_dataset eval_dataset 04/25/2025 07:47:57 - INFO - __main__ - Add dataset: sharegpt4v_instruct_gpt4-vision_cap100k with length: 9000 eval_dataset trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 training_args TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=2, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-47-50_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 training_args TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=1, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-47-50_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.qkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.qkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.proj.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.proj.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w2.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wqkv.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wqkv.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wo.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wo.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w1.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w1.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w3.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w3.lora_B.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w2.lora_A.default.weight 04/25/2025 07:47:58 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w2.lora_B.default.weight training_args TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-47-50_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) [INFO|trainer.py:741] 2025-04-25 07:47:58,764 >> Using auto half precision backend [WARNING|trainer.py:803] 2025-04-25 07:47:58,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:47:58,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:47:58,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:47:58,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:47:58,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:47:58,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [2025-04-25 07:47:59,019] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed info: version=0.16.3, git-hash=unknown, git-branch=unknown [2025-04-25 07:47:59,020] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 3 Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... [2025-04-25 07:48:05,449] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... Detected CUDA files, patching ldflags Emitting ninja build file /home/wangjiarui/.cache/torch_extensions/py39_cu121/fused_adam/build.ninja... Building extension module fused_adam... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) ninja: no work to do. Loading extension module fused_adam... Time to load fused_adam op: 0.6127653121948242 seconds Loading extension module fused_adam... Time to load fused_adam op: 0.6027617454528809 seconds Loading extension module fused_adam... Time to load fused_adam op: 0.6032164096832275 seconds [2025-04-25 07:48:06,506] [INFO] [logging.py:128:log_dist] [Rank 0] Using DeepSpeed Optimizer param name adamw as basic optimizer [2025-04-25 07:48:06,506] [INFO] [logging.py:128:log_dist] [Rank 0] Removing param_group that has no 'params' in the basic Optimizer [2025-04-25 07:48:06,595] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Basic Optimizer = FusedAdam [2025-04-25 07:48:06,595] [INFO] [utils.py:59:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= [2025-04-25 07:48:06,595] [INFO] [logging.py:128:log_dist] [Rank 0] Creating torch.bfloat16 ZeRO stage 1 optimizer [2025-04-25 07:48:06,595] [INFO] [stage_1_and_2.py:149:__init__] Reduce bucket size 1000000000 [2025-04-25 07:48:06,595] [INFO] [stage_1_and_2.py:150:__init__] Allgather bucket size 1000000000 [2025-04-25 07:48:06,595] [INFO] [stage_1_and_2.py:151:__init__] CPU Offload: False [2025-04-25 07:48:06,595] [INFO] [stage_1_and_2.py:152:__init__] Round robin gradient partitioning: False [2025-04-25 07:48:06,998] [INFO] [utils.py:781:see_memory_usage] Before initializing optimizer states [2025-04-25 07:48:06,999] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.08 GB CA 18.33 GB Max_CA 18 GB [2025-04-25 07:48:06,999] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 27.41 GB, percent = 10.9% [2025-04-25 07:48:07,173] [INFO] [utils.py:781:see_memory_usage] After initializing optimizer states [2025-04-25 07:48:07,174] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.11 GB CA 18.39 GB Max_CA 18 GB [2025-04-25 07:48:07,174] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 27.42 GB, percent = 10.9% [2025-04-25 07:48:07,174] [INFO] [stage_1_and_2.py:545:__init__] optimizer state initialized [2025-04-25 07:48:07,348] [INFO] [utils.py:781:see_memory_usage] After initializing ZeRO optimizer [2025-04-25 07:48:07,349] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.05 GB CA 18.39 GB Max_CA 18 GB [2025-04-25 07:48:07,349] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 27.56 GB, percent = 11.0% [2025-04-25 07:48:07,355] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Final Optimizer = DeepSpeedZeroOptimizer [2025-04-25 07:48:07,355] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed using client callable to create LR scheduler [2025-04-25 07:48:07,355] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed LR Scheduler = [2025-04-25 07:48:07,355] [INFO] [logging.py:128:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0], mom=[[0.9, 0.999]] [2025-04-25 07:48:07,365] [INFO] [config.py:999:print] DeepSpeedEngine configuration: [2025-04-25 07:48:07,366] [INFO] [config.py:1003:print] activation_checkpointing_config { "partition_activations": false, "contiguous_memory_optimization": false, "cpu_checkpointing": false, "number_checkpoints": null, "synchronize_checkpoint_boundary": false, "profile": false } [2025-04-25 07:48:07,366] [INFO] [config.py:1003:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True, 'use_gds': False} [2025-04-25 07:48:07,366] [INFO] [config.py:1003:print] amp_enabled .................. False [2025-04-25 07:48:07,366] [INFO] [config.py:1003:print] amp_params ................... False [2025-04-25 07:48:07,366] [INFO] [config.py:1003:print] autotuning_config ............ { "enabled": false, "start_step": null, "end_step": null, "metric_path": null, "arg_mappings": null, "metric": "throughput", "model_info": null, "results_dir": "autotuning_results", "exps_dir": "autotuning_exps", "overwrite": true, "fast": true, "start_profile_step": 3, "end_profile_step": 5, "tuner_type": "gridsearch", "tuner_early_stopping": 5, "tuner_num_trials": 50, "model_info_path": null, "mp_size": 1, "max_train_batch_size": null, "min_train_batch_size": 1, "max_train_micro_batch_size_per_gpu": 1.024000e+03, "min_train_micro_batch_size_per_gpu": 1, "num_tuning_micro_batch_sizes": 3 } [2025-04-25 07:48:07,366] [INFO] [config.py:1003:print] bfloat16_enabled ............. True [2025-04-25 07:48:07,366] [INFO] [config.py:1003:print] bfloat16_immediate_grad_update False [2025-04-25 07:48:07,366] [INFO] [config.py:1003:print] checkpoint_parallel_write_pipeline False [2025-04-25 07:48:07,366] [INFO] [config.py:1003:print] checkpoint_tag_validation_enabled True [2025-04-25 07:48:07,366] [INFO] [config.py:1003:print] checkpoint_tag_validation_fail False [2025-04-25 07:48:07,366] [INFO] [config.py:1003:print] comms_config ................. [2025-04-25 07:48:07,366] [INFO] [config.py:1003:print] communication_data_type ...... None [2025-04-25 07:48:07,366] [INFO] [config.py:1003:print] compression_config ........... {'weight_quantization': {'shared_parameters': {'enabled': False, 'quantizer_kernel': False, 'schedule_offset': 0, 'quantize_groups': 1, 'quantize_verbose': False, 'quantization_type': 'symmetric', 'quantize_weight_in_forward': False, 'rounding': 'nearest', 'fp16_mixed_quantize': False, 'quantize_change_ratio': 0.001}, 'different_groups': {}}, 'activation_quantization': {'shared_parameters': {'enabled': False, 'quantization_type': 'symmetric', 'range_calibration': 'dynamic', 'schedule_offset': 1000}, 'different_groups': {}}, 'sparse_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'row_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'head_pruning': {'shared_parameters': {'enabled': False, 'method': 'topk', 'schedule_offset': 1000}, 'different_groups': {}}, 'channel_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'layer_reduction': {'enabled': False}} [2025-04-25 07:48:07,366] [INFO] [config.py:1003:print] curriculum_enabled_legacy .... False [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] curriculum_params_legacy ..... False [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] data_efficiency_config ....... {'enabled': False, 'seed': 1234, 'data_sampling': {'enabled': False, 'num_epochs': 1000, 'num_workers': 0, 'curriculum_learning': {'enabled': False}}, 'data_routing': {'enabled': False, 'random_ltd': {'enabled': False, 'layer_token_lr_schedule': {'enabled': False}}}} [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] data_efficiency_enabled ...... False [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] dataloader_drop_last ......... False [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] disable_allgather ............ False [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] dump_state ................... False [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] dynamic_loss_scale_args ...... None [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] eigenvalue_enabled ........... False [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] eigenvalue_gas_boundary_resolution 1 [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] eigenvalue_layer_name ........ bert.encoder.layer [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] eigenvalue_layer_num ......... 0 [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] eigenvalue_max_iter .......... 100 [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] eigenvalue_stability ......... 1e-06 [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] eigenvalue_tol ............... 0.01 [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] eigenvalue_verbose ........... False [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] elasticity_enabled ........... False [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] flops_profiler_config ........ { "enabled": false, "recompute_fwd_factor": 0.0, "profile_step": 1, "module_depth": -1, "top_modules": 1, "detailed": true, "output_file": null } [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] fp16_auto_cast ............... None [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] fp16_enabled ................. False [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] fp16_master_weights_and_gradients False [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] global_rank .................. 0 [2025-04-25 07:48:07,367] [INFO] [config.py:1003:print] grad_accum_dtype ............. None [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] gradient_accumulation_steps .. 1 [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] gradient_clipping ............ 1.0 [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] gradient_predivide_factor .... 1.0 [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] graph_harvesting ............. False [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] hybrid_engine ................ enabled=False max_out_tokens=512 inference_tp_size=1 release_inference_cache=False pin_parameters=True tp_gather_partition_size=8 [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] initial_dynamic_scale ........ 1 [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] load_universal_checkpoint .... False [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] loss_scale ................... 1.0 [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] memory_breakdown ............. False [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] mics_hierarchial_params_gather False [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] mics_shard_size .............. -1 [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] monitor_config ............... tensorboard=TensorBoardConfig(enabled=False, output_path='', job_name='DeepSpeedJobName') comet=CometConfig(enabled=False, samples_log_interval=100, project=None, workspace=None, api_key=None, experiment_name=None, experiment_key=None, online=None, mode=None) wandb=WandbConfig(enabled=False, group=None, team=None, project='deepspeed') csv_monitor=CSVConfig(enabled=False, output_path='', job_name='DeepSpeedJobName') [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] nebula_config ................ { "enabled": false, "persistent_storage_path": null, "persistent_time_interval": 100, "num_of_version_in_retention": 2, "enable_nebula_load": true, "load_path": null } [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] optimizer_legacy_fusion ...... False [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] optimizer_name ............... adamw [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] optimizer_params ............. {'lr': 4e-05, 'betas': [0.9, 0.999], 'eps': 1e-08, 'weight_decay': 0.01} [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0, 'pipe_partitioned': True, 'grad_partitioned': True} [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] pld_enabled .................. False [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] pld_params ................... False [2025-04-25 07:48:07,368] [INFO] [config.py:1003:print] prescale_gradients ........... False [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] scheduler_name ............... None [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] scheduler_params ............. None [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] seq_parallel_communication_data_type torch.float32 [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] sparse_attention ............. None [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] sparse_gradients_enabled ..... False [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] steps_per_print .............. inf [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] timers_config ................ enabled=True synchronized=True [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] train_batch_size ............. 12 [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] train_micro_batch_size_per_gpu 4 [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] use_data_before_expert_parallel_ False [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] use_node_local_storage ....... False [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] wall_clock_breakdown ......... True [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] weight_quantization_config ... None [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] world_size ................... 3 [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] zero_allow_untested_optimizer False [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] zero_config .................. stage=1 contiguous_gradients=True reduce_scatter=True reduce_bucket_size=1000000000 use_multi_rank_bucket_allreduce=True allgather_partitions=True allgather_bucket_size=1000000000 overlap_comm=True load_from_fp32_weights=True elastic_checkpoint=False offload_param=None offload_optimizer=None sub_group_size=1000000000 cpu_offload_param=None cpu_offload_use_pin_memory=None cpu_offload=None prefetch_bucket_size=50000000 param_persistence_threshold=100000 model_persistence_threshold=9223372036854775807 max_live_parameters=1000000000 max_reuse_distance=1000000000 gather_16bit_weights_on_model_save=False module_granularity_threshold=0 use_all_reduce_for_fetch_params=False stage3_gather_fp16_weights_on_model_save=False ignore_unused_parameters=True legacy_stage1=False round_robin_gradients=False zero_hpz_partition_size=1 zero_quantized_weights=False zero_quantized_nontrainable_weights=False zero_quantized_gradients=False zeropp_loco_param=None mics_shard_size=-1 mics_hierarchical_params_gather=False memory_efficient_linear=True pipeline_loading_checkpoint=False override_module_apply=True [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] zero_enabled ................. True [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] zero_force_ds_cpu_optimizer .. True [2025-04-25 07:48:07,369] [INFO] [config.py:1003:print] zero_optimization_stage ...... 1 [2025-04-25 07:48:07,370] [INFO] [config.py:989:print_user_config] json = { "zero_optimization": { "stage": 1, "allgather_partitions": true, "allgather_bucket_size": 1.000000e+09, "overlap_comm": true, "reduce_scatter": true, "reduce_bucket_size": 1.000000e+09, "contiguous_gradients": true }, "fp16": { "enabled": false, "auto_cast": true, "loss_scale": 0, "initial_scale_power": 32, "loss_scale_window": 1000, "hysteresis": 2, "min_loss_scale": 1 }, "bf16": { "enabled": true }, "optimizer": { "type": "AdamW", "params": { "lr": 4e-05, "betas": [0.9, 0.999], "eps": 1e-08, "weight_decay": 0.01 } }, "gradient_accumulation_steps": 1, "gradient_clipping": 1.0, "steps_per_print": inf, "train_batch_size": 12, "train_micro_batch_size_per_gpu": 4, "wall_clock_breakdown": true } [INFO|trainer.py:2369] 2025-04-25 07:48:07,371 >> ***** Running training ***** [INFO|trainer.py:2370] 2025-04-25 07:48:07,371 >> Num examples = 49,500 [INFO|trainer.py:2371] 2025-04-25 07:48:07,371 >> Num Epochs = 10 [INFO|trainer.py:2372] 2025-04-25 07:48:07,371 >> Instantaneous batch size per device = 4 [INFO|trainer.py:2375] 2025-04-25 07:48:07,371 >> Total train batch size (w. parallel, distributed & accumulation) = 12 [INFO|trainer.py:2376] 2025-04-25 07:48:07,371 >> Gradient Accumulation steps = 1 [INFO|trainer.py:2377] 2025-04-25 07:48:07,371 >> Total optimization steps = 41,250 [INFO|trainer.py:2378] 2025-04-25 07:48:07,378 >> Number of trainable parameters = 52,297,728 0%| | 0/41250 [00:00> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-25 07:49:19,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-25 07:49:19,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2 2 2 [WARNING|trainer.py:803] 2025-04-25 07:49:21,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-25 07:49:22,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-25 07:49:22,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3 3 3 [WARNING|trainer.py:803] 2025-04-25 07:49:24,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-25 07:49:24,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-25 07:49:24,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4 4 4 [WARNING|trainer.py:803] 2025-04-25 07:49:27,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-25 07:49:27,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-25 07:49:27,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5 5 5 [WARNING|trainer.py:803] 2025-04-25 07:49:29,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-25 07:49:29,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-25 07:49:29,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6 6 6 [WARNING|trainer.py:803] 2025-04-25 07:49:32,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-25 07:49:32,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-25 07:49:32,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 7 7 7 [WARNING|trainer.py:803] 2025-04-25 07:49:34,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. .Yes [WARNING|trainer.py:803] 2025-04-25 07:49:34,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. .Yes [WARNING|trainer.py:803] 2025-04-25 07:49:35,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. .Yes 8 8 [WARNING|trainer.py:803] 2025-04-25 07:49:37,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-25 07:49:37,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8 W0425 07:49:41.769487 2673708 site-packages/torch/distributed/run.py:793] W0425 07:49:41.769487 2673708 site-packages/torch/distributed/run.py:793] ***************************************** W0425 07:49:41.769487 2673708 site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0425 07:49:41.769487 2673708 site-packages/torch/distributed/run.py:793] ***************************************** [2025-04-25 07:49:43,797] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-25 07:49:43,800] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-25 07:49:43,801] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) petrel_client is not installed. If you read data locally instead of from ceph, ignore it. petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. [2025-04-25 07:49:47,009] [INFO] [comm.py:652:init_distributed] cdb=None [2025-04-25 07:49:47,009] [INFO] [comm.py:683:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. 04/25/2025 07:49:47 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False 04/25/2025 07:49:47 - INFO - __main__ - Training/evaluation parameters TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-49-47_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) 04/25/2025 07:49:47 - INFO - __main__ - Loading Tokenizer: /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:49:47,255 >> loading file tokenizer.model [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:49:47,255 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:49:47,255 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:49:47,255 >> loading file tokenizer_config.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:49:47,255 >> loading file tokenizer.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:49:47,255 >> loading file chat_template.jinja [2025-04-25 07:49:47,302] [INFO] [comm.py:652:init_distributed] cdb=None Replace train sampler!! petrel_client is not installed. Using PIL to load images. 04/25/2025 07:49:47 - WARNING - __main__ - Process rank: 2, device: cuda:2, n_gpu: 1distributed training: True, 16-bits training: False [2025-04-25 07:49:47,461] [INFO] [comm.py:652:init_distributed] cdb=None 04/25/2025 07:49:47 - WARNING - __main__ - Process rank: 1, device: cuda:1, n_gpu: 1distributed training: True, 16-bits training: False [INFO|tokenization_utils_base.py:2304] 2025-04-25 07:49:47,734 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 04/25/2025 07:49:47 - INFO - __main__ - Loading InternVLChatModel... [INFO|configuration_utils.py:694] 2025-04-25 07:49:47,875 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/config.json [INFO|configuration_utils.py:768] 2025-04-25 07:49:47,877 >> Model config InternVLChatConfig { "_commit_hash": null, "_name_or_path": "/mnt/petrelfs/wangweiyun/workspace_wwy/open_source/InternVL/internvl_chat/work_dirs/internvl_chat_v3_0/InternVL3_0-9B-MPO-try0-2", "architectures": [ "InternVLChatModel" ], "auto_map": { "AutoConfig": "configuration_internvl_chat.InternVLChatConfig", "AutoModel": "modeling_internvl_chat.InternVLChatModel", "AutoModelForCausalLM": "modeling_internvl_chat.InternVLChatModel" }, "downsample_ratio": 0.5, "dynamic_image_size": true, "force_image_size": 448, "hidden_size": 4096, "image_fold": null, "llm_config": { "_attn_implementation_autoset": true, "_name_or_path": "/mnt/petrelfs/share_data/wangweiyun/share_ckpt/hf_home/internlm2-chat-7b", "add_cross_attention": false, "architectures": [ "InternLM2ForCausalLM" ], "attn_implementation": "flash_attention_2", "auto_map": { "AutoConfig": "configuration_internlm2.InternLM2Config", "AutoModel": "modeling_internlm2.InternLM2ForCausalLM", "AutoModelForCausalLM": "modeling_internlm2.InternLM2ForCausalLM" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bias": false, "bos_token_id": 1, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": 2, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "silu", "hidden_size": 4096, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "initializer_range": 0.02, "intermediate_size": 10240, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "length_penalty": 1.0, "max_length": 20, "max_position_embeddings": 32768, "min_length": 0, "model_type": "internlm2", "moe_config": null, "no_repeat_ngram_size": 0, "num_attention_heads": 32, "num_beam_groups": 1, "num_beams": 1, "num_hidden_layers": 48, "num_key_value_heads": 2, "num_return_sequences": 1, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": 2, "prefix": null, "pretraining_tp": 1, "problem_type": null, "pruned_heads": {}, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "rms_norm_eps": 1e-05, "rope_scaling": { "factor": 2.0, "type": "dynamic" }, "rope_theta": 50000000, "sep_token_id": null, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": false, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_cache": false, "vocab_size": 128142 }, "max_dynamic_patch": 12, "min_dynamic_patch": 1, "model_type": "internvl_chat", "pad2square": false, "ps_version": "v2", "select_layer": -1, "system_message": null, "template": "internvl2_5", "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": null, "use_backbone_lora": 0, "use_img_start_end_token": true, "use_llm_lora": 0, "use_thumbnail": true, "vision_config": { "_attn_implementation_autoset": true, "_name_or_path": "pretrained/intern_vit_6b_448px_v1_2/", "add_cross_attention": false, "architectures": [ "InternVisionModel" ], "attention_dropout": 0.0, "auto_map": { "AutoConfig": "configuration_intern_vit.InternVisionConfig", "AutoModel": "modeling_intern_vit.InternVisionModel" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bos_token_id": null, "capacity_factor": 1.2, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "drop_path_rate": 0.1, "dropout": 0.0, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": null, "eval_capacity_factor": 1.4, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "gelu", "hidden_size": 1024, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "image_size": 448, "initializer_factor": 0.1, "initializer_range": 1e-10, "intermediate_size": 4096, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "laux_allreduce": "all_nodes", "layer_norm_eps": 1e-06, "length_penalty": 1.0, "max_length": 20, "min_length": 0, "model_type": "intern_vit_6b", "moe_coeff_ratio": 0.5, "moe_intermediate_size": 768, "moe_output_scale": 4.0, "no_repeat_ngram_size": 0, "noisy_gate_policy": "RSample_before", "norm_type": "layer_norm", "num_attention_heads": 16, "num_beam_groups": 1, "num_beams": 1, "num_channels": 3, "num_experts": 8, "num_hidden_layers": 24, "num_return_sequences": 1, "num_routed_experts": 4, "num_shared_experts": 4, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": null, "patch_size": 14, "prefix": null, "problem_type": null, "pruned_heads": {}, "qk_normalization": false, "qkv_bias": true, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "sep_token_id": null, "shared_expert_intermediate_size": 3072, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": true, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_flash_attn": true, "use_moe": false, "use_residual": true, "use_rts": false, "use_weighted_residual": false } } 04/25/2025 07:49:47 - INFO - __main__ - Using flash_attention_2 for InternLM [INFO|modeling_utils.py:3901] 2025-04-25 07:49:47,877 >> loading weights file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/model.safetensors.index.json [INFO|modeling_utils.py:1582] 2025-04-25 07:49:47,878 >> Instantiating InternVLChatModel model under default dtype torch.bfloat16. [INFO|configuration_utils.py:1140] 2025-04-25 07:49:47,880 >> Generate config GenerationConfig {} this model [WARNING|logging.py:328] 2025-04-25 07:49:47,928 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. [INFO|configuration_utils.py:1140] 2025-04-25 07:49:47,929 >> Generate config GenerationConfig { "bos_token_id": 1, "eos_token_id": 2, "pad_token_id": 2, "use_cache": false } this model [WARNING|logging.py:328] 2025-04-25 07:49:48,146 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. this model [WARNING|logging.py:328] 2025-04-25 07:49:48,253 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) motion_mlp.weight2 motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) None False motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight2 None False Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) None False Setting backbone: fragments_backbone Setting backbone: fragments_backbone Setting backbone: fragments_backbone Loading checkpoint shards: 0%| | 0/4 [00:00> All model checkpoint weights were used when initializing InternVLChatModel. [WARNING|modeling_utils.py:4890] 2025-04-25 07:49:53,901 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. [INFO|configuration_utils.py:1093] 2025-04-25 07:49:53,911 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/generation_config.json [INFO|configuration_utils.py:1140] 2025-04-25 07:49:53,912 >> Generate config GenerationConfig {} 04/25/2025 07:49:53 - INFO - __main__ - Finished 04/25/2025 07:49:53 - INFO - __main__ - model.config.force_image_size: 448 04/25/2025 07:49:53 - INFO - __main__ - data_args.force_image_size: 448 04/25/2025 07:49:53 - INFO - __main__ - model.config.vision_config.image_size: 448 04/25/2025 07:49:53 - INFO - __main__ - [Dataset] num_image_token: 256 04/25/2025 07:49:53 - INFO - __main__ - [Dataset] dynamic_image_size: True 04/25/2025 07:49:53 - INFO - __main__ - [Dataset] use_thumbnail: True 04/25/2025 07:49:53 - INFO - __main__ - [Dataset] min_dynamic_patch: 1, max_dynamic_patch: 6 04/25/2025 07:49:53 - INFO - __main__ - Formatting inputs...Skip in lazy mode Loading checkpoint shards: 50%|█████ | 2/4 [00:00<00:00, 3.99it/s] Loading checkpoint shards: 75%|███████▌ | 3/4 [00:00<00:00, 4.28it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.69it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.31it/s] [WARNING|modeling_utils.py:4890] 2025-04-25 07:49:54,231 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Loading checkpoint shards: 75%|███████▌ | 3/4 [00:00<00:00, 4.41it/s]04/25/2025 07:49:54 - INFO - __main__ - Add dataset: sharegpt4v_instruct_gpt4-vision_cap100k with length: 49500 04/25/2025 07:49:54 - INFO - __main__ - [Dataset] num_image_token: 256 04/25/2025 07:49:54 - INFO - __main__ - [Dataset] dynamic_image_size: True 04/25/2025 07:49:54 - INFO - __main__ - [Dataset] use_thumbnail: True 04/25/2025 07:49:54 - INFO - __main__ - [Dataset] min_dynamic_patch: 1, max_dynamic_patch: 6 04/25/2025 07:49:54 - INFO - __main__ - Formatting inputs...Skip in lazy mode Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.71it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.40it/s] [WARNING|modeling_utils.py:4890] 2025-04-25 07:49:54,435 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. 04/25/2025 07:49:54 - INFO - __main__ - Add dataset: sharegpt4v_instruct_gpt4-vision_cap100k with length: 3 eval_dataset trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 eval_dataset trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 eval_dataset trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.qkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.qkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.proj.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.proj.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w2.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wqkv.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wqkv.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wo.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wo.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w1.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w1.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w3.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w3.lora_B.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w2.lora_A.default.weight 04/25/2025 07:49:55 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w2.lora_B.default.weight training_args TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-49-47_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) [INFO|trainer.py:741] 2025-04-25 07:49:55,566 >> Using auto half precision backend [WARNING|trainer.py:803] 2025-04-25 07:49:55,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:49:55,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [2025-04-25 07:49:55,769] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed info: version=0.16.3, git-hash=unknown, git-branch=unknown [2025-04-25 07:49:55,770] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 3 trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 training_args TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=2, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-49-47_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 training_args TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=1, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-49-47_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) [WARNING|trainer.py:803] 2025-04-25 07:49:56,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:49:56,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:49:56,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:49:56,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [2025-04-25 07:50:02,030] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... Detected CUDA files, patching ldflags Emitting ninja build file /home/wangjiarui/.cache/torch_extensions/py39_cu121/fused_adam/build.ninja... Building extension module fused_adam... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) ninja: no work to do. Loading extension module fused_adam... Time to load fused_adam op: 0.640312910079956 seconds Loading extension module fused_adam... Time to load fused_adam op: 0.7031729221343994 seconds Loading extension module fused_adam... Time to load fused_adam op: 0.7033886909484863 seconds [2025-04-25 07:50:03,095] [INFO] [logging.py:128:log_dist] [Rank 0] Using DeepSpeed Optimizer param name adamw as basic optimizer [2025-04-25 07:50:03,096] [INFO] [logging.py:128:log_dist] [Rank 0] Removing param_group that has no 'params' in the basic Optimizer [2025-04-25 07:50:03,184] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Basic Optimizer = FusedAdam [2025-04-25 07:50:03,184] [INFO] [utils.py:59:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= [2025-04-25 07:50:03,184] [INFO] [logging.py:128:log_dist] [Rank 0] Creating torch.bfloat16 ZeRO stage 1 optimizer [2025-04-25 07:50:03,184] [INFO] [stage_1_and_2.py:149:__init__] Reduce bucket size 1000000000 [2025-04-25 07:50:03,184] [INFO] [stage_1_and_2.py:150:__init__] Allgather bucket size 1000000000 [2025-04-25 07:50:03,184] [INFO] [stage_1_and_2.py:151:__init__] CPU Offload: False [2025-04-25 07:50:03,185] [INFO] [stage_1_and_2.py:152:__init__] Round robin gradient partitioning: False [2025-04-25 07:50:03,608] [INFO] [utils.py:781:see_memory_usage] Before initializing optimizer states [2025-04-25 07:50:03,609] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.08 GB CA 18.33 GB Max_CA 18 GB [2025-04-25 07:50:03,609] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 26.36 GB, percent = 10.5% [2025-04-25 07:50:03,781] [INFO] [utils.py:781:see_memory_usage] After initializing optimizer states [2025-04-25 07:50:03,782] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.11 GB CA 18.39 GB Max_CA 18 GB [2025-04-25 07:50:03,782] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 26.3 GB, percent = 10.4% [2025-04-25 07:50:03,782] [INFO] [stage_1_and_2.py:545:__init__] optimizer state initialized [2025-04-25 07:50:03,957] [INFO] [utils.py:781:see_memory_usage] After initializing ZeRO optimizer [2025-04-25 07:50:03,958] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.05 GB CA 18.39 GB Max_CA 18 GB [2025-04-25 07:50:03,958] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 26.45 GB, percent = 10.5% [2025-04-25 07:50:03,964] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Final Optimizer = DeepSpeedZeroOptimizer [2025-04-25 07:50:03,964] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed using client callable to create LR scheduler [2025-04-25 07:50:03,964] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed LR Scheduler = [2025-04-25 07:50:03,964] [INFO] [logging.py:128:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0], mom=[[0.9, 0.999]] [2025-04-25 07:50:03,975] [INFO] [config.py:999:print] DeepSpeedEngine configuration: [2025-04-25 07:50:03,975] [INFO] [config.py:1003:print] activation_checkpointing_config { "partition_activations": false, "contiguous_memory_optimization": false, "cpu_checkpointing": false, "number_checkpoints": null, "synchronize_checkpoint_boundary": false, "profile": false } [2025-04-25 07:50:03,975] [INFO] [config.py:1003:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True, 'use_gds': False} [2025-04-25 07:50:03,975] [INFO] [config.py:1003:print] amp_enabled .................. False [2025-04-25 07:50:03,975] [INFO] [config.py:1003:print] amp_params ................... False [2025-04-25 07:50:03,975] [INFO] [config.py:1003:print] autotuning_config ............ { "enabled": false, "start_step": null, "end_step": null, "metric_path": null, "arg_mappings": null, "metric": "throughput", "model_info": null, "results_dir": "autotuning_results", "exps_dir": "autotuning_exps", "overwrite": true, "fast": true, "start_profile_step": 3, "end_profile_step": 5, "tuner_type": "gridsearch", "tuner_early_stopping": 5, "tuner_num_trials": 50, "model_info_path": null, "mp_size": 1, "max_train_batch_size": null, "min_train_batch_size": 1, "max_train_micro_batch_size_per_gpu": 1.024000e+03, "min_train_micro_batch_size_per_gpu": 1, "num_tuning_micro_batch_sizes": 3 } [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] bfloat16_enabled ............. True [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] bfloat16_immediate_grad_update False [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] checkpoint_parallel_write_pipeline False [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] checkpoint_tag_validation_enabled True [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] checkpoint_tag_validation_fail False [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] comms_config ................. [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] communication_data_type ...... None [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] compression_config ........... {'weight_quantization': {'shared_parameters': {'enabled': False, 'quantizer_kernel': False, 'schedule_offset': 0, 'quantize_groups': 1, 'quantize_verbose': False, 'quantization_type': 'symmetric', 'quantize_weight_in_forward': False, 'rounding': 'nearest', 'fp16_mixed_quantize': False, 'quantize_change_ratio': 0.001}, 'different_groups': {}}, 'activation_quantization': {'shared_parameters': {'enabled': False, 'quantization_type': 'symmetric', 'range_calibration': 'dynamic', 'schedule_offset': 1000}, 'different_groups': {}}, 'sparse_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'row_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'head_pruning': {'shared_parameters': {'enabled': False, 'method': 'topk', 'schedule_offset': 1000}, 'different_groups': {}}, 'channel_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'layer_reduction': {'enabled': False}} [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] curriculum_enabled_legacy .... False [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] curriculum_params_legacy ..... False [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] data_efficiency_config ....... {'enabled': False, 'seed': 1234, 'data_sampling': {'enabled': False, 'num_epochs': 1000, 'num_workers': 0, 'curriculum_learning': {'enabled': False}}, 'data_routing': {'enabled': False, 'random_ltd': {'enabled': False, 'layer_token_lr_schedule': {'enabled': False}}}} [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] data_efficiency_enabled ...... False [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] dataloader_drop_last ......... False [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] disable_allgather ............ False [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] dump_state ................... False [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] dynamic_loss_scale_args ...... None [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] eigenvalue_enabled ........... False [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] eigenvalue_gas_boundary_resolution 1 [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] eigenvalue_layer_name ........ bert.encoder.layer [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] eigenvalue_layer_num ......... 0 [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] eigenvalue_max_iter .......... 100 [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] eigenvalue_stability ......... 1e-06 [2025-04-25 07:50:03,976] [INFO] [config.py:1003:print] eigenvalue_tol ............... 0.01 [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] eigenvalue_verbose ........... False [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] elasticity_enabled ........... False [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] flops_profiler_config ........ { "enabled": false, "recompute_fwd_factor": 0.0, "profile_step": 1, "module_depth": -1, "top_modules": 1, "detailed": true, "output_file": null } [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] fp16_auto_cast ............... None [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] fp16_enabled ................. False [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] fp16_master_weights_and_gradients False [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] global_rank .................. 0 [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] grad_accum_dtype ............. None [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] gradient_accumulation_steps .. 1 [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] gradient_clipping ............ 1.0 [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] gradient_predivide_factor .... 1.0 [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] graph_harvesting ............. False [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] hybrid_engine ................ enabled=False max_out_tokens=512 inference_tp_size=1 release_inference_cache=False pin_parameters=True tp_gather_partition_size=8 [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] initial_dynamic_scale ........ 1 [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] load_universal_checkpoint .... False [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] loss_scale ................... 1.0 [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] memory_breakdown ............. False [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] mics_hierarchial_params_gather False [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] mics_shard_size .............. -1 [2025-04-25 07:50:03,977] [INFO] [config.py:1003:print] monitor_config ............... tensorboard=TensorBoardConfig(enabled=False, output_path='', job_name='DeepSpeedJobName') comet=CometConfig(enabled=False, samples_log_interval=100, project=None, workspace=None, api_key=None, experiment_name=None, experiment_key=None, online=None, mode=None) wandb=WandbConfig(enabled=False, group=None, team=None, project='deepspeed') csv_monitor=CSVConfig(enabled=False, output_path='', job_name='DeepSpeedJobName') [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] nebula_config ................ { "enabled": false, "persistent_storage_path": null, "persistent_time_interval": 100, "num_of_version_in_retention": 2, "enable_nebula_load": true, "load_path": null } [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] optimizer_legacy_fusion ...... False [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] optimizer_name ............... adamw [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] optimizer_params ............. {'lr': 4e-05, 'betas': [0.9, 0.999], 'eps': 1e-08, 'weight_decay': 0.01} [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0, 'pipe_partitioned': True, 'grad_partitioned': True} [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] pld_enabled .................. False [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] pld_params ................... False [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] prescale_gradients ........... False [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] scheduler_name ............... None [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] scheduler_params ............. None [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] seq_parallel_communication_data_type torch.float32 [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] sparse_attention ............. None [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] sparse_gradients_enabled ..... False [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] steps_per_print .............. inf [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] timers_config ................ enabled=True synchronized=True [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] train_batch_size ............. 12 [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] train_micro_batch_size_per_gpu 4 [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] use_data_before_expert_parallel_ False [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] use_node_local_storage ....... False [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] wall_clock_breakdown ......... True [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] weight_quantization_config ... None [2025-04-25 07:50:03,978] [INFO] [config.py:1003:print] world_size ................... 3 [2025-04-25 07:50:03,979] [INFO] [config.py:1003:print] zero_allow_untested_optimizer False [2025-04-25 07:50:03,979] [INFO] [config.py:1003:print] zero_config .................. stage=1 contiguous_gradients=True reduce_scatter=True reduce_bucket_size=1000000000 use_multi_rank_bucket_allreduce=True allgather_partitions=True allgather_bucket_size=1000000000 overlap_comm=True load_from_fp32_weights=True elastic_checkpoint=False offload_param=None offload_optimizer=None sub_group_size=1000000000 cpu_offload_param=None cpu_offload_use_pin_memory=None cpu_offload=None prefetch_bucket_size=50000000 param_persistence_threshold=100000 model_persistence_threshold=9223372036854775807 max_live_parameters=1000000000 max_reuse_distance=1000000000 gather_16bit_weights_on_model_save=False module_granularity_threshold=0 use_all_reduce_for_fetch_params=False stage3_gather_fp16_weights_on_model_save=False ignore_unused_parameters=True legacy_stage1=False round_robin_gradients=False zero_hpz_partition_size=1 zero_quantized_weights=False zero_quantized_nontrainable_weights=False zero_quantized_gradients=False zeropp_loco_param=None mics_shard_size=-1 mics_hierarchical_params_gather=False memory_efficient_linear=True pipeline_loading_checkpoint=False override_module_apply=True [2025-04-25 07:50:03,979] [INFO] [config.py:1003:print] zero_enabled ................. True [2025-04-25 07:50:03,979] [INFO] [config.py:1003:print] zero_force_ds_cpu_optimizer .. True [2025-04-25 07:50:03,979] [INFO] [config.py:1003:print] zero_optimization_stage ...... 1 [2025-04-25 07:50:03,979] [INFO] [config.py:989:print_user_config] json = { "zero_optimization": { "stage": 1, "allgather_partitions": true, "allgather_bucket_size": 1.000000e+09, "overlap_comm": true, "reduce_scatter": true, "reduce_bucket_size": 1.000000e+09, "contiguous_gradients": true }, "fp16": { "enabled": false, "auto_cast": true, "loss_scale": 0, "initial_scale_power": 32, "loss_scale_window": 1000, "hysteresis": 2, "min_loss_scale": 1 }, "bf16": { "enabled": true }, "optimizer": { "type": "AdamW", "params": { "lr": 4e-05, "betas": [0.9, 0.999], "eps": 1e-08, "weight_decay": 0.01 } }, "gradient_accumulation_steps": 1, "gradient_clipping": 1.0, "steps_per_print": inf, "train_batch_size": 12, "train_micro_batch_size_per_gpu": 4, "wall_clock_breakdown": true } [INFO|trainer.py:2369] 2025-04-25 07:50:03,980 >> ***** Running training ***** [INFO|trainer.py:2370] 2025-04-25 07:50:03,980 >> Num examples = 49,500 [INFO|trainer.py:2371] 2025-04-25 07:50:03,980 >> Num Epochs = 10 [INFO|trainer.py:2372] 2025-04-25 07:50:03,980 >> Instantaneous batch size per device = 4 [INFO|trainer.py:2375] 2025-04-25 07:50:03,980 >> Total train batch size (w. parallel, distributed & accumulation) = 12 [INFO|trainer.py:2376] 2025-04-25 07:50:03,980 >> Gradient Accumulation steps = 1 [INFO|trainer.py:2377] 2025-04-25 07:50:03,980 >> Total optimization steps = 41,250 [INFO|trainer.py:2378] 2025-04-25 07:50:03,987 >> Number of trainable parameters = 52,297,728 0%| | 0/41250 [00:00> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1 [WARNING|trainer.py:803] 2025-04-25 07:51:15,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-25 07:51:16,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2 2 [WARNING|trainer.py:803] 2025-04-25 07:51:18,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2 [WARNING|trainer.py:803] 2025-04-25 07:51:18,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-25 07:51:19,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3 3 3 [WARNING|trainer.py:803] 2025-04-25 07:51:20,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.0 New best accuracy: 0.0. Saving model... [WARNING|trainer.py:803] 2025-04-25 07:51:20,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.0 New best accuracy: 0.0. Saving model... [WARNING|trainer.py:803] 2025-04-25 07:51:21,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.0 New best accuracy: 0.0. Saving model... [INFO|trainer.py:3910] 2025-04-25 07:51:32,907 >> Saving model checkpoint to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast [INFO|configuration_utils.py:420] 2025-04-25 07:51:32,912 >> Configuration saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/config.json [INFO|configuration_utils.py:909] 2025-04-25 07:51:32,913 >> Configuration saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/generation_config.json [rank1]: Traceback (most recent call last): [rank1]: File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 1219, in [rank1]: main() [rank1]: File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 1214, in main [rank1]: train_result = trainer.train(resume_from_checkpoint=checkpoint) [rank1]: File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/transformers/trainer.py", line 2171, in train [rank1]: return inner_training_loop( [rank1]: File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/transformers/trainer.py", line 2598, in _inner_training_loop [rank1]: self._maybe_log_save_evaluate( [rank1]: File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/transformers/trainer.py", line 3071, in _maybe_log_save_evaluate [rank1]: metrics = self._evaluate(trial, ignore_keys_for_eval) [rank1]: File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/transformers/trainer.py", line 3025, in _evaluate [rank1]: metrics = self.evaluate(ignore_keys=ignore_keys_for_eval) [rank1]: File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 282, in evaluate [rank1]: self.save_lora_weights(self.model, self.args.output_dir) [rank1]: AttributeError: 'CustomTrainer' object has no attribute 'save_lora_weights' [rank2]: Traceback (most recent call last): [rank2]: File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 1219, in [rank2]: main() [rank2]: File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 1214, in main [rank2]: train_result = trainer.train(resume_from_checkpoint=checkpoint) [rank2]: File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/transformers/trainer.py", line 2171, in train [rank2]: return inner_training_loop( [rank2]: File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/transformers/trainer.py", line 2598, in _inner_training_loop [rank2]: self._maybe_log_save_evaluate( [rank2]: File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/transformers/trainer.py", line 3071, in _maybe_log_save_evaluate [rank2]: metrics = self._evaluate(trial, ignore_keys_for_eval) [rank2]: File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/transformers/trainer.py", line 3025, in _evaluate [rank2]: metrics = self.evaluate(ignore_keys=ignore_keys_for_eval) [rank2]: File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 282, in evaluate [rank2]: self.save_lora_weights(self.model, self.args.output_dir) [rank2]: AttributeError: 'CustomTrainer' object has no attribute 'save_lora_weights' W0425 07:51:40.631898 2673708 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2673802 closing signal SIGTERM W0425 07:51:40.634155 2673708 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2673804 closing signal SIGTERM /home/wangjiarui/.conda/envs/intern25/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 21 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d ' E0425 07:51:45.660335 2673708 site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 1 (pid: 2673803) of binary: /home/wangjiarui/.conda/envs/intern25/bin/python Traceback (most recent call last): File "/home/wangjiarui/.conda/envs/intern25/bin/torchrun", line 8, in sys.exit(main()) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ train/train_qa.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-04-25_07:51:40 host : amax rank : 1 (local_rank: 1) exitcode : 1 (pid: 2673803) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ W0425 07:53:36.100325 2679929 site-packages/torch/distributed/run.py:793] W0425 07:53:36.100325 2679929 site-packages/torch/distributed/run.py:793] ***************************************** W0425 07:53:36.100325 2679929 site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0425 07:53:36.100325 2679929 site-packages/torch/distributed/run.py:793] ***************************************** [2025-04-25 07:53:38,044] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-25 07:53:38,046] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-25 07:53:38,053] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) petrel_client is not installed. If you read data locally instead of from ceph, ignore it. petrel_client is not installed. If you read data locally instead of from ceph, ignore it. petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! Replace train sampler!!Replace train sampler!! petrel_client is not installed. Using PIL to load images. petrel_client is not installed. Using PIL to load images.petrel_client is not installed. Using PIL to load images. [2025-04-25 07:53:41,686] [INFO] [comm.py:652:init_distributed] cdb=None [2025-04-25 07:53:41,686] [INFO] [comm.py:683:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl 04/25/2025 07:53:41 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False 04/25/2025 07:53:41 - INFO - __main__ - Training/evaluation parameters TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-53-41_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) 04/25/2025 07:53:41 - INFO - __main__ - Loading Tokenizer: /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:53:41,862 >> loading file tokenizer.model [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:53:41,862 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:53:41,862 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:53:41,862 >> loading file tokenizer_config.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:53:41,862 >> loading file tokenizer.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:53:41,862 >> loading file chat_template.jinja [2025-04-25 07:53:41,863] [INFO] [comm.py:652:init_distributed] cdb=None [2025-04-25 07:53:41,872] [INFO] [comm.py:652:init_distributed] cdb=None 04/25/2025 07:53:41 - WARNING - __main__ - Process rank: 2, device: cuda:2, n_gpu: 1distributed training: True, 16-bits training: False 04/25/2025 07:53:41 - WARNING - __main__ - Process rank: 1, device: cuda:1, n_gpu: 1distributed training: True, 16-bits training: False [INFO|tokenization_utils_base.py:2304] 2025-04-25 07:53:42,322 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 04/25/2025 07:53:42 - INFO - __main__ - Loading InternVLChatModel... [INFO|configuration_utils.py:694] 2025-04-25 07:53:42,459 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/config.json [INFO|configuration_utils.py:768] 2025-04-25 07:53:42,462 >> Model config InternVLChatConfig { "_commit_hash": null, "_name_or_path": "/mnt/petrelfs/wangweiyun/workspace_wwy/open_source/InternVL/internvl_chat/work_dirs/internvl_chat_v3_0/InternVL3_0-9B-MPO-try0-2", "architectures": [ "InternVLChatModel" ], "auto_map": { "AutoConfig": "configuration_internvl_chat.InternVLChatConfig", "AutoModel": "modeling_internvl_chat.InternVLChatModel", "AutoModelForCausalLM": "modeling_internvl_chat.InternVLChatModel" }, "downsample_ratio": 0.5, "dynamic_image_size": true, "force_image_size": 448, "hidden_size": 4096, "image_fold": null, "llm_config": { "_attn_implementation_autoset": true, "_name_or_path": "/mnt/petrelfs/share_data/wangweiyun/share_ckpt/hf_home/internlm2-chat-7b", "add_cross_attention": false, "architectures": [ "InternLM2ForCausalLM" ], "attn_implementation": "flash_attention_2", "auto_map": { "AutoConfig": "configuration_internlm2.InternLM2Config", "AutoModel": "modeling_internlm2.InternLM2ForCausalLM", "AutoModelForCausalLM": "modeling_internlm2.InternLM2ForCausalLM" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bias": false, "bos_token_id": 1, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": 2, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "silu", "hidden_size": 4096, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "initializer_range": 0.02, "intermediate_size": 10240, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "length_penalty": 1.0, "max_length": 20, "max_position_embeddings": 32768, "min_length": 0, "model_type": "internlm2", "moe_config": null, "no_repeat_ngram_size": 0, "num_attention_heads": 32, "num_beam_groups": 1, "num_beams": 1, "num_hidden_layers": 48, "num_key_value_heads": 2, "num_return_sequences": 1, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": 2, "prefix": null, "pretraining_tp": 1, "problem_type": null, "pruned_heads": {}, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "rms_norm_eps": 1e-05, "rope_scaling": { "factor": 2.0, "type": "dynamic" }, "rope_theta": 50000000, "sep_token_id": null, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": false, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_cache": false, "vocab_size": 128142 }, "max_dynamic_patch": 12, "min_dynamic_patch": 1, "model_type": "internvl_chat", "pad2square": false, "ps_version": "v2", "select_layer": -1, "system_message": null, "template": "internvl2_5", "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": null, "use_backbone_lora": 0, "use_img_start_end_token": true, "use_llm_lora": 0, "use_thumbnail": true, "vision_config": { "_attn_implementation_autoset": true, "_name_or_path": "pretrained/intern_vit_6b_448px_v1_2/", "add_cross_attention": false, "architectures": [ "InternVisionModel" ], "attention_dropout": 0.0, "auto_map": { "AutoConfig": "configuration_intern_vit.InternVisionConfig", "AutoModel": "modeling_intern_vit.InternVisionModel" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bos_token_id": null, "capacity_factor": 1.2, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "drop_path_rate": 0.1, "dropout": 0.0, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": null, "eval_capacity_factor": 1.4, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "gelu", "hidden_size": 1024, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "image_size": 448, "initializer_factor": 0.1, "initializer_range": 1e-10, "intermediate_size": 4096, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "laux_allreduce": "all_nodes", "layer_norm_eps": 1e-06, "length_penalty": 1.0, "max_length": 20, "min_length": 0, "model_type": "intern_vit_6b", "moe_coeff_ratio": 0.5, "moe_intermediate_size": 768, "moe_output_scale": 4.0, "no_repeat_ngram_size": 0, "noisy_gate_policy": "RSample_before", "norm_type": "layer_norm", "num_attention_heads": 16, "num_beam_groups": 1, "num_beams": 1, "num_channels": 3, "num_experts": 8, "num_hidden_layers": 24, "num_return_sequences": 1, "num_routed_experts": 4, "num_shared_experts": 4, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": null, "patch_size": 14, "prefix": null, "problem_type": null, "pruned_heads": {}, "qk_normalization": false, "qkv_bias": true, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "sep_token_id": null, "shared_expert_intermediate_size": 3072, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": true, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_flash_attn": true, "use_moe": false, "use_residual": true, "use_rts": false, "use_weighted_residual": false } } 04/25/2025 07:53:42 - INFO - __main__ - Using flash_attention_2 for InternLM [INFO|modeling_utils.py:3901] 2025-04-25 07:53:42,462 >> loading weights file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/model.safetensors.index.json [INFO|modeling_utils.py:1582] 2025-04-25 07:53:42,463 >> Instantiating InternVLChatModel model under default dtype torch.bfloat16. [INFO|configuration_utils.py:1140] 2025-04-25 07:53:42,465 >> Generate config GenerationConfig {} this model [WARNING|logging.py:328] 2025-04-25 07:53:42,533 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. [INFO|configuration_utils.py:1140] 2025-04-25 07:53:42,534 >> Generate config GenerationConfig { "bos_token_id": 1, "eos_token_id": 2, "pad_token_id": 2, "use_cache": false } this model [WARNING|logging.py:328] 2025-04-25 07:53:42,611 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. this model [WARNING|logging.py:328] 2025-04-25 07:53:42,684 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight2 None False Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) None False None False Setting backbone: fragments_backbone Setting backbone: fragments_backbone Setting backbone: fragments_backbone Loading checkpoint shards: 0%| | 0/4 [00:00> All model checkpoint weights were used when initializing InternVLChatModel. [WARNING|modeling_utils.py:4890] 2025-04-25 07:53:48,556 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. [INFO|configuration_utils.py:1093] 2025-04-25 07:53:48,567 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/generation_config.json [INFO|configuration_utils.py:1140] 2025-04-25 07:53:48,567 >> Generate config GenerationConfig {} 04/25/2025 07:53:48 - INFO - __main__ - Finished 04/25/2025 07:53:48 - INFO - __main__ - model.config.force_image_size: 448 04/25/2025 07:53:48 - INFO - __main__ - data_args.force_image_size: 448 04/25/2025 07:53:48 - INFO - __main__ - model.config.vision_config.image_size: 448 04/25/2025 07:53:48 - INFO - __main__ - [Dataset] num_image_token: 256 04/25/2025 07:53:48 - INFO - __main__ - [Dataset] dynamic_image_size: True 04/25/2025 07:53:48 - INFO - __main__ - [Dataset] use_thumbnail: True 04/25/2025 07:53:48 - INFO - __main__ - [Dataset] min_dynamic_patch: 1, max_dynamic_patch: 6 04/25/2025 07:53:48 - INFO - __main__ - Formatting inputs...Skip in lazy mode Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.98it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.67it/s] [WARNING|modeling_utils.py:4890] 2025-04-25 07:53:48,573 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.67it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.29it/s] [WARNING|modeling_utils.py:4890] 2025-04-25 07:53:48,686 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. 04/25/2025 07:53:49 - INFO - __main__ - Add dataset: sharegpt4v_instruct_gpt4-vision_cap100k with length: 49500 04/25/2025 07:53:49 - INFO - __main__ - [Dataset] num_image_token: 256 04/25/2025 07:53:49 - INFO - __main__ - [Dataset] dynamic_image_size: True 04/25/2025 07:53:49 - INFO - __main__ - [Dataset] use_thumbnail: True 04/25/2025 07:53:49 - INFO - __main__ - [Dataset] min_dynamic_patch: 1, max_dynamic_patch: 6 04/25/2025 07:53:49 - INFO - __main__ - Formatting inputs...Skip in lazy mode eval_dataset 04/25/2025 07:53:49 - INFO - __main__ - Add dataset: sharegpt4v_instruct_gpt4-vision_cap100k with length: 3 eval_dataset eval_dataset trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 training_args TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=2, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-53-41_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.qkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.qkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.proj.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.proj.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w2.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wqkv.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wqkv.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wo.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wo.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w1.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w1.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w3.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w3.lora_B.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w2.lora_A.default.weight 04/25/2025 07:53:50 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w2.lora_B.default.weight training_args TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-53-41_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) [INFO|trainer.py:741] 2025-04-25 07:53:50,250 >> Using auto half precision backend trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 training_args TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=1, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-53-41_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) [WARNING|trainer.py:803] 2025-04-25 07:53:50,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:53:50,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:53:50,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:53:50,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [2025-04-25 07:53:50,502] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed info: version=0.16.3, git-hash=unknown, git-branch=unknown [2025-04-25 07:53:50,502] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 3 [WARNING|trainer.py:803] 2025-04-25 07:53:50,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:53:50,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... [2025-04-25 07:53:57,168] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... Detected CUDA files, patching ldflags Emitting ninja build file /home/wangjiarui/.cache/torch_extensions/py39_cu121/fused_adam/build.ninja... Building extension module fused_adam... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) ninja: no work to do. Loading extension module fused_adam... Time to load fused_adam op: 0.6174883842468262 seconds Loading extension module fused_adam... Time to load fused_adam op: 0.6031739711761475 seconds Loading extension module fused_adam... Time to load fused_adam op: 0.6032898426055908 seconds [2025-04-25 07:53:58,221] [INFO] [logging.py:128:log_dist] [Rank 0] Using DeepSpeed Optimizer param name adamw as basic optimizer [2025-04-25 07:53:58,221] [INFO] [logging.py:128:log_dist] [Rank 0] Removing param_group that has no 'params' in the basic Optimizer [2025-04-25 07:53:58,307] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Basic Optimizer = FusedAdam [2025-04-25 07:53:58,308] [INFO] [utils.py:59:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= [2025-04-25 07:53:58,308] [INFO] [logging.py:128:log_dist] [Rank 0] Creating torch.bfloat16 ZeRO stage 1 optimizer [2025-04-25 07:53:58,308] [INFO] [stage_1_and_2.py:149:__init__] Reduce bucket size 1000000000 [2025-04-25 07:53:58,308] [INFO] [stage_1_and_2.py:150:__init__] Allgather bucket size 1000000000 [2025-04-25 07:53:58,308] [INFO] [stage_1_and_2.py:151:__init__] CPU Offload: False [2025-04-25 07:53:58,308] [INFO] [stage_1_and_2.py:152:__init__] Round robin gradient partitioning: False [2025-04-25 07:53:58,700] [INFO] [utils.py:781:see_memory_usage] Before initializing optimizer states [2025-04-25 07:53:58,701] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.08 GB CA 18.33 GB Max_CA 18 GB [2025-04-25 07:53:58,702] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 27.44 GB, percent = 10.9% [2025-04-25 07:53:58,887] [INFO] [utils.py:781:see_memory_usage] After initializing optimizer states [2025-04-25 07:53:58,888] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.11 GB CA 18.39 GB Max_CA 18 GB [2025-04-25 07:53:58,888] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 27.46 GB, percent = 10.9% [2025-04-25 07:53:58,888] [INFO] [stage_1_and_2.py:545:__init__] optimizer state initialized [2025-04-25 07:53:59,063] [INFO] [utils.py:781:see_memory_usage] After initializing ZeRO optimizer [2025-04-25 07:53:59,064] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.05 GB CA 18.39 GB Max_CA 18 GB [2025-04-25 07:53:59,064] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 27.6 GB, percent = 11.0% [2025-04-25 07:53:59,070] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Final Optimizer = DeepSpeedZeroOptimizer [2025-04-25 07:53:59,070] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed using client callable to create LR scheduler [2025-04-25 07:53:59,070] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed LR Scheduler = [2025-04-25 07:53:59,070] [INFO] [logging.py:128:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0], mom=[[0.9, 0.999]] [2025-04-25 07:53:59,080] [INFO] [config.py:999:print] DeepSpeedEngine configuration: [2025-04-25 07:53:59,081] [INFO] [config.py:1003:print] activation_checkpointing_config { "partition_activations": false, "contiguous_memory_optimization": false, "cpu_checkpointing": false, "number_checkpoints": null, "synchronize_checkpoint_boundary": false, "profile": false } [2025-04-25 07:53:59,081] [INFO] [config.py:1003:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True, 'use_gds': False} [2025-04-25 07:53:59,081] [INFO] [config.py:1003:print] amp_enabled .................. False [2025-04-25 07:53:59,081] [INFO] [config.py:1003:print] amp_params ................... False [2025-04-25 07:53:59,081] [INFO] [config.py:1003:print] autotuning_config ............ { "enabled": false, "start_step": null, "end_step": null, "metric_path": null, "arg_mappings": null, "metric": "throughput", "model_info": null, "results_dir": "autotuning_results", "exps_dir": "autotuning_exps", "overwrite": true, "fast": true, "start_profile_step": 3, "end_profile_step": 5, "tuner_type": "gridsearch", "tuner_early_stopping": 5, "tuner_num_trials": 50, "model_info_path": null, "mp_size": 1, "max_train_batch_size": null, "min_train_batch_size": 1, "max_train_micro_batch_size_per_gpu": 1.024000e+03, "min_train_micro_batch_size_per_gpu": 1, "num_tuning_micro_batch_sizes": 3 } [2025-04-25 07:53:59,081] [INFO] [config.py:1003:print] bfloat16_enabled ............. True [2025-04-25 07:53:59,081] [INFO] [config.py:1003:print] bfloat16_immediate_grad_update False [2025-04-25 07:53:59,081] [INFO] [config.py:1003:print] checkpoint_parallel_write_pipeline False [2025-04-25 07:53:59,081] [INFO] [config.py:1003:print] checkpoint_tag_validation_enabled True [2025-04-25 07:53:59,081] [INFO] [config.py:1003:print] checkpoint_tag_validation_fail False [2025-04-25 07:53:59,081] [INFO] [config.py:1003:print] comms_config ................. [2025-04-25 07:53:59,081] [INFO] [config.py:1003:print] communication_data_type ...... None [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] compression_config ........... {'weight_quantization': {'shared_parameters': {'enabled': False, 'quantizer_kernel': False, 'schedule_offset': 0, 'quantize_groups': 1, 'quantize_verbose': False, 'quantization_type': 'symmetric', 'quantize_weight_in_forward': False, 'rounding': 'nearest', 'fp16_mixed_quantize': False, 'quantize_change_ratio': 0.001}, 'different_groups': {}}, 'activation_quantization': {'shared_parameters': {'enabled': False, 'quantization_type': 'symmetric', 'range_calibration': 'dynamic', 'schedule_offset': 1000}, 'different_groups': {}}, 'sparse_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'row_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'head_pruning': {'shared_parameters': {'enabled': False, 'method': 'topk', 'schedule_offset': 1000}, 'different_groups': {}}, 'channel_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'layer_reduction': {'enabled': False}} [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] curriculum_enabled_legacy .... False [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] curriculum_params_legacy ..... False [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] data_efficiency_config ....... {'enabled': False, 'seed': 1234, 'data_sampling': {'enabled': False, 'num_epochs': 1000, 'num_workers': 0, 'curriculum_learning': {'enabled': False}}, 'data_routing': {'enabled': False, 'random_ltd': {'enabled': False, 'layer_token_lr_schedule': {'enabled': False}}}} [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] data_efficiency_enabled ...... False [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] dataloader_drop_last ......... False [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] disable_allgather ............ False [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] dump_state ................... False [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] dynamic_loss_scale_args ...... None [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] eigenvalue_enabled ........... False [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] eigenvalue_gas_boundary_resolution 1 [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] eigenvalue_layer_name ........ bert.encoder.layer [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] eigenvalue_layer_num ......... 0 [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] eigenvalue_max_iter .......... 100 [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] eigenvalue_stability ......... 1e-06 [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] eigenvalue_tol ............... 0.01 [2025-04-25 07:53:59,082] [INFO] [config.py:1003:print] eigenvalue_verbose ........... False [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] elasticity_enabled ........... False [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] flops_profiler_config ........ { "enabled": false, "recompute_fwd_factor": 0.0, "profile_step": 1, "module_depth": -1, "top_modules": 1, "detailed": true, "output_file": null } [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] fp16_auto_cast ............... None [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] fp16_enabled ................. False [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] fp16_master_weights_and_gradients False [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] global_rank .................. 0 [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] grad_accum_dtype ............. None [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] gradient_accumulation_steps .. 1 [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] gradient_clipping ............ 1.0 [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] gradient_predivide_factor .... 1.0 [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] graph_harvesting ............. False [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] hybrid_engine ................ enabled=False max_out_tokens=512 inference_tp_size=1 release_inference_cache=False pin_parameters=True tp_gather_partition_size=8 [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] initial_dynamic_scale ........ 1 [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] load_universal_checkpoint .... False [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] loss_scale ................... 1.0 [2025-04-25 07:53:59,083] [INFO] [config.py:1003:print] memory_breakdown ............. False [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] mics_hierarchial_params_gather False [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] mics_shard_size .............. -1 [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] monitor_config ............... tensorboard=TensorBoardConfig(enabled=False, output_path='', job_name='DeepSpeedJobName') comet=CometConfig(enabled=False, samples_log_interval=100, project=None, workspace=None, api_key=None, experiment_name=None, experiment_key=None, online=None, mode=None) wandb=WandbConfig(enabled=False, group=None, team=None, project='deepspeed') csv_monitor=CSVConfig(enabled=False, output_path='', job_name='DeepSpeedJobName') [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] nebula_config ................ { "enabled": false, "persistent_storage_path": null, "persistent_time_interval": 100, "num_of_version_in_retention": 2, "enable_nebula_load": true, "load_path": null } [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] optimizer_legacy_fusion ...... False [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] optimizer_name ............... adamw [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] optimizer_params ............. {'lr': 4e-05, 'betas': [0.9, 0.999], 'eps': 1e-08, 'weight_decay': 0.01} [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0, 'pipe_partitioned': True, 'grad_partitioned': True} [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] pld_enabled .................. False [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] pld_params ................... False [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] prescale_gradients ........... False [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] scheduler_name ............... None [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] scheduler_params ............. None [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] seq_parallel_communication_data_type torch.float32 [2025-04-25 07:53:59,084] [INFO] [config.py:1003:print] sparse_attention ............. None [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] sparse_gradients_enabled ..... False [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] steps_per_print .............. inf [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] timers_config ................ enabled=True synchronized=True [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] train_batch_size ............. 12 [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] train_micro_batch_size_per_gpu 4 [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] use_data_before_expert_parallel_ False [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] use_node_local_storage ....... False [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] wall_clock_breakdown ......... True [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] weight_quantization_config ... None [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] world_size ................... 3 [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] zero_allow_untested_optimizer False [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] zero_config .................. stage=1 contiguous_gradients=True reduce_scatter=True reduce_bucket_size=1000000000 use_multi_rank_bucket_allreduce=True allgather_partitions=True allgather_bucket_size=1000000000 overlap_comm=True load_from_fp32_weights=True elastic_checkpoint=False offload_param=None offload_optimizer=None sub_group_size=1000000000 cpu_offload_param=None cpu_offload_use_pin_memory=None cpu_offload=None prefetch_bucket_size=50000000 param_persistence_threshold=100000 model_persistence_threshold=9223372036854775807 max_live_parameters=1000000000 max_reuse_distance=1000000000 gather_16bit_weights_on_model_save=False module_granularity_threshold=0 use_all_reduce_for_fetch_params=False stage3_gather_fp16_weights_on_model_save=False ignore_unused_parameters=True legacy_stage1=False round_robin_gradients=False zero_hpz_partition_size=1 zero_quantized_weights=False zero_quantized_nontrainable_weights=False zero_quantized_gradients=False zeropp_loco_param=None mics_shard_size=-1 mics_hierarchical_params_gather=False memory_efficient_linear=True pipeline_loading_checkpoint=False override_module_apply=True [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] zero_enabled ................. True [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] zero_force_ds_cpu_optimizer .. True [2025-04-25 07:53:59,085] [INFO] [config.py:1003:print] zero_optimization_stage ...... 1 [2025-04-25 07:53:59,086] [INFO] [config.py:989:print_user_config] json = { "zero_optimization": { "stage": 1, "allgather_partitions": true, "allgather_bucket_size": 1.000000e+09, "overlap_comm": true, "reduce_scatter": true, "reduce_bucket_size": 1.000000e+09, "contiguous_gradients": true }, "fp16": { "enabled": false, "auto_cast": true, "loss_scale": 0, "initial_scale_power": 32, "loss_scale_window": 1000, "hysteresis": 2, "min_loss_scale": 1 }, "bf16": { "enabled": true }, "optimizer": { "type": "AdamW", "params": { "lr": 4e-05, "betas": [0.9, 0.999], "eps": 1e-08, "weight_decay": 0.01 } }, "gradient_accumulation_steps": 1, "gradient_clipping": 1.0, "steps_per_print": inf, "train_batch_size": 12, "train_micro_batch_size_per_gpu": 4, "wall_clock_breakdown": true } [INFO|trainer.py:2369] 2025-04-25 07:53:59,087 >> ***** Running training ***** [INFO|trainer.py:2370] 2025-04-25 07:53:59,087 >> Num examples = 49,500 [INFO|trainer.py:2371] 2025-04-25 07:53:59,087 >> Num Epochs = 10 [INFO|trainer.py:2372] 2025-04-25 07:53:59,087 >> Instantaneous batch size per device = 4 [INFO|trainer.py:2375] 2025-04-25 07:53:59,087 >> Total train batch size (w. parallel, distributed & accumulation) = 12 [INFO|trainer.py:2376] 2025-04-25 07:53:59,087 >> Gradient Accumulation steps = 1 [INFO|trainer.py:2377] 2025-04-25 07:53:59,087 >> Total optimization steps = 41,250 [INFO|trainer.py:2378] 2025-04-25 07:53:59,094 >> Number of trainable parameters = 52,297,728 0%| | 0/41250 [00:00> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-25 07:55:10,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-25 07:55:10,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2 2 2 [WARNING|trainer.py:803] 2025-04-25 07:55:12,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-25 07:55:12,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-25 07:55:13,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3 3 3 [WARNING|trainer.py:803] 2025-04-25 07:55:15,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.0 New best accuracy: 0.0. Saving model... [WARNING|trainer.py:803] 2025-04-25 07:55:15,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.0 New best accuracy: 0.0. Saving model... [WARNING|trainer.py:803] 2025-04-25 07:55:15,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.0 New best accuracy: 0.0. Saving model... [INFO|trainer.py:3910] 2025-04-25 07:55:27,160 >> Saving model checkpoint to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast [INFO|configuration_utils.py:420] 2025-04-25 07:55:27,165 >> Configuration saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/config.json [INFO|configuration_utils.py:909] 2025-04-25 07:55:27,166 >> Configuration saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/generation_config.json [INFO|modeling_utils.py:2996] 2025-04-25 07:56:12,162 >> The model is bigger than the maximum size per checkpoint (5GB) and is going to be split in 4 checkpoint shards. You can find where each parameters has been saved in the index located at /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/model.safetensors.index.json. [INFO|tokenization_utils_base.py:2491] 2025-04-25 07:56:12,164 >> tokenizer config file saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/tokenizer_config.json [INFO|tokenization_utils_base.py:2500] 2025-04-25 07:56:12,164 >> Special tokens file saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/special_tokens_map.json [INFO|tokenization_utils_base.py:2553] 2025-04-25 07:56:12,164 >> added tokens file saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/added_tokens.json 04/25/2025 07:56:14 - INFO - __main__ - Saved LoRA weights to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/lora_weights.pth [2025-04-25 07:56:22,671] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.01 | optimizer_step: 1.10 [2025-04-25 07:56:22,672] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2790.32 | bwd_microstep: 5607.11 | bwd_inner_microstep: 5594.78 | bwd_allreduce_microstep: 12.28 | step_microstep: 19.10 [2025-04-25 07:56:22,672] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2790.32 | bwd: 5607.12 | bwd_inner: 5594.78 | bwd_allreduce: 12.30 | step: 19.11 0%| | 5/41250 [02:23<406:15:59, 35.46s/it] {'loss': 1.2025, 'grad_norm': 15.128987312316895, 'learning_rate': 1.6155088852988693e-07, 'epoch': 0.0} 0%| | 5/41250 [02:23<406:15:59, 35.46s/it][2025-04-25 07:56:31,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 07:56:31,857] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 3370.26 | bwd_microstep: 5730.02 | bwd_inner_microstep: 5717.05 | bwd_allreduce_microstep: 12.92 | step_microstep: 18.91 [2025-04-25 07:56:31,857] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 3370.26 | bwd: 5730.03 | bwd_inner: 5717.05 | bwd_allreduce: 12.94 | step: 18.92 0%| | 6/41250 [02:32<303:54:16, 26.53s/it] {'loss': 1.0311, 'grad_norm': 12.863167762756348, 'learning_rate': 1.938610662358643e-07, 'epoch': 0.0} 0%| | 6/41250 [02:32<303:54:16, 26.53s/it]W0425 07:57:00.173141 2687168 site-packages/torch/distributed/run.py:793] W0425 07:57:00.173141 2687168 site-packages/torch/distributed/run.py:793] ***************************************** W0425 07:57:00.173141 2687168 site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0425 07:57:00.173141 2687168 site-packages/torch/distributed/run.py:793] ***************************************** W0425 07:57:12.391524 2687306 site-packages/torch/distributed/run.py:793] W0425 07:57:12.391524 2687306 site-packages/torch/distributed/run.py:793] ***************************************** W0425 07:57:12.391524 2687306 site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0425 07:57:12.391524 2687306 site-packages/torch/distributed/run.py:793] ***************************************** [2025-04-25 07:57:14,346] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-25 07:57:14,349] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-25 07:57:14,349] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) petrel_client is not installed. If you read data locally instead of from ceph, ignore it. petrel_client is not installed. If you read data locally instead of from ceph, ignore it. petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. [2025-04-25 07:57:17,584] [INFO] [comm.py:652:init_distributed] cdb=None [2025-04-25 07:57:17,584] [INFO] [comm.py:683:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl Replace train sampler!! petrel_client is not installed. Using PIL to load images. Replace train sampler!! petrel_client is not installed. Using PIL to load images. 04/25/2025 07:57:17 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False 04/25/2025 07:57:17 - INFO - __main__ - Training/evaluation parameters TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4125, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-57-17_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) 04/25/2025 07:57:17 - INFO - __main__ - Loading Tokenizer: /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:57:17,769 >> loading file tokenizer.model [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:57:17,769 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:57:17,769 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:57:17,769 >> loading file tokenizer_config.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:57:17,769 >> loading file tokenizer.json [INFO|tokenization_utils_base.py:2032] 2025-04-25 07:57:17,769 >> loading file chat_template.jinja [2025-04-25 07:57:17,853] [INFO] [comm.py:652:init_distributed] cdb=None [2025-04-25 07:57:17,866] [INFO] [comm.py:652:init_distributed] cdb=None 04/25/2025 07:57:17 - WARNING - __main__ - Process rank: 1, device: cuda:1, n_gpu: 1distributed training: True, 16-bits training: False 04/25/2025 07:57:17 - WARNING - __main__ - Process rank: 2, device: cuda:2, n_gpu: 1distributed training: True, 16-bits training: False [INFO|tokenization_utils_base.py:2304] 2025-04-25 07:57:18,232 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 04/25/2025 07:57:18 - INFO - __main__ - Loading InternVLChatModel... [INFO|configuration_utils.py:694] 2025-04-25 07:57:18,377 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/config.json [INFO|configuration_utils.py:768] 2025-04-25 07:57:18,379 >> Model config InternVLChatConfig { "_commit_hash": null, "_name_or_path": "/mnt/petrelfs/wangweiyun/workspace_wwy/open_source/InternVL/internvl_chat/work_dirs/internvl_chat_v3_0/InternVL3_0-9B-MPO-try0-2", "architectures": [ "InternVLChatModel" ], "auto_map": { "AutoConfig": "configuration_internvl_chat.InternVLChatConfig", "AutoModel": "modeling_internvl_chat.InternVLChatModel", "AutoModelForCausalLM": "modeling_internvl_chat.InternVLChatModel" }, "downsample_ratio": 0.5, "dynamic_image_size": true, "force_image_size": 448, "hidden_size": 4096, "image_fold": null, "llm_config": { "_attn_implementation_autoset": true, "_name_or_path": "/mnt/petrelfs/share_data/wangweiyun/share_ckpt/hf_home/internlm2-chat-7b", "add_cross_attention": false, "architectures": [ "InternLM2ForCausalLM" ], "attn_implementation": "flash_attention_2", "auto_map": { "AutoConfig": "configuration_internlm2.InternLM2Config", "AutoModel": "modeling_internlm2.InternLM2ForCausalLM", "AutoModelForCausalLM": "modeling_internlm2.InternLM2ForCausalLM" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bias": false, "bos_token_id": 1, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": 2, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "silu", "hidden_size": 4096, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "initializer_range": 0.02, "intermediate_size": 10240, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "length_penalty": 1.0, "max_length": 20, "max_position_embeddings": 32768, "min_length": 0, "model_type": "internlm2", "moe_config": null, "no_repeat_ngram_size": 0, "num_attention_heads": 32, "num_beam_groups": 1, "num_beams": 1, "num_hidden_layers": 48, "num_key_value_heads": 2, "num_return_sequences": 1, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": 2, "prefix": null, "pretraining_tp": 1, "problem_type": null, "pruned_heads": {}, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "rms_norm_eps": 1e-05, "rope_scaling": { "factor": 2.0, "type": "dynamic" }, "rope_theta": 50000000, "sep_token_id": null, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": false, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_cache": false, "vocab_size": 128142 }, "max_dynamic_patch": 12, "min_dynamic_patch": 1, "model_type": "internvl_chat", "pad2square": false, "ps_version": "v2", "select_layer": -1, "system_message": null, "template": "internvl2_5", "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": null, "use_backbone_lora": 0, "use_img_start_end_token": true, "use_llm_lora": 0, "use_thumbnail": true, "vision_config": { "_attn_implementation_autoset": true, "_name_or_path": "pretrained/intern_vit_6b_448px_v1_2/", "add_cross_attention": false, "architectures": [ "InternVisionModel" ], "attention_dropout": 0.0, "auto_map": { "AutoConfig": "configuration_intern_vit.InternVisionConfig", "AutoModel": "modeling_intern_vit.InternVisionModel" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bos_token_id": null, "capacity_factor": 1.2, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "drop_path_rate": 0.1, "dropout": 0.0, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": null, "eval_capacity_factor": 1.4, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "gelu", "hidden_size": 1024, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "image_size": 448, "initializer_factor": 0.1, "initializer_range": 1e-10, "intermediate_size": 4096, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "laux_allreduce": "all_nodes", "layer_norm_eps": 1e-06, "length_penalty": 1.0, "max_length": 20, "min_length": 0, "model_type": "intern_vit_6b", "moe_coeff_ratio": 0.5, "moe_intermediate_size": 768, "moe_output_scale": 4.0, "no_repeat_ngram_size": 0, "noisy_gate_policy": "RSample_before", "norm_type": "layer_norm", "num_attention_heads": 16, "num_beam_groups": 1, "num_beams": 1, "num_channels": 3, "num_experts": 8, "num_hidden_layers": 24, "num_return_sequences": 1, "num_routed_experts": 4, "num_shared_experts": 4, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": null, "patch_size": 14, "prefix": null, "problem_type": null, "pruned_heads": {}, "qk_normalization": false, "qkv_bias": true, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "sep_token_id": null, "shared_expert_intermediate_size": 3072, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": true, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_flash_attn": true, "use_moe": false, "use_residual": true, "use_rts": false, "use_weighted_residual": false } } 04/25/2025 07:57:18 - INFO - __main__ - Using flash_attention_2 for InternLM [INFO|modeling_utils.py:3901] 2025-04-25 07:57:18,379 >> loading weights file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/model.safetensors.index.json [INFO|modeling_utils.py:1582] 2025-04-25 07:57:18,380 >> Instantiating InternVLChatModel model under default dtype torch.bfloat16. [INFO|configuration_utils.py:1140] 2025-04-25 07:57:18,382 >> Generate config GenerationConfig {} this model [WARNING|logging.py:328] 2025-04-25 07:57:18,449 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. [INFO|configuration_utils.py:1140] 2025-04-25 07:57:18,449 >> Generate config GenerationConfig { "bos_token_id": 1, "eos_token_id": 2, "pad_token_id": 2, "use_cache": false } this model this model [WARNING|logging.py:328] 2025-04-25 07:57:18,620 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. [WARNING|logging.py:328] 2025-04-25 07:57:18,640 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 motion_mlp.weight2 Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) None False motion_mlp.weight2 Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight2 None False Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) None False Setting backbone: fragments_backbone Setting backbone: fragments_backbone Setting backbone: fragments_backbone Loading checkpoint shards: 0%| | 0/4 [00:00> All model checkpoint weights were used when initializing InternVLChatModel. [WARNING|modeling_utils.py:4890] 2025-04-25 07:57:24,509 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. [INFO|configuration_utils.py:1093] 2025-04-25 07:57:24,519 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/generation_config.json [INFO|configuration_utils.py:1140] 2025-04-25 07:57:24,519 >> Generate config GenerationConfig {} 04/25/2025 07:57:24 - INFO - __main__ - Finished 04/25/2025 07:57:24 - INFO - __main__ - model.config.force_image_size: 448 04/25/2025 07:57:24 - INFO - __main__ - data_args.force_image_size: 448 04/25/2025 07:57:24 - INFO - __main__ - model.config.vision_config.image_size: 448 04/25/2025 07:57:24 - INFO - __main__ - [Dataset] num_image_token: 256 04/25/2025 07:57:24 - INFO - __main__ - [Dataset] dynamic_image_size: True 04/25/2025 07:57:24 - INFO - __main__ - [Dataset] use_thumbnail: True 04/25/2025 07:57:24 - INFO - __main__ - [Dataset] min_dynamic_patch: 1, max_dynamic_patch: 6 04/25/2025 07:57:24 - INFO - __main__ - Formatting inputs...Skip in lazy mode Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 5.01it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.66it/s] [WARNING|modeling_utils.py:4890] 2025-04-25 07:57:24,555 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.83it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.52it/s] [WARNING|modeling_utils.py:4890] 2025-04-25 07:57:24,573 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. 04/25/2025 07:57:25 - INFO - __main__ - Add dataset: sharegpt4v_instruct_gpt4-vision_cap100k with length: 49500 04/25/2025 07:57:25 - INFO - __main__ - [Dataset] num_image_token: 256 04/25/2025 07:57:25 - INFO - __main__ - [Dataset] dynamic_image_size: True 04/25/2025 07:57:25 - INFO - __main__ - [Dataset] use_thumbnail: True 04/25/2025 07:57:25 - INFO - __main__ - [Dataset] min_dynamic_patch: 1, max_dynamic_patch: 6 04/25/2025 07:57:25 - INFO - __main__ - Formatting inputs...Skip in lazy mode 04/25/2025 07:57:25 - INFO - __main__ - Add dataset: sharegpt4v_instruct_gpt4-vision_cap100k with length: 3 eval_dataset eval_dataset eval_dataset trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.qkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.qkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.proj.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.proj.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wqkv.lora_B.default.weight trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w2.lora_A.default.weight training_args 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w1.lora_B.default.weight TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4125, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=2, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-57-17_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, )04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w1.lora_B.default.weight trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w2.lora_B.default.weight training_args 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w3.lora_A.default.weight TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4125, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=1, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-57-17_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w2.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wqkv.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wqkv.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wo.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wo.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w1.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w1.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w3.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w3.lora_B.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w2.lora_A.default.weight 04/25/2025 07:57:26 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w2.lora_B.default.weight training_args TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4125, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr25_07-57-17_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) [INFO|trainer.py:741] 2025-04-25 07:57:26,197 >> Using auto half precision backend [WARNING|trainer.py:803] 2025-04-25 07:57:26,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:57:26,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:57:26,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:57:26,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:57:26,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-25 07:57:26,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [2025-04-25 07:57:26,431] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed info: version=0.16.3, git-hash=unknown, git-branch=unknown [2025-04-25 07:57:26,431] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 3 Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... [2025-04-25 07:57:32,749] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... Detected CUDA files, patching ldflags Emitting ninja build file /home/wangjiarui/.cache/torch_extensions/py39_cu121/fused_adam/build.ninja... Building extension module fused_adam... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) ninja: no work to do. Loading extension module fused_adam... Time to load fused_adam op: 0.5859417915344238 seconds Loading extension module fused_adam... Time to load fused_adam op: 0.6033613681793213 seconds Loading extension module fused_adam... Time to load fused_adam op: 0.6033060550689697 seconds [2025-04-25 07:57:33,803] [INFO] [logging.py:128:log_dist] [Rank 0] Using DeepSpeed Optimizer param name adamw as basic optimizer [2025-04-25 07:57:33,803] [INFO] [logging.py:128:log_dist] [Rank 0] Removing param_group that has no 'params' in the basic Optimizer [2025-04-25 07:57:33,895] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Basic Optimizer = FusedAdam [2025-04-25 07:57:33,895] [INFO] [utils.py:59:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= [2025-04-25 07:57:33,895] [INFO] [logging.py:128:log_dist] [Rank 0] Creating torch.bfloat16 ZeRO stage 1 optimizer [2025-04-25 07:57:33,895] [INFO] [stage_1_and_2.py:149:__init__] Reduce bucket size 1000000000 [2025-04-25 07:57:33,895] [INFO] [stage_1_and_2.py:150:__init__] Allgather bucket size 1000000000 [2025-04-25 07:57:33,895] [INFO] [stage_1_and_2.py:151:__init__] CPU Offload: False [2025-04-25 07:57:33,895] [INFO] [stage_1_and_2.py:152:__init__] Round robin gradient partitioning: False [2025-04-25 07:57:34,310] [INFO] [utils.py:781:see_memory_usage] Before initializing optimizer states [2025-04-25 07:57:34,310] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.08 GB CA 18.33 GB Max_CA 18 GB [2025-04-25 07:57:34,311] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 27.16 GB, percent = 10.8% [2025-04-25 07:57:34,484] [INFO] [utils.py:781:see_memory_usage] After initializing optimizer states [2025-04-25 07:57:34,485] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.11 GB CA 18.39 GB Max_CA 18 GB [2025-04-25 07:57:34,485] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 27.15 GB, percent = 10.8% [2025-04-25 07:57:34,485] [INFO] [stage_1_and_2.py:545:__init__] optimizer state initialized [2025-04-25 07:57:34,656] [INFO] [utils.py:781:see_memory_usage] After initializing ZeRO optimizer [2025-04-25 07:57:34,657] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.05 GB CA 18.39 GB Max_CA 18 GB [2025-04-25 07:57:34,657] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 27.26 GB, percent = 10.8% [2025-04-25 07:57:34,663] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Final Optimizer = DeepSpeedZeroOptimizer [2025-04-25 07:57:34,663] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed using client callable to create LR scheduler [2025-04-25 07:57:34,663] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed LR Scheduler = [2025-04-25 07:57:34,664] [INFO] [logging.py:128:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0], mom=[[0.9, 0.999]] [2025-04-25 07:57:34,674] [INFO] [config.py:999:print] DeepSpeedEngine configuration: [2025-04-25 07:57:34,674] [INFO] [config.py:1003:print] activation_checkpointing_config { "partition_activations": false, "contiguous_memory_optimization": false, "cpu_checkpointing": false, "number_checkpoints": null, "synchronize_checkpoint_boundary": false, "profile": false } [2025-04-25 07:57:34,674] [INFO] [config.py:1003:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True, 'use_gds': False} [2025-04-25 07:57:34,674] [INFO] [config.py:1003:print] amp_enabled .................. False [2025-04-25 07:57:34,674] [INFO] [config.py:1003:print] amp_params ................... False [2025-04-25 07:57:34,674] [INFO] [config.py:1003:print] autotuning_config ............ { "enabled": false, "start_step": null, "end_step": null, "metric_path": null, "arg_mappings": null, "metric": "throughput", "model_info": null, "results_dir": "autotuning_results", "exps_dir": "autotuning_exps", "overwrite": true, "fast": true, "start_profile_step": 3, "end_profile_step": 5, "tuner_type": "gridsearch", "tuner_early_stopping": 5, "tuner_num_trials": 50, "model_info_path": null, "mp_size": 1, "max_train_batch_size": null, "min_train_batch_size": 1, "max_train_micro_batch_size_per_gpu": 1.024000e+03, "min_train_micro_batch_size_per_gpu": 1, "num_tuning_micro_batch_sizes": 3 } [2025-04-25 07:57:34,674] [INFO] [config.py:1003:print] bfloat16_enabled ............. True [2025-04-25 07:57:34,674] [INFO] [config.py:1003:print] bfloat16_immediate_grad_update False [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] checkpoint_parallel_write_pipeline False [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] checkpoint_tag_validation_enabled True [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] checkpoint_tag_validation_fail False [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] comms_config ................. [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] communication_data_type ...... None [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] compression_config ........... {'weight_quantization': {'shared_parameters': {'enabled': False, 'quantizer_kernel': False, 'schedule_offset': 0, 'quantize_groups': 1, 'quantize_verbose': False, 'quantization_type': 'symmetric', 'quantize_weight_in_forward': False, 'rounding': 'nearest', 'fp16_mixed_quantize': False, 'quantize_change_ratio': 0.001}, 'different_groups': {}}, 'activation_quantization': {'shared_parameters': {'enabled': False, 'quantization_type': 'symmetric', 'range_calibration': 'dynamic', 'schedule_offset': 1000}, 'different_groups': {}}, 'sparse_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'row_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'head_pruning': {'shared_parameters': {'enabled': False, 'method': 'topk', 'schedule_offset': 1000}, 'different_groups': {}}, 'channel_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'layer_reduction': {'enabled': False}} [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] curriculum_enabled_legacy .... False [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] curriculum_params_legacy ..... False [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] data_efficiency_config ....... {'enabled': False, 'seed': 1234, 'data_sampling': {'enabled': False, 'num_epochs': 1000, 'num_workers': 0, 'curriculum_learning': {'enabled': False}}, 'data_routing': {'enabled': False, 'random_ltd': {'enabled': False, 'layer_token_lr_schedule': {'enabled': False}}}} [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] data_efficiency_enabled ...... False [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] dataloader_drop_last ......... False [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] disable_allgather ............ False [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] dump_state ................... False [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] dynamic_loss_scale_args ...... None [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] eigenvalue_enabled ........... False [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] eigenvalue_gas_boundary_resolution 1 [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] eigenvalue_layer_name ........ bert.encoder.layer [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] eigenvalue_layer_num ......... 0 [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] eigenvalue_max_iter .......... 100 [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] eigenvalue_stability ......... 1e-06 [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] eigenvalue_tol ............... 0.01 [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] eigenvalue_verbose ........... False [2025-04-25 07:57:34,675] [INFO] [config.py:1003:print] elasticity_enabled ........... False [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] flops_profiler_config ........ { "enabled": false, "recompute_fwd_factor": 0.0, "profile_step": 1, "module_depth": -1, "top_modules": 1, "detailed": true, "output_file": null } [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] fp16_auto_cast ............... None [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] fp16_enabled ................. False [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] fp16_master_weights_and_gradients False [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] global_rank .................. 0 [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] grad_accum_dtype ............. None [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] gradient_accumulation_steps .. 1 [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] gradient_clipping ............ 1.0 [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] gradient_predivide_factor .... 1.0 [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] graph_harvesting ............. False [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] hybrid_engine ................ enabled=False max_out_tokens=512 inference_tp_size=1 release_inference_cache=False pin_parameters=True tp_gather_partition_size=8 [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] initial_dynamic_scale ........ 1 [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] load_universal_checkpoint .... False [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] loss_scale ................... 1.0 [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] memory_breakdown ............. False [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] mics_hierarchial_params_gather False [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] mics_shard_size .............. -1 [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] monitor_config ............... tensorboard=TensorBoardConfig(enabled=False, output_path='', job_name='DeepSpeedJobName') comet=CometConfig(enabled=False, samples_log_interval=100, project=None, workspace=None, api_key=None, experiment_name=None, experiment_key=None, online=None, mode=None) wandb=WandbConfig(enabled=False, group=None, team=None, project='deepspeed') csv_monitor=CSVConfig(enabled=False, output_path='', job_name='DeepSpeedJobName') [2025-04-25 07:57:34,676] [INFO] [config.py:1003:print] nebula_config ................ { "enabled": false, "persistent_storage_path": null, "persistent_time_interval": 100, "num_of_version_in_retention": 2, "enable_nebula_load": true, "load_path": null } [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] optimizer_legacy_fusion ...... False [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] optimizer_name ............... adamw [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] optimizer_params ............. {'lr': 4e-05, 'betas': [0.9, 0.999], 'eps': 1e-08, 'weight_decay': 0.01} [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0, 'pipe_partitioned': True, 'grad_partitioned': True} [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] pld_enabled .................. False [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] pld_params ................... False [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] prescale_gradients ........... False [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] scheduler_name ............... None [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] scheduler_params ............. None [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] seq_parallel_communication_data_type torch.float32 [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] sparse_attention ............. None [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] sparse_gradients_enabled ..... False [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] steps_per_print .............. inf [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] timers_config ................ enabled=True synchronized=True [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] train_batch_size ............. 12 [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] train_micro_batch_size_per_gpu 4 [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] use_data_before_expert_parallel_ False [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] use_node_local_storage ....... False [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] wall_clock_breakdown ......... True [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] weight_quantization_config ... None [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] world_size ................... 3 [2025-04-25 07:57:34,677] [INFO] [config.py:1003:print] zero_allow_untested_optimizer False [2025-04-25 07:57:34,678] [INFO] [config.py:1003:print] zero_config .................. stage=1 contiguous_gradients=True reduce_scatter=True reduce_bucket_size=1000000000 use_multi_rank_bucket_allreduce=True allgather_partitions=True allgather_bucket_size=1000000000 overlap_comm=True load_from_fp32_weights=True elastic_checkpoint=False offload_param=None offload_optimizer=None sub_group_size=1000000000 cpu_offload_param=None cpu_offload_use_pin_memory=None cpu_offload=None prefetch_bucket_size=50000000 param_persistence_threshold=100000 model_persistence_threshold=9223372036854775807 max_live_parameters=1000000000 max_reuse_distance=1000000000 gather_16bit_weights_on_model_save=False module_granularity_threshold=0 use_all_reduce_for_fetch_params=False stage3_gather_fp16_weights_on_model_save=False ignore_unused_parameters=True legacy_stage1=False round_robin_gradients=False zero_hpz_partition_size=1 zero_quantized_weights=False zero_quantized_nontrainable_weights=False zero_quantized_gradients=False zeropp_loco_param=None mics_shard_size=-1 mics_hierarchical_params_gather=False memory_efficient_linear=True pipeline_loading_checkpoint=False override_module_apply=True [2025-04-25 07:57:34,678] [INFO] [config.py:1003:print] zero_enabled ................. True [2025-04-25 07:57:34,678] [INFO] [config.py:1003:print] zero_force_ds_cpu_optimizer .. True [2025-04-25 07:57:34,678] [INFO] [config.py:1003:print] zero_optimization_stage ...... 1 [2025-04-25 07:57:34,678] [INFO] [config.py:989:print_user_config] json = { "zero_optimization": { "stage": 1, "allgather_partitions": true, "allgather_bucket_size": 1.000000e+09, "overlap_comm": true, "reduce_scatter": true, "reduce_bucket_size": 1.000000e+09, "contiguous_gradients": true }, "fp16": { "enabled": false, "auto_cast": true, "loss_scale": 0, "initial_scale_power": 32, "loss_scale_window": 1000, "hysteresis": 2, "min_loss_scale": 1 }, "bf16": { "enabled": true }, "optimizer": { "type": "AdamW", "params": { "lr": 4e-05, "betas": [0.9, 0.999], "eps": 1e-08, "weight_decay": 0.01 } }, "gradient_accumulation_steps": 1, "gradient_clipping": 1.0, "steps_per_print": inf, "train_batch_size": 12, "train_micro_batch_size_per_gpu": 4, "wall_clock_breakdown": true } [INFO|trainer.py:2369] 2025-04-25 07:57:34,679 >> ***** Running training ***** [INFO|trainer.py:2370] 2025-04-25 07:57:34,679 >> Num examples = 49,500 [INFO|trainer.py:2371] 2025-04-25 07:57:34,679 >> Num Epochs = 10 [INFO|trainer.py:2372] 2025-04-25 07:57:34,679 >> Instantaneous batch size per device = 4 [INFO|trainer.py:2375] 2025-04-25 07:57:34,679 >> Total train batch size (w. parallel, distributed & accumulation) = 12 [INFO|trainer.py:2376] 2025-04-25 07:57:34,679 >> Gradient Accumulation steps = 1 [INFO|trainer.py:2377] 2025-04-25 07:57:34,679 >> Total optimization steps = 41,250 [INFO|trainer.py:2378] 2025-04-25 07:57:34,686 >> Number of trainable parameters = 52,297,728 0%| | 0/41250 [00:00> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-25 17:54:34,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-25 17:54:34,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2 2 2 [WARNING|trainer.py:803] 2025-04-25 17:54:36,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-25 17:54:36,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-25 17:54:36,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3 3 3 [WARNING|trainer.py:803] 2025-04-25 17:54:39,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.3333333333333333 New best accuracy: 0.3333333333333333. Saving model... [WARNING|trainer.py:803] 2025-04-25 17:54:39,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.3333333333333333 New best accuracy: 0.3333333333333333. Saving model... [WARNING|trainer.py:803] 2025-04-25 17:54:39,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.3333333333333333 New best accuracy: 0.3333333333333333. Saving model... [INFO|trainer.py:3910] 2025-04-25 17:54:51,084 >> Saving model checkpoint to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast [INFO|configuration_utils.py:420] 2025-04-25 17:54:51,089 >> Configuration saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/config.json [INFO|configuration_utils.py:909] 2025-04-25 17:54:51,090 >> Configuration saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/generation_config.json evaluate! evaluate! 1 1 [WARNING|trainer.py:803] 2025-04-25 17:54:56,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-25 17:54:56,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2 2 [WARNING|trainer.py:803] 2025-04-25 17:54:59,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-25 17:54:59,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3 3 [WARNING|trainer.py:803] 2025-04-25 17:55:01,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.3333333333333333 [WARNING|trainer.py:803] 2025-04-25 17:55:01,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.3333333333333333 [2025-04-25 17:55:03,957] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-25 17:55:03,970] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [2025-04-25 17:55:09,491] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-25 17:55:09,494] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [2025-04-25 17:55:15,047] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-25 17:55:15,288] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [2025-04-25 17:55:20,839] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-25 17:55:21,081] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [INFO|modeling_utils.py:2996] 2025-04-25 17:55:46,649 >> The model is bigger than the maximum size per checkpoint (5GB) and is going to be split in 4 checkpoint shards. You can find where each parameters has been saved in the index located at /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/model.safetensors.index.json. [INFO|tokenization_utils_base.py:2491] 2025-04-25 17:55:46,652 >> tokenizer config file saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/tokenizer_config.json [INFO|tokenization_utils_base.py:2500] 2025-04-25 17:55:46,652 >> Special tokens file saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/special_tokens_map.json [INFO|tokenization_utils_base.py:2553] 2025-04-25 17:55:46,653 >> added tokens file saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/added_tokens.json 04/25/2025 17:55:48 - INFO - __main__ - Saved LoRA weights to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/lora_weights.pth evaluate! 1 [WARNING|trainer.py:803] 2025-04-25 17:55:51,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2 [WARNING|trainer.py:803] 2025-04-25 17:55:54,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3 [WARNING|trainer.py:803] 2025-04-25 17:55:57,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.3333333333333333 [2025-04-25 17:55:59,769] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [2025-04-25 17:56:05,873] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [2025-04-25 17:56:11,469] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [2025-04-25 17:56:17,177] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [2025-04-25 17:56:33,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.09 | optimizer_step: 1.12 [2025-04-25 17:56:33,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.16 | bwd_microstep: 5665.04 | bwd_inner_microstep: 5652.11 | bwd_allreduce_microstep: 12.88 | step_microstep: 20.39 [2025-04-25 17:56:33,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.13 | bwd: 5665.05 | bwd_inner: 5652.11 | bwd_allreduce: 12.90 | step: 20.39 10%|█ | 4126/41250 [9:58:58<444:27:29, 43.10s/it] {'loss': 0.0967, 'grad_norm': 1.5433621406555176, 'learning_rate': 3.948802125410715e-05, 'epoch': 1.0} 10%|█ | 4126/41250 [9:58:58<444:27:29, 43.10s/it][2025-04-25 17:56:41,866] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.11 | optimizer_step: 1.10 [2025-04-25 17:56:41,867] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2802.50 | bwd_microstep: 5644.42 | bwd_inner_microstep: 5609.47 | bwd_allreduce_microstep: 34.89 | step_microstep: 20.05 [2025-04-25 17:56:41,867] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2802.50 | bwd: 5644.44 | bwd_inner: 5609.47 | bwd_allreduce: 34.92 | step: 20.05 10%|█ | 4127/41250 [9:59:07<337:30:54, 32.73s/it] {'loss': 0.1664, 'grad_norm': 1.3101110458374023, 'learning_rate': 3.9487668158630396e-05, 'epoch': 1.0} 10%|█ | 4127/41250 [9:59:07<337:30:54, 32.73s/it][2025-04-25 17:56:50,508] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.08 | optimizer_step: 1.29 [2025-04-25 17:56:50,509] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.50 | bwd_microstep: 5715.47 | bwd_inner_microstep: 5672.05 | bwd_allreduce_microstep: 43.37 | step_microstep: 20.24 [2025-04-25 17:56:50,509] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.50 | bwd: 5715.49 | bwd_inner: 5672.05 | bwd_allreduce: 43.39 | step: 20.24 10%|█ | 4128/41250 [9:59:15<262:59:08, 25.50s/it] {'loss': 0.1747, 'grad_norm': 1.3050565719604492, 'learning_rate': 3.9487314943016006e-05, 'epoch': 1.0} 10%|█ | 4128/41250 [9:59:15<262:59:08, 25.50s/it][2025-04-25 17:56:59,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.07 | optimizer_step: 1.10 [2025-04-25 17:56:59,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.50 | bwd_microstep: 5694.06 | bwd_inner_microstep: 5651.10 | bwd_allreduce_microstep: 42.91 | step_microstep: 19.92 [2025-04-25 17:56:59,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.50 | bwd: 5694.08 | bwd_inner: 5651.10 | bwd_allreduce: 42.94 | step: 19.92 10%|█ | 4129/41250 [9:59:24<210:45:20, 20.44s/it] {'loss': 0.1765, 'grad_norm': 1.246640682220459, 'learning_rate': 3.9486961607266146e-05, 'epoch': 1.0} 10%|█ | 4129/41250 [9:59:24<210:45:20, 20.44s/it][2025-04-25 17:57:07,688] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.06 | optimizer_step: 1.05 [2025-04-25 17:57:07,689] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2814.24 | bwd_microstep: 5656.06 | bwd_inner_microstep: 5634.02 | bwd_allreduce_microstep: 21.98 | step_microstep: 19.71 [2025-04-25 17:57:07,690] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2814.24 | bwd: 5656.07 | bwd_inner: 5634.02 | bwd_allreduce: 22.01 | step: 19.71 10%|█ | 4130/41250 [9:59:33<174:00:18, 16.88s/it] {'loss': 0.1559, 'grad_norm': 1.3961122035980225, 'learning_rate': 3.9486608151383007e-05, 'epoch': 1.0} 10%|█ | 4130/41250 [9:59:33<174:00:18, 16.88s/it][2025-04-25 17:57:16,325] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 17:57:16,325] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.87 | bwd_microstep: 5729.87 | bwd_inner_microstep: 5630.09 | bwd_allreduce_microstep: 99.73 | step_microstep: 18.97 [2025-04-25 17:57:16,325] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.87 | bwd: 5729.88 | bwd_inner: 5630.09 | bwd_allreduce: 99.75 | step: 18.97 10%|█ | 4131/41250 [9:59:41<148:29:59, 14.40s/it] {'loss': 0.1372, 'grad_norm': 1.2956308126449585, 'learning_rate': 3.9486254575368755e-05, 'epoch': 1.0} 10%|█ | 4131/41250 [9:59:41<148:29:59, 14.40s/it][2025-04-25 17:57:24,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.06 | optimizer_step: 1.01 [2025-04-25 17:57:24,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.61 | bwd_microstep: 5692.23 | bwd_inner_microstep: 5636.88 | bwd_allreduce_microstep: 55.31 | step_microstep: 18.94 [2025-04-25 17:57:24,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.61 | bwd: 5692.25 | bwd_inner: 5636.88 | bwd_allreduce: 55.33 | step: 18.94 10%|█ | 4132/41250 [9:59:50<130:32:00, 12.66s/it] {'loss': 0.2273, 'grad_norm': 7.36805534362793, 'learning_rate': 3.948590087922558e-05, 'epoch': 1.0} 10%|█ | 4132/41250 [9:59:50<130:32:00, 12.66s/it][2025-04-25 17:57:33,585] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-25 17:57:33,585] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.21 | bwd_microstep: 5735.57 | bwd_inner_microstep: 5670.08 | bwd_allreduce_microstep: 65.45 | step_microstep: 18.74 [2025-04-25 17:57:33,585] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.21 | bwd: 5735.58 | bwd_inner: 5670.08 | bwd_allreduce: 65.46 | step: 18.75 10%|█ | 4133/41250 [9:59:58<118:10:19, 11.46s/it] {'loss': 0.114, 'grad_norm': 0.7711699604988098, 'learning_rate': 3.948554706295565e-05, 'epoch': 1.0} 10%|█ | 4133/41250 [9:59:58<118:10:19, 11.46s/it][2025-04-25 17:57:42,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 17:57:42,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.70 | bwd_microstep: 5882.06 | bwd_inner_microstep: 5669.07 | bwd_allreduce_microstep: 212.94 | step_microstep: 18.56 [2025-04-25 17:57:42,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.70 | bwd: 5882.07 | bwd_inner: 5669.07 | bwd_allreduce: 212.95 | step: 18.57 10%|█ | 4134/41250 [10:00:07<109:56:41, 10.66s/it] {'loss': 0.048, 'grad_norm': 0.8537027835845947, 'learning_rate': 3.948519312656116e-05, 'epoch': 1.0} 10%|█ | 4134/41250 [10:00:07<109:56:41, 10.66s/it][2025-04-25 17:57:51,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.03 | optimizer_step: 1.00 [2025-04-25 17:57:51,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.33 | bwd_microstep: 5688.03 | bwd_inner_microstep: 5671.93 | bwd_allreduce_microstep: 16.05 | step_microstep: 19.42 [2025-04-25 17:57:51,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.33 | bwd: 5688.04 | bwd_inner: 5671.93 | bwd_allreduce: 16.07 | step: 19.42 10%|█ | 4135/41250 [10:00:16<103:36:24, 10.05s/it] {'loss': 0.108, 'grad_norm': 2.1665878295898438, 'learning_rate': 3.94848390700443e-05, 'epoch': 1.0} 10%|█ | 4135/41250 [10:00:16<103:36:24, 10.05s/it][2025-04-25 17:57:59,662] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.04 | optimizer_step: 1.05 [2025-04-25 17:57:59,663] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.00 | bwd_microstep: 5745.48 | bwd_inner_microstep: 5637.94 | bwd_allreduce_microstep: 107.49 | step_microstep: 19.38 [2025-04-25 17:57:59,663] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.00 | bwd: 5745.50 | bwd_inner: 5637.94 | bwd_allreduce: 107.51 | step: 19.38 10%|█ | 4136/41250 [10:00:24<99:18:35, 9.63s/it] {'loss': 0.2206, 'grad_norm': 1.2619880437850952, 'learning_rate': 3.948448489340722e-05, 'epoch': 1.0} 10%|█ | 4136/41250 [10:00:24<99:18:35, 9.63s/it][2025-04-25 17:58:08,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 0.99 | optimizer_step: 1.00 [2025-04-25 17:58:08,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.52 | bwd_microstep: 5757.14 | bwd_inner_microstep: 5670.63 | bwd_allreduce_microstep: 86.46 | step_microstep: 18.84 [2025-04-25 17:58:08,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.52 | bwd: 5757.16 | bwd_inner: 5670.63 | bwd_allreduce: 86.48 | step: 18.85 10%|█ | 4137/41250 [10:00:33<96:21:40, 9.35s/it] {'loss': 0.232, 'grad_norm': 0.8654947280883789, 'learning_rate': 3.948413059665214e-05, 'epoch': 1.0} 10%|█ | 4137/41250 [10:00:33<96:21:40, 9.35s/it][2025-04-25 17:58:16,945] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 17:58:16,945] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.18 | bwd_microstep: 5692.04 | bwd_inner_microstep: 5633.00 | bwd_allreduce_microstep: 58.99 | step_microstep: 19.02 [2025-04-25 17:58:16,945] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.18 | bwd: 5692.05 | bwd_inner: 5633.00 | bwd_allreduce: 59.01 | step: 19.02 10%|█ | 4138/41250 [10:00:42<94:02:53, 9.12s/it] {'loss': 0.0877, 'grad_norm': 0.8022608160972595, 'learning_rate': 3.9483776179781216e-05, 'epoch': 1.0} 10%|█ | 4138/41250 [10:00:42<94:02:53, 9.12s/it][2025-04-25 17:58:25,620] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.04 | optimizer_step: 1.00 [2025-04-25 17:58:25,621] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.12 | bwd_microstep: 5743.95 | bwd_inner_microstep: 5692.64 | bwd_allreduce_microstep: 51.26 | step_microstep: 19.34 [2025-04-25 17:58:25,621] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.12 | bwd: 5743.96 | bwd_inner: 5692.64 | bwd_allreduce: 51.28 | step: 19.34 10%|█ | 4139/41250 [10:00:50<92:39:58, 8.99s/it] {'loss': 0.1978, 'grad_norm': 1.5828359127044678, 'learning_rate': 3.948342164279665e-05, 'epoch': 1.0} 10%|█ | 4139/41250 [10:00:50<92:39:58, 8.99s/it][2025-04-25 17:58:34,265] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.99 | optimizer_step: 0.92 [2025-04-25 17:58:34,265] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.34 | bwd_microstep: 5709.21 | bwd_inner_microstep: 5689.77 | bwd_allreduce_microstep: 19.39 | step_microstep: 19.05 [2025-04-25 17:58:34,266] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.34 | bwd: 5709.23 | bwd_inner: 5689.77 | bwd_allreduce: 19.41 | step: 19.05 10%|█ | 4140/41250 [10:00:59<91:35:41, 8.89s/it] {'loss': 0.1855, 'grad_norm': 1.5996220111846924, 'learning_rate': 3.948306698570061e-05, 'epoch': 1.0} 10%|█ | 4140/41250 [10:00:59<91:35:41, 8.89s/it][2025-04-25 17:58:42,884] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.02 | optimizer_step: 1.06 [2025-04-25 17:58:42,885] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.45 | bwd_microstep: 5697.32 | bwd_inner_microstep: 5684.31 | bwd_allreduce_microstep: 12.96 | step_microstep: 19.17 [2025-04-25 17:58:42,885] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.45 | bwd: 5697.33 | bwd_inner: 5684.31 | bwd_allreduce: 12.98 | step: 19.18 10%|█ | 4141/41250 [10:01:08<90:46:11, 8.81s/it] {'loss': 0.1522, 'grad_norm': 1.473616600036621, 'learning_rate': 3.948271220849531e-05, 'epoch': 1.0} 10%|█ | 4141/41250 [10:01:08<90:46:11, 8.81s/it][2025-04-25 17:58:51,474] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.91 [2025-04-25 17:58:51,474] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.60 | bwd_microstep: 5678.18 | bwd_inner_microstep: 5655.33 | bwd_allreduce_microstep: 22.79 | step_microstep: 19.18 [2025-04-25 17:58:51,474] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.60 | bwd: 5678.19 | bwd_inner: 5655.33 | bwd_allreduce: 22.82 | step: 19.18 10%|█ | 4142/41250 [10:01:16<90:06:24, 8.74s/it] {'loss': 0.1321, 'grad_norm': 1.1322617530822754, 'learning_rate': 3.948235731118291e-05, 'epoch': 1.0} 10%|█ | 4142/41250 [10:01:16<90:06:24, 8.74s/it][2025-04-25 17:59:00,365] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.99 | optimizer_step: 0.95 [2025-04-25 17:59:00,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2931.99 | bwd_microstep: 5874.40 | bwd_inner_microstep: 5861.78 | bwd_allreduce_microstep: 12.58 | step_microstep: 18.79 [2025-04-25 17:59:00,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2931.99 | bwd: 5874.41 | bwd_inner: 5861.78 | bwd_allreduce: 12.60 | step: 18.79 10%|█ | 4143/41250 [10:01:25<90:33:36, 8.79s/it] {'loss': 0.1587, 'grad_norm': 1.1864161491394043, 'learning_rate': 3.948200229376561e-05, 'epoch': 1.0} 10%|█ | 4143/41250 [10:01:25<90:33:36, 8.79s/it][2025-04-25 17:59:09,048] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.22 | optimizer_step: 0.93 [2025-04-25 17:59:09,049] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.95 | bwd_microstep: 5747.84 | bwd_inner_microstep: 5686.30 | bwd_allreduce_microstep: 61.49 | step_microstep: 19.25 [2025-04-25 17:59:09,049] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.95 | bwd: 5747.85 | bwd_inner: 5686.30 | bwd_allreduce: 61.51 | step: 19.25 10%|█ | 4144/41250 [10:01:34<90:14:15, 8.75s/it] {'loss': 0.1562, 'grad_norm': 2.6250088214874268, 'learning_rate': 3.948164715624561e-05, 'epoch': 1.0} 10%|█ | 4144/41250 [10:01:34<90:14:15, 8.75s/it][2025-04-25 17:59:17,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 17:59:17,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.77 | bwd_microstep: 5736.41 | bwd_inner_microstep: 5659.23 | bwd_allreduce_microstep: 77.14 | step_microstep: 18.82 [2025-04-25 17:59:17,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.77 | bwd: 5736.43 | bwd_inner: 5659.23 | bwd_allreduce: 77.15 | step: 18.83 10%|█ | 4145/41250 [10:01:43<89:56:00, 8.73s/it] {'loss': 0.1122, 'grad_norm': 0.9504488110542297, 'learning_rate': 3.948129189862507e-05, 'epoch': 1.0} 10%|█ | 4145/41250 [10:01:43<89:56:00, 8.73s/it][2025-04-25 17:59:26,498] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 17:59:26,498] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.50 | bwd_microstep: 5884.71 | bwd_inner_microstep: 5653.80 | bwd_allreduce_microstep: 230.87 | step_microstep: 18.67 [2025-04-25 17:59:26,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.50 | bwd: 5884.73 | bwd_inner: 5653.80 | bwd_allreduce: 230.89 | step: 18.68 10%|█ | 4146/41250 [10:01:51<90:08:30, 8.75s/it] {'loss': 0.1389, 'grad_norm': 1.6825302839279175, 'learning_rate': 3.948093652090621e-05, 'epoch': 1.01} 10%|█ | 4146/41250 [10:01:51<90:08:30, 8.75s/it][2025-04-25 17:59:35,186] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-25 17:59:35,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.47 | bwd_microstep: 5768.28 | bwd_inner_microstep: 5658.77 | bwd_allreduce_microstep: 109.46 | step_microstep: 19.27 [2025-04-25 17:59:35,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.47 | bwd: 5768.29 | bwd_inner: 5658.77 | bwd_allreduce: 109.48 | step: 19.27 10%|█ | 4147/41250 [10:02:00<89:57:45, 8.73s/it] {'loss': 0.2164, 'grad_norm': 1.5684460401535034, 'learning_rate': 3.94805810230912e-05, 'epoch': 1.01} 10%|█ | 4147/41250 [10:02:00<89:57:45, 8.73s/it][2025-04-25 17:59:43,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 6.47 | optimizer_gradients: 1.00 | optimizer_step: 0.93 [2025-04-25 17:59:43,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.07 | bwd_microstep: 5796.78 | bwd_inner_microstep: 5658.14 | bwd_allreduce_microstep: 138.60 | step_microstep: 20.32 [2025-04-25 17:59:43,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.07 | bwd: 5796.80 | bwd_inner: 5658.14 | bwd_allreduce: 138.62 | step: 20.32 10%|█ | 4148/41250 [10:02:09<89:53:32, 8.72s/it] {'loss': 0.0538, 'grad_norm': 0.6008443832397461, 'learning_rate': 3.9480225405182246e-05, 'epoch': 1.01} 10%|█ | 4148/41250 [10:02:09<89:53:32, 8.72s/it][2025-04-25 17:59:52,599] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 17:59:52,600] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.79 | bwd_microstep: 5789.71 | bwd_inner_microstep: 5670.08 | bwd_allreduce_microstep: 119.58 | step_microstep: 18.53 [2025-04-25 17:59:52,600] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.79 | bwd: 5789.72 | bwd_inner: 5670.08 | bwd_allreduce: 119.60 | step: 18.53 10%|█ | 4149/41250 [10:02:17<89:50:02, 8.72s/it] {'loss': 0.0956, 'grad_norm': 0.7498725652694702, 'learning_rate': 3.947986966718153e-05, 'epoch': 1.01} 10%|█ | 4149/41250 [10:02:17<89:50:02, 8.72s/it][2025-04-25 18:00:01,220] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:00:01,221] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.28 | bwd_microstep: 5711.34 | bwd_inner_microstep: 5639.66 | bwd_allreduce_microstep: 71.63 | step_microstep: 19.23 [2025-04-25 18:00:01,221] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.28 | bwd: 5711.35 | bwd_inner: 5639.66 | bwd_allreduce: 71.65 | step: 19.23 10%|█ | 4150/41250 [10:02:26<89:32:09, 8.69s/it] {'loss': 0.0863, 'grad_norm': 0.7300331592559814, 'learning_rate': 3.9479513809091254e-05, 'epoch': 1.01} 10%|█ | 4150/41250 [10:02:26<89:32:09, 8.69s/it][2025-04-25 18:00:09,930] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.03 | optimizer_step: 1.13 [2025-04-25 18:00:09,930] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.90 | bwd_microstep: 5769.63 | bwd_inner_microstep: 5729.90 | bwd_allreduce_microstep: 39.67 | step_microstep: 19.31 [2025-04-25 18:00:09,931] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.90 | bwd: 5769.64 | bwd_inner: 5729.90 | bwd_allreduce: 39.70 | step: 19.31 10%|█ | 4151/41250 [10:02:35<89:36:21, 8.70s/it] {'loss': 0.1633, 'grad_norm': 1.2381879091262817, 'learning_rate': 3.94791578309136e-05, 'epoch': 1.01} 10%|█ | 4151/41250 [10:02:35<89:36:21, 8.70s/it][2025-04-25 18:00:18,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.01 | optimizer_step: 0.94 [2025-04-25 18:00:18,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.41 | bwd_microstep: 5705.72 | bwd_inner_microstep: 5653.16 | bwd_allreduce_microstep: 52.51 | step_microstep: 19.03 [2025-04-25 18:00:18,552] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.41 | bwd: 5705.73 | bwd_inner: 5653.16 | bwd_allreduce: 52.52 | step: 19.03 10%|█ | 4152/41250 [10:02:43<89:22:13, 8.67s/it] {'loss': 0.1048, 'grad_norm': 1.0615270137786865, 'learning_rate': 3.947880173265077e-05, 'epoch': 1.01} 10%|█ | 4152/41250 [10:02:43<89:22:13, 8.67s/it][2025-04-25 18:00:27,233] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:00:27,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.08 | bwd_microstep: 5773.40 | bwd_inner_microstep: 5662.34 | bwd_allreduce_microstep: 111.01 | step_microstep: 18.73 [2025-04-25 18:00:27,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.08 | bwd: 5773.41 | bwd_inner: 5662.34 | bwd_allreduce: 111.03 | step: 18.73 10%|█ | 4153/41250 [10:02:52<89:23:53, 8.68s/it] {'loss': 0.0781, 'grad_norm': 0.7601968050003052, 'learning_rate': 3.947844551430496e-05, 'epoch': 1.01} 10%|█ | 4153/41250 [10:02:52<89:23:53, 8.68s/it][2025-04-25 18:00:35,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:00:35,908] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.14 | bwd_microstep: 5751.74 | bwd_inner_microstep: 5665.90 | bwd_allreduce_microstep: 85.79 | step_microstep: 18.61 [2025-04-25 18:00:35,908] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.14 | bwd: 5751.76 | bwd_inner: 5665.90 | bwd_allreduce: 85.81 | step: 18.62 10%|█ | 4154/41250 [10:03:01<89:23:35, 8.68s/it] {'loss': 0.2023, 'grad_norm': 1.0136252641677856, 'learning_rate': 3.947808917587837e-05, 'epoch': 1.01} 10%|█ | 4154/41250 [10:03:01<89:23:35, 8.68s/it][2025-04-25 18:00:44,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.06 | optimizer_step: 1.24 [2025-04-25 18:00:44,605] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.61 | bwd_microstep: 5755.21 | bwd_inner_microstep: 5711.59 | bwd_allreduce_microstep: 43.57 | step_microstep: 19.74 [2025-04-25 18:00:44,605] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.61 | bwd: 5755.22 | bwd_inner: 5711.59 | bwd_allreduce: 43.59 | step: 19.74 10%|█ | 4155/41250 [10:03:09<89:27:34, 8.68s/it] {'loss': 0.1403, 'grad_norm': 1.389998435974121, 'learning_rate': 3.9477732717373186e-05, 'epoch': 1.01} 10%|█ | 4155/41250 [10:03:09<89:27:34, 8.68s/it][2025-04-25 18:00:53,276] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.76 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:00:53,277] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.90 | bwd_microstep: 5733.90 | bwd_inner_microstep: 5708.63 | bwd_allreduce_microstep: 25.23 | step_microstep: 19.88 [2025-04-25 18:00:53,278] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.90 | bwd: 5733.92 | bwd_inner: 5708.63 | bwd_allreduce: 25.25 | step: 19.89 10%|█ | 4156/41250 [10:03:18<89:25:32, 8.68s/it] {'loss': 0.1432, 'grad_norm': 1.5966302156448364, 'learning_rate': 3.947737613879161e-05, 'epoch': 1.01} 10%|█ | 4156/41250 [10:03:18<89:25:32, 8.68s/it][2025-04-25 18:01:01,988] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.26 | optimizer_step: 0.90 [2025-04-25 18:01:01,988] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.90 | bwd_microstep: 5784.57 | bwd_inner_microstep: 5673.09 | bwd_allreduce_microstep: 111.43 | step_microstep: 19.34 [2025-04-25 18:01:01,988] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.90 | bwd: 5784.59 | bwd_inner: 5673.09 | bwd_allreduce: 111.45 | step: 19.35 10%|█ | 4157/41250 [10:03:27<89:31:21, 8.69s/it] {'loss': 0.1382, 'grad_norm': 1.4558185338974, 'learning_rate': 3.947701944013584e-05, 'epoch': 1.01} 10%|█ | 4157/41250 [10:03:27<89:31:21, 8.69s/it][2025-04-25 18:01:10,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.26 | optimizer_step: 0.98 [2025-04-25 18:01:10,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2941.93 | bwd_microstep: 5903.17 | bwd_inner_microstep: 5890.14 | bwd_allreduce_microstep: 12.97 | step_microstep: 19.51 [2025-04-25 18:01:10,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2941.93 | bwd: 5903.19 | bwd_inner: 5890.14 | bwd_allreduce: 13.00 | step: 19.51 10%|█ | 4158/41250 [10:03:36<90:16:37, 8.76s/it] {'loss': 0.2006, 'grad_norm': 2.0930919647216797, 'learning_rate': 3.947666262140809e-05, 'epoch': 1.01} 10%|█ | 4158/41250 [10:03:36<90:16:37, 8.76s/it][2025-04-25 18:01:19,601] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.04 | optimizer_step: 0.89 [2025-04-25 18:01:19,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.17 | bwd_microstep: 5764.09 | bwd_inner_microstep: 5667.65 | bwd_allreduce_microstep: 96.39 | step_microstep: 18.87 [2025-04-25 18:01:19,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.17 | bwd: 5764.10 | bwd_inner: 5667.65 | bwd_allreduce: 96.41 | step: 18.87 10%|█ | 4159/41250 [10:03:44<90:01:16, 8.74s/it] {'loss': 0.2323, 'grad_norm': 1.9333583116531372, 'learning_rate': 3.947630568261053e-05, 'epoch': 1.01} 10%|█ | 4159/41250 [10:03:44<90:01:16, 8.74s/it][2025-04-25 18:01:28,273] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.06 | optimizer_step: 0.98 [2025-04-25 18:01:28,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.10 | bwd_microstep: 5754.26 | bwd_inner_microstep: 5670.74 | bwd_allreduce_microstep: 83.47 | step_microstep: 19.28 [2025-04-25 18:01:28,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.10 | bwd: 5754.27 | bwd_inner: 5670.74 | bwd_allreduce: 83.49 | step: 19.28 10%|█ | 4160/41250 [10:03:53<89:48:55, 8.72s/it] {'loss': 0.1946, 'grad_norm': 1.4183326959609985, 'learning_rate': 3.9475948623745376e-05, 'epoch': 1.01} 10%|█ | 4160/41250 [10:03:53<89:48:55, 8.72s/it][2025-04-25 18:01:37,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.96 | optimizer_step: 1.05 [2025-04-25 18:01:37,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.29 | bwd_microstep: 5802.46 | bwd_inner_microstep: 5669.47 | bwd_allreduce_microstep: 132.95 | step_microstep: 18.58 [2025-04-25 18:01:37,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.29 | bwd: 5802.47 | bwd_inner: 5669.47 | bwd_allreduce: 132.96 | step: 18.58 10%|█ | 4161/41250 [10:04:02<89:50:33, 8.72s/it] {'loss': 0.0824, 'grad_norm': 0.9515659809112549, 'learning_rate': 3.947559144481484e-05, 'epoch': 1.01} 10%|█ | 4161/41250 [10:04:02<89:50:33, 8.72s/it][2025-04-25 18:01:45,683] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 18:01:45,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.33 | bwd_microstep: 5770.79 | bwd_inner_microstep: 5657.02 | bwd_allreduce_microstep: 113.73 | step_microstep: 18.80 [2025-04-25 18:01:45,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.33 | bwd: 5770.80 | bwd_inner: 5657.02 | bwd_allreduce: 113.74 | step: 18.80 10%|█ | 4162/41250 [10:04:11<89:43:22, 8.71s/it] {'loss': 0.0128, 'grad_norm': 0.12015310674905777, 'learning_rate': 3.9475234145821104e-05, 'epoch': 1.01} 10%|█ | 4162/41250 [10:04:11<89:43:22, 8.71s/it][2025-04-25 18:01:54,354] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-25 18:01:54,355] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.72 | bwd_microstep: 5732.12 | bwd_inner_microstep: 5712.94 | bwd_allreduce_microstep: 19.13 | step_microstep: 18.97 [2025-04-25 18:01:54,355] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.73 | bwd: 5732.13 | bwd_inner: 5712.94 | bwd_allreduce: 19.15 | step: 18.97 10%|█ | 4163/41250 [10:04:19<89:36:31, 8.70s/it] {'loss': 0.2137, 'grad_norm': 2.2988533973693848, 'learning_rate': 3.947487672676638e-05, 'epoch': 1.01} 10%|█ | 4163/41250 [10:04:19<89:36:31, 8.70s/it][2025-04-25 18:02:02,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.59 | optimizer_gradients: 0.97 | optimizer_step: 0.91 [2025-04-25 18:02:02,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.91 | bwd_microstep: 5703.46 | bwd_inner_microstep: 5690.76 | bwd_allreduce_microstep: 12.65 | step_microstep: 19.04 [2025-04-25 18:02:02,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.91 | bwd: 5703.47 | bwd_inner: 5690.76 | bwd_allreduce: 12.67 | step: 19.04 10%|█ | 4164/41250 [10:04:28<89:26:04, 8.68s/it] {'loss': 0.1216, 'grad_norm': 1.7952409982681274, 'learning_rate': 3.947451918765288e-05, 'epoch': 1.01} 10%|█ | 4164/41250 [10:04:28<89:26:04, 8.68s/it][2025-04-25 18:02:11,779] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:02:11,780] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2895.02 | bwd_microstep: 5802.98 | bwd_inner_microstep: 5790.30 | bwd_allreduce_microstep: 12.63 | step_microstep: 18.11 [2025-04-25 18:02:11,780] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2895.02 | bwd: 5802.99 | bwd_inner: 5790.30 | bwd_allreduce: 12.65 | step: 18.11 10%|█ | 4165/41250 [10:04:37<89:44:17, 8.71s/it] {'loss': 0.0652, 'grad_norm': 0.9131054282188416, 'learning_rate': 3.94741615284828e-05, 'epoch': 1.01} 10%|█ | 4165/41250 [10:04:37<89:44:17, 8.71s/it][2025-04-25 18:02:20,406] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.22 | optimizer_step: 0.90 [2025-04-25 18:02:20,407] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.74 | bwd_microstep: 5703.44 | bwd_inner_microstep: 5663.99 | bwd_allreduce_microstep: 39.40 | step_microstep: 19.26 [2025-04-25 18:02:20,407] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.74 | bwd: 5703.46 | bwd_inner: 5663.99 | bwd_allreduce: 39.43 | step: 19.27 10%|█ | 4166/41250 [10:04:45<89:28:38, 8.69s/it] {'loss': 0.0228, 'grad_norm': 0.2111085206270218, 'learning_rate': 3.947380374925834e-05, 'epoch': 1.01} 10%|█ | 4166/41250 [10:04:45<89:28:38, 8.69s/it][2025-04-25 18:02:29,042] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.93 [2025-04-25 18:02:29,042] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.43 | bwd_microstep: 5706.11 | bwd_inner_microstep: 5693.39 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.72 [2025-04-25 18:02:29,042] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.43 | bwd: 5706.13 | bwd_inner: 5693.39 | bwd_allreduce: 12.69 | step: 18.72 10%|█ | 4167/41250 [10:04:54<89:19:47, 8.67s/it] {'loss': 0.0831, 'grad_norm': 1.265779972076416, 'learning_rate': 3.94734458499817e-05, 'epoch': 1.01} 10%|█ | 4167/41250 [10:04:54<89:19:47, 8.67s/it][2025-04-25 18:02:37,657] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.01 | optimizer_step: 1.17 [2025-04-25 18:02:37,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.35 | bwd_microstep: 5698.71 | bwd_inner_microstep: 5660.32 | bwd_allreduce_microstep: 38.35 | step_microstep: 19.26 [2025-04-25 18:02:37,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.35 | bwd: 5698.73 | bwd_inner: 5660.32 | bwd_allreduce: 38.37 | step: 19.26 10%|█ | 4168/41250 [10:05:02<89:08:26, 8.65s/it] {'loss': 0.1071, 'grad_norm': 1.9594755172729492, 'learning_rate': 3.947308783065511e-05, 'epoch': 1.01} 10%|█ | 4168/41250 [10:05:02<89:08:26, 8.65s/it][2025-04-25 18:02:46,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.23 | optimizer_step: 0.89 [2025-04-25 18:02:46,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.47 | bwd_microstep: 5701.33 | bwd_inner_microstep: 5666.73 | bwd_allreduce_microstep: 34.55 | step_microstep: 19.22 [2025-04-25 18:02:46,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.47 | bwd: 5701.34 | bwd_inner: 5666.73 | bwd_allreduce: 34.57 | step: 19.22 10%|█ | 4169/41250 [10:05:11<89:02:18, 8.64s/it] {'loss': 0.2158, 'grad_norm': 1.2009261846542358, 'learning_rate': 3.947272969128076e-05, 'epoch': 1.01} 10%|█ | 4169/41250 [10:05:11<89:02:18, 8.64s/it][2025-04-25 18:02:54,965] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 18:02:54,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.24 | bwd_microstep: 5755.00 | bwd_inner_microstep: 5698.52 | bwd_allreduce_microstep: 56.44 | step_microstep: 18.51 [2025-04-25 18:02:54,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.24 | bwd: 5755.02 | bwd_inner: 5698.52 | bwd_allreduce: 56.46 | step: 18.51 10%|█ | 4170/41250 [10:05:20<89:09:51, 8.66s/it] {'loss': 0.0888, 'grad_norm': 1.1178159713745117, 'learning_rate': 3.947237143186086e-05, 'epoch': 1.01} 10%|█ | 4170/41250 [10:05:20<89:09:51, 8.66s/it][2025-04-25 18:03:03,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.96 | optimizer_step: 1.00 [2025-04-25 18:03:03,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.67 | bwd_microstep: 5689.27 | bwd_inner_microstep: 5676.67 | bwd_allreduce_microstep: 12.55 | step_microstep: 18.41 [2025-04-25 18:03:03,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.67 | bwd: 5689.28 | bwd_inner: 5676.67 | bwd_allreduce: 12.57 | step: 18.42 10%|█ | 4171/41250 [10:05:28<89:02:26, 8.64s/it] {'loss': 0.218, 'grad_norm': 1.8544337749481201, 'learning_rate': 3.947201305239763e-05, 'epoch': 1.01} 10%|█ | 4171/41250 [10:05:28<89:02:26, 8.64s/it][2025-04-25 18:03:12,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.29 | optimizer_step: 1.04 [2025-04-25 18:03:12,245] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.55 | bwd_microstep: 5723.41 | bwd_inner_microstep: 5700.43 | bwd_allreduce_microstep: 22.92 | step_microstep: 19.94 [2025-04-25 18:03:12,245] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.55 | bwd: 5723.43 | bwd_inner: 5700.43 | bwd_allreduce: 22.95 | step: 19.94 10%|█ | 4172/41250 [10:05:37<89:05:40, 8.65s/it] {'loss': 0.5387, 'grad_norm': 2.278240442276001, 'learning_rate': 3.947165455289325e-05, 'epoch': 1.01} 10%|█ | 4172/41250 [10:05:37<89:05:40, 8.65s/it][2025-04-25 18:03:20,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 18:03:20,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.04 | bwd_microstep: 5755.79 | bwd_inner_microstep: 5663.28 | bwd_allreduce_microstep: 92.46 | step_microstep: 18.79 [2025-04-25 18:03:20,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.04 | bwd: 5755.81 | bwd_inner: 5663.28 | bwd_allreduce: 92.48 | step: 18.79 10%|█ | 4173/41250 [10:05:46<89:09:36, 8.66s/it] {'loss': 0.1244, 'grad_norm': 1.2545166015625, 'learning_rate': 3.947129593334996e-05, 'epoch': 1.01} 10%|█ | 4173/41250 [10:05:46<89:09:36, 8.66s/it][2025-04-25 18:03:29,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 18:03:29,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.44 | bwd_microstep: 5762.19 | bwd_inner_microstep: 5646.43 | bwd_allreduce_microstep: 115.71 | step_microstep: 18.86 [2025-04-25 18:03:29,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.44 | bwd: 5762.20 | bwd_inner: 5646.43 | bwd_allreduce: 115.72 | step: 18.87 10%|█ | 4174/41250 [10:05:54<89:12:46, 8.66s/it] {'loss': 0.1622, 'grad_norm': 0.6883653402328491, 'learning_rate': 3.9470937193769964e-05, 'epoch': 1.01} 10%|█ | 4174/41250 [10:05:54<89:12:46, 8.66s/it][2025-04-25 18:03:38,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 1.06 [2025-04-25 18:03:38,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.56 | bwd_microstep: 5714.96 | bwd_inner_microstep: 5701.91 | bwd_allreduce_microstep: 13.00 | step_microstep: 18.82 [2025-04-25 18:03:38,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.56 | bwd: 5714.97 | bwd_inner: 5701.91 | bwd_allreduce: 13.02 | step: 18.82 10%|█ | 4175/41250 [10:06:03<89:09:52, 8.66s/it] {'loss': 0.0602, 'grad_norm': 0.5664167404174805, 'learning_rate': 3.947057833415547e-05, 'epoch': 1.01} 10%|█ | 4175/41250 [10:06:03<89:09:52, 8.66s/it][2025-04-25 18:03:46,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.35 | optimizer_step: 1.06 [2025-04-25 18:03:46,877] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.90 | bwd_microstep: 5704.94 | bwd_inner_microstep: 5690.84 | bwd_allreduce_microstep: 14.03 | step_microstep: 20.40 [2025-04-25 18:03:46,877] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.91 | bwd: 5704.95 | bwd_inner: 5690.84 | bwd_allreduce: 14.07 | step: 20.40 10%|█ | 4176/41250 [10:06:12<89:06:00, 8.65s/it] {'loss': 0.0768, 'grad_norm': 1.050527811050415, 'learning_rate': 3.9470219354508685e-05, 'epoch': 1.01} 10%|█ | 4176/41250 [10:06:12<89:06:00, 8.65s/it][2025-04-25 18:03:55,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.03 | optimizer_step: 0.93 [2025-04-25 18:03:55,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.93 | bwd_microstep: 5739.69 | bwd_inner_microstep: 5697.43 | bwd_allreduce_microstep: 42.20 | step_microstep: 19.28 [2025-04-25 18:03:55,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.94 | bwd: 5739.70 | bwd_inner: 5697.43 | bwd_allreduce: 42.23 | step: 19.28 10%|█ | 4177/41250 [10:06:20<89:08:51, 8.66s/it] {'loss': 0.0303, 'grad_norm': 0.3459623456001282, 'learning_rate': 3.946986025483183e-05, 'epoch': 1.01} 10%|█ | 4177/41250 [10:06:20<89:08:51, 8.66s/it][2025-04-25 18:04:04,171] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 18:04:04,171] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.30 | bwd_microstep: 5691.18 | bwd_inner_microstep: 5678.32 | bwd_allreduce_microstep: 12.81 | step_microstep: 18.68 [2025-04-25 18:04:04,172] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.30 | bwd: 5691.20 | bwd_inner: 5678.32 | bwd_allreduce: 12.83 | step: 18.68 10%|█ | 4178/41250 [10:06:29<89:02:43, 8.65s/it] {'loss': 0.2619, 'grad_norm': 2.2571332454681396, 'learning_rate': 3.9469501035127115e-05, 'epoch': 1.01} 10%|█ | 4178/41250 [10:06:29<89:02:43, 8.65s/it][2025-04-25 18:04:12,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.08 | optimizer_step: 0.95 [2025-04-25 18:04:12,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.25 | bwd_microstep: 5868.58 | bwd_inner_microstep: 5652.55 | bwd_allreduce_microstep: 215.99 | step_microstep: 19.00 [2025-04-25 18:04:12,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.25 | bwd: 5868.60 | bwd_inner: 5652.55 | bwd_allreduce: 216.00 | step: 19.01 10%|█ | 4179/41250 [10:06:38<89:27:16, 8.69s/it] {'loss': 0.1018, 'grad_norm': 1.5447839498519897, 'learning_rate': 3.946914169539675e-05, 'epoch': 1.01} 10%|█ | 4179/41250 [10:06:38<89:27:16, 8.69s/it][2025-04-25 18:04:21,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:04:21,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.85 | bwd_microstep: 5721.56 | bwd_inner_microstep: 5680.97 | bwd_allreduce_microstep: 40.54 | step_microstep: 18.68 [2025-04-25 18:04:21,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.85 | bwd: 5721.57 | bwd_inner: 5680.97 | bwd_allreduce: 40.56 | step: 18.68 10%|█ | 4180/41250 [10:06:46<89:18:43, 8.67s/it] {'loss': 0.0372, 'grad_norm': 0.8868253827095032, 'learning_rate': 3.946878223564297e-05, 'epoch': 1.01} 10%|█ | 4180/41250 [10:06:46<89:18:43, 8.67s/it][2025-04-25 18:04:30,217] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.97 | optimizer_step: 1.12 [2025-04-25 18:04:30,217] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.31 | bwd_microstep: 5696.13 | bwd_inner_microstep: 5683.47 | bwd_allreduce_microstep: 12.61 | step_microstep: 18.74 [2025-04-25 18:04:30,218] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.31 | bwd: 5696.14 | bwd_inner: 5683.47 | bwd_allreduce: 12.63 | step: 18.74 10%|█ | 4181/41250 [10:06:55<89:09:39, 8.66s/it] {'loss': 0.111, 'grad_norm': 2.586524486541748, 'learning_rate': 3.946842265586797e-05, 'epoch': 1.01} 10%|█ | 4181/41250 [10:06:55<89:09:39, 8.66s/it][2025-04-25 18:04:39,009] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:04:39,009] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.73 | bwd_microstep: 5887.51 | bwd_inner_microstep: 5644.71 | bwd_allreduce_microstep: 242.75 | step_microstep: 18.87 [2025-04-25 18:04:39,010] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.73 | bwd: 5887.52 | bwd_inner: 5644.71 | bwd_allreduce: 242.77 | step: 18.87 10%|█ | 4182/41250 [10:07:04<89:33:54, 8.70s/it] {'loss': 0.2207, 'grad_norm': 3.264554023742676, 'learning_rate': 3.946806295607397e-05, 'epoch': 1.01} 10%|█ | 4182/41250 [10:07:04<89:33:54, 8.70s/it][2025-04-25 18:04:47,753] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 18:04:47,754] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2885.84 | bwd_microstep: 5774.35 | bwd_inner_microstep: 5761.29 | bwd_allreduce_microstep: 13.02 | step_microstep: 18.57 [2025-04-25 18:04:47,754] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2885.84 | bwd: 5774.37 | bwd_inner: 5761.29 | bwd_allreduce: 13.03 | step: 18.57 10%|█ | 4183/41250 [10:07:13<89:42:15, 8.71s/it] {'loss': 0.2048, 'grad_norm': 1.818853497505188, 'learning_rate': 3.946770313626319e-05, 'epoch': 1.01} 10%|█ | 4183/41250 [10:07:13<89:42:15, 8.71s/it][2025-04-25 18:04:56,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:04:56,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2884.97 | bwd_microstep: 5778.90 | bwd_inner_microstep: 5765.90 | bwd_allreduce_microstep: 12.95 | step_microstep: 18.95 [2025-04-25 18:04:56,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2884.97 | bwd: 5778.91 | bwd_inner: 5765.90 | bwd_allreduce: 12.97 | step: 18.95 10%|█ | 4184/41250 [10:07:21<89:48:38, 8.72s/it] {'loss': 0.0375, 'grad_norm': 0.754041850566864, 'learning_rate': 3.946734319643785e-05, 'epoch': 1.01} 10%|█ | 4184/41250 [10:07:21<89:48:38, 8.72s/it][2025-04-25 18:05:05,254] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:05:05,254] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2885.85 | bwd_microstep: 5784.06 | bwd_inner_microstep: 5771.13 | bwd_allreduce_microstep: 12.89 | step_microstep: 18.68 [2025-04-25 18:05:05,254] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2885.85 | bwd: 5784.08 | bwd_inner: 5771.13 | bwd_allreduce: 12.91 | step: 18.68 10%|█ | 4185/41250 [10:07:30<89:54:12, 8.73s/it] {'loss': 0.0606, 'grad_norm': 0.6877186894416809, 'learning_rate': 3.946698313660017e-05, 'epoch': 1.01} 10%|█ | 4185/41250 [10:07:30<89:54:12, 8.73s/it][2025-04-25 18:05:13,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 18:05:13,902] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.00 | bwd_microstep: 5741.02 | bwd_inner_microstep: 5645.12 | bwd_allreduce_microstep: 95.85 | step_microstep: 19.00 [2025-04-25 18:05:13,902] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.00 | bwd: 5741.03 | bwd_inner: 5645.12 | bwd_allreduce: 95.87 | step: 19.00 10%|█ | 4186/41250 [10:07:39<89:38:25, 8.71s/it] {'loss': 0.241, 'grad_norm': 2.102073907852173, 'learning_rate': 3.946662295675237e-05, 'epoch': 1.01} 10%|█ | 4186/41250 [10:07:39<89:38:25, 8.71s/it][2025-04-25 18:05:22,496] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.01 | optimizer_step: 1.07 [2025-04-25 18:05:22,497] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.62 | bwd_microstep: 5683.49 | bwd_inner_microstep: 5634.05 | bwd_allreduce_microstep: 49.39 | step_microstep: 18.96 [2025-04-25 18:05:22,497] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.62 | bwd: 5683.51 | bwd_inner: 5634.05 | bwd_allreduce: 49.42 | step: 18.97 10%|█ | 4187/41250 [10:07:47<89:17:49, 8.67s/it] {'loss': 0.1497, 'grad_norm': 1.5247257947921753, 'learning_rate': 3.946626265689667e-05, 'epoch': 1.02} 10%|█ | 4187/41250 [10:07:47<89:17:49, 8.67s/it][2025-04-25 18:05:31,118] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 18:05:31,119] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.42 | bwd_microstep: 5695.41 | bwd_inner_microstep: 5682.53 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.88 [2025-04-25 18:05:31,119] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.42 | bwd: 5695.43 | bwd_inner: 5682.53 | bwd_allreduce: 12.86 | step: 18.89 10%|█ | 4188/41250 [10:07:56<89:07:47, 8.66s/it] {'loss': 0.0561, 'grad_norm': 0.6838136315345764, 'learning_rate': 3.946590223703528e-05, 'epoch': 1.02} 10%|█ | 4188/41250 [10:07:56<89:07:47, 8.66s/it][2025-04-25 18:05:39,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-25 18:05:39,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.40 | bwd_microstep: 5678.03 | bwd_inner_microstep: 5646.05 | bwd_allreduce_microstep: 31.93 | step_microstep: 18.44 [2025-04-25 18:05:39,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.40 | bwd: 5678.04 | bwd_inner: 5646.05 | bwd_allreduce: 31.95 | step: 18.44 10%|█ | 4189/41250 [10:08:05<88:56:01, 8.64s/it] {'loss': 0.2374, 'grad_norm': 3.00146222114563, 'learning_rate': 3.946554169717044e-05, 'epoch': 1.02} 10%|█ | 4189/41250 [10:08:05<88:56:01, 8.64s/it][2025-04-25 18:05:48,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:05:48,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.41 | bwd_microstep: 5728.99 | bwd_inner_microstep: 5700.08 | bwd_allreduce_microstep: 28.87 | step_microstep: 18.51 [2025-04-25 18:05:48,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.41 | bwd: 5729.01 | bwd_inner: 5700.08 | bwd_allreduce: 28.89 | step: 18.51 10%|█ | 4190/41250 [10:08:13<89:00:48, 8.65s/it] {'loss': 0.0283, 'grad_norm': 0.8433277010917664, 'learning_rate': 3.9465181037304365e-05, 'epoch': 1.02} 10%|█ | 4190/41250 [10:08:13<89:00:48, 8.65s/it][2025-04-25 18:05:56,983] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-25 18:05:56,984] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.84 | bwd_microstep: 5683.37 | bwd_inner_microstep: 5662.98 | bwd_allreduce_microstep: 20.35 | step_microstep: 19.23 [2025-04-25 18:05:56,984] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.84 | bwd: 5683.39 | bwd_inner: 5662.98 | bwd_allreduce: 20.37 | step: 19.23 10%|█ | 4191/41250 [10:08:22<88:53:39, 8.64s/it] {'loss': 0.0372, 'grad_norm': 1.0760380029678345, 'learning_rate': 3.946482025743927e-05, 'epoch': 1.02} 10%|█ | 4191/41250 [10:08:22<88:53:39, 8.64s/it][2025-04-25 18:06:05,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.02 | optimizer_step: 0.98 [2025-04-25 18:06:05,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 3084.09 | bwd_microstep: 5652.60 | bwd_inner_microstep: 5639.79 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.87 [2025-04-25 18:06:05,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 3084.09 | bwd: 5652.61 | bwd_inner: 5639.79 | bwd_allreduce: 12.78 | step: 18.87 10%|█ | 4192/41250 [10:08:31<89:27:54, 8.69s/it] {'loss': 0.0519, 'grad_norm': 0.9165551066398621, 'learning_rate': 3.9464459357577394e-05, 'epoch': 1.02} 10%|█ | 4192/41250 [10:08:31<89:27:54, 8.69s/it][2025-04-25 18:06:14,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.04 | optimizer_step: 1.01 [2025-04-25 18:06:14,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.99 | bwd_microstep: 5737.83 | bwd_inner_microstep: 5710.48 | bwd_allreduce_microstep: 27.30 | step_microstep: 18.93 [2025-04-25 18:06:14,483] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.99 | bwd: 5737.84 | bwd_inner: 5710.48 | bwd_allreduce: 27.32 | step: 18.93 10%|█ | 4193/41250 [10:08:39<89:24:23, 8.69s/it] {'loss': 0.3023, 'grad_norm': 2.790635108947754, 'learning_rate': 3.9464098337720955e-05, 'epoch': 1.02} 10%|█ | 4193/41250 [10:08:39<89:24:23, 8.69s/it][2025-04-25 18:06:23,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:06:23,143] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.75 | bwd_microstep: 5753.12 | bwd_inner_microstep: 5664.17 | bwd_allreduce_microstep: 88.90 | step_microstep: 18.73 [2025-04-25 18:06:23,143] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.75 | bwd: 5753.13 | bwd_inner: 5664.17 | bwd_allreduce: 88.92 | step: 18.74 10%|█ | 4194/41250 [10:08:48<89:19:58, 8.68s/it] {'loss': 0.1288, 'grad_norm': 1.1642423868179321, 'learning_rate': 3.946373719787217e-05, 'epoch': 1.02} 10%|█ | 4194/41250 [10:08:48<89:19:58, 8.68s/it][2025-04-25 18:06:31,857] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.95 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-25 18:06:31,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.27 | bwd_microstep: 5767.66 | bwd_inner_microstep: 5685.11 | bwd_allreduce_microstep: 82.51 | step_microstep: 18.46 [2025-04-25 18:06:31,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.27 | bwd: 5767.67 | bwd_inner: 5685.10 | bwd_allreduce: 82.53 | step: 18.47 10%|█ | 4195/41250 [10:08:57<89:26:15, 8.69s/it] {'loss': 0.0491, 'grad_norm': 0.8368769288063049, 'learning_rate': 3.946337593803328e-05, 'epoch': 1.02} 10%|█ | 4195/41250 [10:08:57<89:26:15, 8.69s/it][2025-04-25 18:06:40,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 18:06:40,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.85 | bwd_microstep: 5699.21 | bwd_inner_microstep: 5686.29 | bwd_allreduce_microstep: 12.88 | step_microstep: 19.04 [2025-04-25 18:06:40,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.85 | bwd: 5699.22 | bwd_inner: 5686.28 | bwd_allreduce: 12.89 | step: 19.04 10%|█ | 4196/41250 [10:09:05<89:17:24, 8.68s/it] {'loss': 0.0779, 'grad_norm': 0.9053632020950317, 'learning_rate': 3.94630145582065e-05, 'epoch': 1.02} 10%|█ | 4196/41250 [10:09:05<89:17:24, 8.68s/it][2025-04-25 18:06:49,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.05 | optimizer_step: 1.13 [2025-04-25 18:06:49,184] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.74 | bwd_microstep: 5763.10 | bwd_inner_microstep: 5659.86 | bwd_allreduce_microstep: 103.19 | step_microstep: 19.36 [2025-04-25 18:06:49,184] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.74 | bwd: 5763.11 | bwd_inner: 5659.86 | bwd_allreduce: 103.21 | step: 19.36 10%|█ | 4197/41250 [10:09:14<89:19:09, 8.68s/it] {'loss': 0.1546, 'grad_norm': 2.071751356124878, 'learning_rate': 3.946265305839407e-05, 'epoch': 1.02} 10%|█ | 4197/41250 [10:09:14<89:19:09, 8.68s/it][2025-04-25 18:06:57,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.10 | optimizer_step: 1.08 [2025-04-25 18:06:57,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.43 | bwd_microstep: 5783.24 | bwd_inner_microstep: 5659.36 | bwd_allreduce_microstep: 123.82 | step_microstep: 20.01 [2025-04-25 18:06:57,883] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.43 | bwd: 5783.25 | bwd_inner: 5659.36 | bwd_allreduce: 123.84 | step: 20.01 10%|█ | 4198/41250 [10:09:23<89:23:01, 8.68s/it] {'loss': 0.1573, 'grad_norm': 2.034393072128296, 'learning_rate': 3.946229143859821e-05, 'epoch': 1.02} 10%|█ | 4198/41250 [10:09:23<89:23:01, 8.68s/it][2025-04-25 18:07:06,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.05 | optimizer_step: 0.99 [2025-04-25 18:07:06,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.26 | bwd_microstep: 5747.60 | bwd_inner_microstep: 5643.06 | bwd_allreduce_microstep: 104.48 | step_microstep: 19.04 [2025-04-25 18:07:06,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.26 | bwd: 5747.61 | bwd_inner: 5643.06 | bwd_allreduce: 104.51 | step: 19.04 10%|█ | 4199/41250 [10:09:31<89:21:28, 8.68s/it] {'loss': 0.116, 'grad_norm': 1.955825924873352, 'learning_rate': 3.946192969882115e-05, 'epoch': 1.02} 10%|█ | 4199/41250 [10:09:31<89:21:28, 8.68s/it][2025-04-25 18:07:15,371] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.10 | optimizer_step: 0.97 [2025-04-25 18:07:15,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2889.89 | bwd_microstep: 5834.92 | bwd_inner_microstep: 5788.37 | bwd_allreduce_microstep: 46.50 | step_microstep: 19.84 [2025-04-25 18:07:15,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2889.89 | bwd: 5834.94 | bwd_inner: 5788.37 | bwd_allreduce: 46.52 | step: 19.85 10%|█ | 4200/41250 [10:09:40<89:45:31, 8.72s/it] {'loss': 0.1984, 'grad_norm': 2.021057605743408, 'learning_rate': 3.9461567839065117e-05, 'epoch': 1.02} 10%|█ | 4200/41250 [10:09:40<89:45:31, 8.72s/it][2025-04-25 18:07:24,063] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.07 | optimizer_step: 1.11 [2025-04-25 18:07:24,064] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.84 | bwd_microstep: 5768.34 | bwd_inner_microstep: 5664.03 | bwd_allreduce_microstep: 104.26 | step_microstep: 19.26 [2025-04-25 18:07:24,064] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.84 | bwd: 5768.36 | bwd_inner: 5664.02 | bwd_allreduce: 104.28 | step: 19.26 10%|█ | 4201/41250 [10:09:49<89:40:11, 8.71s/it] {'loss': 0.1897, 'grad_norm': 1.534706473350525, 'learning_rate': 3.9461205859332355e-05, 'epoch': 1.02} 10%|█ | 4201/41250 [10:09:49<89:40:11, 8.71s/it][2025-04-25 18:07:32,733] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.07 | optimizer_step: 1.24 [2025-04-25 18:07:32,734] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2862.30 | bwd_microstep: 5719.29 | bwd_inner_microstep: 5706.58 | bwd_allreduce_microstep: 12.66 | step_microstep: 19.70 [2025-04-25 18:07:32,734] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2862.30 | bwd: 5719.31 | bwd_inner: 5706.58 | bwd_allreduce: 12.68 | step: 19.70 10%|█ | 4202/41250 [10:09:58<89:31:37, 8.70s/it] {'loss': 0.044, 'grad_norm': 1.0958623886108398, 'learning_rate': 3.946084375962509e-05, 'epoch': 1.02} 10%|█ | 4202/41250 [10:09:58<89:31:37, 8.70s/it][2025-04-25 18:07:41,431] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 18:07:41,431] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.45 | bwd_microstep: 5747.08 | bwd_inner_microstep: 5724.05 | bwd_allreduce_microstep: 22.98 | step_microstep: 18.83 [2025-04-25 18:07:41,432] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.45 | bwd: 5747.09 | bwd_inner: 5724.05 | bwd_allreduce: 23.00 | step: 18.84 10%|█ | 4203/41250 [10:10:06<89:31:07, 8.70s/it] {'loss': 0.131, 'grad_norm': 1.4508492946624756, 'learning_rate': 3.946048153994554e-05, 'epoch': 1.02} 10%|█ | 4203/41250 [10:10:06<89:31:07, 8.70s/it][2025-04-25 18:07:50,122] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.57 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 18:07:50,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2866.02 | bwd_microstep: 5738.58 | bwd_inner_microstep: 5720.89 | bwd_allreduce_microstep: 17.64 | step_microstep: 19.15 [2025-04-25 18:07:50,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2866.02 | bwd: 5738.59 | bwd_inner: 5720.89 | bwd_allreduce: 17.66 | step: 19.16 10%|█ | 4204/41250 [10:10:15<89:29:24, 8.70s/it] {'loss': 0.0446, 'grad_norm': 0.5830694437026978, 'learning_rate': 3.946011920029595e-05, 'epoch': 1.02} 10%|█ | 4204/41250 [10:10:15<89:29:24, 8.70s/it][2025-04-25 18:07:59,082] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 18:07:59,082] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.73 | bwd_microstep: 6037.47 | bwd_inner_microstep: 5652.23 | bwd_allreduce_microstep: 385.20 | step_microstep: 18.54 [2025-04-25 18:07:59,083] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.73 | bwd: 6037.49 | bwd_inner: 5652.23 | bwd_allreduce: 385.22 | step: 18.55 10%|█ | 4205/41250 [10:10:24<90:17:49, 8.77s/it] {'loss': 0.2696, 'grad_norm': 2.912604331970215, 'learning_rate': 3.945975674067855e-05, 'epoch': 1.02} 10%|█ | 4205/41250 [10:10:24<90:17:49, 8.77s/it][2025-04-25 18:08:07,875] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:08:07,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.74 | bwd_microstep: 5853.06 | bwd_inner_microstep: 5717.46 | bwd_allreduce_microstep: 135.54 | step_microstep: 18.77 [2025-04-25 18:08:07,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.74 | bwd: 5853.07 | bwd_inner: 5717.46 | bwd_allreduce: 135.56 | step: 18.77 10%|█ | 4206/41250 [10:10:33<90:21:34, 8.78s/it] {'loss': 0.315, 'grad_norm': 1.664613962173462, 'learning_rate': 3.945939416109559e-05, 'epoch': 1.02} 10%|█ | 4206/41250 [10:10:33<90:21:34, 8.78s/it][2025-04-25 18:08:16,563] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-25 18:08:16,563] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.55 | bwd_microstep: 5736.51 | bwd_inner_microstep: 5723.70 | bwd_allreduce_microstep: 12.77 | step_microstep: 19.07 [2025-04-25 18:08:16,564] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.55 | bwd: 5736.53 | bwd_inner: 5723.70 | bwd_allreduce: 12.79 | step: 19.07 10%|█ | 4207/41250 [10:10:41<90:03:55, 8.75s/it] {'loss': 0.2255, 'grad_norm': 2.1335668563842773, 'learning_rate': 3.945903146154928e-05, 'epoch': 1.02} 10%|█ | 4207/41250 [10:10:41<90:03:55, 8.75s/it][2025-04-25 18:08:25,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-25 18:08:25,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2866.54 | bwd_microstep: 5722.99 | bwd_inner_microstep: 5710.51 | bwd_allreduce_microstep: 12.44 | step_microstep: 19.00 [2025-04-25 18:08:25,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2866.54 | bwd: 5723.01 | bwd_inner: 5710.51 | bwd_allreduce: 12.46 | step: 19.00 10%|█ | 4208/41250 [10:10:50<89:49:22, 8.73s/it] {'loss': 0.1128, 'grad_norm': 1.2596015930175781, 'learning_rate': 3.9458668642041875e-05, 'epoch': 1.02} 10%|█ | 4208/41250 [10:10:50<89:49:22, 8.73s/it][2025-04-25 18:08:33,914] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:08:33,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2869.39 | bwd_microstep: 5717.87 | bwd_inner_microstep: 5704.92 | bwd_allreduce_microstep: 12.90 | step_microstep: 18.62 [2025-04-25 18:08:33,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2869.39 | bwd: 5717.88 | bwd_inner: 5704.92 | bwd_allreduce: 12.92 | step: 18.63 10%|█ | 4209/41250 [10:10:59<89:39:11, 8.71s/it] {'loss': 0.1353, 'grad_norm': 3.8359878063201904, 'learning_rate': 3.9458305702575607e-05, 'epoch': 1.02} 10%|█ | 4209/41250 [10:10:59<89:39:11, 8.71s/it][2025-04-25 18:08:42,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:08:42,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.93 | bwd_microstep: 5717.90 | bwd_inner_microstep: 5654.78 | bwd_allreduce_microstep: 63.07 | step_microstep: 18.87 [2025-04-25 18:08:42,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.93 | bwd: 5717.91 | bwd_inner: 5654.78 | bwd_allreduce: 63.09 | step: 18.87 10%|█ | 4210/41250 [10:11:07<89:23:39, 8.69s/it] {'loss': 0.2175, 'grad_norm': 1.7950379848480225, 'learning_rate': 3.94579426431527e-05, 'epoch': 1.02} 10%|█ | 4210/41250 [10:11:07<89:23:39, 8.69s/it][2025-04-25 18:08:51,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:08:51,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.39 | bwd_microstep: 5778.98 | bwd_inner_microstep: 5650.69 | bwd_allreduce_microstep: 128.25 | step_microstep: 18.62 [2025-04-25 18:08:51,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.39 | bwd: 5778.99 | bwd_inner: 5650.69 | bwd_allreduce: 128.26 | step: 18.62 10%|█ | 4211/41250 [10:11:16<89:23:55, 8.69s/it] {'loss': 0.2346, 'grad_norm': 3.5496091842651367, 'learning_rate': 3.945757946377542e-05, 'epoch': 1.02} 10%|█ | 4211/41250 [10:11:16<89:23:55, 8.69s/it][2025-04-25 18:08:59,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 18:08:59,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.94 | bwd_microstep: 5771.51 | bwd_inner_microstep: 5665.01 | bwd_allreduce_microstep: 106.44 | step_microstep: 18.57 [2025-04-25 18:08:59,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.94 | bwd: 5771.52 | bwd_inner: 5665.01 | bwd_allreduce: 106.46 | step: 18.57 10%|█ | 4212/41250 [10:11:25<89:25:32, 8.69s/it] {'loss': 0.0522, 'grad_norm': 0.5724160671234131, 'learning_rate': 3.9457216164445974e-05, 'epoch': 1.02} 10%|█ | 4212/41250 [10:11:25<89:25:32, 8.69s/it][2025-04-25 18:09:08,738] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.04 | optimizer_step: 0.96 [2025-04-25 18:09:08,739] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.96 | bwd_microstep: 5876.31 | bwd_inner_microstep: 5650.41 | bwd_allreduce_microstep: 225.85 | step_microstep: 19.17 [2025-04-25 18:09:08,739] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.96 | bwd: 5876.32 | bwd_inner: 5650.41 | bwd_allreduce: 225.87 | step: 19.17 10%|█ | 4213/41250 [10:11:34<89:46:37, 8.73s/it] {'loss': 0.1156, 'grad_norm': 2.5822794437408447, 'learning_rate': 3.9456852745166624e-05, 'epoch': 1.02} 10%|█ | 4213/41250 [10:11:34<89:46:37, 8.73s/it][2025-04-25 18:09:17,418] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-25 18:09:17,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.56 | bwd_microstep: 5733.66 | bwd_inner_microstep: 5697.19 | bwd_allreduce_microstep: 36.43 | step_microstep: 18.59 [2025-04-25 18:09:17,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.56 | bwd: 5733.68 | bwd_inner: 5697.19 | bwd_allreduce: 36.45 | step: 18.59 10%|█ | 4214/41250 [10:11:42<89:37:43, 8.71s/it] {'loss': 0.1478, 'grad_norm': 2.150280714035034, 'learning_rate': 3.945648920593961e-05, 'epoch': 1.02} 10%|█ | 4214/41250 [10:11:42<89:37:43, 8.71s/it][2025-04-25 18:09:26,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 18:09:26,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.20 | bwd_microstep: 5731.10 | bwd_inner_microstep: 5697.52 | bwd_allreduce_microstep: 33.53 | step_microstep: 18.81 [2025-04-25 18:09:26,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.20 | bwd: 5731.11 | bwd_inner: 5697.52 | bwd_allreduce: 33.55 | step: 18.81 10%|█ | 4215/41250 [10:11:51<89:31:01, 8.70s/it] {'loss': 0.1028, 'grad_norm': 2.2400944232940674, 'learning_rate': 3.945612554676716e-05, 'epoch': 1.02} 10%|█ | 4215/41250 [10:11:51<89:31:01, 8.70s/it][2025-04-25 18:09:34,867] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.98 | optimizer_step: 1.05 [2025-04-25 18:09:34,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2889.81 | bwd_microstep: 5793.65 | bwd_inner_microstep: 5780.85 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.70 [2025-04-25 18:09:34,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2889.81 | bwd: 5793.66 | bwd_inner: 5780.85 | bwd_allreduce: 12.77 | step: 18.70 10%|█ | 4216/41250 [10:12:00<89:43:36, 8.72s/it] {'loss': 0.2295, 'grad_norm': 1.7833269834518433, 'learning_rate': 3.945576176765152e-05, 'epoch': 1.02} 10%|█ | 4216/41250 [10:12:00<89:43:36, 8.72s/it][2025-04-25 18:09:43,509] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:09:43,510] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.25 | bwd_microstep: 5709.47 | bwd_inner_microstep: 5696.48 | bwd_allreduce_microstep: 12.95 | step_microstep: 18.94 [2025-04-25 18:09:43,510] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.25 | bwd: 5709.49 | bwd_inner: 5696.48 | bwd_allreduce: 12.97 | step: 18.95 10%|█ | 4217/41250 [10:12:08<89:28:56, 8.70s/it] {'loss': 0.0828, 'grad_norm': 1.1248224973678589, 'learning_rate': 3.9455397868594944e-05, 'epoch': 1.02} 10%|█ | 4217/41250 [10:12:08<89:28:56, 8.70s/it][2025-04-25 18:09:52,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-25 18:09:52,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.99 | bwd_microstep: 5700.80 | bwd_inner_microstep: 5668.57 | bwd_allreduce_microstep: 32.19 | step_microstep: 18.58 [2025-04-25 18:09:52,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.99 | bwd: 5700.81 | bwd_inner: 5668.57 | bwd_allreduce: 32.20 | step: 18.59 10%|█ | 4218/41250 [10:12:17<89:13:06, 8.67s/it] {'loss': 0.3929, 'grad_norm': 2.5217535495758057, 'learning_rate': 3.945503384959966e-05, 'epoch': 1.02} 10%|█ | 4218/41250 [10:12:17<89:13:06, 8.67s/it][2025-04-25 18:10:00,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:10:00,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.26 | bwd_microstep: 5700.45 | bwd_inner_microstep: 5648.89 | bwd_allreduce_microstep: 51.52 | step_microstep: 18.82 [2025-04-25 18:10:00,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.26 | bwd: 5700.47 | bwd_inner: 5648.89 | bwd_allreduce: 51.54 | step: 18.82 10%|█ | 4219/41250 [10:12:26<89:02:38, 8.66s/it] {'loss': 0.047, 'grad_norm': 1.6845982074737549, 'learning_rate': 3.945466971066792e-05, 'epoch': 1.02} 10%|█ | 4219/41250 [10:12:26<89:02:38, 8.66s/it][2025-04-25 18:10:09,426] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:10:09,426] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.02 | bwd_microstep: 5766.49 | bwd_inner_microstep: 5647.19 | bwd_allreduce_microstep: 119.25 | step_microstep: 18.81 [2025-04-25 18:10:09,426] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.02 | bwd: 5766.50 | bwd_inner: 5647.19 | bwd_allreduce: 119.27 | step: 18.81 10%|█ | 4220/41250 [10:12:34<89:07:32, 8.66s/it] {'loss': 0.2931, 'grad_norm': 2.276106595993042, 'learning_rate': 3.945430545180197e-05, 'epoch': 1.02} 10%|█ | 4220/41250 [10:12:34<89:07:32, 8.66s/it][2025-04-25 18:10:18,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:10:18,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.32 | bwd_microstep: 5738.72 | bwd_inner_microstep: 5673.80 | bwd_allreduce_microstep: 64.87 | step_microstep: 18.85 [2025-04-25 18:10:18,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.32 | bwd: 5738.74 | bwd_inner: 5673.80 | bwd_allreduce: 64.89 | step: 18.85 10%|█ | 4221/41250 [10:12:43<89:08:24, 8.67s/it] {'loss': 0.0809, 'grad_norm': 1.3934506177902222, 'learning_rate': 3.945394107300406e-05, 'epoch': 1.02} 10%|█ | 4221/41250 [10:12:43<89:08:24, 8.67s/it][2025-04-25 18:10:26,712] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.06 | optimizer_step: 1.16 [2025-04-25 18:10:26,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.83 | bwd_microstep: 5691.72 | bwd_inner_microstep: 5654.11 | bwd_allreduce_microstep: 37.54 | step_microstep: 19.73 [2025-04-25 18:10:26,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.83 | bwd: 5691.73 | bwd_inner: 5654.11 | bwd_allreduce: 37.57 | step: 19.73 10%|█ | 4222/41250 [10:12:52<88:59:15, 8.65s/it] {'loss': 0.1893, 'grad_norm': 2.4775314331054688, 'learning_rate': 3.945357657427642e-05, 'epoch': 1.02} 10%|█ | 4222/41250 [10:12:52<88:59:15, 8.65s/it][2025-04-25 18:10:35,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.03 | optimizer_step: 1.07 [2025-04-25 18:10:35,303] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.05 | bwd_microstep: 5675.06 | bwd_inner_microstep: 5657.59 | bwd_allreduce_microstep: 17.42 | step_microstep: 19.30 [2025-04-25 18:10:35,303] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.05 | bwd: 5675.08 | bwd_inner: 5657.59 | bwd_allreduce: 17.44 | step: 19.30 10%|█ | 4223/41250 [10:13:00<88:47:32, 8.63s/it] {'loss': 0.1513, 'grad_norm': 1.8368407487869263, 'learning_rate': 3.945321195562131e-05, 'epoch': 1.02} 10%|█ | 4223/41250 [10:13:00<88:47:32, 8.63s/it][2025-04-25 18:10:43,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:10:43,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.31 | bwd_microstep: 5738.74 | bwd_inner_microstep: 5683.10 | bwd_allreduce_microstep: 55.59 | step_microstep: 18.44 [2025-04-25 18:10:43,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.31 | bwd: 5738.75 | bwd_inner: 5683.10 | bwd_allreduce: 55.61 | step: 18.44 10%|█ | 4224/41250 [10:13:09<88:54:32, 8.64s/it] {'loss': 0.117, 'grad_norm': 2.159970998764038, 'learning_rate': 3.9452847217040976e-05, 'epoch': 1.02} 10%|█ | 4224/41250 [10:13:09<88:54:32, 8.64s/it][2025-04-25 18:10:52,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 1.02 [2025-04-25 18:10:52,626] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.91 | bwd_microstep: 5717.85 | bwd_inner_microstep: 5705.08 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.71 [2025-04-25 18:10:52,626] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.91 | bwd: 5717.86 | bwd_inner: 5705.08 | bwd_allreduce: 12.74 | step: 18.71 10%|█ | 4225/41250 [10:13:17<88:55:35, 8.65s/it] {'loss': 0.1362, 'grad_norm': 1.8845840692520142, 'learning_rate': 3.9452482358537655e-05, 'epoch': 1.02} 10%|█ | 4225/41250 [10:13:17<88:55:35, 8.65s/it][2025-04-25 18:11:01,292] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:11:01,292] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.14 | bwd_microstep: 5728.44 | bwd_inner_microstep: 5679.40 | bwd_allreduce_microstep: 48.99 | step_microstep: 18.57 [2025-04-25 18:11:01,292] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.14 | bwd: 5728.45 | bwd_inner: 5679.40 | bwd_allreduce: 49.01 | step: 18.58 10%|█ | 4226/41250 [10:13:26<88:58:52, 8.65s/it] {'loss': 0.0815, 'grad_norm': 2.1824045181274414, 'learning_rate': 3.945211738011362e-05, 'epoch': 1.02} 10%|█ | 4226/41250 [10:13:26<88:58:52, 8.65s/it][2025-04-25 18:11:09,902] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 18:11:09,902] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.69 | bwd_microstep: 5684.12 | bwd_inner_microstep: 5641.79 | bwd_allreduce_microstep: 42.29 | step_microstep: 18.64 [2025-04-25 18:11:09,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.69 | bwd: 5684.13 | bwd_inner: 5641.79 | bwd_allreduce: 42.30 | step: 18.64 10%|█ | 4227/41250 [10:13:35<88:51:17, 8.64s/it] {'loss': 0.2321, 'grad_norm': 1.972327709197998, 'learning_rate': 3.9451752281771096e-05, 'epoch': 1.02} 10%|█ | 4227/41250 [10:13:35<88:51:17, 8.64s/it][2025-04-25 18:11:18,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 18:11:18,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.03 | bwd_microstep: 5722.73 | bwd_inner_microstep: 5709.90 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.43 [2025-04-25 18:11:18,566] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.03 | bwd: 5722.74 | bwd_inner: 5709.91 | bwd_allreduce: 12.80 | step: 18.43 10%|█ | 4228/41250 [10:13:43<88:55:21, 8.65s/it] {'loss': 0.1322, 'grad_norm': 0.9054320454597473, 'learning_rate': 3.945138706351236e-05, 'epoch': 1.02} 10%|█ | 4228/41250 [10:13:43<88:55:21, 8.65s/it][2025-04-25 18:11:27,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-25 18:11:27,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.17 | bwd_microstep: 6063.37 | bwd_inner_microstep: 5687.07 | bwd_allreduce_microstep: 376.25 | step_microstep: 18.64 [2025-04-25 18:11:27,558] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.17 | bwd: 6063.38 | bwd_inner: 5687.07 | bwd_allreduce: 376.27 | step: 18.65 10%|█ | 4229/41250 [10:13:52<89:59:09, 8.75s/it] {'loss': 0.0639, 'grad_norm': 0.4967726469039917, 'learning_rate': 3.945102172533963e-05, 'epoch': 1.03} 10%|█ | 4229/41250 [10:13:52<89:59:09, 8.75s/it][2025-04-25 18:11:36,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.01 | optimizer_step: 1.02 [2025-04-25 18:11:36,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.54 | bwd_microstep: 5720.67 | bwd_inner_microstep: 5689.66 | bwd_allreduce_microstep: 30.95 | step_microstep: 18.66 [2025-04-25 18:11:36,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.54 | bwd: 5720.68 | bwd_inner: 5689.66 | bwd_allreduce: 30.98 | step: 18.66 10%|█ | 4230/41250 [10:14:01<89:41:25, 8.72s/it] {'loss': 0.1186, 'grad_norm': 1.6046632528305054, 'learning_rate': 3.945065626725519e-05, 'epoch': 1.03} 10%|█ | 4230/41250 [10:14:01<89:41:25, 8.72s/it][2025-04-25 18:11:44,844] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-25 18:11:44,845] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.97 | bwd_microstep: 5687.12 | bwd_inner_microstep: 5674.38 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.77 [2025-04-25 18:11:44,845] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.97 | bwd: 5687.14 | bwd_inner: 5674.38 | bwd_allreduce: 12.71 | step: 18.77 10%|█ | 4231/41250 [10:14:10<89:24:29, 8.69s/it] {'loss': 0.0621, 'grad_norm': 0.9168894290924072, 'learning_rate': 3.9450290689261274e-05, 'epoch': 1.03} 10%|█ | 4231/41250 [10:14:10<89:24:29, 8.69s/it][2025-04-25 18:11:53,505] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-25 18:11:53,506] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.81 | bwd_microstep: 5722.13 | bwd_inner_microstep: 5680.21 | bwd_allreduce_microstep: 41.88 | step_microstep: 18.79 [2025-04-25 18:11:53,506] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.81 | bwd: 5722.14 | bwd_inner: 5680.21 | bwd_allreduce: 41.90 | step: 18.79 10%|█ | 4232/41250 [10:14:18<89:18:12, 8.68s/it] {'loss': 0.3139, 'grad_norm': 2.5868072509765625, 'learning_rate': 3.944992499136015e-05, 'epoch': 1.03} 10%|█ | 4232/41250 [10:14:18<89:18:12, 8.68s/it][2025-04-25 18:12:02,175] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.05 | optimizer_step: 0.92 [2025-04-25 18:12:02,176] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.45 | bwd_microstep: 5731.30 | bwd_inner_microstep: 5689.21 | bwd_allreduce_microstep: 42.05 | step_microstep: 18.84 [2025-04-25 18:12:02,176] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.45 | bwd: 5731.31 | bwd_inner: 5689.21 | bwd_allreduce: 42.07 | step: 18.84 10%|█ | 4233/41250 [10:14:27<89:15:18, 8.68s/it] {'loss': 0.2273, 'grad_norm': 3.0464329719543457, 'learning_rate': 3.944955917355405e-05, 'epoch': 1.03} 10%|█ | 4233/41250 [10:14:27<89:15:18, 8.68s/it][2025-04-25 18:12:10,775] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:12:10,776] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.05 | bwd_microstep: 5690.04 | bwd_inner_microstep: 5649.87 | bwd_allreduce_microstep: 40.12 | step_microstep: 18.68 [2025-04-25 18:12:10,776] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.05 | bwd: 5690.05 | bwd_inner: 5649.87 | bwd_allreduce: 40.14 | step: 18.69 10%|█ | 4234/41250 [10:14:36<89:00:13, 8.66s/it] {'loss': 0.1933, 'grad_norm': 1.6493109464645386, 'learning_rate': 3.9449193235845254e-05, 'epoch': 1.03} 10%|█ | 4234/41250 [10:14:36<89:00:13, 8.66s/it][2025-04-25 18:12:19,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:12:19,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.43 | bwd_microstep: 5715.10 | bwd_inner_microstep: 5702.41 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.56 [2025-04-25 18:12:19,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.43 | bwd: 5715.11 | bwd_inner: 5702.41 | bwd_allreduce: 12.66 | step: 18.56 10%|█ | 4235/41250 [10:14:44<88:59:36, 8.66s/it] {'loss': 0.0285, 'grad_norm': 0.5292505025863647, 'learning_rate': 3.9448827178236e-05, 'epoch': 1.03} 10%|█ | 4235/41250 [10:14:44<88:59:36, 8.66s/it][2025-04-25 18:12:28,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:12:28,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.10 | bwd_microstep: 5720.34 | bwd_inner_microstep: 5685.29 | bwd_allreduce_microstep: 35.00 | step_microstep: 18.65 [2025-04-25 18:12:28,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.11 | bwd: 5720.35 | bwd_inner: 5685.29 | bwd_allreduce: 35.02 | step: 18.66 10%|█ | 4236/41250 [10:14:53<89:00:16, 8.66s/it] {'loss': 0.1028, 'grad_norm': 0.9120008945465088, 'learning_rate': 3.944846100072856e-05, 'epoch': 1.03} 10%|█ | 4236/41250 [10:14:53<89:00:16, 8.66s/it][2025-04-25 18:12:36,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:12:36,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.28 | bwd_microstep: 5750.45 | bwd_inner_microstep: 5692.27 | bwd_allreduce_microstep: 58.14 | step_microstep: 18.44 [2025-04-25 18:12:36,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.28 | bwd: 5750.46 | bwd_inner: 5692.27 | bwd_allreduce: 58.15 | step: 18.44 10%|█ | 4237/41250 [10:15:02<89:06:59, 8.67s/it] {'loss': 0.0605, 'grad_norm': 0.9694133400917053, 'learning_rate': 3.944809470332518e-05, 'epoch': 1.03} 10%|█ | 4237/41250 [10:15:02<89:06:59, 8.67s/it][2025-04-25 18:12:45,448] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 1.00 [2025-04-25 18:12:45,449] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.86 | bwd_microstep: 5746.19 | bwd_inner_microstep: 5653.28 | bwd_allreduce_microstep: 92.87 | step_microstep: 18.49 [2025-04-25 18:12:45,449] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.86 | bwd: 5746.21 | bwd_inner: 5653.28 | bwd_allreduce: 92.88 | step: 18.50 10%|█ | 4238/41250 [10:15:10<89:06:25, 8.67s/it] {'loss': 0.1495, 'grad_norm': 1.792251706123352, 'learning_rate': 3.944772828602812e-05, 'epoch': 1.03} 10%|█ | 4238/41250 [10:15:10<89:06:25, 8.67s/it][2025-04-25 18:12:54,233] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:12:54,233] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2894.03 | bwd_microstep: 5807.43 | bwd_inner_microstep: 5767.83 | bwd_allreduce_microstep: 39.54 | step_microstep: 18.45 [2025-04-25 18:12:54,233] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2894.03 | bwd: 5807.44 | bwd_inner: 5767.83 | bwd_allreduce: 39.57 | step: 18.45 10%|█ | 4239/41250 [10:15:19<89:28:10, 8.70s/it] {'loss': 0.2474, 'grad_norm': 1.8370968103408813, 'learning_rate': 3.944736174883964e-05, 'epoch': 1.03} 10%|█ | 4239/41250 [10:15:19<89:28:10, 8.70s/it][2025-04-25 18:13:02,893] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.05 | optimizer_step: 0.98 [2025-04-25 18:13:02,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.04 | bwd_microstep: 5733.44 | bwd_inner_microstep: 5650.90 | bwd_allreduce_microstep: 82.50 | step_microstep: 19.12 [2025-04-25 18:13:02,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.04 | bwd: 5733.46 | bwd_inner: 5650.90 | bwd_allreduce: 82.51 | step: 19.12 10%|█ | 4240/41250 [10:15:28<89:20:18, 8.69s/it] {'loss': 0.2255, 'grad_norm': 4.632996082305908, 'learning_rate': 3.9446995091762e-05, 'epoch': 1.03} 10%|█ | 4240/41250 [10:15:28<89:20:18, 8.69s/it][2025-04-25 18:13:11,588] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 18:13:11,589] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.66 | bwd_microstep: 5753.16 | bwd_inner_microstep: 5693.19 | bwd_allreduce_microstep: 59.93 | step_microstep: 18.47 [2025-04-25 18:13:11,589] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.66 | bwd: 5753.17 | bwd_inner: 5693.19 | bwd_allreduce: 59.95 | step: 18.48 10%|█ | 4241/41250 [10:15:36<89:20:50, 8.69s/it] {'loss': 0.0609, 'grad_norm': 0.6934396028518677, 'learning_rate': 3.9446628314797456e-05, 'epoch': 1.03} 10%|█ | 4241/41250 [10:15:36<89:20:50, 8.69s/it][2025-04-25 18:13:20,289] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 18:13:20,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.38 | bwd_microstep: 5773.34 | bwd_inner_microstep: 5656.52 | bwd_allreduce_microstep: 116.77 | step_microstep: 18.65 [2025-04-25 18:13:20,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.38 | bwd: 5773.35 | bwd_inner: 5656.52 | bwd_allreduce: 116.79 | step: 18.65 10%|█ | 4242/41250 [10:15:45<89:22:32, 8.69s/it] {'loss': 0.2469, 'grad_norm': 1.8963345289230347, 'learning_rate': 3.9446261417948284e-05, 'epoch': 1.03} 10%|█ | 4242/41250 [10:15:45<89:22:32, 8.69s/it][2025-04-25 18:13:28,977] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:13:28,977] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.49 | bwd_microstep: 5763.91 | bwd_inner_microstep: 5669.30 | bwd_allreduce_microstep: 94.56 | step_microstep: 18.38 [2025-04-25 18:13:28,977] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.49 | bwd: 5763.92 | bwd_inner: 5669.30 | bwd_allreduce: 94.58 | step: 18.38 10%|█ | 4243/41250 [10:15:54<89:21:06, 8.69s/it] {'loss': 0.0451, 'grad_norm': 0.7321364879608154, 'learning_rate': 3.9445894401216723e-05, 'epoch': 1.03} 10%|█ | 4243/41250 [10:15:54<89:21:06, 8.69s/it][2025-04-25 18:13:37,652] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:13:37,652] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.11 | bwd_microstep: 5730.57 | bwd_inner_microstep: 5717.53 | bwd_allreduce_microstep: 13.00 | step_microstep: 18.52 [2025-04-25 18:13:37,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.11 | bwd: 5730.59 | bwd_inner: 5717.53 | bwd_allreduce: 13.02 | step: 18.53 10%|█ | 4244/41250 [10:16:02<89:17:52, 8.69s/it] {'loss': 0.1571, 'grad_norm': 2.0496609210968018, 'learning_rate': 3.944552726460506e-05, 'epoch': 1.03} 10%|█ | 4244/41250 [10:16:02<89:17:52, 8.69s/it][2025-04-25 18:13:46,326] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:13:46,326] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.73 | bwd_microstep: 5751.42 | bwd_inner_microstep: 5647.47 | bwd_allreduce_microstep: 103.91 | step_microstep: 18.52 [2025-04-25 18:13:46,327] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.73 | bwd: 5751.44 | bwd_inner: 5647.47 | bwd_allreduce: 103.93 | step: 18.52 10%|█ | 4245/41250 [10:16:11<89:15:26, 8.68s/it] {'loss': 0.1856, 'grad_norm': 0.9276850819587708, 'learning_rate': 3.944516000811554e-05, 'epoch': 1.03} 10%|█ | 4245/41250 [10:16:11<89:15:26, 8.68s/it][2025-04-25 18:13:54,939] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:13:54,940] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.28 | bwd_microstep: 5683.63 | bwd_inner_microstep: 5670.81 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.72 [2025-04-25 18:13:54,940] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.28 | bwd: 5683.64 | bwd_inner: 5670.81 | bwd_allreduce: 12.78 | step: 18.73 10%|█ | 4246/41250 [10:16:20<89:02:24, 8.66s/it] {'loss': 0.2254, 'grad_norm': 1.5918402671813965, 'learning_rate': 3.9444792631750434e-05, 'epoch': 1.03} 10%|█ | 4246/41250 [10:16:20<89:02:24, 8.66s/it][2025-04-25 18:14:03,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:14:03,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.11 | bwd_microstep: 5686.45 | bwd_inner_microstep: 5670.17 | bwd_allreduce_microstep: 16.24 | step_microstep: 18.52 [2025-04-25 18:14:03,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.11 | bwd: 5686.47 | bwd_inner: 5670.17 | bwd_allreduce: 16.26 | step: 18.52 10%|█ | 4247/41250 [10:16:28<88:54:14, 8.65s/it] {'loss': 0.2249, 'grad_norm': 1.73623788356781, 'learning_rate': 3.9444425135512e-05, 'epoch': 1.03} 10%|█ | 4247/41250 [10:16:28<88:54:14, 8.65s/it][2025-04-25 18:14:12,177] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.95 [2025-04-25 18:14:12,178] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.01 | bwd_microstep: 5689.46 | bwd_inner_microstep: 5666.41 | bwd_allreduce_microstep: 23.00 | step_microstep: 18.83 [2025-04-25 18:14:12,178] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.01 | bwd: 5689.48 | bwd_inner: 5666.41 | bwd_allreduce: 23.02 | step: 18.83 10%|█ | 4248/41250 [10:16:37<88:48:34, 8.64s/it] {'loss': 0.083, 'grad_norm': 1.151031732559204, 'learning_rate': 3.944405751940252e-05, 'epoch': 1.03} 10%|█ | 4248/41250 [10:16:37<88:48:34, 8.64s/it][2025-04-25 18:14:20,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 18:14:20,855] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.82 | bwd_microstep: 5735.96 | bwd_inner_microstep: 5723.07 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.72 [2025-04-25 18:14:20,855] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.82 | bwd: 5735.97 | bwd_inner: 5723.07 | bwd_allreduce: 12.86 | step: 18.73 10%|█ | 4249/41250 [10:16:46<88:55:01, 8.65s/it] {'loss': 0.0537, 'grad_norm': 0.8216007351875305, 'learning_rate': 3.944368978342424e-05, 'epoch': 1.03} 10%|█ | 4249/41250 [10:16:46<88:55:01, 8.65s/it][2025-04-25 18:14:29,471] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.04 | optimizer_step: 0.94 [2025-04-25 18:14:29,471] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.02 | bwd_microstep: 5691.65 | bwd_inner_microstep: 5671.39 | bwd_allreduce_microstep: 20.21 | step_microstep: 20.14 [2025-04-25 18:14:29,472] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.02 | bwd: 5691.67 | bwd_inner: 5671.39 | bwd_allreduce: 20.24 | step: 20.14 10%|█ | 4250/41250 [10:16:54<88:48:25, 8.64s/it] {'loss': 0.0128, 'grad_norm': 0.18519435822963715, 'learning_rate': 3.9443321927579444e-05, 'epoch': 1.03} 10%|█ | 4250/41250 [10:16:54<88:48:25, 8.64s/it][2025-04-25 18:14:38,133] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.99 | optimizer_step: 1.07 [2025-04-25 18:14:38,134] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.40 | bwd_microstep: 5743.73 | bwd_inner_microstep: 5663.10 | bwd_allreduce_microstep: 80.58 | step_microstep: 18.68 [2025-04-25 18:14:38,134] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.40 | bwd: 5743.74 | bwd_inner: 5663.10 | bwd_allreduce: 80.60 | step: 18.69 10%|█ | 4251/41250 [10:17:03<88:52:13, 8.65s/it] {'loss': 0.324, 'grad_norm': 1.9927241802215576, 'learning_rate': 3.944295395187039e-05, 'epoch': 1.03} 10%|█ | 4251/41250 [10:17:03<88:52:13, 8.65s/it][2025-04-25 18:14:46,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.03 | optimizer_step: 1.07 [2025-04-25 18:14:46,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.98 | bwd_microstep: 5757.74 | bwd_inner_microstep: 5715.84 | bwd_allreduce_microstep: 41.84 | step_microstep: 19.18 [2025-04-25 18:14:46,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.98 | bwd: 5757.76 | bwd_inner: 5715.84 | bwd_allreduce: 41.87 | step: 19.18 10%|█ | 4252/41250 [10:17:12<89:00:44, 8.66s/it] {'loss': 0.2774, 'grad_norm': 2.106180429458618, 'learning_rate': 3.944258585629935e-05, 'epoch': 1.03} 10%|█ | 4252/41250 [10:17:12<89:00:44, 8.66s/it][2025-04-25 18:14:55,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:14:55,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.90 | bwd_microstep: 5801.28 | bwd_inner_microstep: 5657.74 | bwd_allreduce_microstep: 143.50 | step_microstep: 18.81 [2025-04-25 18:14:55,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.90 | bwd: 5801.30 | bwd_inner: 5657.74 | bwd_allreduce: 143.52 | step: 18.81 10%|█ | 4253/41250 [10:17:20<89:11:18, 8.68s/it] {'loss': 0.0655, 'grad_norm': 0.8271169662475586, 'learning_rate': 3.944221764086859e-05, 'epoch': 1.03} 10%|█ | 4253/41250 [10:17:20<89:11:18, 8.68s/it][2025-04-25 18:15:04,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.03 | optimizer_step: 0.94 [2025-04-25 18:15:04,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.50 | bwd_microstep: 5769.84 | bwd_inner_microstep: 5673.08 | bwd_allreduce_microstep: 96.69 | step_microstep: 19.44 [2025-04-25 18:15:04,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.50 | bwd: 5769.85 | bwd_inner: 5673.08 | bwd_allreduce: 96.72 | step: 19.44 10%|█ | 4254/41250 [10:17:29<89:13:35, 8.68s/it] {'loss': 0.196, 'grad_norm': 1.4587161540985107, 'learning_rate': 3.944184930558039e-05, 'epoch': 1.03} 10%|█ | 4254/41250 [10:17:29<89:13:35, 8.68s/it][2025-04-25 18:15:12,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 18:15:12,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.35 | bwd_microstep: 5721.73 | bwd_inner_microstep: 5708.86 | bwd_allreduce_microstep: 12.83 | step_microstep: 18.99 [2025-04-25 18:15:12,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.35 | bwd: 5721.74 | bwd_inner: 5708.86 | bwd_allreduce: 12.85 | step: 19.00 10%|█ | 4255/41250 [10:17:38<89:09:07, 8.68s/it] {'loss': 0.0371, 'grad_norm': 1.0150933265686035, 'learning_rate': 3.944148085043701e-05, 'epoch': 1.03} 10%|█ | 4255/41250 [10:17:38<89:09:07, 8.68s/it][2025-04-25 18:15:21,569] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.02 | optimizer_step: 1.00 [2025-04-25 18:15:21,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.53 | bwd_microstep: 5728.19 | bwd_inner_microstep: 5714.72 | bwd_allreduce_microstep: 13.40 | step_microstep: 19.12 [2025-04-25 18:15:21,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.53 | bwd: 5728.20 | bwd_inner: 5714.72 | bwd_allreduce: 13.42 | step: 19.12 10%|█ | 4256/41250 [10:17:46<89:08:59, 8.68s/it] {'loss': 0.0778, 'grad_norm': 0.9586832523345947, 'learning_rate': 3.944111227544071e-05, 'epoch': 1.03} 10%|█ | 4256/41250 [10:17:46<89:08:59, 8.68s/it][2025-04-25 18:15:30,259] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 0.96 [2025-04-25 18:15:30,260] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.88 | bwd_microstep: 5753.64 | bwd_inner_microstep: 5679.16 | bwd_allreduce_microstep: 74.42 | step_microstep: 18.79 [2025-04-25 18:15:30,260] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.88 | bwd: 5753.65 | bwd_inner: 5679.16 | bwd_allreduce: 74.45 | step: 18.79 10%|█ | 4257/41250 [10:17:55<89:11:25, 8.68s/it] {'loss': 0.0494, 'grad_norm': 0.8940937519073486, 'learning_rate': 3.9440743580593795e-05, 'epoch': 1.03} 10%|█ | 4257/41250 [10:17:55<89:11:25, 8.68s/it][2025-04-25 18:15:38,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:15:38,952] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2873.86 | bwd_microstep: 5731.32 | bwd_inner_microstep: 5716.75 | bwd_allreduce_microstep: 14.53 | step_microstep: 18.32 [2025-04-25 18:15:38,952] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2873.86 | bwd: 5731.33 | bwd_inner: 5716.75 | bwd_allreduce: 14.54 | step: 18.33 10%|█ | 4258/41250 [10:18:04<89:13:16, 8.68s/it] {'loss': 0.0259, 'grad_norm': 0.29654863476753235, 'learning_rate': 3.944037476589851e-05, 'epoch': 1.03} 10%|█ | 4258/41250 [10:18:04<89:13:16, 8.68s/it][2025-04-25 18:15:47,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.97 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 18:15:47,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.16 | bwd_microstep: 5760.24 | bwd_inner_microstep: 5702.20 | bwd_allreduce_microstep: 58.00 | step_microstep: 18.38 [2025-04-25 18:15:47,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.16 | bwd: 5760.25 | bwd_inner: 5702.20 | bwd_allreduce: 58.02 | step: 18.38 10%|█ | 4259/41250 [10:18:12<89:16:33, 8.69s/it] {'loss': 0.108, 'grad_norm': 1.480064034461975, 'learning_rate': 3.944000583135714e-05, 'epoch': 1.03} 10%|█ | 4259/41250 [10:18:12<89:16:33, 8.69s/it][2025-04-25 18:15:56,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 1.05 [2025-04-25 18:15:56,275] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.67 | bwd_microstep: 5695.17 | bwd_inner_microstep: 5668.09 | bwd_allreduce_microstep: 27.03 | step_microstep: 18.73 [2025-04-25 18:15:56,275] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.67 | bwd: 5695.18 | bwd_inner: 5668.09 | bwd_allreduce: 27.05 | step: 18.73 10%|█ | 4260/41250 [10:18:21<89:03:58, 8.67s/it] {'loss': 0.0937, 'grad_norm': 4.116370677947998, 'learning_rate': 3.9439636776971965e-05, 'epoch': 1.03} 10%|█ | 4260/41250 [10:18:21<89:03:58, 8.67s/it][2025-04-25 18:16:04,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:16:04,967] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2865.18 | bwd_microstep: 5741.60 | bwd_inner_microstep: 5723.00 | bwd_allreduce_microstep: 18.56 | step_microstep: 18.62 [2025-04-25 18:16:04,967] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2865.19 | bwd: 5741.62 | bwd_inner: 5723.00 | bwd_allreduce: 18.57 | step: 18.62 10%|█ | 4261/41250 [10:18:30<89:08:15, 8.68s/it] {'loss': 0.062, 'grad_norm': 0.885590672492981, 'learning_rate': 3.943926760274525e-05, 'epoch': 1.03} 10%|█ | 4261/41250 [10:18:30<89:08:15, 8.68s/it][2025-04-25 18:16:13,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.94 | optimizer_step: 0.95 [2025-04-25 18:16:13,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2867.05 | bwd_microstep: 5733.48 | bwd_inner_microstep: 5720.67 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.24 [2025-04-25 18:16:13,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2867.05 | bwd: 5733.49 | bwd_inner: 5720.67 | bwd_allreduce: 12.78 | step: 18.24 10%|█ | 4262/41250 [10:18:38<89:09:19, 8.68s/it] {'loss': 0.0845, 'grad_norm': 0.7767930030822754, 'learning_rate': 3.9438898308679264e-05, 'epoch': 1.03} 10%|█ | 4262/41250 [10:18:38<89:09:19, 8.68s/it][2025-04-25 18:16:22,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.97 | optimizer_step: 0.96 [2025-04-25 18:16:22,346] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2862.12 | bwd_microstep: 5749.68 | bwd_inner_microstep: 5707.52 | bwd_allreduce_microstep: 42.11 | step_microstep: 18.70 [2025-04-25 18:16:22,346] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2862.13 | bwd: 5749.69 | bwd_inner: 5707.52 | bwd_allreduce: 42.13 | step: 18.70 10%|█ | 4263/41250 [10:18:47<89:12:55, 8.68s/it] {'loss': 0.0673, 'grad_norm': 1.0258992910385132, 'learning_rate': 3.94385288947763e-05, 'epoch': 1.03} 10%|█ | 4263/41250 [10:18:47<89:12:55, 8.68s/it][2025-04-25 18:16:30,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 18:16:30,967] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.34 | bwd_microstep: 5699.09 | bwd_inner_microstep: 5662.08 | bwd_allreduce_microstep: 36.96 | step_microstep: 19.60 [2025-04-25 18:16:30,967] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.34 | bwd: 5699.10 | bwd_inner: 5662.08 | bwd_allreduce: 36.98 | step: 19.61 10%|█ | 4264/41250 [10:18:56<89:01:32, 8.67s/it] {'loss': 0.0856, 'grad_norm': 1.3644062280654907, 'learning_rate': 3.943815936103863e-05, 'epoch': 1.03} 10%|█ | 4264/41250 [10:18:56<89:01:32, 8.67s/it][2025-04-25 18:16:39,636] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 18:16:39,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.59 | bwd_microstep: 5727.23 | bwd_inner_microstep: 5714.38 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.88 [2025-04-25 18:16:39,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.59 | bwd: 5727.24 | bwd_inner: 5714.38 | bwd_allreduce: 12.82 | step: 18.88 10%|█ | 4265/41250 [10:19:04<89:02:49, 8.67s/it] {'loss': 0.2842, 'grad_norm': 2.544776678085327, 'learning_rate': 3.943778970746853e-05, 'epoch': 1.03} 10%|█ | 4265/41250 [10:19:04<89:02:49, 8.67s/it][2025-04-25 18:16:48,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.31 | optimizer_step: 0.96 [2025-04-25 18:16:48,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.83 | bwd_microstep: 5741.71 | bwd_inner_microstep: 5728.33 | bwd_allreduce_microstep: 13.33 | step_microstep: 19.81 [2025-04-25 18:16:48,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.83 | bwd: 5741.72 | bwd_inner: 5728.33 | bwd_allreduce: 13.35 | step: 19.81 10%|█ | 4266/41250 [10:19:13<89:07:14, 8.67s/it] {'loss': 0.0987, 'grad_norm': 1.7793251276016235, 'learning_rate': 3.9437419934068274e-05, 'epoch': 1.03} 10%|█ | 4266/41250 [10:19:13<89:07:14, 8.67s/it][2025-04-25 18:16:56,968] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.01 | optimizer_step: 1.06 [2025-04-25 18:16:56,969] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.96 | bwd_microstep: 5708.51 | bwd_inner_microstep: 5695.63 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.60 [2025-04-25 18:16:56,969] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.96 | bwd: 5708.53 | bwd_inner: 5695.63 | bwd_allreduce: 12.85 | step: 18.60 10%|█ | 4267/41250 [10:19:22<88:59:40, 8.66s/it] {'loss': 0.0744, 'grad_norm': 1.1843347549438477, 'learning_rate': 3.943705004084016e-05, 'epoch': 1.03} 10%|█ | 4267/41250 [10:19:22<88:59:40, 8.66s/it][2025-04-25 18:17:05,605] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 18:17:05,606] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.99 | bwd_microstep: 5707.14 | bwd_inner_microstep: 5694.42 | bwd_allreduce_microstep: 12.67 | step_microstep: 19.02 [2025-04-25 18:17:05,606] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.99 | bwd: 5707.16 | bwd_inner: 5694.42 | bwd_allreduce: 12.69 | step: 19.02 10%|█ | 4268/41250 [10:19:30<88:55:00, 8.66s/it] {'loss': 0.0993, 'grad_norm': 0.7667320966720581, 'learning_rate': 3.943668002778644e-05, 'epoch': 1.03} 10%|█ | 4268/41250 [10:19:30<88:55:00, 8.66s/it][2025-04-25 18:17:14,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 18:17:14,280] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.97 | bwd_microstep: 5745.28 | bwd_inner_microstep: 5689.23 | bwd_allreduce_microstep: 56.00 | step_microstep: 19.14 [2025-04-25 18:17:14,280] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.97 | bwd: 5745.30 | bwd_inner: 5689.23 | bwd_allreduce: 56.02 | step: 19.14 10%|█ | 4269/41250 [10:19:39<88:57:56, 8.66s/it] {'loss': 0.1238, 'grad_norm': 1.6069034337997437, 'learning_rate': 3.943630989490942e-05, 'epoch': 1.03} 10%|█ | 4269/41250 [10:19:39<88:57:56, 8.66s/it][2025-04-25 18:17:22,937] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.05 | optimizer_step: 1.00 [2025-04-25 18:17:22,938] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.23 | bwd_microstep: 5756.84 | bwd_inner_microstep: 5646.66 | bwd_allreduce_microstep: 110.13 | step_microstep: 19.29 [2025-04-25 18:17:22,938] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.23 | bwd: 5756.85 | bwd_inner: 5646.66 | bwd_allreduce: 110.15 | step: 19.29 10%|█ | 4270/41250 [10:19:48<88:57:48, 8.66s/it] {'loss': 0.1232, 'grad_norm': 1.6003575325012207, 'learning_rate': 3.9435939642211375e-05, 'epoch': 1.04} 10%|█ | 4270/41250 [10:19:48<88:57:48, 8.66s/it][2025-04-25 18:17:31,712] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.28 | optimizer_step: 0.90 [2025-04-25 18:17:31,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.93 | bwd_microstep: 5861.68 | bwd_inner_microstep: 5654.19 | bwd_allreduce_microstep: 207.43 | step_microstep: 19.78 [2025-04-25 18:17:31,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.93 | bwd: 5861.69 | bwd_inner: 5654.19 | bwd_allreduce: 207.46 | step: 19.78 10%|█ | 4271/41250 [10:19:57<89:18:23, 8.69s/it] {'loss': 0.151, 'grad_norm': 1.754258394241333, 'learning_rate': 3.943556926969458e-05, 'epoch': 1.04} 10%|█ | 4271/41250 [10:19:57<89:18:23, 8.69s/it][2025-04-25 18:17:40,311] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 1.11 [2025-04-25 18:17:40,312] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.25 | bwd_microstep: 5682.34 | bwd_inner_microstep: 5655.50 | bwd_allreduce_microstep: 26.80 | step_microstep: 19.13 [2025-04-25 18:17:40,312] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.25 | bwd: 5682.35 | bwd_inner: 5655.50 | bwd_allreduce: 26.81 | step: 19.13 10%|█ | 4272/41250 [10:20:05<89:00:33, 8.67s/it] {'loss': 0.1932, 'grad_norm': 2.4293034076690674, 'learning_rate': 3.943519877736132e-05, 'epoch': 1.04} 10%|█ | 4272/41250 [10:20:05<89:00:33, 8.67s/it][2025-04-25 18:17:48,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-25 18:17:48,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.49 | bwd_microstep: 5758.28 | bwd_inner_microstep: 5643.08 | bwd_allreduce_microstep: 115.16 | step_microstep: 18.66 [2025-04-25 18:17:48,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.49 | bwd: 5758.30 | bwd_inner: 5643.08 | bwd_allreduce: 115.18 | step: 18.67 10%|█ | 4273/41250 [10:20:14<88:59:55, 8.66s/it] {'loss': 0.0499, 'grad_norm': 0.6260738372802734, 'learning_rate': 3.943482816521389e-05, 'epoch': 1.04} 10%|█ | 4273/41250 [10:20:14<88:59:55, 8.66s/it][2025-04-25 18:17:57,595] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.99 | optimizer_step: 1.03 [2025-04-25 18:17:57,595] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.13 | bwd_microstep: 5697.10 | bwd_inner_microstep: 5684.27 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.76 [2025-04-25 18:17:57,595] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.13 | bwd: 5697.11 | bwd_inner: 5684.27 | bwd_allreduce: 12.79 | step: 18.76 10%|█ | 4274/41250 [10:20:22<88:51:39, 8.65s/it] {'loss': 0.1496, 'grad_norm': 1.40872061252594, 'learning_rate': 3.943445743325456e-05, 'epoch': 1.04} 10%|█ | 4274/41250 [10:20:22<88:51:39, 8.65s/it][2025-04-25 18:18:06,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.98 | optimizer_step: 1.00 [2025-04-25 18:18:06,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.34 | bwd_microstep: 5691.46 | bwd_inner_microstep: 5653.08 | bwd_allreduce_microstep: 38.34 | step_microstep: 18.72 [2025-04-25 18:18:06,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.34 | bwd: 5691.47 | bwd_inner: 5653.08 | bwd_allreduce: 38.36 | step: 18.72 10%|█ | 4275/41250 [10:20:31<88:41:52, 8.64s/it] {'loss': 0.0636, 'grad_norm': 0.5519015192985535, 'learning_rate': 3.943408658148563e-05, 'epoch': 1.04} 10%|█ | 4275/41250 [10:20:31<88:41:52, 8.64s/it][2025-04-25 18:18:14,867] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:18:14,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.70 | bwd_microstep: 5750.69 | bwd_inner_microstep: 5688.25 | bwd_allreduce_microstep: 62.39 | step_microstep: 19.06 [2025-04-25 18:18:14,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.70 | bwd: 5750.71 | bwd_inner: 5688.25 | bwd_allreduce: 62.41 | step: 19.06 10%|█ | 4276/41250 [10:20:40<88:49:09, 8.65s/it] {'loss': 0.036, 'grad_norm': 0.6887385845184326, 'learning_rate': 3.9433715609909376e-05, 'epoch': 1.04} 10%|█ | 4276/41250 [10:20:40<88:49:09, 8.65s/it][2025-04-25 18:18:23,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:18:23,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.83 | bwd_microstep: 5751.76 | bwd_inner_microstep: 5690.95 | bwd_allreduce_microstep: 60.76 | step_microstep: 18.65 [2025-04-25 18:18:23,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.83 | bwd: 5751.78 | bwd_inner: 5690.95 | bwd_allreduce: 60.78 | step: 18.65 10%|█ | 4277/41250 [10:20:48<88:54:46, 8.66s/it] {'loss': 0.1042, 'grad_norm': 2.644340753555298, 'learning_rate': 3.943334451852808e-05, 'epoch': 1.04} 10%|█ | 4277/41250 [10:20:48<88:54:46, 8.66s/it][2025-04-25 18:18:32,218] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.03 | optimizer_step: 1.01 [2025-04-25 18:18:32,219] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.71 | bwd_microstep: 5739.49 | bwd_inner_microstep: 5688.84 | bwd_allreduce_microstep: 50.60 | step_microstep: 19.48 [2025-04-25 18:18:32,219] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.71 | bwd: 5739.51 | bwd_inner: 5688.84 | bwd_allreduce: 50.62 | step: 19.48 10%|█ | 4278/41250 [10:20:57<88:56:50, 8.66s/it] {'loss': 0.0894, 'grad_norm': 1.0842552185058594, 'learning_rate': 3.943297330734405e-05, 'epoch': 1.04} 10%|█ | 4278/41250 [10:20:57<88:56:50, 8.66s/it][2025-04-25 18:18:40,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:18:40,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.46 | bwd_microstep: 5737.17 | bwd_inner_microstep: 5701.74 | bwd_allreduce_microstep: 35.39 | step_microstep: 18.43 [2025-04-25 18:18:40,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.46 | bwd: 5737.18 | bwd_inner: 5701.74 | bwd_allreduce: 35.40 | step: 18.44 10%|█ | 4279/41250 [10:21:06<88:59:33, 8.67s/it] {'loss': 0.1123, 'grad_norm': 2.578372001647949, 'learning_rate': 3.943260197635955e-05, 'epoch': 1.04} 10%|█ | 4279/41250 [10:21:06<88:59:33, 8.67s/it][2025-04-25 18:18:49,498] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:18:49,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.09 | bwd_microstep: 5692.32 | bwd_inner_microstep: 5635.46 | bwd_allreduce_microstep: 56.81 | step_microstep: 18.59 [2025-04-25 18:18:49,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.09 | bwd: 5692.33 | bwd_inner: 5635.46 | bwd_allreduce: 56.83 | step: 18.60 10%|█ | 4280/41250 [10:21:14<88:48:04, 8.65s/it] {'loss': 0.1077, 'grad_norm': 1.6248053312301636, 'learning_rate': 3.943223052557689e-05, 'epoch': 1.04} 10%|█ | 4280/41250 [10:21:14<88:48:04, 8.65s/it][2025-04-25 18:18:58,145] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-25 18:18:58,146] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.45 | bwd_microstep: 5710.85 | bwd_inner_microstep: 5697.92 | bwd_allreduce_microstep: 12.89 | step_microstep: 19.12 [2025-04-25 18:18:58,146] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.45 | bwd: 5710.87 | bwd_inner: 5697.92 | bwd_allreduce: 12.91 | step: 19.13 10%|█ | 4281/41250 [10:21:23<88:47:56, 8.65s/it] {'loss': 0.1511, 'grad_norm': 2.5546717643737793, 'learning_rate': 3.9431858954998345e-05, 'epoch': 1.04} 10%|█ | 4281/41250 [10:21:23<88:47:56, 8.65s/it][2025-04-25 18:19:06,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:19:06,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.56 | bwd_microstep: 5723.07 | bwd_inner_microstep: 5649.99 | bwd_allreduce_microstep: 73.03 | step_microstep: 18.57 [2025-04-25 18:19:06,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.56 | bwd: 5723.08 | bwd_inner: 5649.99 | bwd_allreduce: 73.05 | step: 18.58 10%|█ | 4282/41250 [10:21:32<88:46:58, 8.65s/it] {'loss': 0.0914, 'grad_norm': 2.1073076725006104, 'learning_rate': 3.943148726462621e-05, 'epoch': 1.04} 10%|█ | 4282/41250 [10:21:32<88:46:58, 8.65s/it][2025-04-25 18:19:15,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:19:15,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.27 | bwd_microstep: 5681.90 | bwd_inner_microstep: 5644.72 | bwd_allreduce_microstep: 37.14 | step_microstep: 18.28 [2025-04-25 18:19:15,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.27 | bwd: 5681.91 | bwd_inner: 5644.72 | bwd_allreduce: 37.15 | step: 18.28 10%|█ | 4283/41250 [10:21:40<88:36:53, 8.63s/it] {'loss': 0.2201, 'grad_norm': 2.288560390472412, 'learning_rate': 3.9431115454462785e-05, 'epoch': 1.04} 10%|█ | 4283/41250 [10:21:40<88:36:53, 8.63s/it][2025-04-25 18:19:24,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:19:24,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.59 | bwd_microstep: 5730.39 | bwd_inner_microstep: 5704.55 | bwd_allreduce_microstep: 25.80 | step_microstep: 18.70 [2025-04-25 18:19:24,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.59 | bwd: 5730.40 | bwd_inner: 5704.55 | bwd_allreduce: 25.82 | step: 18.71 10%|█ | 4284/41250 [10:21:49<88:45:14, 8.64s/it] {'loss': 0.1239, 'grad_norm': 1.5183223485946655, 'learning_rate': 3.943074352451035e-05, 'epoch': 1.04} 10%|█ | 4284/41250 [10:21:49<88:45:14, 8.64s/it][2025-04-25 18:19:32,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 18:19:32,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.05 | bwd_microstep: 5676.27 | bwd_inner_microstep: 5647.03 | bwd_allreduce_microstep: 29.20 | step_microstep: 18.10 [2025-04-25 18:19:32,645] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.05 | bwd: 5676.29 | bwd_inner: 5647.03 | bwd_allreduce: 29.21 | step: 18.10 10%|█ | 4285/41250 [10:21:57<88:34:44, 8.63s/it] {'loss': 0.1194, 'grad_norm': 1.3224231004714966, 'learning_rate': 3.943037147477121e-05, 'epoch': 1.04} 10%|█ | 4285/41250 [10:21:57<88:34:44, 8.63s/it][2025-04-25 18:19:41,278] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.03 | optimizer_step: 0.97 [2025-04-25 18:19:41,278] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.51 | bwd_microstep: 5695.93 | bwd_inner_microstep: 5682.90 | bwd_allreduce_microstep: 12.98 | step_microstep: 19.08 [2025-04-25 18:19:41,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.51 | bwd: 5695.94 | bwd_inner: 5682.90 | bwd_allreduce: 13.00 | step: 19.08 10%|█ | 4286/41250 [10:22:06<88:36:20, 8.63s/it] {'loss': 0.0763, 'grad_norm': 0.8323181867599487, 'learning_rate': 3.9429999305247646e-05, 'epoch': 1.04} 10%|█ | 4286/41250 [10:22:06<88:36:20, 8.63s/it][2025-04-25 18:19:49,863] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:19:49,863] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.64 | bwd_microstep: 5672.56 | bwd_inner_microstep: 5659.87 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.54 [2025-04-25 18:19:49,863] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.64 | bwd: 5672.58 | bwd_inner: 5659.86 | bwd_allreduce: 12.67 | step: 18.54 10%|█ | 4287/41250 [10:22:15<88:27:29, 8.62s/it] {'loss': 0.1341, 'grad_norm': 1.9119104146957397, 'learning_rate': 3.942962701594196e-05, 'epoch': 1.04} 10%|█ | 4287/41250 [10:22:15<88:27:29, 8.62s/it][2025-04-25 18:19:58,511] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.03 | optimizer_step: 0.92 [2025-04-25 18:19:58,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.89 | bwd_microstep: 5706.30 | bwd_inner_microstep: 5693.53 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.90 [2025-04-25 18:19:58,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.89 | bwd: 5706.32 | bwd_inner: 5693.53 | bwd_allreduce: 12.75 | step: 18.90 10%|█ | 4288/41250 [10:22:23<88:33:26, 8.63s/it] {'loss': 0.0691, 'grad_norm': 1.1976312398910522, 'learning_rate': 3.942925460685644e-05, 'epoch': 1.04} 10%|█ | 4288/41250 [10:22:23<88:33:26, 8.63s/it][2025-04-25 18:20:07,117] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:20:07,117] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.97 | bwd_microstep: 5686.35 | bwd_inner_microstep: 5655.78 | bwd_allreduce_microstep: 30.53 | step_microstep: 18.25 [2025-04-25 18:20:07,118] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.97 | bwd: 5686.37 | bwd_inner: 5655.78 | bwd_allreduce: 30.55 | step: 18.25 10%|█ | 4289/41250 [10:22:32<88:29:37, 8.62s/it] {'loss': 0.27, 'grad_norm': 2.857119083404541, 'learning_rate': 3.9428882077993386e-05, 'epoch': 1.04} 10%|█ | 4289/41250 [10:22:32<88:29:37, 8.62s/it][2025-04-25 18:20:15,724] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 18:20:15,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.47 | bwd_microstep: 5674.56 | bwd_inner_microstep: 5661.93 | bwd_allreduce_microstep: 12.58 | step_microstep: 18.91 [2025-04-25 18:20:15,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.47 | bwd: 5674.58 | bwd_inner: 5661.93 | bwd_allreduce: 12.60 | step: 18.92 10%|█ | 4290/41250 [10:22:41<88:27:48, 8.62s/it] {'loss': 0.0666, 'grad_norm': 1.0534813404083252, 'learning_rate': 3.942850942935511e-05, 'epoch': 1.04} 10%|█ | 4290/41250 [10:22:41<88:27:48, 8.62s/it][2025-04-25 18:20:24,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.93 | optimizer_gradients: 1.02 | optimizer_step: 1.17 [2025-04-25 18:20:24,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.02 | bwd_microstep: 5859.69 | bwd_inner_microstep: 5660.17 | bwd_allreduce_microstep: 199.47 | step_microstep: 18.87 [2025-04-25 18:20:24,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.02 | bwd: 5859.70 | bwd_inner: 5660.17 | bwd_allreduce: 199.49 | step: 18.87 10%|█ | 4291/41250 [10:22:49<88:56:26, 8.66s/it] {'loss': 0.0947, 'grad_norm': 2.228644609451294, 'learning_rate': 3.942813666094388e-05, 'epoch': 1.04} 10%|█ | 4291/41250 [10:22:49<88:56:26, 8.66s/it][2025-04-25 18:20:33,148] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 18:20:33,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.71 | bwd_microstep: 5709.61 | bwd_inner_microstep: 5696.65 | bwd_allreduce_microstep: 12.91 | step_microstep: 18.82 [2025-04-25 18:20:33,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.72 | bwd: 5709.62 | bwd_inner: 5696.65 | bwd_allreduce: 12.93 | step: 18.82 10%|█ | 4292/41250 [10:22:58<88:53:35, 8.66s/it] {'loss': 0.0841, 'grad_norm': 1.236938714981079, 'learning_rate': 3.942776377276201e-05, 'epoch': 1.04} 10%|█ | 4292/41250 [10:22:58<88:53:35, 8.66s/it][2025-04-25 18:20:41,821] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.21 | optimizer_step: 0.90 [2025-04-25 18:20:41,822] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.64 | bwd_microstep: 5758.32 | bwd_inner_microstep: 5659.13 | bwd_allreduce_microstep: 99.15 | step_microstep: 18.84 [2025-04-25 18:20:41,822] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.64 | bwd: 5758.33 | bwd_inner: 5659.13 | bwd_allreduce: 99.16 | step: 18.84 10%|█ | 4293/41250 [10:23:07<88:56:12, 8.66s/it] {'loss': 0.4083, 'grad_norm': 3.261368989944458, 'learning_rate': 3.9427390764811805e-05, 'epoch': 1.04} 10%|█ | 4293/41250 [10:23:07<88:56:12, 8.66s/it][2025-04-25 18:20:50,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.01 | optimizer_step: 1.16 [2025-04-25 18:20:50,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.91 | bwd_microstep: 5766.24 | bwd_inner_microstep: 5656.88 | bwd_allreduce_microstep: 109.31 | step_microstep: 19.07 [2025-04-25 18:20:50,505] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.91 | bwd: 5766.26 | bwd_inner: 5656.88 | bwd_allreduce: 109.33 | step: 19.07 10%|█ | 4294/41250 [10:23:15<88:59:47, 8.67s/it] {'loss': 0.0648, 'grad_norm': 0.7541224360466003, 'learning_rate': 3.9427017637095554e-05, 'epoch': 1.04} 10%|█ | 4294/41250 [10:23:15<88:59:47, 8.67s/it][2025-04-25 18:20:59,218] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 18:20:59,218] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.65 | bwd_microstep: 5769.10 | bwd_inner_microstep: 5704.68 | bwd_allreduce_microstep: 64.38 | step_microstep: 18.75 [2025-04-25 18:20:59,219] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.65 | bwd: 5769.12 | bwd_inner: 5704.68 | bwd_allreduce: 64.39 | step: 18.75 10%|█ | 4295/41250 [10:23:24<89:07:43, 8.68s/it] {'loss': 0.2856, 'grad_norm': 2.0439558029174805, 'learning_rate': 3.942664438961556e-05, 'epoch': 1.04} 10%|█ | 4295/41250 [10:23:24<89:07:43, 8.68s/it][2025-04-25 18:21:07,864] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-25 18:21:07,864] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.50 | bwd_microstep: 5702.79 | bwd_inner_microstep: 5690.01 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.80 [2025-04-25 18:21:07,865] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.50 | bwd: 5702.80 | bwd_inner: 5690.01 | bwd_allreduce: 12.76 | step: 18.80 10%|█ | 4296/41250 [10:23:33<89:00:47, 8.67s/it] {'loss': 0.0198, 'grad_norm': 0.29425621032714844, 'learning_rate': 3.942627102237412e-05, 'epoch': 1.04} 10%|█ | 4296/41250 [10:23:33<89:00:47, 8.67s/it][2025-04-25 18:21:16,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.98 [2025-04-25 18:21:16,566] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.59 | bwd_microstep: 5786.56 | bwd_inner_microstep: 5660.52 | bwd_allreduce_microstep: 125.98 | step_microstep: 18.95 [2025-04-25 18:21:16,566] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.59 | bwd: 5786.57 | bwd_inner: 5660.52 | bwd_allreduce: 126.01 | step: 18.95 10%|█ | 4297/41250 [10:23:41<89:06:20, 8.68s/it] {'loss': 0.0252, 'grad_norm': 0.4940983057022095, 'learning_rate': 3.942589753537355e-05, 'epoch': 1.04} 10%|█ | 4297/41250 [10:23:41<89:06:20, 8.68s/it][2025-04-25 18:21:25,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.94 [2025-04-25 18:21:25,397] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.24 | bwd_microstep: 5882.80 | bwd_inner_microstep: 5712.22 | bwd_allreduce_microstep: 170.53 | step_microstep: 19.30 [2025-04-25 18:21:25,397] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.24 | bwd: 5882.82 | bwd_inner: 5712.22 | bwd_allreduce: 170.55 | step: 19.31 10%|█ | 4298/41250 [10:23:50<89:33:45, 8.73s/it] {'loss': 0.2121, 'grad_norm': 2.6713039875030518, 'learning_rate': 3.942552392861613e-05, 'epoch': 1.04} 10%|█ | 4298/41250 [10:23:50<89:33:45, 8.73s/it][2025-04-25 18:21:34,099] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 1.07 [2025-04-25 18:21:34,099] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.82 | bwd_microstep: 5768.51 | bwd_inner_microstep: 5702.90 | bwd_allreduce_microstep: 65.56 | step_microstep: 19.01 [2025-04-25 18:21:34,100] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.81 | bwd: 5768.53 | bwd_inner: 5702.90 | bwd_allreduce: 65.58 | step: 19.01 10%|█ | 4299/41250 [10:23:59<89:29:58, 8.72s/it] {'loss': 0.1348, 'grad_norm': 1.8745875358581543, 'learning_rate': 3.942515020210418e-05, 'epoch': 1.04} 10%|█ | 4299/41250 [10:23:59<89:29:58, 8.72s/it][2025-04-25 18:21:42,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.04 | optimizer_step: 1.01 [2025-04-25 18:21:42,794] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.98 | bwd_microstep: 5776.01 | bwd_inner_microstep: 5667.91 | bwd_allreduce_microstep: 108.04 | step_microstep: 19.36 [2025-04-25 18:21:42,794] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.98 | bwd: 5776.02 | bwd_inner: 5667.91 | bwd_allreduce: 108.06 | step: 19.36 10%|█ | 4300/41250 [10:24:08<89:24:36, 8.71s/it] {'loss': 0.0319, 'grad_norm': 0.7025327086448669, 'learning_rate': 3.942477635584e-05, 'epoch': 1.04} 10%|█ | 4300/41250 [10:24:08<89:24:36, 8.71s/it][2025-04-25 18:21:51,476] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:21:51,476] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.47 | bwd_microstep: 5748.61 | bwd_inner_microstep: 5686.79 | bwd_allreduce_microstep: 61.77 | step_microstep: 18.85 [2025-04-25 18:21:51,477] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.47 | bwd: 5748.62 | bwd_inner: 5686.79 | bwd_allreduce: 61.79 | step: 18.85 10%|█ | 4301/41250 [10:24:16<89:19:05, 8.70s/it] {'loss': 0.2064, 'grad_norm': 1.1024681329727173, 'learning_rate': 3.94244023898259e-05, 'epoch': 1.04} 10%|█ | 4301/41250 [10:24:16<89:19:05, 8.70s/it][2025-04-25 18:22:00,087] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 18:22:00,087] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.07 | bwd_microstep: 5701.31 | bwd_inner_microstep: 5651.96 | bwd_allreduce_microstep: 49.30 | step_microstep: 18.38 [2025-04-25 18:22:00,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.07 | bwd: 5701.33 | bwd_inner: 5651.96 | bwd_allreduce: 49.32 | step: 18.38 10%|█ | 4302/41250 [10:24:25<89:02:03, 8.67s/it] {'loss': 0.1254, 'grad_norm': 2.6810591220855713, 'learning_rate': 3.942402830406418e-05, 'epoch': 1.04} 10%|█ | 4302/41250 [10:24:25<89:02:03, 8.67s/it][2025-04-25 18:22:08,818] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 0.97 | optimizer_step: 1.01 [2025-04-25 18:22:08,819] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.09 | bwd_microstep: 5812.36 | bwd_inner_microstep: 5674.18 | bwd_allreduce_microstep: 138.13 | step_microstep: 18.49 [2025-04-25 18:22:08,819] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.09 | bwd: 5812.37 | bwd_inner: 5674.18 | bwd_allreduce: 138.15 | step: 18.50 10%|█ | 4303/41250 [10:24:34<89:12:45, 8.69s/it] {'loss': 0.0804, 'grad_norm': 1.3666743040084839, 'learning_rate': 3.9423654098557136e-05, 'epoch': 1.04} 10%|█ | 4303/41250 [10:24:34<89:12:45, 8.69s/it][2025-04-25 18:22:17,487] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-25 18:22:17,488] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.19 | bwd_microstep: 5726.75 | bwd_inner_microstep: 5713.75 | bwd_allreduce_microstep: 12.95 | step_microstep: 19.35 [2025-04-25 18:22:17,488] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.19 | bwd: 5726.76 | bwd_inner: 5713.75 | bwd_allreduce: 12.97 | step: 19.36 10%|█ | 4304/41250 [10:24:42<89:07:51, 8.68s/it] {'loss': 0.0574, 'grad_norm': 0.8311811089515686, 'learning_rate': 3.94232797733071e-05, 'epoch': 1.04} 10%|█ | 4304/41250 [10:24:42<89:07:51, 8.68s/it][2025-04-25 18:22:26,203] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-25 18:22:26,204] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.39 | bwd_microstep: 5784.82 | bwd_inner_microstep: 5685.95 | bwd_allreduce_microstep: 98.82 | step_microstep: 18.72 [2025-04-25 18:22:26,204] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.39 | bwd: 5784.83 | bwd_inner: 5685.95 | bwd_allreduce: 98.84 | step: 18.73 10%|█ | 4305/41250 [10:24:51<89:13:28, 8.69s/it] {'loss': 0.0762, 'grad_norm': 1.1117693185806274, 'learning_rate': 3.942290532831635e-05, 'epoch': 1.04} 10%|█ | 4305/41250 [10:24:51<89:13:28, 8.69s/it][2025-04-25 18:22:34,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.97 [2025-04-25 18:22:34,897] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.27 | bwd_microstep: 5759.23 | bwd_inner_microstep: 5693.90 | bwd_allreduce_microstep: 65.28 | step_microstep: 18.95 [2025-04-25 18:22:34,897] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.27 | bwd: 5759.24 | bwd_inner: 5693.90 | bwd_allreduce: 65.30 | step: 18.95 10%|█ | 4306/41250 [10:25:00<89:13:02, 8.69s/it] {'loss': 0.0643, 'grad_norm': 1.2738224267959595, 'learning_rate': 3.942253076358721e-05, 'epoch': 1.04} 10%|█ | 4306/41250 [10:25:00<89:13:02, 8.69s/it][2025-04-25 18:22:43,606] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:22:43,607] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.56 | bwd_microstep: 5787.73 | bwd_inner_microstep: 5657.59 | bwd_allreduce_microstep: 130.10 | step_microstep: 18.72 [2025-04-25 18:22:43,607] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.56 | bwd: 5787.74 | bwd_inner: 5657.59 | bwd_allreduce: 130.11 | step: 18.72 10%|█ | 4307/41250 [10:25:08<89:15:54, 8.70s/it] {'loss': 0.1853, 'grad_norm': 2.93125581741333, 'learning_rate': 3.9422156079122e-05, 'epoch': 1.04} 10%|█ | 4307/41250 [10:25:08<89:15:54, 8.70s/it][2025-04-25 18:22:52,295] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.03 | optimizer_step: 0.97 [2025-04-25 18:22:52,295] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.62 | bwd_microstep: 5766.06 | bwd_inner_microstep: 5671.22 | bwd_allreduce_microstep: 94.79 | step_microstep: 19.17 [2025-04-25 18:22:52,296] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.62 | bwd: 5766.07 | bwd_inner: 5671.22 | bwd_allreduce: 94.81 | step: 19.17 10%|█ | 4308/41250 [10:25:17<89:14:11, 8.70s/it] {'loss': 0.0437, 'grad_norm': 1.386686086654663, 'learning_rate': 3.9421781274923006e-05, 'epoch': 1.04} 10%|█ | 4308/41250 [10:25:17<89:14:11, 8.70s/it][2025-04-25 18:23:01,003] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.59 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 18:23:01,004] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.16 | bwd_microstep: 5770.97 | bwd_inner_microstep: 5708.27 | bwd_allreduce_microstep: 62.66 | step_microstep: 18.79 [2025-04-25 18:23:01,004] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.16 | bwd: 5770.98 | bwd_inner: 5708.27 | bwd_allreduce: 62.68 | step: 18.80 10%|█ | 4309/41250 [10:25:26<89:16:00, 8.70s/it] {'loss': 0.0632, 'grad_norm': 0.7490788102149963, 'learning_rate': 3.9421406350992555e-05, 'epoch': 1.04} 10%|█ | 4309/41250 [10:25:26<89:16:00, 8.70s/it][2025-04-25 18:23:09,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:23:09,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.03 | bwd_microstep: 5717.98 | bwd_inner_microstep: 5705.11 | bwd_allreduce_microstep: 12.82 | step_microstep: 19.07 [2025-04-25 18:23:09,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.03 | bwd: 5718.00 | bwd_inner: 5705.11 | bwd_allreduce: 12.84 | step: 19.07 10%|█ | 4310/41250 [10:25:34<89:07:34, 8.69s/it] {'loss': 0.3915, 'grad_norm': 2.596740245819092, 'learning_rate': 3.9421031307332954e-05, 'epoch': 1.04} 10%|█ | 4310/41250 [10:25:34<89:07:34, 8.69s/it][2025-04-25 18:23:18,351] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 18:23:18,351] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.67 | bwd_microstep: 5782.61 | bwd_inner_microstep: 5661.75 | bwd_allreduce_microstep: 120.82 | step_microstep: 18.46 [2025-04-25 18:23:18,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.67 | bwd: 5782.63 | bwd_inner: 5661.74 | bwd_allreduce: 120.84 | step: 18.46 10%|█ | 4311/41250 [10:25:43<89:08:50, 8.69s/it] {'loss': 0.1742, 'grad_norm': 3.0394287109375, 'learning_rate': 3.9420656143946514e-05, 'epoch': 1.05} 10%|█ | 4311/41250 [10:25:43<89:08:50, 8.69s/it][2025-04-25 18:23:27,030] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.87 | optimizer_gradients: 1.14 | optimizer_step: 0.90 [2025-04-25 18:23:27,030] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.79 | bwd_microstep: 5735.10 | bwd_inner_microstep: 5716.71 | bwd_allreduce_microstep: 18.34 | step_microstep: 20.92 [2025-04-25 18:23:27,030] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.79 | bwd: 5735.11 | bwd_inner: 5716.71 | bwd_allreduce: 18.36 | step: 20.92 10%|█ | 4312/41250 [10:25:52<89:06:54, 8.69s/it] {'loss': 0.2819, 'grad_norm': 2.173081398010254, 'learning_rate': 3.942028086083555e-05, 'epoch': 1.05} 10%|█ | 4312/41250 [10:25:52<89:06:54, 8.69s/it][2025-04-25 18:23:35,737] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 18:23:35,738] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.27 | bwd_microstep: 5768.96 | bwd_inner_microstep: 5699.06 | bwd_allreduce_microstep: 69.85 | step_microstep: 18.83 [2025-04-25 18:23:35,738] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.27 | bwd: 5768.98 | bwd_inner: 5699.06 | bwd_allreduce: 69.87 | step: 18.83 10%|█ | 4313/41250 [10:26:01<89:11:01, 8.69s/it] {'loss': 0.2025, 'grad_norm': 3.664738178253174, 'learning_rate': 3.941990545800238e-05, 'epoch': 1.05} 10%|█ | 4313/41250 [10:26:01<89:11:01, 8.69s/it][2025-04-25 18:23:44,418] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:23:44,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.89 | bwd_microstep: 5776.91 | bwd_inner_microstep: 5645.75 | bwd_allreduce_microstep: 131.11 | step_microstep: 18.77 [2025-04-25 18:23:44,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.89 | bwd: 5776.92 | bwd_inner: 5645.75 | bwd_allreduce: 131.13 | step: 18.77 10%|█ | 4314/41250 [10:26:09<89:08:47, 8.69s/it] {'loss': 0.1121, 'grad_norm': 1.8317548036575317, 'learning_rate': 3.94195299354493e-05, 'epoch': 1.05} 10%|█ | 4314/41250 [10:26:09<89:08:47, 8.69s/it][2025-04-25 18:23:53,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 1.05 | optimizer_step: 0.96 [2025-04-25 18:23:53,032] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.68 | bwd_microstep: 5700.62 | bwd_inner_microstep: 5657.65 | bwd_allreduce_microstep: 42.92 | step_microstep: 19.16 [2025-04-25 18:23:53,032] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.68 | bwd: 5700.64 | bwd_inner: 5657.65 | bwd_allreduce: 42.94 | step: 19.17 10%|█ | 4315/41250 [10:26:18<88:55:20, 8.67s/it] {'loss': 0.0822, 'grad_norm': 1.649374008178711, 'learning_rate': 3.9419154293178646e-05, 'epoch': 1.05} 10%|█ | 4315/41250 [10:26:18<88:55:20, 8.67s/it][2025-04-25 18:24:01,733] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.24 | optimizer_step: 0.91 [2025-04-25 18:24:01,734] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.88 | bwd_microstep: 5764.13 | bwd_inner_microstep: 5654.46 | bwd_allreduce_microstep: 109.62 | step_microstep: 19.26 [2025-04-25 18:24:01,734] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.88 | bwd: 5764.15 | bwd_inner: 5654.46 | bwd_allreduce: 109.64 | step: 19.27 10%|█ | 4316/41250 [10:26:27<89:01:06, 8.68s/it] {'loss': 0.0518, 'grad_norm': 0.9528141617774963, 'learning_rate': 3.941877853119272e-05, 'epoch': 1.05} 10%|█ | 4316/41250 [10:26:27<89:01:06, 8.68s/it][2025-04-25 18:24:10,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.07 | optimizer_step: 0.98 [2025-04-25 18:24:10,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.10 | bwd_microstep: 5759.00 | bwd_inner_microstep: 5661.18 | bwd_allreduce_microstep: 97.78 | step_microstep: 18.62 [2025-04-25 18:24:10,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.09 | bwd: 5759.02 | bwd_inner: 5661.18 | bwd_allreduce: 97.80 | step: 18.63 10%|█ | 4317/41250 [10:26:35<88:59:46, 8.67s/it] {'loss': 0.0378, 'grad_norm': 0.7443650364875793, 'learning_rate': 3.941840264949385e-05, 'epoch': 1.05} 10%|█ | 4317/41250 [10:26:35<88:59:46, 8.67s/it][2025-04-25 18:24:19,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 0.95 [2025-04-25 18:24:19,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.58 | bwd_microstep: 5681.94 | bwd_inner_microstep: 5656.25 | bwd_allreduce_microstep: 25.64 | step_microstep: 18.58 [2025-04-25 18:24:19,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.58 | bwd: 5681.95 | bwd_inner: 5656.25 | bwd_allreduce: 25.66 | step: 18.58 10%|█ | 4318/41250 [10:26:44<88:45:12, 8.65s/it] {'loss': 0.1728, 'grad_norm': 2.928795337677002, 'learning_rate': 3.941802664808434e-05, 'epoch': 1.05} 10%|█ | 4318/41250 [10:26:44<88:45:12, 8.65s/it][2025-04-25 18:24:27,696] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.03 | optimizer_step: 0.98 [2025-04-25 18:24:27,697] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.85 | bwd_microstep: 5760.97 | bwd_inner_microstep: 5716.52 | bwd_allreduce_microstep: 44.40 | step_microstep: 18.85 [2025-04-25 18:24:27,697] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.85 | bwd: 5760.98 | bwd_inner: 5716.52 | bwd_allreduce: 44.42 | step: 18.85 10%|█ | 4319/41250 [10:26:53<88:53:25, 8.66s/it] {'loss': 0.2351, 'grad_norm': 1.6294091939926147, 'learning_rate': 3.941765052696652e-05, 'epoch': 1.05} 10%|█ | 4319/41250 [10:26:53<88:53:25, 8.66s/it][2025-04-25 18:24:36,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.07 | optimizer_step: 0.91 [2025-04-25 18:24:36,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.52 | bwd_microstep: 5799.32 | bwd_inner_microstep: 5638.91 | bwd_allreduce_microstep: 160.36 | step_microstep: 19.00 [2025-04-25 18:24:36,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.52 | bwd: 5799.33 | bwd_inner: 5638.91 | bwd_allreduce: 160.38 | step: 19.00 10%|█ | 4320/41250 [10:27:01<89:00:53, 8.68s/it] {'loss': 0.3101, 'grad_norm': 2.0307178497314453, 'learning_rate': 3.941727428614271e-05, 'epoch': 1.05} 10%|█ | 4320/41250 [10:27:01<89:00:53, 8.68s/it][2025-04-25 18:24:45,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:24:45,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.12 | bwd_microstep: 5741.88 | bwd_inner_microstep: 5701.97 | bwd_allreduce_microstep: 39.87 | step_microstep: 18.60 [2025-04-25 18:24:45,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.12 | bwd: 5741.90 | bwd_inner: 5701.96 | bwd_allreduce: 39.89 | step: 18.61 10%|█ | 4321/41250 [10:27:10<89:00:11, 8.68s/it] {'loss': 0.2983, 'grad_norm': 1.9110794067382812, 'learning_rate': 3.9416897925615206e-05, 'epoch': 1.05} 10%|█ | 4321/41250 [10:27:10<89:00:11, 8.68s/it][2025-04-25 18:24:53,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 18:24:53,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.73 | bwd_microstep: 5724.27 | bwd_inner_microstep: 5688.72 | bwd_allreduce_microstep: 35.51 | step_microstep: 18.71 [2025-04-25 18:24:53,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.73 | bwd: 5724.29 | bwd_inner: 5688.72 | bwd_allreduce: 35.53 | step: 18.71 10%|█ | 4322/41250 [10:27:19<88:57:29, 8.67s/it] {'loss': 0.2062, 'grad_norm': 1.1079087257385254, 'learning_rate': 3.941652144538636e-05, 'epoch': 1.05} 10%|█ | 4322/41250 [10:27:19<88:57:29, 8.67s/it][2025-04-25 18:25:02,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 18:25:02,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.42 | bwd_microstep: 5730.18 | bwd_inner_microstep: 5680.00 | bwd_allreduce_microstep: 50.13 | step_microstep: 18.87 [2025-04-25 18:25:02,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.42 | bwd: 5730.19 | bwd_inner: 5680.00 | bwd_allreduce: 50.15 | step: 18.87 10%|█ | 4323/41250 [10:27:27<88:54:04, 8.67s/it] {'loss': 0.1511, 'grad_norm': 2.8997669219970703, 'learning_rate': 3.941614484545848e-05, 'epoch': 1.05} 10%|█ | 4323/41250 [10:27:27<88:54:04, 8.67s/it][2025-04-25 18:25:11,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:25:11,073] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.67 | bwd_microstep: 5745.47 | bwd_inner_microstep: 5705.18 | bwd_allreduce_microstep: 40.24 | step_microstep: 18.61 [2025-04-25 18:25:11,073] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.67 | bwd: 5745.48 | bwd_inner: 5705.18 | bwd_allreduce: 40.26 | step: 18.61 10%|█ | 4324/41250 [10:27:36<88:55:42, 8.67s/it] {'loss': 0.1747, 'grad_norm': 1.6146695613861084, 'learning_rate': 3.941576812583387e-05, 'epoch': 1.05} 10%|█ | 4324/41250 [10:27:36<88:55:42, 8.67s/it][2025-04-25 18:25:19,751] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 6.33 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 18:25:19,752] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.09 | bwd_microstep: 5768.32 | bwd_inner_microstep: 5646.52 | bwd_allreduce_microstep: 121.75 | step_microstep: 19.47 [2025-04-25 18:25:19,752] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.09 | bwd: 5768.33 | bwd_inner: 5646.52 | bwd_allreduce: 121.77 | step: 19.48 10%|█ | 4325/41250 [10:27:45<88:57:19, 8.67s/it] {'loss': 0.1352, 'grad_norm': 0.5939320921897888, 'learning_rate': 3.941539128651488e-05, 'epoch': 1.05} 10%|█ | 4325/41250 [10:27:45<88:57:19, 8.67s/it][2025-04-25 18:25:28,422] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.66 | optimizer_gradients: 1.02 | optimizer_step: 0.96 [2025-04-25 18:25:28,423] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.43 | bwd_microstep: 5731.78 | bwd_inner_microstep: 5678.58 | bwd_allreduce_microstep: 53.16 | step_microstep: 19.26 [2025-04-25 18:25:28,423] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.43 | bwd: 5731.79 | bwd_inner: 5678.58 | bwd_allreduce: 53.18 | step: 19.26 10%|█ | 4326/41250 [10:27:53<88:56:56, 8.67s/it] {'loss': 0.2079, 'grad_norm': 2.6648716926574707, 'learning_rate': 3.9415014327503825e-05, 'epoch': 1.05} 10%|█ | 4326/41250 [10:27:53<88:56:56, 8.67s/it][2025-04-25 18:25:37,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 18:25:37,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.79 | bwd_microstep: 5742.36 | bwd_inner_microstep: 5641.19 | bwd_allreduce_microstep: 101.12 | step_microstep: 18.65 [2025-04-25 18:25:37,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.79 | bwd: 5742.37 | bwd_inner: 5641.19 | bwd_allreduce: 101.14 | step: 18.66 10%|█ | 4327/41250 [10:28:02<88:52:29, 8.67s/it] {'loss': 0.1469, 'grad_norm': 1.7123693227767944, 'learning_rate': 3.9414637248803014e-05, 'epoch': 1.05} 10%|█ | 4327/41250 [10:28:02<88:52:29, 8.67s/it][2025-04-25 18:25:45,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.21 | optimizer_step: 0.90 [2025-04-25 18:25:45,667] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.78 | bwd_microstep: 5693.90 | bwd_inner_microstep: 5641.47 | bwd_allreduce_microstep: 52.39 | step_microstep: 18.90 [2025-04-25 18:25:45,667] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.78 | bwd: 5693.92 | bwd_inner: 5641.47 | bwd_allreduce: 52.41 | step: 18.90 10%|█ | 4328/41250 [10:28:10<88:39:18, 8.64s/it] {'loss': 0.1442, 'grad_norm': 1.333683729171753, 'learning_rate': 3.9414260050414795e-05, 'epoch': 1.05} 10%|█ | 4328/41250 [10:28:10<88:39:18, 8.64s/it][2025-04-25 18:25:54,337] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 18:25:54,338] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.94 | bwd_microstep: 5766.25 | bwd_inner_microstep: 5645.31 | bwd_allreduce_microstep: 120.89 | step_microstep: 18.59 [2025-04-25 18:25:54,338] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.94 | bwd: 5766.27 | bwd_inner: 5645.31 | bwd_allreduce: 120.91 | step: 18.59 10%|█ | 4329/41250 [10:28:19<88:44:09, 8.65s/it] {'loss': 0.1586, 'grad_norm': 1.489928126335144, 'learning_rate': 3.941388273234147e-05, 'epoch': 1.05} 10%|█ | 4329/41250 [10:28:19<88:44:09, 8.65s/it][2025-04-25 18:26:03,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.03 | optimizer_step: 0.94 [2025-04-25 18:26:03,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.36 | bwd_microstep: 5744.34 | bwd_inner_microstep: 5685.55 | bwd_allreduce_microstep: 58.74 | step_microstep: 18.73 [2025-04-25 18:26:03,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.36 | bwd: 5744.36 | bwd_inner: 5685.55 | bwd_allreduce: 58.76 | step: 18.73 10%|█ | 4330/41250 [10:28:28<88:47:13, 8.66s/it] {'loss': 0.1794, 'grad_norm': 3.48454213142395, 'learning_rate': 3.9413505294585386e-05, 'epoch': 1.05} 10%|█ | 4330/41250 [10:28:28<88:47:13, 8.66s/it][2025-04-25 18:26:11,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.01 | optimizer_step: 1.02 [2025-04-25 18:26:11,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.47 | bwd_microstep: 5736.42 | bwd_inner_microstep: 5647.72 | bwd_allreduce_microstep: 88.65 | step_microstep: 18.57 [2025-04-25 18:26:11,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.47 | bwd: 5736.43 | bwd_inner: 5647.72 | bwd_allreduce: 88.67 | step: 18.57 10%|█ | 4331/41250 [10:28:36<88:43:42, 8.65s/it] {'loss': 0.1551, 'grad_norm': 6.716338157653809, 'learning_rate': 3.941312773714885e-05, 'epoch': 1.05} 10%|█ | 4331/41250 [10:28:36<88:43:42, 8.65s/it][2025-04-25 18:26:20,278] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 18:26:20,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.91 | bwd_microstep: 5696.31 | bwd_inner_microstep: 5683.36 | bwd_allreduce_microstep: 12.90 | step_microstep: 19.06 [2025-04-25 18:26:20,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.91 | bwd: 5696.33 | bwd_inner: 5683.36 | bwd_allreduce: 12.92 | step: 19.06 11%|█ | 4332/41250 [10:28:45<88:39:49, 8.65s/it] {'loss': 0.1955, 'grad_norm': 2.0575735569000244, 'learning_rate': 3.9412750060034205e-05, 'epoch': 1.05} 11%|█ | 4332/41250 [10:28:45<88:39:49, 8.65s/it][2025-04-25 18:26:28,923] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:26:28,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.04 | bwd_microstep: 5712.72 | bwd_inner_microstep: 5699.89 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.78 [2025-04-25 18:26:28,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.04 | bwd: 5712.73 | bwd_inner: 5699.89 | bwd_allreduce: 12.80 | step: 18.78 11%|█ | 4333/41250 [10:28:54<88:39:27, 8.65s/it] {'loss': 0.1713, 'grad_norm': 1.5923789739608765, 'learning_rate': 3.941237226324378e-05, 'epoch': 1.05} 11%|█ | 4333/41250 [10:28:54<88:39:27, 8.65s/it][2025-04-25 18:26:37,555] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:26:37,555] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.40 | bwd_microstep: 5696.88 | bwd_inner_microstep: 5684.05 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.82 [2025-04-25 18:26:37,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.40 | bwd: 5696.89 | bwd_inner: 5684.05 | bwd_allreduce: 12.80 | step: 18.83 11%|█ | 4334/41250 [10:29:02<88:36:45, 8.64s/it] {'loss': 0.1475, 'grad_norm': 2.819801092147827, 'learning_rate': 3.9411994346779895e-05, 'epoch': 1.05} 11%|█ | 4334/41250 [10:29:02<88:36:45, 8.64s/it][2025-04-25 18:26:46,296] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.06 | optimizer_step: 0.99 [2025-04-25 18:26:46,297] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2881.15 | bwd_microstep: 5777.50 | bwd_inner_microstep: 5764.16 | bwd_allreduce_microstep: 13.28 | step_microstep: 19.39 [2025-04-25 18:26:46,297] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2881.15 | bwd: 5777.51 | bwd_inner: 5764.16 | bwd_allreduce: 13.31 | step: 19.39 11%|█ | 4335/41250 [10:29:11<88:55:28, 8.67s/it] {'loss': 0.1029, 'grad_norm': 1.757896900177002, 'learning_rate': 3.941161631064488e-05, 'epoch': 1.05} 11%|█ | 4335/41250 [10:29:11<88:55:28, 8.67s/it][2025-04-25 18:26:55,040] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.15 | optimizer_step: 0.91 [2025-04-25 18:26:55,040] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2876.47 | bwd_microstep: 5779.93 | bwd_inner_microstep: 5767.28 | bwd_allreduce_microstep: 12.61 | step_microstep: 19.80 [2025-04-25 18:26:55,041] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2876.47 | bwd: 5779.94 | bwd_inner: 5767.28 | bwd_allreduce: 12.62 | step: 19.81 11%|█ | 4336/41250 [10:29:20<89:08:06, 8.69s/it] {'loss': 0.0633, 'grad_norm': 0.9106292128562927, 'learning_rate': 3.941123815484107e-05, 'epoch': 1.05} 11%|█ | 4336/41250 [10:29:20<89:08:06, 8.69s/it][2025-04-25 18:27:03,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 18:27:03,706] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.28 | bwd_microstep: 5738.38 | bwd_inner_microstep: 5691.27 | bwd_allreduce_microstep: 47.06 | step_microstep: 18.33 [2025-04-25 18:27:03,706] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.28 | bwd: 5738.39 | bwd_inner: 5691.27 | bwd_allreduce: 47.08 | step: 18.33 11%|█ | 4337/41250 [10:29:29<89:02:50, 8.68s/it] {'loss': 0.3589, 'grad_norm': 4.45988655090332, 'learning_rate': 3.9410859879370795e-05, 'epoch': 1.05} 11%|█ | 4337/41250 [10:29:29<89:02:50, 8.68s/it][2025-04-25 18:27:12,369] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 18:27:12,369] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.78 | bwd_microstep: 5747.92 | bwd_inner_microstep: 5657.56 | bwd_allreduce_microstep: 90.31 | step_microstep: 18.27 [2025-04-25 18:27:12,370] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.78 | bwd: 5747.94 | bwd_inner: 5657.56 | bwd_allreduce: 90.33 | step: 18.27 11%|█ | 4338/41250 [10:29:37<88:58:51, 8.68s/it] {'loss': 0.0354, 'grad_norm': 0.37355369329452515, 'learning_rate': 3.941048148423638e-05, 'epoch': 1.05} 11%|█ | 4338/41250 [10:29:37<88:58:51, 8.68s/it][2025-04-25 18:27:21,026] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 18:27:21,026] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.27 | bwd_microstep: 5722.96 | bwd_inner_microstep: 5709.97 | bwd_allreduce_microstep: 12.95 | step_microstep: 18.52 [2025-04-25 18:27:21,027] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.27 | bwd: 5722.97 | bwd_inner: 5709.97 | bwd_allreduce: 12.96 | step: 18.52 11%|█ | 4339/41250 [10:29:46<88:54:46, 8.67s/it] {'loss': 0.1773, 'grad_norm': 1.9248709678649902, 'learning_rate': 3.9410102969440175e-05, 'epoch': 1.05} 11%|█ | 4339/41250 [10:29:46<88:54:46, 8.67s/it][2025-04-25 18:27:29,628] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.06 | optimizer_step: 1.01 [2025-04-25 18:27:29,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.63 | bwd_microstep: 5685.02 | bwd_inner_microstep: 5668.05 | bwd_allreduce_microstep: 16.86 | step_microstep: 20.26 [2025-04-25 18:27:29,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.63 | bwd: 5685.04 | bwd_inner: 5668.05 | bwd_allreduce: 16.91 | step: 20.25 11%|█ | 4340/41250 [10:29:54<88:42:16, 8.65s/it] {'loss': 0.1613, 'grad_norm': 1.8060245513916016, 'learning_rate': 3.94097243349845e-05, 'epoch': 1.05} 11%|█ | 4340/41250 [10:29:54<88:42:16, 8.65s/it][2025-04-25 18:27:38,531] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:27:38,532] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2937.01 | bwd_microstep: 5881.43 | bwd_inner_microstep: 5868.95 | bwd_allreduce_microstep: 12.43 | step_microstep: 18.31 [2025-04-25 18:27:38,532] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2937.01 | bwd: 5881.45 | bwd_inner: 5868.95 | bwd_allreduce: 12.45 | step: 18.31 11%|█ | 4341/41250 [10:30:03<89:28:04, 8.73s/it] {'loss': 0.0538, 'grad_norm': 1.7891043424606323, 'learning_rate': 3.9409345580871694e-05, 'epoch': 1.05} 11%|█ | 4341/41250 [10:30:03<89:28:04, 8.73s/it][2025-04-25 18:27:47,218] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 0.98 | optimizer_step: 0.97 [2025-04-25 18:27:47,218] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.55 | bwd_microstep: 5762.35 | bwd_inner_microstep: 5700.51 | bwd_allreduce_microstep: 61.80 | step_microstep: 18.34 [2025-04-25 18:27:47,219] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.55 | bwd: 5762.36 | bwd_inner: 5700.51 | bwd_allreduce: 61.81 | step: 18.35 11%|█ | 4342/41250 [10:30:12<89:20:33, 8.71s/it] {'loss': 0.0509, 'grad_norm': 1.172315239906311, 'learning_rate': 3.940896670710408e-05, 'epoch': 1.05} 11%|█ | 4342/41250 [10:30:12<89:20:33, 8.71s/it][2025-04-25 18:27:55,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:27:55,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.48 | bwd_microstep: 5778.89 | bwd_inner_microstep: 5650.90 | bwd_allreduce_microstep: 127.95 | step_microstep: 18.60 [2025-04-25 18:27:55,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.48 | bwd: 5778.91 | bwd_inner: 5650.90 | bwd_allreduce: 127.97 | step: 18.60 11%|█ | 4343/41250 [10:30:21<89:15:01, 8.71s/it] {'loss': 0.0796, 'grad_norm': 0.6610034108161926, 'learning_rate': 3.940858771368401e-05, 'epoch': 1.05} 11%|█ | 4343/41250 [10:30:21<89:15:01, 8.71s/it][2025-04-25 18:28:04,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:28:04,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.29 | bwd_microstep: 5738.50 | bwd_inner_microstep: 5697.20 | bwd_allreduce_microstep: 41.25 | step_microstep: 18.52 [2025-04-25 18:28:04,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.29 | bwd: 5738.51 | bwd_inner: 5697.20 | bwd_allreduce: 41.27 | step: 18.52 11%|█ | 4344/41250 [10:30:29<89:10:06, 8.70s/it] {'loss': 0.1083, 'grad_norm': 2.063920736312866, 'learning_rate': 3.940820860061382e-05, 'epoch': 1.05} 11%|█ | 4344/41250 [10:30:29<89:10:06, 8.70s/it][2025-04-25 18:28:13,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.99 | optimizer_step: 0.94 [2025-04-25 18:28:13,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.88 | bwd_microstep: 5707.48 | bwd_inner_microstep: 5694.61 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.88 [2025-04-25 18:28:13,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.88 | bwd: 5707.49 | bwd_inner: 5694.61 | bwd_allreduce: 12.84 | step: 18.88 11%|█ | 4345/41250 [10:30:38<89:01:24, 8.68s/it] {'loss': 0.1689, 'grad_norm': 1.796738624572754, 'learning_rate': 3.940782936789584e-05, 'epoch': 1.05} 11%|█ | 4345/41250 [10:30:38<89:01:24, 8.68s/it][2025-04-25 18:28:21,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.21 | optimizer_step: 0.92 [2025-04-25 18:28:21,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.29 | bwd_microstep: 5737.65 | bwd_inner_microstep: 5701.94 | bwd_allreduce_microstep: 35.66 | step_microstep: 19.66 [2025-04-25 18:28:21,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.29 | bwd: 5737.67 | bwd_inner: 5701.94 | bwd_allreduce: 35.68 | step: 19.66 11%|█ | 4346/41250 [10:30:47<89:01:04, 8.68s/it] {'loss': 0.0712, 'grad_norm': 0.8431259393692017, 'learning_rate': 3.9407450015532404e-05, 'epoch': 1.05} 11%|█ | 4346/41250 [10:30:47<89:01:04, 8.68s/it][2025-04-25 18:28:30,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.18 | optimizer_step: 1.13 [2025-04-25 18:28:30,591] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.36 | bwd_microstep: 5739.44 | bwd_inner_microstep: 5669.24 | bwd_allreduce_microstep: 70.13 | step_microstep: 20.13 [2025-04-25 18:28:30,591] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.36 | bwd: 5739.46 | bwd_inner: 5669.24 | bwd_allreduce: 70.17 | step: 20.13 11%|█ | 4347/41250 [10:30:55<88:59:39, 8.68s/it] {'loss': 0.123, 'grad_norm': 0.7493043541908264, 'learning_rate': 3.940707054352586e-05, 'epoch': 1.05} 11%|█ | 4347/41250 [10:30:55<88:59:39, 8.68s/it][2025-04-25 18:28:39,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.05 | optimizer_step: 0.93 [2025-04-25 18:28:39,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2866.15 | bwd_microstep: 5760.31 | bwd_inner_microstep: 5704.43 | bwd_allreduce_microstep: 55.83 | step_microstep: 19.15 [2025-04-25 18:28:39,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2866.15 | bwd: 5760.33 | bwd_inner: 5704.43 | bwd_allreduce: 55.86 | step: 19.15 11%|█ | 4348/41250 [10:31:04<89:05:08, 8.69s/it] {'loss': 0.0886, 'grad_norm': 1.3719218969345093, 'learning_rate': 3.940669095187853e-05, 'epoch': 1.05} 11%|█ | 4348/41250 [10:31:04<89:05:08, 8.69s/it][2025-04-25 18:28:47,983] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:28:47,984] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.95 | bwd_microstep: 5747.40 | bwd_inner_microstep: 5657.27 | bwd_allreduce_microstep: 90.08 | step_microstep: 18.63 [2025-04-25 18:28:47,984] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.95 | bwd: 5747.41 | bwd_inner: 5657.27 | bwd_allreduce: 90.10 | step: 18.63 11%|█ | 4349/41250 [10:31:13<89:02:31, 8.69s/it] {'loss': 0.1333, 'grad_norm': 0.8937363624572754, 'learning_rate': 3.940631124059278e-05, 'epoch': 1.05} 11%|█ | 4349/41250 [10:31:13<89:02:31, 8.69s/it][2025-04-25 18:28:56,607] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 18:28:56,607] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.78 | bwd_microstep: 5700.86 | bwd_inner_microstep: 5675.35 | bwd_allreduce_microstep: 25.46 | step_microstep: 18.62 [2025-04-25 18:28:56,608] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.78 | bwd: 5700.87 | bwd_inner: 5675.35 | bwd_allreduce: 25.48 | step: 18.62 11%|█ | 4350/41250 [10:31:21<88:51:32, 8.67s/it] {'loss': 0.0367, 'grad_norm': 0.5375616550445557, 'learning_rate': 3.940593140967094e-05, 'epoch': 1.05} 11%|█ | 4350/41250 [10:31:21<88:51:32, 8.67s/it][2025-04-25 18:29:05,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.99 | optimizer_step: 0.97 [2025-04-25 18:29:05,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.65 | bwd_microstep: 5792.45 | bwd_inner_microstep: 5665.28 | bwd_allreduce_microstep: 127.12 | step_microstep: 18.84 [2025-04-25 18:29:05,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.65 | bwd: 5792.46 | bwd_inner: 5665.28 | bwd_allreduce: 127.14 | step: 18.84 11%|█ | 4351/41250 [10:31:30<89:03:08, 8.69s/it] {'loss': 0.1449, 'grad_norm': 1.3790946006774902, 'learning_rate': 3.9405551459115344e-05, 'epoch': 1.05} 11%|█ | 4351/41250 [10:31:30<89:03:08, 8.69s/it][2025-04-25 18:29:14,035] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 18:29:14,035] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.08 | bwd_microstep: 5747.26 | bwd_inner_microstep: 5709.18 | bwd_allreduce_microstep: 38.03 | step_microstep: 18.87 [2025-04-25 18:29:14,035] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.08 | bwd: 5747.27 | bwd_inner: 5709.18 | bwd_allreduce: 38.05 | step: 18.88 11%|█ | 4352/41250 [10:31:39<89:03:22, 8.69s/it] {'loss': 0.1458, 'grad_norm': 1.9944881200790405, 'learning_rate': 3.9405171388928345e-05, 'epoch': 1.06} 11%|█ | 4352/41250 [10:31:39<89:03:22, 8.69s/it][2025-04-25 18:29:22,668] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 1.00 [2025-04-25 18:29:22,669] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.25 | bwd_microstep: 5693.97 | bwd_inner_microstep: 5672.12 | bwd_allreduce_microstep: 21.80 | step_microstep: 18.85 [2025-04-25 18:29:22,669] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.25 | bwd: 5693.98 | bwd_inner: 5672.12 | bwd_allreduce: 21.82 | step: 18.85 11%|█ | 4353/41250 [10:31:47<88:53:14, 8.67s/it] {'loss': 0.1745, 'grad_norm': 1.9208825826644897, 'learning_rate': 3.940479119911228e-05, 'epoch': 1.06} 11%|█ | 4353/41250 [10:31:47<88:53:14, 8.67s/it][2025-04-25 18:29:31,365] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.58 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:29:31,365] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.66 | bwd_microstep: 5773.66 | bwd_inner_microstep: 5665.47 | bwd_allreduce_microstep: 108.15 | step_microstep: 18.89 [2025-04-25 18:29:31,365] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.66 | bwd: 5773.68 | bwd_inner: 5665.47 | bwd_allreduce: 108.17 | step: 18.90 11%|█ | 4354/41250 [10:31:56<88:57:31, 8.68s/it] {'loss': 0.2992, 'grad_norm': 4.0982255935668945, 'learning_rate': 3.940441088966949e-05, 'epoch': 1.06} 11%|█ | 4354/41250 [10:31:56<88:57:31, 8.68s/it][2025-04-25 18:29:40,036] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.03 | optimizer_step: 1.03 [2025-04-25 18:29:40,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.90 | bwd_microstep: 5744.23 | bwd_inner_microstep: 5660.06 | bwd_allreduce_microstep: 84.12 | step_microstep: 18.67 [2025-04-25 18:29:40,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.90 | bwd: 5744.24 | bwd_inner: 5660.06 | bwd_allreduce: 84.14 | step: 18.68 11%|█ | 4355/41250 [10:32:05<88:55:59, 8.68s/it] {'loss': 0.0164, 'grad_norm': 0.25418123602867126, 'learning_rate': 3.940403046060232e-05, 'epoch': 1.06} 11%|█ | 4355/41250 [10:32:05<88:55:59, 8.68s/it][2025-04-25 18:29:48,701] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.03 | optimizer_step: 1.08 [2025-04-25 18:29:48,701] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.91 | bwd_microstep: 5718.44 | bwd_inner_microstep: 5705.68 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.89 [2025-04-25 18:29:48,702] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.91 | bwd: 5718.45 | bwd_inner: 5705.68 | bwd_allreduce: 12.73 | step: 18.89 11%|█ | 4356/41250 [10:32:14<88:53:01, 8.67s/it] {'loss': 0.0577, 'grad_norm': 0.9849978685379028, 'learning_rate': 3.9403649911913134e-05, 'epoch': 1.06} 11%|█ | 4356/41250 [10:32:14<88:53:01, 8.67s/it][2025-04-25 18:29:57,437] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.98 [2025-04-25 18:29:57,437] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.16 | bwd_microstep: 5817.31 | bwd_inner_microstep: 5648.49 | bwd_allreduce_microstep: 168.76 | step_microstep: 18.91 [2025-04-25 18:29:57,437] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.16 | bwd: 5817.32 | bwd_inner: 5648.49 | bwd_allreduce: 168.78 | step: 18.92 11%|█ | 4357/41250 [10:32:22<89:04:32, 8.69s/it] {'loss': 0.1355, 'grad_norm': 1.2141590118408203, 'learning_rate': 3.940326924360425e-05, 'epoch': 1.06} 11%|█ | 4357/41250 [10:32:22<89:04:32, 8.69s/it][2025-04-25 18:30:06,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 18:30:06,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.13 | bwd_microstep: 5728.91 | bwd_inner_microstep: 5700.79 | bwd_allreduce_microstep: 28.06 | step_microstep: 19.00 [2025-04-25 18:30:06,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.13 | bwd: 5728.93 | bwd_inner: 5700.79 | bwd_allreduce: 28.09 | step: 19.01 11%|█ | 4358/41250 [10:32:31<89:01:17, 8.69s/it] {'loss': 0.1808, 'grad_norm': 1.2154712677001953, 'learning_rate': 3.940288845567803e-05, 'epoch': 1.06} 11%|█ | 4358/41250 [10:32:31<89:01:17, 8.69s/it][2025-04-25 18:30:14,779] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 18:30:14,780] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.41 | bwd_microstep: 5729.34 | bwd_inner_microstep: 5716.62 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.88 [2025-04-25 18:30:14,780] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.41 | bwd: 5729.35 | bwd_inner: 5716.62 | bwd_allreduce: 12.69 | step: 18.88 11%|█ | 4359/41250 [10:32:40<88:57:29, 8.68s/it] {'loss': 0.1459, 'grad_norm': 1.4988512992858887, 'learning_rate': 3.940250754813682e-05, 'epoch': 1.06} 11%|█ | 4359/41250 [10:32:40<88:57:29, 8.68s/it][2025-04-25 18:30:23,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.96 | optimizer_step: 0.97 [2025-04-25 18:30:23,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.57 | bwd_microstep: 5699.98 | bwd_inner_microstep: 5673.21 | bwd_allreduce_microstep: 26.72 | step_microstep: 19.20 [2025-04-25 18:30:23,395] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.57 | bwd: 5699.99 | bwd_inner: 5673.21 | bwd_allreduce: 26.74 | step: 19.20 11%|█ | 4360/41250 [10:32:48<88:45:16, 8.66s/it] {'loss': 0.018, 'grad_norm': 0.18255771696567535, 'learning_rate': 3.940212652098297e-05, 'epoch': 1.06} 11%|█ | 4360/41250 [10:32:48<88:45:16, 8.66s/it][2025-04-25 18:30:32,061] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-25 18:30:32,062] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.24 | bwd_microstep: 5727.11 | bwd_inner_microstep: 5713.88 | bwd_allreduce_microstep: 13.18 | step_microstep: 19.32 [2025-04-25 18:30:32,062] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.24 | bwd: 5727.12 | bwd_inner: 5713.88 | bwd_allreduce: 13.20 | step: 19.32 11%|█ | 4361/41250 [10:32:57<88:46:12, 8.66s/it] {'loss': 0.1089, 'grad_norm': 1.5533779859542847, 'learning_rate': 3.940174537421882e-05, 'epoch': 1.06} 11%|█ | 4361/41250 [10:32:57<88:46:12, 8.66s/it][2025-04-25 18:30:40,755] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 18:30:40,756] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.67 | bwd_microstep: 5752.55 | bwd_inner_microstep: 5714.97 | bwd_allreduce_microstep: 37.54 | step_microstep: 19.03 [2025-04-25 18:30:40,756] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.67 | bwd: 5752.57 | bwd_inner: 5714.97 | bwd_allreduce: 37.56 | step: 19.03 11%|█ | 4362/41250 [10:33:06<88:51:41, 8.67s/it] {'loss': 0.1644, 'grad_norm': 1.7783571481704712, 'learning_rate': 3.940136410784673e-05, 'epoch': 1.06} 11%|█ | 4362/41250 [10:33:06<88:51:41, 8.67s/it][2025-04-25 18:30:49,401] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.16 | optimizer_step: 1.03 [2025-04-25 18:30:49,401] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.39 | bwd_microstep: 5712.44 | bwd_inner_microstep: 5699.55 | bwd_allreduce_microstep: 12.84 | step_microstep: 19.40 [2025-04-25 18:30:49,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.39 | bwd: 5712.46 | bwd_inner: 5699.55 | bwd_allreduce: 12.87 | step: 19.41 11%|█ | 4363/41250 [10:33:14<88:46:49, 8.66s/it] {'loss': 0.1516, 'grad_norm': 2.5252654552459717, 'learning_rate': 3.9400982721869045e-05, 'epoch': 1.06} 11%|█ | 4363/41250 [10:33:14<88:46:49, 8.66s/it][2025-04-25 18:30:58,077] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 18:30:58,077] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.09 | bwd_microstep: 5733.56 | bwd_inner_microstep: 5709.60 | bwd_allreduce_microstep: 23.90 | step_microstep: 19.08 [2025-04-25 18:30:58,078] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.09 | bwd: 5733.57 | bwd_inner: 5709.60 | bwd_allreduce: 23.92 | step: 19.08 11%|█ | 4364/41250 [10:33:23<88:48:34, 8.67s/it] {'loss': 0.183, 'grad_norm': 3.0939369201660156, 'learning_rate': 3.940060121628812e-05, 'epoch': 1.06} 11%|█ | 4364/41250 [10:33:23<88:48:34, 8.67s/it][2025-04-25 18:31:06,677] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.11 | optimizer_step: 1.03 [2025-04-25 18:31:06,678] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.65 | bwd_microstep: 5689.57 | bwd_inner_microstep: 5642.86 | bwd_allreduce_microstep: 46.65 | step_microstep: 19.29 [2025-04-25 18:31:06,678] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.65 | bwd: 5689.58 | bwd_inner: 5642.86 | bwd_allreduce: 46.67 | step: 19.29 11%|█ | 4365/41250 [10:33:32<88:36:07, 8.65s/it] {'loss': 0.1147, 'grad_norm': 1.7657740116119385, 'learning_rate': 3.94002195911063e-05, 'epoch': 1.06} 11%|█ | 4365/41250 [10:33:32<88:36:07, 8.65s/it][2025-04-25 18:31:15,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.00 | optimizer_step: 0.93 [2025-04-25 18:31:15,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.17 | bwd_microstep: 5725.42 | bwd_inner_microstep: 5709.34 | bwd_allreduce_microstep: 16.03 | step_microstep: 18.85 [2025-04-25 18:31:15,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.17 | bwd: 5725.44 | bwd_inner: 5709.34 | bwd_allreduce: 16.05 | step: 18.85 11%|█ | 4366/41250 [10:33:40<88:39:22, 8.65s/it] {'loss': 0.0868, 'grad_norm': 1.0539568662643433, 'learning_rate': 3.939983784632595e-05, 'epoch': 1.06} 11%|█ | 4366/41250 [10:33:40<88:39:22, 8.65s/it][2025-04-25 18:31:23,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 1.04 [2025-04-25 18:31:23,976] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.04 | bwd_microstep: 5701.07 | bwd_inner_microstep: 5688.32 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.80 [2025-04-25 18:31:23,976] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.05 | bwd: 5701.08 | bwd_inner: 5688.32 | bwd_allreduce: 12.72 | step: 18.80 11%|█ | 4367/41250 [10:33:49<88:35:17, 8.65s/it] {'loss': 0.1783, 'grad_norm': 1.577417016029358, 'learning_rate': 3.939945598194941e-05, 'epoch': 1.06} 11%|█ | 4367/41250 [10:33:49<88:35:17, 8.65s/it][2025-04-25 18:31:32,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.00 | optimizer_step: 1.07 [2025-04-25 18:31:32,632] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.96 | bwd_microstep: 5711.85 | bwd_inner_microstep: 5699.09 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.88 [2025-04-25 18:31:32,632] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.96 | bwd: 5711.87 | bwd_inner: 5699.09 | bwd_allreduce: 12.74 | step: 18.89 11%|█ | 4368/41250 [10:33:57<88:37:00, 8.65s/it] {'loss': 0.0508, 'grad_norm': 0.614050030708313, 'learning_rate': 3.9399073997979036e-05, 'epoch': 1.06} 11%|█ | 4368/41250 [10:33:57<88:37:00, 8.65s/it][2025-04-25 18:31:41,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:31:41,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.28 | bwd_microstep: 5674.00 | bwd_inner_microstep: 5661.16 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.80 [2025-04-25 18:31:41,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.28 | bwd: 5674.02 | bwd_inner: 5661.16 | bwd_allreduce: 12.82 | step: 18.80 11%|█ | 4369/41250 [10:34:06<88:26:17, 8.63s/it] {'loss': 0.0943, 'grad_norm': 1.0155844688415527, 'learning_rate': 3.939869189441719e-05, 'epoch': 1.06} 11%|█ | 4369/41250 [10:34:06<88:26:17, 8.63s/it][2025-04-25 18:31:49,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-25 18:31:49,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.05 | bwd_microstep: 5725.26 | bwd_inner_microstep: 5651.62 | bwd_allreduce_microstep: 73.59 | step_microstep: 18.69 [2025-04-25 18:31:49,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.05 | bwd: 5725.27 | bwd_inner: 5651.62 | bwd_allreduce: 73.61 | step: 18.71 11%|█ | 4370/41250 [10:34:15<88:28:48, 8.64s/it] {'loss': 0.1436, 'grad_norm': 1.3770383596420288, 'learning_rate': 3.939830967126622e-05, 'epoch': 1.06} 11%|█ | 4370/41250 [10:34:15<88:28:48, 8.64s/it][2025-04-25 18:31:58,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:31:58,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.62 | bwd_microstep: 5731.62 | bwd_inner_microstep: 5642.94 | bwd_allreduce_microstep: 88.64 | step_microstep: 18.83 [2025-04-25 18:31:58,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.62 | bwd: 5731.63 | bwd_inner: 5642.94 | bwd_allreduce: 88.65 | step: 18.84 11%|█ | 4371/41250 [10:34:23<88:32:44, 8.64s/it] {'loss': 0.1558, 'grad_norm': 2.2909882068634033, 'learning_rate': 3.9397927328528496e-05, 'epoch': 1.06} 11%|█ | 4371/41250 [10:34:23<88:32:44, 8.64s/it][2025-04-25 18:32:07,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:32:07,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.07 | bwd_microstep: 5753.41 | bwd_inner_microstep: 5655.53 | bwd_allreduce_microstep: 97.83 | step_microstep: 18.74 [2025-04-25 18:32:07,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.07 | bwd: 5753.42 | bwd_inner: 5655.53 | bwd_allreduce: 97.85 | step: 18.74 11%|█ | 4372/41250 [10:34:32<88:38:53, 8.65s/it] {'loss': 0.1227, 'grad_norm': 1.7595747709274292, 'learning_rate': 3.9397544866206355e-05, 'epoch': 1.06} 11%|█ | 4372/41250 [10:34:32<88:38:53, 8.65s/it][2025-04-25 18:32:15,822] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 18:32:15,823] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.00 | bwd_microstep: 5697.02 | bwd_inner_microstep: 5639.95 | bwd_allreduce_microstep: 57.01 | step_microstep: 18.82 [2025-04-25 18:32:15,823] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.00 | bwd: 5697.04 | bwd_inner: 5639.95 | bwd_allreduce: 57.04 | step: 18.82 11%|█ | 4373/41250 [10:34:41<88:31:23, 8.64s/it] {'loss': 0.1915, 'grad_norm': 2.7969486713409424, 'learning_rate': 3.939716228430217e-05, 'epoch': 1.06} 11%|█ | 4373/41250 [10:34:41<88:31:23, 8.64s/it][2025-04-25 18:32:24,468] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.04 | optimizer_step: 1.01 [2025-04-25 18:32:24,469] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.37 | bwd_microstep: 5710.59 | bwd_inner_microstep: 5697.92 | bwd_allreduce_microstep: 12.63 | step_microstep: 19.34 [2025-04-25 18:32:24,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.37 | bwd: 5710.60 | bwd_inner: 5697.92 | bwd_allreduce: 12.65 | step: 19.34 11%|█ | 4374/41250 [10:34:49<88:32:11, 8.64s/it] {'loss': 0.0887, 'grad_norm': 0.7422292232513428, 'learning_rate': 3.93967795828183e-05, 'epoch': 1.06} 11%|█ | 4374/41250 [10:34:49<88:32:11, 8.64s/it][2025-04-25 18:32:33,065] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.98 | optimizer_step: 0.99 [2025-04-25 18:32:33,066] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.91 | bwd_microstep: 5676.56 | bwd_inner_microstep: 5663.78 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.46 [2025-04-25 18:32:33,066] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.91 | bwd: 5676.57 | bwd_inner: 5663.78 | bwd_allreduce: 12.75 | step: 18.46 11%|█ | 4375/41250 [10:34:58<88:23:17, 8.63s/it] {'loss': 0.2203, 'grad_norm': 1.7668137550354004, 'learning_rate': 3.939639676175709e-05, 'epoch': 1.06} 11%|█ | 4375/41250 [10:34:58<88:23:17, 8.63s/it][2025-04-25 18:32:41,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:32:41,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.76 | bwd_microstep: 5723.07 | bwd_inner_microstep: 5647.12 | bwd_allreduce_microstep: 75.91 | step_microstep: 18.69 [2025-04-25 18:32:41,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.76 | bwd: 5723.08 | bwd_inner: 5647.12 | bwd_allreduce: 75.92 | step: 18.69 11%|█ | 4376/41250 [10:35:07<88:26:01, 8.63s/it] {'loss': 0.2211, 'grad_norm': 1.5097174644470215, 'learning_rate': 3.939601382112091e-05, 'epoch': 1.06} 11%|█ | 4376/41250 [10:35:07<88:26:01, 8.63s/it][2025-04-25 18:32:50,373] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 1.19 | optimizer_step: 0.90 [2025-04-25 18:32:50,374] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.99 | bwd_microstep: 5754.16 | bwd_inner_microstep: 5655.49 | bwd_allreduce_microstep: 98.61 | step_microstep: 18.54 [2025-04-25 18:32:50,374] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.99 | bwd: 5754.17 | bwd_inner: 5655.49 | bwd_allreduce: 98.63 | step: 18.54 11%|█ | 4377/41250 [10:35:15<88:31:22, 8.64s/it] {'loss': 0.1688, 'grad_norm': 2.0830674171447754, 'learning_rate': 3.939563076091213e-05, 'epoch': 1.06} 11%|█ | 4377/41250 [10:35:15<88:31:22, 8.64s/it][2025-04-25 18:32:59,047] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 18:32:59,047] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.78 | bwd_microstep: 5756.03 | bwd_inner_microstep: 5638.52 | bwd_allreduce_microstep: 117.47 | step_microstep: 18.60 [2025-04-25 18:32:59,047] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.78 | bwd: 5756.05 | bwd_inner: 5638.52 | bwd_allreduce: 117.49 | step: 18.60 11%|█ | 4378/41250 [10:35:24<88:36:43, 8.65s/it] {'loss': 0.3217, 'grad_norm': 2.794948101043701, 'learning_rate': 3.939524758113309e-05, 'epoch': 1.06} 11%|█ | 4378/41250 [10:35:24<88:36:43, 8.65s/it][2025-04-25 18:33:07,795] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 18:33:07,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.76 | bwd_microstep: 5820.64 | bwd_inner_microstep: 5652.76 | bwd_allreduce_microstep: 167.83 | step_microstep: 18.89 [2025-04-25 18:33:07,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.77 | bwd: 5820.65 | bwd_inner: 5652.76 | bwd_allreduce: 167.85 | step: 18.90 11%|█ | 4379/41250 [10:35:33<88:54:30, 8.68s/it] {'loss': 0.3309, 'grad_norm': 2.016970634460449, 'learning_rate': 3.939486428178616e-05, 'epoch': 1.06} 11%|█ | 4379/41250 [10:35:33<88:54:30, 8.68s/it][2025-04-25 18:33:16,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:33:16,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.04 | bwd_microstep: 5690.29 | bwd_inner_microstep: 5650.56 | bwd_allreduce_microstep: 39.69 | step_microstep: 18.79 [2025-04-25 18:33:16,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.04 | bwd: 5690.30 | bwd_inner: 5650.56 | bwd_allreduce: 39.70 | step: 18.79 11%|█ | 4380/41250 [10:35:41<88:38:24, 8.65s/it] {'loss': 0.1316, 'grad_norm': 4.002390384674072, 'learning_rate': 3.939448086287372e-05, 'epoch': 1.06} 11%|█ | 4380/41250 [10:35:41<88:38:24, 8.65s/it][2025-04-25 18:33:25,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 18:33:25,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.36 | bwd_microstep: 5771.24 | bwd_inner_microstep: 5655.70 | bwd_allreduce_microstep: 115.49 | step_microstep: 18.79 [2025-04-25 18:33:25,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.36 | bwd: 5771.25 | bwd_inner: 5655.70 | bwd_allreduce: 115.51 | step: 18.79 11%|█ | 4381/41250 [10:35:50<88:42:50, 8.66s/it] {'loss': 0.1699, 'grad_norm': 1.4788113832473755, 'learning_rate': 3.939409732439811e-05, 'epoch': 1.06} 11%|█ | 4381/41250 [10:35:50<88:42:50, 8.66s/it][2025-04-25 18:33:33,664] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 18:33:33,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.08 | bwd_microstep: 5691.68 | bwd_inner_microstep: 5639.46 | bwd_allreduce_microstep: 52.17 | step_microstep: 18.75 [2025-04-25 18:33:33,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.08 | bwd: 5691.69 | bwd_inner: 5639.46 | bwd_allreduce: 52.18 | step: 18.76 11%|█ | 4382/41250 [10:35:58<88:31:09, 8.64s/it] {'loss': 0.4661, 'grad_norm': 4.3773980140686035, 'learning_rate': 3.939371366636171e-05, 'epoch': 1.06} 11%|█ | 4382/41250 [10:35:58<88:31:09, 8.64s/it][2025-04-25 18:33:42,305] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.04 | optimizer_step: 1.05 [2025-04-25 18:33:42,305] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.19 | bwd_microstep: 5711.50 | bwd_inner_microstep: 5698.27 | bwd_allreduce_microstep: 13.18 | step_microstep: 19.35 [2025-04-25 18:33:42,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.19 | bwd: 5711.52 | bwd_inner: 5698.27 | bwd_allreduce: 13.20 | step: 19.35 11%|█ | 4383/41250 [10:36:07<88:30:01, 8.64s/it] {'loss': 0.1928, 'grad_norm': 1.420462965965271, 'learning_rate': 3.939332988876688e-05, 'epoch': 1.06} 11%|█ | 4383/41250 [10:36:07<88:30:01, 8.64s/it][2025-04-25 18:33:51,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 1.04 | optimizer_step: 0.93 [2025-04-25 18:33:51,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.91 | bwd_microstep: 5985.14 | bwd_inner_microstep: 5700.71 | bwd_allreduce_microstep: 284.38 | step_microstep: 19.44 [2025-04-25 18:33:51,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.91 | bwd: 5985.15 | bwd_inner: 5700.71 | bwd_allreduce: 284.40 | step: 19.44 11%|█ | 4384/41250 [10:36:16<89:22:06, 8.73s/it] {'loss': 0.203, 'grad_norm': 1.6352598667144775, 'learning_rate': 3.9392945991615986e-05, 'epoch': 1.06} 11%|█ | 4384/41250 [10:36:16<89:22:06, 8.73s/it][2025-04-25 18:33:59,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 18:33:59,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.30 | bwd_microstep: 5728.66 | bwd_inner_microstep: 5703.27 | bwd_allreduce_microstep: 25.34 | step_microstep: 18.89 [2025-04-25 18:33:59,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.30 | bwd: 5728.68 | bwd_inner: 5703.27 | bwd_allreduce: 25.36 | step: 18.89 11%|█ | 4385/41250 [10:36:25<89:11:56, 8.71s/it] {'loss': 0.2102, 'grad_norm': 2.190444231033325, 'learning_rate': 3.939256197491139e-05, 'epoch': 1.06} 11%|█ | 4385/41250 [10:36:25<89:11:56, 8.71s/it][2025-04-25 18:34:08,566] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.12 | optimizer_step: 1.04 [2025-04-25 18:34:08,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.24 | bwd_microstep: 5743.27 | bwd_inner_microstep: 5657.28 | bwd_allreduce_microstep: 85.93 | step_microstep: 19.47 [2025-04-25 18:34:08,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.24 | bwd: 5743.29 | bwd_inner: 5657.28 | bwd_allreduce: 85.96 | step: 19.47 11%|█ | 4386/41250 [10:36:33<89:03:07, 8.70s/it] {'loss': 0.0791, 'grad_norm': 0.6603943705558777, 'learning_rate': 3.939217783865546e-05, 'epoch': 1.06} 11%|█ | 4386/41250 [10:36:33<89:03:07, 8.70s/it][2025-04-25 18:34:17,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.03 | optimizer_step: 1.04 [2025-04-25 18:34:17,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2892.37 | bwd_microstep: 5790.57 | bwd_inner_microstep: 5777.71 | bwd_allreduce_microstep: 12.81 | step_microstep: 19.25 [2025-04-25 18:34:17,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2892.37 | bwd: 5790.59 | bwd_inner: 5777.71 | bwd_allreduce: 12.83 | step: 19.25 11%|█ | 4387/41250 [10:36:42<89:15:17, 8.72s/it] {'loss': 0.277, 'grad_norm': 3.764476776123047, 'learning_rate': 3.939179358285058e-05, 'epoch': 1.06} 11%|█ | 4387/41250 [10:36:42<89:15:17, 8.72s/it][2025-04-25 18:34:25,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 1.02 [2025-04-25 18:34:25,976] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.72 | bwd_microstep: 5710.55 | bwd_inner_microstep: 5697.91 | bwd_allreduce_microstep: 12.60 | step_microstep: 18.80 [2025-04-25 18:34:25,976] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.72 | bwd: 5710.57 | bwd_inner: 5697.91 | bwd_allreduce: 12.62 | step: 18.80 11%|█ | 4388/41250 [10:36:51<89:01:40, 8.69s/it] {'loss': 0.1864, 'grad_norm': 2.2933590412139893, 'learning_rate': 3.93914092074991e-05, 'epoch': 1.06} 11%|█ | 4388/41250 [10:36:51<89:01:40, 8.69s/it][2025-04-25 18:34:34,573] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:34:34,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.73 | bwd_microstep: 5687.46 | bwd_inner_microstep: 5663.21 | bwd_allreduce_microstep: 24.20 | step_microstep: 18.47 [2025-04-25 18:34:34,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.73 | bwd: 5687.47 | bwd_inner: 5663.21 | bwd_allreduce: 24.22 | step: 18.47 11%|█ | 4389/41250 [10:36:59<88:44:24, 8.67s/it] {'loss': 0.1718, 'grad_norm': 1.1613348722457886, 'learning_rate': 3.93910247126034e-05, 'epoch': 1.06} 11%|█ | 4389/41250 [10:36:59<88:44:24, 8.67s/it][2025-04-25 18:34:43,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.04 | optimizer_step: 1.03 [2025-04-25 18:34:43,200] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.84 | bwd_microstep: 5710.94 | bwd_inner_microstep: 5658.33 | bwd_allreduce_microstep: 52.55 | step_microstep: 20.26 [2025-04-25 18:34:43,200] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.84 | bwd: 5710.96 | bwd_inner: 5658.33 | bwd_allreduce: 52.57 | step: 20.26 11%|█ | 4390/41250 [10:37:08<88:36:07, 8.65s/it] {'loss': 0.1018, 'grad_norm': 2.734412670135498, 'learning_rate': 3.939064009816584e-05, 'epoch': 1.06} 11%|█ | 4390/41250 [10:37:08<88:36:07, 8.65s/it][2025-04-25 18:34:51,871] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:34:51,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.82 | bwd_microstep: 5732.64 | bwd_inner_microstep: 5719.92 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.71 [2025-04-25 18:34:51,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.82 | bwd: 5732.65 | bwd_inner: 5719.92 | bwd_allreduce: 12.69 | step: 18.71 11%|█ | 4391/41250 [10:37:17<88:39:22, 8.66s/it] {'loss': 0.0552, 'grad_norm': 1.0721778869628906, 'learning_rate': 3.93902553641888e-05, 'epoch': 1.06} 11%|█ | 4391/41250 [10:37:17<88:39:22, 8.66s/it][2025-04-25 18:35:00,576] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 18:35:00,577] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.84 | bwd_microstep: 5771.75 | bwd_inner_microstep: 5689.19 | bwd_allreduce_microstep: 82.51 | step_microstep: 18.82 [2025-04-25 18:35:00,577] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.84 | bwd: 5771.76 | bwd_inner: 5689.19 | bwd_allreduce: 82.53 | step: 18.82 11%|█ | 4392/41250 [10:37:25<88:47:47, 8.67s/it] {'loss': 0.272, 'grad_norm': 2.5128087997436523, 'learning_rate': 3.938987051067465e-05, 'epoch': 1.06} 11%|█ | 4392/41250 [10:37:25<88:47:47, 8.67s/it][2025-04-25 18:35:09,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 0.99 | optimizer_step: 0.97 [2025-04-25 18:35:09,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.81 | bwd_microstep: 5896.24 | bwd_inner_microstep: 5700.53 | bwd_allreduce_microstep: 195.66 | step_microstep: 18.44 [2025-04-25 18:35:09,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.81 | bwd: 5896.25 | bwd_inner: 5700.53 | bwd_allreduce: 195.68 | step: 18.44 11%|█ | 4393/41250 [10:37:34<89:15:50, 8.72s/it] {'loss': 0.0399, 'grad_norm': 0.4485118091106415, 'learning_rate': 3.938948553762576e-05, 'epoch': 1.06} 11%|█ | 4393/41250 [10:37:34<89:15:50, 8.72s/it][2025-04-25 18:35:18,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.00 | optimizer_step: 1.11 [2025-04-25 18:35:18,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.96 | bwd_microstep: 5790.84 | bwd_inner_microstep: 5703.61 | bwd_allreduce_microstep: 87.18 | step_microstep: 18.82 [2025-04-25 18:35:18,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.96 | bwd: 5790.85 | bwd_inner: 5703.61 | bwd_allreduce: 87.20 | step: 18.82 11%|█ | 4394/41250 [10:37:43<89:16:28, 8.72s/it] {'loss': 0.3051, 'grad_norm': 1.8765010833740234, 'learning_rate': 3.938910044504451e-05, 'epoch': 1.07} 11%|█ | 4394/41250 [10:37:43<89:16:28, 8.72s/it][2025-04-25 18:35:26,828] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.03 | optimizer_step: 0.92 [2025-04-25 18:35:26,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.62 | bwd_microstep: 5770.41 | bwd_inner_microstep: 5662.93 | bwd_allreduce_microstep: 107.43 | step_microstep: 19.09 [2025-04-25 18:35:26,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.62 | bwd: 5770.43 | bwd_inner: 5662.93 | bwd_allreduce: 107.45 | step: 19.10 11%|█ | 4395/41250 [10:37:52<89:13:12, 8.72s/it] {'loss': 0.1475, 'grad_norm': 1.8870352506637573, 'learning_rate': 3.938871523293327e-05, 'epoch': 1.07} 11%|█ | 4395/41250 [10:37:52<89:13:12, 8.72s/it][2025-04-25 18:35:35,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.04 | optimizer_step: 0.98 [2025-04-25 18:35:35,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.08 | bwd_microstep: 5780.01 | bwd_inner_microstep: 5655.17 | bwd_allreduce_microstep: 124.79 | step_microstep: 18.93 [2025-04-25 18:35:35,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.08 | bwd: 5780.02 | bwd_inner: 5655.17 | bwd_allreduce: 124.81 | step: 18.93 11%|█ | 4396/41250 [10:38:00<89:09:34, 8.71s/it] {'loss': 0.1032, 'grad_norm': 1.218840479850769, 'learning_rate': 3.938832990129441e-05, 'epoch': 1.07} 11%|█ | 4396/41250 [10:38:00<89:09:34, 8.71s/it][2025-04-25 18:35:44,174] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-25 18:35:44,175] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.20 | bwd_microstep: 5739.42 | bwd_inner_microstep: 5664.86 | bwd_allreduce_microstep: 74.51 | step_microstep: 19.16 [2025-04-25 18:35:44,175] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.20 | bwd: 5739.44 | bwd_inner: 5664.86 | bwd_allreduce: 74.53 | step: 19.16 11%|█ | 4397/41250 [10:38:09<88:59:01, 8.69s/it] {'loss': 0.1292, 'grad_norm': 1.3917323350906372, 'learning_rate': 3.9387944450130306e-05, 'epoch': 1.07} 11%|█ | 4397/41250 [10:38:09<88:59:01, 8.69s/it][2025-04-25 18:35:52,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.20 | optimizer_step: 0.94 [2025-04-25 18:35:52,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.06 | bwd_microstep: 5763.58 | bwd_inner_microstep: 5693.94 | bwd_allreduce_microstep: 69.59 | step_microstep: 19.43 [2025-04-25 18:35:52,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.06 | bwd: 5763.59 | bwd_inner: 5693.94 | bwd_allreduce: 69.61 | step: 19.43 11%|█ | 4398/41250 [10:38:18<88:59:53, 8.69s/it] {'loss': 0.0682, 'grad_norm': 0.7918466329574585, 'learning_rate': 3.938755887944335e-05, 'epoch': 1.07} 11%|█ | 4398/41250 [10:38:18<88:59:53, 8.69s/it][2025-04-25 18:36:01,626] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.01 | optimizer_step: 1.01 [2025-04-25 18:36:01,626] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.46 | bwd_microstep: 5832.25 | bwd_inner_microstep: 5659.06 | bwd_allreduce_microstep: 173.14 | step_microstep: 18.70 [2025-04-25 18:36:01,626] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.46 | bwd: 5832.27 | bwd_inner: 5659.06 | bwd_allreduce: 173.16 | step: 18.70 11%|█ | 4399/41250 [10:38:26<89:10:31, 8.71s/it] {'loss': 0.191, 'grad_norm': 1.6946619749069214, 'learning_rate': 3.9387173189235896e-05, 'epoch': 1.07} 11%|█ | 4399/41250 [10:38:26<89:10:31, 8.71s/it][2025-04-25 18:36:10,349] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.05 | optimizer_step: 1.02 [2025-04-25 18:36:10,350] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.48 | bwd_microstep: 5776.73 | bwd_inner_microstep: 5707.07 | bwd_allreduce_microstep: 69.61 | step_microstep: 19.27 [2025-04-25 18:36:10,350] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.48 | bwd: 5776.74 | bwd_inner: 5707.07 | bwd_allreduce: 69.63 | step: 19.27 11%|█ | 4400/41250 [10:38:35<89:12:10, 8.71s/it] {'loss': 0.2191, 'grad_norm': 1.5005919933319092, 'learning_rate': 3.938678737951034e-05, 'epoch': 1.07} 11%|█ | 4400/41250 [10:38:35<89:12:10, 8.71s/it][2025-04-25 18:36:19,044] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 18:36:19,045] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.18 | bwd_microstep: 5773.94 | bwd_inner_microstep: 5654.60 | bwd_allreduce_microstep: 119.29 | step_microstep: 18.62 [2025-04-25 18:36:19,045] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.18 | bwd: 5773.96 | bwd_inner: 5654.60 | bwd_allreduce: 119.31 | step: 18.63 11%|█ | 4401/41250 [10:38:44<89:08:21, 8.71s/it] {'loss': 0.0291, 'grad_norm': 0.30797114968299866, 'learning_rate': 3.938640145026905e-05, 'epoch': 1.07} 11%|█ | 4401/41250 [10:38:44<89:08:21, 8.71s/it][2025-04-25 18:36:27,717] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:36:27,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.58 | bwd_microstep: 5728.39 | bwd_inner_microstep: 5715.52 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.66 [2025-04-25 18:36:27,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.58 | bwd: 5728.40 | bwd_inner: 5715.52 | bwd_allreduce: 12.84 | step: 18.66 11%|█ | 4402/41250 [10:38:53<89:01:40, 8.70s/it] {'loss': 0.0985, 'grad_norm': 0.942489504814148, 'learning_rate': 3.9386015401514406e-05, 'epoch': 1.07} 11%|█ | 4402/41250 [10:38:53<89:01:40, 8.70s/it][2025-04-25 18:36:36,424] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.96 [2025-04-25 18:36:36,424] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.33 | bwd_microstep: 5768.48 | bwd_inner_microstep: 5671.46 | bwd_allreduce_microstep: 96.97 | step_microstep: 18.80 [2025-04-25 18:36:36,425] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.33 | bwd: 5768.49 | bwd_inner: 5671.46 | bwd_allreduce: 96.99 | step: 18.80 11%|█ | 4403/41250 [10:39:01<89:03:20, 8.70s/it] {'loss': 0.1946, 'grad_norm': 1.446420431137085, 'learning_rate': 3.9385629233248795e-05, 'epoch': 1.07} 11%|█ | 4403/41250 [10:39:01<89:03:20, 8.70s/it][2025-04-25 18:36:45,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.01 | optimizer_step: 1.08 [2025-04-25 18:36:45,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.61 | bwd_microstep: 5739.95 | bwd_inner_microstep: 5664.57 | bwd_allreduce_microstep: 75.33 | step_microstep: 18.75 [2025-04-25 18:36:45,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.61 | bwd: 5739.96 | bwd_inner: 5664.57 | bwd_allreduce: 75.35 | step: 18.75 11%|█ | 4404/41250 [10:39:10<88:56:39, 8.69s/it] {'loss': 0.0737, 'grad_norm': 0.6900420188903809, 'learning_rate': 3.9385242945474586e-05, 'epoch': 1.07} 11%|█ | 4404/41250 [10:39:10<88:56:39, 8.69s/it][2025-04-25 18:36:53,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:36:53,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2864.58 | bwd_microstep: 5731.26 | bwd_inner_microstep: 5718.56 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.64 [2025-04-25 18:36:53,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2864.58 | bwd: 5731.27 | bwd_inner: 5718.56 | bwd_allreduce: 12.67 | step: 18.64 11%|█ | 4405/41250 [10:39:19<88:54:01, 8.69s/it] {'loss': 0.1729, 'grad_norm': 1.9499508142471313, 'learning_rate': 3.938485653819417e-05, 'epoch': 1.07} 11%|█ | 4405/41250 [10:39:19<88:54:01, 8.69s/it][2025-04-25 18:37:02,422] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 18:37:02,422] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.34 | bwd_microstep: 5717.18 | bwd_inner_microstep: 5704.10 | bwd_allreduce_microstep: 13.03 | step_microstep: 18.90 [2025-04-25 18:37:02,422] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.34 | bwd: 5717.19 | bwd_inner: 5704.10 | bwd_allreduce: 13.06 | step: 18.91 11%|█ | 4406/41250 [10:39:27<88:48:00, 8.68s/it] {'loss': 0.0776, 'grad_norm': 0.8000380396842957, 'learning_rate': 3.938447001140993e-05, 'epoch': 1.07} 11%|█ | 4406/41250 [10:39:27<88:48:00, 8.68s/it][2025-04-25 18:37:11,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:37:11,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.14 | bwd_microstep: 5757.69 | bwd_inner_microstep: 5687.64 | bwd_allreduce_microstep: 70.01 | step_microstep: 18.54 [2025-04-25 18:37:11,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.14 | bwd: 5757.70 | bwd_inner: 5687.64 | bwd_allreduce: 70.02 | step: 18.54 11%|█ | 4407/41250 [10:39:36<88:50:43, 8.68s/it] {'loss': 0.0941, 'grad_norm': 1.1551799774169922, 'learning_rate': 3.938408336512424e-05, 'epoch': 1.07} 11%|█ | 4407/41250 [10:39:36<88:50:43, 8.68s/it][2025-04-25 18:37:19,940] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 18:37:19,941] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.30 | bwd_microstep: 5885.92 | bwd_inner_microstep: 5701.14 | bwd_allreduce_microstep: 184.72 | step_microstep: 18.65 [2025-04-25 18:37:19,941] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.30 | bwd: 5885.93 | bwd_inner: 5701.14 | bwd_allreduce: 184.75 | step: 18.65 11%|█ | 4408/41250 [10:39:45<89:17:33, 8.73s/it] {'loss': 0.0816, 'grad_norm': 1.5748038291931152, 'learning_rate': 3.938369659933949e-05, 'epoch': 1.07} 11%|█ | 4408/41250 [10:39:45<89:17:33, 8.73s/it][2025-04-25 18:37:28,573] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-25 18:37:28,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.17 | bwd_microstep: 5713.07 | bwd_inner_microstep: 5653.30 | bwd_allreduce_microstep: 59.72 | step_microstep: 18.78 [2025-04-25 18:37:28,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.17 | bwd: 5713.08 | bwd_inner: 5653.30 | bwd_allreduce: 59.74 | step: 18.79 11%|█ | 4409/41250 [10:39:53<89:00:20, 8.70s/it] {'loss': 0.3335, 'grad_norm': 2.518967628479004, 'learning_rate': 3.938330971405806e-05, 'epoch': 1.07} 11%|█ | 4409/41250 [10:39:53<89:00:20, 8.70s/it][2025-04-25 18:37:37,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 1.06 [2025-04-25 18:37:37,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.76 | bwd_microstep: 5741.48 | bwd_inner_microstep: 5660.97 | bwd_allreduce_microstep: 80.47 | step_microstep: 19.22 [2025-04-25 18:37:37,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.76 | bwd: 5741.50 | bwd_inner: 5660.97 | bwd_allreduce: 80.49 | step: 19.22 11%|█ | 4410/41250 [10:40:02<88:54:42, 8.69s/it] {'loss': 0.2221, 'grad_norm': 1.7776679992675781, 'learning_rate': 3.938292270928234e-05, 'epoch': 1.07} 11%|█ | 4410/41250 [10:40:02<88:54:42, 8.69s/it][2025-04-25 18:37:45,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.72 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 18:37:45,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.28 | bwd_microstep: 5698.90 | bwd_inner_microstep: 5686.40 | bwd_allreduce_microstep: 12.45 | step_microstep: 19.14 [2025-04-25 18:37:45,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.28 | bwd: 5698.91 | bwd_inner: 5686.40 | bwd_allreduce: 12.47 | step: 19.14 11%|█ | 4411/41250 [10:40:11<88:43:51, 8.67s/it] {'loss': 0.0244, 'grad_norm': 0.7130568623542786, 'learning_rate': 3.938253558501472e-05, 'epoch': 1.07} 11%|█ | 4411/41250 [10:40:11<88:43:51, 8.67s/it][2025-04-25 18:37:54,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.27 | optimizer_step: 1.02 [2025-04-25 18:37:54,571] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.49 | bwd_microstep: 5785.57 | bwd_inner_microstep: 5662.56 | bwd_allreduce_microstep: 122.96 | step_microstep: 19.91 [2025-04-25 18:37:54,571] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.49 | bwd: 5785.59 | bwd_inner: 5662.56 | bwd_allreduce: 122.98 | step: 19.91 11%|█ | 4412/41250 [10:40:19<88:49:05, 8.68s/it] {'loss': 0.0699, 'grad_norm': 1.0552712678909302, 'learning_rate': 3.9382148341257575e-05, 'epoch': 1.07} 11%|█ | 4412/41250 [10:40:19<88:49:05, 8.68s/it][2025-04-25 18:38:03,293] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-25 18:38:03,294] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.89 | bwd_microstep: 5807.97 | bwd_inner_microstep: 5651.98 | bwd_allreduce_microstep: 155.94 | step_microstep: 19.02 [2025-04-25 18:38:03,294] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.89 | bwd: 5807.98 | bwd_inner: 5651.98 | bwd_allreduce: 155.96 | step: 19.02 11%|█ | 4413/41250 [10:40:28<88:56:45, 8.69s/it] {'loss': 0.2272, 'grad_norm': 1.8068305253982544, 'learning_rate': 3.9381760978013304e-05, 'epoch': 1.07} 11%|█ | 4413/41250 [10:40:28<88:56:45, 8.69s/it][2025-04-25 18:38:11,929] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:38:11,930] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.23 | bwd_microstep: 5706.06 | bwd_inner_microstep: 5693.42 | bwd_allreduce_microstep: 12.59 | step_microstep: 18.75 [2025-04-25 18:38:11,930] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.23 | bwd: 5706.07 | bwd_inner: 5693.42 | bwd_allreduce: 12.61 | step: 18.75 11%|█ | 4414/41250 [10:40:37<88:46:09, 8.68s/it] {'loss': 0.2516, 'grad_norm': 2.5417561531066895, 'learning_rate': 3.938137349528428e-05, 'epoch': 1.07} 11%|█ | 4414/41250 [10:40:37<88:46:09, 8.68s/it][2025-04-25 18:38:20,566] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:38:20,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.39 | bwd_microstep: 5703.05 | bwd_inner_microstep: 5690.17 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.87 [2025-04-25 18:38:20,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.40 | bwd: 5703.07 | bwd_inner: 5690.17 | bwd_allreduce: 12.86 | step: 18.88 11%|█ | 4415/41250 [10:40:45<88:38:53, 8.66s/it] {'loss': 0.2104, 'grad_norm': 1.9450719356536865, 'learning_rate': 3.938098589307291e-05, 'epoch': 1.07} 11%|█ | 4415/41250 [10:40:45<88:38:53, 8.66s/it][2025-04-25 18:38:29,247] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:38:29,247] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.62 | bwd_microstep: 5769.23 | bwd_inner_microstep: 5654.08 | bwd_allreduce_microstep: 115.11 | step_microstep: 18.64 [2025-04-25 18:38:29,247] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.62 | bwd: 5769.25 | bwd_inner: 5654.08 | bwd_allreduce: 115.13 | step: 18.64 11%|█ | 4416/41250 [10:40:54<88:41:49, 8.67s/it] {'loss': 0.0633, 'grad_norm': 0.6060265898704529, 'learning_rate': 3.938059817138157e-05, 'epoch': 1.07} 11%|█ | 4416/41250 [10:40:54<88:41:49, 8.67s/it][2025-04-25 18:38:37,931] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.12 | optimizer_step: 1.16 [2025-04-25 18:38:37,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.51 | bwd_microstep: 5770.46 | bwd_inner_microstep: 5650.24 | bwd_allreduce_microstep: 120.16 | step_microstep: 19.72 [2025-04-25 18:38:37,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.51 | bwd: 5770.47 | bwd_inner: 5650.24 | bwd_allreduce: 120.19 | step: 19.72 11%|█ | 4417/41250 [10:41:03<88:44:54, 8.67s/it] {'loss': 0.044, 'grad_norm': 0.7225815057754517, 'learning_rate': 3.9380210330212644e-05, 'epoch': 1.07} 11%|█ | 4417/41250 [10:41:03<88:44:54, 8.67s/it][2025-04-25 18:38:46,540] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.30 | optimizer_step: 0.95 [2025-04-25 18:38:46,541] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.99 | bwd_microstep: 5679.14 | bwd_inner_microstep: 5654.12 | bwd_allreduce_microstep: 24.96 | step_microstep: 19.96 [2025-04-25 18:38:46,541] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.99 | bwd: 5679.15 | bwd_inner: 5654.12 | bwd_allreduce: 24.98 | step: 19.96 11%|█ | 4418/41250 [10:41:11<88:32:27, 8.65s/it] {'loss': 0.3643, 'grad_norm': 2.5174543857574463, 'learning_rate': 3.9379822369568544e-05, 'epoch': 1.07} 11%|█ | 4418/41250 [10:41:11<88:32:27, 8.65s/it][2025-04-25 18:38:55,190] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 18:38:55,190] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.90 | bwd_microstep: 5717.67 | bwd_inner_microstep: 5704.85 | bwd_allreduce_microstep: 12.78 | step_microstep: 19.04 [2025-04-25 18:38:55,191] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.90 | bwd: 5717.69 | bwd_inner: 5704.85 | bwd_allreduce: 12.80 | step: 19.04 11%|█ | 4419/41250 [10:41:20<88:31:30, 8.65s/it] {'loss': 0.0935, 'grad_norm': 1.332194209098816, 'learning_rate': 3.937943428945165e-05, 'epoch': 1.07} 11%|█ | 4419/41250 [10:41:20<88:31:30, 8.65s/it][2025-04-25 18:39:03,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.19 | optimizer_step: 0.89 [2025-04-25 18:39:03,849] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.14 | bwd_microstep: 5747.91 | bwd_inner_microstep: 5645.02 | bwd_allreduce_microstep: 102.84 | step_microstep: 19.16 [2025-04-25 18:39:03,849] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.14 | bwd: 5747.92 | bwd_inner: 5645.02 | bwd_allreduce: 102.85 | step: 19.16 11%|█ | 4420/41250 [10:41:29<88:32:28, 8.65s/it] {'loss': 0.2032, 'grad_norm': 1.5955408811569214, 'learning_rate': 3.937904608986435e-05, 'epoch': 1.07} 11%|█ | 4420/41250 [10:41:29<88:32:28, 8.65s/it][2025-04-25 18:39:12,634] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.15 | optimizer_step: 0.92 [2025-04-25 18:39:12,634] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.00 | bwd_microstep: 5873.31 | bwd_inner_microstep: 5662.96 | bwd_allreduce_microstep: 210.30 | step_microstep: 18.97 [2025-04-25 18:39:12,634] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.00 | bwd: 5873.32 | bwd_inner: 5662.96 | bwd_allreduce: 210.32 | step: 18.97 11%|█ | 4421/41250 [10:41:37<88:56:25, 8.69s/it] {'loss': 0.1767, 'grad_norm': 1.8288027048110962, 'learning_rate': 3.937865777080905e-05, 'epoch': 1.07} 11%|█ | 4421/41250 [10:41:37<88:56:25, 8.69s/it][2025-04-25 18:39:21,420] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.14 | optimizer_step: 0.92 [2025-04-25 18:39:21,421] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.49 | bwd_microstep: 5858.40 | bwd_inner_microstep: 5686.32 | bwd_allreduce_microstep: 172.03 | step_microstep: 18.74 [2025-04-25 18:39:21,421] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.49 | bwd: 5858.41 | bwd_inner: 5686.32 | bwd_allreduce: 172.05 | step: 18.74 11%|█ | 4422/41250 [10:41:46<89:13:26, 8.72s/it] {'loss': 0.1271, 'grad_norm': 1.1595779657363892, 'learning_rate': 3.937826933228813e-05, 'epoch': 1.07} 11%|█ | 4422/41250 [10:41:46<89:13:26, 8.72s/it][2025-04-25 18:39:30,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 18:39:30,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.82 | bwd_microstep: 5711.36 | bwd_inner_microstep: 5698.42 | bwd_allreduce_microstep: 12.89 | step_microstep: 19.03 [2025-04-25 18:39:30,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.82 | bwd: 5711.38 | bwd_inner: 5698.42 | bwd_allreduce: 12.91 | step: 19.03 11%|█ | 4423/41250 [10:41:55<88:58:09, 8.70s/it] {'loss': 0.1151, 'grad_norm': 1.0234304666519165, 'learning_rate': 3.9377880774303994e-05, 'epoch': 1.07} 11%|█ | 4423/41250 [10:41:55<88:58:09, 8.70s/it][2025-04-25 18:39:38,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 18:39:38,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.75 | bwd_microstep: 5692.81 | bwd_inner_microstep: 5652.46 | bwd_allreduce_microstep: 40.30 | step_microstep: 19.30 [2025-04-25 18:39:38,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.75 | bwd: 5692.82 | bwd_inner: 5652.46 | bwd_allreduce: 40.32 | step: 19.31 11%|█ | 4424/41250 [10:42:03<88:40:43, 8.67s/it] {'loss': 0.2381, 'grad_norm': 2.004948139190674, 'learning_rate': 3.9377492096859036e-05, 'epoch': 1.07} 11%|█ | 4424/41250 [10:42:03<88:40:43, 8.67s/it][2025-04-25 18:39:47,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.03 | optimizer_step: 0.92 [2025-04-25 18:39:47,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.67 | bwd_microstep: 5671.55 | bwd_inner_microstep: 5635.10 | bwd_allreduce_microstep: 36.40 | step_microstep: 18.97 [2025-04-25 18:39:47,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.67 | bwd: 5671.56 | bwd_inner: 5635.10 | bwd_allreduce: 36.42 | step: 18.97 11%|█ | 4425/41250 [10:42:12<88:24:06, 8.64s/it] {'loss': 0.2867, 'grad_norm': 1.8418034315109253, 'learning_rate': 3.9377103299955645e-05, 'epoch': 1.07} 11%|█ | 4425/41250 [10:42:12<88:24:06, 8.64s/it][2025-04-25 18:39:56,024] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:39:56,025] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2874.63 | bwd_microstep: 5820.61 | bwd_inner_microstep: 5755.61 | bwd_allreduce_microstep: 64.96 | step_microstep: 18.63 [2025-04-25 18:39:56,025] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2874.63 | bwd: 5820.63 | bwd_inner: 5755.61 | bwd_allreduce: 64.98 | step: 18.63 11%|█ | 4426/41250 [10:42:21<88:49:23, 8.68s/it] {'loss': 0.1271, 'grad_norm': 0.7210027575492859, 'learning_rate': 3.937671438359623e-05, 'epoch': 1.07} 11%|█ | 4426/41250 [10:42:21<88:49:23, 8.68s/it][2025-04-25 18:40:04,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:40:04,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.23 | bwd_microstep: 5736.82 | bwd_inner_microstep: 5688.26 | bwd_allreduce_microstep: 48.52 | step_microstep: 18.98 [2025-04-25 18:40:04,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.23 | bwd: 5736.84 | bwd_inner: 5688.26 | bwd_allreduce: 48.53 | step: 18.98 11%|█ | 4427/41250 [10:42:30<88:44:54, 8.68s/it] {'loss': 0.1041, 'grad_norm': 1.1251496076583862, 'learning_rate': 3.9376325347783176e-05, 'epoch': 1.07} 11%|█ | 4427/41250 [10:42:30<88:44:54, 8.68s/it][2025-04-25 18:40:13,310] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.03 | optimizer_step: 1.03 [2025-04-25 18:40:13,311] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.74 | bwd_microstep: 5700.59 | bwd_inner_microstep: 5687.72 | bwd_allreduce_microstep: 12.82 | step_microstep: 19.12 [2025-04-25 18:40:13,311] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.74 | bwd: 5700.61 | bwd_inner: 5687.72 | bwd_allreduce: 12.85 | step: 19.12 11%|█ | 4428/41250 [10:42:38<88:35:35, 8.66s/it] {'loss': 0.2991, 'grad_norm': 2.2353999614715576, 'learning_rate': 3.9375936192518895e-05, 'epoch': 1.07} 11%|█ | 4428/41250 [10:42:38<88:35:35, 8.66s/it][2025-04-25 18:40:21,954] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-25 18:40:21,954] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.28 | bwd_microstep: 5734.22 | bwd_inner_microstep: 5652.64 | bwd_allreduce_microstep: 81.53 | step_microstep: 18.95 [2025-04-25 18:40:21,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.28 | bwd: 5734.24 | bwd_inner: 5652.64 | bwd_allreduce: 81.55 | step: 18.95 11%|█ | 4429/41250 [10:42:47<88:31:58, 8.66s/it] {'loss': 0.0438, 'grad_norm': 0.40286049246788025, 'learning_rate': 3.937554691780577e-05, 'epoch': 1.07} 11%|█ | 4429/41250 [10:42:47<88:31:58, 8.66s/it][2025-04-25 18:40:30,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.67 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:40:30,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.04 | bwd_microstep: 5701.39 | bwd_inner_microstep: 5688.39 | bwd_allreduce_microstep: 12.95 | step_microstep: 19.03 [2025-04-25 18:40:30,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.04 | bwd: 5701.40 | bwd_inner: 5688.39 | bwd_allreduce: 12.97 | step: 19.04 11%|█ | 4430/41250 [10:42:55<88:26:52, 8.65s/it] {'loss': 0.2159, 'grad_norm': 2.207287073135376, 'learning_rate': 3.9375157523646215e-05, 'epoch': 1.07} 11%|█ | 4430/41250 [10:42:55<88:26:52, 8.65s/it][2025-04-25 18:40:39,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.73 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:40:39,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.11 | bwd_microstep: 5709.56 | bwd_inner_microstep: 5696.56 | bwd_allreduce_microstep: 12.96 | step_microstep: 19.40 [2025-04-25 18:40:39,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.11 | bwd: 5709.57 | bwd_inner: 5696.56 | bwd_allreduce: 12.97 | step: 19.40 11%|█ | 4431/41250 [10:43:04<88:27:36, 8.65s/it] {'loss': 0.2461, 'grad_norm': 2.833239793777466, 'learning_rate': 3.9374768010042626e-05, 'epoch': 1.07} 11%|█ | 4431/41250 [10:43:04<88:27:36, 8.65s/it][2025-04-25 18:40:47,817] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:40:47,817] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.21 | bwd_microstep: 5674.00 | bwd_inner_microstep: 5647.81 | bwd_allreduce_microstep: 26.14 | step_microstep: 18.61 [2025-04-25 18:40:47,818] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.21 | bwd: 5674.01 | bwd_inner: 5647.81 | bwd_allreduce: 26.16 | step: 18.62 11%|█ | 4432/41250 [10:43:13<88:15:00, 8.63s/it] {'loss': 0.2277, 'grad_norm': 1.44098699092865, 'learning_rate': 3.93743783769974e-05, 'epoch': 1.07} 11%|█ | 4432/41250 [10:43:13<88:15:00, 8.63s/it][2025-04-25 18:40:56,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:40:56,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.88 | bwd_microstep: 5692.02 | bwd_inner_microstep: 5679.43 | bwd_allreduce_microstep: 12.55 | step_microstep: 18.31 [2025-04-25 18:40:56,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.88 | bwd: 5692.03 | bwd_inner: 5679.43 | bwd_allreduce: 12.57 | step: 18.31 11%|█ | 4433/41250 [10:43:21<88:14:17, 8.63s/it] {'loss': 0.1688, 'grad_norm': 2.2626988887786865, 'learning_rate': 3.937398862451294e-05, 'epoch': 1.07} 11%|█ | 4433/41250 [10:43:21<88:14:17, 8.63s/it][2025-04-25 18:41:05,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.01 | optimizer_step: 1.07 [2025-04-25 18:41:05,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.50 | bwd_microstep: 5749.11 | bwd_inner_microstep: 5648.51 | bwd_allreduce_microstep: 100.55 | step_microstep: 19.19 [2025-04-25 18:41:05,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.50 | bwd: 5749.12 | bwd_inner: 5648.51 | bwd_allreduce: 100.57 | step: 19.19 11%|█ | 4434/41250 [10:43:30<88:18:16, 8.63s/it] {'loss': 0.1729, 'grad_norm': 1.4599367380142212, 'learning_rate': 3.9373598752591647e-05, 'epoch': 1.07} 11%|█ | 4434/41250 [10:43:30<88:18:16, 8.63s/it][2025-04-25 18:41:13,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 6.58 | optimizer_gradients: 1.34 | optimizer_step: 0.90 [2025-04-25 18:41:13,863] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2885.79 | bwd_microstep: 5797.91 | bwd_inner_microstep: 5785.12 | bwd_allreduce_microstep: 12.74 | step_microstep: 21.10 [2025-04-25 18:41:13,863] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2885.79 | bwd: 5797.92 | bwd_inner: 5785.12 | bwd_allreduce: 12.76 | step: 21.11 11%|█ | 4435/41250 [10:43:39<88:42:57, 8.68s/it] {'loss': 0.164, 'grad_norm': 2.6037352085113525, 'learning_rate': 3.937320876123594e-05, 'epoch': 1.08} 11%|█ | 4435/41250 [10:43:39<88:42:57, 8.68s/it][2025-04-25 18:41:22,511] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 18:41:22,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.07 | bwd_microstep: 5717.66 | bwd_inner_microstep: 5689.35 | bwd_allreduce_microstep: 28.26 | step_microstep: 21.21 [2025-04-25 18:41:22,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.07 | bwd: 5717.67 | bwd_inner: 5689.35 | bwd_allreduce: 28.28 | step: 21.21 11%|█ | 4436/41250 [10:43:47<88:38:13, 8.67s/it] {'loss': 0.2557, 'grad_norm': 2.169285535812378, 'learning_rate': 3.937281865044821e-05, 'epoch': 1.08} 11%|█ | 4436/41250 [10:43:47<88:38:13, 8.67s/it][2025-04-25 18:41:31,191] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-25 18:41:31,192] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.58 | bwd_microstep: 5748.85 | bwd_inner_microstep: 5693.81 | bwd_allreduce_microstep: 54.99 | step_microstep: 19.04 [2025-04-25 18:41:31,192] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.58 | bwd: 5748.87 | bwd_inner: 5693.81 | bwd_allreduce: 55.02 | step: 19.04 11%|█ | 4437/41250 [10:43:56<88:40:10, 8.67s/it] {'loss': 0.081, 'grad_norm': 0.5970475077629089, 'learning_rate': 3.937242842023085e-05, 'epoch': 1.08} 11%|█ | 4437/41250 [10:43:56<88:40:10, 8.67s/it][2025-04-25 18:41:39,788] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.99 | optimizer_step: 0.92 [2025-04-25 18:41:39,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.41 | bwd_microstep: 5689.60 | bwd_inner_microstep: 5640.21 | bwd_allreduce_microstep: 49.34 | step_microstep: 18.67 [2025-04-25 18:41:39,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.41 | bwd: 5689.61 | bwd_inner: 5640.21 | bwd_allreduce: 49.35 | step: 18.67 11%|█ | 4438/41250 [10:44:05<88:26:12, 8.65s/it] {'loss': 0.316, 'grad_norm': 1.703989863395691, 'learning_rate': 3.937203807058629e-05, 'epoch': 1.08} 11%|█ | 4438/41250 [10:44:05<88:26:12, 8.65s/it][2025-04-25 18:41:48,425] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.03 | optimizer_step: 1.03 [2025-04-25 18:41:48,425] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.78 | bwd_microstep: 5703.21 | bwd_inner_microstep: 5690.24 | bwd_allreduce_microstep: 12.92 | step_microstep: 19.17 [2025-04-25 18:41:48,426] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.78 | bwd: 5703.22 | bwd_inner: 5690.24 | bwd_allreduce: 12.94 | step: 19.17 11%|█ | 4439/41250 [10:44:13<88:23:53, 8.65s/it] {'loss': 0.2568, 'grad_norm': 2.739767074584961, 'learning_rate': 3.937164760151692e-05, 'epoch': 1.08} 11%|█ | 4439/41250 [10:44:13<88:23:53, 8.65s/it][2025-04-25 18:41:57,098] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:41:57,099] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.46 | bwd_microstep: 5743.57 | bwd_inner_microstep: 5695.00 | bwd_allreduce_microstep: 48.53 | step_microstep: 18.78 [2025-04-25 18:41:57,099] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.46 | bwd: 5743.59 | bwd_inner: 5695.00 | bwd_allreduce: 48.54 | step: 18.78 11%|█ | 4440/41250 [10:44:22<88:28:59, 8.65s/it] {'loss': 0.0195, 'grad_norm': 0.3446187674999237, 'learning_rate': 3.937125701302516e-05, 'epoch': 1.08} 11%|█ | 4440/41250 [10:44:22<88:28:59, 8.65s/it][2025-04-25 18:42:05,772] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 18:42:05,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.29 | bwd_microstep: 5762.01 | bwd_inner_microstep: 5656.52 | bwd_allreduce_microstep: 105.45 | step_microstep: 18.85 [2025-04-25 18:42:05,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.29 | bwd: 5762.02 | bwd_inner: 5656.52 | bwd_allreduce: 105.46 | step: 18.85 11%|█ | 4441/41250 [10:44:31<88:32:37, 8.66s/it] {'loss': 0.0641, 'grad_norm': 0.949478268623352, 'learning_rate': 3.937086630511341e-05, 'epoch': 1.08} 11%|█ | 4441/41250 [10:44:31<88:32:37, 8.66s/it][2025-04-25 18:42:14,425] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:42:14,425] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.68 | bwd_microstep: 5715.58 | bwd_inner_microstep: 5702.77 | bwd_allreduce_microstep: 12.76 | step_microstep: 19.01 [2025-04-25 18:42:14,425] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.68 | bwd: 5715.59 | bwd_inner: 5702.77 | bwd_allreduce: 12.77 | step: 19.01 11%|█ | 4442/41250 [10:44:39<88:31:05, 8.66s/it] {'loss': 0.0444, 'grad_norm': 0.6931912302970886, 'learning_rate': 3.9370475477784076e-05, 'epoch': 1.08} 11%|█ | 4442/41250 [10:44:39<88:31:05, 8.66s/it][2025-04-25 18:42:23,111] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.26 | optimizer_step: 1.03 [2025-04-25 18:42:23,111] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.53 | bwd_microstep: 5752.77 | bwd_inner_microstep: 5682.25 | bwd_allreduce_microstep: 70.46 | step_microstep: 19.83 [2025-04-25 18:42:23,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.53 | bwd: 5752.78 | bwd_inner: 5682.25 | bwd_allreduce: 70.49 | step: 19.83 11%|█ | 4443/41250 [10:44:48<88:36:25, 8.67s/it] {'loss': 0.0191, 'grad_norm': 0.33196255564689636, 'learning_rate': 3.937008453103957e-05, 'epoch': 1.08} 11%|█ | 4443/41250 [10:44:48<88:36:25, 8.67s/it][2025-04-25 18:42:31,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 18:42:31,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.22 | bwd_microstep: 5748.03 | bwd_inner_microstep: 5694.61 | bwd_allreduce_microstep: 53.38 | step_microstep: 18.52 [2025-04-25 18:42:31,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.22 | bwd: 5748.05 | bwd_inner: 5694.61 | bwd_allreduce: 53.40 | step: 18.52 11%|█ | 4444/41250 [10:44:57<88:39:24, 8.67s/it] {'loss': 0.1721, 'grad_norm': 1.4878524541854858, 'learning_rate': 3.93696934648823e-05, 'epoch': 1.08} 11%|█ | 4444/41250 [10:44:57<88:39:24, 8.67s/it][2025-04-25 18:42:40,427] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 18:42:40,428] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.55 | bwd_microstep: 5718.77 | bwd_inner_microstep: 5659.62 | bwd_allreduce_microstep: 59.11 | step_microstep: 18.58 [2025-04-25 18:42:40,428] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.55 | bwd: 5718.79 | bwd_inner: 5659.62 | bwd_allreduce: 59.13 | step: 18.58 11%|█ | 4445/41250 [10:45:05<88:31:56, 8.66s/it] {'loss': 0.115, 'grad_norm': 1.5637929439544678, 'learning_rate': 3.936930227931469e-05, 'epoch': 1.08} 11%|█ | 4445/41250 [10:45:05<88:31:56, 8.66s/it][2025-04-25 18:42:49,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:42:49,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.28 | bwd_microstep: 5728.51 | bwd_inner_microstep: 5715.83 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.80 [2025-04-25 18:42:49,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.28 | bwd: 5728.53 | bwd_inner: 5715.83 | bwd_allreduce: 12.66 | step: 18.80 11%|█ | 4446/41250 [10:45:14<88:32:51, 8.66s/it] {'loss': 0.0759, 'grad_norm': 0.9053823947906494, 'learning_rate': 3.936891097433913e-05, 'epoch': 1.08} 11%|█ | 4446/41250 [10:45:14<88:32:51, 8.66s/it][2025-04-25 18:42:57,777] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:42:57,778] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.52 | bwd_microstep: 5748.00 | bwd_inner_microstep: 5703.77 | bwd_allreduce_microstep: 44.19 | step_microstep: 18.91 [2025-04-25 18:42:57,778] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.52 | bwd: 5748.02 | bwd_inner: 5703.77 | bwd_allreduce: 44.21 | step: 18.92 11%|█ | 4447/41250 [10:45:23<88:36:55, 8.67s/it] {'loss': 0.0395, 'grad_norm': 0.5109105706214905, 'learning_rate': 3.936851954995805e-05, 'epoch': 1.08} 11%|█ | 4447/41250 [10:45:23<88:36:55, 8.67s/it][2025-04-25 18:43:06,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.25 | optimizer_step: 1.01 [2025-04-25 18:43:06,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.71 | bwd_microstep: 5740.61 | bwd_inner_microstep: 5653.27 | bwd_allreduce_microstep: 87.28 | step_microstep: 19.50 [2025-04-25 18:43:06,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.72 | bwd: 5740.62 | bwd_inner: 5653.27 | bwd_allreduce: 87.31 | step: 19.51 11%|█ | 4448/41250 [10:45:31<88:34:10, 8.66s/it] {'loss': 0.132, 'grad_norm': 2.068089723587036, 'learning_rate': 3.936812800617385e-05, 'epoch': 1.08} 11%|█ | 4448/41250 [10:45:31<88:34:10, 8.66s/it][2025-04-25 18:43:15,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.06 | optimizer_step: 1.09 [2025-04-25 18:43:15,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.50 | bwd_microstep: 5718.23 | bwd_inner_microstep: 5703.83 | bwd_allreduce_microstep: 14.33 | step_microstep: 19.30 [2025-04-25 18:43:15,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.50 | bwd: 5718.24 | bwd_inner: 5703.83 | bwd_allreduce: 14.36 | step: 19.30 11%|█ | 4449/41250 [10:45:40<88:33:13, 8.66s/it] {'loss': 0.171, 'grad_norm': 1.8694559335708618, 'learning_rate': 3.936773634298896e-05, 'epoch': 1.08} 11%|█ | 4449/41250 [10:45:40<88:33:13, 8.66s/it][2025-04-25 18:43:23,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 18:43:23,802] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.48 | bwd_microstep: 5801.44 | bwd_inner_microstep: 5666.21 | bwd_allreduce_microstep: 135.18 | step_microstep: 18.59 [2025-04-25 18:43:23,802] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.48 | bwd: 5801.45 | bwd_inner: 5666.21 | bwd_allreduce: 135.20 | step: 18.59 11%|█ | 4450/41250 [10:45:49<88:42:00, 8.68s/it] {'loss': 0.0372, 'grad_norm': 0.7233862280845642, 'learning_rate': 3.936734456040578e-05, 'epoch': 1.08} 11%|█ | 4450/41250 [10:45:49<88:42:00, 8.68s/it][2025-04-25 18:43:32,454] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.82 | optimizer_gradients: 1.25 | optimizer_step: 1.03 [2025-04-25 18:43:32,455] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.62 | bwd_microstep: 5707.46 | bwd_inner_microstep: 5694.12 | bwd_allreduce_microstep: 13.27 | step_microstep: 20.04 [2025-04-25 18:43:32,455] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.62 | bwd: 5707.48 | bwd_inner: 5694.12 | bwd_allreduce: 13.30 | step: 20.04 11%|█ | 4451/41250 [10:45:57<88:37:41, 8.67s/it] {'loss': 0.0897, 'grad_norm': 2.108818292617798, 'learning_rate': 3.936695265842673e-05, 'epoch': 1.08} 11%|█ | 4451/41250 [10:45:57<88:37:41, 8.67s/it][2025-04-25 18:43:41,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.05 | optimizer_step: 1.09 [2025-04-25 18:43:41,102] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.36 | bwd_microstep: 5710.09 | bwd_inner_microstep: 5696.84 | bwd_allreduce_microstep: 13.20 | step_microstep: 18.91 [2025-04-25 18:43:41,102] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.36 | bwd: 5710.10 | bwd_inner: 5696.84 | bwd_allreduce: 13.22 | step: 18.92 11%|█ | 4452/41250 [10:46:06<88:32:47, 8.66s/it] {'loss': 0.3424, 'grad_norm': 5.507380485534668, 'learning_rate': 3.9366560637054226e-05, 'epoch': 1.08} 11%|█ | 4452/41250 [10:46:06<88:32:47, 8.66s/it][2025-04-25 18:43:49,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 18:43:49,799] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.63 | bwd_microstep: 5756.50 | bwd_inner_microstep: 5717.76 | bwd_allreduce_microstep: 38.70 | step_microstep: 18.65 [2025-04-25 18:43:49,799] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.63 | bwd: 5756.52 | bwd_inner: 5717.76 | bwd_allreduce: 38.72 | step: 18.66 11%|█ | 4453/41250 [10:46:15<88:38:59, 8.67s/it] {'loss': 0.1579, 'grad_norm': 1.580683708190918, 'learning_rate': 3.936616849629069e-05, 'epoch': 1.08} 11%|█ | 4453/41250 [10:46:15<88:38:59, 8.67s/it][2025-04-25 18:43:58,532] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:43:58,533] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.74 | bwd_microstep: 5805.99 | bwd_inner_microstep: 5657.93 | bwd_allreduce_microstep: 148.01 | step_microstep: 18.83 [2025-04-25 18:43:58,533] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.74 | bwd: 5806.00 | bwd_inner: 5657.93 | bwd_allreduce: 148.02 | step: 18.84 11%|█ | 4454/41250 [10:46:23<88:50:04, 8.69s/it] {'loss': 0.0792, 'grad_norm': 1.2625483274459839, 'learning_rate': 3.9365776236138524e-05, 'epoch': 1.08} 11%|█ | 4454/41250 [10:46:23<88:50:04, 8.69s/it][2025-04-25 18:44:07,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-25 18:44:07,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.67 | bwd_microstep: 5788.52 | bwd_inner_microstep: 5661.93 | bwd_allreduce_microstep: 126.53 | step_microstep: 18.84 [2025-04-25 18:44:07,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.67 | bwd: 5788.53 | bwd_inner: 5661.93 | bwd_allreduce: 126.55 | step: 18.84 11%|█ | 4455/41250 [10:46:32<88:52:17, 8.70s/it] {'loss': 0.0386, 'grad_norm': 0.5451412200927734, 'learning_rate': 3.936538385660016e-05, 'epoch': 1.08} 11%|█ | 4455/41250 [10:46:32<88:52:17, 8.70s/it][2025-04-25 18:44:16,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.19 | optimizer_step: 1.02 [2025-04-25 18:44:16,077] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.38 | bwd_microstep: 5899.00 | bwd_inner_microstep: 5715.71 | bwd_allreduce_microstep: 183.24 | step_microstep: 19.48 [2025-04-25 18:44:16,077] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.38 | bwd: 5899.01 | bwd_inner: 5715.71 | bwd_allreduce: 183.26 | step: 19.48 11%|█ | 4456/41250 [10:46:41<89:18:54, 8.74s/it] {'loss': 0.0532, 'grad_norm': 0.4197375476360321, 'learning_rate': 3.9364991357678014e-05, 'epoch': 1.08} 11%|█ | 4456/41250 [10:46:41<89:18:54, 8.74s/it][2025-04-25 18:44:24,717] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 1.19 | optimizer_step: 0.90 [2025-04-25 18:44:24,717] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.58 | bwd_microstep: 5715.71 | bwd_inner_microstep: 5670.93 | bwd_allreduce_microstep: 44.72 | step_microstep: 19.00 [2025-04-25 18:44:24,717] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.58 | bwd: 5715.72 | bwd_inner: 5670.93 | bwd_allreduce: 44.74 | step: 19.00 11%|█ | 4457/41250 [10:46:50<89:00:34, 8.71s/it] {'loss': 0.0285, 'grad_norm': 0.5160013437271118, 'learning_rate': 3.936459873937451e-05, 'epoch': 1.08} 11%|█ | 4457/41250 [10:46:50<89:00:34, 8.71s/it][2025-04-25 18:44:33,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.12 | optimizer_step: 0.97 [2025-04-25 18:44:33,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.74 | bwd_microstep: 5770.21 | bwd_inner_microstep: 5666.51 | bwd_allreduce_microstep: 103.65 | step_microstep: 19.11 [2025-04-25 18:44:33,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.74 | bwd: 5770.23 | bwd_inner: 5666.51 | bwd_allreduce: 103.67 | step: 19.11 11%|█ | 4458/41250 [10:46:58<88:57:52, 8.70s/it] {'loss': 0.0929, 'grad_norm': 1.7485343217849731, 'learning_rate': 3.9364206001692055e-05, 'epoch': 1.08} 11%|█ | 4458/41250 [10:46:58<88:57:52, 8.70s/it][2025-04-25 18:44:42,111] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.97 | optimizer_gradients: 1.02 | optimizer_step: 1.04 [2025-04-25 18:44:42,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.76 | bwd_microstep: 5783.23 | bwd_inner_microstep: 5661.81 | bwd_allreduce_microstep: 121.36 | step_microstep: 18.88 [2025-04-25 18:44:42,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.76 | bwd: 5783.24 | bwd_inner: 5661.81 | bwd_allreduce: 121.38 | step: 18.88 11%|█ | 4459/41250 [10:47:07<88:57:01, 8.70s/it] {'loss': 0.1865, 'grad_norm': 1.6928972005844116, 'learning_rate': 3.936381314463309e-05, 'epoch': 1.08} 11%|█ | 4459/41250 [10:47:07<88:57:01, 8.70s/it][2025-04-25 18:44:50,833] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:44:50,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.47 | bwd_microstep: 5772.10 | bwd_inner_microstep: 5702.41 | bwd_allreduce_microstep: 69.64 | step_microstep: 19.00 [2025-04-25 18:44:50,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.47 | bwd: 5772.11 | bwd_inner: 5702.41 | bwd_allreduce: 69.66 | step: 19.00 11%|█ | 4460/41250 [10:47:16<88:59:55, 8.71s/it] {'loss': 0.3383, 'grad_norm': 1.6606882810592651, 'learning_rate': 3.936342016820001e-05, 'epoch': 1.08} 11%|█ | 4460/41250 [10:47:16<88:59:55, 8.71s/it][2025-04-25 18:44:59,449] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.97 | optimizer_step: 0.91 [2025-04-25 18:44:59,449] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.89 | bwd_microstep: 5703.83 | bwd_inner_microstep: 5663.06 | bwd_allreduce_microstep: 40.73 | step_microstep: 18.61 [2025-04-25 18:44:59,450] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.89 | bwd: 5703.84 | bwd_inner: 5663.06 | bwd_allreduce: 40.74 | step: 18.62 11%|█ | 4461/41250 [10:47:24<88:42:42, 8.68s/it] {'loss': 0.083, 'grad_norm': 1.0882588624954224, 'learning_rate': 3.936302707239527e-05, 'epoch': 1.08} 11%|█ | 4461/41250 [10:47:24<88:42:42, 8.68s/it][2025-04-25 18:45:08,130] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 18:45:08,131] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.47 | bwd_microstep: 5740.58 | bwd_inner_microstep: 5704.75 | bwd_allreduce_microstep: 35.79 | step_microstep: 18.86 [2025-04-25 18:45:08,131] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.47 | bwd: 5740.60 | bwd_inner: 5704.75 | bwd_allreduce: 35.81 | step: 18.86 11%|█ | 4462/41250 [10:47:33<88:42:41, 8.68s/it] {'loss': 0.0763, 'grad_norm': 1.3595836162567139, 'learning_rate': 3.936263385722127e-05, 'epoch': 1.08} 11%|█ | 4462/41250 [10:47:33<88:42:41, 8.68s/it][2025-04-25 18:45:16,840] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.06 | optimizer_step: 1.02 [2025-04-25 18:45:16,841] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.87 | bwd_microstep: 5798.88 | bwd_inner_microstep: 5652.17 | bwd_allreduce_microstep: 146.65 | step_microstep: 19.48 [2025-04-25 18:45:16,841] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.87 | bwd: 5798.90 | bwd_inner: 5652.17 | bwd_allreduce: 146.68 | step: 19.49 11%|█ | 4463/41250 [10:47:42<88:48:09, 8.69s/it] {'loss': 0.1718, 'grad_norm': 1.495214819908142, 'learning_rate': 3.936224052268044e-05, 'epoch': 1.08} 11%|█ | 4463/41250 [10:47:42<88:48:09, 8.69s/it][2025-04-25 18:45:25,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-25 18:45:25,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.83 | bwd_microstep: 5764.10 | bwd_inner_microstep: 5690.99 | bwd_allreduce_microstep: 73.07 | step_microstep: 19.16 [2025-04-25 18:45:25,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.83 | bwd: 5764.12 | bwd_inner: 5690.99 | bwd_allreduce: 73.09 | step: 19.17 11%|█ | 4464/41250 [10:47:50<88:51:03, 8.70s/it] {'loss': 0.1638, 'grad_norm': 2.1290366649627686, 'learning_rate': 3.93618470687752e-05, 'epoch': 1.08} 11%|█ | 4464/41250 [10:47:50<88:51:03, 8.70s/it][2025-04-25 18:45:34,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 18:45:34,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.53 | bwd_microstep: 5704.99 | bwd_inner_microstep: 5692.04 | bwd_allreduce_microstep: 12.91 | step_microstep: 18.79 [2025-04-25 18:45:34,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.53 | bwd: 5705.01 | bwd_inner: 5692.04 | bwd_allreduce: 12.92 | step: 18.79 11%|█ | 4465/41250 [10:47:59<88:42:18, 8.68s/it] {'loss': 0.0521, 'grad_norm': 0.7306700348854065, 'learning_rate': 3.936145349550799e-05, 'epoch': 1.08} 11%|█ | 4465/41250 [10:47:59<88:42:18, 8.68s/it][2025-04-25 18:45:42,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.17 | optimizer_step: 0.96 [2025-04-25 18:45:42,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.47 | bwd_microstep: 5756.77 | bwd_inner_microstep: 5658.85 | bwd_allreduce_microstep: 97.87 | step_microstep: 19.05 [2025-04-25 18:45:42,877] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.47 | bwd: 5756.78 | bwd_inner: 5658.85 | bwd_allreduce: 97.89 | step: 19.06 11%|█ | 4466/41250 [10:48:08<88:41:47, 8.68s/it] {'loss': 0.0279, 'grad_norm': 0.8303221464157104, 'learning_rate': 3.9361059802881226e-05, 'epoch': 1.08} 11%|█ | 4466/41250 [10:48:08<88:41:47, 8.68s/it][2025-04-25 18:45:51,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:45:51,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.13 | bwd_microstep: 5725.13 | bwd_inner_microstep: 5695.68 | bwd_allreduce_microstep: 29.40 | step_microstep: 18.46 [2025-04-25 18:45:51,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.13 | bwd: 5725.14 | bwd_inner: 5695.68 | bwd_allreduce: 29.42 | step: 18.46 11%|█ | 4467/41250 [10:48:16<88:39:50, 8.68s/it] {'loss': 0.247, 'grad_norm': 2.485633611679077, 'learning_rate': 3.936066599089733e-05, 'epoch': 1.08} 11%|█ | 4467/41250 [10:48:16<88:39:50, 8.68s/it][2025-04-25 18:46:00,247] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.02 | optimizer_step: 0.98 [2025-04-25 18:46:00,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.84 | bwd_microstep: 5788.85 | bwd_inner_microstep: 5647.61 | bwd_allreduce_microstep: 141.19 | step_microstep: 19.18 [2025-04-25 18:46:00,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.84 | bwd: 5788.86 | bwd_inner: 5647.61 | bwd_allreduce: 141.21 | step: 19.18 11%|█ | 4468/41250 [10:48:25<88:44:02, 8.68s/it] {'loss': 0.1751, 'grad_norm': 1.1373461484909058, 'learning_rate': 3.936027205955874e-05, 'epoch': 1.08} 11%|█ | 4468/41250 [10:48:25<88:44:02, 8.68s/it][2025-04-25 18:46:08,875] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 18:46:08,875] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.70 | bwd_microstep: 5710.03 | bwd_inner_microstep: 5639.20 | bwd_allreduce_microstep: 70.77 | step_microstep: 18.43 [2025-04-25 18:46:08,875] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.70 | bwd: 5710.05 | bwd_inner: 5639.20 | bwd_allreduce: 70.80 | step: 18.44 11%|█ | 4469/41250 [10:48:34<88:32:53, 8.67s/it] {'loss': 0.2761, 'grad_norm': 1.48638117313385, 'learning_rate': 3.935987800886788e-05, 'epoch': 1.08} 11%|█ | 4469/41250 [10:48:34<88:32:53, 8.67s/it][2025-04-25 18:46:17,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.05 | optimizer_step: 0.91 [2025-04-25 18:46:17,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.45 | bwd_microstep: 5755.28 | bwd_inner_microstep: 5651.84 | bwd_allreduce_microstep: 103.40 | step_microstep: 18.81 [2025-04-25 18:46:17,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.46 | bwd: 5755.29 | bwd_inner: 5651.83 | bwd_allreduce: 103.42 | step: 18.82 11%|█ | 4470/41250 [10:48:42<88:33:42, 8.67s/it] {'loss': 0.1612, 'grad_norm': 1.872536063194275, 'learning_rate': 3.9359483838827185e-05, 'epoch': 1.08} 11%|█ | 4470/41250 [10:48:42<88:33:42, 8.67s/it][2025-04-25 18:46:26,200] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 1.06 [2025-04-25 18:46:26,200] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.18 | bwd_microstep: 5710.10 | bwd_inner_microstep: 5697.35 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.67 [2025-04-25 18:46:26,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.18 | bwd: 5710.11 | bwd_inner: 5697.35 | bwd_allreduce: 12.72 | step: 18.68 11%|█ | 4471/41250 [10:48:51<88:30:48, 8.66s/it] {'loss': 0.0624, 'grad_norm': 1.7096320390701294, 'learning_rate': 3.935908954943908e-05, 'epoch': 1.08} 11%|█ | 4471/41250 [10:48:51<88:30:48, 8.66s/it][2025-04-25 18:46:34,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:46:34,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.19 | bwd_microstep: 5712.24 | bwd_inner_microstep: 5698.11 | bwd_allreduce_microstep: 14.09 | step_microstep: 18.68 [2025-04-25 18:46:34,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.19 | bwd: 5712.25 | bwd_inner: 5698.11 | bwd_allreduce: 14.10 | step: 18.69 11%|█ | 4472/41250 [10:49:00<88:28:29, 8.66s/it] {'loss': 0.2341, 'grad_norm': 2.134669542312622, 'learning_rate': 3.935869514070599e-05, 'epoch': 1.08} 11%|█ | 4472/41250 [10:49:00<88:28:29, 8.66s/it][2025-04-25 18:46:43,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:46:43,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.56 | bwd_microstep: 5743.90 | bwd_inner_microstep: 5639.97 | bwd_allreduce_microstep: 103.88 | step_microstep: 18.40 [2025-04-25 18:46:43,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.56 | bwd: 5743.91 | bwd_inner: 5639.97 | bwd_allreduce: 103.90 | step: 18.40 11%|█ | 4473/41250 [10:49:08<88:27:18, 8.66s/it] {'loss': 0.2038, 'grad_norm': 2.6601359844207764, 'learning_rate': 3.9358300612630354e-05, 'epoch': 1.08} 11%|█ | 4473/41250 [10:49:08<88:27:18, 8.66s/it][2025-04-25 18:46:52,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.09 | optimizer_step: 1.04 [2025-04-25 18:46:52,189] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.53 | bwd_microstep: 5743.03 | bwd_inner_microstep: 5679.99 | bwd_allreduce_microstep: 62.98 | step_microstep: 19.78 [2025-04-25 18:46:52,190] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.53 | bwd: 5743.05 | bwd_inner: 5679.99 | bwd_allreduce: 63.01 | step: 19.78 11%|█ | 4474/41250 [10:49:17<88:31:44, 8.67s/it] {'loss': 0.1967, 'grad_norm': 2.7388408184051514, 'learning_rate': 3.93579059652146e-05, 'epoch': 1.08} 11%|█ | 4474/41250 [10:49:17<88:31:44, 8.67s/it][2025-04-25 18:47:00,860] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.23 | optimizer_step: 0.93 [2025-04-25 18:47:00,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.59 | bwd_microstep: 5732.55 | bwd_inner_microstep: 5697.02 | bwd_allreduce_microstep: 35.48 | step_microstep: 19.39 [2025-04-25 18:47:00,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.59 | bwd: 5732.57 | bwd_inner: 5697.02 | bwd_allreduce: 35.51 | step: 19.40 11%|█ | 4475/41250 [10:49:26<88:32:28, 8.67s/it] {'loss': 0.0922, 'grad_norm': 1.4723496437072754, 'learning_rate': 3.935751119846117e-05, 'epoch': 1.08} 11%|█ | 4475/41250 [10:49:26<88:32:28, 8.67s/it][2025-04-25 18:47:09,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 18:47:09,539] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.99 | bwd_microstep: 5743.52 | bwd_inner_microstep: 5683.85 | bwd_allreduce_microstep: 59.63 | step_microstep: 19.12 [2025-04-25 18:47:09,539] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.00 | bwd: 5743.54 | bwd_inner: 5683.85 | bwd_allreduce: 59.65 | step: 19.13 11%|█ | 4476/41250 [10:49:34<88:34:10, 8.67s/it] {'loss': 0.1156, 'grad_norm': 1.481529712677002, 'learning_rate': 3.935711631237248e-05, 'epoch': 1.09} 11%|█ | 4476/41250 [10:49:34<88:34:10, 8.67s/it][2025-04-25 18:47:18,192] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-25 18:47:18,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.36 | bwd_microstep: 5732.00 | bwd_inner_microstep: 5683.47 | bwd_allreduce_microstep: 48.48 | step_microstep: 19.09 [2025-04-25 18:47:18,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.36 | bwd: 5732.02 | bwd_inner: 5683.47 | bwd_allreduce: 48.50 | step: 19.09 11%|█ | 4477/41250 [10:49:43<88:31:03, 8.67s/it] {'loss': 0.4719, 'grad_norm': 4.076874256134033, 'learning_rate': 3.9356721306950984e-05, 'epoch': 1.09} 11%|█ | 4477/41250 [10:49:43<88:31:03, 8.67s/it][2025-04-25 18:47:26,791] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:47:26,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.13 | bwd_microstep: 5682.84 | bwd_inner_microstep: 5645.49 | bwd_allreduce_microstep: 37.30 | step_microstep: 18.81 [2025-04-25 18:47:26,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.13 | bwd: 5682.86 | bwd_inner: 5645.49 | bwd_allreduce: 37.32 | step: 18.80 11%|█ | 4478/41250 [10:49:52<88:18:29, 8.65s/it] {'loss': 0.1309, 'grad_norm': 2.8889074325561523, 'learning_rate': 3.935632618219911e-05, 'epoch': 1.09} 11%|█ | 4478/41250 [10:49:52<88:18:29, 8.65s/it][2025-04-25 18:47:35,397] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.99 | optimizer_step: 0.98 [2025-04-25 18:47:35,398] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.77 | bwd_microstep: 5701.79 | bwd_inner_microstep: 5652.76 | bwd_allreduce_microstep: 48.98 | step_microstep: 18.61 [2025-04-25 18:47:35,398] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.77 | bwd: 5701.81 | bwd_inner: 5652.76 | bwd_allreduce: 49.00 | step: 18.61 11%|█ | 4479/41250 [10:50:00<88:11:28, 8.63s/it] {'loss': 0.2098, 'grad_norm': 2.43686580657959, 'learning_rate': 3.935593093811928e-05, 'epoch': 1.09} 11%|█ | 4479/41250 [10:50:00<88:11:28, 8.63s/it][2025-04-25 18:47:44,054] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.04 | optimizer_step: 1.01 [2025-04-25 18:47:44,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.64 | bwd_microstep: 5745.45 | bwd_inner_microstep: 5644.68 | bwd_allreduce_microstep: 100.73 | step_microstep: 19.36 [2025-04-25 18:47:44,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.64 | bwd: 5745.47 | bwd_inner: 5644.68 | bwd_allreduce: 100.75 | step: 19.36 11%|█ | 4480/41250 [10:50:09<88:15:17, 8.64s/it] {'loss': 0.2445, 'grad_norm': 2.05794095993042, 'learning_rate': 3.935553557471396e-05, 'epoch': 1.09} 11%|█ | 4480/41250 [10:50:09<88:15:17, 8.64s/it][2025-04-25 18:47:52,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 18:47:52,688] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.47 | bwd_microstep: 5705.06 | bwd_inner_microstep: 5692.18 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.61 [2025-04-25 18:47:52,688] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.47 | bwd: 5705.07 | bwd_inner: 5692.18 | bwd_allreduce: 12.85 | step: 18.61 11%|█ | 4481/41250 [10:50:18<88:13:45, 8.64s/it] {'loss': 0.204, 'grad_norm': 2.763901472091675, 'learning_rate': 3.935514009198556e-05, 'epoch': 1.09} 11%|█ | 4481/41250 [10:50:18<88:13:45, 8.64s/it][2025-04-25 18:48:01,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:48:01,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 3113.80 | bwd_microstep: 5697.63 | bwd_inner_microstep: 5684.86 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.87 [2025-04-25 18:48:01,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 3113.80 | bwd: 5697.65 | bwd_inner: 5684.85 | bwd_allreduce: 12.75 | step: 18.87 11%|█ | 4482/41250 [10:50:26<89:00:32, 8.71s/it] {'loss': 0.1184, 'grad_norm': 1.1089304685592651, 'learning_rate': 3.9354744489936526e-05, 'epoch': 1.09} 11%|█ | 4482/41250 [10:50:26<89:00:32, 8.71s/it][2025-04-25 18:48:10,160] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.97 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:48:10,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.38 | bwd_microstep: 5680.12 | bwd_inner_microstep: 5632.91 | bwd_allreduce_microstep: 47.17 | step_microstep: 18.65 [2025-04-25 18:48:10,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.38 | bwd: 5680.14 | bwd_inner: 5632.91 | bwd_allreduce: 47.18 | step: 18.66 11%|█ | 4483/41250 [10:50:35<88:35:22, 8.67s/it] {'loss': 0.0556, 'grad_norm': 0.6284872889518738, 'learning_rate': 3.9354348768569305e-05, 'epoch': 1.09} 11%|█ | 4483/41250 [10:50:35<88:35:22, 8.67s/it][2025-04-25 18:48:18,802] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 18:48:18,802] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2814.99 | bwd_microstep: 5742.28 | bwd_inner_microstep: 5649.28 | bwd_allreduce_microstep: 92.95 | step_microstep: 18.74 [2025-04-25 18:48:18,802] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2814.99 | bwd: 5742.29 | bwd_inner: 5649.28 | bwd_allreduce: 92.97 | step: 18.74 11%|█ | 4484/41250 [10:50:44<88:29:10, 8.66s/it] {'loss': 0.0851, 'grad_norm': 1.2230061292648315, 'learning_rate': 3.9353952927886326e-05, 'epoch': 1.09} 11%|█ | 4484/41250 [10:50:44<88:29:10, 8.66s/it][2025-04-25 18:48:27,457] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 18:48:27,457] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.59 | bwd_microstep: 5749.53 | bwd_inner_microstep: 5647.89 | bwd_allreduce_microstep: 101.60 | step_microstep: 18.49 [2025-04-25 18:48:27,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.59 | bwd: 5749.54 | bwd_inner: 5647.89 | bwd_allreduce: 101.62 | step: 18.49 11%|█ | 4485/41250 [10:50:52<88:27:41, 8.66s/it] {'loss': 0.0899, 'grad_norm': 2.1784865856170654, 'learning_rate': 3.9353556967890036e-05, 'epoch': 1.09} 11%|█ | 4485/41250 [10:50:52<88:27:41, 8.66s/it][2025-04-25 18:48:36,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.02 | optimizer_step: 1.19 [2025-04-25 18:48:36,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.89 | bwd_microstep: 5698.29 | bwd_inner_microstep: 5646.19 | bwd_allreduce_microstep: 52.05 | step_microstep: 19.48 [2025-04-25 18:48:36,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.89 | bwd: 5698.31 | bwd_inner: 5646.19 | bwd_allreduce: 52.08 | step: 19.48 11%|█ | 4486/41250 [10:51:01<88:16:08, 8.64s/it] {'loss': 0.1985, 'grad_norm': 2.308450698852539, 'learning_rate': 3.935316088858287e-05, 'epoch': 1.09} 11%|█ | 4486/41250 [10:51:01<88:16:08, 8.64s/it][2025-04-25 18:48:44,728] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.90 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 18:48:44,728] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.66 | bwd_microstep: 5732.78 | bwd_inner_microstep: 5692.91 | bwd_allreduce_microstep: 39.83 | step_microstep: 18.90 [2025-04-25 18:48:44,728] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.66 | bwd: 5732.79 | bwd_inner: 5692.90 | bwd_allreduce: 39.84 | step: 18.90 11%|█ | 4487/41250 [10:51:10<88:20:48, 8.65s/it] {'loss': 0.2893, 'grad_norm': 2.4250080585479736, 'learning_rate': 3.935276468996728e-05, 'epoch': 1.09} 11%|█ | 4487/41250 [10:51:10<88:20:48, 8.65s/it][2025-04-25 18:48:53,326] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 18:48:53,326] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.47 | bwd_microstep: 5684.53 | bwd_inner_microstep: 5651.91 | bwd_allreduce_microstep: 32.58 | step_microstep: 18.79 [2025-04-25 18:48:53,327] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.47 | bwd: 5684.55 | bwd_inner: 5651.91 | bwd_allreduce: 32.60 | step: 18.79 11%|█ | 4488/41250 [10:51:18<88:10:51, 8.64s/it] {'loss': 0.4054, 'grad_norm': 2.6328678131103516, 'learning_rate': 3.935236837204571e-05, 'epoch': 1.09} 11%|█ | 4488/41250 [10:51:18<88:10:51, 8.64s/it][2025-04-25 18:49:01,978] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 18:49:01,979] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.88 | bwd_microstep: 5717.11 | bwd_inner_microstep: 5704.39 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.92 [2025-04-25 18:49:01,979] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.88 | bwd: 5717.13 | bwd_inner: 5704.39 | bwd_allreduce: 12.69 | step: 18.92 11%|█ | 4489/41250 [10:51:27<88:13:48, 8.64s/it] {'loss': 0.2347, 'grad_norm': 3.02933406829834, 'learning_rate': 3.935197193482058e-05, 'epoch': 1.09} 11%|█ | 4489/41250 [10:51:27<88:13:48, 8.64s/it][2025-04-25 18:49:10,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.04 | optimizer_step: 0.99 [2025-04-25 18:49:10,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.41 | bwd_microstep: 5716.48 | bwd_inner_microstep: 5703.43 | bwd_allreduce_microstep: 12.99 | step_microstep: 19.20 [2025-04-25 18:49:10,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.41 | bwd: 5716.49 | bwd_inner: 5703.43 | bwd_allreduce: 13.02 | step: 19.20 11%|█ | 4490/41250 [10:51:35<88:16:03, 8.64s/it] {'loss': 0.1738, 'grad_norm': 1.503122329711914, 'learning_rate': 3.9351575378294356e-05, 'epoch': 1.09} 11%|█ | 4490/41250 [10:51:35<88:16:03, 8.64s/it][2025-04-25 18:49:19,316] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.04 | optimizer_step: 0.92 [2025-04-25 18:49:19,317] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.90 | bwd_microstep: 5773.24 | bwd_inner_microstep: 5655.35 | bwd_allreduce_microstep: 117.84 | step_microstep: 19.08 [2025-04-25 18:49:19,317] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.90 | bwd: 5773.25 | bwd_inner: 5655.35 | bwd_allreduce: 117.86 | step: 19.08 11%|█ | 4491/41250 [10:51:44<88:23:20, 8.66s/it] {'loss': 0.2459, 'grad_norm': 1.5094022750854492, 'learning_rate': 3.935117870246947e-05, 'epoch': 1.09} 11%|█ | 4491/41250 [10:51:44<88:23:20, 8.66s/it][2025-04-25 18:49:27,964] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.00 | optimizer_step: 1.04 [2025-04-25 18:49:27,965] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.96 | bwd_microstep: 5710.35 | bwd_inner_microstep: 5697.78 | bwd_allreduce_microstep: 12.53 | step_microstep: 18.62 [2025-04-25 18:49:27,965] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.96 | bwd: 5710.37 | bwd_inner: 5697.78 | bwd_allreduce: 12.55 | step: 18.63 11%|█ | 4492/41250 [10:51:53<88:21:35, 8.65s/it] {'loss': 0.0761, 'grad_norm': 1.1473582983016968, 'learning_rate': 3.935078190734838e-05, 'epoch': 1.09} 11%|█ | 4492/41250 [10:51:53<88:21:35, 8.65s/it][2025-04-25 18:49:36,640] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.96 | optimizer_gradients: 1.02 | optimizer_step: 1.24 [2025-04-25 18:49:36,640] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.70 | bwd_microstep: 5741.56 | bwd_inner_microstep: 5702.15 | bwd_allreduce_microstep: 39.35 | step_microstep: 19.27 [2025-04-25 18:49:36,641] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.70 | bwd: 5741.57 | bwd_inner: 5702.15 | bwd_allreduce: 39.38 | step: 19.28 11%|█ | 4493/41250 [10:52:01<88:25:47, 8.66s/it] {'loss': 0.0838, 'grad_norm': 0.8658097386360168, 'learning_rate': 3.9350384992933523e-05, 'epoch': 1.09} 11%|█ | 4493/41250 [10:52:01<88:25:47, 8.66s/it][2025-04-25 18:49:45,273] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-25 18:49:45,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.97 | bwd_microstep: 5705.23 | bwd_inner_microstep: 5692.42 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.63 [2025-04-25 18:49:45,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.97 | bwd: 5705.24 | bwd_inner: 5692.42 | bwd_allreduce: 12.78 | step: 18.63 11%|█ | 4494/41250 [10:52:10<88:20:20, 8.65s/it] {'loss': 0.1634, 'grad_norm': 3.0577893257141113, 'learning_rate': 3.934998795922735e-05, 'epoch': 1.09} 11%|█ | 4494/41250 [10:52:10<88:20:20, 8.65s/it][2025-04-25 18:49:53,884] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:49:53,885] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.05 | bwd_microstep: 5694.63 | bwd_inner_microstep: 5654.02 | bwd_allreduce_microstep: 40.56 | step_microstep: 18.58 [2025-04-25 18:49:53,885] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.05 | bwd: 5694.64 | bwd_inner: 5654.02 | bwd_allreduce: 40.58 | step: 18.59 11%|█ | 4495/41250 [10:52:19<88:12:35, 8.64s/it] {'loss': 0.0591, 'grad_norm': 0.6008265614509583, 'learning_rate': 3.9349590806232305e-05, 'epoch': 1.09} 11%|█ | 4495/41250 [10:52:19<88:12:35, 8.64s/it][2025-04-25 18:50:02,514] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.98 | optimizer_step: 0.97 [2025-04-25 18:50:02,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.91 | bwd_microstep: 5717.15 | bwd_inner_microstep: 5661.16 | bwd_allreduce_microstep: 55.95 | step_microstep: 19.12 [2025-04-25 18:50:02,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.91 | bwd: 5717.17 | bwd_inner: 5661.16 | bwd_allreduce: 55.96 | step: 19.12 11%|█ | 4496/41250 [10:52:27<88:10:46, 8.64s/it] {'loss': 0.2158, 'grad_norm': 3.804691791534424, 'learning_rate': 3.934919353395084e-05, 'epoch': 1.09} 11%|█ | 4496/41250 [10:52:27<88:10:46, 8.64s/it][2025-04-25 18:50:11,220] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-25 18:50:11,221] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.25 | bwd_microstep: 5765.60 | bwd_inner_microstep: 5697.94 | bwd_allreduce_microstep: 67.61 | step_microstep: 19.07 [2025-04-25 18:50:11,221] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.25 | bwd: 5765.62 | bwd_inner: 5697.94 | bwd_allreduce: 67.63 | step: 19.07 11%|█ | 4497/41250 [10:52:36<88:23:10, 8.66s/it] {'loss': 0.1161, 'grad_norm': 1.613997459411621, 'learning_rate': 3.9348796142385406e-05, 'epoch': 1.09} 11%|█ | 4497/41250 [10:52:36<88:23:10, 8.66s/it][2025-04-25 18:50:19,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.89 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 18:50:19,923] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.03 | bwd_microstep: 5788.79 | bwd_inner_microstep: 5653.50 | bwd_allreduce_microstep: 135.25 | step_microstep: 18.60 [2025-04-25 18:50:19,923] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.03 | bwd: 5788.80 | bwd_inner: 5653.50 | bwd_allreduce: 135.26 | step: 18.60 11%|█ | 4498/41250 [10:52:45<88:31:11, 8.67s/it] {'loss': 0.1865, 'grad_norm': 1.805816650390625, 'learning_rate': 3.934839863153845e-05, 'epoch': 1.09} 11%|█ | 4498/41250 [10:52:45<88:31:11, 8.67s/it][2025-04-25 18:50:28,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 18:50:28,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.30 | bwd_microstep: 5708.04 | bwd_inner_microstep: 5695.34 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.77 [2025-04-25 18:50:28,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.30 | bwd: 5708.05 | bwd_inner: 5695.34 | bwd_allreduce: 12.67 | step: 18.77 11%|█ | 4499/41250 [10:52:54<89:01:49, 8.72s/it] {'loss': 0.1771, 'grad_norm': 1.8707342147827148, 'learning_rate': 3.934800100141242e-05, 'epoch': 1.09} 11%|█ | 4499/41250 [10:52:54<89:01:49, 8.72s/it][2025-04-25 18:50:37,407] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.67 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 18:50:37,408] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.16 | bwd_microstep: 5725.76 | bwd_inner_microstep: 5672.40 | bwd_allreduce_microstep: 53.31 | step_microstep: 18.79 [2025-04-25 18:50:37,408] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.16 | bwd: 5725.77 | bwd_inner: 5672.40 | bwd_allreduce: 53.33 | step: 18.79 11%|█ | 4500/41250 [10:53:02<88:48:27, 8.70s/it] {'loss': 0.1694, 'grad_norm': 1.9052107334136963, 'learning_rate': 3.934760325200977e-05, 'epoch': 1.09} 11%|█ | 4500/41250 [10:53:02<88:48:27, 8.70s/it][2025-04-25 18:50:46,064] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.01 | optimizer_step: 0.98 [2025-04-25 18:50:46,065] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.48 | bwd_microstep: 5717.78 | bwd_inner_microstep: 5704.98 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.96 [2025-04-25 18:50:46,065] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.48 | bwd: 5717.79 | bwd_inner: 5704.98 | bwd_allreduce: 12.77 | step: 18.97 11%|█ | 4501/41250 [10:53:11<88:40:12, 8.69s/it] {'loss': 0.2118, 'grad_norm': 1.5735963582992554, 'learning_rate': 3.934720538333294e-05, 'epoch': 1.09} 11%|█ | 4501/41250 [10:53:11<88:40:12, 8.69s/it][2025-04-25 18:50:54,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-25 18:50:54,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.57 | bwd_microstep: 5811.51 | bwd_inner_microstep: 5665.51 | bwd_allreduce_microstep: 145.95 | step_microstep: 18.79 [2025-04-25 18:50:54,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.57 | bwd: 5811.52 | bwd_inner: 5665.51 | bwd_allreduce: 145.97 | step: 18.79 11%|█ | 4502/41250 [10:53:20<88:48:19, 8.70s/it] {'loss': 0.2224, 'grad_norm': 1.467522144317627, 'learning_rate': 3.934680739538441e-05, 'epoch': 1.09} 11%|█ | 4502/41250 [10:53:20<88:48:19, 8.70s/it][2025-04-25 18:51:03,456] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.01 | optimizer_step: 0.95 [2025-04-25 18:51:03,457] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.92 | bwd_microstep: 5744.68 | bwd_inner_microstep: 5661.33 | bwd_allreduce_microstep: 83.30 | step_microstep: 18.73 [2025-04-25 18:51:03,457] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.92 | bwd: 5744.69 | bwd_inner: 5661.33 | bwd_allreduce: 83.32 | step: 18.73 11%|█ | 4503/41250 [10:53:28<88:40:42, 8.69s/it] {'loss': 0.2918, 'grad_norm': 1.7469674348831177, 'learning_rate': 3.9346409288166603e-05, 'epoch': 1.09} 11%|█ | 4503/41250 [10:53:28<88:40:42, 8.69s/it][2025-04-25 18:51:12,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 18:51:12,171] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.10 | bwd_microstep: 5785.54 | bwd_inner_microstep: 5693.21 | bwd_allreduce_microstep: 92.28 | step_microstep: 18.53 [2025-04-25 18:51:12,171] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.10 | bwd: 5785.55 | bwd_inner: 5693.21 | bwd_allreduce: 92.30 | step: 18.53 11%|█ | 4504/41250 [10:53:37<88:45:29, 8.70s/it] {'loss': 0.1021, 'grad_norm': 1.6817435026168823, 'learning_rate': 3.9346011061682e-05, 'epoch': 1.09} 11%|█ | 4504/41250 [10:53:37<88:45:29, 8.70s/it][2025-04-25 18:51:20,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.86 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:51:20,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.78 | bwd_microstep: 5792.68 | bwd_inner_microstep: 5645.98 | bwd_allreduce_microstep: 146.65 | step_microstep: 19.40 [2025-04-25 18:51:20,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.78 | bwd: 5792.69 | bwd_inner: 5645.98 | bwd_allreduce: 146.67 | step: 19.41 11%|█ | 4505/41250 [10:53:46<88:47:13, 8.70s/it] {'loss': 0.2349, 'grad_norm': 1.7538763284683228, 'learning_rate': 3.934561271593304e-05, 'epoch': 1.09} 11%|█ | 4505/41250 [10:53:46<88:47:13, 8.70s/it][2025-04-25 18:51:29,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-25 18:51:29,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.84 | bwd_microstep: 5727.23 | bwd_inner_microstep: 5653.32 | bwd_allreduce_microstep: 73.86 | step_microstep: 19.17 [2025-04-25 18:51:29,516] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.84 | bwd: 5727.25 | bwd_inner: 5653.32 | bwd_allreduce: 73.89 | step: 19.17 11%|█ | 4506/41250 [10:53:54<88:36:09, 8.68s/it] {'loss': 0.114, 'grad_norm': 1.1388229131698608, 'learning_rate': 3.9345214250922176e-05, 'epoch': 1.09} 11%|█ | 4506/41250 [10:53:54<88:36:09, 8.68s/it][2025-04-25 18:51:38,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 18:51:38,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.68 | bwd_microstep: 5783.56 | bwd_inner_microstep: 5719.62 | bwd_allreduce_microstep: 63.90 | step_microstep: 18.61 [2025-04-25 18:51:38,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.68 | bwd: 5783.58 | bwd_inner: 5719.62 | bwd_allreduce: 63.92 | step: 18.61 11%|█ | 4507/41250 [10:54:03<88:43:23, 8.69s/it] {'loss': 0.1658, 'grad_norm': 1.545922875404358, 'learning_rate': 3.9344815666651885e-05, 'epoch': 1.09} 11%|█ | 4507/41250 [10:54:03<88:43:23, 8.69s/it][2025-04-25 18:51:46,883] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:51:46,884] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.90 | bwd_microstep: 5717.67 | bwd_inner_microstep: 5680.66 | bwd_allreduce_microstep: 36.96 | step_microstep: 18.74 [2025-04-25 18:51:46,884] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.90 | bwd: 5717.68 | bwd_inner: 5680.66 | bwd_allreduce: 36.98 | step: 18.75 11%|█ | 4508/41250 [10:54:12<88:34:52, 8.68s/it] {'loss': 0.0643, 'grad_norm': 0.8357555270195007, 'learning_rate': 3.93444169631246e-05, 'epoch': 1.09} 11%|█ | 4508/41250 [10:54:12<88:34:52, 8.68s/it][2025-04-25 18:51:55,591] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 18:51:55,591] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.74 | bwd_microstep: 5796.00 | bwd_inner_microstep: 5659.96 | bwd_allreduce_microstep: 135.99 | step_microstep: 18.73 [2025-04-25 18:51:55,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.74 | bwd: 5796.01 | bwd_inner: 5659.96 | bwd_allreduce: 136.01 | step: 18.73 11%|█ | 4509/41250 [10:54:20<88:39:54, 8.69s/it] {'loss': 0.1487, 'grad_norm': 1.9823205471038818, 'learning_rate': 3.93440181403428e-05, 'epoch': 1.09} 11%|█ | 4509/41250 [10:54:20<88:39:54, 8.69s/it][2025-04-25 18:52:04,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.95 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:52:04,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.69 | bwd_microstep: 5707.46 | bwd_inner_microstep: 5694.67 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.16 [2025-04-25 18:52:04,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.69 | bwd: 5707.47 | bwd_inner: 5694.67 | bwd_allreduce: 12.77 | step: 18.17 11%|█ | 4510/41250 [10:54:29<88:31:36, 8.67s/it] {'loss': 0.1022, 'grad_norm': 1.128544569015503, 'learning_rate': 3.934361919830892e-05, 'epoch': 1.09} 11%|█ | 4510/41250 [10:54:29<88:31:36, 8.67s/it][2025-04-25 18:52:12,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 18:52:12,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.73 | bwd_microstep: 5805.60 | bwd_inner_microstep: 5666.83 | bwd_allreduce_microstep: 138.72 | step_microstep: 18.60 [2025-04-25 18:52:12,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.73 | bwd: 5805.61 | bwd_inner: 5666.83 | bwd_allreduce: 138.74 | step: 18.60 11%|█ | 4511/41250 [10:54:38<88:39:08, 8.69s/it] {'loss': 0.0558, 'grad_norm': 0.6568917632102966, 'learning_rate': 3.9343220137025436e-05, 'epoch': 1.09} 11%|█ | 4511/41250 [10:54:38<88:39:08, 8.69s/it][2025-04-25 18:52:21,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.62 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 18:52:21,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.63 | bwd_microstep: 5708.79 | bwd_inner_microstep: 5694.84 | bwd_allreduce_microstep: 13.91 | step_microstep: 18.68 [2025-04-25 18:52:21,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.63 | bwd: 5708.81 | bwd_inner: 5694.84 | bwd_allreduce: 13.92 | step: 18.69 11%|█ | 4512/41250 [10:54:46<88:29:57, 8.67s/it] {'loss': 0.2255, 'grad_norm': 1.5887351036071777, 'learning_rate': 3.9342820956494805e-05, 'epoch': 1.09} 11%|█ | 4512/41250 [10:54:46<88:29:57, 8.67s/it][2025-04-25 18:52:30,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.95 | optimizer_step: 1.09 [2025-04-25 18:52:30,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.30 | bwd_microstep: 5786.09 | bwd_inner_microstep: 5704.56 | bwd_allreduce_microstep: 81.49 | step_microstep: 18.68 [2025-04-25 18:52:30,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.30 | bwd: 5786.10 | bwd_inner: 5704.56 | bwd_allreduce: 81.51 | step: 18.68 11%|█ | 4513/41250 [10:54:55<88:38:09, 8.69s/it] {'loss': 0.1601, 'grad_norm': 3.4159319400787354, 'learning_rate': 3.9342421656719486e-05, 'epoch': 1.09} 11%|█ | 4513/41250 [10:54:55<88:38:09, 8.69s/it][2025-04-25 18:52:39,018] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.96 | optimizer_step: 0.93 [2025-04-25 18:52:39,018] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.35 | bwd_microstep: 5803.62 | bwd_inner_microstep: 5659.74 | bwd_allreduce_microstep: 143.82 | step_microstep: 18.58 [2025-04-25 18:52:39,019] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.35 | bwd: 5803.63 | bwd_inner: 5659.74 | bwd_allreduce: 143.84 | step: 18.58 11%|█ | 4514/41250 [10:55:04<88:43:05, 8.69s/it] {'loss': 0.2819, 'grad_norm': 2.117114543914795, 'learning_rate': 3.9342022237701945e-05, 'epoch': 1.09} 11%|█ | 4514/41250 [10:55:04<88:43:05, 8.69s/it][2025-04-25 18:52:47,720] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.04 | optimizer_step: 1.15 [2025-04-25 18:52:47,721] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.35 | bwd_microstep: 5794.01 | bwd_inner_microstep: 5660.28 | bwd_allreduce_microstep: 133.68 | step_microstep: 19.15 [2025-04-25 18:52:47,721] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.35 | bwd: 5794.02 | bwd_inner: 5660.28 | bwd_allreduce: 133.70 | step: 19.15 11%|█ | 4515/41250 [10:55:13<88:44:17, 8.70s/it] {'loss': 0.1234, 'grad_norm': 1.4453462362289429, 'learning_rate': 3.934162269944464e-05, 'epoch': 1.09} 11%|█ | 4515/41250 [10:55:13<88:44:17, 8.70s/it][2025-04-25 18:52:56,434] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 18:52:56,435] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.72 | bwd_microstep: 5809.43 | bwd_inner_microstep: 5644.27 | bwd_allreduce_microstep: 165.12 | step_microstep: 18.69 [2025-04-25 18:52:56,435] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.72 | bwd: 5809.44 | bwd_inner: 5644.27 | bwd_allreduce: 165.13 | step: 18.69 11%|█ | 4516/41250 [10:55:21<88:47:23, 8.70s/it] {'loss': 0.2478, 'grad_norm': 1.698708176612854, 'learning_rate': 3.9341223041950044e-05, 'epoch': 1.09} 11%|█ | 4516/41250 [10:55:21<88:47:23, 8.70s/it][2025-04-25 18:53:05,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 1.06 [2025-04-25 18:53:05,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.29 | bwd_microstep: 5702.72 | bwd_inner_microstep: 5641.12 | bwd_allreduce_microstep: 61.54 | step_microstep: 18.86 [2025-04-25 18:53:05,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.29 | bwd: 5702.73 | bwd_inner: 5641.12 | bwd_allreduce: 61.56 | step: 18.86 11%|█ | 4517/41250 [10:55:30<88:29:07, 8.67s/it] {'loss': 0.0554, 'grad_norm': 0.6598738431930542, 'learning_rate': 3.93408232652206e-05, 'epoch': 1.1} 11%|█ | 4517/41250 [10:55:30<88:29:07, 8.67s/it][2025-04-25 18:53:13,737] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:53:13,738] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.56 | bwd_microstep: 5770.25 | bwd_inner_microstep: 5676.33 | bwd_allreduce_microstep: 93.87 | step_microstep: 18.73 [2025-04-25 18:53:13,738] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.56 | bwd: 5770.26 | bwd_inner: 5676.33 | bwd_allreduce: 93.89 | step: 18.74 11%|█ | 4518/41250 [10:55:39<88:34:07, 8.68s/it] {'loss': 0.3153, 'grad_norm': 2.1002492904663086, 'learning_rate': 3.934042336925879e-05, 'epoch': 1.1} 11%|█ | 4518/41250 [10:55:39<88:34:07, 8.68s/it][2025-04-25 18:53:22,401] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 18:53:22,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.92 | bwd_microstep: 5732.97 | bwd_inner_microstep: 5707.64 | bwd_allreduce_microstep: 25.28 | step_microstep: 19.35 [2025-04-25 18:53:22,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.92 | bwd: 5732.98 | bwd_inner: 5707.64 | bwd_allreduce: 25.30 | step: 19.35 11%|█ | 4519/41250 [10:55:47<88:31:09, 8.68s/it] {'loss': 0.0862, 'grad_norm': 1.5788171291351318, 'learning_rate': 3.934002335406707e-05, 'epoch': 1.1} 11%|█ | 4519/41250 [10:55:47<88:31:09, 8.68s/it][2025-04-25 18:53:31,068] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.02 | optimizer_step: 1.05 [2025-04-25 18:53:31,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.45 | bwd_microstep: 5756.19 | bwd_inner_microstep: 5654.21 | bwd_allreduce_microstep: 101.93 | step_microstep: 18.75 [2025-04-25 18:53:31,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.45 | bwd: 5756.21 | bwd_inner: 5654.21 | bwd_allreduce: 101.95 | step: 18.75 11%|█ | 4520/41250 [10:55:56<88:29:09, 8.67s/it] {'loss': 0.0859, 'grad_norm': 1.1612995862960815, 'learning_rate': 3.9339623219647917e-05, 'epoch': 1.1} 11%|█ | 4520/41250 [10:55:56<88:29:09, 8.67s/it][2025-04-25 18:53:39,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 18:53:39,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.15 | bwd_microstep: 5779.55 | bwd_inner_microstep: 5654.90 | bwd_allreduce_microstep: 124.61 | step_microstep: 18.63 [2025-04-25 18:53:39,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.15 | bwd: 5779.57 | bwd_inner: 5654.90 | bwd_allreduce: 124.63 | step: 18.63 11%|█ | 4521/41250 [10:56:05<88:32:48, 8.68s/it] {'loss': 0.2228, 'grad_norm': 1.6770527362823486, 'learning_rate': 3.9339222966003784e-05, 'epoch': 1.1} 11%|█ | 4521/41250 [10:56:05<88:32:48, 8.68s/it][2025-04-25 18:53:48,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 18:53:48,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.80 | bwd_microstep: 5721.26 | bwd_inner_microstep: 5682.38 | bwd_allreduce_microstep: 38.82 | step_microstep: 18.80 [2025-04-25 18:53:48,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.80 | bwd: 5721.27 | bwd_inner: 5682.38 | bwd_allreduce: 38.85 | step: 18.80 11%|█ | 4522/41250 [10:56:13<88:27:39, 8.67s/it] {'loss': 0.2356, 'grad_norm': 1.0870553255081177, 'learning_rate': 3.933882259313714e-05, 'epoch': 1.1} 11%|█ | 4522/41250 [10:56:13<88:27:39, 8.67s/it][2025-04-25 18:53:57,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.99 | optimizer_step: 0.96 [2025-04-25 18:53:57,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.62 | bwd_microstep: 5704.87 | bwd_inner_microstep: 5692.18 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.92 [2025-04-25 18:53:57,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.62 | bwd: 5704.89 | bwd_inner: 5692.18 | bwd_allreduce: 12.66 | step: 18.92 11%|█ | 4523/41250 [10:56:22<88:22:32, 8.66s/it] {'loss': 0.1273, 'grad_norm': 1.0783185958862305, 'learning_rate': 3.933842210105048e-05, 'epoch': 1.1} 11%|█ | 4523/41250 [10:56:22<88:22:32, 8.66s/it][2025-04-25 18:54:05,651] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 1.12 [2025-04-25 18:54:05,652] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.78 | bwd_microstep: 5678.04 | bwd_inner_microstep: 5658.83 | bwd_allreduce_microstep: 19.17 | step_microstep: 19.02 [2025-04-25 18:54:05,652] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.78 | bwd: 5678.06 | bwd_inner: 5658.83 | bwd_allreduce: 19.19 | step: 19.02 11%|█ | 4524/41250 [10:56:30<88:09:56, 8.64s/it] {'loss': 0.0592, 'grad_norm': 1.3513944149017334, 'learning_rate': 3.933802148974623e-05, 'epoch': 1.1} 11%|█ | 4524/41250 [10:56:30<88:09:56, 8.64s/it][2025-04-25 18:54:14,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.98 | optimizer_step: 1.07 [2025-04-25 18:54:14,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.37 | bwd_microstep: 5696.35 | bwd_inner_microstep: 5637.39 | bwd_allreduce_microstep: 58.92 | step_microstep: 18.51 [2025-04-25 18:54:14,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.37 | bwd: 5696.36 | bwd_inner: 5637.38 | bwd_allreduce: 58.94 | step: 18.51 11%|█ | 4525/41250 [10:56:39<88:01:17, 8.63s/it] {'loss': 0.2173, 'grad_norm': 2.3078246116638184, 'learning_rate': 3.933762075922689e-05, 'epoch': 1.1} 11%|█ | 4525/41250 [10:56:39<88:01:17, 8.63s/it][2025-04-25 18:54:22,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 0.96 | optimizer_step: 1.02 [2025-04-25 18:54:22,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.50 | bwd_microstep: 5756.66 | bwd_inner_microstep: 5692.22 | bwd_allreduce_microstep: 64.40 | step_microstep: 18.29 [2025-04-25 18:54:22,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.50 | bwd: 5756.67 | bwd_inner: 5692.22 | bwd_allreduce: 64.42 | step: 18.29 11%|█ | 4526/41250 [10:56:48<88:11:23, 8.65s/it] {'loss': 0.0387, 'grad_norm': 0.3723589777946472, 'learning_rate': 3.933721990949492e-05, 'epoch': 1.1} 11%|█ | 4526/41250 [10:56:48<88:11:23, 8.65s/it][2025-04-25 18:54:31,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 0.98 | optimizer_step: 1.01 [2025-04-25 18:54:31,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.14 | bwd_microstep: 5868.20 | bwd_inner_microstep: 5643.90 | bwd_allreduce_microstep: 224.26 | step_microstep: 18.55 [2025-04-25 18:54:31,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.14 | bwd: 5868.22 | bwd_inner: 5643.90 | bwd_allreduce: 224.28 | step: 18.56 11%|█ | 4527/41250 [10:56:57<88:34:21, 8.68s/it] {'loss': 0.1727, 'grad_norm': 1.9564958810806274, 'learning_rate': 3.933681894055279e-05, 'epoch': 1.1} 11%|█ | 4527/41250 [10:56:57<88:34:21, 8.68s/it][2025-04-25 18:54:40,613] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 18:54:40,613] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2935.73 | bwd_microstep: 5888.24 | bwd_inner_microstep: 5875.41 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.78 [2025-04-25 18:54:40,613] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2935.73 | bwd: 5888.25 | bwd_inner: 5875.41 | bwd_allreduce: 12.80 | step: 18.78 11%|█ | 4528/41250 [10:57:05<89:15:49, 8.75s/it] {'loss': 0.0875, 'grad_norm': 1.4494255781173706, 'learning_rate': 3.933641785240298e-05, 'epoch': 1.1} 11%|█ | 4528/41250 [10:57:05<89:15:49, 8.75s/it][2025-04-25 18:54:49,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 18:54:49,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.39 | bwd_microstep: 5701.78 | bwd_inner_microstep: 5688.96 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.75 [2025-04-25 18:54:49,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.39 | bwd: 5701.79 | bwd_inner: 5688.96 | bwd_allreduce: 12.79 | step: 18.75 11%|█ | 4529/41250 [10:57:14<88:53:16, 8.71s/it] {'loss': 0.1698, 'grad_norm': 1.0681403875350952, 'learning_rate': 3.933601664504796e-05, 'epoch': 1.1} 11%|█ | 4529/41250 [10:57:14<88:53:16, 8.71s/it][2025-04-25 18:54:57,910] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 1.06 [2025-04-25 18:54:57,911] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.47 | bwd_microstep: 5758.73 | bwd_inner_microstep: 5645.72 | bwd_allreduce_microstep: 112.96 | step_microstep: 19.04 [2025-04-25 18:54:57,911] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.47 | bwd: 5758.74 | bwd_inner: 5645.72 | bwd_allreduce: 112.97 | step: 19.05 11%|█ | 4530/41250 [10:57:23<88:44:54, 8.70s/it] {'loss': 0.1359, 'grad_norm': 2.0846006870269775, 'learning_rate': 3.933561531849019e-05, 'epoch': 1.1} 11%|█ | 4530/41250 [10:57:23<88:44:54, 8.70s/it][2025-04-25 18:55:06,686] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.02 | optimizer_step: 1.01 [2025-04-25 18:55:06,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.30 | bwd_microstep: 5866.84 | bwd_inner_microstep: 5641.43 | bwd_allreduce_microstep: 225.37 | step_microstep: 19.23 [2025-04-25 18:55:06,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.30 | bwd: 5866.86 | bwd_inner: 5641.43 | bwd_allreduce: 225.38 | step: 19.24 11%|█ | 4531/41250 [10:57:32<88:58:30, 8.72s/it] {'loss': 0.1426, 'grad_norm': 1.4769189357757568, 'learning_rate': 3.933521387273216e-05, 'epoch': 1.1} 11%|█ | 4531/41250 [10:57:32<88:58:30, 8.72s/it][2025-04-25 18:55:15,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:55:15,373] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.47 | bwd_microstep: 5752.47 | bwd_inner_microstep: 5698.58 | bwd_allreduce_microstep: 53.84 | step_microstep: 18.92 [2025-04-25 18:55:15,373] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.47 | bwd: 5752.48 | bwd_inner: 5698.58 | bwd_allreduce: 53.86 | step: 18.93 11%|█ | 4532/41250 [10:57:40<88:51:27, 8.71s/it] {'loss': 0.1074, 'grad_norm': 1.531858205795288, 'learning_rate': 3.933481230777634e-05, 'epoch': 1.1} 11%|█ | 4532/41250 [10:57:40<88:51:27, 8.71s/it][2025-04-25 18:55:23,997] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 18:55:23,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.22 | bwd_microstep: 5700.79 | bwd_inner_microstep: 5687.88 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.38 [2025-04-25 18:55:23,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.22 | bwd: 5700.80 | bwd_inner: 5687.88 | bwd_allreduce: 12.88 | step: 18.38 11%|█ | 4533/41250 [10:57:49<88:35:45, 8.69s/it] {'loss': 0.3277, 'grad_norm': 2.9874019622802734, 'learning_rate': 3.9334410623625203e-05, 'epoch': 1.1} 11%|█ | 4533/41250 [10:57:49<88:35:45, 8.69s/it][2025-04-25 18:55:32,610] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 18:55:32,611] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.23 | bwd_microstep: 5691.25 | bwd_inner_microstep: 5655.16 | bwd_allreduce_microstep: 36.04 | step_microstep: 18.76 [2025-04-25 18:55:32,611] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.23 | bwd: 5691.27 | bwd_inner: 5655.16 | bwd_allreduce: 36.06 | step: 18.77 11%|█ | 4534/41250 [10:57:57<88:21:45, 8.66s/it] {'loss': 0.2192, 'grad_norm': 2.2776594161987305, 'learning_rate': 3.9334008820281236e-05, 'epoch': 1.1} 11%|█ | 4534/41250 [10:57:57<88:21:45, 8.66s/it][2025-04-25 18:55:41,197] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:55:41,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.61 | bwd_microstep: 5685.29 | bwd_inner_microstep: 5642.17 | bwd_allreduce_microstep: 43.07 | step_microstep: 18.75 [2025-04-25 18:55:41,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.62 | bwd: 5685.30 | bwd_inner: 5642.17 | bwd_allreduce: 43.09 | step: 18.75 11%|█ | 4535/41250 [10:58:06<88:07:25, 8.64s/it] {'loss': 0.1959, 'grad_norm': 2.2234511375427246, 'learning_rate': 3.93336068977469e-05, 'epoch': 1.1} 11%|█ | 4535/41250 [10:58:06<88:07:25, 8.64s/it][2025-04-25 18:55:49,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.03 | optimizer_step: 0.98 [2025-04-25 18:55:49,857] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.16 | bwd_microstep: 5720.14 | bwd_inner_microstep: 5707.42 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.94 [2025-04-25 18:55:49,857] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.17 | bwd: 5720.15 | bwd_inner: 5707.42 | bwd_allreduce: 12.69 | step: 18.94 11%|█ | 4536/41250 [10:58:15<88:10:53, 8.65s/it] {'loss': 0.1388, 'grad_norm': 1.2779003381729126, 'learning_rate': 3.933320485602468e-05, 'epoch': 1.1} 11%|█ | 4536/41250 [10:58:15<88:10:53, 8.65s/it][2025-04-25 18:55:58,516] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 18:55:58,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.17 | bwd_microstep: 5720.42 | bwd_inner_microstep: 5690.87 | bwd_allreduce_microstep: 29.51 | step_microstep: 19.15 [2025-04-25 18:55:58,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.17 | bwd: 5720.44 | bwd_inner: 5690.87 | bwd_allreduce: 29.53 | step: 19.15 11%|█ | 4537/41250 [10:58:23<88:13:17, 8.65s/it] {'loss': 0.1598, 'grad_norm': 1.5102986097335815, 'learning_rate': 3.933280269511705e-05, 'epoch': 1.1} 11%|█ | 4537/41250 [10:58:23<88:13:17, 8.65s/it][2025-04-25 18:56:07,176] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-25 18:56:07,177] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.84 | bwd_microstep: 5745.22 | bwd_inner_microstep: 5657.45 | bwd_allreduce_microstep: 87.72 | step_microstep: 18.81 [2025-04-25 18:56:07,177] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.84 | bwd: 5745.23 | bwd_inner: 5657.45 | bwd_allreduce: 87.74 | step: 18.81 11%|█ | 4538/41250 [10:58:32<88:14:31, 8.65s/it] {'loss': 0.0922, 'grad_norm': 1.1246477365493774, 'learning_rate': 3.93324004150265e-05, 'epoch': 1.1} 11%|█ | 4538/41250 [10:58:32<88:14:31, 8.65s/it][2025-04-25 18:56:15,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 18:56:15,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.31 | bwd_microstep: 5688.52 | bwd_inner_microstep: 5645.39 | bwd_allreduce_microstep: 43.09 | step_microstep: 18.41 [2025-04-25 18:56:15,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.31 | bwd: 5688.53 | bwd_inner: 5645.39 | bwd_allreduce: 43.11 | step: 18.42 11%|█ | 4539/41250 [10:58:41<88:05:44, 8.64s/it] {'loss': 0.0815, 'grad_norm': 1.1636536121368408, 'learning_rate': 3.93319980157555e-05, 'epoch': 1.1} 11%|█ | 4539/41250 [10:58:41<88:05:44, 8.64s/it][2025-04-25 18:56:24,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:56:24,397] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.63 | bwd_microstep: 5697.21 | bwd_inner_microstep: 5650.50 | bwd_allreduce_microstep: 46.66 | step_microstep: 18.93 [2025-04-25 18:56:24,397] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.63 | bwd: 5697.22 | bwd_inner: 5650.50 | bwd_allreduce: 46.68 | step: 18.93 11%|█ | 4540/41250 [10:58:49<88:01:00, 8.63s/it] {'loss': 0.1543, 'grad_norm': 2.8506689071655273, 'learning_rate': 3.933159549730654e-05, 'epoch': 1.1} 11%|█ | 4540/41250 [10:58:49<88:01:00, 8.63s/it][2025-04-25 18:56:33,050] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 1.19 | optimizer_step: 0.91 [2025-04-25 18:56:33,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.40 | bwd_microstep: 5712.51 | bwd_inner_microstep: 5698.87 | bwd_allreduce_microstep: 13.59 | step_microstep: 19.33 [2025-04-25 18:56:33,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.40 | bwd: 5712.52 | bwd_inner: 5698.87 | bwd_allreduce: 13.61 | step: 19.33 11%|█ | 4541/41250 [10:58:58<88:05:24, 8.64s/it] {'loss': 0.1242, 'grad_norm': 1.7450355291366577, 'learning_rate': 3.93311928596821e-05, 'epoch': 1.1} 11%|█ | 4541/41250 [10:58:58<88:05:24, 8.64s/it][2025-04-25 18:56:41,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.99 | optimizer_step: 1.05 [2025-04-25 18:56:41,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.15 | bwd_microstep: 5761.86 | bwd_inner_microstep: 5654.33 | bwd_allreduce_microstep: 107.48 | step_microstep: 18.61 [2025-04-25 18:56:41,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.15 | bwd: 5761.88 | bwd_inner: 5654.33 | bwd_allreduce: 107.50 | step: 18.61 11%|█ | 4542/41250 [10:59:07<88:12:37, 8.65s/it] {'loss': 0.1236, 'grad_norm': 2.2751147747039795, 'learning_rate': 3.9330790102884646e-05, 'epoch': 1.1} 11%|█ | 4542/41250 [10:59:07<88:12:37, 8.65s/it][2025-04-25 18:56:50,422] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 18:56:50,423] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.76 | bwd_microstep: 5752.33 | bwd_inner_microstep: 5699.93 | bwd_allreduce_microstep: 52.34 | step_microstep: 18.63 [2025-04-25 18:56:50,423] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.76 | bwd: 5752.35 | bwd_inner: 5699.93 | bwd_allreduce: 52.37 | step: 18.63 11%|█ | 4543/41250 [10:59:15<88:21:06, 8.66s/it] {'loss': 0.1268, 'grad_norm': 1.0480449199676514, 'learning_rate': 3.933038722691669e-05, 'epoch': 1.1} 11%|█ | 4543/41250 [10:59:15<88:21:06, 8.66s/it][2025-04-25 18:56:59,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.04 | optimizer_step: 1.02 [2025-04-25 18:56:59,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.03 | bwd_microstep: 5787.55 | bwd_inner_microstep: 5670.04 | bwd_allreduce_microstep: 117.45 | step_microstep: 19.36 [2025-04-25 18:56:59,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.03 | bwd: 5787.56 | bwd_inner: 5670.04 | bwd_allreduce: 117.48 | step: 19.36 11%|█ | 4544/41250 [10:59:24<88:28:56, 8.68s/it] {'loss': 0.0462, 'grad_norm': 0.7029061913490295, 'learning_rate': 3.932998423178068e-05, 'epoch': 1.1} 11%|█ | 4544/41250 [10:59:24<88:28:56, 8.68s/it][2025-04-25 18:57:07,803] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 18:57:07,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.36 | bwd_microstep: 5724.17 | bwd_inner_microstep: 5711.27 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.48 [2025-04-25 18:57:07,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.36 | bwd: 5724.19 | bwd_inner: 5711.27 | bwd_allreduce: 12.88 | step: 18.49 11%|█ | 4545/41250 [10:59:33<88:26:38, 8.67s/it] {'loss': 0.1589, 'grad_norm': 2.0694775581359863, 'learning_rate': 3.9329581117479125e-05, 'epoch': 1.1} 11%|█ | 4545/41250 [10:59:33<88:26:38, 8.67s/it][2025-04-25 18:57:16,509] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 18:57:16,509] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.01 | bwd_microstep: 5787.26 | bwd_inner_microstep: 5664.79 | bwd_allreduce_microstep: 122.42 | step_microstep: 19.07 [2025-04-25 18:57:16,510] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.01 | bwd: 5787.28 | bwd_inner: 5664.79 | bwd_allreduce: 122.44 | step: 19.07 11%|█ | 4546/41250 [10:59:41<88:32:03, 8.68s/it] {'loss': 0.3852, 'grad_norm': 2.812570810317993, 'learning_rate': 3.932917788401451e-05, 'epoch': 1.1} 11%|█ | 4546/41250 [10:59:41<88:32:03, 8.68s/it][2025-04-25 18:57:25,218] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:57:25,218] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.05 | bwd_microstep: 5772.90 | bwd_inner_microstep: 5715.12 | bwd_allreduce_microstep: 57.73 | step_microstep: 18.36 [2025-04-25 18:57:25,219] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.05 | bwd: 5772.91 | bwd_inner: 5715.12 | bwd_allreduce: 57.75 | step: 18.36 11%|█ | 4547/41250 [10:59:50<88:36:34, 8.69s/it] {'loss': 0.1538, 'grad_norm': 1.2244116067886353, 'learning_rate': 3.932877453138931e-05, 'epoch': 1.1} 11%|█ | 4547/41250 [10:59:50<88:36:34, 8.69s/it][2025-04-25 18:57:33,943] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:57:33,943] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.13 | bwd_microstep: 5806.85 | bwd_inner_microstep: 5671.17 | bwd_allreduce_microstep: 135.63 | step_microstep: 18.67 [2025-04-25 18:57:33,943] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.13 | bwd: 5806.87 | bwd_inner: 5671.17 | bwd_allreduce: 135.66 | step: 18.67 11%|█ | 4548/41250 [10:59:59<88:42:44, 8.70s/it] {'loss': 0.109, 'grad_norm': 1.8208914995193481, 'learning_rate': 3.9328371059606015e-05, 'epoch': 1.1} 11%|█ | 4548/41250 [10:59:59<88:42:44, 8.70s/it][2025-04-25 18:57:42,589] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-25 18:57:42,589] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.92 | bwd_microstep: 5734.26 | bwd_inner_microstep: 5659.11 | bwd_allreduce_microstep: 75.11 | step_microstep: 18.78 [2025-04-25 18:57:42,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.92 | bwd: 5734.27 | bwd_inner: 5659.11 | bwd_allreduce: 75.12 | step: 18.78 11%|█ | 4549/41250 [11:00:07<88:32:20, 8.68s/it] {'loss': 0.2296, 'grad_norm': 1.3054227828979492, 'learning_rate': 3.932796746866712e-05, 'epoch': 1.1} 11%|█ | 4549/41250 [11:00:07<88:32:20, 8.68s/it][2025-04-25 18:57:51,311] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-25 18:57:51,312] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.92 | bwd_microstep: 5813.36 | bwd_inner_microstep: 5649.88 | bwd_allreduce_microstep: 163.44 | step_microstep: 19.84 [2025-04-25 18:57:51,312] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.92 | bwd: 5813.38 | bwd_inner: 5649.88 | bwd_allreduce: 163.45 | step: 19.84 11%|█ | 4550/41250 [11:00:16<88:39:06, 8.70s/it] {'loss': 0.177, 'grad_norm': 1.2824547290802002, 'learning_rate': 3.9327563758575106e-05, 'epoch': 1.1} 11%|█ | 4550/41250 [11:00:16<88:39:06, 8.70s/it][2025-04-25 18:58:00,017] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 18:58:00,017] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.29 | bwd_microstep: 5763.40 | bwd_inner_microstep: 5717.05 | bwd_allreduce_microstep: 46.31 | step_microstep: 18.69 [2025-04-25 18:58:00,017] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.29 | bwd: 5763.41 | bwd_inner: 5717.04 | bwd_allreduce: 46.32 | step: 18.69 11%|█ | 4551/41250 [11:00:25<88:40:33, 8.70s/it] {'loss': 0.189, 'grad_norm': 1.4757014513015747, 'learning_rate': 3.9327159929332464e-05, 'epoch': 1.1} 11%|█ | 4551/41250 [11:00:25<88:40:33, 8.70s/it][2025-04-25 18:58:08,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 18:58:08,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.37 | bwd_microstep: 5835.11 | bwd_inner_microstep: 5665.33 | bwd_allreduce_microstep: 169.74 | step_microstep: 18.34 [2025-04-25 18:58:08,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.37 | bwd: 5835.13 | bwd_inner: 5665.33 | bwd_allreduce: 169.75 | step: 18.34 11%|█ | 4552/41250 [11:00:34<88:48:46, 8.71s/it] {'loss': 0.2469, 'grad_norm': 2.194135904312134, 'learning_rate': 3.932675598094168e-05, 'epoch': 1.1} 11%|█ | 4552/41250 [11:00:34<88:48:46, 8.71s/it][2025-04-25 18:58:17,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 1.03 [2025-04-25 18:58:17,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.70 | bwd_microstep: 5744.70 | bwd_inner_microstep: 5653.09 | bwd_allreduce_microstep: 91.57 | step_microstep: 18.63 [2025-04-25 18:58:17,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.70 | bwd: 5744.71 | bwd_inner: 5653.09 | bwd_allreduce: 91.58 | step: 18.63 11%|█ | 4553/41250 [11:00:42<88:37:38, 8.69s/it] {'loss': 0.0401, 'grad_norm': 0.41865476965904236, 'learning_rate': 3.9326351913405246e-05, 'epoch': 1.1} 11%|█ | 4553/41250 [11:00:42<88:37:38, 8.69s/it][2025-04-25 18:58:26,063] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 18:58:26,064] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.47 | bwd_microstep: 5723.09 | bwd_inner_microstep: 5699.39 | bwd_allreduce_microstep: 23.66 | step_microstep: 18.57 [2025-04-25 18:58:26,064] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.47 | bwd: 5723.11 | bwd_inner: 5699.39 | bwd_allreduce: 23.68 | step: 18.58 11%|█ | 4554/41250 [11:00:51<88:29:19, 8.68s/it] {'loss': 0.0834, 'grad_norm': 2.745962619781494, 'learning_rate': 3.932594772672565e-05, 'epoch': 1.1} 11%|█ | 4554/41250 [11:00:51<88:29:19, 8.68s/it][2025-04-25 18:58:34,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 18:58:34,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.49 | bwd_microstep: 5827.98 | bwd_inner_microstep: 5663.10 | bwd_allreduce_microstep: 164.83 | step_microstep: 18.68 [2025-04-25 18:58:34,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.49 | bwd: 5827.99 | bwd_inner: 5663.10 | bwd_allreduce: 164.85 | step: 18.68 11%|█ | 4555/41250 [11:01:00<88:39:18, 8.70s/it] {'loss': 0.0622, 'grad_norm': 0.9960505962371826, 'learning_rate': 3.9325543420905395e-05, 'epoch': 1.1} 11%|█ | 4555/41250 [11:01:00<88:39:18, 8.70s/it][2025-04-25 18:58:43,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.09 | optimizer_step: 0.98 [2025-04-25 18:58:43,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2885.06 | bwd_microstep: 5781.82 | bwd_inner_microstep: 5768.46 | bwd_allreduce_microstep: 13.30 | step_microstep: 19.49 [2025-04-25 18:58:43,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2885.06 | bwd: 5781.83 | bwd_inner: 5768.46 | bwd_allreduce: 13.32 | step: 19.50 11%|█ | 4556/41250 [11:01:08<88:48:39, 8.71s/it] {'loss': 0.2152, 'grad_norm': 1.3789695501327515, 'learning_rate': 3.932513899594697e-05, 'epoch': 1.1} 11%|█ | 4556/41250 [11:01:08<88:48:39, 8.71s/it][2025-04-25 18:58:52,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 18:58:52,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.49 | bwd_microstep: 5780.42 | bwd_inner_microstep: 5667.80 | bwd_allreduce_microstep: 112.58 | step_microstep: 18.40 [2025-04-25 18:58:52,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.49 | bwd: 5780.44 | bwd_inner: 5667.80 | bwd_allreduce: 112.60 | step: 18.40 11%|█ | 4557/41250 [11:01:17<88:45:01, 8.71s/it] {'loss': 0.0465, 'grad_norm': 0.6915189623832703, 'learning_rate': 3.9324734451852856e-05, 'epoch': 1.1} 11%|█ | 4557/41250 [11:01:17<88:45:01, 8.71s/it][2025-04-25 18:59:00,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.57 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 18:59:00,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.98 | bwd_microstep: 5723.56 | bwd_inner_microstep: 5711.00 | bwd_allreduce_microstep: 12.52 | step_microstep: 19.31 [2025-04-25 18:59:00,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.98 | bwd: 5723.58 | bwd_inner: 5711.00 | bwd_allreduce: 12.53 | step: 19.32 11%|█ | 4558/41250 [11:01:26<88:34:36, 8.69s/it] {'loss': 0.0555, 'grad_norm': 0.7151979207992554, 'learning_rate': 3.932432978862555e-05, 'epoch': 1.1} 11%|█ | 4558/41250 [11:01:26<88:34:36, 8.69s/it][2025-04-25 18:59:09,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 1.19 | optimizer_step: 0.90 [2025-04-25 18:59:09,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.08 | bwd_microstep: 5737.71 | bwd_inner_microstep: 5668.87 | bwd_allreduce_microstep: 68.79 | step_microstep: 18.70 [2025-04-25 18:59:09,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.08 | bwd: 5737.72 | bwd_inner: 5668.87 | bwd_allreduce: 68.81 | step: 18.70 11%|█ | 4559/41250 [11:01:34<88:27:10, 8.68s/it] {'loss': 0.0819, 'grad_norm': 1.004449486732483, 'learning_rate': 3.932392500626756e-05, 'epoch': 1.11} 11%|█ | 4559/41250 [11:01:34<88:27:10, 8.68s/it][2025-04-25 18:59:18,256] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 0.98 | optimizer_step: 1.00 [2025-04-25 18:59:18,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.87 | bwd_microstep: 5802.33 | bwd_inner_microstep: 5657.59 | bwd_allreduce_microstep: 144.69 | step_microstep: 18.42 [2025-04-25 18:59:18,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.87 | bwd: 5802.35 | bwd_inner: 5657.59 | bwd_allreduce: 144.71 | step: 18.43 11%|█ | 4560/41250 [11:01:43<88:33:06, 8.69s/it] {'loss': 0.0978, 'grad_norm': 1.0364960432052612, 'learning_rate': 3.932352010478137e-05, 'epoch': 1.11} 11%|█ | 4560/41250 [11:01:43<88:33:06, 8.69s/it][2025-04-25 18:59:26,980] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.95 [2025-04-25 18:59:26,980] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.25 | bwd_microstep: 5810.93 | bwd_inner_microstep: 5663.69 | bwd_allreduce_microstep: 147.20 | step_microstep: 19.08 [2025-04-25 18:59:26,981] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.26 | bwd: 5810.95 | bwd_inner: 5663.69 | bwd_allreduce: 147.21 | step: 19.08 11%|█ | 4561/41250 [11:01:52<88:39:41, 8.70s/it] {'loss': 0.2334, 'grad_norm': 1.7201008796691895, 'learning_rate': 3.9323115084169485e-05, 'epoch': 1.11} 11%|█ | 4561/41250 [11:01:52<88:39:41, 8.70s/it][2025-04-25 18:59:35,624] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-25 18:59:35,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.57 | bwd_microstep: 5707.77 | bwd_inner_microstep: 5695.05 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.77 [2025-04-25 18:59:35,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.57 | bwd: 5707.78 | bwd_inner: 5695.05 | bwd_allreduce: 12.69 | step: 18.78 11%|█ | 4562/41250 [11:02:00<88:28:56, 8.68s/it] {'loss': 0.1034, 'grad_norm': 1.5235018730163574, 'learning_rate': 3.932270994443439e-05, 'epoch': 1.11} 11%|█ | 4562/41250 [11:02:00<88:28:56, 8.68s/it][2025-04-25 18:59:44,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.00 | optimizer_step: 0.93 [2025-04-25 18:59:44,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2888.38 | bwd_microstep: 5796.53 | bwd_inner_microstep: 5783.86 | bwd_allreduce_microstep: 12.63 | step_microstep: 18.54 [2025-04-25 18:59:44,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2888.38 | bwd: 5796.54 | bwd_inner: 5783.86 | bwd_allreduce: 12.64 | step: 18.54 11%|█ | 4563/41250 [11:02:09<88:44:24, 8.71s/it] {'loss': 0.2135, 'grad_norm': 2.1221041679382324, 'learning_rate': 3.932230468557859e-05, 'epoch': 1.11} 11%|█ | 4563/41250 [11:02:09<88:44:24, 8.71s/it][2025-04-25 18:59:53,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 18:59:53,077] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.25 | bwd_microstep: 5747.15 | bwd_inner_microstep: 5709.60 | bwd_allreduce_microstep: 37.50 | step_microstep: 19.02 [2025-04-25 18:59:53,077] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.25 | bwd: 5747.16 | bwd_inner: 5709.60 | bwd_allreduce: 37.52 | step: 19.02 11%|█ | 4564/41250 [11:02:18<88:40:00, 8.70s/it] {'loss': 0.2895, 'grad_norm': 1.8766146898269653, 'learning_rate': 3.9321899307604586e-05, 'epoch': 1.11} 11%|█ | 4564/41250 [11:02:18<88:40:00, 8.70s/it][2025-04-25 19:00:01,758] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.99 | optimizer_step: 0.97 [2025-04-25 19:00:01,759] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.88 | bwd_microstep: 5751.62 | bwd_inner_microstep: 5692.86 | bwd_allreduce_microstep: 58.71 | step_microstep: 18.79 [2025-04-25 19:00:01,759] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.88 | bwd: 5751.63 | bwd_inner: 5692.86 | bwd_allreduce: 58.73 | step: 18.79 11%|█ | 4565/41250 [11:02:27<88:36:32, 8.70s/it] {'loss': 0.0804, 'grad_norm': 1.1136682033538818, 'learning_rate': 3.9321493810514864e-05, 'epoch': 1.11} 11%|█ | 4565/41250 [11:02:27<88:36:32, 8.70s/it][2025-04-25 19:00:10,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.06 | optimizer_step: 0.92 [2025-04-25 19:00:10,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.00 | bwd_microstep: 5757.96 | bwd_inner_microstep: 5661.44 | bwd_allreduce_microstep: 96.47 | step_microstep: 18.84 [2025-04-25 19:00:10,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.00 | bwd: 5757.97 | bwd_inner: 5661.44 | bwd_allreduce: 96.49 | step: 18.85 11%|█ | 4566/41250 [11:02:35<88:34:02, 8.69s/it] {'loss': 0.2405, 'grad_norm': 2.052882432937622, 'learning_rate': 3.932108819431194e-05, 'epoch': 1.11} 11%|█ | 4566/41250 [11:02:35<88:34:02, 8.69s/it][2025-04-25 19:00:19,136] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:00:19,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.30 | bwd_microstep: 5776.64 | bwd_inner_microstep: 5652.13 | bwd_allreduce_microstep: 124.46 | step_microstep: 18.68 [2025-04-25 19:00:19,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.30 | bwd: 5776.65 | bwd_inner: 5652.13 | bwd_allreduce: 124.48 | step: 18.68 11%|█ | 4567/41250 [11:02:44<88:34:27, 8.69s/it] {'loss': 0.0923, 'grad_norm': 0.8716554045677185, 'learning_rate': 3.9320682458998306e-05, 'epoch': 1.11} 11%|█ | 4567/41250 [11:02:44<88:34:27, 8.69s/it][2025-04-25 19:00:27,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.36 | optimizer_step: 1.06 [2025-04-25 19:00:27,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.58 | bwd_microstep: 5710.86 | bwd_inner_microstep: 5696.65 | bwd_allreduce_microstep: 14.15 | step_microstep: 20.35 [2025-04-25 19:00:27,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.58 | bwd: 5710.88 | bwd_inner: 5696.65 | bwd_allreduce: 14.18 | step: 20.35 11%|█ | 4568/41250 [11:02:53<88:25:59, 8.68s/it] {'loss': 0.028, 'grad_norm': 0.6333507299423218, 'learning_rate': 3.9320276604576466e-05, 'epoch': 1.11} 11%|█ | 4568/41250 [11:02:53<88:25:59, 8.68s/it][2025-04-25 19:00:36,688] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.03 | optimizer_step: 0.93 [2025-04-25 19:00:36,689] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2940.22 | bwd_microstep: 5878.07 | bwd_inner_microstep: 5865.00 | bwd_allreduce_microstep: 13.02 | step_microstep: 19.11 [2025-04-25 19:00:36,689] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2940.22 | bwd: 5878.08 | bwd_inner: 5865.00 | bwd_allreduce: 13.04 | step: 19.11 11%|█ | 4569/41250 [11:03:02<89:07:24, 8.75s/it] {'loss': 0.2313, 'grad_norm': 1.7262234687805176, 'learning_rate': 3.931987063104892e-05, 'epoch': 1.11} 11%|█ | 4569/41250 [11:03:02<89:07:24, 8.75s/it][2025-04-25 19:00:45,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 19:00:45,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.18 | bwd_microstep: 5745.92 | bwd_inner_microstep: 5646.98 | bwd_allreduce_microstep: 98.89 | step_microstep: 18.50 [2025-04-25 19:00:45,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.18 | bwd: 5745.94 | bwd_inner: 5646.98 | bwd_allreduce: 98.91 | step: 18.51 11%|█ | 4570/41250 [11:03:10<88:50:50, 8.72s/it] {'loss': 0.1781, 'grad_norm': 2.5242016315460205, 'learning_rate': 3.931946453841817e-05, 'epoch': 1.11} 11%|█ | 4570/41250 [11:03:10<88:50:50, 8.72s/it][2025-04-25 19:00:53,997] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.24 | optimizer_step: 1.03 [2025-04-25 19:00:53,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.65 | bwd_microstep: 5714.11 | bwd_inner_microstep: 5673.84 | bwd_allreduce_microstep: 40.20 | step_microstep: 19.61 [2025-04-25 19:00:53,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.67 | bwd: 5714.11 | bwd_inner: 5673.82 | bwd_allreduce: 40.22 | step: 19.61 11%|█ | 4571/41250 [11:03:19<88:38:10, 8.70s/it] {'loss': 0.2504, 'grad_norm': 1.8800952434539795, 'learning_rate': 3.931905832668672e-05, 'epoch': 1.11} 11%|█ | 4571/41250 [11:03:19<88:38:10, 8.70s/it][2025-04-25 19:01:02,891] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 19:01:02,892] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2936.99 | bwd_microstep: 5872.67 | bwd_inner_microstep: 5859.82 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.89 [2025-04-25 19:01:02,892] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2936.99 | bwd: 5872.68 | bwd_inner: 5859.82 | bwd_allreduce: 12.82 | step: 18.89 11%|█ | 4572/41250 [11:03:28<89:13:32, 8.76s/it] {'loss': 0.2138, 'grad_norm': 2.530850410461426, 'learning_rate': 3.931865199585708e-05, 'epoch': 1.11} 11%|█ | 4572/41250 [11:03:28<89:13:32, 8.76s/it][2025-04-25 19:01:11,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 6.73 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 19:01:11,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2862.44 | bwd_microstep: 5717.71 | bwd_inner_microstep: 5700.83 | bwd_allreduce_microstep: 16.83 | step_microstep: 20.67 [2025-04-25 19:01:11,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2862.44 | bwd: 5717.72 | bwd_inner: 5700.83 | bwd_allreduce: 16.85 | step: 20.68 11%|█ | 4573/41250 [11:03:36<88:56:40, 8.73s/it] {'loss': 0.0599, 'grad_norm': 1.7945847511291504, 'learning_rate': 3.931824554593175e-05, 'epoch': 1.11} 11%|█ | 4573/41250 [11:03:36<88:56:40, 8.73s/it][2025-04-25 19:01:20,229] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.05 | optimizer_step: 0.96 [2025-04-25 19:01:20,230] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.60 | bwd_microstep: 5737.45 | bwd_inner_microstep: 5685.95 | bwd_allreduce_microstep: 51.44 | step_microstep: 19.23 [2025-04-25 19:01:20,230] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.60 | bwd: 5737.46 | bwd_inner: 5685.95 | bwd_allreduce: 51.47 | step: 19.23 11%|█ | 4574/41250 [11:03:45<88:45:49, 8.71s/it] {'loss': 0.1837, 'grad_norm': 6.341294765472412, 'learning_rate': 3.9317838976913234e-05, 'epoch': 1.11} 11%|█ | 4574/41250 [11:03:45<88:45:49, 8.71s/it][2025-04-25 19:01:28,875] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.06 | optimizer_step: 0.93 [2025-04-25 19:01:28,875] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.07 | bwd_microstep: 5710.13 | bwd_inner_microstep: 5696.99 | bwd_allreduce_microstep: 13.09 | step_microstep: 18.94 [2025-04-25 19:01:28,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.07 | bwd: 5710.14 | bwd_inner: 5696.99 | bwd_allreduce: 13.11 | step: 18.94 11%|█ | 4575/41250 [11:03:54<88:33:20, 8.69s/it] {'loss': 0.0943, 'grad_norm': 0.9967871904373169, 'learning_rate': 3.931743228880404e-05, 'epoch': 1.11} 11%|█ | 4575/41250 [11:03:54<88:33:20, 8.69s/it][2025-04-25 19:01:37,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:01:37,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.55 | bwd_microstep: 5754.86 | bwd_inner_microstep: 5689.92 | bwd_allreduce_microstep: 64.90 | step_microstep: 18.64 [2025-04-25 19:01:37,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.55 | bwd: 5754.87 | bwd_inner: 5689.92 | bwd_allreduce: 64.91 | step: 18.64 11%|█ | 4576/41250 [11:04:02<88:31:03, 8.69s/it] {'loss': 0.035, 'grad_norm': 0.4574890732765198, 'learning_rate': 3.9317025481606676e-05, 'epoch': 1.11} 11%|█ | 4576/41250 [11:04:02<88:31:03, 8.69s/it][2025-04-25 19:01:46,160] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 19:01:46,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.43 | bwd_microstep: 5696.70 | bwd_inner_microstep: 5637.18 | bwd_allreduce_microstep: 59.46 | step_microstep: 18.72 [2025-04-25 19:01:46,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.43 | bwd: 5696.71 | bwd_inner: 5637.18 | bwd_allreduce: 59.49 | step: 18.72 11%|█ | 4577/41250 [11:04:11<88:15:57, 8.66s/it] {'loss': 0.1866, 'grad_norm': 1.736091136932373, 'learning_rate': 3.931661855532365e-05, 'epoch': 1.11} 11%|█ | 4577/41250 [11:04:11<88:15:57, 8.66s/it][2025-04-25 19:01:54,821] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.06 | optimizer_step: 1.15 [2025-04-25 19:01:54,822] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.60 | bwd_microstep: 5748.97 | bwd_inner_microstep: 5638.14 | bwd_allreduce_microstep: 110.78 | step_microstep: 19.71 [2025-04-25 19:01:54,822] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.60 | bwd: 5748.98 | bwd_inner: 5638.14 | bwd_allreduce: 110.80 | step: 19.71 11%|█ | 4578/41250 [11:04:20<88:14:44, 8.66s/it] {'loss': 0.2633, 'grad_norm': 1.8835053443908691, 'learning_rate': 3.9316211509957465e-05, 'epoch': 1.11} 11%|█ | 4578/41250 [11:04:20<88:14:44, 8.66s/it][2025-04-25 19:02:03,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:02:03,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.76 | bwd_microstep: 5737.32 | bwd_inner_microstep: 5656.52 | bwd_allreduce_microstep: 80.76 | step_microstep: 18.51 [2025-04-25 19:02:03,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.76 | bwd: 5737.33 | bwd_inner: 5656.52 | bwd_allreduce: 80.77 | step: 18.51 11%|█ | 4579/41250 [11:04:28<88:11:44, 8.66s/it] {'loss': 0.1029, 'grad_norm': 1.436594009399414, 'learning_rate': 3.931580434551064e-05, 'epoch': 1.11} 11%|█ | 4579/41250 [11:04:28<88:11:44, 8.66s/it][2025-04-25 19:02:12,139] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 19:02:12,139] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.15 | bwd_microstep: 5766.98 | bwd_inner_microstep: 5635.77 | bwd_allreduce_microstep: 131.15 | step_microstep: 18.94 [2025-04-25 19:02:12,140] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.15 | bwd: 5766.99 | bwd_inner: 5635.77 | bwd_allreduce: 131.18 | step: 18.94 11%|█ | 4580/41250 [11:04:37<88:13:54, 8.66s/it] {'loss': 0.3264, 'grad_norm': 3.4999420642852783, 'learning_rate': 3.931539706198568e-05, 'epoch': 1.11} 11%|█ | 4580/41250 [11:04:37<88:13:54, 8.66s/it][2025-04-25 19:02:20,800] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.06 | optimizer_step: 1.07 [2025-04-25 19:02:20,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.34 | bwd_microstep: 5752.00 | bwd_inner_microstep: 5642.99 | bwd_allreduce_microstep: 108.96 | step_microstep: 19.25 [2025-04-25 19:02:20,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.34 | bwd: 5752.01 | bwd_inner: 5642.99 | bwd_allreduce: 108.98 | step: 19.25 11%|█ | 4581/41250 [11:04:46<88:13:48, 8.66s/it] {'loss': 0.2241, 'grad_norm': 2.4001758098602295, 'learning_rate': 3.93149896593851e-05, 'epoch': 1.11} 11%|█ | 4581/41250 [11:04:46<88:13:48, 8.66s/it][2025-04-25 19:02:29,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.03 | optimizer_step: 0.91 [2025-04-25 19:02:29,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.00 | bwd_microstep: 5702.62 | bwd_inner_microstep: 5689.27 | bwd_allreduce_microstep: 13.30 | step_microstep: 18.89 [2025-04-25 19:02:29,431] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.00 | bwd: 5702.63 | bwd_inner: 5689.27 | bwd_allreduce: 13.32 | step: 18.89 11%|█ | 4582/41250 [11:04:54<88:07:22, 8.65s/it] {'loss': 0.4059, 'grad_norm': 2.4316213130950928, 'learning_rate': 3.9314582137711405e-05, 'epoch': 1.11} 11%|█ | 4582/41250 [11:04:54<88:07:22, 8.65s/it][2025-04-25 19:02:38,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.04 | optimizer_step: 1.01 [2025-04-25 19:02:38,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.28 | bwd_microstep: 5742.70 | bwd_inner_microstep: 5714.40 | bwd_allreduce_microstep: 28.25 | step_microstep: 19.32 [2025-04-25 19:02:38,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.28 | bwd: 5742.72 | bwd_inner: 5714.40 | bwd_allreduce: 28.27 | step: 19.32 11%|█ | 4583/41250 [11:05:03<88:13:45, 8.66s/it] {'loss': 0.1653, 'grad_norm': 1.2079308032989502, 'learning_rate': 3.9314174496967104e-05, 'epoch': 1.11} 11%|█ | 4583/41250 [11:05:03<88:13:45, 8.66s/it][2025-04-25 19:02:46,810] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.07 | optimizer_step: 0.93 [2025-04-25 19:02:46,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.23 | bwd_microstep: 5754.72 | bwd_inner_microstep: 5708.28 | bwd_allreduce_microstep: 46.39 | step_microstep: 19.47 [2025-04-25 19:02:46,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.23 | bwd: 5754.74 | bwd_inner: 5708.28 | bwd_allreduce: 46.41 | step: 19.47 11%|█ | 4584/41250 [11:05:12<88:19:20, 8.67s/it] {'loss': 0.1602, 'grad_norm': 1.3148317337036133, 'learning_rate': 3.9313766737154725e-05, 'epoch': 1.11} 11%|█ | 4584/41250 [11:05:12<88:19:20, 8.67s/it][2025-04-25 19:02:55,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 19:02:55,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2868.01 | bwd_microstep: 5700.05 | bwd_inner_microstep: 5687.10 | bwd_allreduce_microstep: 12.91 | step_microstep: 18.91 [2025-04-25 19:02:55,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2868.01 | bwd: 5700.07 | bwd_inner: 5687.10 | bwd_allreduce: 12.93 | step: 18.91 11%|█ | 4585/41250 [11:05:20<88:15:01, 8.66s/it] {'loss': 0.1225, 'grad_norm': 1.532497525215149, 'learning_rate': 3.931335885827677e-05, 'epoch': 1.11} 11%|█ | 4585/41250 [11:05:20<88:15:01, 8.66s/it][2025-04-25 19:03:04,120] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.14 | optimizer_step: 0.97 [2025-04-25 19:03:04,121] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.17 | bwd_microstep: 5749.80 | bwd_inner_microstep: 5653.09 | bwd_allreduce_microstep: 96.66 | step_microstep: 19.17 [2025-04-25 19:03:04,121] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.17 | bwd: 5749.82 | bwd_inner: 5653.09 | bwd_allreduce: 96.68 | step: 19.17 11%|█ | 4586/41250 [11:05:29<88:13:58, 8.66s/it] {'loss': 0.066, 'grad_norm': 0.6060198545455933, 'learning_rate': 3.931295086033575e-05, 'epoch': 1.11} 11%|█ | 4586/41250 [11:05:29<88:13:58, 8.66s/it][2025-04-25 19:03:12,851] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.09 | optimizer_step: 1.25 [2025-04-25 19:03:12,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2883.15 | bwd_microstep: 5763.65 | bwd_inner_microstep: 5751.00 | bwd_allreduce_microstep: 12.60 | step_microstep: 20.02 [2025-04-25 19:03:12,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2883.15 | bwd: 5763.66 | bwd_inner: 5751.00 | bwd_allreduce: 12.62 | step: 20.02 11%|█ | 4587/41250 [11:05:38<88:27:01, 8.69s/it] {'loss': 0.1341, 'grad_norm': 1.756473422050476, 'learning_rate': 3.9312542743334195e-05, 'epoch': 1.11} 11%|█ | 4587/41250 [11:05:38<88:27:01, 8.69s/it][2025-04-25 19:03:21,536] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.33 | optimizer_step: 0.90 [2025-04-25 19:03:21,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.14 | bwd_microstep: 5747.52 | bwd_inner_microstep: 5697.89 | bwd_allreduce_microstep: 49.57 | step_microstep: 19.73 [2025-04-25 19:03:21,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.14 | bwd: 5747.53 | bwd_inner: 5697.89 | bwd_allreduce: 49.60 | step: 19.73 11%|█ | 4588/41250 [11:05:46<88:26:17, 8.68s/it] {'loss': 0.1727, 'grad_norm': 1.1536343097686768, 'learning_rate': 3.931213450727461e-05, 'epoch': 1.11} 11%|█ | 4588/41250 [11:05:46<88:26:17, 8.68s/it][2025-04-25 19:03:30,171] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.19 | optimizer_step: 0.90 [2025-04-25 19:03:30,172] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.32 | bwd_microstep: 5699.52 | bwd_inner_microstep: 5686.19 | bwd_allreduce_microstep: 13.28 | step_microstep: 19.68 [2025-04-25 19:03:30,172] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.32 | bwd: 5699.53 | bwd_inner: 5686.19 | bwd_allreduce: 13.30 | step: 19.68 11%|█ | 4589/41250 [11:05:55<88:17:05, 8.67s/it] {'loss': 0.1086, 'grad_norm': 1.668796181678772, 'learning_rate': 3.931172615215952e-05, 'epoch': 1.11} 11%|█ | 4589/41250 [11:05:55<88:17:05, 8.67s/it][2025-04-25 19:03:38,833] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.03 | optimizer_step: 0.99 [2025-04-25 19:03:38,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.44 | bwd_microstep: 5718.99 | bwd_inner_microstep: 5706.39 | bwd_allreduce_microstep: 12.55 | step_microstep: 19.28 [2025-04-25 19:03:38,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.44 | bwd: 5719.00 | bwd_inner: 5706.39 | bwd_allreduce: 12.57 | step: 19.28 11%|█ | 4590/41250 [11:06:04<88:15:38, 8.67s/it] {'loss': 0.0721, 'grad_norm': 1.0471773147583008, 'learning_rate': 3.9311317677991426e-05, 'epoch': 1.11} 11%|█ | 4590/41250 [11:06:04<88:15:38, 8.67s/it][2025-04-25 19:03:47,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.03 | optimizer_step: 0.91 [2025-04-25 19:03:47,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.02 | bwd_microstep: 5714.73 | bwd_inner_microstep: 5649.48 | bwd_allreduce_microstep: 65.20 | step_microstep: 19.07 [2025-04-25 19:03:47,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.02 | bwd: 5714.75 | bwd_inner: 5649.48 | bwd_allreduce: 65.22 | step: 19.07 11%|█ | 4591/41250 [11:06:12<88:08:13, 8.66s/it] {'loss': 0.1529, 'grad_norm': 1.2147377729415894, 'learning_rate': 3.931090908477286e-05, 'epoch': 1.11} 11%|█ | 4591/41250 [11:06:12<88:08:13, 8.66s/it][2025-04-25 19:03:56,177] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 1.08 | optimizer_step: 1.06 [2025-04-25 19:03:56,178] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.14 | bwd_microstep: 5774.64 | bwd_inner_microstep: 5698.14 | bwd_allreduce_microstep: 76.44 | step_microstep: 20.08 [2025-04-25 19:03:56,178] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.14 | bwd: 5774.65 | bwd_inner: 5698.14 | bwd_allreduce: 76.46 | step: 20.08 11%|█ | 4592/41250 [11:06:21<88:19:42, 8.67s/it] {'loss': 0.1501, 'grad_norm': 2.055354356765747, 'learning_rate': 3.931050037250634e-05, 'epoch': 1.11} 11%|█ | 4592/41250 [11:06:21<88:19:42, 8.67s/it][2025-04-25 19:04:04,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.56 | optimizer_gradients: 1.11 | optimizer_step: 0.96 [2025-04-25 19:04:04,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.51 | bwd_microstep: 5711.26 | bwd_inner_microstep: 5664.33 | bwd_allreduce_microstep: 46.88 | step_microstep: 19.31 [2025-04-25 19:04:04,812] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.51 | bwd: 5711.27 | bwd_inner: 5664.33 | bwd_allreduce: 46.89 | step: 19.31 11%|█ | 4593/41250 [11:06:30<88:11:58, 8.66s/it] {'loss': 0.0886, 'grad_norm': 1.7564442157745361, 'learning_rate': 3.931009154119438e-05, 'epoch': 1.11} 11%|█ | 4593/41250 [11:06:30<88:11:58, 8.66s/it][2025-04-25 19:04:13,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 19:04:13,508] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.65 | bwd_microstep: 5758.53 | bwd_inner_microstep: 5699.92 | bwd_allreduce_microstep: 58.52 | step_microstep: 19.75 [2025-04-25 19:04:13,508] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.66 | bwd: 5758.55 | bwd_inner: 5699.92 | bwd_allreduce: 58.56 | step: 19.74 11%|█ | 4594/41250 [11:06:38<88:17:32, 8.67s/it] {'loss': 0.2523, 'grad_norm': 1.2622593641281128, 'learning_rate': 3.930968259083951e-05, 'epoch': 1.11} 11%|█ | 4594/41250 [11:06:38<88:17:32, 8.67s/it][2025-04-25 19:04:22,210] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 19:04:22,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.47 | bwd_microstep: 5787.69 | bwd_inner_microstep: 5656.55 | bwd_allreduce_microstep: 131.08 | step_microstep: 18.67 [2025-04-25 19:04:22,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.47 | bwd: 5787.70 | bwd_inner: 5656.55 | bwd_allreduce: 131.11 | step: 18.67 11%|█ | 4595/41250 [11:06:47<88:23:21, 8.68s/it] {'loss': 0.1424, 'grad_norm': 2.740917205810547, 'learning_rate': 3.9309273521444236e-05, 'epoch': 1.11} 11%|█ | 4595/41250 [11:06:47<88:23:21, 8.68s/it][2025-04-25 19:04:30,908] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-25 19:04:30,909] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.49 | bwd_microstep: 5759.07 | bwd_inner_microstep: 5713.78 | bwd_allreduce_microstep: 45.25 | step_microstep: 18.93 [2025-04-25 19:04:30,909] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.49 | bwd: 5759.08 | bwd_inner: 5713.78 | bwd_allreduce: 45.26 | step: 18.93 11%|█ | 4596/41250 [11:06:56<88:26:17, 8.69s/it] {'loss': 0.3702, 'grad_norm': 3.1296234130859375, 'learning_rate': 3.930886433301109e-05, 'epoch': 1.11} 11%|█ | 4596/41250 [11:06:56<88:26:17, 8.69s/it][2025-04-25 19:04:39,617] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.03 | optimizer_step: 1.04 [2025-04-25 19:04:39,618] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.52 | bwd_microstep: 5774.18 | bwd_inner_microstep: 5700.66 | bwd_allreduce_microstep: 73.47 | step_microstep: 19.04 [2025-04-25 19:04:39,618] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.52 | bwd: 5774.19 | bwd_inner: 5700.66 | bwd_allreduce: 73.49 | step: 19.05 11%|█ | 4597/41250 [11:07:04<88:30:19, 8.69s/it] {'loss': 0.1269, 'grad_norm': 1.6055759191513062, 'learning_rate': 3.9308455025542606e-05, 'epoch': 1.11} 11%|█ | 4597/41250 [11:07:04<88:30:19, 8.69s/it][2025-04-25 19:04:48,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.99 | optimizer_step: 1.02 [2025-04-25 19:04:48,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.76 | bwd_microstep: 5705.34 | bwd_inner_microstep: 5665.35 | bwd_allreduce_microstep: 39.94 | step_microstep: 18.72 [2025-04-25 19:04:48,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.76 | bwd: 5705.36 | bwd_inner: 5665.35 | bwd_allreduce: 39.96 | step: 18.73 11%|█ | 4598/41250 [11:07:13<88:18:48, 8.67s/it] {'loss': 0.088, 'grad_norm': 1.1385233402252197, 'learning_rate': 3.930804559904128e-05, 'epoch': 1.11} 11%|█ | 4598/41250 [11:07:13<88:18:48, 8.67s/it][2025-04-25 19:04:56,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.33 | optimizer_step: 0.94 [2025-04-25 19:04:56,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.46 | bwd_microstep: 5724.67 | bwd_inner_microstep: 5667.18 | bwd_allreduce_microstep: 57.44 | step_microstep: 21.01 [2025-04-25 19:04:56,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.46 | bwd: 5724.68 | bwd_inner: 5667.18 | bwd_allreduce: 57.46 | step: 21.02 11%|█ | 4599/41250 [11:07:22<88:13:54, 8.67s/it] {'loss': 0.3029, 'grad_norm': 1.9439752101898193, 'learning_rate': 3.930763605350966e-05, 'epoch': 1.11} 11%|█ | 4599/41250 [11:07:22<88:13:54, 8.67s/it][2025-04-25 19:05:05,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.99 | optimizer_step: 0.93 [2025-04-25 19:05:05,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.61 | bwd_microstep: 5714.40 | bwd_inner_microstep: 5666.76 | bwd_allreduce_microstep: 47.60 | step_microstep: 18.31 [2025-04-25 19:05:05,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.61 | bwd: 5714.41 | bwd_inner: 5666.76 | bwd_allreduce: 47.61 | step: 18.31 11%|█ | 4600/41250 [11:07:30<88:07:11, 8.66s/it] {'loss': 0.1051, 'grad_norm': 1.7912379503250122, 'learning_rate': 3.930722638895025e-05, 'epoch': 1.12} 11%|█ | 4600/41250 [11:07:30<88:07:11, 8.66s/it][2025-04-25 19:05:14,144] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:05:14,145] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.20 | bwd_microstep: 5688.22 | bwd_inner_microstep: 5671.68 | bwd_allreduce_microstep: 16.50 | step_microstep: 17.97 [2025-04-25 19:05:14,145] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.20 | bwd: 5688.23 | bwd_inner: 5671.68 | bwd_allreduce: 16.52 | step: 17.97 11%|█ | 4601/41250 [11:07:39<87:59:57, 8.64s/it] {'loss': 0.2708, 'grad_norm': 2.329646587371826, 'learning_rate': 3.9306816605365596e-05, 'epoch': 1.12} 11%|█ | 4601/41250 [11:07:39<87:59:57, 8.64s/it][2025-04-25 19:05:22,863] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:05:22,864] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.86 | bwd_microstep: 5772.55 | bwd_inner_microstep: 5713.27 | bwd_allreduce_microstep: 59.24 | step_microstep: 18.56 [2025-04-25 19:05:22,864] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.86 | bwd: 5772.56 | bwd_inner: 5713.27 | bwd_allreduce: 59.26 | step: 18.57 11%|█ | 4602/41250 [11:07:48<88:13:26, 8.67s/it] {'loss': 0.1339, 'grad_norm': 1.0368999242782593, 'learning_rate': 3.9306406702758215e-05, 'epoch': 1.12} 11%|█ | 4602/41250 [11:07:48<88:13:26, 8.67s/it][2025-04-25 19:05:31,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 19:05:31,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2905.80 | bwd_microstep: 5804.95 | bwd_inner_microstep: 5792.31 | bwd_allreduce_microstep: 12.60 | step_microstep: 18.05 [2025-04-25 19:05:31,661] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2905.80 | bwd: 5804.96 | bwd_inner: 5792.31 | bwd_allreduce: 12.61 | step: 18.06 11%|█ | 4603/41250 [11:07:56<88:37:11, 8.71s/it] {'loss': 0.0559, 'grad_norm': 0.9343948364257812, 'learning_rate': 3.9305996681130637e-05, 'epoch': 1.12} 11%|█ | 4603/41250 [11:07:56<88:37:11, 8.71s/it][2025-04-25 19:05:40,373] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-25 19:05:40,374] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.51 | bwd_microstep: 5795.55 | bwd_inner_microstep: 5670.55 | bwd_allreduce_microstep: 124.95 | step_microstep: 19.02 [2025-04-25 19:05:40,374] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.51 | bwd: 5795.56 | bwd_inner: 5670.55 | bwd_allreduce: 124.97 | step: 19.02 11%|█ | 4604/41250 [11:08:05<88:38:44, 8.71s/it] {'loss': 0.1589, 'grad_norm': 1.4977457523345947, 'learning_rate': 3.930558654048538e-05, 'epoch': 1.12} 11%|█ | 4604/41250 [11:08:05<88:38:44, 8.71s/it][2025-04-25 19:05:49,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 19:05:49,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.28 | bwd_microstep: 5802.01 | bwd_inner_microstep: 5709.81 | bwd_allreduce_microstep: 92.15 | step_microstep: 18.38 [2025-04-25 19:05:49,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.28 | bwd: 5802.03 | bwd_inner: 5709.81 | bwd_allreduce: 92.17 | step: 18.39 11%|█ | 4605/41250 [11:08:14<88:44:34, 8.72s/it] {'loss': 0.0441, 'grad_norm': 1.8959487676620483, 'learning_rate': 3.930517628082498e-05, 'epoch': 1.12} 11%|█ | 4605/41250 [11:08:14<88:44:34, 8.72s/it][2025-04-25 19:05:57,757] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.01 | optimizer_step: 1.12 [2025-04-25 19:05:57,758] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.13 | bwd_microstep: 5730.48 | bwd_inner_microstep: 5672.77 | bwd_allreduce_microstep: 57.66 | step_microstep: 19.02 [2025-04-25 19:05:57,758] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.13 | bwd: 5730.49 | bwd_inner: 5672.77 | bwd_allreduce: 57.68 | step: 19.02 11%|█ | 4606/41250 [11:08:23<88:30:34, 8.70s/it] {'loss': 0.2896, 'grad_norm': 1.7620083093643188, 'learning_rate': 3.9304765902151964e-05, 'epoch': 1.12} 11%|█ | 4606/41250 [11:08:23<88:30:34, 8.70s/it][2025-04-25 19:06:06,598] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.19 | optimizer_step: 0.90 [2025-04-25 19:06:06,599] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.79 | bwd_microstep: 5925.45 | bwd_inner_microstep: 5654.86 | bwd_allreduce_microstep: 270.54 | step_microstep: 18.70 [2025-04-25 19:06:06,599] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.80 | bwd: 5925.47 | bwd_inner: 5654.86 | bwd_allreduce: 270.56 | step: 18.70 11%|█ | 4607/41250 [11:08:31<88:57:00, 8.74s/it] {'loss': 0.1025, 'grad_norm': 0.8973283171653748, 'learning_rate': 3.930435540446887e-05, 'epoch': 1.12} 11%|█ | 4607/41250 [11:08:31<88:57:00, 8.74s/it][2025-04-25 19:06:15,371] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.25 | optimizer_step: 0.90 [2025-04-25 19:06:15,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2893.23 | bwd_microstep: 5796.76 | bwd_inner_microstep: 5783.00 | bwd_allreduce_microstep: 13.69 | step_microstep: 19.42 [2025-04-25 19:06:15,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2893.23 | bwd: 5796.77 | bwd_inner: 5783.00 | bwd_allreduce: 13.72 | step: 19.44 11%|█ | 4608/41250 [11:08:40<89:03:18, 8.75s/it] {'loss': 0.0736, 'grad_norm': 1.2439167499542236, 'learning_rate': 3.930394478777822e-05, 'epoch': 1.12} 11%|█ | 4608/41250 [11:08:40<89:03:18, 8.75s/it][2025-04-25 19:06:24,033] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 1.07 [2025-04-25 19:06:24,034] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.09 | bwd_microstep: 5722.33 | bwd_inner_microstep: 5709.66 | bwd_allreduce_microstep: 12.62 | step_microstep: 19.02 [2025-04-25 19:06:24,034] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.09 | bwd: 5722.35 | bwd_inner: 5709.66 | bwd_allreduce: 12.64 | step: 19.02 11%|█ | 4609/41250 [11:08:49<88:47:17, 8.72s/it] {'loss': 0.2007, 'grad_norm': 1.7923405170440674, 'learning_rate': 3.930353405208254e-05, 'epoch': 1.12} 11%|█ | 4609/41250 [11:08:49<88:47:17, 8.72s/it][2025-04-25 19:06:32,740] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.33 | optimizer_step: 1.03 [2025-04-25 19:06:32,740] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2870.05 | bwd_microstep: 5751.39 | bwd_inner_microstep: 5711.59 | bwd_allreduce_microstep: 39.74 | step_microstep: 19.86 [2025-04-25 19:06:32,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2870.05 | bwd: 5751.40 | bwd_inner: 5711.58 | bwd_allreduce: 39.77 | step: 19.86 11%|█ | 4610/41250 [11:08:58<88:43:50, 8.72s/it] {'loss': 0.2066, 'grad_norm': 1.4302186965942383, 'learning_rate': 3.9303123197384375e-05, 'epoch': 1.12} 11%|█ | 4610/41250 [11:08:58<88:43:50, 8.72s/it][2025-04-25 19:06:41,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.03 | optimizer_step: 0.92 [2025-04-25 19:06:41,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.76 | bwd_microstep: 5717.49 | bwd_inner_microstep: 5704.52 | bwd_allreduce_microstep: 12.92 | step_microstep: 19.11 [2025-04-25 19:06:41,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.76 | bwd: 5717.50 | bwd_inner: 5704.52 | bwd_allreduce: 12.94 | step: 19.11 11%|█ | 4611/41250 [11:09:06<88:31:55, 8.70s/it] {'loss': 0.2966, 'grad_norm': 2.8984038829803467, 'learning_rate': 3.930271222368625e-05, 'epoch': 1.12} 11%|█ | 4611/41250 [11:09:06<88:31:55, 8.70s/it][2025-04-25 19:06:50,079] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 19:06:50,079] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.42 | bwd_microstep: 5758.19 | bwd_inner_microstep: 5678.84 | bwd_allreduce_microstep: 79.30 | step_microstep: 19.10 [2025-04-25 19:06:50,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.42 | bwd: 5758.20 | bwd_inner: 5678.84 | bwd_allreduce: 79.32 | step: 19.10 11%|█ | 4612/41250 [11:09:15<88:29:10, 8.69s/it] {'loss': 0.1879, 'grad_norm': 2.2638251781463623, 'learning_rate': 3.9302301130990705e-05, 'epoch': 1.12} 11%|█ | 4612/41250 [11:09:15<88:29:10, 8.69s/it][2025-04-25 19:06:58,757] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 19:06:58,757] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.90 | bwd_microstep: 5761.66 | bwd_inner_microstep: 5659.24 | bwd_allreduce_microstep: 102.38 | step_microstep: 18.84 [2025-04-25 19:06:58,757] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.90 | bwd: 5761.68 | bwd_inner: 5659.24 | bwd_allreduce: 102.40 | step: 18.85 11%|█ | 4613/41250 [11:09:24<88:25:55, 8.69s/it] {'loss': 0.273, 'grad_norm': 1.7546948194503784, 'learning_rate': 3.930188991930027e-05, 'epoch': 1.12} 11%|█ | 4613/41250 [11:09:24<88:25:55, 8.69s/it][2025-04-25 19:07:07,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.18 | optimizer_step: 0.90 [2025-04-25 19:07:07,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.50 | bwd_microstep: 5793.31 | bwd_inner_microstep: 5658.84 | bwd_allreduce_microstep: 134.42 | step_microstep: 19.40 [2025-04-25 19:07:07,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.50 | bwd: 5793.32 | bwd_inner: 5658.84 | bwd_allreduce: 134.44 | step: 19.40 11%|█ | 4614/41250 [11:09:32<88:28:22, 8.69s/it] {'loss': 0.0459, 'grad_norm': 0.6387472152709961, 'learning_rate': 3.930147858861747e-05, 'epoch': 1.12} 11%|█ | 4614/41250 [11:09:32<88:28:22, 8.69s/it][2025-04-25 19:07:16,079] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:07:16,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.27 | bwd_microstep: 5706.09 | bwd_inner_microstep: 5643.75 | bwd_allreduce_microstep: 62.29 | step_microstep: 18.85 [2025-04-25 19:07:16,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.27 | bwd: 5706.10 | bwd_inner: 5643.75 | bwd_allreduce: 62.31 | step: 18.85 11%|█ | 4615/41250 [11:09:41<88:14:33, 8.67s/it] {'loss': 0.1399, 'grad_norm': 1.0723421573638916, 'learning_rate': 3.9301067138944864e-05, 'epoch': 1.12} 11%|█ | 4615/41250 [11:09:41<88:14:33, 8.67s/it][2025-04-25 19:07:24,774] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.11 | optimizer_step: 1.03 [2025-04-25 19:07:24,775] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.98 | bwd_microstep: 5784.01 | bwd_inner_microstep: 5647.78 | bwd_allreduce_microstep: 136.16 | step_microstep: 19.67 [2025-04-25 19:07:24,775] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.98 | bwd: 5784.03 | bwd_inner: 5647.78 | bwd_allreduce: 136.20 | step: 19.68 11%|█ | 4616/41250 [11:09:50<88:18:51, 8.68s/it] {'loss': 0.178, 'grad_norm': 1.1782130002975464, 'learning_rate': 3.930065557028497e-05, 'epoch': 1.12} 11%|█ | 4616/41250 [11:09:50<88:18:51, 8.68s/it][2025-04-25 19:07:33,465] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:07:33,465] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.89 | bwd_microstep: 5751.16 | bwd_inner_microstep: 5702.65 | bwd_allreduce_microstep: 48.47 | step_microstep: 18.20 [2025-04-25 19:07:33,466] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.89 | bwd: 5751.17 | bwd_inner: 5702.65 | bwd_allreduce: 48.48 | step: 18.20 11%|█ | 4617/41250 [11:09:58<88:20:44, 8.68s/it] {'loss': 0.0744, 'grad_norm': 1.0950591564178467, 'learning_rate': 3.9300243882640326e-05, 'epoch': 1.12} 11%|█ | 4617/41250 [11:09:58<88:20:44, 8.68s/it][2025-04-25 19:07:42,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:07:42,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.53 | bwd_microstep: 5762.65 | bwd_inner_microstep: 5647.84 | bwd_allreduce_microstep: 114.77 | step_microstep: 17.92 [2025-04-25 19:07:42,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.53 | bwd: 5762.66 | bwd_inner: 5647.84 | bwd_allreduce: 114.78 | step: 17.93 11%|█ | 4618/41250 [11:10:07<88:21:57, 8.68s/it] {'loss': 0.1051, 'grad_norm': 0.9404296278953552, 'learning_rate': 3.929983207601348e-05, 'epoch': 1.12} 11%|█ | 4618/41250 [11:10:07<88:21:57, 8.68s/it][2025-04-25 19:07:50,776] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 1.03 [2025-04-25 19:07:50,776] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.27 | bwd_microstep: 5706.75 | bwd_inner_microstep: 5643.62 | bwd_allreduce_microstep: 63.09 | step_microstep: 18.70 [2025-04-25 19:07:50,776] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.27 | bwd: 5706.76 | bwd_inner: 5643.62 | bwd_allreduce: 63.11 | step: 18.71 11%|█ | 4619/41250 [11:10:16<88:10:23, 8.67s/it] {'loss': 0.1121, 'grad_norm': 0.7978342771530151, 'learning_rate': 3.929942015040697e-05, 'epoch': 1.12} 11%|█ | 4619/41250 [11:10:16<88:10:23, 8.67s/it][2025-04-25 19:07:59,472] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 19:07:59,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.50 | bwd_microstep: 5776.47 | bwd_inner_microstep: 5649.43 | bwd_allreduce_microstep: 127.00 | step_microstep: 18.87 [2025-04-25 19:07:59,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.50 | bwd: 5776.49 | bwd_inner: 5649.43 | bwd_allreduce: 127.01 | step: 18.87 11%|█ | 4620/41250 [11:10:24<88:16:02, 8.67s/it] {'loss': 0.2065, 'grad_norm': 2.305922031402588, 'learning_rate': 3.929900810582332e-05, 'epoch': 1.12} 11%|█ | 4620/41250 [11:10:24<88:16:02, 8.67s/it][2025-04-25 19:08:08,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-25 19:08:08,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.48 | bwd_microstep: 5689.70 | bwd_inner_microstep: 5661.14 | bwd_allreduce_microstep: 28.52 | step_microstep: 18.33 [2025-04-25 19:08:08,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.48 | bwd: 5689.72 | bwd_inner: 5661.14 | bwd_allreduce: 28.54 | step: 18.33 11%|█ | 4621/41250 [11:10:33<88:03:24, 8.65s/it] {'loss': 0.1135, 'grad_norm': 2.8591206073760986, 'learning_rate': 3.929859594226509e-05, 'epoch': 1.12} 11%|█ | 4621/41250 [11:10:33<88:03:24, 8.65s/it][2025-04-25 19:08:16,686] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 6.34 | optimizer_gradients: 1.05 | optimizer_step: 0.90 [2025-04-25 19:08:16,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.19 | bwd_microstep: 5680.32 | bwd_inner_microstep: 5650.14 | bwd_allreduce_microstep: 30.14 | step_microstep: 19.61 [2025-04-25 19:08:16,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.19 | bwd: 5680.33 | bwd_inner: 5650.14 | bwd_allreduce: 30.16 | step: 19.62 11%|█ | 4622/41250 [11:10:42<87:55:20, 8.64s/it] {'loss': 0.1458, 'grad_norm': 2.3223838806152344, 'learning_rate': 3.92981836597348e-05, 'epoch': 1.12} 11%|█ | 4622/41250 [11:10:42<87:55:20, 8.64s/it][2025-04-25 19:08:25,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 19:08:25,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.46 | bwd_microstep: 5716.34 | bwd_inner_microstep: 5700.47 | bwd_allreduce_microstep: 15.83 | step_microstep: 18.58 [2025-04-25 19:08:25,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.46 | bwd: 5716.36 | bwd_inner: 5700.47 | bwd_allreduce: 15.85 | step: 18.58 11%|█ | 4623/41250 [11:10:50<87:58:43, 8.65s/it] {'loss': 0.1919, 'grad_norm': 2.426478862762451, 'learning_rate': 3.929777125823502e-05, 'epoch': 1.12} 11%|█ | 4623/41250 [11:10:50<87:58:43, 8.65s/it][2025-04-25 19:08:34,011] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.16 | optimizer_step: 1.05 [2025-04-25 19:08:34,012] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.30 | bwd_microstep: 5746.79 | bwd_inner_microstep: 5650.64 | bwd_allreduce_microstep: 96.09 | step_microstep: 19.49 [2025-04-25 19:08:34,012] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.30 | bwd: 5746.81 | bwd_inner: 5650.64 | bwd_allreduce: 96.12 | step: 19.50 11%|█ | 4624/41250 [11:10:59<88:01:04, 8.65s/it] {'loss': 0.3352, 'grad_norm': 2.637467622756958, 'learning_rate': 3.9297358737768265e-05, 'epoch': 1.12} 11%|█ | 4624/41250 [11:10:59<88:01:04, 8.65s/it][2025-04-25 19:08:42,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 19:08:42,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.44 | bwd_microstep: 5739.74 | bwd_inner_microstep: 5683.85 | bwd_allreduce_microstep: 55.85 | step_microstep: 18.31 [2025-04-25 19:08:42,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.44 | bwd: 5739.76 | bwd_inner: 5683.85 | bwd_allreduce: 55.87 | step: 18.31 11%|█ | 4625/41250 [11:11:08<88:04:34, 8.66s/it] {'loss': 0.0641, 'grad_norm': 1.4774078130722046, 'learning_rate': 3.929694609833709e-05, 'epoch': 1.12} 11%|█ | 4625/41250 [11:11:08<88:04:34, 8.66s/it][2025-04-25 19:08:51,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.93 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-25 19:08:51,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.66 | bwd_microstep: 5709.20 | bwd_inner_microstep: 5696.59 | bwd_allreduce_microstep: 12.56 | step_microstep: 17.76 [2025-04-25 19:08:51,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.66 | bwd: 5709.21 | bwd_inner: 5696.59 | bwd_allreduce: 12.58 | step: 17.76 11%|█ | 4626/41250 [11:11:16<88:02:51, 8.65s/it] {'loss': 0.1073, 'grad_norm': 1.5386073589324951, 'learning_rate': 3.929653333994404e-05, 'epoch': 1.12} 11%|█ | 4626/41250 [11:11:16<88:02:51, 8.65s/it][2025-04-25 19:09:00,016] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 19:09:00,016] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.86 | bwd_microstep: 5770.32 | bwd_inner_microstep: 5656.89 | bwd_allreduce_microstep: 113.38 | step_microstep: 18.55 [2025-04-25 19:09:00,016] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.86 | bwd: 5770.33 | bwd_inner: 5656.89 | bwd_allreduce: 113.40 | step: 18.56 11%|█ | 4627/41250 [11:11:25<88:07:58, 8.66s/it] {'loss': 0.2121, 'grad_norm': 1.5341253280639648, 'learning_rate': 3.9296120462591655e-05, 'epoch': 1.12} 11%|█ | 4627/41250 [11:11:25<88:07:58, 8.66s/it][2025-04-25 19:09:08,953] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 19:09:08,954] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.20 | bwd_microstep: 5995.86 | bwd_inner_microstep: 5681.38 | bwd_allreduce_microstep: 314.43 | step_microstep: 18.55 [2025-04-25 19:09:08,954] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.20 | bwd: 5995.88 | bwd_inner: 5681.38 | bwd_allreduce: 314.45 | step: 18.55 11%|█ | 4628/41250 [11:11:34<88:57:53, 8.75s/it] {'loss': 0.1814, 'grad_norm': 1.2128883600234985, 'learning_rate': 3.9295707466282486e-05, 'epoch': 1.12} 11%|█ | 4628/41250 [11:11:34<88:57:53, 8.75s/it][2025-04-25 19:09:17,561] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 19:09:17,562] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.11 | bwd_microstep: 5688.41 | bwd_inner_microstep: 5650.06 | bwd_allreduce_microstep: 38.29 | step_microstep: 18.47 [2025-04-25 19:09:17,562] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.11 | bwd: 5688.42 | bwd_inner: 5650.06 | bwd_allreduce: 38.31 | step: 18.48 11%|█ | 4629/41250 [11:11:42<88:32:39, 8.70s/it] {'loss': 0.2956, 'grad_norm': 1.8438291549682617, 'learning_rate': 3.929529435101907e-05, 'epoch': 1.12} 11%|█ | 4629/41250 [11:11:42<88:32:39, 8.70s/it][2025-04-25 19:09:26,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 19:09:26,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.58 | bwd_microstep: 5697.15 | bwd_inner_microstep: 5684.44 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.17 [2025-04-25 19:09:26,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.58 | bwd: 5697.16 | bwd_inner: 5684.44 | bwd_allreduce: 12.68 | step: 18.17 11%|█ | 4630/41250 [11:11:51<88:19:23, 8.68s/it] {'loss': 0.1402, 'grad_norm': 1.2220367193222046, 'learning_rate': 3.929488111680396e-05, 'epoch': 1.12} 11%|█ | 4630/41250 [11:11:51<88:19:23, 8.68s/it][2025-04-25 19:09:34,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:09:34,805] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.08 | bwd_microstep: 5695.78 | bwd_inner_microstep: 5647.26 | bwd_allreduce_microstep: 48.48 | step_microstep: 18.41 [2025-04-25 19:09:34,805] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.08 | bwd: 5695.79 | bwd_inner: 5647.26 | bwd_allreduce: 48.50 | step: 18.41 11%|█ | 4631/41250 [11:12:00<88:06:01, 8.66s/it] {'loss': 0.091, 'grad_norm': 0.6347318887710571, 'learning_rate': 3.929446776363971e-05, 'epoch': 1.12} 11%|█ | 4631/41250 [11:12:00<88:06:01, 8.66s/it][2025-04-25 19:09:43,464] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 19:09:43,464] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.24 | bwd_microstep: 5742.97 | bwd_inner_microstep: 5658.19 | bwd_allreduce_microstep: 84.73 | step_microstep: 18.58 [2025-04-25 19:09:43,464] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.24 | bwd: 5742.98 | bwd_inner: 5658.19 | bwd_allreduce: 84.75 | step: 18.58 11%|█ | 4632/41250 [11:12:08<88:05:32, 8.66s/it] {'loss': 0.2822, 'grad_norm': 2.350827217102051, 'learning_rate': 3.929405429152886e-05, 'epoch': 1.12} 11%|█ | 4632/41250 [11:12:08<88:05:32, 8.66s/it][2025-04-25 19:09:52,247] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 19:09:52,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.41 | bwd_microstep: 5865.87 | bwd_inner_microstep: 5658.63 | bwd_allreduce_microstep: 207.20 | step_microstep: 18.27 [2025-04-25 19:09:52,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.41 | bwd: 5865.88 | bwd_inner: 5658.62 | bwd_allreduce: 207.22 | step: 18.28 11%|█ | 4633/41250 [11:12:17<88:27:54, 8.70s/it] {'loss': 0.0711, 'grad_norm': 0.7345250248908997, 'learning_rate': 3.9293640700473956e-05, 'epoch': 1.12} 11%|█ | 4633/41250 [11:12:17<88:27:54, 8.70s/it][2025-04-25 19:10:00,945] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.10 | optimizer_step: 1.03 [2025-04-25 19:10:00,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.45 | bwd_microstep: 5791.39 | bwd_inner_microstep: 5652.84 | bwd_allreduce_microstep: 138.50 | step_microstep: 19.44 [2025-04-25 19:10:00,947] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.45 | bwd: 5791.41 | bwd_inner: 5652.84 | bwd_allreduce: 138.53 | step: 19.44 11%|█ | 4634/41250 [11:12:26<88:28:09, 8.70s/it] {'loss': 0.1859, 'grad_norm': 2.134308338165283, 'learning_rate': 3.929322699047755e-05, 'epoch': 1.12} 11%|█ | 4634/41250 [11:12:26<88:28:09, 8.70s/it][2025-04-25 19:10:09,568] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:10:09,568] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.06 | bwd_microstep: 5702.57 | bwd_inner_microstep: 5659.37 | bwd_allreduce_microstep: 43.16 | step_microstep: 18.55 [2025-04-25 19:10:09,568] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.06 | bwd: 5702.58 | bwd_inner: 5659.37 | bwd_allreduce: 43.17 | step: 18.56 11%|█ | 4635/41250 [11:12:34<88:14:01, 8.68s/it] {'loss': 0.0527, 'grad_norm': 0.35782182216644287, 'learning_rate': 3.92928131615422e-05, 'epoch': 1.12} 11%|█ | 4635/41250 [11:12:34<88:14:01, 8.68s/it][2025-04-25 19:10:18,191] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.24 | optimizer_step: 0.94 [2025-04-25 19:10:18,191] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.49 | bwd_microstep: 5707.52 | bwd_inner_microstep: 5648.45 | bwd_allreduce_microstep: 59.01 | step_microstep: 19.31 [2025-04-25 19:10:18,191] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.50 | bwd: 5707.53 | bwd_inner: 5648.45 | bwd_allreduce: 59.03 | step: 19.32 11%|█ | 4636/41250 [11:12:43<88:04:22, 8.66s/it] {'loss': 0.2619, 'grad_norm': 2.636866331100464, 'learning_rate': 3.929239921367045e-05, 'epoch': 1.12} 11%|█ | 4636/41250 [11:12:43<88:04:22, 8.66s/it][2025-04-25 19:10:26,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-25 19:10:26,883] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.99 | bwd_microstep: 5776.95 | bwd_inner_microstep: 5655.19 | bwd_allreduce_microstep: 121.72 | step_microstep: 18.87 [2025-04-25 19:10:26,883] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.99 | bwd: 5776.96 | bwd_inner: 5655.19 | bwd_allreduce: 121.74 | step: 18.88 11%|█ | 4637/41250 [11:12:52<88:10:12, 8.67s/it] {'loss': 0.0876, 'grad_norm': 0.9262834787368774, 'learning_rate': 3.9291985146864856e-05, 'epoch': 1.12} 11%|█ | 4637/41250 [11:12:52<88:10:12, 8.67s/it][2025-04-25 19:10:35,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.02 | optimizer_step: 1.00 [2025-04-25 19:10:35,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.46 | bwd_microstep: 5721.42 | bwd_inner_microstep: 5707.96 | bwd_allreduce_microstep: 13.41 | step_microstep: 18.86 [2025-04-25 19:10:35,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.46 | bwd: 5721.43 | bwd_inner: 5707.96 | bwd_allreduce: 13.43 | step: 18.87 11%|█ | 4638/41250 [11:13:00<88:08:42, 8.67s/it] {'loss': 0.1617, 'grad_norm': 1.1464555263519287, 'learning_rate': 3.929157096112796e-05, 'epoch': 1.12} 11%|█ | 4638/41250 [11:13:00<88:08:42, 8.67s/it][2025-04-25 19:10:44,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 19:10:44,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.37 | bwd_microstep: 5919.91 | bwd_inner_microstep: 5711.12 | bwd_allreduce_microstep: 208.62 | step_microstep: 20.21 [2025-04-25 19:10:44,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.37 | bwd: 5919.97 | bwd_inner: 5711.12 | bwd_allreduce: 208.71 | step: 20.20 11%|█ | 4639/41250 [11:13:09<88:43:09, 8.72s/it] {'loss': 0.1419, 'grad_norm': 1.9256068468093872, 'learning_rate': 3.929115665646233e-05, 'epoch': 1.12} 11%|█ | 4639/41250 [11:13:09<88:43:09, 8.72s/it][2025-04-25 19:10:53,021] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 19:10:53,022] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.12 | bwd_microstep: 5706.70 | bwd_inner_microstep: 5671.90 | bwd_allreduce_microstep: 34.76 | step_microstep: 18.30 [2025-04-25 19:10:53,022] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.12 | bwd: 5706.72 | bwd_inner: 5671.90 | bwd_allreduce: 34.78 | step: 18.30 11%|█ | 4640/41250 [11:13:18<88:23:54, 8.69s/it] {'loss': 0.1358, 'grad_norm': 2.017378807067871, 'learning_rate': 3.9290742232870514e-05, 'epoch': 1.12} 11%|█ | 4640/41250 [11:13:18<88:23:54, 8.69s/it][2025-04-25 19:11:01,654] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.97 | optimizer_step: 1.04 [2025-04-25 19:11:01,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.61 | bwd_microstep: 5719.80 | bwd_inner_microstep: 5664.53 | bwd_allreduce_microstep: 55.21 | step_microstep: 18.22 [2025-04-25 19:11:01,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.61 | bwd: 5719.81 | bwd_inner: 5664.53 | bwd_allreduce: 55.24 | step: 18.22 11%|█▏ | 4641/41250 [11:13:26<88:12:47, 8.67s/it] {'loss': 0.2068, 'grad_norm': 1.1889952421188354, 'learning_rate': 3.929032769035507e-05, 'epoch': 1.13} 11%|█▏ | 4641/41250 [11:13:26<88:12:47, 8.67s/it][2025-04-25 19:11:10,359] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.00 | optimizer_step: 1.02 [2025-04-25 19:11:10,360] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2868.51 | bwd_microstep: 5747.50 | bwd_inner_microstep: 5709.14 | bwd_allreduce_microstep: 38.32 | step_microstep: 18.79 [2025-04-25 19:11:10,360] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2868.51 | bwd: 5747.52 | bwd_inner: 5709.14 | bwd_allreduce: 38.34 | step: 18.79 11%|█▏ | 4642/41250 [11:13:35<88:18:17, 8.68s/it] {'loss': 0.082, 'grad_norm': 1.1911170482635498, 'learning_rate': 3.928991302891854e-05, 'epoch': 1.13} 11%|█▏ | 4642/41250 [11:13:35<88:18:17, 8.68s/it][2025-04-25 19:11:19,060] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.22 | optimizer_step: 1.02 [2025-04-25 19:11:19,061] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.91 | bwd_microstep: 5783.29 | bwd_inner_microstep: 5669.82 | bwd_allreduce_microstep: 113.41 | step_microstep: 19.62 [2025-04-25 19:11:19,061] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.91 | bwd: 5783.31 | bwd_inner: 5669.82 | bwd_allreduce: 113.44 | step: 19.62 11%|█▏ | 4643/41250 [11:13:44<88:21:27, 8.69s/it] {'loss': 0.0595, 'grad_norm': 0.6864256262779236, 'learning_rate': 3.92894982485635e-05, 'epoch': 1.13} 11%|█▏ | 4643/41250 [11:13:44<88:21:27, 8.69s/it][2025-04-25 19:11:27,707] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.98 | optimizer_step: 0.91 [2025-04-25 19:11:27,708] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.35 | bwd_microstep: 5708.76 | bwd_inner_microstep: 5672.79 | bwd_allreduce_microstep: 35.92 | step_microstep: 18.64 [2025-04-25 19:11:27,708] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.35 | bwd: 5708.77 | bwd_inner: 5672.79 | bwd_allreduce: 35.94 | step: 18.64 11%|█▏ | 4644/41250 [11:13:53<88:13:20, 8.68s/it] {'loss': 0.1135, 'grad_norm': 0.820340096950531, 'learning_rate': 3.928908334929249e-05, 'epoch': 1.13} 11%|█▏ | 4644/41250 [11:13:53<88:13:20, 8.68s/it][2025-04-25 19:11:36,346] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 19:11:36,347] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.69 | bwd_microstep: 5719.30 | bwd_inner_microstep: 5663.95 | bwd_allreduce_microstep: 55.31 | step_microstep: 19.15 [2025-04-25 19:11:36,347] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.69 | bwd: 5719.32 | bwd_inner: 5663.95 | bwd_allreduce: 55.32 | step: 19.15 11%|█▏ | 4645/41250 [11:14:01<88:06:36, 8.67s/it] {'loss': 0.1886, 'grad_norm': 1.5512456893920898, 'learning_rate': 3.9288668331108074e-05, 'epoch': 1.13} 11%|█▏ | 4645/41250 [11:14:01<88:06:36, 8.67s/it][2025-04-25 19:11:45,043] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.97 | optimizer_step: 1.03 [2025-04-25 19:11:45,043] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.38 | bwd_microstep: 5755.23 | bwd_inner_microstep: 5692.33 | bwd_allreduce_microstep: 62.86 | step_microstep: 18.64 [2025-04-25 19:11:45,043] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.38 | bwd: 5755.25 | bwd_inner: 5692.33 | bwd_allreduce: 62.88 | step: 18.64 11%|█▏ | 4646/41250 [11:14:10<88:12:00, 8.67s/it] {'loss': 0.148, 'grad_norm': 1.3895223140716553, 'learning_rate': 3.9288253194012815e-05, 'epoch': 1.13} 11%|█▏ | 4646/41250 [11:14:10<88:12:00, 8.67s/it][2025-04-25 19:11:53,751] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 19:11:53,751] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.41 | bwd_microstep: 5792.40 | bwd_inner_microstep: 5676.98 | bwd_allreduce_microstep: 115.38 | step_microstep: 18.75 [2025-04-25 19:11:53,752] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.41 | bwd: 5792.42 | bwd_inner: 5676.98 | bwd_allreduce: 115.40 | step: 18.75 11%|█▏ | 4647/41250 [11:14:19<88:18:03, 8.68s/it] {'loss': 0.0967, 'grad_norm': 0.7746520638465881, 'learning_rate': 3.928783793800927e-05, 'epoch': 1.13} 11%|█▏ | 4647/41250 [11:14:19<88:18:03, 8.68s/it][2025-04-25 19:12:02,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:12:02,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.39 | bwd_microstep: 5898.77 | bwd_inner_microstep: 5712.50 | bwd_allreduce_microstep: 186.23 | step_microstep: 18.74 [2025-04-25 19:12:02,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.39 | bwd: 5898.79 | bwd_inner: 5712.50 | bwd_allreduce: 186.25 | step: 18.74 11%|█▏ | 4648/41250 [11:14:27<88:45:04, 8.73s/it] {'loss': 0.0379, 'grad_norm': 0.3687887489795685, 'learning_rate': 3.92874225631e-05, 'epoch': 1.13} 11%|█▏ | 4648/41250 [11:14:27<88:45:04, 8.73s/it][2025-04-25 19:12:11,289] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:12:11,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.66 | bwd_microstep: 5770.27 | bwd_inner_microstep: 5696.54 | bwd_allreduce_microstep: 73.68 | step_microstep: 18.68 [2025-04-25 19:12:11,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.66 | bwd: 5770.29 | bwd_inner: 5696.54 | bwd_allreduce: 73.70 | step: 18.68 11%|█▏ | 4649/41250 [11:14:36<88:40:42, 8.72s/it] {'loss': 0.281, 'grad_norm': 2.3898613452911377, 'learning_rate': 3.928700706928756e-05, 'epoch': 1.13} 11%|█▏ | 4649/41250 [11:14:36<88:40:42, 8.72s/it][2025-04-25 19:12:20,015] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.24 | optimizer_step: 0.98 [2025-04-25 19:12:20,016] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.96 | bwd_microstep: 5808.42 | bwd_inner_microstep: 5659.77 | bwd_allreduce_microstep: 148.59 | step_microstep: 19.51 [2025-04-25 19:12:20,016] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.96 | bwd: 5808.43 | bwd_inner: 5659.77 | bwd_allreduce: 148.61 | step: 19.51 11%|█▏ | 4650/41250 [11:14:45<88:41:13, 8.72s/it] {'loss': 0.3318, 'grad_norm': 2.670694589614868, 'learning_rate': 3.9286591456574516e-05, 'epoch': 1.13} 11%|█▏ | 4650/41250 [11:14:45<88:41:13, 8.72s/it][2025-04-25 19:12:28,671] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.04 | optimizer_step: 0.98 [2025-04-25 19:12:28,672] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.68 | bwd_microstep: 5718.90 | bwd_inner_microstep: 5702.92 | bwd_allreduce_microstep: 15.94 | step_microstep: 18.97 [2025-04-25 19:12:28,672] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.68 | bwd: 5718.92 | bwd_inner: 5702.92 | bwd_allreduce: 15.96 | step: 18.97 11%|█▏ | 4651/41250 [11:14:53<88:28:39, 8.70s/it] {'loss': 0.051, 'grad_norm': 0.660612165927887, 'learning_rate': 3.928617572496342e-05, 'epoch': 1.13} 11%|█▏ | 4651/41250 [11:14:53<88:28:39, 8.70s/it][2025-04-25 19:12:37,376] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 19:12:37,377] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.95 | bwd_microstep: 5790.45 | bwd_inner_microstep: 5667.67 | bwd_allreduce_microstep: 122.73 | step_microstep: 19.16 [2025-04-25 19:12:37,377] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.95 | bwd: 5790.47 | bwd_inner: 5667.67 | bwd_allreduce: 122.76 | step: 19.16 11%|█▏ | 4652/41250 [11:15:02<88:28:53, 8.70s/it] {'loss': 0.0964, 'grad_norm': 1.759534239768982, 'learning_rate': 3.928575987445686e-05, 'epoch': 1.13} 11%|█▏ | 4652/41250 [11:15:02<88:28:53, 8.70s/it][2025-04-25 19:12:46,041] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.96 [2025-04-25 19:12:46,042] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.16 | bwd_microstep: 5727.95 | bwd_inner_microstep: 5714.81 | bwd_allreduce_microstep: 13.09 | step_microstep: 18.85 [2025-04-25 19:12:46,042] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.16 | bwd: 5727.96 | bwd_inner: 5714.81 | bwd_allreduce: 13.10 | step: 18.85 11%|█▏ | 4653/41250 [11:15:11<88:21:43, 8.69s/it] {'loss': 0.0418, 'grad_norm': 0.42324793338775635, 'learning_rate': 3.928534390505737e-05, 'epoch': 1.13} 11%|█▏ | 4653/41250 [11:15:11<88:21:43, 8.69s/it][2025-04-25 19:12:54,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 19:12:54,768] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.90 | bwd_microstep: 5790.38 | bwd_inner_microstep: 5700.33 | bwd_allreduce_microstep: 89.99 | step_microstep: 18.76 [2025-04-25 19:12:54,768] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.90 | bwd: 5790.39 | bwd_inner: 5700.33 | bwd_allreduce: 90.01 | step: 18.76 11%|█▏ | 4654/41250 [11:15:20<88:28:10, 8.70s/it] {'loss': 0.2243, 'grad_norm': 1.7683931589126587, 'learning_rate': 3.928492781676754e-05, 'epoch': 1.13} 11%|█▏ | 4654/41250 [11:15:20<88:28:10, 8.70s/it][2025-04-25 19:13:03,480] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-25 19:13:03,480] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.27 | bwd_microstep: 5778.38 | bwd_inner_microstep: 5690.99 | bwd_allreduce_microstep: 87.35 | step_microstep: 18.78 [2025-04-25 19:13:03,480] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.27 | bwd: 5778.40 | bwd_inner: 5690.99 | bwd_allreduce: 87.37 | step: 18.78 11%|█▏ | 4655/41250 [11:15:28<88:29:17, 8.70s/it] {'loss': 0.4991, 'grad_norm': 2.7528560161590576, 'learning_rate': 3.928451160958991e-05, 'epoch': 1.13} 11%|█▏ | 4655/41250 [11:15:28<88:29:17, 8.70s/it][2025-04-25 19:13:12,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 19:13:12,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.73 | bwd_microstep: 5783.36 | bwd_inner_microstep: 5694.76 | bwd_allreduce_microstep: 88.54 | step_microstep: 18.65 [2025-04-25 19:13:12,203] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.73 | bwd: 5783.37 | bwd_inner: 5694.76 | bwd_allreduce: 88.56 | step: 18.65 11%|█▏ | 4656/41250 [11:15:37<88:32:17, 8.71s/it] {'loss': 0.1195, 'grad_norm': 1.9477767944335938, 'learning_rate': 3.928409528352707e-05, 'epoch': 1.13} 11%|█▏ | 4656/41250 [11:15:37<88:32:17, 8.71s/it][2025-04-25 19:13:20,913] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-25 19:13:20,914] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.37 | bwd_microstep: 5787.75 | bwd_inner_microstep: 5667.08 | bwd_allreduce_microstep: 120.62 | step_microstep: 19.09 [2025-04-25 19:13:20,914] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.37 | bwd: 5787.76 | bwd_inner: 5667.08 | bwd_allreduce: 120.64 | step: 19.10 11%|█▏ | 4657/41250 [11:15:46<88:32:24, 8.71s/it] {'loss': 0.1211, 'grad_norm': 3.1814639568328857, 'learning_rate': 3.9283678838581576e-05, 'epoch': 1.13} 11%|█▏ | 4657/41250 [11:15:46<88:32:24, 8.71s/it][2025-04-25 19:13:29,632] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.03 | optimizer_step: 1.00 [2025-04-25 19:13:29,632] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.20 | bwd_microstep: 5800.87 | bwd_inner_microstep: 5658.19 | bwd_allreduce_microstep: 142.63 | step_microstep: 19.06 [2025-04-25 19:13:29,633] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.20 | bwd: 5800.88 | bwd_inner: 5658.19 | bwd_allreduce: 142.65 | step: 19.06 11%|█▏ | 4658/41250 [11:15:54<88:33:43, 8.71s/it] {'loss': 0.1909, 'grad_norm': 2.323002338409424, 'learning_rate': 3.928326227475599e-05, 'epoch': 1.13} 11%|█▏ | 4658/41250 [11:15:54<88:33:43, 8.71s/it][2025-04-25 19:13:38,271] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 19:13:38,271] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.27 | bwd_microstep: 5708.93 | bwd_inner_microstep: 5695.69 | bwd_allreduce_microstep: 13.19 | step_microstep: 18.91 [2025-04-25 19:13:38,271] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.27 | bwd: 5708.94 | bwd_inner: 5695.69 | bwd_allreduce: 13.21 | step: 18.91 11%|█▏ | 4659/41250 [11:16:03<88:20:17, 8.69s/it] {'loss': 0.1074, 'grad_norm': 1.444057822227478, 'learning_rate': 3.928284559205288e-05, 'epoch': 1.13} 11%|█▏ | 4659/41250 [11:16:03<88:20:17, 8.69s/it][2025-04-25 19:13:46,912] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.01 | optimizer_step: 1.13 [2025-04-25 19:13:46,913] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.08 | bwd_microstep: 5711.74 | bwd_inner_microstep: 5698.95 | bwd_allreduce_microstep: 12.74 | step_microstep: 19.07 [2025-04-25 19:13:46,913] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.08 | bwd: 5711.75 | bwd_inner: 5698.95 | bwd_allreduce: 12.76 | step: 19.07 11%|█▏ | 4660/41250 [11:16:12<88:10:56, 8.68s/it] {'loss': 0.063, 'grad_norm': 1.1147665977478027, 'learning_rate': 3.9282428790474835e-05, 'epoch': 1.13} 11%|█▏ | 4660/41250 [11:16:12<88:10:56, 8.68s/it][2025-04-25 19:13:55,541] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:13:55,541] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.36 | bwd_microstep: 5714.06 | bwd_inner_microstep: 5656.46 | bwd_allreduce_microstep: 57.57 | step_microstep: 17.97 [2025-04-25 19:13:55,542] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.36 | bwd: 5714.07 | bwd_inner: 5656.46 | bwd_allreduce: 57.58 | step: 17.98 11%|█▏ | 4661/41250 [11:16:20<88:02:01, 8.66s/it] {'loss': 0.1138, 'grad_norm': 1.0744119882583618, 'learning_rate': 3.9282011870024394e-05, 'epoch': 1.13} 11%|█▏ | 4661/41250 [11:16:20<88:02:01, 8.66s/it][2025-04-25 19:14:04,246] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 6.05 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:14:04,246] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.24 | bwd_microstep: 5770.93 | bwd_inner_microstep: 5693.08 | bwd_allreduce_microstep: 77.81 | step_microstep: 19.63 [2025-04-25 19:14:04,246] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.24 | bwd: 5770.94 | bwd_inner: 5693.08 | bwd_allreduce: 77.82 | step: 19.63 11%|█▏ | 4662/41250 [11:16:29<88:09:41, 8.67s/it] {'loss': 0.1716, 'grad_norm': 1.2245436906814575, 'learning_rate': 3.928159483070415e-05, 'epoch': 1.13} 11%|█▏ | 4662/41250 [11:16:29<88:09:41, 8.67s/it][2025-04-25 19:14:13,073] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.95 | optimizer_step: 1.01 [2025-04-25 19:14:13,074] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.45 | bwd_microstep: 5909.79 | bwd_inner_microstep: 5654.62 | bwd_allreduce_microstep: 255.14 | step_microstep: 18.56 [2025-04-25 19:14:13,074] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.45 | bwd: 5909.81 | bwd_inner: 5654.62 | bwd_allreduce: 255.15 | step: 18.56 11%|█▏ | 4663/41250 [11:16:38<88:37:30, 8.72s/it] {'loss': 0.1044, 'grad_norm': 1.2515987157821655, 'learning_rate': 3.9281177672516665e-05, 'epoch': 1.13} 11%|█▏ | 4663/41250 [11:16:38<88:37:30, 8.72s/it][2025-04-25 19:14:21,786] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:14:21,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.76 | bwd_microstep: 5766.92 | bwd_inner_microstep: 5688.61 | bwd_allreduce_microstep: 78.26 | step_microstep: 18.44 [2025-04-25 19:14:21,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.76 | bwd: 5766.93 | bwd_inner: 5688.60 | bwd_allreduce: 78.28 | step: 18.44 11%|█▏ | 4664/41250 [11:16:47<88:36:10, 8.72s/it] {'loss': 0.1079, 'grad_norm': 1.0714143514633179, 'learning_rate': 3.9280760395464507e-05, 'epoch': 1.13} 11%|█▏ | 4664/41250 [11:16:47<88:36:10, 8.72s/it][2025-04-25 19:14:30,486] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.00 | optimizer_step: 0.99 [2025-04-25 19:14:30,486] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.16 | bwd_microstep: 5782.52 | bwd_inner_microstep: 5652.07 | bwd_allreduce_microstep: 130.41 | step_microstep: 18.91 [2025-04-25 19:14:30,487] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.16 | bwd: 5782.53 | bwd_inner: 5652.07 | bwd_allreduce: 130.42 | step: 18.91 11%|█▏ | 4665/41250 [11:16:55<88:32:29, 8.71s/it] {'loss': 0.1349, 'grad_norm': 1.32918381690979, 'learning_rate': 3.928034299955026e-05, 'epoch': 1.13} 11%|█▏ | 4665/41250 [11:16:55<88:32:29, 8.71s/it][2025-04-25 19:14:39,169] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:14:39,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.29 | bwd_microstep: 5776.64 | bwd_inner_microstep: 5653.96 | bwd_allreduce_microstep: 122.64 | step_microstep: 18.60 [2025-04-25 19:14:39,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.29 | bwd: 5776.65 | bwd_inner: 5653.95 | bwd_allreduce: 122.66 | step: 18.60 11%|█▏ | 4666/41250 [11:17:04<88:27:00, 8.70s/it] {'loss': 0.1226, 'grad_norm': 1.2820580005645752, 'learning_rate': 3.927992548477649e-05, 'epoch': 1.13} 11%|█▏ | 4666/41250 [11:17:04<88:27:00, 8.70s/it][2025-04-25 19:14:47,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.00 | optimizer_step: 1.01 [2025-04-25 19:14:47,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.57 | bwd_microstep: 5790.63 | bwd_inner_microstep: 5649.85 | bwd_allreduce_microstep: 140.74 | step_microstep: 18.62 [2025-04-25 19:14:47,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.57 | bwd: 5790.64 | bwd_inner: 5649.85 | bwd_allreduce: 140.76 | step: 18.63 11%|█▏ | 4667/41250 [11:17:13<88:26:32, 8.70s/it] {'loss': 0.1994, 'grad_norm': 2.7955243587493896, 'learning_rate': 3.927950785114576e-05, 'epoch': 1.13} 11%|█▏ | 4667/41250 [11:17:13<88:26:32, 8.70s/it][2025-04-25 19:14:56,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:14:56,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.81 | bwd_microstep: 5899.05 | bwd_inner_microstep: 5655.61 | bwd_allreduce_microstep: 243.39 | step_microstep: 18.45 [2025-04-25 19:14:56,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.81 | bwd: 5899.06 | bwd_inner: 5655.61 | bwd_allreduce: 243.41 | step: 18.45 11%|█▏ | 4668/41250 [11:17:22<88:47:50, 8.74s/it] {'loss': 0.2469, 'grad_norm': 3.356201171875, 'learning_rate': 3.927909009866067e-05, 'epoch': 1.13} 11%|█▏ | 4668/41250 [11:17:22<88:47:50, 8.74s/it][2025-04-25 19:15:05,297] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-25 19:15:05,298] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.32 | bwd_microstep: 5686.89 | bwd_inner_microstep: 5660.49 | bwd_allreduce_microstep: 26.36 | step_microstep: 18.37 [2025-04-25 19:15:05,298] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.32 | bwd: 5686.91 | bwd_inner: 5660.49 | bwd_allreduce: 26.38 | step: 18.37 11%|█▏ | 4669/41250 [11:17:30<88:23:20, 8.70s/it] {'loss': 0.4425, 'grad_norm': 3.2496368885040283, 'learning_rate': 3.927867222732378e-05, 'epoch': 1.13} 11%|█▏ | 4669/41250 [11:17:30<88:23:20, 8.70s/it][2025-04-25 19:15:13,912] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.69 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:15:13,912] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.09 | bwd_microstep: 5687.30 | bwd_inner_microstep: 5672.58 | bwd_allreduce_microstep: 14.67 | step_microstep: 18.63 [2025-04-25 19:15:13,912] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.09 | bwd: 5687.31 | bwd_inner: 5672.58 | bwd_allreduce: 14.69 | step: 18.64 11%|█▏ | 4670/41250 [11:17:39<88:07:46, 8.67s/it] {'loss': 0.084, 'grad_norm': 1.0883643627166748, 'learning_rate': 3.927825423713766e-05, 'epoch': 1.13} 11%|█▏ | 4670/41250 [11:17:39<88:07:46, 8.67s/it][2025-04-25 19:15:22,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 19:15:22,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.26 | bwd_microstep: 5724.65 | bwd_inner_microstep: 5680.32 | bwd_allreduce_microstep: 44.28 | step_microstep: 18.14 [2025-04-25 19:15:22,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.26 | bwd: 5724.67 | bwd_inner: 5680.32 | bwd_allreduce: 44.30 | step: 18.15 11%|█▏ | 4671/41250 [11:17:47<88:04:18, 8.67s/it] {'loss': 0.0796, 'grad_norm': 1.0362863540649414, 'learning_rate': 3.927783612810491e-05, 'epoch': 1.13} 11%|█▏ | 4671/41250 [11:17:47<88:04:18, 8.67s/it][2025-04-25 19:15:31,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.04 | optimizer_step: 0.94 [2025-04-25 19:15:31,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.93 | bwd_microstep: 5726.61 | bwd_inner_microstep: 5705.51 | bwd_allreduce_microstep: 21.05 | step_microstep: 18.60 [2025-04-25 19:15:31,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.93 | bwd: 5726.62 | bwd_inner: 5705.51 | bwd_allreduce: 21.07 | step: 18.60 11%|█▏ | 4672/41250 [11:17:56<88:03:35, 8.67s/it] {'loss': 0.0788, 'grad_norm': 1.1241873502731323, 'learning_rate': 3.927741790022808e-05, 'epoch': 1.13} 11%|█▏ | 4672/41250 [11:17:56<88:03:35, 8.67s/it][2025-04-25 19:15:39,908] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:15:39,909] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.89 | bwd_microstep: 5741.92 | bwd_inner_microstep: 5691.45 | bwd_allreduce_microstep: 50.43 | step_microstep: 18.05 [2025-04-25 19:15:39,909] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.89 | bwd: 5741.93 | bwd_inner: 5691.45 | bwd_allreduce: 50.44 | step: 18.05 11%|█▏ | 4673/41250 [11:18:05<88:05:22, 8.67s/it] {'loss': 0.0646, 'grad_norm': 0.9979038834571838, 'learning_rate': 3.927699955350976e-05, 'epoch': 1.13} 11%|█▏ | 4673/41250 [11:18:05<88:05:22, 8.67s/it][2025-04-25 19:15:48,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 19:15:48,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.07 | bwd_microstep: 5694.46 | bwd_inner_microstep: 5681.88 | bwd_allreduce_microstep: 12.55 | step_microstep: 18.29 [2025-04-25 19:15:48,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.07 | bwd: 5694.48 | bwd_inner: 5681.88 | bwd_allreduce: 12.56 | step: 18.29 11%|█▏ | 4674/41250 [11:18:13<87:59:07, 8.66s/it] {'loss': 0.0243, 'grad_norm': 0.3116523325443268, 'learning_rate': 3.927658108795253e-05, 'epoch': 1.13} 11%|█▏ | 4674/41250 [11:18:13<87:59:07, 8.66s/it][2025-04-25 19:15:57,196] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:15:57,196] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.17 | bwd_microstep: 5718.67 | bwd_inner_microstep: 5690.01 | bwd_allreduce_microstep: 28.61 | step_microstep: 18.44 [2025-04-25 19:15:57,197] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.17 | bwd: 5718.68 | bwd_inner: 5690.01 | bwd_allreduce: 28.63 | step: 18.45 11%|█▏ | 4675/41250 [11:18:22<87:57:14, 8.66s/it] {'loss': 0.1892, 'grad_norm': 2.537076711654663, 'learning_rate': 3.927616250355897e-05, 'epoch': 1.13} 11%|█▏ | 4675/41250 [11:18:22<87:57:14, 8.66s/it][2025-04-25 19:16:05,874] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:16:05,875] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.58 | bwd_microstep: 5754.16 | bwd_inner_microstep: 5650.12 | bwd_allreduce_microstep: 103.99 | step_microstep: 18.37 [2025-04-25 19:16:05,875] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.58 | bwd: 5754.17 | bwd_inner: 5650.12 | bwd_allreduce: 104.01 | step: 18.38 11%|█▏ | 4676/41250 [11:18:31<88:01:00, 8.66s/it] {'loss': 0.2068, 'grad_norm': 2.352227210998535, 'learning_rate': 3.927574380033166e-05, 'epoch': 1.13} 11%|█▏ | 4676/41250 [11:18:31<88:01:00, 8.66s/it][2025-04-25 19:16:14,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.97 | optimizer_step: 1.06 [2025-04-25 19:16:14,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.49 | bwd_microstep: 5704.69 | bwd_inner_microstep: 5690.03 | bwd_allreduce_microstep: 14.62 | step_microstep: 18.39 [2025-04-25 19:16:14,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.49 | bwd: 5704.70 | bwd_inner: 5690.03 | bwd_allreduce: 14.63 | step: 18.39 11%|█▏ | 4677/41250 [11:18:39<87:58:29, 8.66s/it] {'loss': 0.0828, 'grad_norm': 2.859191656112671, 'learning_rate': 3.927532497827318e-05, 'epoch': 1.13} 11%|█▏ | 4677/41250 [11:18:39<87:58:29, 8.66s/it][2025-04-25 19:16:23,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 19:16:23,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.24 | bwd_microstep: 5726.27 | bwd_inner_microstep: 5694.45 | bwd_allreduce_microstep: 31.78 | step_microstep: 18.38 [2025-04-25 19:16:23,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.24 | bwd: 5726.28 | bwd_inner: 5694.45 | bwd_allreduce: 31.80 | step: 18.38 11%|█▏ | 4678/41250 [11:18:48<88:00:02, 8.66s/it] {'loss': 0.1833, 'grad_norm': 5.555526256561279, 'learning_rate': 3.927490603738611e-05, 'epoch': 1.13} 11%|█▏ | 4678/41250 [11:18:48<88:00:02, 8.66s/it][2025-04-25 19:16:31,786] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 19:16:31,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.81 | bwd_microstep: 5680.37 | bwd_inner_microstep: 5646.61 | bwd_allreduce_microstep: 33.72 | step_microstep: 18.41 [2025-04-25 19:16:31,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.81 | bwd: 5680.38 | bwd_inner: 5646.61 | bwd_allreduce: 33.73 | step: 18.42 11%|█▏ | 4679/41250 [11:18:57<87:47:07, 8.64s/it] {'loss': 0.0926, 'grad_norm': 1.6370929479599, 'learning_rate': 3.927448697767304e-05, 'epoch': 1.13} 11%|█▏ | 4679/41250 [11:18:57<87:47:07, 8.64s/it][2025-04-25 19:16:40,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.56 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:16:40,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.78 | bwd_microstep: 5754.29 | bwd_inner_microstep: 5649.63 | bwd_allreduce_microstep: 104.61 | step_microstep: 18.78 [2025-04-25 19:16:40,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.78 | bwd: 5754.30 | bwd_inner: 5649.63 | bwd_allreduce: 104.63 | step: 18.79 11%|█▏ | 4680/41250 [11:19:05<87:53:01, 8.65s/it] {'loss': 0.0992, 'grad_norm': 1.5625085830688477, 'learning_rate': 3.9274067799136555e-05, 'epoch': 1.13} 11%|█▏ | 4680/41250 [11:19:05<87:53:01, 8.65s/it][2025-04-25 19:16:49,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:16:49,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.64 | bwd_microstep: 5762.59 | bwd_inner_microstep: 5652.29 | bwd_allreduce_microstep: 110.26 | step_microstep: 18.43 [2025-04-25 19:16:49,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.64 | bwd: 5762.60 | bwd_inner: 5652.29 | bwd_allreduce: 110.27 | step: 18.43 11%|█▏ | 4681/41250 [11:19:14<87:59:09, 8.66s/it] {'loss': 0.0865, 'grad_norm': 1.8521767854690552, 'learning_rate': 3.927364850177922e-05, 'epoch': 1.13} 11%|█▏ | 4681/41250 [11:19:14<87:59:09, 8.66s/it][2025-04-25 19:16:57,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:16:57,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.71 | bwd_microstep: 5711.39 | bwd_inner_microstep: 5698.60 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.61 [2025-04-25 19:16:57,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.71 | bwd: 5711.40 | bwd_inner: 5698.60 | bwd_allreduce: 12.76 | step: 18.62 11%|█▏ | 4682/41250 [11:19:23<87:58:34, 8.66s/it] {'loss': 0.0671, 'grad_norm': 0.8976309299468994, 'learning_rate': 3.927322908560364e-05, 'epoch': 1.14} 11%|█▏ | 4682/41250 [11:19:23<87:58:34, 8.66s/it][2025-04-25 19:17:06,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 1.06 [2025-04-25 19:17:06,420] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.32 | bwd_microstep: 5692.48 | bwd_inner_microstep: 5656.19 | bwd_allreduce_microstep: 36.24 | step_microstep: 18.75 [2025-04-25 19:17:06,420] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.32 | bwd: 5692.49 | bwd_inner: 5656.19 | bwd_allreduce: 36.26 | step: 18.76 11%|█▏ | 4683/41250 [11:19:31<87:49:46, 8.65s/it] {'loss': 0.1071, 'grad_norm': 1.5411763191223145, 'learning_rate': 3.927280955061239e-05, 'epoch': 1.14} 11%|█▏ | 4683/41250 [11:19:31<87:49:46, 8.65s/it][2025-04-25 19:17:15,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 19:17:15,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.49 | bwd_microstep: 5736.14 | bwd_inner_microstep: 5698.15 | bwd_allreduce_microstep: 37.94 | step_microstep: 18.59 [2025-04-25 19:17:15,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.49 | bwd: 5736.16 | bwd_inner: 5698.15 | bwd_allreduce: 37.96 | step: 18.60 11%|█▏ | 4684/41250 [11:19:40<87:53:54, 8.65s/it] {'loss': 0.1053, 'grad_norm': 1.2314990758895874, 'learning_rate': 3.927238989680806e-05, 'epoch': 1.14} 11%|█▏ | 4684/41250 [11:19:40<87:53:54, 8.65s/it][2025-04-25 19:17:23,755] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 19:17:23,756] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.11 | bwd_microstep: 5740.41 | bwd_inner_microstep: 5661.02 | bwd_allreduce_microstep: 79.34 | step_microstep: 18.60 [2025-04-25 19:17:23,756] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.11 | bwd: 5740.42 | bwd_inner: 5661.02 | bwd_allreduce: 79.36 | step: 18.60 11%|█▏ | 4685/41250 [11:19:49<87:56:05, 8.66s/it] {'loss': 0.0143, 'grad_norm': 0.6147171258926392, 'learning_rate': 3.927197012419324e-05, 'epoch': 1.14} 11%|█▏ | 4685/41250 [11:19:49<87:56:05, 8.66s/it][2025-04-25 19:17:32,449] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.01 | optimizer_step: 1.09 [2025-04-25 19:17:32,450] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.65 | bwd_microstep: 5772.74 | bwd_inner_microstep: 5656.63 | bwd_allreduce_microstep: 116.06 | step_microstep: 19.02 [2025-04-25 19:17:32,450] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.65 | bwd: 5772.75 | bwd_inner: 5656.63 | bwd_allreduce: 116.08 | step: 19.02 11%|█▏ | 4686/41250 [11:19:57<88:02:33, 8.67s/it] {'loss': 0.041, 'grad_norm': 0.8177539110183716, 'learning_rate': 3.927155023277052e-05, 'epoch': 1.14} 11%|█▏ | 4686/41250 [11:19:57<88:02:33, 8.67s/it][2025-04-25 19:17:41,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 19:17:41,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.31 | bwd_microstep: 5770.02 | bwd_inner_microstep: 5666.35 | bwd_allreduce_microstep: 103.62 | step_microstep: 18.53 [2025-04-25 19:17:41,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.31 | bwd: 5770.04 | bwd_inner: 5666.35 | bwd_allreduce: 103.64 | step: 18.54 11%|█▏ | 4687/41250 [11:20:06<88:06:34, 8.68s/it] {'loss': 0.0322, 'grad_norm': 0.7651330232620239, 'learning_rate': 3.927113022254247e-05, 'epoch': 1.14} 11%|█▏ | 4687/41250 [11:20:06<88:06:34, 8.68s/it][2025-04-25 19:17:49,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.05 | optimizer_step: 0.90 [2025-04-25 19:17:49,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.82 | bwd_microstep: 5769.25 | bwd_inner_microstep: 5661.31 | bwd_allreduce_microstep: 107.88 | step_microstep: 18.82 [2025-04-25 19:17:49,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.82 | bwd: 5769.26 | bwd_inner: 5661.31 | bwd_allreduce: 107.91 | step: 18.83 11%|█▏ | 4688/41250 [11:20:15<88:10:14, 8.68s/it] {'loss': 0.1791, 'grad_norm': 1.0371373891830444, 'learning_rate': 3.92707100935117e-05, 'epoch': 1.14} 11%|█▏ | 4688/41250 [11:20:15<88:10:14, 8.68s/it][2025-04-25 19:17:58,469] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 19:17:58,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.11 | bwd_microstep: 5709.24 | bwd_inner_microstep: 5675.54 | bwd_allreduce_microstep: 33.65 | step_microstep: 18.69 [2025-04-25 19:17:58,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.11 | bwd: 5709.25 | bwd_inner: 5675.54 | bwd_allreduce: 33.67 | step: 18.69 11%|█▏ | 4689/41250 [11:20:23<88:01:47, 8.67s/it] {'loss': 0.0507, 'grad_norm': 1.0139471292495728, 'learning_rate': 3.92702898456808e-05, 'epoch': 1.14} 11%|█▏ | 4689/41250 [11:20:23<88:01:47, 8.67s/it][2025-04-25 19:18:07,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 1.08 [2025-04-25 19:18:07,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.76 | bwd_microstep: 5730.22 | bwd_inner_microstep: 5717.51 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.91 [2025-04-25 19:18:07,143] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.76 | bwd: 5730.24 | bwd_inner: 5717.51 | bwd_allreduce: 12.68 | step: 18.91 11%|█▏ | 4690/41250 [11:20:32<88:01:46, 8.67s/it] {'loss': 0.1213, 'grad_norm': 1.8272289037704468, 'learning_rate': 3.926986947905234e-05, 'epoch': 1.14} 11%|█▏ | 4690/41250 [11:20:32<88:01:46, 8.67s/it][2025-04-25 19:18:15,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.05 | optimizer_step: 0.94 [2025-04-25 19:18:15,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.67 | bwd_microstep: 5716.76 | bwd_inner_microstep: 5703.85 | bwd_allreduce_microstep: 12.87 | step_microstep: 18.99 [2025-04-25 19:18:15,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.67 | bwd: 5716.78 | bwd_inner: 5703.85 | bwd_allreduce: 12.88 | step: 18.99 11%|█▏ | 4691/41250 [11:20:41<87:59:18, 8.66s/it] {'loss': 0.1249, 'grad_norm': 3.2163541316986084, 'learning_rate': 3.926944899362893e-05, 'epoch': 1.14} 11%|█▏ | 4691/41250 [11:20:41<87:59:18, 8.66s/it][2025-04-25 19:18:24,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.18 | optimizer_step: 0.92 [2025-04-25 19:18:24,490] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.43 | bwd_microstep: 5749.89 | bwd_inner_microstep: 5702.40 | bwd_allreduce_microstep: 47.42 | step_microstep: 18.82 [2025-04-25 19:18:24,490] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.43 | bwd: 5749.90 | bwd_inner: 5702.40 | bwd_allreduce: 47.45 | step: 18.82 11%|█▏ | 4692/41250 [11:20:49<88:04:47, 8.67s/it] {'loss': 0.1412, 'grad_norm': 3.090867519378662, 'learning_rate': 3.926902838941316e-05, 'epoch': 1.14} 11%|█▏ | 4692/41250 [11:20:49<88:04:47, 8.67s/it][2025-04-25 19:18:33,160] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-25 19:18:33,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.46 | bwd_microstep: 5724.88 | bwd_inner_microstep: 5712.03 | bwd_allreduce_microstep: 12.81 | step_microstep: 19.02 [2025-04-25 19:18:33,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.46 | bwd: 5724.90 | bwd_inner: 5712.03 | bwd_allreduce: 12.83 | step: 19.02 11%|█▏ | 4693/41250 [11:20:58<88:03:38, 8.67s/it] {'loss': 0.3142, 'grad_norm': 1.9771058559417725, 'learning_rate': 3.9268607666407614e-05, 'epoch': 1.14} 11%|█▏ | 4693/41250 [11:20:58<88:03:38, 8.67s/it][2025-04-25 19:18:42,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.66 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 19:18:42,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2942.50 | bwd_microstep: 5877.82 | bwd_inner_microstep: 5864.97 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.79 [2025-04-25 19:18:42,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2942.50 | bwd: 5877.83 | bwd_inner: 5864.97 | bwd_allreduce: 12.82 | step: 18.79 11%|█▏ | 4694/41250 [11:21:07<88:46:46, 8.74s/it] {'loss': 0.1584, 'grad_norm': 2.678492784500122, 'learning_rate': 3.92681868246149e-05, 'epoch': 1.14} 11%|█▏ | 4694/41250 [11:21:07<88:46:46, 8.74s/it][2025-04-25 19:18:50,778] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.04 | optimizer_step: 0.93 [2025-04-25 19:18:50,778] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.86 | bwd_microstep: 5765.49 | bwd_inner_microstep: 5706.94 | bwd_allreduce_microstep: 58.51 | step_microstep: 19.20 [2025-04-25 19:18:50,779] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.86 | bwd: 5765.50 | bwd_inner: 5706.94 | bwd_allreduce: 58.53 | step: 19.20 11%|█▏ | 4695/41250 [11:21:16<88:40:28, 8.73s/it] {'loss': 0.1598, 'grad_norm': 1.9384561777114868, 'learning_rate': 3.92677658640376e-05, 'epoch': 1.14} 11%|█▏ | 4695/41250 [11:21:16<88:40:28, 8.73s/it][2025-04-25 19:18:59,608] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 1.08 [2025-04-25 19:18:59,609] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.17 | bwd_microstep: 5890.32 | bwd_inner_microstep: 5712.63 | bwd_allreduce_microstep: 177.63 | step_microstep: 18.96 [2025-04-25 19:18:59,609] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.17 | bwd: 5890.33 | bwd_inner: 5712.63 | bwd_allreduce: 177.65 | step: 18.96 11%|█▏ | 4696/41250 [11:21:24<88:58:47, 8.76s/it] {'loss': 0.062, 'grad_norm': 1.7465440034866333, 'learning_rate': 3.926734478467831e-05, 'epoch': 1.14} 11%|█▏ | 4696/41250 [11:21:24<88:58:47, 8.76s/it][2025-04-25 19:19:08,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.95 [2025-04-25 19:19:08,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.17 | bwd_microstep: 5717.48 | bwd_inner_microstep: 5665.23 | bwd_allreduce_microstep: 52.20 | step_microstep: 19.02 [2025-04-25 19:19:08,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.17 | bwd: 5717.49 | bwd_inner: 5665.23 | bwd_allreduce: 52.22 | step: 19.02 11%|█▏ | 4697/41250 [11:21:33<88:34:41, 8.72s/it] {'loss': 0.1531, 'grad_norm': 1.3437628746032715, 'learning_rate': 3.926692358653963e-05, 'epoch': 1.14} 11%|█▏ | 4697/41250 [11:21:33<88:34:41, 8.72s/it][2025-04-25 19:19:16,961] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 19:19:16,961] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.26 | bwd_microstep: 5799.95 | bwd_inner_microstep: 5651.35 | bwd_allreduce_microstep: 148.55 | step_microstep: 20.00 [2025-04-25 19:19:16,962] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.26 | bwd: 5799.96 | bwd_inner: 5651.35 | bwd_allreduce: 148.57 | step: 20.01 11%|█▏ | 4698/41250 [11:21:42<88:33:14, 8.72s/it] {'loss': 0.0835, 'grad_norm': 2.0113463401794434, 'learning_rate': 3.926650226962416e-05, 'epoch': 1.14} 11%|█▏ | 4698/41250 [11:21:42<88:33:14, 8.72s/it][2025-04-25 19:19:25,671] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:19:25,672] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.89 | bwd_microstep: 5801.61 | bwd_inner_microstep: 5663.95 | bwd_allreduce_microstep: 137.61 | step_microstep: 18.56 [2025-04-25 19:19:25,672] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.89 | bwd: 5801.63 | bwd_inner: 5663.95 | bwd_allreduce: 137.63 | step: 18.57 11%|█▏ | 4699/41250 [11:21:50<88:31:00, 8.72s/it] {'loss': 0.3577, 'grad_norm': 4.263206481933594, 'learning_rate': 3.926608083393449e-05, 'epoch': 1.14} 11%|█▏ | 4699/41250 [11:21:50<88:31:00, 8.72s/it][2025-04-25 19:19:34,409] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 19:19:34,409] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2881.41 | bwd_microstep: 5775.27 | bwd_inner_microstep: 5762.48 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.53 [2025-04-25 19:19:34,409] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2881.41 | bwd: 5775.28 | bwd_inner: 5762.48 | bwd_allreduce: 12.77 | step: 18.53 11%|█▏ | 4700/41250 [11:21:59<88:34:19, 8.72s/it] {'loss': 0.2392, 'grad_norm': 2.5362701416015625, 'learning_rate': 3.926565927947322e-05, 'epoch': 1.14} 11%|█▏ | 4700/41250 [11:21:59<88:34:19, 8.72s/it][2025-04-25 19:19:43,079] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-25 19:19:43,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.92 | bwd_microstep: 5733.91 | bwd_inner_microstep: 5720.63 | bwd_allreduce_microstep: 13.23 | step_microstep: 19.30 [2025-04-25 19:19:43,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.92 | bwd: 5733.92 | bwd_inner: 5720.63 | bwd_allreduce: 13.25 | step: 19.30 11%|█▏ | 4701/41250 [11:22:08<88:24:51, 8.71s/it] {'loss': 0.0993, 'grad_norm': 2.744521141052246, 'learning_rate': 3.926523760624295e-05, 'epoch': 1.14} 11%|█▏ | 4701/41250 [11:22:08<88:24:51, 8.71s/it][2025-04-25 19:19:51,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.95 | optimizer_step: 1.01 [2025-04-25 19:19:51,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.98 | bwd_microstep: 5720.20 | bwd_inner_microstep: 5662.57 | bwd_allreduce_microstep: 57.59 | step_microstep: 18.31 [2025-04-25 19:19:51,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.98 | bwd: 5720.21 | bwd_inner: 5662.56 | bwd_allreduce: 57.61 | step: 18.31 11%|█▏ | 4702/41250 [11:22:17<88:10:42, 8.69s/it] {'loss': 0.0466, 'grad_norm': 0.8221468329429626, 'learning_rate': 3.9264815814246286e-05, 'epoch': 1.14} 11%|█▏ | 4702/41250 [11:22:17<88:10:42, 8.69s/it][2025-04-25 19:20:00,438] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.03 | optimizer_step: 1.08 [2025-04-25 19:20:00,438] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.97 | bwd_microstep: 5808.65 | bwd_inner_microstep: 5668.51 | bwd_allreduce_microstep: 140.10 | step_microstep: 19.45 [2025-04-25 19:20:00,438] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.97 | bwd: 5808.66 | bwd_inner: 5668.50 | bwd_allreduce: 140.11 | step: 19.45 11%|█▏ | 4703/41250 [11:22:25<88:17:50, 8.70s/it] {'loss': 0.0461, 'grad_norm': 0.9115947484970093, 'learning_rate': 3.926439390348581e-05, 'epoch': 1.14} 11%|█▏ | 4703/41250 [11:22:25<88:17:50, 8.70s/it][2025-04-25 19:20:09,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:20:09,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.17 | bwd_microstep: 5721.33 | bwd_inner_microstep: 5708.26 | bwd_allreduce_microstep: 13.02 | step_microstep: 18.88 [2025-04-25 19:20:09,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.17 | bwd: 5721.34 | bwd_inner: 5708.26 | bwd_allreduce: 13.04 | step: 18.88 11%|█▏ | 4704/41250 [11:22:34<88:10:00, 8.68s/it] {'loss': 0.0637, 'grad_norm': 1.208533763885498, 'learning_rate': 3.9263971873964146e-05, 'epoch': 1.14} 11%|█▏ | 4704/41250 [11:22:34<88:10:00, 8.68s/it][2025-04-25 19:20:17,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.00 | optimizer_step: 1.00 [2025-04-25 19:20:17,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.34 | bwd_microstep: 5767.33 | bwd_inner_microstep: 5693.41 | bwd_allreduce_microstep: 73.88 | step_microstep: 18.77 [2025-04-25 19:20:17,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.34 | bwd: 5767.35 | bwd_inner: 5693.41 | bwd_allreduce: 73.90 | step: 18.78 11%|█▏ | 4705/41250 [11:22:43<88:12:56, 8.69s/it] {'loss': 0.211, 'grad_norm': 2.211055278778076, 'learning_rate': 3.926354972568388e-05, 'epoch': 1.14} 11%|█▏ | 4705/41250 [11:22:43<88:12:56, 8.69s/it][2025-04-25 19:20:26,481] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 19:20:26,481] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.31 | bwd_microstep: 5757.37 | bwd_inner_microstep: 5696.52 | bwd_allreduce_microstep: 60.81 | step_microstep: 18.67 [2025-04-25 19:20:26,481] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.31 | bwd: 5757.39 | bwd_inner: 5696.52 | bwd_allreduce: 60.82 | step: 18.67 11%|█▏ | 4706/41250 [11:22:51<88:11:49, 8.69s/it] {'loss': 0.0815, 'grad_norm': 1.2444218397140503, 'learning_rate': 3.926312745864761e-05, 'epoch': 1.14} 11%|█▏ | 4706/41250 [11:22:51<88:11:49, 8.69s/it][2025-04-25 19:20:35,207] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.00 | optimizer_step: 1.02 [2025-04-25 19:20:35,207] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.23 | bwd_microstep: 5784.06 | bwd_inner_microstep: 5725.55 | bwd_allreduce_microstep: 58.46 | step_microstep: 19.29 [2025-04-25 19:20:35,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.23 | bwd: 5784.08 | bwd_inner: 5725.55 | bwd_allreduce: 58.49 | step: 19.28 11%|█▏ | 4707/41250 [11:23:00<88:19:02, 8.70s/it] {'loss': 0.0346, 'grad_norm': 0.5523371696472168, 'learning_rate': 3.9262705072857966e-05, 'epoch': 1.14} 11%|█▏ | 4707/41250 [11:23:00<88:19:02, 8.70s/it][2025-04-25 19:20:43,888] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.07 | optimizer_step: 1.27 [2025-04-25 19:20:43,889] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.94 | bwd_microstep: 5742.84 | bwd_inner_microstep: 5663.86 | bwd_allreduce_microstep: 78.94 | step_microstep: 20.06 [2025-04-25 19:20:43,890] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.94 | bwd: 5742.86 | bwd_inner: 5663.86 | bwd_allreduce: 78.96 | step: 20.06 11%|█▏ | 4708/41250 [11:23:09<88:15:45, 8.70s/it] {'loss': 0.2243, 'grad_norm': 2.9358279705047607, 'learning_rate': 3.9262282568317524e-05, 'epoch': 1.14} 11%|█▏ | 4708/41250 [11:23:09<88:15:45, 8.70s/it][2025-04-25 19:20:52,533] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.23 | optimizer_step: 0.89 [2025-04-25 19:20:52,533] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.39 | bwd_microstep: 5720.00 | bwd_inner_microstep: 5648.81 | bwd_allreduce_microstep: 71.14 | step_microstep: 20.62 [2025-04-25 19:20:52,534] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.39 | bwd: 5720.02 | bwd_inner: 5648.81 | bwd_allreduce: 71.16 | step: 20.61 11%|█▏ | 4709/41250 [11:23:17<88:05:45, 8.68s/it] {'loss': 0.1946, 'grad_norm': 1.8836102485656738, 'learning_rate': 3.92618599450289e-05, 'epoch': 1.14} 11%|█▏ | 4709/41250 [11:23:17<88:05:45, 8.68s/it][2025-04-25 19:21:01,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 19:21:01,245] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.33 | bwd_microstep: 5792.72 | bwd_inner_microstep: 5656.88 | bwd_allreduce_microstep: 135.79 | step_microstep: 18.76 [2025-04-25 19:21:01,245] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.33 | bwd: 5792.73 | bwd_inner: 5656.88 | bwd_allreduce: 135.81 | step: 18.77 11%|█▏ | 4710/41250 [11:23:26<88:11:42, 8.69s/it] {'loss': 0.0523, 'grad_norm': 1.4962635040283203, 'learning_rate': 3.9261437202994696e-05, 'epoch': 1.14} 11%|█▏ | 4710/41250 [11:23:26<88:11:42, 8.69s/it][2025-04-25 19:21:09,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 19:21:09,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.05 | bwd_microstep: 5813.29 | bwd_inner_microstep: 5651.33 | bwd_allreduce_microstep: 161.90 | step_microstep: 18.76 [2025-04-25 19:21:09,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.05 | bwd: 5813.30 | bwd_inner: 5651.33 | bwd_allreduce: 161.93 | step: 18.77 11%|█▏ | 4711/41250 [11:23:35<88:18:50, 8.70s/it] {'loss': 0.1227, 'grad_norm': 2.575023651123047, 'learning_rate': 3.926101434221752e-05, 'epoch': 1.14} 11%|█▏ | 4711/41250 [11:23:35<88:18:50, 8.70s/it][2025-04-25 19:21:18,610] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.04 | optimizer_step: 1.07 [2025-04-25 19:21:18,611] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.13 | bwd_microstep: 5708.24 | bwd_inner_microstep: 5687.77 | bwd_allreduce_microstep: 20.42 | step_microstep: 19.18 [2025-04-25 19:21:18,611] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.13 | bwd: 5708.26 | bwd_inner: 5687.77 | bwd_allreduce: 20.45 | step: 19.18 11%|█▏ | 4712/41250 [11:23:43<88:06:39, 8.68s/it] {'loss': 0.1377, 'grad_norm': 3.126865863800049, 'learning_rate': 3.926059136269998e-05, 'epoch': 1.14} 11%|█▏ | 4712/41250 [11:23:43<88:06:39, 8.68s/it][2025-04-25 19:21:27,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 1.05 [2025-04-25 19:21:27,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.65 | bwd_microstep: 5695.24 | bwd_inner_microstep: 5682.47 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.75 [2025-04-25 19:21:27,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.65 | bwd: 5695.25 | bwd_inner: 5682.47 | bwd_allreduce: 12.74 | step: 18.75 11%|█▏ | 4713/41250 [11:23:52<87:56:07, 8.66s/it] {'loss': 0.1651, 'grad_norm': 2.919682025909424, 'learning_rate': 3.926016826444468e-05, 'epoch': 1.14} 11%|█▏ | 4713/41250 [11:23:52<87:56:07, 8.66s/it][2025-04-25 19:21:35,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 19:21:35,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.33 | bwd_microstep: 5777.94 | bwd_inner_microstep: 5653.02 | bwd_allreduce_microstep: 124.87 | step_microstep: 18.85 [2025-04-25 19:21:35,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.33 | bwd: 5777.95 | bwd_inner: 5653.01 | bwd_allreduce: 124.89 | step: 18.85 11%|█▏ | 4714/41250 [11:24:01<88:00:48, 8.67s/it] {'loss': 0.2658, 'grad_norm': 1.7092006206512451, 'learning_rate': 3.925974504745424e-05, 'epoch': 1.14} 11%|█▏ | 4714/41250 [11:24:01<88:00:48, 8.67s/it][2025-04-25 19:21:44,576] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.05 | optimizer_step: 1.04 [2025-04-25 19:21:44,577] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.82 | bwd_microstep: 5706.75 | bwd_inner_microstep: 5694.18 | bwd_allreduce_microstep: 12.52 | step_microstep: 19.52 [2025-04-25 19:21:44,577] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.82 | bwd: 5706.77 | bwd_inner: 5694.18 | bwd_allreduce: 12.55 | step: 19.52 11%|█▏ | 4715/41250 [11:24:09<87:56:41, 8.67s/it] {'loss': 0.0184, 'grad_norm': 0.33914363384246826, 'learning_rate': 3.9259321711731255e-05, 'epoch': 1.14} 11%|█▏ | 4715/41250 [11:24:09<87:56:41, 8.67s/it][2025-04-25 19:21:53,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 19:21:53,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.87 | bwd_microstep: 5753.13 | bwd_inner_microstep: 5652.21 | bwd_allreduce_microstep: 100.88 | step_microstep: 18.83 [2025-04-25 19:21:53,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.87 | bwd: 5753.15 | bwd_inner: 5652.21 | bwd_allreduce: 100.89 | step: 18.83 11%|█▏ | 4716/41250 [11:24:18<87:55:23, 8.66s/it] {'loss': 0.0658, 'grad_norm': 1.05362069606781, 'learning_rate': 3.9258898257278335e-05, 'epoch': 1.14} 11%|█▏ | 4716/41250 [11:24:18<87:55:23, 8.66s/it][2025-04-25 19:22:01,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.99 | optimizer_step: 1.07 [2025-04-25 19:22:01,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.62 | bwd_microstep: 5755.92 | bwd_inner_microstep: 5654.97 | bwd_allreduce_microstep: 100.91 | step_microstep: 18.62 [2025-04-25 19:22:01,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.62 | bwd: 5755.94 | bwd_inner: 5654.97 | bwd_allreduce: 100.93 | step: 18.62 11%|█▏ | 4717/41250 [11:24:27<87:55:56, 8.66s/it] {'loss': 0.067, 'grad_norm': 1.4624416828155518, 'learning_rate': 3.92584746840981e-05, 'epoch': 1.14} 11%|█▏ | 4717/41250 [11:24:27<87:55:56, 8.66s/it][2025-04-25 19:22:10,580] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.12 | optimizer_step: 1.00 [2025-04-25 19:22:10,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.73 | bwd_microstep: 5765.51 | bwd_inner_microstep: 5652.93 | bwd_allreduce_microstep: 112.51 | step_microstep: 19.56 [2025-04-25 19:22:10,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.73 | bwd: 5765.52 | bwd_inner: 5652.93 | bwd_allreduce: 112.54 | step: 19.56 11%|█▏ | 4718/41250 [11:24:35<87:58:26, 8.67s/it] {'loss': 0.15, 'grad_norm': 2.750675678253174, 'learning_rate': 3.925805099219315e-05, 'epoch': 1.14} 11%|█▏ | 4718/41250 [11:24:35<87:58:26, 8.67s/it][2025-04-25 19:22:19,207] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 19:22:19,207] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.07 | bwd_microstep: 5684.51 | bwd_inner_microstep: 5671.75 | bwd_allreduce_microstep: 12.71 | step_microstep: 19.07 [2025-04-25 19:22:19,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.07 | bwd: 5684.52 | bwd_inner: 5671.75 | bwd_allreduce: 12.73 | step: 19.07 11%|█▏ | 4719/41250 [11:24:44<87:50:10, 8.66s/it] {'loss': 0.0251, 'grad_norm': 1.008902668952942, 'learning_rate': 3.9257627181566116e-05, 'epoch': 1.14} 11%|█▏ | 4719/41250 [11:24:44<87:50:10, 8.66s/it][2025-04-25 19:22:27,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 19:22:27,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.76 | bwd_microstep: 5703.66 | bwd_inner_microstep: 5690.92 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.75 [2025-04-25 19:22:27,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.76 | bwd: 5703.67 | bwd_inner: 5690.92 | bwd_allreduce: 12.71 | step: 18.76 11%|█▏ | 4720/41250 [11:24:53<87:47:57, 8.65s/it] {'loss': 0.0838, 'grad_norm': 2.081171751022339, 'learning_rate': 3.9257203252219594e-05, 'epoch': 1.14} 11%|█▏ | 4720/41250 [11:24:53<87:47:57, 8.65s/it][2025-04-25 19:22:36,448] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 19:22:36,449] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.78 | bwd_microstep: 5675.65 | bwd_inner_microstep: 5641.92 | bwd_allreduce_microstep: 33.69 | step_microstep: 18.37 [2025-04-25 19:22:36,449] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.78 | bwd: 5675.67 | bwd_inner: 5641.92 | bwd_allreduce: 33.71 | step: 18.37 11%|█▏ | 4721/41250 [11:25:01<87:37:30, 8.64s/it] {'loss': 0.0957, 'grad_norm': 1.8928916454315186, 'learning_rate': 3.925677920415619e-05, 'epoch': 1.14} 11%|█▏ | 4721/41250 [11:25:01<87:37:30, 8.64s/it][2025-04-25 19:22:45,099] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 1.05 [2025-04-25 19:22:45,100] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.98 | bwd_microstep: 5744.41 | bwd_inner_microstep: 5650.84 | bwd_allreduce_microstep: 93.53 | step_microstep: 18.42 [2025-04-25 19:22:45,100] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.98 | bwd: 5744.43 | bwd_inner: 5650.85 | bwd_allreduce: 93.54 | step: 18.42 11%|█▏ | 4722/41250 [11:25:10<87:40:04, 8.64s/it] {'loss': 0.1285, 'grad_norm': 2.4342761039733887, 'learning_rate': 3.9256355037378543e-05, 'epoch': 1.14} 11%|█▏ | 4722/41250 [11:25:10<87:40:04, 8.64s/it][2025-04-25 19:22:53,758] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.01 | optimizer_step: 1.01 [2025-04-25 19:22:53,759] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.73 | bwd_microstep: 5715.15 | bwd_inner_microstep: 5702.07 | bwd_allreduce_microstep: 13.03 | step_microstep: 19.21 [2025-04-25 19:22:53,759] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.73 | bwd: 5715.16 | bwd_inner: 5702.07 | bwd_allreduce: 13.05 | step: 19.21 11%|█▏ | 4723/41250 [11:25:19<87:43:37, 8.65s/it] {'loss': 0.0614, 'grad_norm': 1.5568972826004028, 'learning_rate': 3.9255930751889246e-05, 'epoch': 1.14} 11%|█▏ | 4723/41250 [11:25:19<87:43:37, 8.65s/it][2025-04-25 19:23:02,415] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 19:23:02,415] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.72 | bwd_microstep: 5749.61 | bwd_inner_microstep: 5644.30 | bwd_allreduce_microstep: 105.27 | step_microstep: 18.39 [2025-04-25 19:23:02,416] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.72 | bwd: 5749.62 | bwd_inner: 5644.30 | bwd_allreduce: 105.29 | step: 18.39 11%|█▏ | 4724/41250 [11:25:27<87:45:03, 8.65s/it] {'loss': 0.1055, 'grad_norm': 2.111485004425049, 'learning_rate': 3.925550634769093e-05, 'epoch': 1.15} 11%|█▏ | 4724/41250 [11:25:27<87:45:03, 8.65s/it][2025-04-25 19:23:11,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 19:23:11,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.70 | bwd_microstep: 5709.74 | bwd_inner_microstep: 5695.05 | bwd_allreduce_microstep: 14.65 | step_microstep: 19.08 [2025-04-25 19:23:11,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.70 | bwd: 5709.76 | bwd_inner: 5695.05 | bwd_allreduce: 14.67 | step: 19.08 11%|█▏ | 4725/41250 [11:25:36<87:47:26, 8.65s/it] {'loss': 0.0407, 'grad_norm': 1.5377516746520996, 'learning_rate': 3.92550818247862e-05, 'epoch': 1.15} 11%|█▏ | 4725/41250 [11:25:36<87:47:26, 8.65s/it][2025-04-25 19:23:19,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 19:23:19,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.53 | bwd_microstep: 5710.76 | bwd_inner_microstep: 5697.61 | bwd_allreduce_microstep: 13.10 | step_microstep: 19.19 [2025-04-25 19:23:19,726] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.53 | bwd: 5710.77 | bwd_inner: 5697.61 | bwd_allreduce: 13.12 | step: 19.19 11%|█▏ | 4726/41250 [11:25:45<87:46:28, 8.65s/it] {'loss': 0.2241, 'grad_norm': 2.423675060272217, 'learning_rate': 3.925465718317768e-05, 'epoch': 1.15} 11%|█▏ | 4726/41250 [11:25:45<87:46:28, 8.65s/it][2025-04-25 19:23:28,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.96 [2025-04-25 19:23:28,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.25 | bwd_microstep: 5736.32 | bwd_inner_microstep: 5702.08 | bwd_allreduce_microstep: 34.19 | step_microstep: 18.97 [2025-04-25 19:23:28,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.25 | bwd: 5736.33 | bwd_inner: 5702.08 | bwd_allreduce: 34.21 | step: 18.97 11%|█▏ | 4727/41250 [11:25:53<87:51:02, 8.66s/it] {'loss': 0.0094, 'grad_norm': 0.13889364898204803, 'learning_rate': 3.925423242286799e-05, 'epoch': 1.15} 11%|█▏ | 4727/41250 [11:25:53<87:51:02, 8.66s/it][2025-04-25 19:23:37,054] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 1.08 [2025-04-25 19:23:37,054] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.29 | bwd_microstep: 5719.11 | bwd_inner_microstep: 5706.32 | bwd_allreduce_microstep: 12.74 | step_microstep: 19.21 [2025-04-25 19:23:37,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.29 | bwd: 5719.12 | bwd_inner: 5706.32 | bwd_allreduce: 12.76 | step: 19.21 11%|█▏ | 4728/41250 [11:26:02<87:49:37, 8.66s/it] {'loss': 0.0137, 'grad_norm': 0.38143959641456604, 'learning_rate': 3.925380754385974e-05, 'epoch': 1.15} 11%|█▏ | 4728/41250 [11:26:02<87:49:37, 8.66s/it][2025-04-25 19:23:45,723] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 19:23:45,724] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.14 | bwd_microstep: 5741.50 | bwd_inner_microstep: 5694.90 | bwd_allreduce_microstep: 46.55 | step_microstep: 18.91 [2025-04-25 19:23:45,724] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.14 | bwd: 5741.51 | bwd_inner: 5694.90 | bwd_allreduce: 46.57 | step: 18.91 11%|█▏ | 4729/41250 [11:26:11<87:51:45, 8.66s/it] {'loss': 0.0998, 'grad_norm': 1.8601948022842407, 'learning_rate': 3.925338254615555e-05, 'epoch': 1.15} 11%|█▏ | 4729/41250 [11:26:11<87:51:45, 8.66s/it][2025-04-25 19:23:54,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 19:23:54,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.03 | bwd_microstep: 5700.02 | bwd_inner_microstep: 5687.14 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.55 [2025-04-25 19:23:54,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.03 | bwd: 5700.04 | bwd_inner: 5687.14 | bwd_allreduce: 12.86 | step: 18.55 11%|█▏ | 4730/41250 [11:26:19<87:45:35, 8.65s/it] {'loss': 0.2843, 'grad_norm': 2.7128095626831055, 'learning_rate': 3.925295742975805e-05, 'epoch': 1.15} 11%|█▏ | 4730/41250 [11:26:19<87:45:35, 8.65s/it][2025-04-25 19:24:03,026] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 19:24:03,027] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.37 | bwd_microstep: 5764.59 | bwd_inner_microstep: 5661.20 | bwd_allreduce_microstep: 103.35 | step_microstep: 18.91 [2025-04-25 19:24:03,027] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.37 | bwd: 5764.61 | bwd_inner: 5661.20 | bwd_allreduce: 103.36 | step: 18.91 11%|█▏ | 4731/41250 [11:26:28<87:49:43, 8.66s/it] {'loss': 0.1673, 'grad_norm': 4.155867576599121, 'learning_rate': 3.925253219466985e-05, 'epoch': 1.15} 11%|█▏ | 4731/41250 [11:26:28<87:49:43, 8.66s/it][2025-04-25 19:24:11,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.97 | optimizer_step: 0.94 [2025-04-25 19:24:11,719] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.77 | bwd_microstep: 5783.79 | bwd_inner_microstep: 5660.77 | bwd_allreduce_microstep: 122.98 | step_microstep: 18.70 [2025-04-25 19:24:11,719] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.77 | bwd: 5783.80 | bwd_inner: 5660.77 | bwd_allreduce: 122.99 | step: 18.70 11%|█▏ | 4732/41250 [11:26:37<87:55:40, 8.67s/it] {'loss': 0.2152, 'grad_norm': 2.3533174991607666, 'learning_rate': 3.925210684089358e-05, 'epoch': 1.15} 11%|█▏ | 4732/41250 [11:26:37<87:55:40, 8.67s/it][2025-04-25 19:24:20,409] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.96 | optimizer_step: 1.07 [2025-04-25 19:24:20,409] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.11 | bwd_microstep: 5758.95 | bwd_inner_microstep: 5694.53 | bwd_allreduce_microstep: 64.38 | step_microstep: 18.52 [2025-04-25 19:24:20,409] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.11 | bwd: 5758.96 | bwd_inner: 5694.53 | bwd_allreduce: 64.39 | step: 18.53 11%|█▏ | 4733/41250 [11:26:45<87:59:36, 8.67s/it] {'loss': 0.0259, 'grad_norm': 1.0426514148712158, 'learning_rate': 3.925168136843185e-05, 'epoch': 1.15} 11%|█▏ | 4733/41250 [11:26:45<87:59:36, 8.67s/it][2025-04-25 19:24:29,016] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 19:24:29,017] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.04 | bwd_microstep: 5678.27 | bwd_inner_microstep: 5656.00 | bwd_allreduce_microstep: 22.22 | step_microstep: 18.40 [2025-04-25 19:24:29,017] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.04 | bwd: 5678.28 | bwd_inner: 5656.00 | bwd_allreduce: 22.24 | step: 18.40 11%|█▏ | 4734/41250 [11:26:54<87:47:13, 8.65s/it] {'loss': 0.0591, 'grad_norm': 1.4352678060531616, 'learning_rate': 3.92512557772873e-05, 'epoch': 1.15} 11%|█▏ | 4734/41250 [11:26:54<87:47:13, 8.65s/it][2025-04-25 19:24:37,690] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.03 | optimizer_step: 1.04 [2025-04-25 19:24:37,691] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.68 | bwd_microstep: 5740.94 | bwd_inner_microstep: 5704.86 | bwd_allreduce_microstep: 36.04 | step_microstep: 18.70 [2025-04-25 19:24:37,691] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.68 | bwd: 5740.96 | bwd_inner: 5704.85 | bwd_allreduce: 36.06 | step: 18.71 11%|█▏ | 4735/41250 [11:27:03<87:50:36, 8.66s/it] {'loss': 0.0529, 'grad_norm': 1.2228344678878784, 'learning_rate': 3.925083006746254e-05, 'epoch': 1.15} 11%|█▏ | 4735/41250 [11:27:03<87:50:36, 8.66s/it][2025-04-25 19:24:46,377] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 19:24:46,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.25 | bwd_microstep: 5748.02 | bwd_inner_microstep: 5708.98 | bwd_allreduce_microstep: 38.99 | step_microstep: 19.40 [2025-04-25 19:24:46,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.25 | bwd: 5748.04 | bwd_inner: 5708.98 | bwd_allreduce: 39.01 | step: 19.40 11%|█▏ | 4736/41250 [11:27:11<87:55:29, 8.67s/it] {'loss': 0.2383, 'grad_norm': 3.5267081260681152, 'learning_rate': 3.92504042389602e-05, 'epoch': 1.15} 11%|█▏ | 4736/41250 [11:27:11<87:55:29, 8.67s/it][2025-04-25 19:24:55,009] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.99 | optimizer_step: 1.01 [2025-04-25 19:24:55,010] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.73 | bwd_microstep: 5712.59 | bwd_inner_microstep: 5663.22 | bwd_allreduce_microstep: 49.33 | step_microstep: 18.95 [2025-04-25 19:24:55,010] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.73 | bwd: 5712.61 | bwd_inner: 5663.22 | bwd_allreduce: 49.35 | step: 18.95 11%|█▏ | 4737/41250 [11:27:20<87:48:38, 8.66s/it] {'loss': 0.3074, 'grad_norm': 3.05086088180542, 'learning_rate': 3.92499782917829e-05, 'epoch': 1.15} 11%|█▏ | 4737/41250 [11:27:20<87:48:38, 8.66s/it][2025-04-25 19:25:03,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:25:03,681] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2865.80 | bwd_microstep: 5719.65 | bwd_inner_microstep: 5706.84 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.84 [2025-04-25 19:25:03,681] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2865.80 | bwd: 5719.66 | bwd_inner: 5706.84 | bwd_allreduce: 12.78 | step: 18.84 11%|█▏ | 4738/41250 [11:27:29<87:50:39, 8.66s/it] {'loss': 0.0562, 'grad_norm': 0.821743905544281, 'learning_rate': 3.924955222593328e-05, 'epoch': 1.15} 11%|█▏ | 4738/41250 [11:27:29<87:50:39, 8.66s/it][2025-04-25 19:25:12,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 1.07 [2025-04-25 19:25:12,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2867.41 | bwd_microstep: 5761.24 | bwd_inner_microstep: 5720.39 | bwd_allreduce_microstep: 40.81 | step_microstep: 18.65 [2025-04-25 19:25:12,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2867.41 | bwd: 5761.26 | bwd_inner: 5720.39 | bwd_allreduce: 40.83 | step: 18.65 11%|█▏ | 4739/41250 [11:27:37<87:59:42, 8.68s/it] {'loss': 0.1054, 'grad_norm': 2.19539737701416, 'learning_rate': 3.924912604141395e-05, 'epoch': 1.15} 11%|█▏ | 4739/41250 [11:27:37<87:59:42, 8.68s/it][2025-04-25 19:25:21,105] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:25:21,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.03 | bwd_microstep: 5791.34 | bwd_inner_microstep: 5667.75 | bwd_allreduce_microstep: 123.55 | step_microstep: 18.44 [2025-04-25 19:25:21,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.03 | bwd: 5791.36 | bwd_inner: 5667.75 | bwd_allreduce: 123.56 | step: 18.44 11%|█▏ | 4740/41250 [11:27:46<88:06:23, 8.69s/it] {'loss': 0.1446, 'grad_norm': 1.9763798713684082, 'learning_rate': 3.9248699738227547e-05, 'epoch': 1.15} 11%|█▏ | 4740/41250 [11:27:46<88:06:23, 8.69s/it][2025-04-25 19:25:29,737] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:25:29,738] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.26 | bwd_microstep: 5714.26 | bwd_inner_microstep: 5656.53 | bwd_allreduce_microstep: 57.68 | step_microstep: 18.16 [2025-04-25 19:25:29,738] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.26 | bwd: 5714.27 | bwd_inner: 5656.53 | bwd_allreduce: 57.70 | step: 18.17 11%|█▏ | 4741/41250 [11:27:55<87:56:08, 8.67s/it] {'loss': 0.0764, 'grad_norm': 1.2706516981124878, 'learning_rate': 3.9248273316376693e-05, 'epoch': 1.15} 11%|█▏ | 4741/41250 [11:27:55<87:56:08, 8.67s/it][2025-04-25 19:25:38,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-25 19:25:38,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.18 | bwd_microstep: 5781.12 | bwd_inner_microstep: 5712.96 | bwd_allreduce_microstep: 68.11 | step_microstep: 18.32 [2025-04-25 19:25:38,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.18 | bwd: 5781.14 | bwd_inner: 5712.96 | bwd_allreduce: 68.13 | step: 18.32 11%|█▏ | 4742/41250 [11:28:03<88:04:56, 8.69s/it] {'loss': 0.0319, 'grad_norm': 0.6237672567367554, 'learning_rate': 3.924784677586402e-05, 'epoch': 1.15} 11%|█▏ | 4742/41250 [11:28:03<88:04:56, 8.69s/it][2025-04-25 19:25:47,162] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 19:25:47,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2864.04 | bwd_microstep: 5758.11 | bwd_inner_microstep: 5721.16 | bwd_allreduce_microstep: 36.91 | step_microstep: 18.31 [2025-04-25 19:25:47,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2864.04 | bwd: 5758.12 | bwd_inner: 5721.16 | bwd_allreduce: 36.92 | step: 18.32 11%|█▏ | 4743/41250 [11:28:12<88:08:12, 8.69s/it] {'loss': 0.3157, 'grad_norm': 1.6603177785873413, 'learning_rate': 3.924742011669215e-05, 'epoch': 1.15} 11%|█▏ | 4743/41250 [11:28:12<88:08:12, 8.69s/it][2025-04-25 19:25:55,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.95 [2025-04-25 19:25:55,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.99 | bwd_microstep: 5896.42 | bwd_inner_microstep: 5712.80 | bwd_allreduce_microstep: 183.57 | step_microstep: 18.41 [2025-04-25 19:25:55,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.99 | bwd: 5896.44 | bwd_inner: 5712.80 | bwd_allreduce: 183.59 | step: 18.41 12%|█▏ | 4744/41250 [11:28:21<88:34:28, 8.73s/it] {'loss': 0.1862, 'grad_norm': 1.2356312274932861, 'learning_rate': 3.9246993338863736e-05, 'epoch': 1.15} 12%|█▏ | 4744/41250 [11:28:21<88:34:28, 8.73s/it][2025-04-25 19:26:04,708] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 1.22 | optimizer_step: 0.96 [2025-04-25 19:26:04,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.78 | bwd_microstep: 5784.62 | bwd_inner_microstep: 5651.77 | bwd_allreduce_microstep: 132.80 | step_microstep: 18.88 [2025-04-25 19:26:04,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.78 | bwd: 5784.63 | bwd_inner: 5651.77 | bwd_allreduce: 132.82 | step: 18.88 12%|█▏ | 4745/41250 [11:28:30<88:29:53, 8.73s/it] {'loss': 0.2066, 'grad_norm': 3.0010263919830322, 'learning_rate': 3.9246566442381386e-05, 'epoch': 1.15} 12%|█▏ | 4745/41250 [11:28:30<88:29:53, 8.73s/it][2025-04-25 19:26:13,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-25 19:26:13,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.82 | bwd_microstep: 5712.66 | bwd_inner_microstep: 5666.92 | bwd_allreduce_microstep: 45.70 | step_microstep: 18.20 [2025-04-25 19:26:13,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.82 | bwd: 5712.67 | bwd_inner: 5666.92 | bwd_allreduce: 45.71 | step: 18.20 12%|█▏ | 4746/41250 [11:28:38<88:12:43, 8.70s/it] {'loss': 0.0712, 'grad_norm': 1.340577244758606, 'learning_rate': 3.924613942724774e-05, 'epoch': 1.15} 12%|█▏ | 4746/41250 [11:28:38<88:12:43, 8.70s/it][2025-04-25 19:26:22,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:26:22,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2906.02 | bwd_microstep: 5805.18 | bwd_inner_microstep: 5792.59 | bwd_allreduce_microstep: 12.55 | step_microstep: 18.65 [2025-04-25 19:26:22,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2906.02 | bwd: 5805.19 | bwd_inner: 5792.59 | bwd_allreduce: 12.56 | step: 18.65 12%|█▏ | 4747/41250 [11:28:47<88:29:59, 8.73s/it] {'loss': 0.1567, 'grad_norm': 1.9100004434585571, 'learning_rate': 3.924571229346543e-05, 'epoch': 1.15} 12%|█▏ | 4747/41250 [11:28:47<88:29:59, 8.73s/it][2025-04-25 19:26:30,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.95 | optimizer_step: 1.04 [2025-04-25 19:26:30,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.63 | bwd_microstep: 5776.17 | bwd_inner_microstep: 5677.31 | bwd_allreduce_microstep: 98.81 | step_microstep: 18.35 [2025-04-25 19:26:30,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.63 | bwd: 5776.18 | bwd_inner: 5677.30 | bwd_allreduce: 98.83 | step: 18.35 12%|█▏ | 4748/41250 [11:28:56<88:24:43, 8.72s/it] {'loss': 0.1329, 'grad_norm': 3.5723228454589844, 'learning_rate': 3.924528504103709e-05, 'epoch': 1.15} 12%|█▏ | 4748/41250 [11:28:56<88:24:43, 8.72s/it][2025-04-25 19:26:39,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:26:39,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.33 | bwd_microstep: 6063.64 | bwd_inner_microstep: 5667.23 | bwd_allreduce_microstep: 396.37 | step_microstep: 18.38 [2025-04-25 19:26:39,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.33 | bwd: 6063.66 | bwd_inner: 5667.23 | bwd_allreduce: 396.39 | step: 18.38 12%|█▏ | 4749/41250 [11:29:05<89:13:39, 8.80s/it] {'loss': 0.0249, 'grad_norm': 0.5316081047058105, 'learning_rate': 3.9244857669965355e-05, 'epoch': 1.15} 12%|█▏ | 4749/41250 [11:29:05<89:13:39, 8.80s/it][2025-04-25 19:26:48,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-25 19:26:48,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.01 | bwd_microstep: 5758.10 | bwd_inner_microstep: 5706.31 | bwd_allreduce_microstep: 51.75 | step_microstep: 18.29 [2025-04-25 19:26:48,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.01 | bwd: 5758.11 | bwd_inner: 5706.31 | bwd_allreduce: 51.77 | step: 18.29 12%|█▏ | 4750/41250 [11:29:13<88:53:35, 8.77s/it] {'loss': 0.0714, 'grad_norm': 1.1207773685455322, 'learning_rate': 3.9244430180252855e-05, 'epoch': 1.15} 12%|█▏ | 4750/41250 [11:29:13<88:53:35, 8.77s/it][2025-04-25 19:26:57,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:26:57,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.54 | bwd_microstep: 5774.48 | bwd_inner_microstep: 5657.55 | bwd_allreduce_microstep: 116.88 | step_microstep: 18.03 [2025-04-25 19:26:57,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.54 | bwd: 5774.49 | bwd_inner: 5657.55 | bwd_allreduce: 116.90 | step: 18.03 12%|█▏ | 4751/41250 [11:29:22<88:42:39, 8.75s/it] {'loss': 0.1861, 'grad_norm': 1.5320042371749878, 'learning_rate': 3.9244002571902224e-05, 'epoch': 1.15} 12%|█▏ | 4751/41250 [11:29:22<88:42:39, 8.75s/it][2025-04-25 19:27:05,928] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.03 | optimizer_step: 0.98 [2025-04-25 19:27:05,928] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.28 | bwd_microstep: 5770.22 | bwd_inner_microstep: 5668.98 | bwd_allreduce_microstep: 101.18 | step_microstep: 19.32 [2025-04-25 19:27:05,929] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.28 | bwd: 5770.23 | bwd_inner: 5668.98 | bwd_allreduce: 101.21 | step: 19.32 12%|█▏ | 4752/41250 [11:29:31<88:34:02, 8.74s/it] {'loss': 0.2852, 'grad_norm': 1.9850375652313232, 'learning_rate': 3.924357484491611e-05, 'epoch': 1.15} 12%|█▏ | 4752/41250 [11:29:31<88:34:02, 8.74s/it][2025-04-25 19:27:14,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-25 19:27:14,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.09 | bwd_microstep: 5714.10 | bwd_inner_microstep: 5701.36 | bwd_allreduce_microstep: 12.70 | step_microstep: 17.90 [2025-04-25 19:27:14,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.09 | bwd: 5714.11 | bwd_inner: 5701.36 | bwd_allreduce: 12.71 | step: 17.90 12%|█▏ | 4753/41250 [11:29:39<88:19:26, 8.71s/it] {'loss': 0.0784, 'grad_norm': 1.2855380773544312, 'learning_rate': 3.924314699929714e-05, 'epoch': 1.15} 12%|█▏ | 4753/41250 [11:29:39<88:19:26, 8.71s/it][2025-04-25 19:27:23,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:27:23,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.50 | bwd_microstep: 5784.61 | bwd_inner_microstep: 5684.60 | bwd_allreduce_microstep: 99.97 | step_microstep: 18.10 [2025-04-25 19:27:23,301] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.50 | bwd: 5784.62 | bwd_inner: 5684.60 | bwd_allreduce: 99.98 | step: 18.11 12%|█▏ | 4754/41250 [11:29:48<88:19:36, 8.71s/it] {'loss': 0.101, 'grad_norm': 1.9069010019302368, 'learning_rate': 3.924271903504795e-05, 'epoch': 1.15} 12%|█▏ | 4754/41250 [11:29:48<88:19:36, 8.71s/it][2025-04-25 19:27:31,948] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:27:31,949] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.16 | bwd_microstep: 5709.08 | bwd_inner_microstep: 5696.48 | bwd_allreduce_microstep: 12.56 | step_microstep: 18.22 [2025-04-25 19:27:31,949] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.16 | bwd: 5709.10 | bwd_inner: 5696.48 | bwd_allreduce: 12.58 | step: 18.22 12%|█▏ | 4755/41250 [11:29:57<88:07:44, 8.69s/it] {'loss': 0.3026, 'grad_norm': 3.2761895656585693, 'learning_rate': 3.9242290952171184e-05, 'epoch': 1.15} 12%|█▏ | 4755/41250 [11:29:57<88:07:44, 8.69s/it][2025-04-25 19:27:40,628] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 19:27:40,628] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.12 | bwd_microstep: 5768.68 | bwd_inner_microstep: 5671.77 | bwd_allreduce_microstep: 96.87 | step_microstep: 18.54 [2025-04-25 19:27:40,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.12 | bwd: 5768.69 | bwd_inner: 5671.77 | bwd_allreduce: 96.88 | step: 18.54 12%|█▏ | 4756/41250 [11:30:05<88:05:11, 8.69s/it] {'loss': 0.0705, 'grad_norm': 0.6696234345436096, 'learning_rate': 3.924186275066948e-05, 'epoch': 1.15} 12%|█▏ | 4756/41250 [11:30:05<88:05:11, 8.69s/it][2025-04-25 19:27:49,318] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:27:49,319] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.00 | bwd_microstep: 5778.19 | bwd_inner_microstep: 5662.61 | bwd_allreduce_microstep: 115.54 | step_microstep: 18.31 [2025-04-25 19:27:49,319] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.00 | bwd: 5778.21 | bwd_inner: 5662.61 | bwd_allreduce: 115.56 | step: 18.32 12%|█▏ | 4757/41250 [11:30:14<88:05:14, 8.69s/it] {'loss': 0.3643, 'grad_norm': 2.593266248703003, 'learning_rate': 3.924143443054548e-05, 'epoch': 1.15} 12%|█▏ | 4757/41250 [11:30:14<88:05:14, 8.69s/it][2025-04-25 19:27:58,021] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 19:27:58,021] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.99 | bwd_microstep: 5786.94 | bwd_inner_microstep: 5648.22 | bwd_allreduce_microstep: 138.67 | step_microstep: 18.95 [2025-04-25 19:27:58,022] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.99 | bwd: 5786.95 | bwd_inner: 5648.22 | bwd_allreduce: 138.69 | step: 18.95 12%|█▏ | 4758/41250 [11:30:23<88:07:33, 8.69s/it] {'loss': 0.0113, 'grad_norm': 0.1715356558561325, 'learning_rate': 3.9241005991801814e-05, 'epoch': 1.15} 12%|█▏ | 4758/41250 [11:30:23<88:07:33, 8.69s/it][2025-04-25 19:28:06,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.23 | optimizer_step: 0.99 [2025-04-25 19:28:06,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.62 | bwd_microstep: 5738.65 | bwd_inner_microstep: 5679.72 | bwd_allreduce_microstep: 58.87 | step_microstep: 19.39 [2025-04-25 19:28:06,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.62 | bwd: 5738.66 | bwd_inner: 5679.72 | bwd_allreduce: 58.90 | step: 19.39 12%|█▏ | 4759/41250 [11:30:32<88:03:31, 8.69s/it] {'loss': 0.0945, 'grad_norm': 0.9567952752113342, 'learning_rate': 3.9240577434441134e-05, 'epoch': 1.15} 12%|█▏ | 4759/41250 [11:30:32<88:03:31, 8.69s/it][2025-04-25 19:28:15,384] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.08 | optimizer_step: 0.95 [2025-04-25 19:28:15,384] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.52 | bwd_microstep: 5759.65 | bwd_inner_microstep: 5695.62 | bwd_allreduce_microstep: 63.98 | step_microstep: 19.93 [2025-04-25 19:28:15,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.52 | bwd: 5759.67 | bwd_inner: 5695.62 | bwd_allreduce: 64.01 | step: 19.93 12%|█▏ | 4760/41250 [11:30:40<88:04:02, 8.69s/it] {'loss': 0.0971, 'grad_norm': 1.3594484329223633, 'learning_rate': 3.924014875846608e-05, 'epoch': 1.15} 12%|█▏ | 4760/41250 [11:30:40<88:04:02, 8.69s/it][2025-04-25 19:28:24,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.23 | optimizer_step: 0.90 [2025-04-25 19:28:24,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.61 | bwd_microstep: 5773.20 | bwd_inner_microstep: 5648.61 | bwd_allreduce_microstep: 124.53 | step_microstep: 19.45 [2025-04-25 19:28:24,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.61 | bwd: 5773.22 | bwd_inner: 5648.61 | bwd_allreduce: 124.55 | step: 19.45 12%|█▏ | 4761/41250 [11:30:49<88:03:47, 8.69s/it] {'loss': 0.0922, 'grad_norm': 1.2038213014602661, 'learning_rate': 3.923971996387929e-05, 'epoch': 1.15} 12%|█▏ | 4761/41250 [11:30:49<88:03:47, 8.69s/it][2025-04-25 19:28:32,760] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 19:28:32,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.11 | bwd_microstep: 5777.74 | bwd_inner_microstep: 5653.70 | bwd_allreduce_microstep: 124.00 | step_microstep: 19.24 [2025-04-25 19:28:32,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.11 | bwd: 5777.76 | bwd_inner: 5653.70 | bwd_allreduce: 124.01 | step: 19.24 12%|█▏ | 4762/41250 [11:30:58<88:03:42, 8.69s/it] {'loss': 0.052, 'grad_norm': 1.1598602533340454, 'learning_rate': 3.923929105068341e-05, 'epoch': 1.15} 12%|█▏ | 4762/41250 [11:30:58<88:03:42, 8.69s/it][2025-04-25 19:28:41,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.02 | optimizer_step: 1.06 [2025-04-25 19:28:41,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.72 | bwd_microstep: 5771.24 | bwd_inner_microstep: 5651.03 | bwd_allreduce_microstep: 120.16 | step_microstep: 18.67 [2025-04-25 19:28:41,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.72 | bwd: 5771.25 | bwd_inner: 5651.03 | bwd_allreduce: 120.18 | step: 18.67 12%|█▏ | 4763/41250 [11:31:06<88:02:50, 8.69s/it] {'loss': 0.0938, 'grad_norm': 2.278381109237671, 'learning_rate': 3.923886201888109e-05, 'epoch': 1.15} 12%|█▏ | 4763/41250 [11:31:06<88:02:50, 8.69s/it][2025-04-25 19:28:50,096] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 19:28:50,096] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.55 | bwd_microstep: 5714.00 | bwd_inner_microstep: 5701.05 | bwd_allreduce_microstep: 12.90 | step_microstep: 19.00 [2025-04-25 19:28:50,096] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.55 | bwd: 5714.02 | bwd_inner: 5701.05 | bwd_allreduce: 12.92 | step: 19.01 12%|█▏ | 4764/41250 [11:31:15<87:55:53, 8.68s/it] {'loss': 0.2118, 'grad_norm': 2.749673843383789, 'learning_rate': 3.923843286847497e-05, 'epoch': 1.15} 12%|█▏ | 4764/41250 [11:31:15<87:55:53, 8.68s/it][2025-04-25 19:28:58,688] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.05 | optimizer_step: 1.02 [2025-04-25 19:28:58,689] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.85 | bwd_microstep: 5672.17 | bwd_inner_microstep: 5658.83 | bwd_allreduce_microstep: 13.28 | step_microstep: 19.26 [2025-04-25 19:28:58,689] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.85 | bwd: 5672.19 | bwd_inner: 5658.83 | bwd_allreduce: 13.31 | step: 19.26 12%|█▏ | 4765/41250 [11:31:24<87:41:06, 8.65s/it] {'loss': 0.0794, 'grad_norm': 1.0415256023406982, 'learning_rate': 3.923800359946769e-05, 'epoch': 1.16} 12%|█▏ | 4765/41250 [11:31:24<87:41:06, 8.65s/it][2025-04-25 19:29:07,578] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.96 [2025-04-25 19:29:07,579] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2930.91 | bwd_microstep: 5874.25 | bwd_inner_microstep: 5861.58 | bwd_allreduce_microstep: 12.62 | step_microstep: 18.85 [2025-04-25 19:29:07,579] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2930.91 | bwd: 5874.27 | bwd_inner: 5861.58 | bwd_allreduce: 12.64 | step: 18.86 12%|█▏ | 4766/41250 [11:31:32<88:23:53, 8.72s/it] {'loss': 0.2483, 'grad_norm': 1.9342526197433472, 'learning_rate': 3.92375742118619e-05, 'epoch': 1.16} 12%|█▏ | 4766/41250 [11:31:32<88:23:53, 8.72s/it][2025-04-25 19:29:16,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.05 | optimizer_step: 1.03 [2025-04-25 19:29:16,315] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2880.91 | bwd_microstep: 5769.59 | bwd_inner_microstep: 5756.86 | bwd_allreduce_microstep: 12.68 | step_microstep: 19.41 [2025-04-25 19:29:16,315] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2880.91 | bwd: 5769.61 | bwd_inner: 5756.86 | bwd_allreduce: 12.70 | step: 19.41 12%|█▏ | 4767/41250 [11:31:41<88:26:44, 8.73s/it] {'loss': 0.0547, 'grad_norm': 0.6261408925056458, 'learning_rate': 3.923714470566026e-05, 'epoch': 1.16} 12%|█▏ | 4767/41250 [11:31:41<88:26:44, 8.73s/it][2025-04-25 19:29:25,003] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 19:29:25,004] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.10 | bwd_microstep: 5760.82 | bwd_inner_microstep: 5654.29 | bwd_allreduce_microstep: 106.48 | step_microstep: 18.51 [2025-04-25 19:29:25,004] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.10 | bwd: 5760.83 | bwd_inner: 5654.29 | bwd_allreduce: 106.50 | step: 18.51 12%|█▏ | 4768/41250 [11:31:50<88:18:53, 8.71s/it] {'loss': 0.0724, 'grad_norm': 1.1256295442581177, 'learning_rate': 3.923671508086539e-05, 'epoch': 1.16} 12%|█▏ | 4768/41250 [11:31:50<88:18:53, 8.71s/it][2025-04-25 19:29:33,624] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 19:29:33,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.70 | bwd_microstep: 5699.25 | bwd_inner_microstep: 5651.46 | bwd_allreduce_microstep: 47.74 | step_microstep: 18.56 [2025-04-25 19:29:33,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.70 | bwd: 5699.26 | bwd_inner: 5651.46 | bwd_allreduce: 47.76 | step: 18.57 12%|█▏ | 4769/41250 [11:31:58<88:01:41, 8.69s/it] {'loss': 0.3306, 'grad_norm': 2.228577136993408, 'learning_rate': 3.923628533747997e-05, 'epoch': 1.16} 12%|█▏ | 4769/41250 [11:31:58<88:01:41, 8.69s/it][2025-04-25 19:29:42,375] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-25 19:29:42,376] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2885.64 | bwd_microstep: 5779.40 | bwd_inner_microstep: 5766.53 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.69 [2025-04-25 19:29:42,376] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2885.64 | bwd: 5779.42 | bwd_inner: 5766.53 | bwd_allreduce: 12.84 | step: 18.69 12%|█▏ | 4770/41250 [11:32:07<88:13:41, 8.71s/it] {'loss': 0.1992, 'grad_norm': 1.445123314857483, 'learning_rate': 3.923585547550663e-05, 'epoch': 1.16} 12%|█▏ | 4770/41250 [11:32:07<88:13:41, 8.71s/it][2025-04-25 19:29:50,967] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.19 | optimizer_step: 0.98 [2025-04-25 19:29:50,968] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.04 | bwd_microstep: 5666.62 | bwd_inner_microstep: 5653.27 | bwd_allreduce_microstep: 13.29 | step_microstep: 19.19 [2025-04-25 19:29:50,968] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.04 | bwd: 5666.63 | bwd_inner: 5653.27 | bwd_allreduce: 13.31 | step: 19.19 12%|█▏ | 4771/41250 [11:32:16<87:52:56, 8.67s/it] {'loss': 0.0365, 'grad_norm': 0.7106852531433105, 'learning_rate': 3.9235425494948025e-05, 'epoch': 1.16} 12%|█▏ | 4771/41250 [11:32:16<87:52:56, 8.67s/it][2025-04-25 19:29:59,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.04 | optimizer_step: 0.98 [2025-04-25 19:29:59,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.51 | bwd_microstep: 5700.60 | bwd_inner_microstep: 5639.52 | bwd_allreduce_microstep: 61.03 | step_microstep: 19.35 [2025-04-25 19:29:59,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.51 | bwd: 5700.61 | bwd_inner: 5639.52 | bwd_allreduce: 61.05 | step: 19.35 12%|█▏ | 4772/41250 [11:32:24<87:41:22, 8.65s/it] {'loss': 0.1574, 'grad_norm': 2.663306713104248, 'learning_rate': 3.9234995395806806e-05, 'epoch': 1.16} 12%|█▏ | 4772/41250 [11:32:24<87:41:22, 8.65s/it][2025-04-25 19:30:08,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 19:30:08,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.64 | bwd_microstep: 5740.13 | bwd_inner_microstep: 5648.99 | bwd_allreduce_microstep: 91.09 | step_microstep: 18.48 [2025-04-25 19:30:08,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.64 | bwd: 5740.15 | bwd_inner: 5648.99 | bwd_allreduce: 91.11 | step: 18.49 12%|█▏ | 4773/41250 [11:32:33<87:41:51, 8.66s/it] {'loss': 0.1464, 'grad_norm': 2.180725336074829, 'learning_rate': 3.923456517808562e-05, 'epoch': 1.16} 12%|█▏ | 4773/41250 [11:32:33<87:41:51, 8.66s/it][2025-04-25 19:30:16,835] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 1.08 [2025-04-25 19:30:16,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.95 | bwd_microstep: 5685.39 | bwd_inner_microstep: 5644.36 | bwd_allreduce_microstep: 40.97 | step_microstep: 19.20 [2025-04-25 19:30:16,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.95 | bwd: 5685.40 | bwd_inner: 5644.36 | bwd_allreduce: 40.99 | step: 19.20 12%|█▏ | 4774/41250 [11:32:42<87:30:54, 8.64s/it] {'loss': 0.1231, 'grad_norm': 2.4635796546936035, 'learning_rate': 3.923413484178713e-05, 'epoch': 1.16} 12%|█▏ | 4774/41250 [11:32:42<87:30:54, 8.64s/it][2025-04-25 19:30:25,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 0.98 | optimizer_step: 1.09 [2025-04-25 19:30:25,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.10 | bwd_microstep: 5897.30 | bwd_inner_microstep: 5884.72 | bwd_allreduce_microstep: 12.53 | step_microstep: 18.72 [2025-04-25 19:30:25,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.10 | bwd: 5897.31 | bwd_inner: 5884.72 | bwd_allreduce: 12.55 | step: 18.72 12%|█▏ | 4775/41250 [11:32:50<88:02:03, 8.69s/it] {'loss': 0.1626, 'grad_norm': 3.3246028423309326, 'learning_rate': 3.9233704386913965e-05, 'epoch': 1.16} 12%|█▏ | 4775/41250 [11:32:50<88:02:03, 8.69s/it][2025-04-25 19:30:34,305] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.04 | optimizer_step: 1.16 [2025-04-25 19:30:34,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.69 | bwd_microstep: 5735.39 | bwd_inner_microstep: 5688.01 | bwd_allreduce_microstep: 47.32 | step_microstep: 19.53 [2025-04-25 19:30:34,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.69 | bwd: 5735.41 | bwd_inner: 5688.01 | bwd_allreduce: 47.35 | step: 19.53 12%|█▏ | 4776/41250 [11:32:59<87:57:09, 8.68s/it] {'loss': 0.102, 'grad_norm': 3.5901575088500977, 'learning_rate': 3.923327381346881e-05, 'epoch': 1.16} 12%|█▏ | 4776/41250 [11:32:59<87:57:09, 8.68s/it][2025-04-25 19:30:42,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 19:30:42,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.45 | bwd_microstep: 5687.26 | bwd_inner_microstep: 5674.45 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.79 [2025-04-25 19:30:42,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.45 | bwd: 5687.27 | bwd_inner: 5674.45 | bwd_allreduce: 12.78 | step: 18.79 12%|█▏ | 4777/41250 [11:33:08<87:44:49, 8.66s/it] {'loss': 0.0133, 'grad_norm': 0.27499011158943176, 'learning_rate': 3.9232843121454305e-05, 'epoch': 1.16} 12%|█▏ | 4777/41250 [11:33:08<87:44:49, 8.66s/it][2025-04-25 19:30:51,563] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 19:30:51,563] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.63 | bwd_microstep: 5726.87 | bwd_inner_microstep: 5647.19 | bwd_allreduce_microstep: 79.63 | step_microstep: 18.47 [2025-04-25 19:30:51,564] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.63 | bwd: 5726.88 | bwd_inner: 5647.19 | bwd_allreduce: 79.64 | step: 18.47 12%|█▏ | 4778/41250 [11:33:16<87:41:14, 8.66s/it] {'loss': 0.1359, 'grad_norm': 1.062434196472168, 'learning_rate': 3.92324123108731e-05, 'epoch': 1.16} 12%|█▏ | 4778/41250 [11:33:16<87:41:14, 8.66s/it][2025-04-25 19:31:00,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:31:00,245] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.63 | bwd_microstep: 5745.50 | bwd_inner_microstep: 5691.73 | bwd_allreduce_microstep: 53.73 | step_microstep: 18.47 [2025-04-25 19:31:00,245] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.63 | bwd: 5745.51 | bwd_inner: 5691.73 | bwd_allreduce: 53.75 | step: 18.47 12%|█▏ | 4779/41250 [11:33:25<87:45:48, 8.66s/it] {'loss': 0.2815, 'grad_norm': 3.4740922451019287, 'learning_rate': 3.923198138172786e-05, 'epoch': 1.16} 12%|█▏ | 4779/41250 [11:33:25<87:45:48, 8.66s/it][2025-04-25 19:31:08,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:31:08,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.81 | bwd_microstep: 5744.75 | bwd_inner_microstep: 5686.32 | bwd_allreduce_microstep: 58.39 | step_microstep: 18.30 [2025-04-25 19:31:08,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.81 | bwd: 5744.76 | bwd_inner: 5686.32 | bwd_allreduce: 58.40 | step: 18.30 12%|█▏ | 4780/41250 [11:33:34<87:49:00, 8.67s/it] {'loss': 0.2158, 'grad_norm': 2.32328200340271, 'learning_rate': 3.923155033402124e-05, 'epoch': 1.16} 12%|█▏ | 4780/41250 [11:33:34<87:49:00, 8.67s/it][2025-04-25 19:31:17,607] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.98 | optimizer_step: 1.07 [2025-04-25 19:31:17,607] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.65 | bwd_microstep: 5748.38 | bwd_inner_microstep: 5696.98 | bwd_allreduce_microstep: 51.35 | step_microstep: 18.50 [2025-04-25 19:31:17,607] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.65 | bwd: 5748.39 | bwd_inner: 5696.98 | bwd_allreduce: 51.37 | step: 18.50 12%|█▏ | 4781/41250 [11:33:42<87:51:12, 8.67s/it] {'loss': 0.1004, 'grad_norm': 2.3112552165985107, 'learning_rate': 3.923111916775588e-05, 'epoch': 1.16} 12%|█▏ | 4781/41250 [11:33:42<87:51:12, 8.67s/it][2025-04-25 19:31:26,223] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:31:26,224] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.92 | bwd_microstep: 5696.87 | bwd_inner_microstep: 5657.35 | bwd_allreduce_microstep: 39.47 | step_microstep: 18.79 [2025-04-25 19:31:26,224] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.92 | bwd: 5696.88 | bwd_inner: 5657.35 | bwd_allreduce: 39.49 | step: 18.79 12%|█▏ | 4782/41250 [11:33:51<87:40:59, 8.66s/it] {'loss': 0.1679, 'grad_norm': 3.6094658374786377, 'learning_rate': 3.923068788293447e-05, 'epoch': 1.16} 12%|█▏ | 4782/41250 [11:33:51<87:40:59, 8.66s/it][2025-04-25 19:31:34,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.98 | optimizer_step: 1.05 [2025-04-25 19:31:34,869] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.07 | bwd_microstep: 5719.49 | bwd_inner_microstep: 5706.82 | bwd_allreduce_microstep: 12.63 | step_microstep: 18.51 [2025-04-25 19:31:34,869] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.07 | bwd: 5719.51 | bwd_inner: 5706.82 | bwd_allreduce: 12.65 | step: 18.51 12%|█▏ | 4783/41250 [11:34:00<87:38:47, 8.65s/it] {'loss': 0.0194, 'grad_norm': 0.4176330268383026, 'learning_rate': 3.923025647955964e-05, 'epoch': 1.16} 12%|█▏ | 4783/41250 [11:34:00<87:38:47, 8.65s/it][2025-04-25 19:31:43,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 19:31:43,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.08 | bwd_microstep: 5713.75 | bwd_inner_microstep: 5672.48 | bwd_allreduce_microstep: 41.22 | step_microstep: 18.59 [2025-04-25 19:31:43,494] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.09 | bwd: 5713.77 | bwd_inner: 5672.48 | bwd_allreduce: 41.24 | step: 18.60 12%|█▏ | 4784/41250 [11:34:08<87:33:57, 8.64s/it] {'loss': 0.2624, 'grad_norm': 1.9751349687576294, 'learning_rate': 3.922982495763407e-05, 'epoch': 1.16} 12%|█▏ | 4784/41250 [11:34:08<87:33:57, 8.64s/it][2025-04-25 19:31:52,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:31:52,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.99 | bwd_microstep: 5779.79 | bwd_inner_microstep: 5665.14 | bwd_allreduce_microstep: 114.60 | step_microstep: 18.63 [2025-04-25 19:31:52,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.99 | bwd: 5779.80 | bwd_inner: 5665.14 | bwd_allreduce: 114.62 | step: 18.63 12%|█▏ | 4785/41250 [11:34:17<87:42:22, 8.66s/it] {'loss': 0.1135, 'grad_norm': 1.2706512212753296, 'learning_rate': 3.92293933171604e-05, 'epoch': 1.16} 12%|█▏ | 4785/41250 [11:34:17<87:42:22, 8.66s/it][2025-04-25 19:32:00,884] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.26 | optimizer_step: 1.03 [2025-04-25 19:32:00,884] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.45 | bwd_microstep: 5765.76 | bwd_inner_microstep: 5700.75 | bwd_allreduce_microstep: 64.95 | step_microstep: 19.87 [2025-04-25 19:32:00,885] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.45 | bwd: 5765.78 | bwd_inner: 5700.75 | bwd_allreduce: 64.98 | step: 19.87 12%|█▏ | 4786/41250 [11:34:26<87:49:19, 8.67s/it] {'loss': 0.3513, 'grad_norm': 2.152869939804077, 'learning_rate': 3.9228961558141315e-05, 'epoch': 1.16} 12%|█▏ | 4786/41250 [11:34:26<87:49:19, 8.67s/it][2025-04-25 19:32:09,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:32:09,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.06 | bwd_microstep: 5695.19 | bwd_inner_microstep: 5682.32 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.43 [2025-04-25 19:32:09,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.06 | bwd: 5695.20 | bwd_inner: 5682.32 | bwd_allreduce: 12.84 | step: 18.43 12%|█▏ | 4787/41250 [11:34:34<87:44:22, 8.66s/it] {'loss': 0.1473, 'grad_norm': 2.2794017791748047, 'learning_rate': 3.922852968057946e-05, 'epoch': 1.16} 12%|█▏ | 4787/41250 [11:34:34<87:44:22, 8.66s/it][2025-04-25 19:32:18,145] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:32:18,146] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.09 | bwd_microstep: 5704.53 | bwd_inner_microstep: 5650.55 | bwd_allreduce_microstep: 53.93 | step_microstep: 18.20 [2025-04-25 19:32:18,146] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.09 | bwd: 5704.54 | bwd_inner: 5650.55 | bwd_allreduce: 53.95 | step: 18.21 12%|█▏ | 4788/41250 [11:34:43<87:35:50, 8.65s/it] {'loss': 0.2584, 'grad_norm': 2.2258410453796387, 'learning_rate': 3.9228097684477496e-05, 'epoch': 1.16} 12%|█▏ | 4788/41250 [11:34:43<87:35:50, 8.65s/it][2025-04-25 19:32:26,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 19:32:26,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.32 | bwd_microstep: 5755.90 | bwd_inner_microstep: 5700.37 | bwd_allreduce_microstep: 55.49 | step_microstep: 18.81 [2025-04-25 19:32:26,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.32 | bwd: 5755.91 | bwd_inner: 5700.37 | bwd_allreduce: 55.51 | step: 18.81 12%|█▏ | 4789/41250 [11:34:52<87:43:48, 8.66s/it] {'loss': 0.1411, 'grad_norm': 2.5132060050964355, 'learning_rate': 3.9227665569838096e-05, 'epoch': 1.16} 12%|█▏ | 4789/41250 [11:34:52<87:43:48, 8.66s/it][2025-04-25 19:32:35,510] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.57 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 19:32:35,510] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.56 | bwd_microstep: 5731.63 | bwd_inner_microstep: 5718.95 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.66 [2025-04-25 19:32:35,510] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.56 | bwd: 5731.64 | bwd_inner: 5718.94 | bwd_allreduce: 12.66 | step: 18.66 12%|█▏ | 4790/41250 [11:35:00<87:45:25, 8.66s/it] {'loss': 0.0674, 'grad_norm': 3.9907150268554688, 'learning_rate': 3.9227233336663924e-05, 'epoch': 1.16} 12%|█▏ | 4790/41250 [11:35:00<87:45:25, 8.66s/it][2025-04-25 19:32:44,153] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:32:44,153] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.22 | bwd_microstep: 5720.19 | bwd_inner_microstep: 5665.75 | bwd_allreduce_microstep: 54.39 | step_microstep: 18.36 [2025-04-25 19:32:44,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.23 | bwd: 5720.20 | bwd_inner: 5665.75 | bwd_allreduce: 54.41 | step: 18.36 12%|█▏ | 4791/41250 [11:35:09<87:41:12, 8.66s/it] {'loss': 0.1275, 'grad_norm': 1.7730860710144043, 'learning_rate': 3.9226800984957635e-05, 'epoch': 1.16} 12%|█▏ | 4791/41250 [11:35:09<87:41:12, 8.66s/it][2025-04-25 19:32:52,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 19:32:52,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.23 | bwd_microstep: 5752.48 | bwd_inner_microstep: 5700.59 | bwd_allreduce_microstep: 51.84 | step_microstep: 18.53 [2025-04-25 19:32:52,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.23 | bwd: 5752.49 | bwd_inner: 5700.59 | bwd_allreduce: 51.86 | step: 18.54 12%|█▏ | 4792/41250 [11:35:18<87:45:53, 8.67s/it] {'loss': 0.0633, 'grad_norm': 1.1670159101486206, 'learning_rate': 3.92263685147219e-05, 'epoch': 1.16} 12%|█▏ | 4792/41250 [11:35:18<87:45:53, 8.67s/it][2025-04-25 19:33:01,511] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 19:33:01,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.02 | bwd_microstep: 5730.80 | bwd_inner_microstep: 5717.98 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.26 [2025-04-25 19:33:01,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.02 | bwd: 5730.81 | bwd_inner: 5717.98 | bwd_allreduce: 12.79 | step: 18.26 12%|█▏ | 4793/41250 [11:35:26<87:47:16, 8.67s/it] {'loss': 0.1332, 'grad_norm': 1.3106619119644165, 'learning_rate': 3.9225935925959385e-05, 'epoch': 1.16} 12%|█▏ | 4793/41250 [11:35:26<87:47:16, 8.67s/it][2025-04-25 19:33:10,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:33:10,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.60 | bwd_microstep: 5953.28 | bwd_inner_microstep: 5662.67 | bwd_allreduce_microstep: 290.56 | step_microstep: 18.24 [2025-04-25 19:33:10,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.60 | bwd: 5953.29 | bwd_inner: 5662.67 | bwd_allreduce: 290.58 | step: 18.24 12%|█▏ | 4794/41250 [11:35:35<88:23:58, 8.73s/it] {'loss': 0.2348, 'grad_norm': 1.7338823080062866, 'learning_rate': 3.922550321867276e-05, 'epoch': 1.16} 12%|█▏ | 4794/41250 [11:35:35<88:23:58, 8.73s/it][2025-04-25 19:33:19,111] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.01 | optimizer_step: 1.08 [2025-04-25 19:33:19,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.27 | bwd_microstep: 5785.48 | bwd_inner_microstep: 5695.12 | bwd_allreduce_microstep: 90.32 | step_microstep: 18.41 [2025-04-25 19:33:19,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.27 | bwd: 5785.49 | bwd_inner: 5695.12 | bwd_allreduce: 90.33 | step: 18.41 12%|█▏ | 4795/41250 [11:35:44<88:23:37, 8.73s/it] {'loss': 0.1991, 'grad_norm': 2.3131675720214844, 'learning_rate': 3.922507039286468e-05, 'epoch': 1.16} 12%|█▏ | 4795/41250 [11:35:44<88:23:37, 8.73s/it][2025-04-25 19:33:28,098] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:33:28,098] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2864.30 | bwd_microstep: 6039.86 | bwd_inner_microstep: 5711.07 | bwd_allreduce_microstep: 328.74 | step_microstep: 18.40 [2025-04-25 19:33:28,099] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2864.31 | bwd: 6039.87 | bwd_inner: 5711.07 | bwd_allreduce: 328.76 | step: 18.40 12%|█▏ | 4796/41250 [11:35:53<89:10:25, 8.81s/it] {'loss': 0.1804, 'grad_norm': 2.366947889328003, 'learning_rate': 3.922463744853783e-05, 'epoch': 1.16} 12%|█▏ | 4796/41250 [11:35:53<89:10:25, 8.81s/it][2025-04-25 19:33:36,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:33:36,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.73 | bwd_microstep: 5760.08 | bwd_inner_microstep: 5701.36 | bwd_allreduce_microstep: 58.68 | step_microstep: 17.98 [2025-04-25 19:33:36,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.73 | bwd: 5760.09 | bwd_inner: 5701.36 | bwd_allreduce: 58.70 | step: 17.99 12%|█▏ | 4797/41250 [11:36:02<88:51:34, 8.78s/it] {'loss': 0.1266, 'grad_norm': 1.287366509437561, 'learning_rate': 3.922420438569487e-05, 'epoch': 1.16} 12%|█▏ | 4797/41250 [11:36:02<88:51:34, 8.78s/it][2025-04-25 19:33:45,451] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:33:45,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.45 | bwd_microstep: 5722.45 | bwd_inner_microstep: 5702.33 | bwd_allreduce_microstep: 20.08 | step_microstep: 18.17 [2025-04-25 19:33:45,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.45 | bwd: 5722.46 | bwd_inner: 5702.33 | bwd_allreduce: 20.09 | step: 18.18 12%|█▏ | 4798/41250 [11:36:10<88:28:31, 8.74s/it] {'loss': 0.2097, 'grad_norm': 2.1113126277923584, 'learning_rate': 3.9223771204338475e-05, 'epoch': 1.16} 12%|█▏ | 4798/41250 [11:36:10<88:28:31, 8.74s/it][2025-04-25 19:33:54,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 1.00 | optimizer_step: 0.94 [2025-04-25 19:33:54,150] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.16 | bwd_microstep: 5771.90 | bwd_inner_microstep: 5672.30 | bwd_allreduce_microstep: 99.56 | step_microstep: 18.89 [2025-04-25 19:33:54,150] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.16 | bwd: 5771.92 | bwd_inner: 5672.30 | bwd_allreduce: 99.58 | step: 18.90 12%|█▏ | 4799/41250 [11:36:19<88:21:04, 8.73s/it] {'loss': 0.1576, 'grad_norm': 2.3674511909484863, 'learning_rate': 3.9223337904471304e-05, 'epoch': 1.16} 12%|█▏ | 4799/41250 [11:36:19<88:21:04, 8.73s/it][2025-04-25 19:34:02,850] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 19:34:02,851] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.37 | bwd_microstep: 5782.23 | bwd_inner_microstep: 5670.61 | bwd_allreduce_microstep: 111.58 | step_microstep: 18.78 [2025-04-25 19:34:02,851] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.37 | bwd: 5782.24 | bwd_inner: 5670.61 | bwd_allreduce: 111.59 | step: 18.79 12%|█▏ | 4800/41250 [11:36:28<88:16:27, 8.72s/it] {'loss': 0.2322, 'grad_norm': 3.1812644004821777, 'learning_rate': 3.9222904486096045e-05, 'epoch': 1.16} 12%|█▏ | 4800/41250 [11:36:28<88:16:27, 8.72s/it][2025-04-25 19:34:11,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 0.93 [2025-04-25 19:34:11,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.22 | bwd_microstep: 5794.23 | bwd_inner_microstep: 5659.96 | bwd_allreduce_microstep: 134.23 | step_microstep: 18.32 [2025-04-25 19:34:11,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.22 | bwd: 5794.24 | bwd_inner: 5659.96 | bwd_allreduce: 134.24 | step: 18.32 12%|█▏ | 4801/41250 [11:36:36<88:14:29, 8.72s/it] {'loss': 0.1573, 'grad_norm': 1.598960041999817, 'learning_rate': 3.922247094921535e-05, 'epoch': 1.16} 12%|█▏ | 4801/41250 [11:36:36<88:14:29, 8.72s/it][2025-04-25 19:34:20,269] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-25 19:34:20,269] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.14 | bwd_microstep: 5776.64 | bwd_inner_microstep: 5666.40 | bwd_allreduce_microstep: 110.20 | step_microstep: 18.51 [2025-04-25 19:34:20,270] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.14 | bwd: 5776.65 | bwd_inner: 5666.40 | bwd_allreduce: 110.22 | step: 18.51 12%|█▏ | 4802/41250 [11:36:45<88:13:22, 8.71s/it] {'loss': 0.0427, 'grad_norm': 0.7067662477493286, 'learning_rate': 3.9222037293831914e-05, 'epoch': 1.16} 12%|█▏ | 4802/41250 [11:36:45<88:13:22, 8.71s/it][2025-04-25 19:34:29,018] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 0.97 | optimizer_step: 1.03 [2025-04-25 19:34:29,018] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2894.33 | bwd_microstep: 5771.20 | bwd_inner_microstep: 5758.59 | bwd_allreduce_microstep: 12.57 | step_microstep: 18.20 [2025-04-25 19:34:29,018] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2894.33 | bwd: 5771.21 | bwd_inner: 5758.58 | bwd_allreduce: 12.58 | step: 18.20 12%|█▏ | 4803/41250 [11:36:54<88:19:36, 8.72s/it] {'loss': 0.1516, 'grad_norm': 1.2229222059249878, 'learning_rate': 3.922160351994839e-05, 'epoch': 1.16} 12%|█▏ | 4803/41250 [11:36:54<88:19:36, 8.72s/it][2025-04-25 19:34:37,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.97 | optimizer_step: 1.03 [2025-04-25 19:34:37,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.79 | bwd_microstep: 5715.63 | bwd_inner_microstep: 5646.41 | bwd_allreduce_microstep: 69.18 | step_microstep: 18.35 [2025-04-25 19:34:37,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.79 | bwd: 5715.65 | bwd_inner: 5646.41 | bwd_allreduce: 69.19 | step: 18.36 12%|█▏ | 4804/41250 [11:37:02<88:02:24, 8.70s/it] {'loss': 0.0917, 'grad_norm': 0.8582670092582703, 'learning_rate': 3.922116962756746e-05, 'epoch': 1.16} 12%|█▏ | 4804/41250 [11:37:02<88:02:24, 8.70s/it][2025-04-25 19:34:46,328] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 19:34:46,329] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.96 | bwd_microstep: 5766.13 | bwd_inner_microstep: 5651.49 | bwd_allreduce_microstep: 114.60 | step_microstep: 18.30 [2025-04-25 19:34:46,329] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.96 | bwd: 5766.14 | bwd_inner: 5651.49 | bwd_allreduce: 114.61 | step: 18.31 12%|█▏ | 4805/41250 [11:37:11<87:59:17, 8.69s/it] {'loss': 0.1215, 'grad_norm': 2.3662259578704834, 'learning_rate': 3.922073561669181e-05, 'epoch': 1.16} 12%|█▏ | 4805/41250 [11:37:11<87:59:17, 8.69s/it][2025-04-25 19:34:54,947] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:34:54,947] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.88 | bwd_microstep: 5686.03 | bwd_inner_microstep: 5661.01 | bwd_allreduce_microstep: 24.97 | step_microstep: 18.55 [2025-04-25 19:34:54,948] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.88 | bwd: 5686.04 | bwd_inner: 5661.01 | bwd_allreduce: 24.99 | step: 18.55 12%|█▏ | 4806/41250 [11:37:20<87:45:49, 8.67s/it] {'loss': 0.0417, 'grad_norm': 0.6139357686042786, 'learning_rate': 3.922030148732409e-05, 'epoch': 1.17} 12%|█▏ | 4806/41250 [11:37:20<87:45:49, 8.67s/it][2025-04-25 19:35:03,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 19:35:03,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.70 | bwd_microstep: 5700.31 | bwd_inner_microstep: 5687.54 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.49 [2025-04-25 19:35:03,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.70 | bwd: 5700.33 | bwd_inner: 5687.54 | bwd_allreduce: 12.75 | step: 18.50 12%|█▏ | 4807/41250 [11:37:28<87:41:19, 8.66s/it] {'loss': 0.0823, 'grad_norm': 0.7478948831558228, 'learning_rate': 3.9219867239467005e-05, 'epoch': 1.17} 12%|█▏ | 4807/41250 [11:37:28<87:41:19, 8.66s/it][2025-04-25 19:35:12,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-25 19:35:12,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2865.60 | bwd_microstep: 5860.90 | bwd_inner_microstep: 5698.32 | bwd_allreduce_microstep: 162.55 | step_microstep: 18.42 [2025-04-25 19:35:12,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2865.60 | bwd: 5860.92 | bwd_inner: 5698.32 | bwd_allreduce: 162.56 | step: 18.42 12%|█▏ | 4808/41250 [11:37:37<88:07:57, 8.71s/it] {'loss': 0.2969, 'grad_norm': 1.8237119913101196, 'learning_rate': 3.9219432873123216e-05, 'epoch': 1.17} 12%|█▏ | 4808/41250 [11:37:37<88:07:57, 8.71s/it][2025-04-25 19:35:21,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:35:21,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.16 | bwd_microstep: 5718.95 | bwd_inner_microstep: 5704.16 | bwd_allreduce_microstep: 14.74 | step_microstep: 18.56 [2025-04-25 19:35:21,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.16 | bwd: 5718.96 | bwd_inner: 5704.16 | bwd_allreduce: 14.76 | step: 18.57 12%|█▏ | 4809/41250 [11:37:46<88:00:42, 8.69s/it] {'loss': 0.1527, 'grad_norm': 1.5175973176956177, 'learning_rate': 3.92189983882954e-05, 'epoch': 1.17} 12%|█▏ | 4809/41250 [11:37:46<88:00:42, 8.69s/it][2025-04-25 19:35:29,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:35:29,737] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2862.11 | bwd_microstep: 5720.47 | bwd_inner_microstep: 5707.75 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.46 [2025-04-25 19:35:29,737] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2862.11 | bwd: 5720.49 | bwd_inner: 5707.75 | bwd_allreduce: 12.70 | step: 18.47 12%|█▏ | 4810/41250 [11:37:55<87:55:34, 8.69s/it] {'loss': 0.0665, 'grad_norm': 0.8502180576324463, 'learning_rate': 3.921856378498624e-05, 'epoch': 1.17} 12%|█▏ | 4810/41250 [11:37:55<87:55:34, 8.69s/it][2025-04-25 19:35:38,384] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.88 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-25 19:35:38,384] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.18 | bwd_microstep: 5738.48 | bwd_inner_microstep: 5649.93 | bwd_allreduce_microstep: 88.51 | step_microstep: 17.65 [2025-04-25 19:35:38,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.18 | bwd: 5738.50 | bwd_inner: 5649.93 | bwd_allreduce: 88.53 | step: 17.65 12%|█▏ | 4811/41250 [11:38:03<87:48:18, 8.67s/it] {'loss': 0.16, 'grad_norm': 2.1533284187316895, 'learning_rate': 3.921812906319842e-05, 'epoch': 1.17} 12%|█▏ | 4811/41250 [11:38:03<87:48:18, 8.67s/it][2025-04-25 19:35:47,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-25 19:35:47,039] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.89 | bwd_microstep: 5715.85 | bwd_inner_microstep: 5703.20 | bwd_allreduce_microstep: 12.61 | step_microstep: 17.91 [2025-04-25 19:35:47,039] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.89 | bwd: 5715.87 | bwd_inner: 5703.20 | bwd_allreduce: 12.62 | step: 17.92 12%|█▏ | 4812/41250 [11:38:12<87:44:25, 8.67s/it] {'loss': 0.1914, 'grad_norm': 2.0669641494750977, 'learning_rate': 3.9217694222934605e-05, 'epoch': 1.17} 12%|█▏ | 4812/41250 [11:38:12<87:44:25, 8.67s/it][2025-04-25 19:35:55,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:35:55,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.50 | bwd_microstep: 5715.40 | bwd_inner_microstep: 5702.74 | bwd_allreduce_microstep: 12.61 | step_microstep: 18.40 [2025-04-25 19:35:55,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.50 | bwd: 5715.42 | bwd_inner: 5702.74 | bwd_allreduce: 12.63 | step: 18.40 12%|█▏ | 4813/41250 [11:38:21<87:41:48, 8.66s/it] {'loss': 0.1071, 'grad_norm': 1.2013497352600098, 'learning_rate': 3.9217259264197494e-05, 'epoch': 1.17} 12%|█▏ | 4813/41250 [11:38:21<87:41:48, 8.66s/it][2025-04-25 19:36:04,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.04 | optimizer_step: 1.00 [2025-04-25 19:36:04,368] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.84 | bwd_microstep: 5767.81 | bwd_inner_microstep: 5656.37 | bwd_allreduce_microstep: 111.39 | step_microstep: 18.80 [2025-04-25 19:36:04,368] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.84 | bwd: 5767.83 | bwd_inner: 5656.37 | bwd_allreduce: 111.42 | step: 18.80 12%|█▏ | 4814/41250 [11:38:29<87:43:48, 8.67s/it] {'loss': 0.0578, 'grad_norm': 1.226884365081787, 'learning_rate': 3.921682418698975e-05, 'epoch': 1.17} 12%|█▏ | 4814/41250 [11:38:29<87:43:48, 8.67s/it][2025-04-25 19:36:13,041] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:36:13,041] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.10 | bwd_microstep: 5746.52 | bwd_inner_microstep: 5662.63 | bwd_allreduce_microstep: 83.84 | step_microstep: 18.20 [2025-04-25 19:36:13,042] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.10 | bwd: 5746.53 | bwd_inner: 5662.63 | bwd_allreduce: 83.86 | step: 18.21 12%|█▏ | 4815/41250 [11:38:38<87:44:17, 8.67s/it] {'loss': 0.0907, 'grad_norm': 0.9660002589225769, 'learning_rate': 3.921638899131407e-05, 'epoch': 1.17} 12%|█▏ | 4815/41250 [11:38:38<87:44:17, 8.67s/it][2025-04-25 19:36:21,715] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-25 19:36:21,715] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.83 | bwd_microstep: 5733.58 | bwd_inner_microstep: 5704.79 | bwd_allreduce_microstep: 28.74 | step_microstep: 18.32 [2025-04-25 19:36:21,715] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.83 | bwd: 5733.59 | bwd_inner: 5704.79 | bwd_allreduce: 28.76 | step: 18.32 12%|█▏ | 4816/41250 [11:38:47<87:44:58, 8.67s/it] {'loss': 0.1045, 'grad_norm': 2.243445634841919, 'learning_rate': 3.921595367717313e-05, 'epoch': 1.17} 12%|█▏ | 4816/41250 [11:38:47<87:44:58, 8.67s/it][2025-04-25 19:36:30,377] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 19:36:30,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.97 | bwd_microstep: 5716.10 | bwd_inner_microstep: 5703.20 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.64 [2025-04-25 19:36:30,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.97 | bwd: 5716.11 | bwd_inner: 5703.20 | bwd_allreduce: 12.87 | step: 18.65 12%|█▏ | 4817/41250 [11:38:55<87:44:11, 8.67s/it] {'loss': 0.1509, 'grad_norm': 1.8264247179031372, 'learning_rate': 3.9215518244569623e-05, 'epoch': 1.17} 12%|█▏ | 4817/41250 [11:38:55<87:44:11, 8.67s/it][2025-04-25 19:36:39,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.98 | optimizer_step: 1.01 [2025-04-25 19:36:39,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.91 | bwd_microstep: 5750.29 | bwd_inner_microstep: 5649.98 | bwd_allreduce_microstep: 100.27 | step_microstep: 18.46 [2025-04-25 19:36:39,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.91 | bwd: 5750.30 | bwd_inner: 5649.98 | bwd_allreduce: 100.29 | step: 18.47 12%|█▏ | 4818/41250 [11:39:04<87:44:01, 8.67s/it] {'loss': 0.0819, 'grad_norm': 1.0606857538223267, 'learning_rate': 3.9215082693506214e-05, 'epoch': 1.17} 12%|█▏ | 4818/41250 [11:39:04<87:44:01, 8.67s/it][2025-04-25 19:36:47,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:36:47,726] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.88 | bwd_microstep: 5736.99 | bwd_inner_microstep: 5693.62 | bwd_allreduce_microstep: 43.33 | step_microstep: 18.47 [2025-04-25 19:36:47,726] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.88 | bwd: 5737.01 | bwd_inner: 5693.62 | bwd_allreduce: 43.35 | step: 18.48 12%|█▏ | 4819/41250 [11:39:13<87:45:05, 8.67s/it] {'loss': 0.2346, 'grad_norm': 1.4530978202819824, 'learning_rate': 3.9214647023985606e-05, 'epoch': 1.17} 12%|█▏ | 4819/41250 [11:39:13<87:45:05, 8.67s/it][2025-04-25 19:36:56,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.94 | optimizer_step: 0.93 [2025-04-25 19:36:56,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.40 | bwd_microstep: 5727.85 | bwd_inner_microstep: 5696.60 | bwd_allreduce_microstep: 31.21 | step_microstep: 18.36 [2025-04-25 19:36:56,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.40 | bwd: 5727.86 | bwd_inner: 5696.60 | bwd_allreduce: 31.23 | step: 18.36 12%|█▏ | 4820/41250 [11:39:21<87:43:27, 8.67s/it] {'loss': 0.0998, 'grad_norm': 0.998527467250824, 'learning_rate': 3.9214211236010475e-05, 'epoch': 1.17} 12%|█▏ | 4820/41250 [11:39:21<87:43:27, 8.67s/it][2025-04-25 19:37:05,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 19:37:05,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.23 | bwd_microstep: 5730.79 | bwd_inner_microstep: 5681.63 | bwd_allreduce_microstep: 49.12 | step_microstep: 18.61 [2025-04-25 19:37:05,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.23 | bwd: 5730.81 | bwd_inner: 5681.63 | bwd_allreduce: 49.13 | step: 18.61 12%|█▏ | 4821/41250 [11:39:30<87:43:18, 8.67s/it] {'loss': 0.0722, 'grad_norm': 1.5825917720794678, 'learning_rate': 3.9213775329583514e-05, 'epoch': 1.17} 12%|█▏ | 4821/41250 [11:39:30<87:43:18, 8.67s/it][2025-04-25 19:37:13,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 19:37:13,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.60 | bwd_microstep: 5682.23 | bwd_inner_microstep: 5642.31 | bwd_allreduce_microstep: 39.87 | step_microstep: 18.50 [2025-04-25 19:37:13,661] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.60 | bwd: 5682.24 | bwd_inner: 5642.31 | bwd_allreduce: 39.89 | step: 18.50 12%|█▏ | 4822/41250 [11:39:38<87:31:01, 8.65s/it] {'loss': 0.2085, 'grad_norm': 3.083127737045288, 'learning_rate': 3.921333930470741e-05, 'epoch': 1.17} 12%|█▏ | 4822/41250 [11:39:38<87:31:01, 8.65s/it][2025-04-25 19:37:22,327] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.03 | optimizer_step: 0.91 [2025-04-25 19:37:22,327] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.78 | bwd_microstep: 5727.09 | bwd_inner_microstep: 5695.92 | bwd_allreduce_microstep: 31.13 | step_microstep: 18.72 [2025-04-25 19:37:22,327] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.78 | bwd: 5727.10 | bwd_inner: 5695.92 | bwd_allreduce: 31.15 | step: 18.73 12%|█▏ | 4823/41250 [11:39:47<87:34:04, 8.65s/it] {'loss': 0.0607, 'grad_norm': 0.8865305781364441, 'learning_rate': 3.921290316138485e-05, 'epoch': 1.17} 12%|█▏ | 4823/41250 [11:39:47<87:34:04, 8.65s/it][2025-04-25 19:37:30,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.00 | optimizer_step: 1.00 [2025-04-25 19:37:30,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.20 | bwd_microstep: 5679.18 | bwd_inner_microstep: 5652.20 | bwd_allreduce_microstep: 26.94 | step_microstep: 18.55 [2025-04-25 19:37:30,923] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.20 | bwd: 5679.19 | bwd_inner: 5652.20 | bwd_allreduce: 26.95 | step: 18.56 12%|█▏ | 4824/41250 [11:39:56<87:23:23, 8.64s/it] {'loss': 0.1122, 'grad_norm': 1.6111036539077759, 'learning_rate': 3.9212466899618506e-05, 'epoch': 1.17} 12%|█▏ | 4824/41250 [11:39:56<87:23:23, 8.64s/it][2025-04-25 19:37:39,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 0.99 | optimizer_step: 1.06 [2025-04-25 19:37:39,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.46 | bwd_microstep: 5672.13 | bwd_inner_microstep: 5659.46 | bwd_allreduce_microstep: 12.62 | step_microstep: 18.76 [2025-04-25 19:37:39,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.46 | bwd: 5672.15 | bwd_inner: 5659.46 | bwd_allreduce: 12.65 | step: 18.76 12%|█▏ | 4825/41250 [11:40:04<87:16:09, 8.63s/it] {'loss': 0.1643, 'grad_norm': 1.485959768295288, 'learning_rate': 3.92120305194111e-05, 'epoch': 1.17} 12%|█▏ | 4825/41250 [11:40:04<87:16:09, 8.63s/it][2025-04-25 19:37:48,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.10 | optimizer_step: 1.03 [2025-04-25 19:37:48,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.40 | bwd_microstep: 5759.78 | bwd_inner_microstep: 5652.96 | bwd_allreduce_microstep: 106.75 | step_microstep: 20.20 [2025-04-25 19:37:48,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.40 | bwd: 5759.79 | bwd_inner: 5652.96 | bwd_allreduce: 106.78 | step: 20.20 12%|█▏ | 4826/41250 [11:40:13<87:25:04, 8.64s/it] {'loss': 0.1013, 'grad_norm': 1.5348538160324097, 'learning_rate': 3.921159402076529e-05, 'epoch': 1.17} 12%|█▏ | 4826/41250 [11:40:13<87:25:04, 8.64s/it][2025-04-25 19:37:56,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.00 | optimizer_step: 0.93 [2025-04-25 19:37:56,859] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.66 | bwd_microstep: 5728.55 | bwd_inner_microstep: 5681.74 | bwd_allreduce_microstep: 46.75 | step_microstep: 19.16 [2025-04-25 19:37:56,859] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.66 | bwd: 5728.56 | bwd_inner: 5681.74 | bwd_allreduce: 46.78 | step: 19.16 12%|█▏ | 4827/41250 [11:40:22<87:29:06, 8.65s/it] {'loss': 0.1106, 'grad_norm': 1.5222660303115845, 'learning_rate': 3.921115740368379e-05, 'epoch': 1.17} 12%|█▏ | 4827/41250 [11:40:22<87:29:06, 8.65s/it][2025-04-25 19:38:05,509] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.97 | optimizer_step: 1.08 [2025-04-25 19:38:05,509] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.64 | bwd_microstep: 5714.50 | bwd_inner_microstep: 5688.96 | bwd_allreduce_microstep: 25.50 | step_microstep: 18.43 [2025-04-25 19:38:05,510] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.64 | bwd: 5714.51 | bwd_inner: 5688.96 | bwd_allreduce: 25.51 | step: 18.44 12%|█▏ | 4828/41250 [11:40:30<87:29:32, 8.65s/it] {'loss': 0.2548, 'grad_norm': 3.8109006881713867, 'learning_rate': 3.921072066816928e-05, 'epoch': 1.17} 12%|█▏ | 4828/41250 [11:40:30<87:29:32, 8.65s/it][2025-04-25 19:38:14,268] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:38:14,269] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2886.36 | bwd_microstep: 5787.41 | bwd_inner_microstep: 5774.60 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.71 [2025-04-25 19:38:14,269] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2886.36 | bwd: 5787.42 | bwd_inner: 5774.60 | bwd_allreduce: 12.79 | step: 18.71 12%|█▏ | 4829/41250 [11:40:39<87:49:43, 8.68s/it] {'loss': 0.1384, 'grad_norm': 3.3414554595947266, 'learning_rate': 3.9210283814224454e-05, 'epoch': 1.17} 12%|█▏ | 4829/41250 [11:40:39<87:49:43, 8.68s/it][2025-04-25 19:38:22,889] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 0.99 [2025-04-25 19:38:22,890] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.35 | bwd_microstep: 5697.84 | bwd_inner_microstep: 5685.10 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.73 [2025-04-25 19:38:22,890] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.35 | bwd: 5697.86 | bwd_inner: 5685.10 | bwd_allreduce: 12.72 | step: 18.73 12%|█▏ | 4830/41250 [11:40:48<87:38:54, 8.66s/it] {'loss': 0.1172, 'grad_norm': 4.202263832092285, 'learning_rate': 3.920984684185202e-05, 'epoch': 1.17} 12%|█▏ | 4830/41250 [11:40:48<87:38:54, 8.66s/it][2025-04-25 19:38:31,491] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.00 | optimizer_step: 1.07 [2025-04-25 19:38:31,492] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.87 | bwd_microstep: 5685.93 | bwd_inner_microstep: 5638.66 | bwd_allreduce_microstep: 47.22 | step_microstep: 18.74 [2025-04-25 19:38:31,492] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.87 | bwd: 5685.94 | bwd_inner: 5638.66 | bwd_allreduce: 47.24 | step: 18.74 12%|█▏ | 4831/41250 [11:40:56<87:28:09, 8.65s/it] {'loss': 0.1225, 'grad_norm': 2.9106147289276123, 'learning_rate': 3.9209409751054643e-05, 'epoch': 1.17} 12%|█▏ | 4831/41250 [11:40:56<87:28:09, 8.65s/it][2025-04-25 19:38:40,083] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:38:40,084] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.52 | bwd_microstep: 5681.41 | bwd_inner_microstep: 5637.52 | bwd_allreduce_microstep: 43.85 | step_microstep: 18.68 [2025-04-25 19:38:40,084] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.52 | bwd: 5681.43 | bwd_inner: 5637.52 | bwd_allreduce: 43.87 | step: 18.68 12%|█▏ | 4832/41250 [11:41:05<87:17:21, 8.63s/it] {'loss': 0.182, 'grad_norm': 1.7207306623458862, 'learning_rate': 3.920897254183504e-05, 'epoch': 1.17} 12%|█▏ | 4832/41250 [11:41:05<87:17:21, 8.63s/it][2025-04-25 19:38:48,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 1.00 | optimizer_step: 0.94 [2025-04-25 19:38:48,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.37 | bwd_microstep: 5880.28 | bwd_inner_microstep: 5650.43 | bwd_allreduce_microstep: 229.81 | step_microstep: 18.86 [2025-04-25 19:38:48,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.37 | bwd: 5880.29 | bwd_inner: 5650.43 | bwd_allreduce: 229.82 | step: 18.86 12%|█▏ | 4833/41250 [11:41:14<87:46:16, 8.68s/it] {'loss': 0.0788, 'grad_norm': 1.4711214303970337, 'learning_rate': 3.9208535214195894e-05, 'epoch': 1.17} 12%|█▏ | 4833/41250 [11:41:14<87:46:16, 8.68s/it][2025-04-25 19:38:57,498] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:38:57,498] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.35 | bwd_microstep: 5691.23 | bwd_inner_microstep: 5678.33 | bwd_allreduce_microstep: 12.85 | step_microstep: 18.58 [2025-04-25 19:38:57,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.35 | bwd: 5691.25 | bwd_inner: 5678.33 | bwd_allreduce: 12.87 | step: 18.59 12%|█▏ | 4834/41250 [11:41:22<87:36:54, 8.66s/it] {'loss': 0.1298, 'grad_norm': 1.7749028205871582, 'learning_rate': 3.920809776813991e-05, 'epoch': 1.17} 12%|█▏ | 4834/41250 [11:41:22<87:36:54, 8.66s/it][2025-04-25 19:39:06,169] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 1.03 | optimizer_step: 0.91 [2025-04-25 19:39:06,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.51 | bwd_microstep: 5736.14 | bwd_inner_microstep: 5688.47 | bwd_allreduce_microstep: 47.62 | step_microstep: 19.07 [2025-04-25 19:39:06,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.51 | bwd: 5736.15 | bwd_inner: 5688.47 | bwd_allreduce: 47.64 | step: 19.08 12%|█▏ | 4835/41250 [11:41:31<87:38:34, 8.66s/it] {'loss': 0.1857, 'grad_norm': 1.4377646446228027, 'learning_rate': 3.920766020366979e-05, 'epoch': 1.17} 12%|█▏ | 4835/41250 [11:41:31<87:38:34, 8.66s/it][2025-04-25 19:39:14,772] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.97 | optimizer_step: 1.05 [2025-04-25 19:39:14,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.41 | bwd_microstep: 5673.35 | bwd_inner_microstep: 5655.52 | bwd_allreduce_microstep: 17.78 | step_microstep: 18.52 [2025-04-25 19:39:14,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.41 | bwd: 5673.36 | bwd_inner: 5655.52 | bwd_allreduce: 17.80 | step: 18.52 12%|█▏ | 4836/41250 [11:41:40<87:27:14, 8.65s/it] {'loss': 0.1312, 'grad_norm': 1.7172436714172363, 'learning_rate': 3.9207222520788206e-05, 'epoch': 1.17} 12%|█▏ | 4836/41250 [11:41:40<87:27:14, 8.65s/it][2025-04-25 19:39:23,446] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:39:23,446] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.14 | bwd_microstep: 5743.80 | bwd_inner_microstep: 5689.88 | bwd_allreduce_microstep: 53.88 | step_microstep: 18.35 [2025-04-25 19:39:23,446] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.14 | bwd: 5743.82 | bwd_inner: 5689.88 | bwd_allreduce: 53.90 | step: 18.35 12%|█▏ | 4837/41250 [11:41:48<87:32:00, 8.65s/it] {'loss': 0.3176, 'grad_norm': 4.809261322021484, 'learning_rate': 3.920678471949789e-05, 'epoch': 1.17} 12%|█▏ | 4837/41250 [11:41:48<87:32:00, 8.65s/it][2025-04-25 19:39:32,140] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 19:39:32,140] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.19 | bwd_microstep: 5747.70 | bwd_inner_microstep: 5684.71 | bwd_allreduce_microstep: 62.94 | step_microstep: 18.48 [2025-04-25 19:39:32,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.19 | bwd: 5747.71 | bwd_inner: 5684.71 | bwd_allreduce: 62.95 | step: 18.48 12%|█▏ | 4838/41250 [11:41:57<87:39:13, 8.67s/it] {'loss': 0.1239, 'grad_norm': 2.1053736209869385, 'learning_rate': 3.920634679980151e-05, 'epoch': 1.17} 12%|█▏ | 4838/41250 [11:41:57<87:39:13, 8.67s/it][2025-04-25 19:39:41,039] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 19:39:41,040] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2942.59 | bwd_microstep: 5870.59 | bwd_inner_microstep: 5857.74 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.56 [2025-04-25 19:39:41,040] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2942.59 | bwd: 5870.60 | bwd_inner: 5857.73 | bwd_allreduce: 12.82 | step: 18.56 12%|█▏ | 4839/41250 [11:42:06<88:21:40, 8.74s/it] {'loss': 0.1901, 'grad_norm': 1.5719406604766846, 'learning_rate': 3.920590876170179e-05, 'epoch': 1.17} 12%|█▏ | 4839/41250 [11:42:06<88:21:40, 8.74s/it][2025-04-25 19:39:49,707] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 19:39:49,708] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.57 | bwd_microstep: 5724.96 | bwd_inner_microstep: 5712.26 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.62 [2025-04-25 19:39:49,708] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.57 | bwd: 5724.97 | bwd_inner: 5712.26 | bwd_allreduce: 12.67 | step: 18.63 12%|█▏ | 4840/41250 [11:42:15<88:08:55, 8.72s/it] {'loss': 0.2201, 'grad_norm': 3.573864221572876, 'learning_rate': 3.920547060520141e-05, 'epoch': 1.17} 12%|█▏ | 4840/41250 [11:42:15<88:08:55, 8.72s/it][2025-04-25 19:39:58,386] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 19:39:58,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.95 | bwd_microstep: 5731.68 | bwd_inner_microstep: 5711.16 | bwd_allreduce_microstep: 20.48 | step_microstep: 18.56 [2025-04-25 19:39:58,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.95 | bwd: 5731.70 | bwd_inner: 5711.16 | bwd_allreduce: 20.50 | step: 18.56 12%|█▏ | 4841/41250 [11:42:23<88:01:55, 8.70s/it] {'loss': 0.1699, 'grad_norm': 1.4992388486862183, 'learning_rate': 3.920503233030309e-05, 'epoch': 1.17} 12%|█▏ | 4841/41250 [11:42:23<88:01:55, 8.70s/it][2025-04-25 19:40:07,312] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:40:07,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2940.57 | bwd_microstep: 5899.36 | bwd_inner_microstep: 5886.45 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.55 [2025-04-25 19:40:07,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2940.57 | bwd: 5899.37 | bwd_inner: 5886.45 | bwd_allreduce: 12.88 | step: 18.55 12%|█▏ | 4842/41250 [11:42:32<88:42:19, 8.77s/it] {'loss': 0.1389, 'grad_norm': 2.3708906173706055, 'learning_rate': 3.920459393700952e-05, 'epoch': 1.17} 12%|█▏ | 4842/41250 [11:42:32<88:42:19, 8.77s/it][2025-04-25 19:40:15,983] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 19:40:15,984] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.36 | bwd_microstep: 5730.86 | bwd_inner_microstep: 5718.03 | bwd_allreduce_microstep: 12.79 | step_microstep: 18.31 [2025-04-25 19:40:15,984] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.36 | bwd: 5730.88 | bwd_inner: 5718.03 | bwd_allreduce: 12.81 | step: 18.31 12%|█▏ | 4843/41250 [11:42:41<88:23:52, 8.74s/it] {'loss': 0.3231, 'grad_norm': 3.4561588764190674, 'learning_rate': 3.92041554253234e-05, 'epoch': 1.17} 12%|█▏ | 4843/41250 [11:42:41<88:23:52, 8.74s/it][2025-04-25 19:40:24,659] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 19:40:24,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.49 | bwd_microstep: 5730.32 | bwd_inner_microstep: 5717.51 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.28 [2025-04-25 19:40:24,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.49 | bwd: 5730.34 | bwd_inner: 5717.51 | bwd_allreduce: 12.78 | step: 18.28 12%|█▏ | 4844/41250 [11:42:49<88:11:53, 8.72s/it] {'loss': 0.1212, 'grad_norm': 1.8072365522384644, 'learning_rate': 3.920371679524745e-05, 'epoch': 1.17} 12%|█▏ | 4844/41250 [11:42:49<88:11:53, 8.72s/it][2025-04-25 19:40:33,357] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 19:40:33,358] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.37 | bwd_microstep: 5765.49 | bwd_inner_microstep: 5664.58 | bwd_allreduce_microstep: 100.86 | step_microstep: 18.47 [2025-04-25 19:40:33,358] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.37 | bwd: 5765.50 | bwd_inner: 5664.58 | bwd_allreduce: 100.88 | step: 18.48 12%|█▏ | 4845/41250 [11:42:58<88:07:30, 8.71s/it] {'loss': 0.0644, 'grad_norm': 1.0230131149291992, 'learning_rate': 3.920327804678436e-05, 'epoch': 1.17} 12%|█▏ | 4845/41250 [11:42:58<88:07:30, 8.71s/it][2025-04-25 19:40:42,049] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:40:42,050] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.03 | bwd_microstep: 5751.64 | bwd_inner_microstep: 5696.45 | bwd_allreduce_microstep: 55.15 | step_microstep: 18.24 [2025-04-25 19:40:42,050] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.03 | bwd: 5751.65 | bwd_inner: 5696.45 | bwd_allreduce: 55.16 | step: 18.24 12%|█▏ | 4846/41250 [11:43:07<88:03:15, 8.71s/it] {'loss': 0.1704, 'grad_norm': 2.146838426589966, 'learning_rate': 3.920283917993683e-05, 'epoch': 1.17} 12%|█▏ | 4846/41250 [11:43:07<88:03:15, 8.71s/it][2025-04-25 19:40:50,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.04 | optimizer_step: 0.99 [2025-04-25 19:40:50,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2865.73 | bwd_microstep: 5880.93 | bwd_inner_microstep: 5712.15 | bwd_allreduce_microstep: 168.72 | step_microstep: 19.07 [2025-04-25 19:40:50,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2865.73 | bwd: 5880.94 | bwd_inner: 5712.15 | bwd_allreduce: 168.74 | step: 19.08 12%|█▏ | 4847/41250 [11:43:16<88:25:54, 8.75s/it] {'loss': 0.2343, 'grad_norm': 2.5090417861938477, 'learning_rate': 3.920240019470758e-05, 'epoch': 1.18} 12%|█▏ | 4847/41250 [11:43:16<88:25:54, 8.75s/it][2025-04-25 19:40:59,531] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:40:59,531] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.80 | bwd_microstep: 5725.55 | bwd_inner_microstep: 5668.28 | bwd_allreduce_microstep: 57.23 | step_microstep: 18.26 [2025-04-25 19:40:59,532] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.80 | bwd: 5725.56 | bwd_inner: 5668.28 | bwd_allreduce: 57.25 | step: 18.27 12%|█▏ | 4848/41250 [11:43:24<88:08:10, 8.72s/it] {'loss': 0.184, 'grad_norm': 2.1636621952056885, 'learning_rate': 3.920196109109931e-05, 'epoch': 1.18} 12%|█▏ | 4848/41250 [11:43:24<88:08:10, 8.72s/it][2025-04-25 19:41:08,220] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:41:08,221] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.74 | bwd_microstep: 5751.83 | bwd_inner_microstep: 5692.99 | bwd_allreduce_microstep: 58.79 | step_microstep: 18.15 [2025-04-25 19:41:08,221] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.74 | bwd: 5751.84 | bwd_inner: 5692.99 | bwd_allreduce: 58.81 | step: 18.15 12%|█▏ | 4849/41250 [11:43:33<88:03:09, 8.71s/it] {'loss': 0.0439, 'grad_norm': 1.6585344076156616, 'learning_rate': 3.920152186911472e-05, 'epoch': 1.18} 12%|█▏ | 4849/41250 [11:43:33<88:03:09, 8.71s/it][2025-04-25 19:41:16,890] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.03 | optimizer_step: 1.12 [2025-04-25 19:41:16,890] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.78 | bwd_microstep: 5722.75 | bwd_inner_microstep: 5709.15 | bwd_allreduce_microstep: 13.55 | step_microstep: 19.31 [2025-04-25 19:41:16,891] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.78 | bwd: 5722.77 | bwd_inner: 5709.15 | bwd_allreduce: 13.57 | step: 19.31 12%|█▏ | 4850/41250 [11:43:42<87:55:59, 8.70s/it] {'loss': 0.0435, 'grad_norm': 1.4097739458084106, 'learning_rate': 3.920108252875653e-05, 'epoch': 1.18} 12%|█▏ | 4850/41250 [11:43:42<87:55:59, 8.70s/it][2025-04-25 19:41:25,596] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 19:41:25,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.07 | bwd_microstep: 5763.84 | bwd_inner_microstep: 5713.34 | bwd_allreduce_microstep: 50.46 | step_microstep: 18.76 [2025-04-25 19:41:25,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.07 | bwd: 5763.86 | bwd_inner: 5713.34 | bwd_allreduce: 50.48 | step: 18.76 12%|█▏ | 4851/41250 [11:43:50<87:57:41, 8.70s/it] {'loss': 0.0588, 'grad_norm': 0.9105182886123657, 'learning_rate': 3.920064307002744e-05, 'epoch': 1.18} 12%|█▏ | 4851/41250 [11:43:50<87:57:41, 8.70s/it][2025-04-25 19:41:34,246] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:41:34,246] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.53 | bwd_microstep: 5724.10 | bwd_inner_microstep: 5668.29 | bwd_allreduce_microstep: 55.76 | step_microstep: 18.66 [2025-04-25 19:41:34,247] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.53 | bwd: 5724.11 | bwd_inner: 5668.29 | bwd_allreduce: 55.78 | step: 18.66 12%|█▏ | 4852/41250 [11:43:59<87:48:37, 8.69s/it] {'loss': 0.1146, 'grad_norm': 1.9405802488327026, 'learning_rate': 3.920020349293017e-05, 'epoch': 1.18} 12%|█▏ | 4852/41250 [11:43:59<87:48:37, 8.69s/it][2025-04-25 19:41:42,944] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 19:41:42,944] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.71 | bwd_microstep: 5774.02 | bwd_inner_microstep: 5666.09 | bwd_allreduce_microstep: 107.89 | step_microstep: 18.53 [2025-04-25 19:41:42,945] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.71 | bwd: 5774.03 | bwd_inner: 5666.09 | bwd_allreduce: 107.91 | step: 18.53 12%|█▏ | 4853/41250 [11:44:08<87:50:41, 8.69s/it] {'loss': 0.1169, 'grad_norm': 0.988203227519989, 'learning_rate': 3.919976379746742e-05, 'epoch': 1.18} 12%|█▏ | 4853/41250 [11:44:08<87:50:41, 8.69s/it][2025-04-25 19:41:51,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.97 | optimizer_step: 1.04 [2025-04-25 19:41:51,645] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.70 | bwd_microstep: 5763.98 | bwd_inner_microstep: 5711.24 | bwd_allreduce_microstep: 52.69 | step_microstep: 18.58 [2025-04-25 19:41:51,645] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.70 | bwd: 5763.99 | bwd_inner: 5711.24 | bwd_allreduce: 52.71 | step: 18.58 12%|█▏ | 4854/41250 [11:44:16<87:52:26, 8.69s/it] {'loss': 0.1423, 'grad_norm': 2.924574613571167, 'learning_rate': 3.91993239836419e-05, 'epoch': 1.18} 12%|█▏ | 4854/41250 [11:44:16<87:52:26, 8.69s/it][2025-04-25 19:42:00,611] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:42:00,612] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.44 | bwd_microstep: 6021.68 | bwd_inner_microstep: 5705.87 | bwd_allreduce_microstep: 315.77 | step_microstep: 18.24 [2025-04-25 19:42:00,612] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.44 | bwd: 6021.69 | bwd_inner: 5705.87 | bwd_allreduce: 315.79 | step: 18.24 12%|█▏ | 4855/41250 [11:44:25<88:42:20, 8.77s/it] {'loss': 0.0649, 'grad_norm': 1.0636889934539795, 'learning_rate': 3.919888405145632e-05, 'epoch': 1.18} 12%|█▏ | 4855/41250 [11:44:25<88:42:20, 8.77s/it][2025-04-25 19:42:09,321] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.98 | optimizer_step: 1.01 [2025-04-25 19:42:09,322] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.61 | bwd_microstep: 5769.92 | bwd_inner_microstep: 5715.07 | bwd_allreduce_microstep: 54.80 | step_microstep: 18.54 [2025-04-25 19:42:09,322] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.61 | bwd: 5769.94 | bwd_inner: 5715.07 | bwd_allreduce: 54.82 | step: 18.54 12%|█▏ | 4856/41250 [11:44:34<88:31:08, 8.76s/it] {'loss': 0.119, 'grad_norm': 1.9209041595458984, 'learning_rate': 3.91984440009134e-05, 'epoch': 1.18} 12%|█▏ | 4856/41250 [11:44:34<88:31:08, 8.76s/it][2025-04-25 19:42:18,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-25 19:42:18,039] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.36 | bwd_microstep: 5791.44 | bwd_inner_microstep: 5660.92 | bwd_allreduce_microstep: 130.47 | step_microstep: 18.90 [2025-04-25 19:42:18,039] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.36 | bwd: 5791.46 | bwd_inner: 5660.92 | bwd_allreduce: 130.49 | step: 18.90 12%|█▏ | 4857/41250 [11:44:43<88:23:30, 8.74s/it] {'loss': 0.2512, 'grad_norm': 2.472579002380371, 'learning_rate': 3.919800383201585e-05, 'epoch': 1.18} 12%|█▏ | 4857/41250 [11:44:43<88:23:30, 8.74s/it][2025-04-25 19:42:26,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.04 | optimizer_step: 1.04 [2025-04-25 19:42:26,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.09 | bwd_microstep: 5904.08 | bwd_inner_microstep: 5639.81 | bwd_allreduce_microstep: 264.22 | step_microstep: 19.57 [2025-04-25 19:42:26,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.09 | bwd: 5904.10 | bwd_inner: 5639.81 | bwd_allreduce: 264.24 | step: 19.58 12%|█▏ | 4858/41250 [11:44:52<88:35:01, 8.76s/it] {'loss': 0.2018, 'grad_norm': 2.426089286804199, 'learning_rate': 3.9197563544766385e-05, 'epoch': 1.18} 12%|█▏ | 4858/41250 [11:44:52<88:35:01, 8.76s/it][2025-04-25 19:42:35,543] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.98 | optimizer_step: 0.92 [2025-04-25 19:42:35,544] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.72 | bwd_microstep: 5763.60 | bwd_inner_microstep: 5699.74 | bwd_allreduce_microstep: 63.83 | step_microstep: 18.60 [2025-04-25 19:42:35,544] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.72 | bwd: 5763.62 | bwd_inner: 5699.73 | bwd_allreduce: 63.84 | step: 18.61 12%|█▏ | 4859/41250 [11:45:00<88:22:44, 8.74s/it] {'loss': 0.1143, 'grad_norm': 1.447676420211792, 'learning_rate': 3.919712313916771e-05, 'epoch': 1.18} 12%|█▏ | 4859/41250 [11:45:00<88:22:44, 8.74s/it][2025-04-25 19:42:44,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 19:42:44,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2779.41 | bwd_microstep: 5827.43 | bwd_inner_microstep: 5568.73 | bwd_allreduce_microstep: 258.66 | step_microstep: 18.48 [2025-04-25 19:42:44,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2779.41 | bwd: 5827.45 | bwd_inner: 5568.73 | bwd_allreduce: 258.68 | step: 18.48 12%|█▏ | 4860/41250 [11:45:09<88:12:35, 8.73s/it] {'loss': 0.1222, 'grad_norm': 2.007612466812134, 'learning_rate': 3.919668261522255e-05, 'epoch': 1.18} 12%|█▏ | 4860/41250 [11:45:09<88:12:35, 8.73s/it][2025-04-25 19:42:53,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:42:53,126] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2936.39 | bwd_microstep: 5875.68 | bwd_inner_microstep: 5863.01 | bwd_allreduce_microstep: 12.63 | step_microstep: 18.43 [2025-04-25 19:42:53,126] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2936.39 | bwd: 5875.70 | bwd_inner: 5863.01 | bwd_allreduce: 12.65 | step: 18.43 12%|█▏ | 4861/41250 [11:45:18<88:43:23, 8.78s/it] {'loss': 0.0912, 'grad_norm': 2.141772508621216, 'learning_rate': 3.9196241972933615e-05, 'epoch': 1.18} 12%|█▏ | 4861/41250 [11:45:18<88:43:23, 8.78s/it][2025-04-25 19:43:01,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 1.01 | optimizer_step: 0.98 [2025-04-25 19:43:01,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.25 | bwd_microstep: 5769.50 | bwd_inner_microstep: 5694.45 | bwd_allreduce_microstep: 75.00 | step_microstep: 19.31 [2025-04-25 19:43:01,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.25 | bwd: 5769.51 | bwd_inner: 5694.45 | bwd_allreduce: 75.02 | step: 19.31 12%|█▏ | 4862/41250 [11:45:27<88:29:11, 8.75s/it] {'loss': 0.3301, 'grad_norm': 3.4105751514434814, 'learning_rate': 3.919580121230362e-05, 'epoch': 1.18} 12%|█▏ | 4862/41250 [11:45:27<88:29:11, 8.75s/it][2025-04-25 19:43:10,451] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 19:43:10,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.95 | bwd_microstep: 5699.19 | bwd_inner_microstep: 5678.47 | bwd_allreduce_microstep: 20.68 | step_microstep: 18.49 [2025-04-25 19:43:10,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.95 | bwd: 5699.20 | bwd_inner: 5678.47 | bwd_allreduce: 20.70 | step: 18.49 12%|█▏ | 4863/41250 [11:45:35<88:05:06, 8.71s/it] {'loss': 0.3549, 'grad_norm': 4.196229934692383, 'learning_rate': 3.919536033333529e-05, 'epoch': 1.18} 12%|█▏ | 4863/41250 [11:45:35<88:05:06, 8.71s/it][2025-04-25 19:43:19,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:43:19,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.22 | bwd_microstep: 5753.24 | bwd_inner_microstep: 5661.19 | bwd_allreduce_microstep: 92.01 | step_microstep: 18.55 [2025-04-25 19:43:19,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.22 | bwd: 5753.26 | bwd_inner: 5661.19 | bwd_allreduce: 92.03 | step: 18.55 12%|█▏ | 4864/41250 [11:45:44<87:55:14, 8.70s/it] {'loss': 0.2785, 'grad_norm': 1.8126400709152222, 'learning_rate': 3.9194919336031345e-05, 'epoch': 1.18} 12%|█▏ | 4864/41250 [11:45:44<87:55:14, 8.70s/it][2025-04-25 19:43:27,810] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:43:27,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.41 | bwd_microstep: 5794.60 | bwd_inner_microstep: 5630.39 | bwd_allreduce_microstep: 164.17 | step_microstep: 18.51 [2025-04-25 19:43:27,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.41 | bwd: 5794.61 | bwd_inner: 5630.39 | bwd_allreduce: 164.19 | step: 18.52 12%|█▏ | 4865/41250 [11:45:53<87:54:54, 8.70s/it] {'loss': 0.1567, 'grad_norm': 1.360508680343628, 'learning_rate': 3.919447822039449e-05, 'epoch': 1.18} 12%|█▏ | 4865/41250 [11:45:53<87:54:54, 8.70s/it][2025-04-25 19:43:36,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:43:36,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.09 | bwd_microstep: 5704.15 | bwd_inner_microstep: 5691.48 | bwd_allreduce_microstep: 12.63 | step_microstep: 18.67 [2025-04-25 19:43:36,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.09 | bwd: 5704.16 | bwd_inner: 5691.48 | bwd_allreduce: 12.64 | step: 18.67 12%|█▏ | 4866/41250 [11:46:01<87:42:58, 8.68s/it] {'loss': 0.0555, 'grad_norm': 1.3309224843978882, 'learning_rate': 3.9194036986427455e-05, 'epoch': 1.18} 12%|█▏ | 4866/41250 [11:46:01<87:42:58, 8.68s/it][2025-04-25 19:43:45,081] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:43:45,081] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.40 | bwd_microstep: 5703.57 | bwd_inner_microstep: 5690.70 | bwd_allreduce_microstep: 12.83 | step_microstep: 18.59 [2025-04-25 19:43:45,081] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.40 | bwd: 5703.59 | bwd_inner: 5690.70 | bwd_allreduce: 12.85 | step: 18.59 12%|█▏ | 4867/41250 [11:46:10<87:35:16, 8.67s/it] {'loss': 0.4712, 'grad_norm': 4.319969177246094, 'learning_rate': 3.919359563413296e-05, 'epoch': 1.18} 12%|█▏ | 4867/41250 [11:46:10<87:35:16, 8.67s/it][2025-04-25 19:43:53,673] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:43:53,673] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.92 | bwd_microstep: 5680.60 | bwd_inner_microstep: 5656.29 | bwd_allreduce_microstep: 24.27 | step_microstep: 18.56 [2025-04-25 19:43:53,673] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.92 | bwd: 5680.61 | bwd_inner: 5656.29 | bwd_allreduce: 24.28 | step: 18.56 12%|█▏ | 4868/41250 [11:46:18<87:21:37, 8.64s/it] {'loss': 0.1471, 'grad_norm': 1.6069906949996948, 'learning_rate': 3.919315416351372e-05, 'epoch': 1.18} 12%|█▏ | 4868/41250 [11:46:19<87:21:37, 8.64s/it][2025-04-25 19:44:02,322] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.01 | optimizer_step: 0.95 [2025-04-25 19:44:02,323] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.99 | bwd_microstep: 5711.45 | bwd_inner_microstep: 5698.62 | bwd_allreduce_microstep: 12.78 | step_microstep: 19.05 [2025-04-25 19:44:02,323] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.99 | bwd: 5711.46 | bwd_inner: 5698.62 | bwd_allreduce: 12.80 | step: 19.06 12%|█▏ | 4869/41250 [11:46:27<87:22:18, 8.65s/it] {'loss': 0.1987, 'grad_norm': 1.5827598571777344, 'learning_rate': 3.919271257457246e-05, 'epoch': 1.18} 12%|█▏ | 4869/41250 [11:46:27<87:22:18, 8.65s/it][2025-04-25 19:44:10,973] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 19:44:10,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.01 | bwd_microstep: 5711.57 | bwd_inner_microstep: 5698.83 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.96 [2025-04-25 19:44:10,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.01 | bwd: 5711.59 | bwd_inner: 5698.83 | bwd_allreduce: 12.72 | step: 18.96 12%|█▏ | 4870/41250 [11:46:36<87:23:10, 8.65s/it] {'loss': 0.2467, 'grad_norm': 3.1520044803619385, 'learning_rate': 3.9192270867311904e-05, 'epoch': 1.18} 12%|█▏ | 4870/41250 [11:46:36<87:23:10, 8.65s/it][2025-04-25 19:44:19,595] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:44:19,596] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.02 | bwd_microstep: 5691.15 | bwd_inner_microstep: 5678.51 | bwd_allreduce_microstep: 12.59 | step_microstep: 18.92 [2025-04-25 19:44:19,596] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.02 | bwd: 5691.16 | bwd_inner: 5678.51 | bwd_allreduce: 12.61 | step: 18.93 12%|█▏ | 4871/41250 [11:46:44<87:18:23, 8.64s/it] {'loss': 0.1575, 'grad_norm': 1.5772602558135986, 'learning_rate': 3.919182904173477e-05, 'epoch': 1.18} 12%|█▏ | 4871/41250 [11:46:44<87:18:23, 8.64s/it][2025-04-25 19:44:28,233] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.12 | optimizer_step: 0.94 [2025-04-25 19:44:28,233] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.31 | bwd_microstep: 5707.51 | bwd_inner_microstep: 5694.38 | bwd_allreduce_microstep: 13.09 | step_microstep: 19.02 [2025-04-25 19:44:28,233] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.31 | bwd: 5707.53 | bwd_inner: 5694.38 | bwd_allreduce: 13.11 | step: 19.03 12%|█▏ | 4872/41250 [11:46:53<87:17:59, 8.64s/it] {'loss': 0.1882, 'grad_norm': 3.8629677295684814, 'learning_rate': 3.919138709784379e-05, 'epoch': 1.18} 12%|█▏ | 4872/41250 [11:46:53<87:17:59, 8.64s/it][2025-04-25 19:44:36,890] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.19 | optimizer_step: 0.90 [2025-04-25 19:44:36,891] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.22 | bwd_microstep: 5750.03 | bwd_inner_microstep: 5639.53 | bwd_allreduce_microstep: 110.45 | step_microstep: 18.74 [2025-04-25 19:44:36,891] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.22 | bwd: 5750.04 | bwd_inner: 5639.53 | bwd_allreduce: 110.47 | step: 18.74 12%|█▏ | 4873/41250 [11:47:02<87:21:12, 8.64s/it] {'loss': 0.1097, 'grad_norm': 1.1184332370758057, 'learning_rate': 3.919094503564168e-05, 'epoch': 1.18} 12%|█▏ | 4873/41250 [11:47:02<87:21:12, 8.64s/it][2025-04-25 19:44:45,540] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.04 | optimizer_step: 0.97 [2025-04-25 19:44:45,540] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.91 | bwd_microstep: 5736.96 | bwd_inner_microstep: 5651.85 | bwd_allreduce_microstep: 85.07 | step_microstep: 19.19 [2025-04-25 19:44:45,541] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.91 | bwd: 5736.98 | bwd_inner: 5651.85 | bwd_allreduce: 85.09 | step: 19.20 12%|█▏ | 4874/41250 [11:47:10<87:22:02, 8.65s/it] {'loss': 0.1172, 'grad_norm': 0.8755242228507996, 'learning_rate': 3.919050285513117e-05, 'epoch': 1.18} 12%|█▏ | 4874/41250 [11:47:10<87:22:02, 8.65s/it][2025-04-25 19:44:54,127] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:44:54,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.87 | bwd_microstep: 5677.91 | bwd_inner_microstep: 5638.09 | bwd_allreduce_microstep: 39.78 | step_microstep: 18.50 [2025-04-25 19:44:54,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.87 | bwd: 5677.92 | bwd_inner: 5638.09 | bwd_allreduce: 39.79 | step: 18.50 12%|█▏ | 4875/41250 [11:47:19<87:10:50, 8.63s/it] {'loss': 0.0412, 'grad_norm': 0.736634373664856, 'learning_rate': 3.919006055631498e-05, 'epoch': 1.18} 12%|█▏ | 4875/41250 [11:47:19<87:10:50, 8.63s/it][2025-04-25 19:45:02,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:45:02,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.40 | bwd_microstep: 5717.19 | bwd_inner_microstep: 5704.51 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.67 [2025-04-25 19:45:02,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.40 | bwd: 5717.21 | bwd_inner: 5704.51 | bwd_allreduce: 12.65 | step: 18.67 12%|█▏ | 4876/41250 [11:47:28<87:15:36, 8.64s/it] {'loss': 0.146, 'grad_norm': 1.6597696542739868, 'learning_rate': 3.918961813919584e-05, 'epoch': 1.18} 12%|█▏ | 4876/41250 [11:47:28<87:15:36, 8.64s/it][2025-04-25 19:45:11,358] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.02 | optimizer_step: 0.95 [2025-04-25 19:45:11,359] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.46 | bwd_microstep: 5665.25 | bwd_inner_microstep: 5652.34 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.81 [2025-04-25 19:45:11,359] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.46 | bwd: 5665.27 | bwd_inner: 5652.34 | bwd_allreduce: 12.88 | step: 18.81 12%|█▏ | 4877/41250 [11:47:36<87:04:40, 8.62s/it] {'loss': 0.2474, 'grad_norm': 2.1404333114624023, 'learning_rate': 3.918917560377648e-05, 'epoch': 1.18} 12%|█▏ | 4877/41250 [11:47:36<87:04:40, 8.62s/it][2025-04-25 19:45:19,985] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 19:45:19,985] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.72 | bwd_microstep: 5701.28 | bwd_inner_microstep: 5688.43 | bwd_allreduce_microstep: 12.81 | step_microstep: 18.87 [2025-04-25 19:45:19,986] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.72 | bwd: 5701.29 | bwd_inner: 5688.43 | bwd_allreduce: 12.82 | step: 18.87 12%|█▏ | 4878/41250 [11:47:45<87:05:54, 8.62s/it] {'loss': 0.0482, 'grad_norm': 1.0074851512908936, 'learning_rate': 3.918873295005963e-05, 'epoch': 1.18} 12%|█▏ | 4878/41250 [11:47:45<87:05:54, 8.62s/it][2025-04-25 19:45:28,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.03 | optimizer_step: 1.00 [2025-04-25 19:45:28,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.71 | bwd_microstep: 5739.30 | bwd_inner_microstep: 5641.95 | bwd_allreduce_microstep: 97.30 | step_microstep: 19.45 [2025-04-25 19:45:28,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.71 | bwd: 5739.32 | bwd_inner: 5641.95 | bwd_allreduce: 97.32 | step: 19.45 12%|█▏ | 4879/41250 [11:47:53<87:10:24, 8.63s/it] {'loss': 0.0431, 'grad_norm': 0.6852734088897705, 'learning_rate': 3.918829017804802e-05, 'epoch': 1.18} 12%|█▏ | 4879/41250 [11:47:53<87:10:24, 8.63s/it][2025-04-25 19:45:37,280] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 19:45:37,281] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.35 | bwd_microstep: 5739.05 | bwd_inner_microstep: 5644.03 | bwd_allreduce_microstep: 94.98 | step_microstep: 18.97 [2025-04-25 19:45:37,281] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.35 | bwd: 5739.07 | bwd_inner: 5644.03 | bwd_allreduce: 94.99 | step: 18.97 12%|█▏ | 4880/41250 [11:48:02<87:13:52, 8.63s/it] {'loss': 0.1842, 'grad_norm': 1.0001435279846191, 'learning_rate': 3.918784728774436e-05, 'epoch': 1.18} 12%|█▏ | 4880/41250 [11:48:02<87:13:52, 8.63s/it][2025-04-25 19:45:45,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 19:45:45,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.70 | bwd_microstep: 5670.33 | bwd_inner_microstep: 5656.19 | bwd_allreduce_microstep: 14.10 | step_microstep: 18.71 [2025-04-25 19:45:45,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.70 | bwd: 5670.34 | bwd_inner: 5656.19 | bwd_allreduce: 14.11 | step: 18.71 12%|█▏ | 4881/41250 [11:48:11<87:04:03, 8.62s/it] {'loss': 0.0695, 'grad_norm': 1.0219918489456177, 'learning_rate': 3.918740427915141e-05, 'epoch': 1.18} 12%|█▏ | 4881/41250 [11:48:11<87:04:03, 8.62s/it][2025-04-25 19:45:54,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:45:54,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.91 | bwd_microstep: 5740.45 | bwd_inner_microstep: 5636.77 | bwd_allreduce_microstep: 103.64 | step_microstep: 18.64 [2025-04-25 19:45:54,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.91 | bwd: 5740.47 | bwd_inner: 5636.77 | bwd_allreduce: 103.66 | step: 18.64 12%|█▏ | 4882/41250 [11:48:19<87:08:21, 8.63s/it] {'loss': 0.2087, 'grad_norm': 2.4843990802764893, 'learning_rate': 3.918696115227188e-05, 'epoch': 1.18} 12%|█▏ | 4882/41250 [11:48:19<87:08:21, 8.63s/it][2025-04-25 19:46:03,160] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.96 | optimizer_gradients: 1.01 | optimizer_step: 1.01 [2025-04-25 19:46:03,160] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.77 | bwd_microstep: 5744.44 | bwd_inner_microstep: 5648.22 | bwd_allreduce_microstep: 96.18 | step_microstep: 18.61 [2025-04-25 19:46:03,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.77 | bwd: 5744.46 | bwd_inner: 5648.22 | bwd_allreduce: 96.20 | step: 18.61 12%|█▏ | 4883/41250 [11:48:28<87:13:42, 8.63s/it] {'loss': 0.1691, 'grad_norm': 2.2570009231567383, 'learning_rate': 3.9186517907108517e-05, 'epoch': 1.18} 12%|█▏ | 4883/41250 [11:48:28<87:13:42, 8.63s/it][2025-04-25 19:46:11,817] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.57 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 19:46:11,817] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.95 | bwd_microstep: 5710.73 | bwd_inner_microstep: 5697.82 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.79 [2025-04-25 19:46:11,818] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.95 | bwd: 5710.74 | bwd_inner: 5697.82 | bwd_allreduce: 12.87 | step: 18.80 12%|█▏ | 4884/41250 [11:48:37<87:17:26, 8.64s/it] {'loss': 0.2307, 'grad_norm': 2.4534101486206055, 'learning_rate': 3.9186074543664034e-05, 'epoch': 1.18} 12%|█▏ | 4884/41250 [11:48:37<87:17:26, 8.64s/it][2025-04-25 19:46:20,472] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 1.06 [2025-04-25 19:46:20,472] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.15 | bwd_microstep: 5733.35 | bwd_inner_microstep: 5644.96 | bwd_allreduce_microstep: 88.35 | step_microstep: 18.70 [2025-04-25 19:46:20,472] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.15 | bwd: 5733.36 | bwd_inner: 5644.96 | bwd_allreduce: 88.36 | step: 18.70 12%|█▏ | 4885/41250 [11:48:45<87:19:48, 8.65s/it] {'loss': 0.1183, 'grad_norm': 1.3105171918869019, 'learning_rate': 3.918563106194118e-05, 'epoch': 1.18} 12%|█▏ | 4885/41250 [11:48:45<87:19:48, 8.65s/it][2025-04-25 19:46:29,063] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.56 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:46:29,064] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.30 | bwd_microstep: 5678.91 | bwd_inner_microstep: 5656.78 | bwd_allreduce_microstep: 22.09 | step_microstep: 18.53 [2025-04-25 19:46:29,064] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.30 | bwd: 5678.93 | bwd_inner: 5656.78 | bwd_allreduce: 22.11 | step: 18.54 12%|█▏ | 4886/41250 [11:48:54<87:09:52, 8.63s/it] {'loss': 0.2375, 'grad_norm': 3.244279623031616, 'learning_rate': 3.918518746194268e-05, 'epoch': 1.18} 12%|█▏ | 4886/41250 [11:48:54<87:09:52, 8.63s/it][2025-04-25 19:46:37,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:46:37,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.12 | bwd_microstep: 5733.31 | bwd_inner_microstep: 5691.24 | bwd_allreduce_microstep: 42.04 | step_microstep: 18.51 [2025-04-25 19:46:37,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.12 | bwd: 5733.33 | bwd_inner: 5691.24 | bwd_allreduce: 42.05 | step: 18.52 12%|█▏ | 4887/41250 [11:49:03<87:17:44, 8.64s/it] {'loss': 0.0878, 'grad_norm': 1.354318618774414, 'learning_rate': 3.9184743743671274e-05, 'epoch': 1.18} 12%|█▏ | 4887/41250 [11:49:03<87:17:44, 8.64s/it][2025-04-25 19:46:46,397] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.98 | optimizer_step: 0.93 [2025-04-25 19:46:46,398] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2862.77 | bwd_microstep: 5708.49 | bwd_inner_microstep: 5695.91 | bwd_allreduce_microstep: 12.53 | step_microstep: 18.50 [2025-04-25 19:46:46,398] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2862.77 | bwd: 5708.50 | bwd_inner: 5695.91 | bwd_allreduce: 12.55 | step: 18.51 12%|█▏ | 4888/41250 [11:49:11<87:20:57, 8.65s/it] {'loss': 0.0717, 'grad_norm': 1.2301052808761597, 'learning_rate': 3.9184299907129705e-05, 'epoch': 1.18} 12%|█▏ | 4888/41250 [11:49:11<87:20:57, 8.65s/it][2025-04-25 19:46:55,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 19:46:55,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.03 | bwd_microstep: 5765.22 | bwd_inner_microstep: 5650.54 | bwd_allreduce_microstep: 114.63 | step_microstep: 18.73 [2025-04-25 19:46:55,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.03 | bwd: 5765.23 | bwd_inner: 5650.54 | bwd_allreduce: 114.65 | step: 18.73 12%|█▏ | 4889/41250 [11:49:20<87:25:25, 8.66s/it] {'loss': 0.2746, 'grad_norm': 3.211171865463257, 'learning_rate': 3.9183855952320694e-05, 'epoch': 1.19} 12%|█▏ | 4889/41250 [11:49:20<87:25:25, 8.66s/it][2025-04-25 19:47:03,776] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.99 | optimizer_step: 1.02 [2025-04-25 19:47:03,777] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.85 | bwd_microstep: 5772.65 | bwd_inner_microstep: 5691.28 | bwd_allreduce_microstep: 81.33 | step_microstep: 18.83 [2025-04-25 19:47:03,777] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.85 | bwd: 5772.66 | bwd_inner: 5691.28 | bwd_allreduce: 81.35 | step: 18.84 12%|█▏ | 4890/41250 [11:49:29<87:34:19, 8.67s/it] {'loss': 0.0428, 'grad_norm': 1.417708396911621, 'learning_rate': 3.9183411879246984e-05, 'epoch': 1.19} 12%|█▏ | 4890/41250 [11:49:29<87:34:19, 8.67s/it][2025-04-25 19:47:12,576] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 19:47:12,576] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.99 | bwd_microstep: 5860.27 | bwd_inner_microstep: 5700.78 | bwd_allreduce_microstep: 159.44 | step_microstep: 18.53 [2025-04-25 19:47:12,577] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.99 | bwd: 5860.28 | bwd_inner: 5700.78 | bwd_allreduce: 159.46 | step: 18.53 12%|█▏ | 4891/41250 [11:49:37<87:57:38, 8.71s/it] {'loss': 0.081, 'grad_norm': 1.002457857131958, 'learning_rate': 3.918296768791131e-05, 'epoch': 1.19} 12%|█▏ | 4891/41250 [11:49:37<87:57:38, 8.71s/it][2025-04-25 19:47:21,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-25 19:47:21,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.91 | bwd_microstep: 5763.40 | bwd_inner_microstep: 5665.98 | bwd_allreduce_microstep: 97.37 | step_microstep: 18.73 [2025-04-25 19:47:21,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.92 | bwd: 5763.41 | bwd_inner: 5665.98 | bwd_allreduce: 97.39 | step: 18.74 12%|█▏ | 4892/41250 [11:49:46<87:53:49, 8.70s/it] {'loss': 0.1733, 'grad_norm': 2.23846435546875, 'learning_rate': 3.918252337831641e-05, 'epoch': 1.19} 12%|█▏ | 4892/41250 [11:49:46<87:53:49, 8.70s/it][2025-04-25 19:47:29,935] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 1.05 [2025-04-25 19:47:29,936] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.30 | bwd_microstep: 5733.90 | bwd_inner_microstep: 5720.90 | bwd_allreduce_microstep: 12.96 | step_microstep: 19.11 [2025-04-25 19:47:29,936] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.30 | bwd: 5733.91 | bwd_inner: 5720.90 | bwd_allreduce: 12.97 | step: 19.11 12%|█▏ | 4893/41250 [11:49:55<87:48:06, 8.69s/it] {'loss': 0.2027, 'grad_norm': 2.3698770999908447, 'learning_rate': 3.9182078950465034e-05, 'epoch': 1.19} 12%|█▏ | 4893/41250 [11:49:55<87:48:06, 8.69s/it][2025-04-25 19:47:38,576] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-25 19:47:38,577] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.70 | bwd_microstep: 5709.62 | bwd_inner_microstep: 5696.81 | bwd_allreduce_microstep: 12.76 | step_microstep: 19.01 [2025-04-25 19:47:38,577] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.70 | bwd: 5709.64 | bwd_inner: 5696.81 | bwd_allreduce: 12.78 | step: 19.02 12%|█▏ | 4894/41250 [11:50:03<87:38:09, 8.68s/it] {'loss': 0.1133, 'grad_norm': 1.3070080280303955, 'learning_rate': 3.918163440435992e-05, 'epoch': 1.19} 12%|█▏ | 4894/41250 [11:50:03<87:38:09, 8.68s/it][2025-04-25 19:47:47,207] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.22 | optimizer_step: 0.90 [2025-04-25 19:47:47,207] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.83 | bwd_microstep: 5716.55 | bwd_inner_microstep: 5659.11 | bwd_allreduce_microstep: 57.39 | step_microstep: 19.10 [2025-04-25 19:47:47,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.83 | bwd: 5716.56 | bwd_inner: 5659.11 | bwd_allreduce: 57.41 | step: 19.10 12%|█▏ | 4895/41250 [11:50:12<87:29:26, 8.66s/it] {'loss': 0.2731, 'grad_norm': 3.3088297843933105, 'learning_rate': 3.918118974000379e-05, 'epoch': 1.19} 12%|█▏ | 4895/41250 [11:50:12<87:29:26, 8.66s/it][2025-04-25 19:47:55,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 19:47:55,869] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.61 | bwd_microstep: 5724.09 | bwd_inner_microstep: 5711.23 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.84 [2025-04-25 19:47:55,869] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.61 | bwd: 5724.11 | bwd_inner: 5711.23 | bwd_allreduce: 12.83 | step: 18.84 12%|█▏ | 4896/41250 [11:50:21<87:28:45, 8.66s/it] {'loss': 0.1675, 'grad_norm': 0.9338627457618713, 'learning_rate': 3.9180744957399405e-05, 'epoch': 1.19} 12%|█▏ | 4896/41250 [11:50:21<87:28:45, 8.66s/it][2025-04-25 19:48:04,558] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 19:48:04,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.92 | bwd_microstep: 5757.06 | bwd_inner_microstep: 5705.94 | bwd_allreduce_microstep: 51.07 | step_microstep: 18.56 [2025-04-25 19:48:04,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.92 | bwd: 5757.07 | bwd_inner: 5705.94 | bwd_allreduce: 51.09 | step: 18.57 12%|█▏ | 4897/41250 [11:50:29<87:33:44, 8.67s/it] {'loss': 0.1017, 'grad_norm': 2.3074493408203125, 'learning_rate': 3.918030005654949e-05, 'epoch': 1.19} 12%|█▏ | 4897/41250 [11:50:29<87:33:44, 8.67s/it][2025-04-25 19:48:13,197] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.98 [2025-04-25 19:48:13,197] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.01 | bwd_microstep: 5719.22 | bwd_inner_microstep: 5669.19 | bwd_allreduce_microstep: 49.99 | step_microstep: 19.06 [2025-04-25 19:48:13,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.01 | bwd: 5719.23 | bwd_inner: 5669.19 | bwd_allreduce: 50.00 | step: 19.07 12%|█▏ | 4898/41250 [11:50:38<87:27:40, 8.66s/it] {'loss': 0.0113, 'grad_norm': 0.308078795671463, 'learning_rate': 3.917985503745682e-05, 'epoch': 1.19} 12%|█▏ | 4898/41250 [11:50:38<87:27:40, 8.66s/it][2025-04-25 19:48:21,889] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.22 | optimizer_step: 0.92 [2025-04-25 19:48:21,890] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.74 | bwd_microstep: 5753.35 | bwd_inner_microstep: 5717.36 | bwd_allreduce_microstep: 35.93 | step_microstep: 19.39 [2025-04-25 19:48:21,890] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.74 | bwd: 5753.36 | bwd_inner: 5717.36 | bwd_allreduce: 35.95 | step: 19.40 12%|█▏ | 4899/41250 [11:50:47<87:33:11, 8.67s/it] {'loss': 0.1515, 'grad_norm': 1.2012324333190918, 'learning_rate': 3.9179409900124096e-05, 'epoch': 1.19} 12%|█▏ | 4899/41250 [11:50:47<87:33:11, 8.67s/it][2025-04-25 19:48:30,598] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.04 | optimizer_step: 0.89 [2025-04-25 19:48:30,599] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.13 | bwd_microstep: 5795.73 | bwd_inner_microstep: 5668.36 | bwd_allreduce_microstep: 127.32 | step_microstep: 19.06 [2025-04-25 19:48:30,599] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.13 | bwd: 5795.74 | bwd_inner: 5668.36 | bwd_allreduce: 127.34 | step: 19.06 12%|█▏ | 4900/41250 [11:50:55<87:39:54, 8.68s/it] {'loss': 0.126, 'grad_norm': 2.362168073654175, 'learning_rate': 3.917896464455409e-05, 'epoch': 1.19} 12%|█▏ | 4900/41250 [11:50:55<87:39:54, 8.68s/it][2025-04-25 19:48:39,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:48:39,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.88 | bwd_microstep: 5723.16 | bwd_inner_microstep: 5670.29 | bwd_allreduce_microstep: 52.82 | step_microstep: 18.79 [2025-04-25 19:48:39,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.88 | bwd: 5723.17 | bwd_inner: 5670.29 | bwd_allreduce: 52.84 | step: 18.79 12%|█▏ | 4901/41250 [11:51:04<87:32:44, 8.67s/it] {'loss': 0.1805, 'grad_norm': 2.7314343452453613, 'learning_rate': 3.917851927074954e-05, 'epoch': 1.19} 12%|█▏ | 4901/41250 [11:51:04<87:32:44, 8.67s/it][2025-04-25 19:48:47,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:48:47,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.54 | bwd_microstep: 5802.10 | bwd_inner_microstep: 5663.96 | bwd_allreduce_microstep: 138.10 | step_microstep: 18.84 [2025-04-25 19:48:47,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.54 | bwd: 5802.12 | bwd_inner: 5663.96 | bwd_allreduce: 138.12 | step: 18.84 12%|█▏ | 4902/41250 [11:51:13<87:41:09, 8.68s/it] {'loss': 0.0999, 'grad_norm': 2.5192489624023438, 'learning_rate': 3.917807377871319e-05, 'epoch': 1.19} 12%|█▏ | 4902/41250 [11:51:13<87:41:09, 8.68s/it][2025-04-25 19:48:56,663] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.96 [2025-04-25 19:48:56,664] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.41 | bwd_microstep: 5787.71 | bwd_inner_microstep: 5656.44 | bwd_allreduce_microstep: 131.22 | step_microstep: 19.05 [2025-04-25 19:48:56,664] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.41 | bwd: 5787.72 | bwd_inner: 5656.44 | bwd_allreduce: 131.24 | step: 19.06 12%|█▏ | 4903/41250 [11:51:21<87:44:24, 8.69s/it] {'loss': 0.0951, 'grad_norm': 1.5211549997329712, 'learning_rate': 3.917762816844779e-05, 'epoch': 1.19} 12%|█▏ | 4903/41250 [11:51:21<87:44:24, 8.69s/it][2025-04-25 19:49:05,317] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 19:49:05,318] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.52 | bwd_microstep: 5729.27 | bwd_inner_microstep: 5661.24 | bwd_allreduce_microstep: 67.98 | step_microstep: 19.06 [2025-04-25 19:49:05,318] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.52 | bwd: 5729.28 | bwd_inner: 5661.24 | bwd_allreduce: 68.00 | step: 19.06 12%|█▏ | 4904/41250 [11:51:30<87:37:37, 8.68s/it] {'loss': 0.0529, 'grad_norm': 0.7636629939079285, 'learning_rate': 3.9177182439956074e-05, 'epoch': 1.19} 12%|█▏ | 4904/41250 [11:51:30<87:37:37, 8.68s/it][2025-04-25 19:49:14,129] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 19:49:14,129] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.87 | bwd_microstep: 5872.92 | bwd_inner_microstep: 5715.32 | bwd_allreduce_microstep: 157.55 | step_microstep: 19.05 [2025-04-25 19:49:14,130] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.87 | bwd: 5872.93 | bwd_inner: 5715.32 | bwd_allreduce: 157.57 | step: 19.05 12%|█▏ | 4905/41250 [11:51:39<88:01:39, 8.72s/it] {'loss': 0.2894, 'grad_norm': 7.451047897338867, 'learning_rate': 3.917673659324081e-05, 'epoch': 1.19} 12%|█▏ | 4905/41250 [11:51:39<88:01:39, 8.72s/it][2025-04-25 19:49:22,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.98 | optimizer_step: 0.99 [2025-04-25 19:49:22,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.37 | bwd_microstep: 5778.42 | bwd_inner_microstep: 5692.84 | bwd_allreduce_microstep: 85.54 | step_microstep: 18.52 [2025-04-25 19:49:22,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.37 | bwd: 5778.44 | bwd_inner: 5692.84 | bwd_allreduce: 85.56 | step: 18.53 12%|█▏ | 4906/41250 [11:51:48<87:59:39, 8.72s/it] {'loss': 0.0939, 'grad_norm': 0.8279299139976501, 'learning_rate': 3.917629062830473e-05, 'epoch': 1.19} 12%|█▏ | 4906/41250 [11:51:48<87:59:39, 8.72s/it][2025-04-25 19:49:31,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.98 [2025-04-25 19:49:31,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.85 | bwd_microstep: 5776.23 | bwd_inner_microstep: 5647.08 | bwd_allreduce_microstep: 129.11 | step_microstep: 19.00 [2025-04-25 19:49:31,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.85 | bwd: 5776.25 | bwd_inner: 5647.08 | bwd_allreduce: 129.13 | step: 19.00 12%|█▏ | 4907/41250 [11:51:56<87:54:49, 8.71s/it] {'loss': 0.3709, 'grad_norm': 6.58607292175293, 'learning_rate': 3.9175844545150593e-05, 'epoch': 1.19} 12%|█▏ | 4907/41250 [11:51:56<87:54:49, 8.71s/it][2025-04-25 19:49:40,190] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 19:49:40,191] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.90 | bwd_microstep: 5720.83 | bwd_inner_microstep: 5707.83 | bwd_allreduce_microstep: 12.95 | step_microstep: 18.67 [2025-04-25 19:49:40,191] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.90 | bwd: 5720.84 | bwd_inner: 5707.83 | bwd_allreduce: 12.97 | step: 18.67 12%|█▏ | 4908/41250 [11:52:05<87:46:11, 8.69s/it] {'loss': 0.0356, 'grad_norm': 0.37198176980018616, 'learning_rate': 3.917539834378115e-05, 'epoch': 1.19} 12%|█▏ | 4908/41250 [11:52:05<87:46:11, 8.69s/it][2025-04-25 19:49:48,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.99 | optimizer_step: 1.01 [2025-04-25 19:49:48,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.27 | bwd_microstep: 5808.64 | bwd_inner_microstep: 5654.10 | bwd_allreduce_microstep: 154.49 | step_microstep: 18.71 [2025-04-25 19:49:48,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.27 | bwd: 5808.65 | bwd_inner: 5654.10 | bwd_allreduce: 154.50 | step: 18.71 12%|█▏ | 4909/41250 [11:52:14<87:49:56, 8.70s/it] {'loss': 0.1397, 'grad_norm': 2.2528231143951416, 'learning_rate': 3.917495202419914e-05, 'epoch': 1.19} 12%|█▏ | 4909/41250 [11:52:14<87:49:56, 8.70s/it][2025-04-25 19:49:57,544] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:49:57,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.77 | bwd_microstep: 5727.05 | bwd_inner_microstep: 5664.86 | bwd_allreduce_microstep: 62.14 | step_microstep: 18.74 [2025-04-25 19:49:57,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.77 | bwd: 5727.06 | bwd_inner: 5664.86 | bwd_allreduce: 62.16 | step: 18.74 12%|█▏ | 4910/41250 [11:52:22<87:38:19, 8.68s/it] {'loss': 0.2077, 'grad_norm': 1.0547384023666382, 'learning_rate': 3.917450558640733e-05, 'epoch': 1.19} 12%|█▏ | 4910/41250 [11:52:22<87:38:19, 8.68s/it][2025-04-25 19:50:06,224] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:50:06,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.92 | bwd_microstep: 5765.83 | bwd_inner_microstep: 5665.06 | bwd_allreduce_microstep: 100.72 | step_microstep: 18.45 [2025-04-25 19:50:06,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.92 | bwd: 5765.84 | bwd_inner: 5665.06 | bwd_allreduce: 100.74 | step: 18.45 12%|█▏ | 4911/41250 [11:52:31<87:38:03, 8.68s/it] {'loss': 0.0883, 'grad_norm': 1.2049721479415894, 'learning_rate': 3.917405903040846e-05, 'epoch': 1.19} 12%|█▏ | 4911/41250 [11:52:31<87:38:03, 8.68s/it][2025-04-25 19:50:14,925] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.02 | optimizer_step: 1.05 [2025-04-25 19:50:14,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.65 | bwd_microstep: 5768.13 | bwd_inner_microstep: 5689.63 | bwd_allreduce_microstep: 78.45 | step_microstep: 19.19 [2025-04-25 19:50:14,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.65 | bwd: 5768.15 | bwd_inner: 5689.63 | bwd_allreduce: 78.47 | step: 19.19 12%|█▏ | 4912/41250 [11:52:40<87:41:17, 8.69s/it] {'loss': 0.3934, 'grad_norm': 4.98158597946167, 'learning_rate': 3.9173612356205286e-05, 'epoch': 1.19} 12%|█▏ | 4912/41250 [11:52:40<87:41:17, 8.69s/it][2025-04-25 19:50:23,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:50:23,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.66 | bwd_microstep: 5711.66 | bwd_inner_microstep: 5658.21 | bwd_allreduce_microstep: 53.40 | step_microstep: 18.58 [2025-04-25 19:50:23,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.66 | bwd: 5711.67 | bwd_inner: 5658.21 | bwd_allreduce: 53.42 | step: 18.58 12%|█▏ | 4913/41250 [11:52:48<87:29:39, 8.67s/it] {'loss': 0.3064, 'grad_norm': 3.220306873321533, 'learning_rate': 3.917316556380056e-05, 'epoch': 1.19} 12%|█▏ | 4913/41250 [11:52:48<87:29:39, 8.67s/it][2025-04-25 19:50:32,308] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:50:32,309] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2883.81 | bwd_microstep: 5788.37 | bwd_inner_microstep: 5775.59 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.57 [2025-04-25 19:50:32,309] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2883.81 | bwd: 5788.38 | bwd_inner: 5775.59 | bwd_allreduce: 12.75 | step: 18.57 12%|█▏ | 4914/41250 [11:52:57<87:45:51, 8.70s/it] {'loss': 0.1036, 'grad_norm': 1.3132442235946655, 'learning_rate': 3.9172718653197055e-05, 'epoch': 1.19} 12%|█▏ | 4914/41250 [11:52:57<87:45:51, 8.70s/it][2025-04-25 19:50:40,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:50:40,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.52 | bwd_microstep: 5698.18 | bwd_inner_microstep: 5654.26 | bwd_allreduce_microstep: 43.88 | step_microstep: 18.77 [2025-04-25 19:50:40,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.52 | bwd: 5698.20 | bwd_inner: 5654.26 | bwd_allreduce: 43.90 | step: 18.77 12%|█▏ | 4915/41250 [11:53:06<87:30:33, 8.67s/it] {'loss': 0.0947, 'grad_norm': 1.7092684507369995, 'learning_rate': 3.91722716243975e-05, 'epoch': 1.19} 12%|█▏ | 4915/41250 [11:53:06<87:30:33, 8.67s/it][2025-04-25 19:50:49,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 1.00 [2025-04-25 19:50:49,595] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.31 | bwd_microstep: 5739.07 | bwd_inner_microstep: 5702.39 | bwd_allreduce_microstep: 36.63 | step_microstep: 19.16 [2025-04-25 19:50:49,595] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.31 | bwd: 5739.08 | bwd_inner: 5702.39 | bwd_allreduce: 36.65 | step: 19.16 12%|█▏ | 4916/41250 [11:53:14<87:31:19, 8.67s/it] {'loss': 0.122, 'grad_norm': 2.2786474227905273, 'learning_rate': 3.917182447740466e-05, 'epoch': 1.19} 12%|█▏ | 4916/41250 [11:53:14<87:31:19, 8.67s/it][2025-04-25 19:50:58,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.21 | optimizer_step: 0.90 [2025-04-25 19:50:58,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.32 | bwd_microstep: 5691.58 | bwd_inner_microstep: 5657.71 | bwd_allreduce_microstep: 33.81 | step_microstep: 19.12 [2025-04-25 19:50:58,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.32 | bwd: 5691.59 | bwd_inner: 5657.71 | bwd_allreduce: 33.84 | step: 19.12 12%|█▏ | 4917/41250 [11:53:23<87:18:10, 8.65s/it] {'loss': 0.2152, 'grad_norm': 2.255427598953247, 'learning_rate': 3.91713772122213e-05, 'epoch': 1.19} 12%|█▏ | 4917/41250 [11:53:23<87:18:10, 8.65s/it][2025-04-25 19:51:06,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:51:06,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.83 | bwd_microstep: 5879.07 | bwd_inner_microstep: 5688.41 | bwd_allreduce_microstep: 190.62 | step_microstep: 18.39 [2025-04-25 19:51:06,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.84 | bwd: 5879.09 | bwd_inner: 5688.41 | bwd_allreduce: 190.63 | step: 18.39 12%|█▏ | 4918/41250 [11:53:32<87:45:02, 8.69s/it] {'loss': 0.1935, 'grad_norm': 1.3475937843322754, 'learning_rate': 3.917092982885016e-05, 'epoch': 1.19} 12%|█▏ | 4918/41250 [11:53:32<87:45:02, 8.69s/it][2025-04-25 19:51:15,628] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 19:51:15,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.17 | bwd_microstep: 5706.78 | bwd_inner_microstep: 5694.09 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.86 [2025-04-25 19:51:15,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.17 | bwd: 5706.79 | bwd_inner: 5694.09 | bwd_allreduce: 12.66 | step: 18.86 12%|█▏ | 4919/41250 [11:53:40<87:34:02, 8.68s/it] {'loss': 0.0497, 'grad_norm': 1.1688164472579956, 'learning_rate': 3.9170482327294017e-05, 'epoch': 1.19} 12%|█▏ | 4919/41250 [11:53:40<87:34:02, 8.68s/it][2025-04-25 19:51:24,321] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.00 | optimizer_step: 1.05 [2025-04-25 19:51:24,322] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.02 | bwd_microstep: 5760.00 | bwd_inner_microstep: 5688.50 | bwd_allreduce_microstep: 71.46 | step_microstep: 18.62 [2025-04-25 19:51:24,322] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.02 | bwd: 5760.01 | bwd_inner: 5688.50 | bwd_allreduce: 71.48 | step: 18.62 12%|█▏ | 4920/41250 [11:53:49<87:36:58, 8.68s/it] {'loss': 0.0609, 'grad_norm': 0.6372210383415222, 'learning_rate': 3.9170034707555616e-05, 'epoch': 1.19} 12%|█▏ | 4920/41250 [11:53:49<87:36:58, 8.68s/it][2025-04-25 19:51:32,973] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.35 | optimizer_step: 1.03 [2025-04-25 19:51:32,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.65 | bwd_microstep: 5712.98 | bwd_inner_microstep: 5700.25 | bwd_allreduce_microstep: 12.68 | step_microstep: 20.69 [2025-04-25 19:51:32,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.65 | bwd: 5712.99 | bwd_inner: 5700.25 | bwd_allreduce: 12.70 | step: 20.70 12%|█▏ | 4921/41250 [11:53:58<87:31:18, 8.67s/it] {'loss': 0.1985, 'grad_norm': 1.504049301147461, 'learning_rate': 3.916958696963773e-05, 'epoch': 1.19} 12%|█▏ | 4921/41250 [11:53:58<87:31:18, 8.67s/it][2025-04-25 19:51:41,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 19:51:41,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.27 | bwd_microstep: 5760.29 | bwd_inner_microstep: 5702.77 | bwd_allreduce_microstep: 57.48 | step_microstep: 19.14 [2025-04-25 19:51:41,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.27 | bwd: 5760.31 | bwd_inner: 5702.77 | bwd_allreduce: 57.50 | step: 19.14 12%|█▏ | 4922/41250 [11:54:06<87:34:28, 8.68s/it] {'loss': 0.1123, 'grad_norm': 2.293336868286133, 'learning_rate': 3.91691391135431e-05, 'epoch': 1.19} 12%|█▏ | 4922/41250 [11:54:06<87:34:28, 8.68s/it][2025-04-25 19:51:50,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 19:51:50,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.74 | bwd_microstep: 5765.42 | bwd_inner_microstep: 5655.87 | bwd_allreduce_microstep: 109.50 | step_microstep: 18.89 [2025-04-25 19:51:50,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.75 | bwd: 5765.44 | bwd_inner: 5655.87 | bwd_allreduce: 109.52 | step: 18.89 12%|█▏ | 4923/41250 [11:54:15<87:34:16, 8.68s/it] {'loss': 0.0815, 'grad_norm': 1.3378359079360962, 'learning_rate': 3.9168691139274495e-05, 'epoch': 1.19} 12%|█▏ | 4923/41250 [11:54:15<87:34:16, 8.68s/it][2025-04-25 19:51:58,997] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:51:58,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.73 | bwd_microstep: 5750.05 | bwd_inner_microstep: 5635.33 | bwd_allreduce_microstep: 114.67 | step_microstep: 18.60 [2025-04-25 19:51:58,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.73 | bwd: 5750.06 | bwd_inner: 5635.33 | bwd_allreduce: 114.69 | step: 18.61 12%|█▏ | 4924/41250 [11:54:24<87:29:32, 8.67s/it] {'loss': 0.0467, 'grad_norm': 0.7870063781738281, 'learning_rate': 3.916824304683469e-05, 'epoch': 1.19} 12%|█▏ | 4924/41250 [11:54:24<87:29:32, 8.67s/it][2025-04-25 19:52:07,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:52:07,667] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.72 | bwd_microstep: 5739.28 | bwd_inner_microstep: 5698.71 | bwd_allreduce_microstep: 40.52 | step_microstep: 18.62 [2025-04-25 19:52:07,667] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.72 | bwd: 5739.29 | bwd_inner: 5698.71 | bwd_allreduce: 40.54 | step: 18.62 12%|█▏ | 4925/41250 [11:54:32<87:29:50, 8.67s/it] {'loss': 0.2576, 'grad_norm': 2.001943826675415, 'learning_rate': 3.916779483622644e-05, 'epoch': 1.19} 12%|█▏ | 4925/41250 [11:54:32<87:29:50, 8.67s/it][2025-04-25 19:52:16,440] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.23 | optimizer_step: 1.11 [2025-04-25 19:52:16,441] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.92 | bwd_microstep: 5863.08 | bwd_inner_microstep: 5645.62 | bwd_allreduce_microstep: 217.40 | step_microstep: 20.16 [2025-04-25 19:52:16,441] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.92 | bwd: 5863.10 | bwd_inner: 5645.62 | bwd_allreduce: 217.43 | step: 20.16 12%|█▏ | 4926/41250 [11:54:41<87:47:51, 8.70s/it] {'loss': 0.1806, 'grad_norm': 2.6412672996520996, 'learning_rate': 3.916734650745249e-05, 'epoch': 1.19} 12%|█▏ | 4926/41250 [11:54:41<87:47:51, 8.70s/it][2025-04-25 19:52:25,096] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.97 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:52:25,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.37 | bwd_microstep: 5726.92 | bwd_inner_microstep: 5685.06 | bwd_allreduce_microstep: 41.82 | step_microstep: 18.02 [2025-04-25 19:52:25,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.37 | bwd: 5726.93 | bwd_inner: 5685.06 | bwd_allreduce: 41.84 | step: 18.02 12%|█▏ | 4927/41250 [11:54:50<87:39:12, 8.69s/it] {'loss': 0.1825, 'grad_norm': 2.1910688877105713, 'learning_rate': 3.9166898060515626e-05, 'epoch': 1.19} 12%|█▏ | 4927/41250 [11:54:50<87:39:12, 8.69s/it][2025-04-25 19:52:33,749] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 19:52:33,750] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.88 | bwd_microstep: 5721.57 | bwd_inner_microstep: 5691.29 | bwd_allreduce_microstep: 30.24 | step_microstep: 18.81 [2025-04-25 19:52:33,750] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.88 | bwd: 5721.59 | bwd_inner: 5691.29 | bwd_allreduce: 30.26 | step: 18.81 12%|█▏ | 4928/41250 [11:54:59<87:32:53, 8.68s/it] {'loss': 0.0725, 'grad_norm': 1.0185788869857788, 'learning_rate': 3.9166449495418606e-05, 'epoch': 1.19} 12%|█▏ | 4928/41250 [11:54:59<87:32:53, 8.68s/it][2025-04-25 19:52:42,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.95 | optimizer_step: 0.91 [2025-04-25 19:52:42,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.13 | bwd_microstep: 5671.86 | bwd_inner_microstep: 5642.00 | bwd_allreduce_microstep: 29.83 | step_microstep: 18.38 [2025-04-25 19:52:42,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.13 | bwd: 5671.88 | bwd_inner: 5642.00 | bwd_allreduce: 29.84 | step: 18.39 12%|█▏ | 4929/41250 [11:55:07<87:15:38, 8.65s/it] {'loss': 0.3219, 'grad_norm': 3.0213778018951416, 'learning_rate': 3.9166000812164196e-05, 'epoch': 1.19} 12%|█▏ | 4929/41250 [11:55:07<87:15:38, 8.65s/it][2025-04-25 19:52:50,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:52:50,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.29 | bwd_microstep: 5686.67 | bwd_inner_microstep: 5643.32 | bwd_allreduce_microstep: 43.30 | step_microstep: 18.60 [2025-04-25 19:52:50,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.30 | bwd: 5686.68 | bwd_inner: 5643.32 | bwd_allreduce: 43.32 | step: 18.60 12%|█▏ | 4930/41250 [11:55:16<87:05:22, 8.63s/it] {'loss': 0.0907, 'grad_norm': 1.049283742904663, 'learning_rate': 3.916555201075516e-05, 'epoch': 1.2} 12%|█▏ | 4930/41250 [11:55:16<87:05:22, 8.63s/it][2025-04-25 19:52:59,571] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 19:52:59,571] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.66 | bwd_microstep: 5712.79 | bwd_inner_microstep: 5700.08 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.34 [2025-04-25 19:52:59,572] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.66 | bwd: 5712.80 | bwd_inner: 5700.08 | bwd_allreduce: 12.69 | step: 18.34 12%|█▏ | 4931/41250 [11:55:24<87:07:29, 8.64s/it] {'loss': 0.0839, 'grad_norm': 1.096662998199463, 'learning_rate': 3.916510309119427e-05, 'epoch': 1.2} 12%|█▏ | 4931/41250 [11:55:24<87:07:29, 8.64s/it][2025-04-25 19:53:08,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 19:53:08,223] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.80 | bwd_microstep: 5722.52 | bwd_inner_microstep: 5697.11 | bwd_allreduce_microstep: 25.36 | step_microstep: 18.69 [2025-04-25 19:53:08,223] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.80 | bwd: 5722.53 | bwd_inner: 5697.11 | bwd_allreduce: 25.37 | step: 18.69 12%|█▏ | 4932/41250 [11:55:33<87:10:17, 8.64s/it] {'loss': 0.2855, 'grad_norm': 2.4025378227233887, 'learning_rate': 3.916465405348428e-05, 'epoch': 1.2} 12%|█▏ | 4932/41250 [11:55:33<87:10:17, 8.64s/it][2025-04-25 19:53:16,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.04 | optimizer_step: 0.97 [2025-04-25 19:53:16,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.23 | bwd_microstep: 5760.14 | bwd_inner_microstep: 5655.44 | bwd_allreduce_microstep: 104.65 | step_microstep: 19.36 [2025-04-25 19:53:16,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.23 | bwd: 5760.15 | bwd_inner: 5655.44 | bwd_allreduce: 104.67 | step: 19.36 12%|█▏ | 4933/41250 [11:55:42<87:16:14, 8.65s/it] {'loss': 0.0647, 'grad_norm': 1.373764157295227, 'learning_rate': 3.916420489762797e-05, 'epoch': 1.2} 12%|█▏ | 4933/41250 [11:55:42<87:16:14, 8.65s/it][2025-04-25 19:53:25,573] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:53:25,573] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.39 | bwd_microstep: 5742.04 | bwd_inner_microstep: 5689.95 | bwd_allreduce_microstep: 52.05 | step_microstep: 18.77 [2025-04-25 19:53:25,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.39 | bwd: 5742.05 | bwd_inner: 5689.95 | bwd_allreduce: 52.07 | step: 18.77 12%|█▏ | 4934/41250 [11:55:50<87:20:41, 8.66s/it] {'loss': 0.1918, 'grad_norm': 1.9404672384262085, 'learning_rate': 3.9163755623628105e-05, 'epoch': 1.2} 12%|█▏ | 4934/41250 [11:55:50<87:20:41, 8.66s/it][2025-04-25 19:53:34,227] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 19:53:34,228] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.00 | bwd_microstep: 5740.42 | bwd_inner_microstep: 5654.75 | bwd_allreduce_microstep: 85.63 | step_microstep: 18.85 [2025-04-25 19:53:34,228] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.00 | bwd: 5740.44 | bwd_inner: 5654.75 | bwd_allreduce: 85.65 | step: 18.85 12%|█▏ | 4935/41250 [11:55:59<87:19:52, 8.66s/it] {'loss': 0.0986, 'grad_norm': 1.6045273542404175, 'learning_rate': 3.916330623148746e-05, 'epoch': 1.2} 12%|█▏ | 4935/41250 [11:55:59<87:19:52, 8.66s/it][2025-04-25 19:53:42,842] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-25 19:53:42,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.75 | bwd_microstep: 5704.98 | bwd_inner_microstep: 5656.64 | bwd_allreduce_microstep: 48.30 | step_microstep: 18.65 [2025-04-25 19:53:42,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.75 | bwd: 5704.99 | bwd_inner: 5656.64 | bwd_allreduce: 48.31 | step: 18.66 12%|█▏ | 4936/41250 [11:56:08<87:11:53, 8.64s/it] {'loss': 0.1489, 'grad_norm': 1.9522737264633179, 'learning_rate': 3.91628567212088e-05, 'epoch': 1.2} 12%|█▏ | 4936/41250 [11:56:08<87:11:53, 8.64s/it][2025-04-25 19:53:51,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 19:53:51,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.12 | bwd_microstep: 5753.81 | bwd_inner_microstep: 5708.33 | bwd_allreduce_microstep: 45.43 | step_microstep: 18.56 [2025-04-25 19:53:51,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.12 | bwd: 5753.82 | bwd_inner: 5708.33 | bwd_allreduce: 45.45 | step: 18.56 12%|█▏ | 4937/41250 [11:56:16<87:19:19, 8.66s/it] {'loss': 0.323, 'grad_norm': 2.4790031909942627, 'learning_rate': 3.91624070927949e-05, 'epoch': 1.2} 12%|█▏ | 4937/41250 [11:56:16<87:19:19, 8.66s/it][2025-04-25 19:54:00,221] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 19:54:00,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.99 | bwd_microstep: 5756.11 | bwd_inner_microstep: 5707.82 | bwd_allreduce_microstep: 48.24 | step_microstep: 18.87 [2025-04-25 19:54:00,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.99 | bwd: 5756.13 | bwd_inner: 5707.82 | bwd_allreduce: 48.26 | step: 18.88 12%|█▏ | 4938/41250 [11:56:25<87:25:48, 8.67s/it] {'loss': 0.1678, 'grad_norm': 1.9026670455932617, 'learning_rate': 3.916195734624853e-05, 'epoch': 1.2} 12%|█▏ | 4938/41250 [11:56:25<87:25:48, 8.67s/it][2025-04-25 19:54:08,925] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.03 | optimizer_step: 0.98 [2025-04-25 19:54:08,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.51 | bwd_microstep: 5790.53 | bwd_inner_microstep: 5646.92 | bwd_allreduce_microstep: 143.56 | step_microstep: 19.05 [2025-04-25 19:54:08,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.51 | bwd: 5790.54 | bwd_inner: 5646.92 | bwd_allreduce: 143.58 | step: 19.05 12%|█▏ | 4939/41250 [11:56:34<87:32:19, 8.68s/it] {'loss': 0.1773, 'grad_norm': 2.023768663406372, 'learning_rate': 3.916150748157245e-05, 'epoch': 1.2} 12%|█▏ | 4939/41250 [11:56:34<87:32:19, 8.68s/it][2025-04-25 19:54:17,642] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:54:17,642] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.44 | bwd_microstep: 5776.20 | bwd_inner_microstep: 5699.36 | bwd_allreduce_microstep: 76.79 | step_microstep: 18.42 [2025-04-25 19:54:17,642] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.44 | bwd: 5776.21 | bwd_inner: 5699.36 | bwd_allreduce: 76.81 | step: 18.42 12%|█▏ | 4940/41250 [11:56:42<87:38:45, 8.69s/it] {'loss': 0.1089, 'grad_norm': 1.6138650178909302, 'learning_rate': 3.916105749876946e-05, 'epoch': 1.2} 12%|█▏ | 4940/41250 [11:56:42<87:38:45, 8.69s/it][2025-04-25 19:54:26,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 19:54:26,301] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.78 | bwd_microstep: 5725.15 | bwd_inner_microstep: 5712.58 | bwd_allreduce_microstep: 12.53 | step_microstep: 18.24 [2025-04-25 19:54:26,301] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.78 | bwd: 5725.16 | bwd_inner: 5712.58 | bwd_allreduce: 12.54 | step: 18.24 12%|█▏ | 4941/41250 [11:56:51<87:32:54, 8.68s/it] {'loss': 0.2104, 'grad_norm': 2.7858588695526123, 'learning_rate': 3.916060739784231e-05, 'epoch': 1.2} 12%|█▏ | 4941/41250 [11:56:51<87:32:54, 8.68s/it][2025-04-25 19:54:34,987] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-25 19:54:34,988] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.03 | bwd_microstep: 5748.06 | bwd_inner_microstep: 5697.46 | bwd_allreduce_microstep: 50.55 | step_microstep: 18.52 [2025-04-25 19:54:34,988] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.03 | bwd: 5748.07 | bwd_inner: 5697.46 | bwd_allreduce: 50.57 | step: 18.52 12%|█▏ | 4942/41250 [11:57:00<87:34:21, 8.68s/it] {'loss': 0.042, 'grad_norm': 0.705148458480835, 'learning_rate': 3.916015717879379e-05, 'epoch': 1.2} 12%|█▏ | 4942/41250 [11:57:00<87:34:21, 8.68s/it][2025-04-25 19:54:43,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.02 | optimizer_step: 0.91 [2025-04-25 19:54:43,654] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.94 | bwd_microstep: 5719.65 | bwd_inner_microstep: 5706.85 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.80 [2025-04-25 19:54:43,654] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.94 | bwd: 5719.66 | bwd_inner: 5706.85 | bwd_allreduce: 12.77 | step: 18.80 12%|█▏ | 4943/41250 [11:57:08<87:31:08, 8.68s/it] {'loss': 0.1445, 'grad_norm': 1.4438657760620117, 'learning_rate': 3.915970684162666e-05, 'epoch': 1.2} 12%|█▏ | 4943/41250 [11:57:08<87:31:08, 8.68s/it][2025-04-25 19:54:52,347] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.06 | optimizer_step: 0.97 [2025-04-25 19:54:52,348] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.43 | bwd_microstep: 5772.69 | bwd_inner_microstep: 5661.82 | bwd_allreduce_microstep: 110.82 | step_microstep: 18.96 [2025-04-25 19:54:52,348] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.43 | bwd: 5772.71 | bwd_inner: 5661.82 | bwd_allreduce: 110.84 | step: 18.96 12%|█▏ | 4944/41250 [11:57:17<87:33:58, 8.68s/it] {'loss': 0.0735, 'grad_norm': 2.4272286891937256, 'learning_rate': 3.915925638634372e-05, 'epoch': 1.2} 12%|█▏ | 4944/41250 [11:57:17<87:33:58, 8.68s/it][2025-04-25 19:55:01,025] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 19:55:01,026] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2868.84 | bwd_microstep: 5725.57 | bwd_inner_microstep: 5712.66 | bwd_allreduce_microstep: 12.87 | step_microstep: 18.57 [2025-04-25 19:55:01,026] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2868.84 | bwd: 5725.58 | bwd_inner: 5712.65 | bwd_allreduce: 12.88 | step: 18.58 12%|█▏ | 4945/41250 [11:57:26<87:33:18, 8.68s/it] {'loss': 0.4038, 'grad_norm': 3.6571130752563477, 'learning_rate': 3.915880581294772e-05, 'epoch': 1.2} 12%|█▏ | 4945/41250 [11:57:26<87:33:18, 8.68s/it][2025-04-25 19:55:09,729] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.05 | optimizer_step: 1.15 [2025-04-25 19:55:09,730] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.92 | bwd_microstep: 5761.53 | bwd_inner_microstep: 5718.85 | bwd_allreduce_microstep: 42.63 | step_microstep: 19.44 [2025-04-25 19:55:09,730] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.92 | bwd: 5761.54 | bwd_inner: 5718.85 | bwd_allreduce: 42.65 | step: 19.45 12%|█▏ | 4946/41250 [11:57:35<87:36:34, 8.69s/it] {'loss': 0.1219, 'grad_norm': 1.824683666229248, 'learning_rate': 3.9158355121441456e-05, 'epoch': 1.2} 12%|█▏ | 4946/41250 [11:57:35<87:36:34, 8.69s/it][2025-04-25 19:55:18,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 19:55:18,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.39 | bwd_microstep: 5696.46 | bwd_inner_microstep: 5665.53 | bwd_allreduce_microstep: 30.89 | step_microstep: 18.63 [2025-04-25 19:55:18,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.39 | bwd: 5696.48 | bwd_inner: 5665.53 | bwd_allreduce: 30.90 | step: 18.63 12%|█▏ | 4947/41250 [11:57:43<87:26:18, 8.67s/it] {'loss': 0.1348, 'grad_norm': 2.3260419368743896, 'learning_rate': 3.91579043118277e-05, 'epoch': 1.2} 12%|█▏ | 4947/41250 [11:57:43<87:26:18, 8.67s/it][2025-04-25 19:55:26,997] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 19:55:26,997] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.58 | bwd_microstep: 5705.86 | bwd_inner_microstep: 5668.57 | bwd_allreduce_microstep: 37.25 | step_microstep: 18.61 [2025-04-25 19:55:26,997] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.58 | bwd: 5705.88 | bwd_inner: 5668.57 | bwd_allreduce: 37.27 | step: 18.62 12%|█▏ | 4948/41250 [11:57:52<87:19:51, 8.66s/it] {'loss': 0.0833, 'grad_norm': 1.9073394536972046, 'learning_rate': 3.915745338410923e-05, 'epoch': 1.2} 12%|█▏ | 4948/41250 [11:57:52<87:19:51, 8.66s/it][2025-04-25 19:55:35,698] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 19:55:35,699] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2866.83 | bwd_microstep: 5745.38 | bwd_inner_microstep: 5709.61 | bwd_allreduce_microstep: 35.73 | step_microstep: 18.46 [2025-04-25 19:55:35,699] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2866.83 | bwd: 5745.40 | bwd_inner: 5709.61 | bwd_allreduce: 35.75 | step: 18.46 12%|█▏ | 4949/41250 [11:58:01<87:26:51, 8.67s/it] {'loss': 0.1511, 'grad_norm': 3.4950976371765137, 'learning_rate': 3.9157002338288835e-05, 'epoch': 1.2} 12%|█▏ | 4949/41250 [11:58:01<87:26:51, 8.67s/it][2025-04-25 19:55:44,336] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:55:44,336] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.30 | bwd_microstep: 5705.35 | bwd_inner_microstep: 5663.22 | bwd_allreduce_microstep: 42.09 | step_microstep: 18.26 [2025-04-25 19:55:44,337] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.30 | bwd: 5705.36 | bwd_inner: 5663.22 | bwd_allreduce: 42.11 | step: 18.26 12%|█▏ | 4950/41250 [11:58:09<87:20:32, 8.66s/it] {'loss': 0.2942, 'grad_norm': 4.529632091522217, 'learning_rate': 3.9156551174369284e-05, 'epoch': 1.2} 12%|█▏ | 4950/41250 [11:58:09<87:20:32, 8.66s/it][2025-04-25 19:55:53,054] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 19:55:53,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.51 | bwd_microstep: 5790.13 | bwd_inner_microstep: 5673.42 | bwd_allreduce_microstep: 116.67 | step_microstep: 18.62 [2025-04-25 19:55:53,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.51 | bwd: 5790.14 | bwd_inner: 5673.42 | bwd_allreduce: 116.68 | step: 18.62 12%|█▏ | 4951/41250 [11:58:18<87:30:37, 8.68s/it] {'loss': 0.141, 'grad_norm': 2.012799024581909, 'learning_rate': 3.9156099892353364e-05, 'epoch': 1.2} 12%|█▏ | 4951/41250 [11:58:18<87:30:37, 8.68s/it][2025-04-25 19:56:01,766] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 19:56:01,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.82 | bwd_microstep: 5802.24 | bwd_inner_microstep: 5661.47 | bwd_allreduce_microstep: 140.73 | step_microstep: 18.65 [2025-04-25 19:56:01,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.82 | bwd: 5802.26 | bwd_inner: 5661.47 | bwd_allreduce: 140.75 | step: 18.65 12%|█▏ | 4952/41250 [11:58:27<87:36:27, 8.69s/it] {'loss': 0.0398, 'grad_norm': 0.586517870426178, 'learning_rate': 3.915564849224385e-05, 'epoch': 1.2} 12%|█▏ | 4952/41250 [11:58:27<87:36:27, 8.69s/it][2025-04-25 19:56:10,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.11 | optimizer_step: 1.05 [2025-04-25 19:56:10,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.79 | bwd_microstep: 5738.13 | bwd_inner_microstep: 5724.37 | bwd_allreduce_microstep: 13.71 | step_microstep: 19.97 [2025-04-25 19:56:10,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.79 | bwd: 5738.15 | bwd_inner: 5724.37 | bwd_allreduce: 13.73 | step: 19.97 12%|█▏ | 4953/41250 [11:58:35<87:34:29, 8.69s/it] {'loss': 0.0417, 'grad_norm': 0.5316907167434692, 'learning_rate': 3.915519697404354e-05, 'epoch': 1.2} 12%|█▏ | 4953/41250 [11:58:35<87:34:29, 8.69s/it][2025-04-25 19:56:19,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 19:56:19,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.61 | bwd_microstep: 5754.56 | bwd_inner_microstep: 5716.79 | bwd_allreduce_microstep: 37.72 | step_microstep: 18.53 [2025-04-25 19:56:19,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.61 | bwd: 5754.57 | bwd_inner: 5716.79 | bwd_allreduce: 37.74 | step: 18.53 12%|█▏ | 4954/41250 [11:58:44<87:35:22, 8.69s/it] {'loss': 0.1121, 'grad_norm': 0.9589769244194031, 'learning_rate': 3.9154745337755204e-05, 'epoch': 1.2} 12%|█▏ | 4954/41250 [11:58:44<87:35:22, 8.69s/it][2025-04-25 19:56:27,849] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.99 | optimizer_step: 0.96 [2025-04-25 19:56:27,849] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.60 | bwd_microstep: 5774.15 | bwd_inner_microstep: 5704.82 | bwd_allreduce_microstep: 69.29 | step_microstep: 18.66 [2025-04-25 19:56:27,850] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.60 | bwd: 5774.17 | bwd_inner: 5704.82 | bwd_allreduce: 69.31 | step: 18.66 12%|█▏ | 4955/41250 [11:58:53<87:39:40, 8.69s/it] {'loss': 0.118, 'grad_norm': 1.6285144090652466, 'learning_rate': 3.915429358338163e-05, 'epoch': 1.2} 12%|█▏ | 4955/41250 [11:58:53<87:39:40, 8.69s/it][2025-04-25 19:56:36,572] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 1.03 [2025-04-25 19:56:36,572] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.31 | bwd_microstep: 5780.61 | bwd_inner_microstep: 5709.09 | bwd_allreduce_microstep: 71.48 | step_microstep: 19.00 [2025-04-25 19:56:36,573] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.31 | bwd: 5780.63 | bwd_inner: 5709.09 | bwd_allreduce: 71.49 | step: 19.00 12%|█▏ | 4956/41250 [11:59:01<87:44:49, 8.70s/it] {'loss': 0.1578, 'grad_norm': 1.5781985521316528, 'learning_rate': 3.9153841710925605e-05, 'epoch': 1.2} 12%|█▏ | 4956/41250 [11:59:01<87:44:49, 8.70s/it][2025-04-25 19:56:45,280] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 19:56:45,281] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.04 | bwd_microstep: 5771.63 | bwd_inner_microstep: 5715.88 | bwd_allreduce_microstep: 55.70 | step_microstep: 18.80 [2025-04-25 19:56:45,281] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.04 | bwd: 5771.64 | bwd_inner: 5715.88 | bwd_allreduce: 55.72 | step: 18.80 12%|█▏ | 4957/41250 [11:59:10<87:45:33, 8.71s/it] {'loss': 0.094, 'grad_norm': 1.3534042835235596, 'learning_rate': 3.9153389720389914e-05, 'epoch': 1.2} 12%|█▏ | 4957/41250 [11:59:10<87:45:33, 8.71s/it][2025-04-25 19:56:53,962] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 19:56:53,963] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2862.43 | bwd_microstep: 5733.46 | bwd_inner_microstep: 5703.07 | bwd_allreduce_microstep: 30.34 | step_microstep: 19.04 [2025-04-25 19:56:53,963] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2862.43 | bwd: 5733.48 | bwd_inner: 5703.07 | bwd_allreduce: 30.36 | step: 19.04 12%|█▏ | 4958/41250 [11:59:19<87:41:13, 8.70s/it] {'loss': 0.0997, 'grad_norm': 2.0155587196350098, 'learning_rate': 3.9152937611777336e-05, 'epoch': 1.2} 12%|█▏ | 4958/41250 [11:59:19<87:41:13, 8.70s/it][2025-04-25 19:57:02,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.99 | optimizer_step: 0.98 [2025-04-25 19:57:02,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.38 | bwd_microstep: 5784.31 | bwd_inner_microstep: 5666.40 | bwd_allreduce_microstep: 117.87 | step_microstep: 18.69 [2025-04-25 19:57:02,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.38 | bwd: 5784.33 | bwd_inner: 5666.40 | bwd_allreduce: 117.89 | step: 18.69 12%|█▏ | 4959/41250 [11:59:27<87:42:01, 8.70s/it] {'loss': 0.1237, 'grad_norm': 2.0113131999969482, 'learning_rate': 3.9152485385090675e-05, 'epoch': 1.2} 12%|█▏ | 4959/41250 [11:59:27<87:42:01, 8.70s/it][2025-04-25 19:57:11,341] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.03 | optimizer_step: 1.01 [2025-04-25 19:57:11,342] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.41 | bwd_microstep: 5739.34 | bwd_inner_microstep: 5705.69 | bwd_allreduce_microstep: 33.60 | step_microstep: 19.49 [2025-04-25 19:57:11,342] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.41 | bwd: 5739.35 | bwd_inner: 5705.69 | bwd_allreduce: 33.61 | step: 19.49 12%|█▏ | 4960/41250 [11:59:36<87:37:54, 8.69s/it] {'loss': 0.085, 'grad_norm': 1.701134204864502, 'learning_rate': 3.91520330403327e-05, 'epoch': 1.2} 12%|█▏ | 4960/41250 [11:59:36<87:37:54, 8.69s/it][2025-04-25 19:57:19,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.03 | optimizer_step: 0.97 [2025-04-25 19:57:19,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.19 | bwd_microstep: 5715.79 | bwd_inner_microstep: 5660.93 | bwd_allreduce_microstep: 54.81 | step_microstep: 19.53 [2025-04-25 19:57:19,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.19 | bwd: 5715.80 | bwd_inner: 5660.93 | bwd_allreduce: 54.83 | step: 19.53 12%|█▏ | 4961/41250 [11:59:45<87:26:54, 8.68s/it] {'loss': 0.1712, 'grad_norm': 3.7515008449554443, 'learning_rate': 3.9151580577506215e-05, 'epoch': 1.2} 12%|█▏ | 4961/41250 [11:59:45<87:26:54, 8.68s/it][2025-04-25 19:57:28,589] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.04 | optimizer_step: 0.96 [2025-04-25 19:57:28,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.58 | bwd_microstep: 5706.61 | bwd_inner_microstep: 5639.40 | bwd_allreduce_microstep: 67.17 | step_microstep: 19.38 [2025-04-25 19:57:28,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.58 | bwd: 5706.62 | bwd_inner: 5639.40 | bwd_allreduce: 67.18 | step: 19.38 12%|█▏ | 4962/41250 [11:59:53<87:15:49, 8.66s/it] {'loss': 0.2344, 'grad_norm': 2.0149145126342773, 'learning_rate': 3.9151127996614e-05, 'epoch': 1.2} 12%|█▏ | 4962/41250 [11:59:53<87:15:49, 8.66s/it][2025-04-25 19:57:37,258] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 19:57:37,258] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.83 | bwd_microstep: 5757.27 | bwd_inner_microstep: 5653.40 | bwd_allreduce_microstep: 103.83 | step_microstep: 18.86 [2025-04-25 19:57:37,259] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.83 | bwd: 5757.29 | bwd_inner: 5653.40 | bwd_allreduce: 103.85 | step: 18.86 12%|█▏ | 4963/41250 [12:00:02<87:17:44, 8.66s/it] {'loss': 0.0366, 'grad_norm': 0.455233633518219, 'learning_rate': 3.9150675297658846e-05, 'epoch': 1.2} 12%|█▏ | 4963/41250 [12:00:02<87:17:44, 8.66s/it][2025-04-25 19:57:46,062] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 19:57:46,063] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.41 | bwd_microstep: 5864.73 | bwd_inner_microstep: 5716.95 | bwd_allreduce_microstep: 147.73 | step_microstep: 19.02 [2025-04-25 19:57:46,063] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.42 | bwd: 5864.75 | bwd_inner: 5716.95 | bwd_allreduce: 147.75 | step: 19.02 12%|█▏ | 4964/41250 [12:00:11<87:43:36, 8.70s/it] {'loss': 0.1173, 'grad_norm': 1.497464895248413, 'learning_rate': 3.915022248064355e-05, 'epoch': 1.2} 12%|█▏ | 4964/41250 [12:00:11<87:43:36, 8.70s/it][2025-04-25 19:57:54,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 19:57:54,694] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.68 | bwd_microstep: 5703.42 | bwd_inner_microstep: 5690.54 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.84 [2025-04-25 19:57:54,694] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.68 | bwd: 5703.43 | bwd_inner: 5690.54 | bwd_allreduce: 12.85 | step: 18.84 12%|█▏ | 4965/41250 [12:00:20<87:30:17, 8.68s/it] {'loss': 0.0818, 'grad_norm': 1.3105581998825073, 'learning_rate': 3.914976954557089e-05, 'epoch': 1.2} 12%|█▏ | 4965/41250 [12:00:20<87:30:17, 8.68s/it][2025-04-25 19:58:03,391] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 19:58:03,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.57 | bwd_microstep: 5791.94 | bwd_inner_microstep: 5644.42 | bwd_allreduce_microstep: 147.48 | step_microstep: 18.85 [2025-04-25 19:58:03,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.57 | bwd: 5791.96 | bwd_inner: 5644.42 | bwd_allreduce: 147.50 | step: 18.85 12%|█▏ | 4966/41250 [12:00:28<87:32:57, 8.69s/it] {'loss': 0.1748, 'grad_norm': 2.1041455268859863, 'learning_rate': 3.9149316492443677e-05, 'epoch': 1.2} 12%|█▏ | 4966/41250 [12:00:28<87:32:57, 8.69s/it][2025-04-25 19:58:12,083] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 19:58:12,083] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.08 | bwd_microstep: 5783.18 | bwd_inner_microstep: 5650.14 | bwd_allreduce_microstep: 132.99 | step_microstep: 19.01 [2025-04-25 19:58:12,084] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.08 | bwd: 5783.19 | bwd_inner: 5650.14 | bwd_allreduce: 133.01 | step: 19.02 12%|█▏ | 4967/41250 [12:00:37<87:33:55, 8.69s/it] {'loss': 0.2085, 'grad_norm': 2.15913462638855, 'learning_rate': 3.914886332126469e-05, 'epoch': 1.2} 12%|█▏ | 4967/41250 [12:00:37<87:33:55, 8.69s/it][2025-04-25 19:58:20,746] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 19:58:20,747] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.79 | bwd_microstep: 5728.72 | bwd_inner_microstep: 5699.62 | bwd_allreduce_microstep: 29.05 | step_microstep: 18.88 [2025-04-25 19:58:20,747] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.79 | bwd: 5728.73 | bwd_inner: 5699.62 | bwd_allreduce: 29.07 | step: 18.88 12%|█▏ | 4968/41250 [12:00:46<87:29:18, 8.68s/it] {'loss': 0.0871, 'grad_norm': 1.7109946012496948, 'learning_rate': 3.9148410032036724e-05, 'epoch': 1.2} 12%|█▏ | 4968/41250 [12:00:46<87:29:18, 8.68s/it][2025-04-25 19:58:29,418] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 19:58:29,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.77 | bwd_microstep: 5739.65 | bwd_inner_microstep: 5686.44 | bwd_allreduce_microstep: 53.17 | step_microstep: 19.04 [2025-04-25 19:58:29,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.77 | bwd: 5739.67 | bwd_inner: 5686.44 | bwd_allreduce: 53.18 | step: 19.04 12%|█▏ | 4969/41250 [12:00:54<87:27:27, 8.68s/it] {'loss': 0.1493, 'grad_norm': 2.4113035202026367, 'learning_rate': 3.914795662476258e-05, 'epoch': 1.2} 12%|█▏ | 4969/41250 [12:00:54<87:27:27, 8.68s/it][2025-04-25 19:58:38,338] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.03 | optimizer_step: 0.94 [2025-04-25 19:58:38,338] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.37 | bwd_microstep: 6010.47 | bwd_inner_microstep: 5633.80 | bwd_allreduce_microstep: 376.62 | step_microstep: 19.21 [2025-04-25 19:58:38,338] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.37 | bwd: 6010.48 | bwd_inner: 5633.80 | bwd_allreduce: 376.64 | step: 19.21 12%|█▏ | 4970/41250 [12:01:03<88:11:48, 8.75s/it] {'loss': 0.065, 'grad_norm': 1.0255277156829834, 'learning_rate': 3.9147503099445055e-05, 'epoch': 1.2} 12%|█▏ | 4970/41250 [12:01:03<88:11:48, 8.75s/it][2025-04-25 19:58:46,936] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.07 | optimizer_step: 0.97 [2025-04-25 19:58:46,937] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.73 | bwd_microstep: 5687.59 | bwd_inner_microstep: 5637.91 | bwd_allreduce_microstep: 49.62 | step_microstep: 19.30 [2025-04-25 19:58:46,937] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.73 | bwd: 5687.61 | bwd_inner: 5637.91 | bwd_allreduce: 49.65 | step: 19.30 12%|█▏ | 4971/41250 [12:01:12<87:43:07, 8.70s/it] {'loss': 0.0825, 'grad_norm': 3.303624153137207, 'learning_rate': 3.914704945608693e-05, 'epoch': 1.21} 12%|█▏ | 4971/41250 [12:01:12<87:43:07, 8.70s/it][2025-04-25 19:58:55,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 19:58:55,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.96 | bwd_microstep: 5693.55 | bwd_inner_microstep: 5680.75 | bwd_allreduce_microstep: 12.75 | step_microstep: 19.24 [2025-04-25 19:58:55,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.96 | bwd: 5693.56 | bwd_inner: 5680.75 | bwd_allreduce: 12.77 | step: 19.24 12%|█▏ | 4972/41250 [12:01:20<87:27:21, 8.68s/it] {'loss': 0.0361, 'grad_norm': 0.6159043312072754, 'learning_rate': 3.914659569469102e-05, 'epoch': 1.21} 12%|█▏ | 4972/41250 [12:01:20<87:27:21, 8.68s/it][2025-04-25 19:59:04,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 19:59:04,213] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.59 | bwd_microstep: 5734.50 | bwd_inner_microstep: 5671.05 | bwd_allreduce_microstep: 63.40 | step_microstep: 18.57 [2025-04-25 19:59:04,213] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.59 | bwd: 5734.51 | bwd_inner: 5671.05 | bwd_allreduce: 63.42 | step: 18.58 12%|█▏ | 4973/41250 [12:01:29<87:23:10, 8.67s/it] {'loss': 0.1209, 'grad_norm': 2.3566412925720215, 'learning_rate': 3.9146141815260104e-05, 'epoch': 1.21} 12%|█▏ | 4973/41250 [12:01:29<87:23:10, 8.67s/it][2025-04-25 19:59:12,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 19:59:12,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.60 | bwd_microstep: 5693.76 | bwd_inner_microstep: 5628.00 | bwd_allreduce_microstep: 65.71 | step_microstep: 18.42 [2025-04-25 19:59:12,812] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.60 | bwd: 5693.77 | bwd_inner: 5628.00 | bwd_allreduce: 65.73 | step: 18.43 12%|█▏ | 4974/41250 [12:01:38<87:09:44, 8.65s/it] {'loss': 0.2752, 'grad_norm': 3.125633716583252, 'learning_rate': 3.914568781779699e-05, 'epoch': 1.21} 12%|█▏ | 4974/41250 [12:01:38<87:09:44, 8.65s/it][2025-04-25 19:59:21,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 19:59:21,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.56 | bwd_microstep: 5698.00 | bwd_inner_microstep: 5629.31 | bwd_allreduce_microstep: 68.64 | step_microstep: 18.63 [2025-04-25 19:59:21,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.56 | bwd: 5698.01 | bwd_inner: 5629.31 | bwd_allreduce: 68.66 | step: 18.63 12%|█▏ | 4975/41250 [12:01:46<87:00:57, 8.64s/it] {'loss': 0.1183, 'grad_norm': 1.8595918416976929, 'learning_rate': 3.914523370230448e-05, 'epoch': 1.21} 12%|█▏ | 4975/41250 [12:01:46<87:00:57, 8.64s/it][2025-04-25 19:59:29,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.97 [2025-04-25 19:59:29,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.70 | bwd_microstep: 5659.99 | bwd_inner_microstep: 5645.24 | bwd_allreduce_microstep: 14.70 | step_microstep: 18.71 [2025-04-25 19:59:29,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.70 | bwd: 5660.01 | bwd_inner: 5645.24 | bwd_allreduce: 14.72 | step: 18.72 12%|█▏ | 4976/41250 [12:01:55<86:50:51, 8.62s/it] {'loss': 0.1315, 'grad_norm': 2.796468734741211, 'learning_rate': 3.914477946878537e-05, 'epoch': 1.21} 12%|█▏ | 4976/41250 [12:01:55<86:50:51, 8.62s/it][2025-04-25 19:59:38,642] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 19:59:38,643] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.47 | bwd_microstep: 5726.15 | bwd_inner_microstep: 5673.35 | bwd_allreduce_microstep: 52.75 | step_microstep: 18.37 [2025-04-25 19:59:38,643] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.47 | bwd: 5726.16 | bwd_inner: 5673.35 | bwd_allreduce: 52.77 | step: 18.37 12%|█▏ | 4977/41250 [12:02:03<86:55:50, 8.63s/it] {'loss': 0.1904, 'grad_norm': 3.0751256942749023, 'learning_rate': 3.9144325117242464e-05, 'epoch': 1.21} 12%|█▏ | 4977/41250 [12:02:03<86:55:50, 8.63s/it][2025-04-25 19:59:47,295] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.98 | optimizer_step: 0.93 [2025-04-25 19:59:47,296] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.78 | bwd_microstep: 5750.26 | bwd_inner_microstep: 5633.75 | bwd_allreduce_microstep: 116.46 | step_microstep: 18.50 [2025-04-25 19:59:47,296] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.78 | bwd: 5750.27 | bwd_inner: 5633.75 | bwd_allreduce: 116.47 | step: 18.50 12%|█▏ | 4978/41250 [12:02:12<87:00:27, 8.64s/it] {'loss': 0.1271, 'grad_norm': 1.2477110624313354, 'learning_rate': 3.914387064767856e-05, 'epoch': 1.21} 12%|█▏ | 4978/41250 [12:02:12<87:00:27, 8.64s/it][2025-04-25 19:59:55,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.01 | optimizer_step: 1.00 [2025-04-25 19:59:55,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.98 | bwd_microstep: 5695.81 | bwd_inner_microstep: 5683.03 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.61 [2025-04-25 19:59:55,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.98 | bwd: 5695.83 | bwd_inner: 5683.03 | bwd_allreduce: 12.75 | step: 18.61 12%|█▏ | 4979/41250 [12:02:21<86:58:12, 8.63s/it] {'loss': 0.0895, 'grad_norm': 1.662016749382019, 'learning_rate': 3.914341606009645e-05, 'epoch': 1.21} 12%|█▏ | 4979/41250 [12:02:21<86:58:12, 8.63s/it][2025-04-25 20:00:04,578] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:00:04,579] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.15 | bwd_microstep: 5736.08 | bwd_inner_microstep: 5678.55 | bwd_allreduce_microstep: 57.48 | step_microstep: 18.50 [2025-04-25 20:00:04,579] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.15 | bwd: 5736.09 | bwd_inner: 5678.55 | bwd_allreduce: 57.50 | step: 18.50 12%|█▏ | 4980/41250 [12:02:29<87:03:24, 8.64s/it] {'loss': 0.0362, 'grad_norm': 0.7744361758232117, 'learning_rate': 3.9142961354498956e-05, 'epoch': 1.21} 12%|█▏ | 4980/41250 [12:02:29<87:03:24, 8.64s/it][2025-04-25 20:00:13,229] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.02 | optimizer_step: 1.03 [2025-04-25 20:00:13,230] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.31 | bwd_microstep: 5718.39 | bwd_inner_microstep: 5688.34 | bwd_allreduce_microstep: 30.01 | step_microstep: 18.97 [2025-04-25 20:00:13,230] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.32 | bwd: 5718.40 | bwd_inner: 5688.34 | bwd_allreduce: 30.03 | step: 18.98 12%|█▏ | 4981/41250 [12:02:38<87:04:54, 8.64s/it] {'loss': 0.1609, 'grad_norm': 1.3515477180480957, 'learning_rate': 3.914250653088886e-05, 'epoch': 1.21} 12%|█▏ | 4981/41250 [12:02:38<87:04:54, 8.64s/it][2025-04-25 20:00:21,855] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 20:00:21,855] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.70 | bwd_microstep: 5694.25 | bwd_inner_microstep: 5681.44 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.60 [2025-04-25 20:00:21,855] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.71 | bwd: 5694.26 | bwd_inner: 5681.44 | bwd_allreduce: 12.78 | step: 18.61 12%|█▏ | 4982/41250 [12:02:47<87:01:09, 8.64s/it] {'loss': 0.1592, 'grad_norm': 3.1764118671417236, 'learning_rate': 3.9142051589268975e-05, 'epoch': 1.21} 12%|█▏ | 4982/41250 [12:02:47<87:01:09, 8.64s/it][2025-04-25 20:00:30,542] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:00:30,542] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.41 | bwd_microstep: 5741.96 | bwd_inner_microstep: 5683.91 | bwd_allreduce_microstep: 58.00 | step_microstep: 18.61 [2025-04-25 20:00:30,543] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.41 | bwd: 5741.97 | bwd_inner: 5683.91 | bwd_allreduce: 58.02 | step: 18.61 12%|█▏ | 4983/41250 [12:02:55<87:09:55, 8.65s/it] {'loss': 0.1383, 'grad_norm': 1.1897492408752441, 'learning_rate': 3.914159652964211e-05, 'epoch': 1.21} 12%|█▏ | 4983/41250 [12:02:55<87:09:55, 8.65s/it][2025-04-25 20:00:39,156] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-25 20:00:39,157] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.87 | bwd_microstep: 5702.69 | bwd_inner_microstep: 5662.48 | bwd_allreduce_microstep: 40.16 | step_microstep: 19.27 [2025-04-25 20:00:39,157] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.87 | bwd: 5702.71 | bwd_inner: 5662.48 | bwd_allreduce: 40.18 | step: 19.28 12%|█▏ | 4984/41250 [12:03:04<87:03:35, 8.64s/it] {'loss': 0.1093, 'grad_norm': 1.878277063369751, 'learning_rate': 3.914114135201108e-05, 'epoch': 1.21} 12%|█▏ | 4984/41250 [12:03:04<87:03:35, 8.64s/it][2025-04-25 20:00:47,845] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.92 [2025-04-25 20:00:47,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.47 | bwd_microstep: 5775.57 | bwd_inner_microstep: 5659.03 | bwd_allreduce_microstep: 116.50 | step_microstep: 18.81 [2025-04-25 20:00:47,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.47 | bwd: 5775.59 | bwd_inner: 5659.03 | bwd_allreduce: 116.52 | step: 18.81 12%|█▏ | 4985/41250 [12:03:13<87:11:17, 8.66s/it] {'loss': 0.1417, 'grad_norm': 1.152449369430542, 'learning_rate': 3.914068605637867e-05, 'epoch': 1.21} 12%|█▏ | 4985/41250 [12:03:13<87:11:17, 8.66s/it][2025-04-25 20:00:56,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 20:00:56,645] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.16 | bwd_microstep: 5888.26 | bwd_inner_microstep: 5660.86 | bwd_allreduce_microstep: 227.35 | step_microstep: 18.39 [2025-04-25 20:00:56,645] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.16 | bwd: 5888.28 | bwd_inner: 5660.86 | bwd_allreduce: 227.37 | step: 18.39 12%|█▏ | 4986/41250 [12:03:21<87:37:13, 8.70s/it] {'loss': 0.0798, 'grad_norm': 1.2755616903305054, 'learning_rate': 3.914023064274769e-05, 'epoch': 1.21} 12%|█▏ | 4986/41250 [12:03:21<87:37:13, 8.70s/it][2025-04-25 20:01:05,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:01:05,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.87 | bwd_microstep: 5715.10 | bwd_inner_microstep: 5702.36 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.80 [2025-04-25 20:01:05,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.87 | bwd: 5715.11 | bwd_inner: 5702.36 | bwd_allreduce: 12.71 | step: 18.80 12%|█▏ | 4987/41250 [12:03:30<87:28:00, 8.68s/it] {'loss': 0.0179, 'grad_norm': 0.3066762089729309, 'learning_rate': 3.913977511112095e-05, 'epoch': 1.21} 12%|█▏ | 4987/41250 [12:03:30<87:28:00, 8.68s/it][2025-04-25 20:01:14,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.99 | optimizer_step: 0.97 [2025-04-25 20:01:14,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.89 | bwd_microstep: 5787.81 | bwd_inner_microstep: 5652.15 | bwd_allreduce_microstep: 135.62 | step_microstep: 19.16 [2025-04-25 20:01:14,003] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.89 | bwd: 5787.82 | bwd_inner: 5652.15 | bwd_allreduce: 135.64 | step: 19.16 12%|█▏ | 4988/41250 [12:03:39<87:32:37, 8.69s/it] {'loss': 0.099, 'grad_norm': 1.5418002605438232, 'learning_rate': 3.913931946150127e-05, 'epoch': 1.21} 12%|█▏ | 4988/41250 [12:03:39<87:32:37, 8.69s/it][2025-04-25 20:01:22,652] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 20:01:22,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.39 | bwd_microstep: 5717.83 | bwd_inner_microstep: 5704.49 | bwd_allreduce_microstep: 13.29 | step_microstep: 19.26 [2025-04-25 20:01:22,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.39 | bwd: 5717.84 | bwd_inner: 5704.49 | bwd_allreduce: 13.31 | step: 19.27 12%|█▏ | 4989/41250 [12:03:47<87:25:12, 8.68s/it] {'loss': 0.1826, 'grad_norm': 2.231804132461548, 'learning_rate': 3.913886369389145e-05, 'epoch': 1.21} 12%|█▏ | 4989/41250 [12:03:47<87:25:12, 8.68s/it][2025-04-25 20:01:31,399] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 20:01:31,399] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2886.10 | bwd_microstep: 5776.20 | bwd_inner_microstep: 5763.39 | bwd_allreduce_microstep: 12.77 | step_microstep: 19.33 [2025-04-25 20:01:31,399] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2886.10 | bwd: 5776.22 | bwd_inner: 5763.39 | bwd_allreduce: 12.79 | step: 19.34 12%|█▏ | 4990/41250 [12:03:56<87:37:04, 8.70s/it] {'loss': 0.4217, 'grad_norm': 2.3078975677490234, 'learning_rate': 3.913840780829429e-05, 'epoch': 1.21} 12%|█▏ | 4990/41250 [12:03:56<87:37:04, 8.70s/it][2025-04-25 20:01:40,084] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.03 | optimizer_step: 1.04 [2025-04-25 20:01:40,085] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.32 | bwd_microstep: 5776.70 | bwd_inner_microstep: 5657.88 | bwd_allreduce_microstep: 118.77 | step_microstep: 19.68 [2025-04-25 20:01:40,085] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.32 | bwd: 5776.71 | bwd_inner: 5657.88 | bwd_allreduce: 118.79 | step: 19.68 12%|█▏ | 4991/41250 [12:04:05<87:34:51, 8.70s/it] {'loss': 0.1023, 'grad_norm': 1.7703533172607422, 'learning_rate': 3.913795180471262e-05, 'epoch': 1.21} 12%|█▏ | 4991/41250 [12:04:05<87:34:51, 8.70s/it][2025-04-25 20:01:48,717] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 20:01:48,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.45 | bwd_microstep: 5723.64 | bwd_inner_microstep: 5649.19 | bwd_allreduce_microstep: 74.40 | step_microstep: 18.99 [2025-04-25 20:01:48,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.45 | bwd: 5723.65 | bwd_inner: 5649.19 | bwd_allreduce: 74.42 | step: 18.99 12%|█▏ | 4992/41250 [12:04:14<87:23:11, 8.68s/it] {'loss': 0.0859, 'grad_norm': 1.9050469398498535, 'learning_rate': 3.9137495683149245e-05, 'epoch': 1.21} 12%|█▏ | 4992/41250 [12:04:14<87:23:11, 8.68s/it][2025-04-25 20:01:57,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.19 | optimizer_step: 0.94 [2025-04-25 20:01:57,368] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.84 | bwd_microstep: 5710.28 | bwd_inner_microstep: 5697.13 | bwd_allreduce_microstep: 13.09 | step_microstep: 19.21 [2025-04-25 20:01:57,368] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.84 | bwd: 5710.30 | bwd_inner: 5697.13 | bwd_allreduce: 13.12 | step: 19.21 12%|█▏ | 4993/41250 [12:04:22<87:19:04, 8.67s/it] {'loss': 0.1149, 'grad_norm': 1.2638007402420044, 'learning_rate': 3.9137039443606965e-05, 'epoch': 1.21} 12%|█▏ | 4993/41250 [12:04:22<87:19:04, 8.67s/it][2025-04-25 20:02:06,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-25 20:02:06,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.20 | bwd_microstep: 5725.77 | bwd_inner_microstep: 5712.93 | bwd_allreduce_microstep: 12.79 | step_microstep: 19.31 [2025-04-25 20:02:06,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.21 | bwd: 5725.78 | bwd_inner: 5712.93 | bwd_allreduce: 12.81 | step: 19.31 12%|█▏ | 4994/41250 [12:04:31<87:18:04, 8.67s/it] {'loss': 0.1194, 'grad_norm': 1.9572725296020508, 'learning_rate': 3.91365830860886e-05, 'epoch': 1.21} 12%|█▏ | 4994/41250 [12:04:31<87:18:04, 8.67s/it][2025-04-25 20:02:14,756] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.02 | optimizer_step: 0.91 [2025-04-25 20:02:14,756] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.67 | bwd_microstep: 5784.70 | bwd_inner_microstep: 5708.35 | bwd_allreduce_microstep: 76.30 | step_microstep: 18.94 [2025-04-25 20:02:14,756] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.67 | bwd: 5784.71 | bwd_inner: 5708.35 | bwd_allreduce: 76.32 | step: 18.94 12%|█▏ | 4995/41250 [12:04:40<87:26:55, 8.68s/it] {'loss': 0.0806, 'grad_norm': 1.2304291725158691, 'learning_rate': 3.9136126610596965e-05, 'epoch': 1.21} 12%|█▏ | 4995/41250 [12:04:40<87:26:55, 8.68s/it][2025-04-25 20:02:23,451] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 1.00 [2025-04-25 20:02:23,451] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.09 | bwd_microstep: 5783.00 | bwd_inner_microstep: 5666.45 | bwd_allreduce_microstep: 116.51 | step_microstep: 19.04 [2025-04-25 20:02:23,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.09 | bwd: 5783.02 | bwd_inner: 5666.45 | bwd_allreduce: 116.53 | step: 19.04 12%|█▏ | 4996/41250 [12:04:48<87:29:02, 8.69s/it] {'loss': 0.1497, 'grad_norm': 2.648470640182495, 'learning_rate': 3.913567001713488e-05, 'epoch': 1.21} 12%|█▏ | 4996/41250 [12:04:48<87:29:02, 8.69s/it][2025-04-25 20:02:32,092] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.99 | optimizer_step: 1.03 [2025-04-25 20:02:32,092] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.72 | bwd_microstep: 5724.55 | bwd_inner_microstep: 5657.91 | bwd_allreduce_microstep: 66.59 | step_microstep: 18.76 [2025-04-25 20:02:32,092] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.72 | bwd: 5724.57 | bwd_inner: 5657.91 | bwd_allreduce: 66.61 | step: 18.76 12%|█▏ | 4997/41250 [12:04:57<87:20:25, 8.67s/it] {'loss': 0.0517, 'grad_norm': 1.6863369941711426, 'learning_rate': 3.913521330570515e-05, 'epoch': 1.21} 12%|█▏ | 4997/41250 [12:04:57<87:20:25, 8.67s/it][2025-04-25 20:02:40,803] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.12 | optimizer_step: 0.91 [2025-04-25 20:02:40,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.73 | bwd_microstep: 5772.65 | bwd_inner_microstep: 5721.28 | bwd_allreduce_microstep: 51.33 | step_microstep: 18.90 [2025-04-25 20:02:40,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.73 | bwd: 5772.67 | bwd_inner: 5721.28 | bwd_allreduce: 51.35 | step: 18.90 12%|█▏ | 4998/41250 [12:05:06<87:27:15, 8.68s/it] {'loss': 0.1823, 'grad_norm': 3.343055248260498, 'learning_rate': 3.913475647631059e-05, 'epoch': 1.21} 12%|█▏ | 4998/41250 [12:05:06<87:27:15, 8.68s/it][2025-04-25 20:02:49,511] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:02:49,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.12 | bwd_microstep: 5773.08 | bwd_inner_microstep: 5720.31 | bwd_allreduce_microstep: 52.72 | step_microstep: 18.33 [2025-04-25 20:02:49,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.12 | bwd: 5773.09 | bwd_inner: 5720.31 | bwd_allreduce: 52.74 | step: 18.34 12%|█▏ | 4999/41250 [12:05:14<87:31:34, 8.69s/it] {'loss': 0.5073, 'grad_norm': 4.116705894470215, 'learning_rate': 3.913429952895402e-05, 'epoch': 1.21} 12%|█▏ | 4999/41250 [12:05:14<87:31:34, 8.69s/it][2025-04-25 20:02:58,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.07 | optimizer_step: 1.14 [2025-04-25 20:02:58,210] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.34 | bwd_microstep: 5784.49 | bwd_inner_microstep: 5660.70 | bwd_allreduce_microstep: 123.74 | step_microstep: 19.62 [2025-04-25 20:02:58,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.34 | bwd: 5784.51 | bwd_inner: 5660.70 | bwd_allreduce: 123.76 | step: 19.62 12%|█▏ | 5000/41250 [12:05:23<87:32:27, 8.69s/it] {'loss': 0.0483, 'grad_norm': 1.5798704624176025, 'learning_rate': 3.9133842463638266e-05, 'epoch': 1.21} 12%|█▏ | 5000/41250 [12:05:23<87:32:27, 8.69s/it][2025-04-25 20:03:06,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 20:03:06,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.29 | bwd_microstep: 5712.37 | bwd_inner_microstep: 5699.65 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.71 [2025-04-25 20:03:06,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.29 | bwd: 5712.39 | bwd_inner: 5699.65 | bwd_allreduce: 12.70 | step: 18.71 12%|█▏ | 5001/41250 [12:05:32<87:23:49, 8.68s/it] {'loss': 0.049, 'grad_norm': 1.3240474462509155, 'learning_rate': 3.913338528036612e-05, 'epoch': 1.21} 12%|█▏ | 5001/41250 [12:05:32<87:23:49, 8.68s/it][2025-04-25 20:03:15,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:03:15,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.35 | bwd_microstep: 5733.40 | bwd_inner_microstep: 5667.25 | bwd_allreduce_microstep: 66.10 | step_microstep: 18.54 [2025-04-25 20:03:15,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.35 | bwd: 5733.41 | bwd_inner: 5667.25 | bwd_allreduce: 66.12 | step: 18.54 12%|█▏ | 5002/41250 [12:05:40<87:18:18, 8.67s/it] {'loss': 0.0841, 'grad_norm': 2.360732078552246, 'learning_rate': 3.913292797914043e-05, 'epoch': 1.21} 12%|█▏ | 5002/41250 [12:05:40<87:18:18, 8.67s/it][2025-04-25 20:03:24,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 20:03:24,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.74 | bwd_microstep: 5736.35 | bwd_inner_microstep: 5710.85 | bwd_allreduce_microstep: 25.46 | step_microstep: 18.89 [2025-04-25 20:03:24,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.74 | bwd: 5736.37 | bwd_inner: 5710.85 | bwd_allreduce: 25.47 | step: 18.89 12%|█▏ | 5003/41250 [12:05:49<87:20:01, 8.67s/it] {'loss': 0.0258, 'grad_norm': 0.7353870868682861, 'learning_rate': 3.9132470559964e-05, 'epoch': 1.21} 12%|█▏ | 5003/41250 [12:05:49<87:20:01, 8.67s/it][2025-04-25 20:03:32,900] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-25 20:03:32,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.05 | bwd_microstep: 5798.24 | bwd_inner_microstep: 5658.36 | bwd_allreduce_microstep: 139.82 | step_microstep: 18.87 [2025-04-25 20:03:32,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.04 | bwd: 5798.25 | bwd_inner: 5658.36 | bwd_allreduce: 139.84 | step: 18.88 12%|█▏ | 5004/41250 [12:05:58<87:27:22, 8.69s/it] {'loss': 0.1154, 'grad_norm': 3.8901925086975098, 'learning_rate': 3.9132013022839654e-05, 'epoch': 1.21} 12%|█▏ | 5004/41250 [12:05:58<87:27:22, 8.69s/it][2025-04-25 20:03:41,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.04 | optimizer_step: 1.10 [2025-04-25 20:03:41,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.69 | bwd_microstep: 5781.74 | bwd_inner_microstep: 5650.00 | bwd_allreduce_microstep: 131.69 | step_microstep: 19.49 [2025-04-25 20:03:41,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.69 | bwd: 5781.75 | bwd_inner: 5650.00 | bwd_allreduce: 131.71 | step: 19.50 12%|█▏ | 5005/41250 [12:06:06<87:27:38, 8.69s/it] {'loss': 0.2919, 'grad_norm': 4.008839130401611, 'learning_rate': 3.9131555367770206e-05, 'epoch': 1.21} 12%|█▏ | 5005/41250 [12:06:06<87:27:38, 8.69s/it][2025-04-25 20:03:50,292] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:03:50,293] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.35 | bwd_microstep: 5762.06 | bwd_inner_microstep: 5701.18 | bwd_allreduce_microstep: 60.84 | step_microstep: 18.77 [2025-04-25 20:03:50,293] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.35 | bwd: 5762.08 | bwd_inner: 5701.18 | bwd_allreduce: 60.85 | step: 18.77 12%|█▏ | 5006/41250 [12:06:15<87:29:55, 8.69s/it] {'loss': 0.1804, 'grad_norm': 4.946050643920898, 'learning_rate': 3.913109759475848e-05, 'epoch': 1.21} 12%|█▏ | 5006/41250 [12:06:15<87:29:55, 8.69s/it][2025-04-25 20:03:59,000] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.07 | optimizer_step: 1.09 [2025-04-25 20:03:59,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.72 | bwd_microstep: 5793.41 | bwd_inner_microstep: 5660.49 | bwd_allreduce_microstep: 132.87 | step_microstep: 19.24 [2025-04-25 20:03:59,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.72 | bwd: 5793.42 | bwd_inner: 5660.49 | bwd_allreduce: 132.89 | step: 19.25 12%|█▏ | 5007/41250 [12:06:24<87:32:57, 8.70s/it] {'loss': 0.0791, 'grad_norm': 1.5028494596481323, 'learning_rate': 3.913063970380731e-05, 'epoch': 1.21} 12%|█▏ | 5007/41250 [12:06:24<87:32:57, 8.70s/it][2025-04-25 20:04:07,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 20:04:07,681] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.88 | bwd_microstep: 5746.05 | bwd_inner_microstep: 5697.73 | bwd_allreduce_microstep: 48.28 | step_microstep: 18.93 [2025-04-25 20:04:07,681] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.88 | bwd: 5746.06 | bwd_inner: 5697.73 | bwd_allreduce: 48.29 | step: 18.94 12%|█▏ | 5008/41250 [12:06:33<87:29:51, 8.69s/it] {'loss': 0.1946, 'grad_norm': 2.1430089473724365, 'learning_rate': 3.91301816949195e-05, 'epoch': 1.21} 12%|█▏ | 5008/41250 [12:06:33<87:29:51, 8.69s/it][2025-04-25 20:04:16,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 20:04:16,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.41 | bwd_microstep: 5849.93 | bwd_inner_microstep: 5700.53 | bwd_allreduce_microstep: 149.35 | step_microstep: 18.97 [2025-04-25 20:04:16,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.41 | bwd: 5849.95 | bwd_inner: 5700.53 | bwd_allreduce: 149.37 | step: 18.97 12%|█▏ | 5009/41250 [12:06:41<87:45:47, 8.72s/it] {'loss': 0.058, 'grad_norm': 0.7106497883796692, 'learning_rate': 3.912972356809789e-05, 'epoch': 1.21} 12%|█▏ | 5009/41250 [12:06:41<87:45:47, 8.72s/it][2025-04-25 20:04:25,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:04:25,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.39 | bwd_microstep: 5761.80 | bwd_inner_microstep: 5703.05 | bwd_allreduce_microstep: 58.70 | step_microstep: 18.94 [2025-04-25 20:04:25,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.39 | bwd: 5761.81 | bwd_inner: 5703.05 | bwd_allreduce: 58.72 | step: 18.94 12%|█▏ | 5010/41250 [12:06:50<87:41:12, 8.71s/it] {'loss': 0.1867, 'grad_norm': 1.5591517686843872, 'learning_rate': 3.912926532334529e-05, 'epoch': 1.21} 12%|█▏ | 5010/41250 [12:06:50<87:41:12, 8.71s/it][2025-04-25 20:04:33,823] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:04:33,823] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.54 | bwd_microstep: 5737.35 | bwd_inner_microstep: 5677.38 | bwd_allreduce_microstep: 59.93 | step_microstep: 18.70 [2025-04-25 20:04:33,824] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.54 | bwd: 5737.37 | bwd_inner: 5677.38 | bwd_allreduce: 59.95 | step: 18.70 12%|█▏ | 5011/41250 [12:06:59<87:33:32, 8.70s/it] {'loss': 0.1329, 'grad_norm': 1.522217869758606, 'learning_rate': 3.912880696066453e-05, 'epoch': 1.21} 12%|█▏ | 5011/41250 [12:06:59<87:33:32, 8.70s/it][2025-04-25 20:04:42,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 1.04 | optimizer_step: 0.98 [2025-04-25 20:04:42,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.94 | bwd_microstep: 5705.51 | bwd_inner_microstep: 5692.86 | bwd_allreduce_microstep: 12.60 | step_microstep: 19.23 [2025-04-25 20:04:42,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.94 | bwd: 5705.52 | bwd_inner: 5692.86 | bwd_allreduce: 12.62 | step: 19.23 12%|█▏ | 5012/41250 [12:07:07<87:22:36, 8.68s/it] {'loss': 0.1252, 'grad_norm': 1.374525785446167, 'learning_rate': 3.9128348480058444e-05, 'epoch': 1.22} 12%|█▏ | 5012/41250 [12:07:07<87:22:36, 8.68s/it][2025-04-25 20:04:51,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.98 | optimizer_step: 1.06 [2025-04-25 20:04:51,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.70 | bwd_microstep: 5738.35 | bwd_inner_microstep: 5687.69 | bwd_allreduce_microstep: 50.62 | step_microstep: 18.55 [2025-04-25 20:04:51,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.70 | bwd: 5738.37 | bwd_inner: 5687.69 | bwd_allreduce: 50.64 | step: 18.56 12%|█▏ | 5013/41250 [12:07:16<87:21:45, 8.68s/it] {'loss': 0.3143, 'grad_norm': 2.638021230697632, 'learning_rate': 3.912788988152985e-05, 'epoch': 1.22} 12%|█▏ | 5013/41250 [12:07:16<87:21:45, 8.68s/it][2025-04-25 20:04:59,841] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.02 | optimizer_step: 0.96 [2025-04-25 20:04:59,842] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.02 | bwd_microstep: 5774.64 | bwd_inner_microstep: 5657.53 | bwd_allreduce_microstep: 117.07 | step_microstep: 18.83 [2025-04-25 20:04:59,842] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.02 | bwd: 5774.65 | bwd_inner: 5657.53 | bwd_allreduce: 117.09 | step: 18.83 12%|█▏ | 5014/41250 [12:07:25<87:26:06, 8.69s/it] {'loss': 0.4135, 'grad_norm': 6.624649524688721, 'learning_rate': 3.912743116508158e-05, 'epoch': 1.22} 12%|█▏ | 5014/41250 [12:07:25<87:26:06, 8.69s/it][2025-04-25 20:05:08,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:05:08,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.90 | bwd_microstep: 5686.61 | bwd_inner_microstep: 5643.14 | bwd_allreduce_microstep: 43.43 | step_microstep: 18.62 [2025-04-25 20:05:08,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.90 | bwd: 5686.62 | bwd_inner: 5643.14 | bwd_allreduce: 43.44 | step: 18.63 12%|█▏ | 5015/41250 [12:07:33<87:10:42, 8.66s/it] {'loss': 0.1049, 'grad_norm': 1.9475866556167603, 'learning_rate': 3.9126972330716455e-05, 'epoch': 1.22} 12%|█▏ | 5015/41250 [12:07:33<87:10:42, 8.66s/it][2025-04-25 20:05:17,061] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 20:05:17,062] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.49 | bwd_microstep: 5691.07 | bwd_inner_microstep: 5657.91 | bwd_allreduce_microstep: 33.11 | step_microstep: 19.13 [2025-04-25 20:05:17,062] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.49 | bwd: 5691.08 | bwd_inner: 5657.91 | bwd_allreduce: 33.13 | step: 19.13 12%|█▏ | 5016/41250 [12:07:42<87:02:45, 8.65s/it] {'loss': 0.1095, 'grad_norm': 1.510125994682312, 'learning_rate': 3.912651337843731e-05, 'epoch': 1.22} 12%|█▏ | 5016/41250 [12:07:42<87:02:45, 8.65s/it][2025-04-25 20:05:25,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.02 | optimizer_step: 1.01 [2025-04-25 20:05:25,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.88 | bwd_microstep: 5780.36 | bwd_inner_microstep: 5649.58 | bwd_allreduce_microstep: 130.73 | step_microstep: 18.50 [2025-04-25 20:05:25,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.88 | bwd: 5780.37 | bwd_inner: 5649.58 | bwd_allreduce: 130.74 | step: 18.50 12%|█▏ | 5017/41250 [12:07:51<87:12:33, 8.66s/it] {'loss': 0.1416, 'grad_norm': 1.1436816453933716, 'learning_rate': 3.912605430824697e-05, 'epoch': 1.22} 12%|█▏ | 5017/41250 [12:07:51<87:12:33, 8.66s/it][2025-04-25 20:05:34,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 20:05:34,415] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.29 | bwd_microstep: 5709.31 | bwd_inner_microstep: 5696.59 | bwd_allreduce_microstep: 12.68 | step_microstep: 19.18 [2025-04-25 20:05:34,415] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.29 | bwd: 5709.32 | bwd_inner: 5696.59 | bwd_allreduce: 12.70 | step: 19.18 12%|█▏ | 5018/41250 [12:07:59<87:09:44, 8.66s/it] {'loss': 0.2378, 'grad_norm': 2.4052534103393555, 'learning_rate': 3.9125595120148266e-05, 'epoch': 1.22} 12%|█▏ | 5018/41250 [12:07:59<87:09:44, 8.66s/it][2025-04-25 20:05:43,052] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 20:05:43,053] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.75 | bwd_microstep: 5703.80 | bwd_inner_microstep: 5690.87 | bwd_allreduce_microstep: 12.89 | step_microstep: 18.89 [2025-04-25 20:05:43,053] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.75 | bwd: 5703.82 | bwd_inner: 5690.87 | bwd_allreduce: 12.91 | step: 18.89 12%|█▏ | 5019/41250 [12:08:08<87:05:08, 8.65s/it] {'loss': 0.1368, 'grad_norm': 1.9468337297439575, 'learning_rate': 3.912513581414403e-05, 'epoch': 1.22} 12%|█▏ | 5019/41250 [12:08:08<87:05:08, 8.65s/it][2025-04-25 20:05:51,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 1.03 [2025-04-25 20:05:51,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.37 | bwd_microstep: 5715.49 | bwd_inner_microstep: 5702.45 | bwd_allreduce_microstep: 13.00 | step_microstep: 18.68 [2025-04-25 20:05:51,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.37 | bwd: 5715.51 | bwd_inner: 5702.45 | bwd_allreduce: 13.01 | step: 18.68 12%|█▏ | 5020/41250 [12:08:17<87:05:47, 8.65s/it] {'loss': 0.1255, 'grad_norm': 1.4539841413497925, 'learning_rate': 3.91246763902371e-05, 'epoch': 1.22} 12%|█▏ | 5020/41250 [12:08:17<87:05:47, 8.65s/it][2025-04-25 20:06:00,377] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:06:00,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.52 | bwd_microstep: 5728.00 | bwd_inner_microstep: 5686.58 | bwd_allreduce_microstep: 41.37 | step_microstep: 18.58 [2025-04-25 20:06:00,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.52 | bwd: 5728.01 | bwd_inner: 5686.58 | bwd_allreduce: 41.39 | step: 18.59 12%|█▏ | 5021/41250 [12:08:25<87:08:04, 8.66s/it] {'loss': 0.0984, 'grad_norm': 2.749279737472534, 'learning_rate': 3.912421684843029e-05, 'epoch': 1.22} 12%|█▏ | 5021/41250 [12:08:25<87:08:04, 8.66s/it][2025-04-25 20:06:09,040] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.99 | optimizer_step: 1.00 [2025-04-25 20:06:09,041] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.82 | bwd_microstep: 5718.02 | bwd_inner_microstep: 5705.21 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.67 [2025-04-25 20:06:09,041] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.82 | bwd: 5718.04 | bwd_inner: 5705.21 | bwd_allreduce: 12.79 | step: 18.67 12%|█▏ | 5022/41250 [12:08:34<87:08:51, 8.66s/it] {'loss': 0.1086, 'grad_norm': 1.2631206512451172, 'learning_rate': 3.912375718872646e-05, 'epoch': 1.22} 12%|█▏ | 5022/41250 [12:08:34<87:08:51, 8.66s/it][2025-04-25 20:06:17,634] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 20:06:17,635] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.15 | bwd_microstep: 5680.03 | bwd_inner_microstep: 5640.27 | bwd_allreduce_microstep: 39.71 | step_microstep: 18.61 [2025-04-25 20:06:17,635] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.15 | bwd: 5680.04 | bwd_inner: 5640.27 | bwd_allreduce: 39.73 | step: 18.61 12%|█▏ | 5023/41250 [12:08:42<86:56:44, 8.64s/it] {'loss': 0.1003, 'grad_norm': 1.817197322845459, 'learning_rate': 3.912329741112841e-05, 'epoch': 1.22} 12%|█▏ | 5023/41250 [12:08:42<86:56:44, 8.64s/it][2025-04-25 20:06:26,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:06:26,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.54 | bwd_microstep: 5667.42 | bwd_inner_microstep: 5642.49 | bwd_allreduce_microstep: 24.89 | step_microstep: 18.29 [2025-04-25 20:06:26,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.54 | bwd: 5667.43 | bwd_inner: 5642.49 | bwd_allreduce: 24.90 | step: 18.29 12%|█▏ | 5024/41250 [12:08:51<86:47:35, 8.63s/it] {'loss': 0.1737, 'grad_norm': 4.090746879577637, 'learning_rate': 3.912283751563901e-05, 'epoch': 1.22} 12%|█▏ | 5024/41250 [12:08:51<86:47:35, 8.63s/it][2025-04-25 20:06:34,874] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:06:34,874] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.01 | bwd_microstep: 5715.08 | bwd_inner_microstep: 5695.36 | bwd_allreduce_microstep: 19.68 | step_microstep: 18.62 [2025-04-25 20:06:34,875] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.01 | bwd: 5715.09 | bwd_inner: 5695.36 | bwd_allreduce: 19.69 | step: 18.63 12%|█▏ | 5025/41250 [12:09:00<86:51:36, 8.63s/it] {'loss': 0.1028, 'grad_norm': 1.5080643892288208, 'learning_rate': 3.912237750226107e-05, 'epoch': 1.22} 12%|█▏ | 5025/41250 [12:09:00<86:51:36, 8.63s/it][2025-04-25 20:06:43,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:06:43,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.11 | bwd_microstep: 5700.89 | bwd_inner_microstep: 5688.19 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.50 [2025-04-25 20:06:43,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.11 | bwd: 5700.91 | bwd_inner: 5688.19 | bwd_allreduce: 12.68 | step: 18.50 12%|█▏ | 5026/41250 [12:09:08<86:53:07, 8.63s/it] {'loss': 0.0487, 'grad_norm': 0.6209538578987122, 'learning_rate': 3.912191737099743e-05, 'epoch': 1.22} 12%|█▏ | 5026/41250 [12:09:08<86:53:07, 8.63s/it][2025-04-25 20:06:52,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:06:52,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2867.98 | bwd_microstep: 5715.80 | bwd_inner_microstep: 5702.95 | bwd_allreduce_microstep: 12.81 | step_microstep: 18.70 [2025-04-25 20:06:52,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2867.98 | bwd: 5715.82 | bwd_inner: 5702.95 | bwd_allreduce: 12.83 | step: 18.70 12%|█▏ | 5027/41250 [12:09:17<86:59:58, 8.65s/it] {'loss': 0.0862, 'grad_norm': 0.8333643078804016, 'learning_rate': 3.912145712185094e-05, 'epoch': 1.22} 12%|█▏ | 5027/41250 [12:09:17<86:59:58, 8.65s/it][2025-04-25 20:07:00,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.98 | optimizer_step: 1.08 [2025-04-25 20:07:00,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.99 | bwd_microstep: 5747.48 | bwd_inner_microstep: 5644.78 | bwd_allreduce_microstep: 102.65 | step_microstep: 18.36 [2025-04-25 20:07:00,855] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.99 | bwd: 5747.49 | bwd_inner: 5644.78 | bwd_allreduce: 102.67 | step: 18.36 12%|█▏ | 5028/41250 [12:09:26<87:03:27, 8.65s/it] {'loss': 0.1501, 'grad_norm': 1.395484209060669, 'learning_rate': 3.912099675482442e-05, 'epoch': 1.22} 12%|█▏ | 5028/41250 [12:09:26<87:03:27, 8.65s/it][2025-04-25 20:07:09,448] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:07:09,448] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.44 | bwd_microstep: 5673.11 | bwd_inner_microstep: 5654.19 | bwd_allreduce_microstep: 18.88 | step_microstep: 18.26 [2025-04-25 20:07:09,449] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.44 | bwd: 5673.12 | bwd_inner: 5654.19 | bwd_allreduce: 18.89 | step: 18.26 12%|█▏ | 5029/41250 [12:09:34<86:52:36, 8.63s/it] {'loss': 0.1443, 'grad_norm': 1.3053386211395264, 'learning_rate': 3.912053626992072e-05, 'epoch': 1.22} 12%|█▏ | 5029/41250 [12:09:34<86:52:36, 8.63s/it][2025-04-25 20:07:18,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:07:18,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.13 | bwd_microstep: 5754.42 | bwd_inner_microstep: 5696.53 | bwd_allreduce_microstep: 57.84 | step_microstep: 18.51 [2025-04-25 20:07:18,148] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.13 | bwd: 5754.43 | bwd_inner: 5696.53 | bwd_allreduce: 57.86 | step: 18.51 12%|█▏ | 5030/41250 [12:09:43<87:04:10, 8.65s/it] {'loss': 0.1345, 'grad_norm': 1.6842046976089478, 'learning_rate': 3.912007566714267e-05, 'epoch': 1.22} 12%|█▏ | 5030/41250 [12:09:43<87:04:10, 8.65s/it][2025-04-25 20:07:26,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-25 20:07:26,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.06 | bwd_microstep: 5857.69 | bwd_inner_microstep: 5652.70 | bwd_allreduce_microstep: 204.95 | step_microstep: 19.04 [2025-04-25 20:07:26,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.06 | bwd: 5857.70 | bwd_inner: 5652.70 | bwd_allreduce: 204.96 | step: 19.04 12%|█▏ | 5031/41250 [12:09:52<87:25:49, 8.69s/it] {'loss': 0.0773, 'grad_norm': 1.2687104940414429, 'learning_rate': 3.911961494649312e-05, 'epoch': 1.22} 12%|█▏ | 5031/41250 [12:09:52<87:25:49, 8.69s/it][2025-04-25 20:07:35,521] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 20:07:35,522] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.66 | bwd_microstep: 5692.62 | bwd_inner_microstep: 5658.86 | bwd_allreduce_microstep: 33.72 | step_microstep: 18.50 [2025-04-25 20:07:35,522] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.66 | bwd: 5692.63 | bwd_inner: 5658.86 | bwd_allreduce: 33.74 | step: 18.50 12%|█▏ | 5032/41250 [12:10:00<87:09:32, 8.66s/it] {'loss': 0.0645, 'grad_norm': 0.9208755493164062, 'learning_rate': 3.91191541079749e-05, 'epoch': 1.22} 12%|█▏ | 5032/41250 [12:10:00<87:09:32, 8.66s/it][2025-04-25 20:07:44,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 20:07:44,165] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.54 | bwd_microstep: 5698.92 | bwd_inner_microstep: 5686.35 | bwd_allreduce_microstep: 12.52 | step_microstep: 18.65 [2025-04-25 20:07:44,165] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.54 | bwd: 5698.93 | bwd_inner: 5686.35 | bwd_allreduce: 12.54 | step: 18.65 12%|█▏ | 5033/41250 [12:10:09<87:05:42, 8.66s/it] {'loss': 0.1784, 'grad_norm': 2.3678932189941406, 'learning_rate': 3.911869315159086e-05, 'epoch': 1.22} 12%|█▏ | 5033/41250 [12:10:09<87:05:42, 8.66s/it][2025-04-25 20:07:52,851] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.02 | optimizer_step: 0.91 [2025-04-25 20:07:52,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.11 | bwd_microstep: 5763.74 | bwd_inner_microstep: 5651.67 | bwd_allreduce_microstep: 112.02 | step_microstep: 18.42 [2025-04-25 20:07:52,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.11 | bwd: 5763.75 | bwd_inner: 5651.67 | bwd_allreduce: 112.04 | step: 18.43 12%|█▏ | 5034/41250 [12:10:18<87:10:57, 8.67s/it] {'loss': 0.1842, 'grad_norm': 1.895382046699524, 'learning_rate': 3.9118232077343834e-05, 'epoch': 1.22} 12%|█▏ | 5034/41250 [12:10:18<87:10:57, 8.67s/it][2025-04-25 20:08:01,492] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:08:01,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.95 | bwd_microstep: 5702.67 | bwd_inner_microstep: 5690.15 | bwd_allreduce_microstep: 12.48 | step_microstep: 18.64 [2025-04-25 20:08:01,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.95 | bwd: 5702.68 | bwd_inner: 5690.15 | bwd_allreduce: 12.49 | step: 18.64 12%|█▏ | 5035/41250 [12:10:26<87:06:13, 8.66s/it] {'loss': 0.0506, 'grad_norm': 0.5842990279197693, 'learning_rate': 3.9117770885236665e-05, 'epoch': 1.22} 12%|█▏ | 5035/41250 [12:10:26<87:06:13, 8.66s/it][2025-04-25 20:08:10,157] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 1.03 [2025-04-25 20:08:10,158] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.34 | bwd_microstep: 5755.73 | bwd_inner_microstep: 5641.43 | bwd_allreduce_microstep: 114.26 | step_microstep: 18.53 [2025-04-25 20:08:10,158] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.34 | bwd: 5755.74 | bwd_inner: 5641.43 | bwd_allreduce: 114.27 | step: 18.54 12%|█▏ | 5036/41250 [12:10:35<87:07:01, 8.66s/it] {'loss': 0.079, 'grad_norm': 1.7772780656814575, 'learning_rate': 3.911730957527221e-05, 'epoch': 1.22} 12%|█▏ | 5036/41250 [12:10:35<87:07:01, 8.66s/it][2025-04-25 20:08:18,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-25 20:08:18,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.27 | bwd_microstep: 5789.60 | bwd_inner_microstep: 5776.85 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.25 [2025-04-25 20:08:18,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.27 | bwd: 5789.62 | bwd_inner: 5776.85 | bwd_allreduce: 12.72 | step: 18.25 12%|█▏ | 5037/41250 [12:10:44<87:24:57, 8.69s/it] {'loss': 0.0952, 'grad_norm': 2.0197103023529053, 'learning_rate': 3.911684814745329e-05, 'epoch': 1.22} 12%|█▏ | 5037/41250 [12:10:44<87:24:57, 8.69s/it][2025-04-25 20:08:27,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 20:08:27,585] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.68 | bwd_microstep: 5746.62 | bwd_inner_microstep: 5657.93 | bwd_allreduce_microstep: 88.64 | step_microstep: 18.59 [2025-04-25 20:08:27,585] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.68 | bwd: 5746.63 | bwd_inner: 5657.93 | bwd_allreduce: 88.66 | step: 18.59 12%|█▏ | 5038/41250 [12:10:52<87:20:32, 8.68s/it] {'loss': 0.1246, 'grad_norm': 2.08027982711792, 'learning_rate': 3.911638660178276e-05, 'epoch': 1.22} 12%|█▏ | 5038/41250 [12:10:52<87:20:32, 8.68s/it][2025-04-25 20:08:36,365] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:08:36,365] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2865.38 | bwd_microstep: 5826.95 | bwd_inner_microstep: 5705.79 | bwd_allreduce_microstep: 121.11 | step_microstep: 18.69 [2025-04-25 20:08:36,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2865.38 | bwd: 5826.96 | bwd_inner: 5705.79 | bwd_allreduce: 121.13 | step: 18.69 12%|█▏ | 5039/41250 [12:11:01<87:38:05, 8.71s/it] {'loss': 0.221, 'grad_norm': 4.040012359619141, 'learning_rate': 3.911592493826348e-05, 'epoch': 1.22} 12%|█▏ | 5039/41250 [12:11:01<87:38:05, 8.71s/it][2025-04-25 20:08:44,990] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 20:08:44,990] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.34 | bwd_microstep: 5714.95 | bwd_inner_microstep: 5651.90 | bwd_allreduce_microstep: 62.99 | step_microstep: 19.02 [2025-04-25 20:08:44,991] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.34 | bwd: 5714.96 | bwd_inner: 5651.90 | bwd_allreduce: 63.01 | step: 19.02 12%|█▏ | 5040/41250 [12:11:10<87:22:39, 8.69s/it] {'loss': 0.0541, 'grad_norm': 0.9043728113174438, 'learning_rate': 3.911546315689827e-05, 'epoch': 1.22} 12%|█▏ | 5040/41250 [12:11:10<87:22:39, 8.69s/it][2025-04-25 20:08:53,702] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.01 | optimizer_step: 0.98 [2025-04-25 20:08:53,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.16 | bwd_microstep: 5769.68 | bwd_inner_microstep: 5721.74 | bwd_allreduce_microstep: 47.89 | step_microstep: 19.32 [2025-04-25 20:08:53,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.16 | bwd: 5769.69 | bwd_inner: 5721.74 | bwd_allreduce: 47.91 | step: 19.32 12%|█▏ | 5041/41250 [12:11:19<87:26:27, 8.69s/it] {'loss': 0.0546, 'grad_norm': 1.4566926956176758, 'learning_rate': 3.911500125768999e-05, 'epoch': 1.22} 12%|█▏ | 5041/41250 [12:11:19<87:26:27, 8.69s/it][2025-04-25 20:09:02,410] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:09:02,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.15 | bwd_microstep: 5772.15 | bwd_inner_microstep: 5717.88 | bwd_allreduce_microstep: 54.22 | step_microstep: 18.55 [2025-04-25 20:09:02,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.15 | bwd: 5772.16 | bwd_inner: 5717.88 | bwd_allreduce: 54.24 | step: 18.55 12%|█▏ | 5042/41250 [12:11:27<87:29:03, 8.70s/it] {'loss': 0.1122, 'grad_norm': 1.566741943359375, 'learning_rate': 3.911453924064149e-05, 'epoch': 1.22} 12%|█▏ | 5042/41250 [12:11:27<87:29:03, 8.70s/it][2025-04-25 20:09:11,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.92 [2025-04-25 20:09:11,092] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.96 | bwd_microstep: 5735.69 | bwd_inner_microstep: 5722.78 | bwd_allreduce_microstep: 12.87 | step_microstep: 19.04 [2025-04-25 20:09:11,092] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.96 | bwd: 5735.71 | bwd_inner: 5722.78 | bwd_allreduce: 12.88 | step: 19.04 12%|█▏ | 5043/41250 [12:11:36<87:25:43, 8.69s/it] {'loss': 0.3892, 'grad_norm': 4.925854206085205, 'learning_rate': 3.911407710575562e-05, 'epoch': 1.22} 12%|█▏ | 5043/41250 [12:11:36<87:25:43, 8.69s/it][2025-04-25 20:09:19,779] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.06 | optimizer_step: 0.90 [2025-04-25 20:09:19,780] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.15 | bwd_microstep: 5768.42 | bwd_inner_microstep: 5669.35 | bwd_allreduce_microstep: 99.02 | step_microstep: 18.75 [2025-04-25 20:09:19,780] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.15 | bwd: 5768.43 | bwd_inner: 5669.35 | bwd_allreduce: 99.04 | step: 18.75 12%|█▏ | 5044/41250 [12:11:45<87:24:36, 8.69s/it] {'loss': 0.4229, 'grad_norm': 2.192835569381714, 'learning_rate': 3.911361485303522e-05, 'epoch': 1.22} 12%|█▏ | 5044/41250 [12:11:45<87:24:36, 8.69s/it][2025-04-25 20:09:28,433] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:09:28,433] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.59 | bwd_microstep: 5716.75 | bwd_inner_microstep: 5703.79 | bwd_allreduce_microstep: 12.91 | step_microstep: 18.77 [2025-04-25 20:09:28,433] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.59 | bwd: 5716.76 | bwd_inner: 5703.79 | bwd_allreduce: 12.93 | step: 18.77 12%|█▏ | 5045/41250 [12:11:53<87:17:43, 8.68s/it] {'loss': 0.0715, 'grad_norm': 1.2645682096481323, 'learning_rate': 3.9113152482483145e-05, 'epoch': 1.22} 12%|█▏ | 5045/41250 [12:11:53<87:17:43, 8.68s/it][2025-04-25 20:09:37,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:09:37,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.54 | bwd_microstep: 5718.11 | bwd_inner_microstep: 5705.44 | bwd_allreduce_microstep: 12.63 | step_microstep: 18.65 [2025-04-25 20:09:37,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.54 | bwd: 5718.13 | bwd_inner: 5705.44 | bwd_allreduce: 12.65 | step: 18.65 12%|█▏ | 5046/41250 [12:12:02<87:13:28, 8.67s/it] {'loss': 0.0781, 'grad_norm': 1.0807932615280151, 'learning_rate': 3.911268999410224e-05, 'epoch': 1.22} 12%|█▏ | 5046/41250 [12:12:02<87:13:28, 8.67s/it][2025-04-25 20:09:45,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 20:09:45,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.16 | bwd_microstep: 5767.39 | bwd_inner_microstep: 5703.48 | bwd_allreduce_microstep: 63.86 | step_microstep: 18.83 [2025-04-25 20:09:45,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.16 | bwd: 5767.41 | bwd_inner: 5703.48 | bwd_allreduce: 63.88 | step: 18.83 12%|█▏ | 5047/41250 [12:12:11<87:17:51, 8.68s/it] {'loss': 0.3425, 'grad_norm': 5.316107749938965, 'learning_rate': 3.9112227387895366e-05, 'epoch': 1.22} 12%|█▏ | 5047/41250 [12:12:11<87:17:51, 8.68s/it][2025-04-25 20:09:54,480] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:09:54,481] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.05 | bwd_microstep: 5780.40 | bwd_inner_microstep: 5656.04 | bwd_allreduce_microstep: 124.31 | step_microstep: 18.74 [2025-04-25 20:09:54,481] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.05 | bwd: 5780.41 | bwd_inner: 5656.04 | bwd_allreduce: 124.33 | step: 18.74 12%|█▏ | 5048/41250 [12:12:19<87:19:45, 8.68s/it] {'loss': 0.1066, 'grad_norm': 1.2257190942764282, 'learning_rate': 3.911176466386537e-05, 'epoch': 1.22} 12%|█▏ | 5048/41250 [12:12:19<87:19:45, 8.68s/it][2025-04-25 20:10:03,206] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.12 | optimizer_step: 1.03 [2025-04-25 20:10:03,207] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.28 | bwd_microstep: 5781.73 | bwd_inner_microstep: 5719.23 | bwd_allreduce_microstep: 62.44 | step_microstep: 19.97 [2025-04-25 20:10:03,207] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.28 | bwd: 5781.75 | bwd_inner: 5719.23 | bwd_allreduce: 62.47 | step: 19.98 12%|█▏ | 5049/41250 [12:12:28<87:27:07, 8.70s/it] {'loss': 0.0202, 'grad_norm': 0.5248527526855469, 'learning_rate': 3.91113018220151e-05, 'epoch': 1.22} 12%|█▏ | 5049/41250 [12:12:28<87:27:07, 8.70s/it][2025-04-25 20:10:12,044] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:10:12,045] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2903.23 | bwd_microstep: 5851.62 | bwd_inner_microstep: 5791.87 | bwd_allreduce_microstep: 59.70 | step_microstep: 18.81 [2025-04-25 20:10:12,045] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2903.23 | bwd: 5851.63 | bwd_inner: 5791.87 | bwd_allreduce: 59.72 | step: 18.81 12%|█▏ | 5050/41250 [12:12:37<87:52:27, 8.74s/it] {'loss': 0.2238, 'grad_norm': 1.9877865314483643, 'learning_rate': 3.9110838862347414e-05, 'epoch': 1.22} 12%|█▏ | 5050/41250 [12:12:37<87:52:27, 8.74s/it][2025-04-25 20:10:20,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.02 | optimizer_step: 1.03 [2025-04-25 20:10:20,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.40 | bwd_microstep: 5754.95 | bwd_inner_microstep: 5715.22 | bwd_allreduce_microstep: 39.69 | step_microstep: 18.98 [2025-04-25 20:10:20,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.40 | bwd: 5754.96 | bwd_inner: 5715.21 | bwd_allreduce: 39.71 | step: 18.98 12%|█▏ | 5051/41250 [12:12:46<87:44:47, 8.73s/it] {'loss': 0.0549, 'grad_norm': 0.9897903203964233, 'learning_rate': 3.911037578486516e-05, 'epoch': 1.22} 12%|█▏ | 5051/41250 [12:12:46<87:44:47, 8.73s/it][2025-04-25 20:10:29,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 20:10:29,431] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.82 | bwd_microstep: 5774.23 | bwd_inner_microstep: 5669.81 | bwd_allreduce_microstep: 104.37 | step_microstep: 19.32 [2025-04-25 20:10:29,431] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.82 | bwd: 5774.24 | bwd_inner: 5669.81 | bwd_allreduce: 104.39 | step: 19.33 12%|█▏ | 5052/41250 [12:12:54<87:37:56, 8.72s/it] {'loss': 0.1498, 'grad_norm': 2.4899444580078125, 'learning_rate': 3.910991258957121e-05, 'epoch': 1.22} 12%|█▏ | 5052/41250 [12:12:54<87:37:56, 8.72s/it][2025-04-25 20:10:38,102] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:10:38,103] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.36 | bwd_microstep: 5732.87 | bwd_inner_microstep: 5719.95 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.91 [2025-04-25 20:10:38,103] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.36 | bwd: 5732.88 | bwd_inner: 5719.95 | bwd_allreduce: 12.88 | step: 18.91 12%|█▏ | 5053/41250 [12:13:03<87:30:01, 8.70s/it] {'loss': 0.1505, 'grad_norm': 1.8523670434951782, 'learning_rate': 3.910944927646839e-05, 'epoch': 1.22} 12%|█▏ | 5053/41250 [12:13:03<87:30:01, 8.70s/it][2025-04-25 20:10:46,737] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.01 | optimizer_step: 1.00 [2025-04-25 20:10:46,737] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.02 | bwd_microstep: 5709.27 | bwd_inner_microstep: 5658.43 | bwd_allreduce_microstep: 50.80 | step_microstep: 19.04 [2025-04-25 20:10:46,738] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.02 | bwd: 5709.29 | bwd_inner: 5658.43 | bwd_allreduce: 50.82 | step: 19.04 12%|█▏ | 5054/41250 [12:13:12<87:17:35, 8.68s/it] {'loss': 0.1403, 'grad_norm': 2.547337293624878, 'learning_rate': 3.910898584555959e-05, 'epoch': 1.23} 12%|█▏ | 5054/41250 [12:13:12<87:17:35, 8.68s/it][2025-04-25 20:10:55,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:10:55,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.04 | bwd_microstep: 5755.51 | bwd_inner_microstep: 5658.68 | bwd_allreduce_microstep: 96.78 | step_microstep: 18.58 [2025-04-25 20:10:55,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.04 | bwd: 5755.52 | bwd_inner: 5658.68 | bwd_allreduce: 96.80 | step: 18.59 12%|█▏ | 5055/41250 [12:13:20<87:16:04, 8.68s/it] {'loss': 0.2557, 'grad_norm': 3.742323160171509, 'learning_rate': 3.910852229684764e-05, 'epoch': 1.23} 12%|█▏ | 5055/41250 [12:13:20<87:16:04, 8.68s/it][2025-04-25 20:11:04,035] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 20:11:04,035] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.09 | bwd_microstep: 5695.55 | bwd_inner_microstep: 5665.42 | bwd_allreduce_microstep: 30.09 | step_microstep: 18.75 [2025-04-25 20:11:04,035] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.09 | bwd: 5695.57 | bwd_inner: 5665.42 | bwd_allreduce: 30.11 | step: 18.75 12%|█▏ | 5056/41250 [12:13:29<87:05:43, 8.66s/it] {'loss': 0.032, 'grad_norm': 0.4658140242099762, 'learning_rate': 3.910805863033542e-05, 'epoch': 1.23} 12%|█▏ | 5056/41250 [12:13:29<87:05:43, 8.66s/it][2025-04-25 20:11:12,659] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:11:12,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.29 | bwd_microstep: 5701.40 | bwd_inner_microstep: 5661.72 | bwd_allreduce_microstep: 39.63 | step_microstep: 18.48 [2025-04-25 20:11:12,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.29 | bwd: 5701.41 | bwd_inner: 5661.72 | bwd_allreduce: 39.65 | step: 18.48 12%|█▏ | 5057/41250 [12:13:37<86:58:31, 8.65s/it] {'loss': 0.0369, 'grad_norm': 0.5797690749168396, 'learning_rate': 3.910759484602577e-05, 'epoch': 1.23} 12%|█▏ | 5057/41250 [12:13:37<86:58:31, 8.65s/it][2025-04-25 20:11:21,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:11:21,303] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.90 | bwd_microstep: 5708.32 | bwd_inner_microstep: 5695.55 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.61 [2025-04-25 20:11:21,303] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.90 | bwd: 5708.33 | bwd_inner: 5695.55 | bwd_allreduce: 12.75 | step: 18.61 12%|█▏ | 5058/41250 [12:13:46<86:57:06, 8.65s/it] {'loss': 0.0862, 'grad_norm': 0.8855921030044556, 'learning_rate': 3.910713094392156e-05, 'epoch': 1.23} 12%|█▏ | 5058/41250 [12:13:46<86:57:06, 8.65s/it][2025-04-25 20:11:30,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 1.04 [2025-04-25 20:11:30,052] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2882.86 | bwd_microstep: 5782.57 | bwd_inner_microstep: 5769.70 | bwd_allreduce_microstep: 12.81 | step_microstep: 19.11 [2025-04-25 20:11:30,052] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2882.86 | bwd: 5782.58 | bwd_inner: 5769.70 | bwd_allreduce: 12.84 | step: 19.11 12%|█▏ | 5059/41250 [12:13:55<87:15:23, 8.68s/it] {'loss': 0.3588, 'grad_norm': 2.986276626586914, 'learning_rate': 3.910666692402564e-05, 'epoch': 1.23} 12%|█▏ | 5059/41250 [12:13:55<87:15:23, 8.68s/it][2025-04-25 20:11:38,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 20:11:38,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.80 | bwd_microstep: 5695.59 | bwd_inner_microstep: 5662.43 | bwd_allreduce_microstep: 33.11 | step_microstep: 18.95 [2025-04-25 20:11:38,683] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.80 | bwd: 5695.61 | bwd_inner: 5662.43 | bwd_allreduce: 33.13 | step: 18.95 12%|█▏ | 5060/41250 [12:14:04<87:05:49, 8.66s/it] {'loss': 0.1269, 'grad_norm': 1.4574915170669556, 'learning_rate': 3.910620278634088e-05, 'epoch': 1.23} 12%|█▏ | 5060/41250 [12:14:04<87:05:49, 8.66s/it][2025-04-25 20:11:47,365] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:11:47,365] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.07 | bwd_microstep: 5746.58 | bwd_inner_microstep: 5710.98 | bwd_allreduce_microstep: 35.55 | step_microstep: 18.62 [2025-04-25 20:11:47,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.07 | bwd: 5746.59 | bwd_inner: 5710.98 | bwd_allreduce: 35.57 | step: 18.62 12%|█▏ | 5061/41250 [12:14:12<87:09:07, 8.67s/it] {'loss': 0.0477, 'grad_norm': 1.215200424194336, 'learning_rate': 3.910573853087014e-05, 'epoch': 1.23} 12%|█▏ | 5061/41250 [12:14:12<87:09:07, 8.67s/it][2025-04-25 20:11:56,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 1.04 [2025-04-25 20:11:56,029] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.46 | bwd_microstep: 5726.43 | bwd_inner_microstep: 5678.62 | bwd_allreduce_microstep: 47.77 | step_microstep: 18.55 [2025-04-25 20:11:56,029] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.46 | bwd: 5726.45 | bwd_inner: 5678.62 | bwd_allreduce: 47.79 | step: 18.55 12%|█▏ | 5062/41250 [12:14:21<87:07:48, 8.67s/it] {'loss': 0.0256, 'grad_norm': 0.6524646282196045, 'learning_rate': 3.910527415761627e-05, 'epoch': 1.23} 12%|█▏ | 5062/41250 [12:14:21<87:07:48, 8.67s/it][2025-04-25 20:12:04,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:12:04,626] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.90 | bwd_microstep: 5687.03 | bwd_inner_microstep: 5649.73 | bwd_allreduce_microstep: 37.26 | step_microstep: 18.64 [2025-04-25 20:12:04,626] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.91 | bwd: 5687.04 | bwd_inner: 5649.73 | bwd_allreduce: 37.27 | step: 18.64 12%|█▏ | 5063/41250 [12:14:29<86:54:49, 8.65s/it] {'loss': 0.0624, 'grad_norm': 1.566728115081787, 'learning_rate': 3.9104809666582156e-05, 'epoch': 1.23} 12%|█▏ | 5063/41250 [12:14:29<86:54:49, 8.65s/it][2025-04-25 20:12:13,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:12:13,227] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.14 | bwd_microstep: 5686.70 | bwd_inner_microstep: 5660.04 | bwd_allreduce_microstep: 26.61 | step_microstep: 18.46 [2025-04-25 20:12:13,227] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.14 | bwd: 5686.71 | bwd_inner: 5660.04 | bwd_allreduce: 26.63 | step: 18.46 12%|█▏ | 5064/41250 [12:14:38<86:46:45, 8.63s/it] {'loss': 0.0184, 'grad_norm': 0.3900707960128784, 'learning_rate': 3.910434505777064e-05, 'epoch': 1.23} 12%|█▏ | 5064/41250 [12:14:38<86:46:45, 8.63s/it][2025-04-25 20:12:21,900] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-25 20:12:21,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.66 | bwd_microstep: 5742.36 | bwd_inner_microstep: 5684.24 | bwd_allreduce_microstep: 58.08 | step_microstep: 19.02 [2025-04-25 20:12:21,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.66 | bwd: 5742.37 | bwd_inner: 5684.24 | bwd_allreduce: 58.09 | step: 19.02 12%|█▏ | 5065/41250 [12:14:47<86:54:01, 8.65s/it] {'loss': 0.0311, 'grad_norm': 0.7081826329231262, 'learning_rate': 3.91038803311846e-05, 'epoch': 1.23} 12%|█▏ | 5065/41250 [12:14:47<86:54:01, 8.65s/it][2025-04-25 20:12:30,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 20:12:30,566] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.50 | bwd_microstep: 5753.66 | bwd_inner_microstep: 5640.64 | bwd_allreduce_microstep: 112.97 | step_microstep: 18.90 [2025-04-25 20:12:30,566] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.50 | bwd: 5753.67 | bwd_inner: 5640.64 | bwd_allreduce: 112.99 | step: 18.90 12%|█▏ | 5066/41250 [12:14:55<86:57:07, 8.65s/it] {'loss': 0.1676, 'grad_norm': 3.1262905597686768, 'learning_rate': 3.910341548682689e-05, 'epoch': 1.23} 12%|█▏ | 5066/41250 [12:14:55<86:57:07, 8.65s/it][2025-04-25 20:12:39,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 0.97 | optimizer_step: 0.98 [2025-04-25 20:12:39,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.07 | bwd_microstep: 5879.16 | bwd_inner_microstep: 5638.96 | bwd_allreduce_microstep: 240.15 | step_microstep: 18.37 [2025-04-25 20:12:39,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.07 | bwd: 5879.17 | bwd_inner: 5638.96 | bwd_allreduce: 240.17 | step: 18.37 12%|█▏ | 5067/41250 [12:15:04<87:23:59, 8.70s/it] {'loss': 0.1829, 'grad_norm': 2.9240429401397705, 'learning_rate': 3.910295052470038e-05, 'epoch': 1.23} 12%|█▏ | 5067/41250 [12:15:04<87:23:59, 8.70s/it][2025-04-25 20:12:48,027] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:12:48,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.61 | bwd_microstep: 5748.94 | bwd_inner_microstep: 5645.30 | bwd_allreduce_microstep: 103.60 | step_microstep: 18.47 [2025-04-25 20:12:48,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.61 | bwd: 5748.96 | bwd_inner: 5645.29 | bwd_allreduce: 103.62 | step: 18.48 12%|█▏ | 5068/41250 [12:15:13<87:17:42, 8.69s/it] {'loss': 0.2031, 'grad_norm': 1.4885188341140747, 'learning_rate': 3.910248544480794e-05, 'epoch': 1.23} 12%|█▏ | 5068/41250 [12:15:13<87:17:42, 8.69s/it][2025-04-25 20:12:56,679] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.03 | optimizer_step: 1.07 [2025-04-25 20:12:56,679] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.83 | bwd_microstep: 5714.71 | bwd_inner_microstep: 5701.85 | bwd_allreduce_microstep: 12.81 | step_microstep: 18.96 [2025-04-25 20:12:56,679] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.83 | bwd: 5714.72 | bwd_inner: 5701.85 | bwd_allreduce: 12.83 | step: 18.96 12%|█▏ | 5069/41250 [12:15:22<87:11:52, 8.68s/it] {'loss': 0.0427, 'grad_norm': 1.332484245300293, 'learning_rate': 3.910202024715244e-05, 'epoch': 1.23} 12%|█▏ | 5069/41250 [12:15:22<87:11:52, 8.68s/it][2025-04-25 20:13:05,350] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.02 | optimizer_step: 1.00 [2025-04-25 20:13:05,351] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.74 | bwd_microstep: 5753.44 | bwd_inner_microstep: 5635.68 | bwd_allreduce_microstep: 117.71 | step_microstep: 18.75 [2025-04-25 20:13:05,351] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.74 | bwd: 5753.45 | bwd_inner: 5635.68 | bwd_allreduce: 117.73 | step: 18.75 12%|█▏ | 5070/41250 [12:15:30<87:10:43, 8.67s/it] {'loss': 0.1721, 'grad_norm': 1.5071417093276978, 'learning_rate': 3.9101554931736736e-05, 'epoch': 1.23} 12%|█▏ | 5070/41250 [12:15:30<87:10:43, 8.67s/it][2025-04-25 20:13:14,017] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.03 | optimizer_step: 0.99 [2025-04-25 20:13:14,017] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.21 | bwd_microstep: 5749.86 | bwd_inner_microstep: 5643.28 | bwd_allreduce_microstep: 106.52 | step_microstep: 18.92 [2025-04-25 20:13:14,017] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.21 | bwd: 5749.87 | bwd_inner: 5643.28 | bwd_allreduce: 106.55 | step: 18.93 12%|█▏ | 5071/41250 [12:15:39<87:08:50, 8.67s/it] {'loss': 0.128, 'grad_norm': 1.9287441968917847, 'learning_rate': 3.91010894985637e-05, 'epoch': 1.23} 12%|█▏ | 5071/41250 [12:15:39<87:08:50, 8.67s/it][2025-04-25 20:13:22,669] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.04 | optimizer_step: 1.04 [2025-04-25 20:13:22,670] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.61 | bwd_microstep: 5704.03 | bwd_inner_microstep: 5691.21 | bwd_allreduce_microstep: 12.76 | step_microstep: 19.48 [2025-04-25 20:13:22,670] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.61 | bwd: 5704.04 | bwd_inner: 5691.21 | bwd_allreduce: 12.79 | step: 19.48 12%|█▏ | 5072/41250 [12:15:47<87:05:30, 8.67s/it] {'loss': 0.2056, 'grad_norm': 1.568760633468628, 'learning_rate': 3.910062394763621e-05, 'epoch': 1.23} 12%|█▏ | 5072/41250 [12:15:47<87:05:30, 8.67s/it][2025-04-25 20:13:31,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:13:31,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.93 | bwd_microstep: 5715.47 | bwd_inner_microstep: 5644.39 | bwd_allreduce_microstep: 71.03 | step_microstep: 18.48 [2025-04-25 20:13:31,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.93 | bwd: 5715.49 | bwd_inner: 5644.39 | bwd_allreduce: 71.05 | step: 18.48 12%|█▏ | 5073/41250 [12:15:56<87:01:08, 8.66s/it] {'loss': 0.3266, 'grad_norm': 4.317389965057373, 'learning_rate': 3.910015827895713e-05, 'epoch': 1.23} 12%|█▏ | 5073/41250 [12:15:56<87:01:08, 8.66s/it][2025-04-25 20:13:39,952] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.04 | optimizer_step: 1.03 [2025-04-25 20:13:39,952] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.47 | bwd_microstep: 5717.00 | bwd_inner_microstep: 5650.26 | bwd_allreduce_microstep: 66.69 | step_microstep: 19.47 [2025-04-25 20:13:39,953] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.47 | bwd: 5717.02 | bwd_inner: 5650.26 | bwd_allreduce: 66.71 | step: 19.47 12%|█▏ | 5074/41250 [12:16:05<86:57:43, 8.65s/it] {'loss': 0.007, 'grad_norm': 0.10062289237976074, 'learning_rate': 3.909969249252933e-05, 'epoch': 1.23} 12%|█▏ | 5074/41250 [12:16:05<86:57:43, 8.65s/it][2025-04-25 20:13:48,596] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.04 | optimizer_step: 0.95 [2025-04-25 20:13:48,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.19 | bwd_microstep: 5700.42 | bwd_inner_microstep: 5682.22 | bwd_allreduce_microstep: 18.16 | step_microstep: 19.18 [2025-04-25 20:13:48,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.19 | bwd: 5700.44 | bwd_inner: 5682.22 | bwd_allreduce: 18.18 | step: 19.18 12%|█▏ | 5075/41250 [12:16:13<86:55:47, 8.65s/it] {'loss': 0.0641, 'grad_norm': 1.4348504543304443, 'learning_rate': 3.9099226588355676e-05, 'epoch': 1.23} 12%|█▏ | 5075/41250 [12:16:13<86:55:47, 8.65s/it][2025-04-25 20:13:57,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.06 | optimizer_step: 1.15 [2025-04-25 20:13:57,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.56 | bwd_microstep: 5698.30 | bwd_inner_microstep: 5685.72 | bwd_allreduce_microstep: 12.53 | step_microstep: 19.34 [2025-04-25 20:13:57,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.56 | bwd: 5698.31 | bwd_inner: 5685.72 | bwd_allreduce: 12.55 | step: 19.34 12%|█▏ | 5076/41250 [12:16:22<86:52:41, 8.65s/it] {'loss': 0.1838, 'grad_norm': 3.48087739944458, 'learning_rate': 3.9098760566439056e-05, 'epoch': 1.23} 12%|█▏ | 5076/41250 [12:16:22<86:52:41, 8.65s/it][2025-04-25 20:14:05,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.01 | optimizer_step: 1.06 [2025-04-25 20:14:05,830] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.02 | bwd_microstep: 5673.69 | bwd_inner_microstep: 5654.89 | bwd_allreduce_microstep: 18.75 | step_microstep: 18.92 [2025-04-25 20:14:05,830] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.02 | bwd: 5673.70 | bwd_inner: 5654.89 | bwd_allreduce: 18.77 | step: 18.92 12%|█▏ | 5077/41250 [12:16:31<86:43:34, 8.63s/it] {'loss': 0.0892, 'grad_norm': 1.8518404960632324, 'learning_rate': 3.909829442678232e-05, 'epoch': 1.23} 12%|█▏ | 5077/41250 [12:16:31<86:43:34, 8.63s/it][2025-04-25 20:14:14,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:14:14,474] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.24 | bwd_microstep: 5699.40 | bwd_inner_microstep: 5686.55 | bwd_allreduce_microstep: 12.79 | step_microstep: 18.52 [2025-04-25 20:14:14,474] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.24 | bwd: 5699.41 | bwd_inner: 5686.55 | bwd_allreduce: 12.81 | step: 18.52 12%|█▏ | 5078/41250 [12:16:39<86:46:03, 8.64s/it] {'loss': 0.189, 'grad_norm': 2.3064677715301514, 'learning_rate': 3.9097828169388375e-05, 'epoch': 1.23} 12%|█▏ | 5078/41250 [12:16:39<86:46:03, 8.64s/it][2025-04-25 20:14:23,107] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.04 | optimizer_step: 1.04 [2025-04-25 20:14:23,108] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.05 | bwd_microstep: 5703.86 | bwd_inner_microstep: 5691.11 | bwd_allreduce_microstep: 12.71 | step_microstep: 19.02 [2025-04-25 20:14:23,108] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.05 | bwd: 5703.87 | bwd_inner: 5691.11 | bwd_allreduce: 12.73 | step: 19.02 12%|█▏ | 5079/41250 [12:16:48<86:45:24, 8.63s/it] {'loss': 0.0588, 'grad_norm': 1.8209543228149414, 'learning_rate': 3.9097361794260067e-05, 'epoch': 1.23} 12%|█▏ | 5079/41250 [12:16:48<86:45:24, 8.63s/it][2025-04-25 20:14:31,706] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 1.08 [2025-04-25 20:14:31,707] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.97 | bwd_microstep: 5670.50 | bwd_inner_microstep: 5653.38 | bwd_allreduce_microstep: 17.07 | step_microstep: 18.75 [2025-04-25 20:14:31,707] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.97 | bwd: 5670.51 | bwd_inner: 5653.38 | bwd_allreduce: 17.09 | step: 18.76 12%|█▏ | 5080/41250 [12:16:57<86:38:48, 8.62s/it] {'loss': 0.2444, 'grad_norm': 3.145698308944702, 'learning_rate': 3.909689530140028e-05, 'epoch': 1.23} 12%|█▏ | 5080/41250 [12:16:57<86:38:48, 8.62s/it][2025-04-25 20:14:40,391] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 20:14:40,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.91 | bwd_microstep: 5745.15 | bwd_inner_microstep: 5704.24 | bwd_allreduce_microstep: 40.87 | step_microstep: 18.70 [2025-04-25 20:14:40,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.91 | bwd: 5745.17 | bwd_inner: 5704.24 | bwd_allreduce: 40.88 | step: 18.70 12%|█▏ | 5081/41250 [12:17:05<86:49:40, 8.64s/it] {'loss': 0.0343, 'grad_norm': 0.49042725563049316, 'learning_rate': 3.9096428690811886e-05, 'epoch': 1.23} 12%|█▏ | 5081/41250 [12:17:05<86:49:40, 8.64s/it][2025-04-25 20:14:49,151] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:14:49,151] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2886.25 | bwd_microstep: 5788.74 | bwd_inner_microstep: 5775.84 | bwd_allreduce_microstep: 12.85 | step_microstep: 18.36 [2025-04-25 20:14:49,152] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2886.25 | bwd: 5788.75 | bwd_inner: 5775.84 | bwd_allreduce: 12.87 | step: 18.36 12%|█▏ | 5082/41250 [12:17:14<87:11:31, 8.68s/it] {'loss': 0.1004, 'grad_norm': 1.0388877391815186, 'learning_rate': 3.909596196249776e-05, 'epoch': 1.23} 12%|█▏ | 5082/41250 [12:17:14<87:11:31, 8.68s/it][2025-04-25 20:14:57,805] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.02 | optimizer_step: 1.13 [2025-04-25 20:14:57,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.37 | bwd_microstep: 5706.95 | bwd_inner_microstep: 5694.05 | bwd_allreduce_microstep: 12.85 | step_microstep: 19.13 [2025-04-25 20:14:57,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.37 | bwd: 5706.97 | bwd_inner: 5694.05 | bwd_allreduce: 12.87 | step: 19.13 12%|█▏ | 5083/41250 [12:17:23<87:06:07, 8.67s/it] {'loss': 0.1147, 'grad_norm': 2.022648334503174, 'learning_rate': 3.9095495116460794e-05, 'epoch': 1.23} 12%|█▏ | 5083/41250 [12:17:23<87:06:07, 8.67s/it][2025-04-25 20:15:06,486] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.03 | optimizer_step: 0.91 [2025-04-25 20:15:06,486] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.14 | bwd_microstep: 5774.76 | bwd_inner_microstep: 5640.09 | bwd_allreduce_microstep: 134.62 | step_microstep: 18.64 [2025-04-25 20:15:06,486] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.14 | bwd: 5774.78 | bwd_inner: 5640.09 | bwd_allreduce: 134.64 | step: 18.65 12%|█▏ | 5084/41250 [12:17:31<87:07:41, 8.67s/it] {'loss': 0.2314, 'grad_norm': 1.267189383506775, 'learning_rate': 3.909502815270385e-05, 'epoch': 1.23} 12%|█▏ | 5084/41250 [12:17:31<87:07:41, 8.67s/it][2025-04-25 20:15:15,096] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 20:15:15,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.52 | bwd_microstep: 5701.61 | bwd_inner_microstep: 5651.38 | bwd_allreduce_microstep: 50.18 | step_microstep: 18.68 [2025-04-25 20:15:15,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.52 | bwd: 5701.62 | bwd_inner: 5651.38 | bwd_allreduce: 50.20 | step: 18.68 12%|█▏ | 5085/41250 [12:17:40<86:56:29, 8.65s/it] {'loss': 0.0941, 'grad_norm': 6.133732318878174, 'learning_rate': 3.909456107122982e-05, 'epoch': 1.23} 12%|█▏ | 5085/41250 [12:17:40<86:56:29, 8.65s/it][2025-04-25 20:15:23,765] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:15:23,765] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.09 | bwd_microstep: 5736.84 | bwd_inner_microstep: 5695.03 | bwd_allreduce_microstep: 41.77 | step_microstep: 18.59 [2025-04-25 20:15:23,765] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.09 | bwd: 5736.85 | bwd_inner: 5695.03 | bwd_allreduce: 41.78 | step: 18.59 12%|█▏ | 5086/41250 [12:17:49<86:58:44, 8.66s/it] {'loss': 0.113, 'grad_norm': 1.5671069622039795, 'learning_rate': 3.909409387204158e-05, 'epoch': 1.23} 12%|█▏ | 5086/41250 [12:17:49<86:58:44, 8.66s/it][2025-04-25 20:15:32,428] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:15:32,428] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.77 | bwd_microstep: 5718.65 | bwd_inner_microstep: 5705.87 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.44 [2025-04-25 20:15:32,428] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.77 | bwd: 5718.66 | bwd_inner: 5705.87 | bwd_allreduce: 12.75 | step: 18.45 12%|█▏ | 5087/41250 [12:17:57<86:59:22, 8.66s/it] {'loss': 0.0259, 'grad_norm': 0.4037093222141266, 'learning_rate': 3.9093626555142e-05, 'epoch': 1.23} 12%|█▏ | 5087/41250 [12:17:57<86:59:22, 8.66s/it][2025-04-25 20:15:41,135] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:15:41,135] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.73 | bwd_microstep: 5768.88 | bwd_inner_microstep: 5702.71 | bwd_allreduce_microstep: 66.13 | step_microstep: 18.89 [2025-04-25 20:15:41,136] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.73 | bwd: 5768.90 | bwd_inner: 5702.71 | bwd_allreduce: 66.15 | step: 18.90 12%|█▏ | 5088/41250 [12:18:06<87:08:02, 8.67s/it] {'loss': 0.0772, 'grad_norm': 1.7295770645141602, 'learning_rate': 3.909315912053397e-05, 'epoch': 1.23} 12%|█▏ | 5088/41250 [12:18:06<87:08:02, 8.67s/it][2025-04-25 20:15:49,819] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:15:49,819] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2866.33 | bwd_microstep: 5737.13 | bwd_inner_microstep: 5718.22 | bwd_allreduce_microstep: 18.86 | step_microstep: 18.58 [2025-04-25 20:15:49,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2866.33 | bwd: 5737.14 | bwd_inner: 5718.22 | bwd_allreduce: 18.88 | step: 18.59 12%|█▏ | 5089/41250 [12:18:15<87:10:08, 8.68s/it] {'loss': 0.0548, 'grad_norm': 1.396491289138794, 'learning_rate': 3.9092691568220365e-05, 'epoch': 1.23} 12%|█▏ | 5089/41250 [12:18:15<87:10:08, 8.68s/it][2025-04-25 20:15:58,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-25 20:15:58,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.42 | bwd_microstep: 5694.50 | bwd_inner_microstep: 5658.37 | bwd_allreduce_microstep: 36.08 | step_microstep: 18.93 [2025-04-25 20:15:58,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.41 | bwd: 5694.52 | bwd_inner: 5658.37 | bwd_allreduce: 36.10 | step: 18.94 12%|█▏ | 5090/41250 [12:18:23<86:59:40, 8.66s/it] {'loss': 0.321, 'grad_norm': 4.7440571784973145, 'learning_rate': 3.909222389820408e-05, 'epoch': 1.23} 12%|█▏ | 5090/41250 [12:18:23<86:59:40, 8.66s/it][2025-04-25 20:16:07,270] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.00 | optimizer_step: 1.09 [2025-04-25 20:16:07,271] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.19 | bwd_microstep: 5890.92 | bwd_inner_microstep: 5692.18 | bwd_allreduce_microstep: 198.69 | step_microstep: 18.55 [2025-04-25 20:16:07,271] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.19 | bwd: 5890.93 | bwd_inner: 5692.18 | bwd_allreduce: 198.71 | step: 18.56 12%|█▏ | 5091/41250 [12:18:32<87:29:35, 8.71s/it] {'loss': 0.2472, 'grad_norm': 3.061587333679199, 'learning_rate': 3.909175611048798e-05, 'epoch': 1.23} 12%|█▏ | 5091/41250 [12:18:32<87:29:35, 8.71s/it][2025-04-25 20:16:15,893] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 1.06 [2025-04-25 20:16:15,893] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.68 | bwd_microstep: 5702.26 | bwd_inner_microstep: 5666.37 | bwd_allreduce_microstep: 35.84 | step_microstep: 18.61 [2025-04-25 20:16:15,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.69 | bwd: 5702.27 | bwd_inner: 5666.37 | bwd_allreduce: 35.86 | step: 18.61 12%|█▏ | 5092/41250 [12:18:41<87:13:25, 8.68s/it] {'loss': 0.0473, 'grad_norm': 1.5806008577346802, 'learning_rate': 3.909128820507497e-05, 'epoch': 1.23} 12%|█▏ | 5092/41250 [12:18:41<87:13:25, 8.68s/it][2025-04-25 20:16:24,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 20:16:24,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.08 | bwd_microstep: 5874.80 | bwd_inner_microstep: 5708.37 | bwd_allreduce_microstep: 166.37 | step_microstep: 18.57 [2025-04-25 20:16:24,719] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.08 | bwd: 5874.81 | bwd_inner: 5708.37 | bwd_allreduce: 166.39 | step: 18.57 12%|█▏ | 5093/41250 [12:18:50<87:38:39, 8.73s/it] {'loss': 0.0711, 'grad_norm': 1.129448652267456, 'learning_rate': 3.9090820181967915e-05, 'epoch': 1.23} 12%|█▏ | 5093/41250 [12:18:50<87:38:39, 8.73s/it][2025-04-25 20:16:33,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 20:16:33,415] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2866.68 | bwd_microstep: 5746.21 | bwd_inner_microstep: 5713.99 | bwd_allreduce_microstep: 32.18 | step_microstep: 18.76 [2025-04-25 20:16:33,415] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2866.68 | bwd: 5746.22 | bwd_inner: 5713.99 | bwd_allreduce: 32.19 | step: 18.76 12%|█▏ | 5094/41250 [12:18:58<87:33:10, 8.72s/it] {'loss': 0.1173, 'grad_norm': 2.590569019317627, 'learning_rate': 3.909035204116971e-05, 'epoch': 1.23} 12%|█▏ | 5094/41250 [12:18:58<87:33:10, 8.72s/it][2025-04-25 20:16:42,126] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:16:42,127] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.00 | bwd_microstep: 5786.63 | bwd_inner_microstep: 5678.67 | bwd_allreduce_microstep: 107.92 | step_microstep: 18.46 [2025-04-25 20:16:42,127] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.00 | bwd: 5786.65 | bwd_inner: 5678.67 | bwd_allreduce: 107.94 | step: 18.46 12%|█▏ | 5095/41250 [12:19:07<87:32:03, 8.72s/it] {'loss': 0.2681, 'grad_norm': 3.9895949363708496, 'learning_rate': 3.908988378268324e-05, 'epoch': 1.24} 12%|█▏ | 5095/41250 [12:19:07<87:32:03, 8.72s/it][2025-04-25 20:16:50,816] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:16:50,817] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.77 | bwd_microstep: 5742.95 | bwd_inner_microstep: 5717.32 | bwd_allreduce_microstep: 25.59 | step_microstep: 18.98 [2025-04-25 20:16:50,817] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.77 | bwd: 5742.96 | bwd_inner: 5717.32 | bwd_allreduce: 25.60 | step: 18.98 12%|█▏ | 5096/41250 [12:19:16<87:27:10, 8.71s/it] {'loss': 0.2662, 'grad_norm': 2.6057276725769043, 'learning_rate': 3.9089415406511386e-05, 'epoch': 1.24} 12%|█▏ | 5096/41250 [12:19:16<87:27:10, 8.71s/it][2025-04-25 20:16:59,465] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:16:59,465] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.42 | bwd_microstep: 5714.16 | bwd_inner_microstep: 5701.31 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.64 [2025-04-25 20:16:59,465] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.42 | bwd: 5714.17 | bwd_inner: 5701.31 | bwd_allreduce: 12.82 | step: 18.64 12%|█▏ | 5097/41250 [12:19:24<87:16:54, 8.69s/it] {'loss': 0.1282, 'grad_norm': 1.8050233125686646, 'learning_rate': 3.908894691265705e-05, 'epoch': 1.24} 12%|█▏ | 5097/41250 [12:19:24<87:16:54, 8.69s/it][2025-04-25 20:17:08,122] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.14 | optimizer_step: 1.04 [2025-04-25 20:17:08,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.88 | bwd_microstep: 5722.93 | bwd_inner_microstep: 5708.46 | bwd_allreduce_microstep: 14.41 | step_microstep: 19.90 [2025-04-25 20:17:08,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.89 | bwd: 5722.95 | bwd_inner: 5708.46 | bwd_allreduce: 14.44 | step: 19.90 12%|█▏ | 5098/41250 [12:19:33<87:10:01, 8.68s/it] {'loss': 0.0152, 'grad_norm': 0.23927578330039978, 'learning_rate': 3.90884783011231e-05, 'epoch': 1.24} 12%|█▏ | 5098/41250 [12:19:33<87:10:01, 8.68s/it][2025-04-25 20:17:16,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 20:17:16,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.09 | bwd_microstep: 5721.17 | bwd_inner_microstep: 5708.43 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.54 [2025-04-25 20:17:16,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.09 | bwd: 5721.19 | bwd_inner: 5708.43 | bwd_allreduce: 12.72 | step: 18.54 12%|█▏ | 5099/41250 [12:19:42<87:06:32, 8.67s/it] {'loss': 0.1491, 'grad_norm': 1.816884994506836, 'learning_rate': 3.908800957191245e-05, 'epoch': 1.24} 12%|█▏ | 5099/41250 [12:19:42<87:06:32, 8.67s/it][2025-04-25 20:17:25,409] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:17:25,409] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.08 | bwd_microstep: 5709.58 | bwd_inner_microstep: 5666.36 | bwd_allreduce_microstep: 43.18 | step_microstep: 18.82 [2025-04-25 20:17:25,410] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.08 | bwd: 5709.59 | bwd_inner: 5666.36 | bwd_allreduce: 43.19 | step: 18.83 12%|█▏ | 5100/41250 [12:19:50<86:57:21, 8.66s/it] {'loss': 0.126, 'grad_norm': 2.330949068069458, 'learning_rate': 3.908754072502797e-05, 'epoch': 1.24} 12%|█▏ | 5100/41250 [12:19:50<86:57:21, 8.66s/it][2025-04-25 20:17:34,267] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 20:17:34,267] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.18 | bwd_microstep: 5923.10 | bwd_inner_microstep: 5698.02 | bwd_allreduce_microstep: 225.03 | step_microstep: 18.60 [2025-04-25 20:17:34,267] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.18 | bwd: 5923.11 | bwd_inner: 5698.02 | bwd_allreduce: 225.05 | step: 18.61 12%|█▏ | 5101/41250 [12:19:59<87:33:02, 8.72s/it] {'loss': 0.0881, 'grad_norm': 1.3635625839233398, 'learning_rate': 3.908707176047255e-05, 'epoch': 1.24} 12%|█▏ | 5101/41250 [12:19:59<87:33:02, 8.72s/it][2025-04-25 20:17:43,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.11 | optimizer_step: 0.93 [2025-04-25 20:17:43,227] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2937.18 | bwd_microstep: 5936.55 | bwd_inner_microstep: 5873.01 | bwd_allreduce_microstep: 63.49 | step_microstep: 18.89 [2025-04-25 20:17:43,227] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2937.18 | bwd: 5936.56 | bwd_inner: 5873.01 | bwd_allreduce: 63.50 | step: 18.89 12%|█▏ | 5102/41250 [12:20:08<88:16:27, 8.79s/it] {'loss': 0.219, 'grad_norm': 4.669386386871338, 'learning_rate': 3.9086602678249095e-05, 'epoch': 1.24} 12%|█▏ | 5102/41250 [12:20:08<88:16:27, 8.79s/it][2025-04-25 20:17:51,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.20 | optimizer_step: 0.90 [2025-04-25 20:17:51,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.87 | bwd_microstep: 5729.07 | bwd_inner_microstep: 5669.38 | bwd_allreduce_microstep: 59.65 | step_microstep: 18.91 [2025-04-25 20:17:51,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.87 | bwd: 5729.09 | bwd_inner: 5669.37 | bwd_allreduce: 59.67 | step: 18.91 12%|█▏ | 5103/41250 [12:20:17<87:51:32, 8.75s/it] {'loss': 0.067, 'grad_norm': 1.1736118793487549, 'learning_rate': 3.9086133478360483e-05, 'epoch': 1.24} 12%|█▏ | 5103/41250 [12:20:17<87:51:32, 8.75s/it][2025-04-25 20:18:00,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 20:18:00,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.47 | bwd_microstep: 5719.05 | bwd_inner_microstep: 5660.62 | bwd_allreduce_microstep: 58.39 | step_microstep: 18.58 [2025-04-25 20:18:00,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.47 | bwd: 5719.06 | bwd_inner: 5660.62 | bwd_allreduce: 58.41 | step: 18.58 12%|█▏ | 5104/41250 [12:20:25<87:29:47, 8.71s/it] {'loss': 0.187, 'grad_norm': 1.2914249897003174, 'learning_rate': 3.908566416080962e-05, 'epoch': 1.24} 12%|█▏ | 5104/41250 [12:20:25<87:29:47, 8.71s/it][2025-04-25 20:18:09,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 20:18:09,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.85 | bwd_microstep: 5717.57 | bwd_inner_microstep: 5704.86 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.54 [2025-04-25 20:18:09,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.85 | bwd: 5717.58 | bwd_inner: 5704.86 | bwd_allreduce: 12.67 | step: 18.55 12%|█▏ | 5105/41250 [12:20:34<87:18:20, 8.70s/it] {'loss': 0.0144, 'grad_norm': 0.3197064697742462, 'learning_rate': 3.908519472559938e-05, 'epoch': 1.24} 12%|█▏ | 5105/41250 [12:20:34<87:18:20, 8.70s/it][2025-04-25 20:18:17,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 20:18:17,849] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.79 | bwd_microstep: 5776.00 | bwd_inner_microstep: 5648.98 | bwd_allreduce_microstep: 126.97 | step_microstep: 18.74 [2025-04-25 20:18:17,849] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.79 | bwd: 5776.02 | bwd_inner: 5648.98 | bwd_allreduce: 127.00 | step: 18.74 12%|█▏ | 5106/41250 [12:20:43<87:16:26, 8.69s/it] {'loss': 0.5619, 'grad_norm': 2.822614908218384, 'learning_rate': 3.908472517273267e-05, 'epoch': 1.24} 12%|█▏ | 5106/41250 [12:20:43<87:16:26, 8.69s/it][2025-04-25 20:18:26,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 20:18:26,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2866.91 | bwd_microstep: 5727.57 | bwd_inner_microstep: 5713.69 | bwd_allreduce_microstep: 13.84 | step_microstep: 18.79 [2025-04-25 20:18:26,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2866.91 | bwd: 5727.58 | bwd_inner: 5713.69 | bwd_allreduce: 13.86 | step: 18.79 12%|█▏ | 5107/41250 [12:20:51<87:14:00, 8.69s/it] {'loss': 0.2245, 'grad_norm': 2.068852186203003, 'learning_rate': 3.908425550221239e-05, 'epoch': 1.24} 12%|█▏ | 5107/41250 [12:20:51<87:14:00, 8.69s/it][2025-04-25 20:18:35,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.01 | optimizer_step: 1.07 [2025-04-25 20:18:35,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.52 | bwd_microstep: 5862.17 | bwd_inner_microstep: 5659.37 | bwd_allreduce_microstep: 202.75 | step_microstep: 19.06 [2025-04-25 20:18:35,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.52 | bwd: 5862.18 | bwd_inner: 5659.37 | bwd_allreduce: 202.77 | step: 19.07 12%|█▏ | 5108/41250 [12:21:00<87:31:16, 8.72s/it] {'loss': 0.2123, 'grad_norm': 1.4290481805801392, 'learning_rate': 3.908378571404142e-05, 'epoch': 1.24} 12%|█▏ | 5108/41250 [12:21:00<87:31:16, 8.72s/it][2025-04-25 20:18:44,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 20:18:44,126] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.69 | bwd_microstep: 5888.86 | bwd_inner_microstep: 5647.86 | bwd_allreduce_microstep: 240.94 | step_microstep: 18.94 [2025-04-25 20:18:44,126] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.70 | bwd: 5888.87 | bwd_inner: 5647.86 | bwd_allreduce: 240.96 | step: 18.94 12%|█▏ | 5109/41250 [12:21:09<87:48:18, 8.75s/it] {'loss': 0.2894, 'grad_norm': 3.7952444553375244, 'learning_rate': 3.908331580822268e-05, 'epoch': 1.24} 12%|█▏ | 5109/41250 [12:21:09<87:48:18, 8.75s/it][2025-04-25 20:18:52,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:18:52,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2866.62 | bwd_microstep: 5728.95 | bwd_inner_microstep: 5716.16 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.72 [2025-04-25 20:18:52,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2866.62 | bwd: 5728.96 | bwd_inner: 5716.16 | bwd_allreduce: 12.77 | step: 18.73 12%|█▏ | 5110/41250 [12:21:18<87:36:02, 8.73s/it] {'loss': 0.0263, 'grad_norm': 0.49451249837875366, 'learning_rate': 3.908284578475903e-05, 'epoch': 1.24} 12%|█▏ | 5110/41250 [12:21:18<87:36:02, 8.73s/it][2025-04-25 20:19:01,488] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 20:19:01,488] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.70 | bwd_microstep: 5754.47 | bwd_inner_microstep: 5644.59 | bwd_allreduce_microstep: 109.83 | step_microstep: 18.86 [2025-04-25 20:19:01,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.70 | bwd: 5754.48 | bwd_inner: 5644.59 | bwd_allreduce: 109.85 | step: 18.87 12%|█▏ | 5111/41250 [12:21:26<87:27:55, 8.71s/it] {'loss': 0.4182, 'grad_norm': 3.0642518997192383, 'learning_rate': 3.90823756436534e-05, 'epoch': 1.24} 12%|█▏ | 5111/41250 [12:21:26<87:27:55, 8.71s/it][2025-04-25 20:19:10,144] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 20:19:10,145] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.07 | bwd_microstep: 5709.07 | bwd_inner_microstep: 5696.28 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.76 [2025-04-25 20:19:10,145] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.07 | bwd: 5709.08 | bwd_inner: 5696.28 | bwd_allreduce: 12.75 | step: 18.77 12%|█▏ | 5112/41250 [12:21:35<87:17:33, 8.70s/it] {'loss': 0.1697, 'grad_norm': 2.1752779483795166, 'learning_rate': 3.908190538490869e-05, 'epoch': 1.24} 12%|█▏ | 5112/41250 [12:21:35<87:17:33, 8.70s/it][2025-04-25 20:19:18,824] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:19:18,824] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.59 | bwd_microstep: 5732.61 | bwd_inner_microstep: 5688.58 | bwd_allreduce_microstep: 43.99 | step_microstep: 18.20 [2025-04-25 20:19:18,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.59 | bwd: 5732.62 | bwd_inner: 5688.58 | bwd_allreduce: 44.01 | step: 18.21 12%|█▏ | 5113/41250 [12:21:44<87:14:31, 8.69s/it] {'loss': 0.0634, 'grad_norm': 1.166725516319275, 'learning_rate': 3.908143500852776e-05, 'epoch': 1.24} 12%|█▏ | 5113/41250 [12:21:44<87:14:31, 8.69s/it][2025-04-25 20:19:27,524] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.19 | optimizer_step: 0.90 [2025-04-25 20:19:27,524] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.40 | bwd_microstep: 5763.10 | bwd_inner_microstep: 5693.88 | bwd_allreduce_microstep: 69.17 | step_microstep: 18.44 [2025-04-25 20:19:27,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.40 | bwd: 5763.12 | bwd_inner: 5693.88 | bwd_allreduce: 69.19 | step: 18.44 12%|█▏ | 5114/41250 [12:21:52<87:15:51, 8.69s/it] {'loss': 0.1411, 'grad_norm': 1.2683814764022827, 'learning_rate': 3.908096451451356e-05, 'epoch': 1.24} 12%|█▏ | 5114/41250 [12:21:52<87:15:51, 8.69s/it][2025-04-25 20:19:36,200] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-25 20:19:36,200] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.34 | bwd_microstep: 5735.42 | bwd_inner_microstep: 5689.16 | bwd_allreduce_microstep: 46.22 | step_microstep: 18.68 [2025-04-25 20:19:36,200] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.34 | bwd: 5735.44 | bwd_inner: 5689.16 | bwd_allreduce: 46.24 | step: 18.68 12%|█▏ | 5115/41250 [12:22:01<87:12:28, 8.69s/it] {'loss': 0.0524, 'grad_norm': 0.840408444404602, 'learning_rate': 3.908049390286896e-05, 'epoch': 1.24} 12%|█▏ | 5115/41250 [12:22:01<87:12:28, 8.69s/it][2025-04-25 20:19:44,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:19:44,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.07 | bwd_microstep: 5700.68 | bwd_inner_microstep: 5649.10 | bwd_allreduce_microstep: 51.53 | step_microstep: 18.70 [2025-04-25 20:19:44,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.08 | bwd: 5700.69 | bwd_inner: 5649.10 | bwd_allreduce: 51.55 | step: 18.70 12%|█▏ | 5116/41250 [12:22:10<86:59:12, 8.67s/it] {'loss': 0.2198, 'grad_norm': 2.6080141067504883, 'learning_rate': 3.908002317359686e-05, 'epoch': 1.24} 12%|█▏ | 5116/41250 [12:22:10<86:59:12, 8.67s/it][2025-04-25 20:19:53,418] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 20:19:53,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.89 | bwd_microstep: 5683.75 | bwd_inner_microstep: 5647.94 | bwd_allreduce_microstep: 35.77 | step_microstep: 18.66 [2025-04-25 20:19:53,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.90 | bwd: 5683.76 | bwd_inner: 5647.94 | bwd_allreduce: 35.78 | step: 18.67 12%|█▏ | 5117/41250 [12:22:18<86:47:54, 8.65s/it] {'loss': 0.1167, 'grad_norm': 1.8331241607666016, 'learning_rate': 3.9079552326700174e-05, 'epoch': 1.24} 12%|█▏ | 5117/41250 [12:22:18<86:47:54, 8.65s/it][2025-04-25 20:20:02,092] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.20 | optimizer_step: 1.02 [2025-04-25 20:20:02,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.93 | bwd_microstep: 5757.93 | bwd_inner_microstep: 5654.01 | bwd_allreduce_microstep: 103.88 | step_microstep: 19.64 [2025-04-25 20:20:02,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.93 | bwd: 5757.94 | bwd_inner: 5654.01 | bwd_allreduce: 103.90 | step: 19.64 12%|█▏ | 5118/41250 [12:22:27<86:52:12, 8.66s/it] {'loss': 0.1716, 'grad_norm': 4.635616302490234, 'learning_rate': 3.90790813621818e-05, 'epoch': 1.24} 12%|█▏ | 5118/41250 [12:22:27<86:52:12, 8.66s/it][2025-04-25 20:20:10,750] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 1.08 [2025-04-25 20:20:10,751] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.97 | bwd_microstep: 5713.61 | bwd_inner_microstep: 5701.10 | bwd_allreduce_microstep: 12.47 | step_microstep: 18.81 [2025-04-25 20:20:10,751] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.97 | bwd: 5713.62 | bwd_inner: 5701.10 | bwd_allreduce: 12.49 | step: 18.81 12%|█▏ | 5119/41250 [12:22:36<86:52:31, 8.66s/it] {'loss': 0.2781, 'grad_norm': 2.351172685623169, 'learning_rate': 3.907861028004465e-05, 'epoch': 1.24} 12%|█▏ | 5119/41250 [12:22:36<86:52:31, 8.66s/it][2025-04-25 20:20:19,418] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 20:20:19,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.64 | bwd_microstep: 5738.82 | bwd_inner_microstep: 5684.31 | bwd_allreduce_microstep: 54.47 | step_microstep: 18.55 [2025-04-25 20:20:19,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.64 | bwd: 5738.84 | bwd_inner: 5684.31 | bwd_allreduce: 54.49 | step: 18.55 12%|█▏ | 5120/41250 [12:22:44<86:54:28, 8.66s/it] {'loss': 0.1146, 'grad_norm': 1.5039328336715698, 'learning_rate': 3.907813908029161e-05, 'epoch': 1.24} 12%|█▏ | 5120/41250 [12:22:44<86:54:28, 8.66s/it][2025-04-25 20:20:28,119] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-25 20:20:28,120] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.77 | bwd_microstep: 5762.71 | bwd_inner_microstep: 5699.02 | bwd_allreduce_microstep: 63.65 | step_microstep: 19.08 [2025-04-25 20:20:28,120] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.77 | bwd: 5762.73 | bwd_inner: 5699.02 | bwd_allreduce: 63.67 | step: 19.08 12%|█▏ | 5121/41250 [12:22:53<87:02:18, 8.67s/it] {'loss': 0.1867, 'grad_norm': 1.7193186283111572, 'learning_rate': 3.90776677629256e-05, 'epoch': 1.24} 12%|█▏ | 5121/41250 [12:22:53<87:02:18, 8.67s/it][2025-04-25 20:20:36,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 20:20:36,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.99 | bwd_microstep: 5710.28 | bwd_inner_microstep: 5697.90 | bwd_allreduce_microstep: 12.33 | step_microstep: 18.55 [2025-04-25 20:20:36,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.99 | bwd: 5710.29 | bwd_inner: 5697.90 | bwd_allreduce: 12.35 | step: 18.55 12%|█▏ | 5122/41250 [12:23:02<86:56:18, 8.66s/it] {'loss': 0.1994, 'grad_norm': 2.149970769882202, 'learning_rate': 3.907719632794952e-05, 'epoch': 1.24} 12%|█▏ | 5122/41250 [12:23:02<86:56:18, 8.66s/it][2025-04-25 20:20:45,354] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:20:45,354] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.47 | bwd_microstep: 5682.97 | bwd_inner_microstep: 5645.39 | bwd_allreduce_microstep: 37.53 | step_microstep: 18.72 [2025-04-25 20:20:45,355] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.47 | bwd: 5682.98 | bwd_inner: 5645.39 | bwd_allreduce: 37.55 | step: 18.72 12%|█▏ | 5123/41250 [12:23:10<86:43:20, 8.64s/it] {'loss': 0.1035, 'grad_norm': 1.6354224681854248, 'learning_rate': 3.907672477536627e-05, 'epoch': 1.24} 12%|█▏ | 5123/41250 [12:23:10<86:43:20, 8.64s/it][2025-04-25 20:20:54,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 20:20:54,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2884.34 | bwd_microstep: 5781.16 | bwd_inner_microstep: 5768.60 | bwd_allreduce_microstep: 12.52 | step_microstep: 18.54 [2025-04-25 20:20:54,102] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2884.34 | bwd: 5781.18 | bwd_inner: 5768.60 | bwd_allreduce: 12.54 | step: 18.54 12%|█▏ | 5124/41250 [12:23:19<87:02:05, 8.67s/it] {'loss': 0.1726, 'grad_norm': 3.536432981491089, 'learning_rate': 3.907625310517878e-05, 'epoch': 1.24} 12%|█▏ | 5124/41250 [12:23:19<87:02:05, 8.67s/it][2025-04-25 20:21:02,686] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 20:21:02,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.47 | bwd_microstep: 5681.99 | bwd_inner_microstep: 5646.61 | bwd_allreduce_microstep: 35.34 | step_microstep: 18.39 [2025-04-25 20:21:02,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.47 | bwd: 5682.00 | bwd_inner: 5646.61 | bwd_allreduce: 35.35 | step: 18.39 12%|█▏ | 5125/41250 [12:23:28<86:46:06, 8.65s/it] {'loss': 0.3302, 'grad_norm': 2.0984888076782227, 'learning_rate': 3.907578131738993e-05, 'epoch': 1.24} 12%|█▏ | 5125/41250 [12:23:28<86:46:06, 8.65s/it][2025-04-25 20:21:11,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:21:11,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.16 | bwd_microstep: 5720.94 | bwd_inner_microstep: 5707.93 | bwd_allreduce_microstep: 12.97 | step_microstep: 18.99 [2025-04-25 20:21:11,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.16 | bwd: 5720.96 | bwd_inner: 5707.93 | bwd_allreduce: 12.99 | step: 18.99 12%|█▏ | 5126/41250 [12:23:36<86:48:17, 8.65s/it] {'loss': 0.2026, 'grad_norm': 1.7363612651824951, 'learning_rate': 3.907530941200264e-05, 'epoch': 1.24} 12%|█▏ | 5126/41250 [12:23:36<86:48:17, 8.65s/it][2025-04-25 20:21:19,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-25 20:21:19,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.26 | bwd_microstep: 5659.54 | bwd_inner_microstep: 5646.75 | bwd_allreduce_microstep: 12.74 | step_microstep: 19.03 [2025-04-25 20:21:19,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.26 | bwd: 5659.55 | bwd_inner: 5646.75 | bwd_allreduce: 12.76 | step: 19.04 12%|█▏ | 5127/41250 [12:23:45<86:34:25, 8.63s/it] {'loss': 0.0358, 'grad_norm': 3.837252378463745, 'learning_rate': 3.907483738901982e-05, 'epoch': 1.24} 12%|█▏ | 5127/41250 [12:23:45<86:34:25, 8.63s/it][2025-04-25 20:21:28,679] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.94 [2025-04-25 20:21:28,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.97 | bwd_microstep: 5817.29 | bwd_inner_microstep: 5690.36 | bwd_allreduce_microstep: 126.88 | step_microstep: 19.21 [2025-04-25 20:21:28,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.97 | bwd: 5817.30 | bwd_inner: 5690.36 | bwd_allreduce: 126.90 | step: 19.21 12%|█▏ | 5128/41250 [12:23:54<86:58:16, 8.67s/it] {'loss': 0.1428, 'grad_norm': 1.5710395574569702, 'learning_rate': 3.907436524844438e-05, 'epoch': 1.24} 12%|█▏ | 5128/41250 [12:23:54<86:58:16, 8.67s/it][2025-04-25 20:21:37,450] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 20:21:37,451] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2865.23 | bwd_microstep: 5817.42 | bwd_inner_microstep: 5699.91 | bwd_allreduce_microstep: 117.47 | step_microstep: 18.88 [2025-04-25 20:21:37,451] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2865.23 | bwd: 5817.44 | bwd_inner: 5699.91 | bwd_allreduce: 117.48 | step: 18.88 12%|█▏ | 5129/41250 [12:24:02<87:16:28, 8.70s/it] {'loss': 0.1586, 'grad_norm': 1.6313564777374268, 'learning_rate': 3.907389299027923e-05, 'epoch': 1.24} 12%|█▏ | 5129/41250 [12:24:02<87:16:28, 8.70s/it][2025-04-25 20:21:46,096] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-25 20:21:46,096] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.19 | bwd_microstep: 5738.65 | bwd_inner_microstep: 5653.40 | bwd_allreduce_microstep: 85.19 | step_microstep: 18.70 [2025-04-25 20:21:46,096] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.19 | bwd: 5738.66 | bwd_inner: 5653.40 | bwd_allreduce: 85.21 | step: 18.71 12%|█▏ | 5130/41250 [12:24:11<87:07:15, 8.68s/it] {'loss': 0.1257, 'grad_norm': 1.4172769784927368, 'learning_rate': 3.907342061452729e-05, 'epoch': 1.24} 12%|█▏ | 5130/41250 [12:24:11<87:07:15, 8.68s/it][2025-04-25 20:21:54,690] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.04 | optimizer_step: 1.03 [2025-04-25 20:21:54,691] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.08 | bwd_microstep: 5672.92 | bwd_inner_microstep: 5652.66 | bwd_allreduce_microstep: 20.21 | step_microstep: 19.23 [2025-04-25 20:21:54,691] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.08 | bwd: 5672.93 | bwd_inner: 5652.66 | bwd_allreduce: 20.23 | step: 19.23 12%|█▏ | 5131/41250 [12:24:20<86:50:45, 8.66s/it] {'loss': 0.1122, 'grad_norm': 1.4044033288955688, 'learning_rate': 3.9072948121191456e-05, 'epoch': 1.24} 12%|█▏ | 5131/41250 [12:24:20<86:50:45, 8.66s/it][2025-04-25 20:22:03,569] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 20:22:03,569] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2929.79 | bwd_microstep: 5865.90 | bwd_inner_microstep: 5852.97 | bwd_allreduce_microstep: 12.88 | step_microstep: 18.78 [2025-04-25 20:22:03,569] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2929.79 | bwd: 5865.91 | bwd_inner: 5852.97 | bwd_allreduce: 12.90 | step: 18.79 12%|█▏ | 5132/41250 [12:24:28<87:30:37, 8.72s/it] {'loss': 0.2958, 'grad_norm': 2.7824742794036865, 'learning_rate': 3.907247551027465e-05, 'epoch': 1.24} 12%|█▏ | 5132/41250 [12:24:28<87:30:37, 8.72s/it][2025-04-25 20:22:12,179] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 20:22:12,180] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.11 | bwd_microstep: 5682.06 | bwd_inner_microstep: 5669.53 | bwd_allreduce_microstep: 12.48 | step_microstep: 18.61 [2025-04-25 20:22:12,180] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.11 | bwd: 5682.07 | bwd_inner: 5669.53 | bwd_allreduce: 12.50 | step: 18.61 12%|█▏ | 5133/41250 [12:24:37<87:10:19, 8.69s/it] {'loss': 0.1446, 'grad_norm': 2.38338565826416, 'learning_rate': 3.907200278177978e-05, 'epoch': 1.24} 12%|█▏ | 5133/41250 [12:24:37<87:10:19, 8.69s/it][2025-04-25 20:22:20,819] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 20:22:20,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.00 | bwd_microstep: 5717.46 | bwd_inner_microstep: 5702.10 | bwd_allreduce_microstep: 15.31 | step_microstep: 18.66 [2025-04-25 20:22:20,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.00 | bwd: 5717.47 | bwd_inner: 5702.10 | bwd_allreduce: 15.33 | step: 18.67 12%|█▏ | 5134/41250 [12:24:46<87:01:26, 8.67s/it] {'loss': 0.0646, 'grad_norm': 0.7593385577201843, 'learning_rate': 3.907152993570976e-05, 'epoch': 1.24} 12%|█▏ | 5134/41250 [12:24:46<87:01:26, 8.67s/it][2025-04-25 20:22:29,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 20:22:29,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.82 | bwd_microstep: 5717.96 | bwd_inner_microstep: 5699.18 | bwd_allreduce_microstep: 18.72 | step_microstep: 19.00 [2025-04-25 20:22:29,471] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.83 | bwd: 5717.97 | bwd_inner: 5699.18 | bwd_allreduce: 18.74 | step: 19.00 12%|█▏ | 5135/41250 [12:24:54<86:57:08, 8.67s/it] {'loss': 0.1776, 'grad_norm': 1.68631112575531, 'learning_rate': 3.907105697206752e-05, 'epoch': 1.24} 12%|█▏ | 5135/41250 [12:24:54<86:57:08, 8.67s/it][2025-04-25 20:22:38,105] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 1.13 [2025-04-25 20:22:38,105] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.23 | bwd_microstep: 5700.13 | bwd_inner_microstep: 5687.41 | bwd_allreduce_microstep: 12.67 | step_microstep: 19.11 [2025-04-25 20:22:38,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.23 | bwd: 5700.14 | bwd_inner: 5687.41 | bwd_allreduce: 12.69 | step: 19.11 12%|█▏ | 5136/41250 [12:25:03<86:50:57, 8.66s/it] {'loss': 0.1131, 'grad_norm': 2.1146373748779297, 'learning_rate': 3.9070583890855964e-05, 'epoch': 1.25} 12%|█▏ | 5136/41250 [12:25:03<86:50:57, 8.66s/it][2025-04-25 20:22:46,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 20:22:46,719] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.87 | bwd_microstep: 5699.46 | bwd_inner_microstep: 5665.03 | bwd_allreduce_microstep: 34.39 | step_microstep: 18.92 [2025-04-25 20:22:46,719] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.87 | bwd: 5699.47 | bwd_inner: 5665.03 | bwd_allreduce: 34.41 | step: 18.92 12%|█▏ | 5137/41250 [12:25:12<86:42:47, 8.64s/it] {'loss': 0.0915, 'grad_norm': 0.976996123790741, 'learning_rate': 3.9070110692078e-05, 'epoch': 1.25} 12%|█▏ | 5137/41250 [12:25:12<86:42:47, 8.64s/it][2025-04-25 20:22:55,395] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 20:22:55,395] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.19 | bwd_microstep: 5763.66 | bwd_inner_microstep: 5649.06 | bwd_allreduce_microstep: 114.55 | step_microstep: 18.98 [2025-04-25 20:22:55,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.19 | bwd: 5763.68 | bwd_inner: 5649.06 | bwd_allreduce: 114.58 | step: 18.99 12%|█▏ | 5138/41250 [12:25:20<86:48:51, 8.65s/it] {'loss': 0.2241, 'grad_norm': 2.3338494300842285, 'learning_rate': 3.906963737573656e-05, 'epoch': 1.25} 12%|█▏ | 5138/41250 [12:25:20<86:48:51, 8.65s/it][2025-04-25 20:23:04,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 20:23:04,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.20 | bwd_microstep: 5742.39 | bwd_inner_microstep: 5701.73 | bwd_allreduce_microstep: 40.61 | step_microstep: 18.55 [2025-04-25 20:23:04,077] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.20 | bwd: 5742.41 | bwd_inner: 5701.73 | bwd_allreduce: 40.63 | step: 18.55 12%|█▏ | 5139/41250 [12:25:29<86:53:32, 8.66s/it] {'loss': 0.1431, 'grad_norm': 1.5251471996307373, 'learning_rate': 3.9069163941834556e-05, 'epoch': 1.25} 12%|█▏ | 5139/41250 [12:25:29<86:53:32, 8.66s/it][2025-04-25 20:23:12,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.03 | optimizer_step: 1.01 [2025-04-25 20:23:12,733] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.43 | bwd_microstep: 5715.75 | bwd_inner_microstep: 5703.07 | bwd_allreduce_microstep: 12.64 | step_microstep: 19.21 [2025-04-25 20:23:12,733] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.43 | bwd: 5715.77 | bwd_inner: 5703.07 | bwd_allreduce: 12.66 | step: 19.21 12%|█▏ | 5140/41250 [12:25:38<86:52:03, 8.66s/it] {'loss': 0.1738, 'grad_norm': 1.2734259366989136, 'learning_rate': 3.906869039037491e-05, 'epoch': 1.25} 12%|█▏ | 5140/41250 [12:25:38<86:52:03, 8.66s/it][2025-04-25 20:23:21,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:23:21,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.23 | bwd_microstep: 5788.05 | bwd_inner_microstep: 5661.96 | bwd_allreduce_microstep: 126.04 | step_microstep: 18.77 [2025-04-25 20:23:21,448] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.23 | bwd: 5788.06 | bwd_inner: 5661.96 | bwd_allreduce: 126.06 | step: 18.78 12%|█▏ | 5141/41250 [12:25:46<87:01:28, 8.68s/it] {'loss': 0.0951, 'grad_norm': 1.9146206378936768, 'learning_rate': 3.9068216721360536e-05, 'epoch': 1.25} 12%|█▏ | 5141/41250 [12:25:46<87:01:28, 8.68s/it][2025-04-25 20:23:30,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:23:30,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.93 | bwd_microstep: 5715.93 | bwd_inner_microstep: 5665.81 | bwd_allreduce_microstep: 50.07 | step_microstep: 18.69 [2025-04-25 20:23:30,077] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.93 | bwd: 5715.94 | bwd_inner: 5665.81 | bwd_allreduce: 50.09 | step: 18.69 12%|█▏ | 5142/41250 [12:25:55<86:52:48, 8.66s/it] {'loss': 0.0466, 'grad_norm': 0.6211170554161072, 'learning_rate': 3.906774293479435e-05, 'epoch': 1.25} 12%|█▏ | 5142/41250 [12:25:55<86:52:48, 8.66s/it][2025-04-25 20:23:38,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:23:38,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.91 | bwd_microstep: 5781.14 | bwd_inner_microstep: 5659.65 | bwd_allreduce_microstep: 121.44 | step_microstep: 18.79 [2025-04-25 20:23:38,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.91 | bwd: 5781.16 | bwd_inner: 5659.65 | bwd_allreduce: 121.46 | step: 18.79 12%|█▏ | 5143/41250 [12:26:04<86:57:38, 8.67s/it] {'loss': 0.088, 'grad_norm': 1.0007144212722778, 'learning_rate': 3.906726903067929e-05, 'epoch': 1.25} 12%|█▏ | 5143/41250 [12:26:04<86:57:38, 8.67s/it][2025-04-25 20:23:47,446] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-25 20:23:47,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.90 | bwd_microstep: 5739.64 | bwd_inner_microstep: 5704.44 | bwd_allreduce_microstep: 35.15 | step_microstep: 18.87 [2025-04-25 20:23:47,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.90 | bwd: 5739.66 | bwd_inner: 5704.44 | bwd_allreduce: 35.17 | step: 18.87 12%|█▏ | 5144/41250 [12:26:12<86:59:31, 8.67s/it] {'loss': 0.0586, 'grad_norm': 0.705538272857666, 'learning_rate': 3.906679500901826e-05, 'epoch': 1.25} 12%|█▏ | 5144/41250 [12:26:12<86:59:31, 8.67s/it][2025-04-25 20:23:56,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.02 | optimizer_step: 1.13 [2025-04-25 20:23:56,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.83 | bwd_microstep: 5784.90 | bwd_inner_microstep: 5661.79 | bwd_allreduce_microstep: 123.06 | step_microstep: 19.48 [2025-04-25 20:23:56,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.84 | bwd: 5784.92 | bwd_inner: 5661.79 | bwd_allreduce: 123.08 | step: 19.48 12%|█▏ | 5145/41250 [12:26:21<87:04:47, 8.68s/it] {'loss': 0.0786, 'grad_norm': 0.97065269947052, 'learning_rate': 3.906632086981419e-05, 'epoch': 1.25} 12%|█▏ | 5145/41250 [12:26:21<87:04:47, 8.68s/it][2025-04-25 20:24:04,871] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:24:04,871] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.06 | bwd_microstep: 5809.11 | bwd_inner_microstep: 5672.81 | bwd_allreduce_microstep: 136.26 | step_microstep: 18.98 [2025-04-25 20:24:04,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.06 | bwd: 5809.12 | bwd_inner: 5672.81 | bwd_allreduce: 136.28 | step: 18.98 12%|█▏ | 5146/41250 [12:26:30<87:11:27, 8.69s/it] {'loss': 0.1352, 'grad_norm': 1.9754246473312378, 'learning_rate': 3.906584661307e-05, 'epoch': 1.25} 12%|█▏ | 5146/41250 [12:26:30<87:11:27, 8.69s/it][2025-04-25 20:24:13,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 20:24:13,588] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.42 | bwd_microstep: 5803.12 | bwd_inner_microstep: 5662.71 | bwd_allreduce_microstep: 140.37 | step_microstep: 18.85 [2025-04-25 20:24:13,588] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.42 | bwd: 5803.13 | bwd_inner: 5662.71 | bwd_allreduce: 140.38 | step: 18.85 12%|█▏ | 5147/41250 [12:26:38<87:15:18, 8.70s/it] {'loss': 0.1775, 'grad_norm': 1.4949736595153809, 'learning_rate': 3.906537223878863e-05, 'epoch': 1.25} 12%|█▏ | 5147/41250 [12:26:38<87:15:18, 8.70s/it][2025-04-25 20:24:22,287] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.99 | optimizer_step: 0.98 [2025-04-25 20:24:22,288] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.87 | bwd_microstep: 5776.35 | bwd_inner_microstep: 5664.81 | bwd_allreduce_microstep: 111.50 | step_microstep: 18.92 [2025-04-25 20:24:22,288] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.87 | bwd: 5776.36 | bwd_inner: 5664.80 | bwd_allreduce: 111.52 | step: 18.92 12%|█▏ | 5148/41250 [12:26:47<87:15:15, 8.70s/it] {'loss': 0.1774, 'grad_norm': 1.8159130811691284, 'learning_rate': 3.906489774697298e-05, 'epoch': 1.25} 12%|█▏ | 5148/41250 [12:26:47<87:15:15, 8.70s/it][2025-04-25 20:24:30,997] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 20:24:30,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.19 | bwd_microstep: 5768.61 | bwd_inner_microstep: 5714.74 | bwd_allreduce_microstep: 53.82 | step_microstep: 18.74 [2025-04-25 20:24:30,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.19 | bwd: 5768.62 | bwd_inner: 5714.74 | bwd_allreduce: 53.84 | step: 18.74 12%|█▏ | 5149/41250 [12:26:56<87:16:34, 8.70s/it] {'loss': 0.1902, 'grad_norm': 1.4925401210784912, 'learning_rate': 3.9064423137625975e-05, 'epoch': 1.25} 12%|█▏ | 5149/41250 [12:26:56<87:16:34, 8.70s/it][2025-04-25 20:24:39,728] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 20:24:39,728] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.67 | bwd_microstep: 5795.28 | bwd_inner_microstep: 5690.01 | bwd_allreduce_microstep: 105.23 | step_microstep: 18.97 [2025-04-25 20:24:39,728] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.67 | bwd: 5795.30 | bwd_inner: 5690.01 | bwd_allreduce: 105.25 | step: 18.97 12%|█▏ | 5150/41250 [12:27:05<87:21:25, 8.71s/it] {'loss': 0.1051, 'grad_norm': 1.2674483060836792, 'learning_rate': 3.906394841075056e-05, 'epoch': 1.25} 12%|█▏ | 5150/41250 [12:27:05<87:21:25, 8.71s/it][2025-04-25 20:24:48,357] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.03 | optimizer_step: 0.99 [2025-04-25 20:24:48,358] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.52 | bwd_microstep: 5721.92 | bwd_inner_microstep: 5648.60 | bwd_allreduce_microstep: 73.27 | step_microstep: 18.91 [2025-04-25 20:24:48,358] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.52 | bwd: 5721.94 | bwd_inner: 5648.60 | bwd_allreduce: 73.29 | step: 18.91 12%|█▏ | 5151/41250 [12:27:13<87:06:25, 8.69s/it] {'loss': 0.1267, 'grad_norm': 1.634439468383789, 'learning_rate': 3.9063473566349657e-05, 'epoch': 1.25} 12%|█▏ | 5151/41250 [12:27:13<87:06:25, 8.69s/it][2025-04-25 20:24:57,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 20:24:57,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2890.99 | bwd_microstep: 5785.51 | bwd_inner_microstep: 5772.82 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.70 [2025-04-25 20:24:57,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2890.99 | bwd: 5785.52 | bwd_inner: 5772.82 | bwd_allreduce: 12.66 | step: 18.70 12%|█▏ | 5152/41250 [12:27:22<87:18:56, 8.71s/it] {'loss': 0.1445, 'grad_norm': 1.6852047443389893, 'learning_rate': 3.906299860442619e-05, 'epoch': 1.25} 12%|█▏ | 5152/41250 [12:27:22<87:18:56, 8.71s/it][2025-04-25 20:25:05,823] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 20:25:05,824] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.21 | bwd_microstep: 5787.80 | bwd_inner_microstep: 5661.46 | bwd_allreduce_microstep: 126.29 | step_microstep: 18.59 [2025-04-25 20:25:05,824] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.21 | bwd: 5787.81 | bwd_inner: 5661.46 | bwd_allreduce: 126.31 | step: 18.59 12%|█▏ | 5153/41250 [12:27:31<87:18:59, 8.71s/it] {'loss': 0.1909, 'grad_norm': 1.799399733543396, 'learning_rate': 3.9062523524983076e-05, 'epoch': 1.25} 12%|█▏ | 5153/41250 [12:27:31<87:18:59, 8.71s/it][2025-04-25 20:25:14,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:25:14,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.70 | bwd_microstep: 5760.18 | bwd_inner_microstep: 5700.35 | bwd_allreduce_microstep: 59.78 | step_microstep: 18.63 [2025-04-25 20:25:14,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.70 | bwd: 5760.19 | bwd_inner: 5700.35 | bwd_allreduce: 59.80 | step: 18.63 12%|█▏ | 5154/41250 [12:27:39<87:17:04, 8.71s/it] {'loss': 0.1917, 'grad_norm': 2.320650577545166, 'learning_rate': 3.906204832802326e-05, 'epoch': 1.25} 12%|█▏ | 5154/41250 [12:27:39<87:17:04, 8.71s/it][2025-04-25 20:25:23,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-25 20:25:23,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.83 | bwd_microstep: 5753.28 | bwd_inner_microstep: 5696.68 | bwd_allreduce_microstep: 56.54 | step_microstep: 19.23 [2025-04-25 20:25:23,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.83 | bwd: 5753.30 | bwd_inner: 5696.68 | bwd_allreduce: 56.56 | step: 19.23 12%|█▏ | 5155/41250 [12:27:48<87:14:20, 8.70s/it] {'loss': 0.0364, 'grad_norm': 0.6395145654678345, 'learning_rate': 3.906157301354966e-05, 'epoch': 1.25} 12%|█▏ | 5155/41250 [12:27:48<87:14:20, 8.70s/it][2025-04-25 20:25:31,886] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 20:25:31,887] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.06 | bwd_microstep: 5732.37 | bwd_inner_microstep: 5719.45 | bwd_allreduce_microstep: 12.88 | step_microstep: 18.95 [2025-04-25 20:25:31,887] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.06 | bwd: 5732.38 | bwd_inner: 5719.45 | bwd_allreduce: 12.90 | step: 18.96 12%|█▏ | 5156/41250 [12:27:57<87:09:27, 8.69s/it] {'loss': 0.1876, 'grad_norm': 1.9100866317749023, 'learning_rate': 3.9061097581565215e-05, 'epoch': 1.25} 12%|█▏ | 5156/41250 [12:27:57<87:09:27, 8.69s/it][2025-04-25 20:25:40,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.02 | optimizer_step: 1.04 [2025-04-25 20:25:40,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.13 | bwd_microstep: 5759.20 | bwd_inner_microstep: 5696.40 | bwd_allreduce_microstep: 62.75 | step_microstep: 19.21 [2025-04-25 20:25:40,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.13 | bwd: 5759.22 | bwd_inner: 5696.40 | bwd_allreduce: 62.77 | step: 19.21 13%|█▎ | 5157/41250 [12:28:05<87:09:56, 8.69s/it] {'loss': 0.0335, 'grad_norm': 0.8751997947692871, 'learning_rate': 3.906062203207285e-05, 'epoch': 1.25} 13%|█▎ | 5157/41250 [12:28:05<87:09:56, 8.69s/it][2025-04-25 20:25:49,266] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 20:25:49,266] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.04 | bwd_microstep: 5741.94 | bwd_inner_microstep: 5697.60 | bwd_allreduce_microstep: 44.29 | step_microstep: 19.28 [2025-04-25 20:25:49,266] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.04 | bwd: 5741.95 | bwd_inner: 5697.60 | bwd_allreduce: 44.30 | step: 19.27 13%|█▎ | 5158/41250 [12:28:14<87:07:37, 8.69s/it] {'loss': 0.143, 'grad_norm': 1.7243084907531738, 'learning_rate': 3.906014636507551e-05, 'epoch': 1.25} 13%|█▎ | 5158/41250 [12:28:14<87:07:37, 8.69s/it][2025-04-25 20:25:58,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.03 | optimizer_step: 0.98 [2025-04-25 20:25:58,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.01 | bwd_microstep: 5876.28 | bwd_inner_microstep: 5658.14 | bwd_allreduce_microstep: 218.10 | step_microstep: 18.73 [2025-04-25 20:25:58,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.01 | bwd: 5876.30 | bwd_inner: 5658.14 | bwd_allreduce: 218.11 | step: 18.73 13%|█▎ | 5159/41250 [12:28:23<87:24:29, 8.72s/it] {'loss': 0.0296, 'grad_norm': 0.5002350807189941, 'learning_rate': 3.905967058057611e-05, 'epoch': 1.25} 13%|█▎ | 5159/41250 [12:28:23<87:24:29, 8.72s/it][2025-04-25 20:26:06,657] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:26:06,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.44 | bwd_microstep: 5693.78 | bwd_inner_microstep: 5662.20 | bwd_allreduce_microstep: 31.54 | step_microstep: 18.91 [2025-04-25 20:26:06,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.45 | bwd: 5693.79 | bwd_inner: 5662.20 | bwd_allreduce: 31.55 | step: 18.91 13%|█▎ | 5160/41250 [12:28:31<87:04:48, 8.69s/it] {'loss': 0.0175, 'grad_norm': 0.47299832105636597, 'learning_rate': 3.9059194678577587e-05, 'epoch': 1.25} 13%|█▎ | 5160/41250 [12:28:31<87:04:48, 8.69s/it][2025-04-25 20:26:15,353] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.25 | optimizer_step: 1.03 [2025-04-25 20:26:15,353] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.70 | bwd_microstep: 5764.05 | bwd_inner_microstep: 5692.40 | bwd_allreduce_microstep: 71.60 | step_microstep: 19.92 [2025-04-25 20:26:15,354] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.70 | bwd: 5764.07 | bwd_inner: 5692.40 | bwd_allreduce: 71.63 | step: 19.92 13%|█▎ | 5161/41250 [12:28:40<87:05:47, 8.69s/it] {'loss': 0.2249, 'grad_norm': 1.955040693283081, 'learning_rate': 3.905871865908288e-05, 'epoch': 1.25} 13%|█▎ | 5161/41250 [12:28:40<87:05:47, 8.69s/it][2025-04-25 20:26:24,017] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.02 | optimizer_step: 0.91 [2025-04-25 20:26:24,018] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.65 | bwd_microstep: 5718.21 | bwd_inner_microstep: 5705.38 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.93 [2025-04-25 20:26:24,018] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.65 | bwd: 5718.22 | bwd_inner: 5705.38 | bwd_allreduce: 12.80 | step: 18.93 13%|█▎ | 5162/41250 [12:28:49<87:01:23, 8.68s/it] {'loss': 0.0437, 'grad_norm': 1.9246015548706055, 'learning_rate': 3.905824252209492e-05, 'epoch': 1.25} 13%|█▎ | 5162/41250 [12:28:49<87:01:23, 8.68s/it][2025-04-25 20:26:32,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 20:26:32,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.47 | bwd_microstep: 5762.61 | bwd_inner_microstep: 5653.40 | bwd_allreduce_microstep: 109.17 | step_microstep: 18.56 [2025-04-25 20:26:32,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.47 | bwd: 5762.63 | bwd_inner: 5653.40 | bwd_allreduce: 109.18 | step: 18.56 13%|█▎ | 5163/41250 [12:28:58<87:02:10, 8.68s/it] {'loss': 0.3352, 'grad_norm': 1.823768138885498, 'learning_rate': 3.905776626761664e-05, 'epoch': 1.25} 13%|█▎ | 5163/41250 [12:28:58<87:02:10, 8.68s/it][2025-04-25 20:26:41,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.03 | optimizer_step: 1.09 [2025-04-25 20:26:41,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.39 | bwd_microstep: 5753.76 | bwd_inner_microstep: 5643.03 | bwd_allreduce_microstep: 110.68 | step_microstep: 19.01 [2025-04-25 20:26:41,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.39 | bwd: 5753.78 | bwd_inner: 5643.03 | bwd_allreduce: 110.70 | step: 19.01 13%|█▎ | 5164/41250 [12:29:06<87:00:42, 8.68s/it] {'loss': 0.2047, 'grad_norm': 1.6656275987625122, 'learning_rate': 3.9057289895650985e-05, 'epoch': 1.25} 13%|█▎ | 5164/41250 [12:29:06<87:00:42, 8.68s/it][2025-04-25 20:26:50,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 20:26:50,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.13 | bwd_microstep: 5769.77 | bwd_inner_microstep: 5647.73 | bwd_allreduce_microstep: 122.00 | step_microstep: 18.58 [2025-04-25 20:26:50,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.13 | bwd: 5769.79 | bwd_inner: 5647.73 | bwd_allreduce: 122.01 | step: 18.58 13%|█▎ | 5165/41250 [12:29:15<87:00:16, 8.68s/it] {'loss': 0.0548, 'grad_norm': 0.9107530117034912, 'learning_rate': 3.905681340620088e-05, 'epoch': 1.25} 13%|█▎ | 5165/41250 [12:29:15<87:00:16, 8.68s/it][2025-04-25 20:26:58,756] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.11 | optimizer_step: 1.21 [2025-04-25 20:26:58,757] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.72 | bwd_microstep: 5791.34 | bwd_inner_microstep: 5652.38 | bwd_allreduce_microstep: 138.90 | step_microstep: 19.74 [2025-04-25 20:26:58,757] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.72 | bwd: 5791.36 | bwd_inner: 5652.38 | bwd_allreduce: 138.93 | step: 19.74 13%|█▎ | 5166/41250 [12:29:24<87:03:21, 8.69s/it] {'loss': 0.078, 'grad_norm': 1.2179808616638184, 'learning_rate': 3.905633679926928e-05, 'epoch': 1.25} 13%|█▎ | 5166/41250 [12:29:24<87:03:21, 8.69s/it][2025-04-25 20:27:07,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 20:27:07,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.92 | bwd_microstep: 5698.34 | bwd_inner_microstep: 5685.47 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.95 [2025-04-25 20:27:07,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.92 | bwd: 5698.35 | bwd_inner: 5685.47 | bwd_allreduce: 12.84 | step: 18.95 13%|█▎ | 5167/41250 [12:29:32<86:52:29, 8.67s/it] {'loss': 0.0869, 'grad_norm': 2.3306570053100586, 'learning_rate': 3.90558600748591e-05, 'epoch': 1.25} 13%|█▎ | 5167/41250 [12:29:32<86:52:29, 8.67s/it][2025-04-25 20:27:16,045] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.98 | optimizer_step: 1.05 [2025-04-25 20:27:16,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.15 | bwd_microstep: 5727.39 | bwd_inner_microstep: 5694.90 | bwd_allreduce_microstep: 32.44 | step_microstep: 18.83 [2025-04-25 20:27:16,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.15 | bwd: 5727.41 | bwd_inner: 5694.90 | bwd_allreduce: 32.46 | step: 18.83 13%|█▎ | 5168/41250 [12:29:41<86:51:29, 8.67s/it] {'loss': 0.1157, 'grad_norm': 1.8850886821746826, 'learning_rate': 3.9055383232973294e-05, 'epoch': 1.25} 13%|█▎ | 5168/41250 [12:29:41<86:51:29, 8.67s/it][2025-04-25 20:27:24,712] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 20:27:24,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.17 | bwd_microstep: 5733.19 | bwd_inner_microstep: 5690.85 | bwd_allreduce_microstep: 42.29 | step_microstep: 18.56 [2025-04-25 20:27:24,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.17 | bwd: 5733.20 | bwd_inner: 5690.85 | bwd_allreduce: 42.31 | step: 18.57 13%|█▎ | 5169/41250 [12:29:50<86:51:33, 8.67s/it] {'loss': 0.1314, 'grad_norm': 1.5204381942749023, 'learning_rate': 3.905490627361481e-05, 'epoch': 1.25} 13%|█▎ | 5169/41250 [12:29:50<86:51:33, 8.67s/it][2025-04-25 20:27:33,439] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:27:33,440] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2878.19 | bwd_microstep: 5769.34 | bwd_inner_microstep: 5756.27 | bwd_allreduce_microstep: 13.02 | step_microstep: 18.89 [2025-04-25 20:27:33,440] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2878.19 | bwd: 5769.35 | bwd_inner: 5756.27 | bwd_allreduce: 13.03 | step: 18.89 13%|█▎ | 5170/41250 [12:29:58<87:02:30, 8.68s/it] {'loss': 0.2834, 'grad_norm': 2.1598241329193115, 'learning_rate': 3.905442919678656e-05, 'epoch': 1.25} 13%|█▎ | 5170/41250 [12:29:58<87:02:30, 8.68s/it][2025-04-25 20:27:42,098] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.97 | optimizer_step: 1.06 [2025-04-25 20:27:42,098] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.02 | bwd_microstep: 5728.28 | bwd_inner_microstep: 5685.69 | bwd_allreduce_microstep: 42.54 | step_microstep: 18.70 [2025-04-25 20:27:42,099] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.02 | bwd: 5728.29 | bwd_inner: 5685.69 | bwd_allreduce: 42.56 | step: 18.70 13%|█▎ | 5171/41250 [12:30:07<86:57:25, 8.68s/it] {'loss': 0.0776, 'grad_norm': 1.346614122390747, 'learning_rate': 3.9053952002491515e-05, 'epoch': 1.25} 13%|█▎ | 5171/41250 [12:30:07<86:57:25, 8.68s/it][2025-04-25 20:27:50,880] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 20:27:50,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.14 | bwd_microstep: 5872.18 | bwd_inner_microstep: 5647.44 | bwd_allreduce_microstep: 224.70 | step_microstep: 18.67 [2025-04-25 20:27:50,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.14 | bwd: 5872.20 | bwd_inner: 5647.44 | bwd_allreduce: 224.72 | step: 18.67 13%|█▎ | 5172/41250 [12:30:16<87:16:42, 8.71s/it] {'loss': 0.073, 'grad_norm': 1.1311737298965454, 'learning_rate': 3.905347469073259e-05, 'epoch': 1.25} 13%|█▎ | 5172/41250 [12:30:16<87:16:42, 8.71s/it][2025-04-25 20:27:59,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-25 20:27:59,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.03 | bwd_microstep: 5744.92 | bwd_inner_microstep: 5656.67 | bwd_allreduce_microstep: 88.21 | step_microstep: 18.95 [2025-04-25 20:27:59,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.03 | bwd: 5744.93 | bwd_inner: 5656.67 | bwd_allreduce: 88.22 | step: 18.95 13%|█▎ | 5173/41250 [12:30:24<87:08:38, 8.70s/it] {'loss': 0.2036, 'grad_norm': 3.1469860076904297, 'learning_rate': 3.905299726151275e-05, 'epoch': 1.25} 13%|█▎ | 5173/41250 [12:30:24<87:08:38, 8.70s/it][2025-04-25 20:28:08,204] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 20:28:08,205] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.98 | bwd_microstep: 5717.83 | bwd_inner_microstep: 5688.52 | bwd_allreduce_microstep: 29.27 | step_microstep: 18.38 [2025-04-25 20:28:08,205] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.98 | bwd: 5717.84 | bwd_inner: 5688.52 | bwd_allreduce: 29.28 | step: 18.38 13%|█▎ | 5174/41250 [12:30:33<87:01:34, 8.68s/it] {'loss': 0.2401, 'grad_norm': 3.698014259338379, 'learning_rate': 3.905251971483493e-05, 'epoch': 1.25} 13%|█▎ | 5174/41250 [12:30:33<87:01:34, 8.68s/it][2025-04-25 20:28:16,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-25 20:28:16,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.95 | bwd_microstep: 5691.56 | bwd_inner_microstep: 5678.75 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.89 [2025-04-25 20:28:16,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.95 | bwd: 5691.58 | bwd_inner: 5678.75 | bwd_allreduce: 12.78 | step: 18.90 13%|█▎ | 5175/41250 [12:30:42<86:52:21, 8.67s/it] {'loss': 0.146, 'grad_norm': 1.6693072319030762, 'learning_rate': 3.905204205070207e-05, 'epoch': 1.25} 13%|█▎ | 5175/41250 [12:30:42<86:52:21, 8.67s/it][2025-04-25 20:28:25,488] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.02 | optimizer_step: 1.04 [2025-04-25 20:28:25,488] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.35 | bwd_microstep: 5731.05 | bwd_inner_microstep: 5648.16 | bwd_allreduce_microstep: 82.83 | step_microstep: 18.96 [2025-04-25 20:28:25,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.35 | bwd: 5731.06 | bwd_inner: 5648.16 | bwd_allreduce: 82.86 | step: 18.97 13%|█▎ | 5176/41250 [12:30:50<86:48:41, 8.66s/it] {'loss': 0.0482, 'grad_norm': 0.6208038330078125, 'learning_rate': 3.905156426911712e-05, 'epoch': 1.25} 13%|█▎ | 5176/41250 [12:30:50<86:48:41, 8.66s/it][2025-04-25 20:28:34,143] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:28:34,143] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.01 | bwd_microstep: 5725.88 | bwd_inner_microstep: 5691.79 | bwd_allreduce_microstep: 34.04 | step_microstep: 18.84 [2025-04-25 20:28:34,144] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.01 | bwd: 5725.90 | bwd_inner: 5691.79 | bwd_allreduce: 34.06 | step: 18.84 13%|█▎ | 5177/41250 [12:30:59<86:47:06, 8.66s/it] {'loss': 0.0551, 'grad_norm': 0.625353991985321, 'learning_rate': 3.905108637008302e-05, 'epoch': 1.26} 13%|█▎ | 5177/41250 [12:30:59<86:47:06, 8.66s/it][2025-04-25 20:28:42,809] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 20:28:42,810] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.05 | bwd_microstep: 5733.77 | bwd_inner_microstep: 5691.56 | bwd_allreduce_microstep: 42.16 | step_microstep: 18.53 [2025-04-25 20:28:42,810] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.05 | bwd: 5733.78 | bwd_inner: 5691.56 | bwd_allreduce: 42.18 | step: 18.53 13%|█▎ | 5178/41250 [12:31:08<86:47:52, 8.66s/it] {'loss': 0.3999, 'grad_norm': 4.0473737716674805, 'learning_rate': 3.9050608353602723e-05, 'epoch': 1.26} 13%|█▎ | 5178/41250 [12:31:08<86:47:52, 8.66s/it][2025-04-25 20:28:51,459] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.02 | optimizer_step: 1.01 [2025-04-25 20:28:51,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.52 | bwd_microstep: 5726.23 | bwd_inner_microstep: 5675.30 | bwd_allreduce_microstep: 50.89 | step_microstep: 19.24 [2025-04-25 20:28:51,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.52 | bwd: 5726.25 | bwd_inner: 5675.30 | bwd_allreduce: 50.90 | step: 19.24 13%|█▎ | 5179/41250 [12:31:16<86:45:33, 8.66s/it] {'loss': 0.0665, 'grad_norm': 1.2620210647583008, 'learning_rate': 3.905013021967917e-05, 'epoch': 1.26} 13%|█▎ | 5179/41250 [12:31:16<86:45:33, 8.66s/it][2025-04-25 20:29:00,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:29:00,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.03 | bwd_microstep: 5846.77 | bwd_inner_microstep: 5690.56 | bwd_allreduce_microstep: 156.16 | step_microstep: 18.79 [2025-04-25 20:29:00,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.03 | bwd: 5846.78 | bwd_inner: 5690.57 | bwd_allreduce: 156.18 | step: 18.79 13%|█▎ | 5180/41250 [12:31:25<87:06:55, 8.69s/it] {'loss': 0.1691, 'grad_norm': 1.723955750465393, 'learning_rate': 3.904965196831532e-05, 'epoch': 1.26} 13%|█▎ | 5180/41250 [12:31:25<87:06:55, 8.69s/it][2025-04-25 20:29:08,893] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 20:29:08,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.87 | bwd_microstep: 5719.53 | bwd_inner_microstep: 5707.17 | bwd_allreduce_microstep: 12.32 | step_microstep: 18.67 [2025-04-25 20:29:08,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.87 | bwd: 5719.55 | bwd_inner: 5707.17 | bwd_allreduce: 12.33 | step: 18.67 13%|█▎ | 5181/41250 [12:31:34<86:59:38, 8.68s/it] {'loss': 0.3451, 'grad_norm': 5.188592433929443, 'learning_rate': 3.90491735995141e-05, 'epoch': 1.26} 13%|█▎ | 5181/41250 [12:31:34<86:59:38, 8.68s/it][2025-04-25 20:29:17,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 20:29:17,539] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.04 | bwd_microstep: 5734.64 | bwd_inner_microstep: 5654.46 | bwd_allreduce_microstep: 80.13 | step_microstep: 18.59 [2025-04-25 20:29:17,539] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.04 | bwd: 5734.65 | bwd_inner: 5654.46 | bwd_allreduce: 80.15 | step: 18.59 13%|█▎ | 5182/41250 [12:31:42<86:52:41, 8.67s/it] {'loss': 0.0872, 'grad_norm': 2.5260002613067627, 'learning_rate': 3.904869511327848e-05, 'epoch': 1.26} 13%|█▎ | 5182/41250 [12:31:42<86:52:41, 8.67s/it][2025-04-25 20:29:26,177] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.99 | optimizer_step: 1.08 [2025-04-25 20:29:26,178] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.19 | bwd_microstep: 5699.95 | bwd_inner_microstep: 5687.07 | bwd_allreduce_microstep: 12.83 | step_microstep: 18.92 [2025-04-25 20:29:26,178] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.19 | bwd: 5699.96 | bwd_inner: 5687.07 | bwd_allreduce: 12.85 | step: 18.93 13%|█▎ | 5183/41250 [12:31:51<86:46:59, 8.66s/it] {'loss': 0.2711, 'grad_norm': 3.271097421646118, 'learning_rate': 3.9048216509611396e-05, 'epoch': 1.26} 13%|█▎ | 5183/41250 [12:31:51<86:46:59, 8.66s/it][2025-04-25 20:29:34,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-25 20:29:34,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.97 | bwd_microstep: 5757.27 | bwd_inner_microstep: 5642.04 | bwd_allreduce_microstep: 115.18 | step_microstep: 18.92 [2025-04-25 20:29:34,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.97 | bwd: 5757.28 | bwd_inner: 5642.04 | bwd_allreduce: 115.20 | step: 18.92 13%|█▎ | 5184/41250 [12:32:00<86:48:21, 8.66s/it] {'loss': 0.0741, 'grad_norm': 0.8324564695358276, 'learning_rate': 3.90477377885158e-05, 'epoch': 1.26} 13%|█▎ | 5184/41250 [12:32:00<86:48:21, 8.66s/it][2025-04-25 20:29:43,451] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 20:29:43,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.97 | bwd_microstep: 5693.57 | bwd_inner_microstep: 5640.85 | bwd_allreduce_microstep: 52.67 | step_microstep: 18.27 [2025-04-25 20:29:43,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.97 | bwd: 5693.58 | bwd_inner: 5640.85 | bwd_allreduce: 52.69 | step: 18.27 13%|█▎ | 5185/41250 [12:32:08<86:36:50, 8.65s/it] {'loss': 0.168, 'grad_norm': 2.392679214477539, 'learning_rate': 3.904725894999466e-05, 'epoch': 1.26} 13%|█▎ | 5185/41250 [12:32:08<86:36:50, 8.65s/it][2025-04-25 20:29:52,066] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 20:29:52,066] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.04 | bwd_microstep: 5685.72 | bwd_inner_microstep: 5654.50 | bwd_allreduce_microstep: 31.17 | step_microstep: 18.72 [2025-04-25 20:29:52,066] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.04 | bwd: 5685.73 | bwd_inner: 5654.50 | bwd_allreduce: 31.19 | step: 18.72 13%|█▎ | 5186/41250 [12:32:17<86:31:12, 8.64s/it] {'loss': 0.2355, 'grad_norm': 2.7832961082458496, 'learning_rate': 3.904677999405091e-05, 'epoch': 1.26} 13%|█▎ | 5186/41250 [12:32:17<86:31:12, 8.64s/it][2025-04-25 20:30:00,991] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.00 | optimizer_step: 1.06 [2025-04-25 20:30:00,991] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2943.95 | bwd_microstep: 5896.88 | bwd_inner_microstep: 5884.13 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.39 [2025-04-25 20:30:00,991] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2943.95 | bwd: 5896.90 | bwd_inner: 5884.13 | bwd_allreduce: 12.72 | step: 18.39 13%|█▎ | 5187/41250 [12:32:26<87:23:04, 8.72s/it] {'loss': 0.3155, 'grad_norm': 2.2418243885040283, 'learning_rate': 3.90463009206875e-05, 'epoch': 1.26} 13%|█▎ | 5187/41250 [12:32:26<87:23:04, 8.72s/it][2025-04-25 20:30:09,603] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:30:09,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.97 | bwd_microstep: 5689.85 | bwd_inner_microstep: 5656.96 | bwd_allreduce_microstep: 32.84 | step_microstep: 18.66 [2025-04-25 20:30:09,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.97 | bwd: 5689.86 | bwd_inner: 5656.96 | bwd_allreduce: 32.86 | step: 18.66 13%|█▎ | 5188/41250 [12:32:34<87:03:05, 8.69s/it] {'loss': 0.061, 'grad_norm': 1.5095945596694946, 'learning_rate': 3.9045821729907395e-05, 'epoch': 1.26} 13%|█▎ | 5188/41250 [12:32:34<87:03:05, 8.69s/it][2025-04-25 20:30:18,282] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 20:30:18,283] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.60 | bwd_microstep: 5764.70 | bwd_inner_microstep: 5646.49 | bwd_allreduce_microstep: 118.17 | step_microstep: 18.50 [2025-04-25 20:30:18,283] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.60 | bwd: 5764.72 | bwd_inner: 5646.49 | bwd_allreduce: 118.19 | step: 18.50 13%|█▎ | 5189/41250 [12:32:43<87:00:47, 8.69s/it] {'loss': 0.2213, 'grad_norm': 2.3301243782043457, 'learning_rate': 3.904534242171355e-05, 'epoch': 1.26} 13%|█▎ | 5189/41250 [12:32:43<87:00:47, 8.69s/it][2025-04-25 20:30:26,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:30:26,959] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.17 | bwd_microstep: 5734.73 | bwd_inner_microstep: 5703.16 | bwd_allreduce_microstep: 31.52 | step_microstep: 18.53 [2025-04-25 20:30:26,959] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.17 | bwd: 5734.74 | bwd_inner: 5703.16 | bwd_allreduce: 31.54 | step: 18.53 13%|█▎ | 5190/41250 [12:32:52<86:59:12, 8.68s/it] {'loss': 0.1765, 'grad_norm': 2.400830030441284, 'learning_rate': 3.9044862996108915e-05, 'epoch': 1.26} 13%|█▎ | 5190/41250 [12:32:52<86:59:12, 8.68s/it][2025-04-25 20:30:35,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-25 20:30:35,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.53 | bwd_microstep: 5763.91 | bwd_inner_microstep: 5678.40 | bwd_allreduce_microstep: 85.47 | step_microstep: 18.96 [2025-04-25 20:30:35,648] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.53 | bwd: 5763.92 | bwd_inner: 5678.40 | bwd_allreduce: 85.48 | step: 18.97 13%|█▎ | 5191/41250 [12:33:00<86:59:29, 8.68s/it] {'loss': 0.1056, 'grad_norm': 3.665532350540161, 'learning_rate': 3.9044383453096445e-05, 'epoch': 1.26} 13%|█▎ | 5191/41250 [12:33:00<86:59:29, 8.68s/it][2025-04-25 20:30:44,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.92 [2025-04-25 20:30:44,265] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.04 | bwd_microstep: 5690.07 | bwd_inner_microstep: 5663.89 | bwd_allreduce_microstep: 26.13 | step_microstep: 18.69 [2025-04-25 20:30:44,265] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.04 | bwd: 5690.08 | bwd_inner: 5663.89 | bwd_allreduce: 26.15 | step: 18.69 13%|█▎ | 5192/41250 [12:33:09<86:47:15, 8.66s/it] {'loss': 0.0401, 'grad_norm': 1.02812659740448, 'learning_rate': 3.904390379267909e-05, 'epoch': 1.26} 13%|█▎ | 5192/41250 [12:33:09<86:47:15, 8.66s/it][2025-04-25 20:30:52,970] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:30:52,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2864.21 | bwd_microstep: 5750.73 | bwd_inner_microstep: 5689.29 | bwd_allreduce_microstep: 61.39 | step_microstep: 18.30 [2025-04-25 20:30:52,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2864.21 | bwd: 5750.74 | bwd_inner: 5689.29 | bwd_allreduce: 61.41 | step: 18.30 13%|█▎ | 5193/41250 [12:33:18<86:54:14, 8.68s/it] {'loss': 0.3683, 'grad_norm': 3.2963669300079346, 'learning_rate': 3.9043424014859816e-05, 'epoch': 1.26} 13%|█▎ | 5193/41250 [12:33:18<86:54:14, 8.68s/it][2025-04-25 20:31:01,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.98 | optimizer_step: 0.97 [2025-04-25 20:31:01,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.45 | bwd_microstep: 5706.47 | bwd_inner_microstep: 5693.74 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.35 [2025-04-25 20:31:01,626] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.45 | bwd: 5706.48 | bwd_inner: 5693.74 | bwd_allreduce: 12.70 | step: 18.35 13%|█▎ | 5194/41250 [12:33:26<86:50:19, 8.67s/it] {'loss': 0.2396, 'grad_norm': 1.939832091331482, 'learning_rate': 3.904294411964159e-05, 'epoch': 1.26} 13%|█▎ | 5194/41250 [12:33:26<86:50:19, 8.67s/it][2025-04-25 20:31:10,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 20:31:10,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2903.65 | bwd_microstep: 5797.92 | bwd_inner_microstep: 5785.19 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.49 [2025-04-25 20:31:10,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2903.65 | bwd: 5797.94 | bwd_inner: 5785.19 | bwd_allreduce: 12.71 | step: 18.50 13%|█▎ | 5195/41250 [12:33:35<87:11:01, 8.71s/it] {'loss': 0.2349, 'grad_norm': 2.04771089553833, 'learning_rate': 3.904246410702734e-05, 'epoch': 1.26} 13%|█▎ | 5195/41250 [12:33:35<87:11:01, 8.71s/it][2025-04-25 20:31:19,144] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.99 [2025-04-25 20:31:19,144] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.23 | bwd_microstep: 5786.73 | bwd_inner_microstep: 5709.01 | bwd_allreduce_microstep: 77.68 | step_microstep: 18.60 [2025-04-25 20:31:19,144] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.23 | bwd: 5786.75 | bwd_inner: 5709.01 | bwd_allreduce: 77.69 | step: 18.60 13%|█▎ | 5196/41250 [12:33:44<87:15:45, 8.71s/it] {'loss': 0.1449, 'grad_norm': 2.763888120651245, 'learning_rate': 3.9041983977020057e-05, 'epoch': 1.26} 13%|█▎ | 5196/41250 [12:33:44<87:15:45, 8.71s/it][2025-04-25 20:31:27,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:31:27,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2869.79 | bwd_microstep: 5739.93 | bwd_inner_microstep: 5709.15 | bwd_allreduce_microstep: 30.73 | step_microstep: 18.53 [2025-04-25 20:31:27,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2869.79 | bwd: 5739.94 | bwd_inner: 5709.15 | bwd_allreduce: 30.75 | step: 18.53 13%|█▎ | 5197/41250 [12:33:53<87:11:55, 8.71s/it] {'loss': 0.047, 'grad_norm': 0.6097029447555542, 'learning_rate': 3.9041503729622685e-05, 'epoch': 1.26} 13%|█▎ | 5197/41250 [12:33:53<87:11:55, 8.71s/it][2025-04-25 20:31:36,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-25 20:31:36,539] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.88 | bwd_microstep: 5779.85 | bwd_inner_microstep: 5672.48 | bwd_allreduce_microstep: 107.32 | step_microstep: 19.44 [2025-04-25 20:31:36,540] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.88 | bwd: 5779.86 | bwd_inner: 5672.48 | bwd_allreduce: 107.34 | step: 19.45 13%|█▎ | 5198/41250 [12:34:01<87:11:28, 8.71s/it] {'loss': 0.3464, 'grad_norm': 2.8145716190338135, 'learning_rate': 3.904102336483819e-05, 'epoch': 1.26} 13%|█▎ | 5198/41250 [12:34:01<87:11:28, 8.71s/it][2025-04-25 20:31:45,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 20:31:45,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.49 | bwd_microstep: 5789.75 | bwd_inner_microstep: 5669.32 | bwd_allreduce_microstep: 120.39 | step_microstep: 18.72 [2025-04-25 20:31:45,258] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.49 | bwd: 5789.76 | bwd_inner: 5669.32 | bwd_allreduce: 120.40 | step: 18.73 13%|█▎ | 5199/41250 [12:34:10<87:12:45, 8.71s/it] {'loss': 0.0713, 'grad_norm': 1.0165331363677979, 'learning_rate': 3.904054288266953e-05, 'epoch': 1.26} 13%|█▎ | 5199/41250 [12:34:10<87:12:45, 8.71s/it][2025-04-25 20:31:53,987] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.01 | optimizer_step: 1.01 [2025-04-25 20:31:53,988] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.22 | bwd_microstep: 5811.32 | bwd_inner_microstep: 5675.75 | bwd_allreduce_microstep: 135.52 | step_microstep: 19.37 [2025-04-25 20:31:53,988] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.22 | bwd: 5811.33 | bwd_inner: 5675.75 | bwd_allreduce: 135.54 | step: 19.37 13%|█▎ | 5200/41250 [12:34:19<87:17:08, 8.72s/it] {'loss': 0.0126, 'grad_norm': 0.17515237629413605, 'learning_rate': 3.904006228311967e-05, 'epoch': 1.26} 13%|█▎ | 5200/41250 [12:34:19<87:17:08, 8.72s/it][2025-04-25 20:32:02,706] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-25 20:32:02,706] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.85 | bwd_microstep: 5779.12 | bwd_inner_microstep: 5660.19 | bwd_allreduce_microstep: 118.89 | step_microstep: 18.80 [2025-04-25 20:32:02,706] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.85 | bwd: 5779.14 | bwd_inner: 5660.19 | bwd_allreduce: 118.91 | step: 18.80 13%|█▎ | 5201/41250 [12:34:28<87:16:44, 8.72s/it] {'loss': 0.2155, 'grad_norm': 2.3512213230133057, 'learning_rate': 3.903958156619156e-05, 'epoch': 1.26} 13%|█▎ | 5201/41250 [12:34:28<87:16:44, 8.72s/it][2025-04-25 20:32:11,406] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 20:32:11,406] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.80 | bwd_microstep: 5780.32 | bwd_inner_microstep: 5669.87 | bwd_allreduce_microstep: 110.40 | step_microstep: 18.63 [2025-04-25 20:32:11,407] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.80 | bwd: 5780.33 | bwd_inner: 5669.87 | bwd_allreduce: 110.42 | step: 18.64 13%|█▎ | 5202/41250 [12:34:36<87:14:05, 8.71s/it] {'loss': 0.101, 'grad_norm': 2.0349414348602295, 'learning_rate': 3.9039100731888187e-05, 'epoch': 1.26} 13%|█▎ | 5202/41250 [12:34:36<87:14:05, 8.71s/it][2025-04-25 20:32:20,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:32:20,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.48 | bwd_microstep: 5770.40 | bwd_inner_microstep: 5703.92 | bwd_allreduce_microstep: 66.42 | step_microstep: 18.76 [2025-04-25 20:32:20,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.48 | bwd: 5770.41 | bwd_inner: 5703.92 | bwd_allreduce: 66.44 | step: 18.76 13%|█▎ | 5203/41250 [12:34:45<87:15:12, 8.71s/it] {'loss': 0.0602, 'grad_norm': 0.8866743445396423, 'learning_rate': 3.90386197802125e-05, 'epoch': 1.26} 13%|█▎ | 5203/41250 [12:34:45<87:15:12, 8.71s/it][2025-04-25 20:32:28,760] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 1.02 [2025-04-25 20:32:28,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.32 | bwd_microstep: 5707.88 | bwd_inner_microstep: 5667.84 | bwd_allreduce_microstep: 40.00 | step_microstep: 19.20 [2025-04-25 20:32:28,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.32 | bwd: 5707.89 | bwd_inner: 5667.84 | bwd_allreduce: 40.01 | step: 19.21 13%|█▎ | 5204/41250 [12:34:54<87:00:43, 8.69s/it] {'loss': 0.0507, 'grad_norm': 0.7431170344352722, 'learning_rate': 3.903813871116746e-05, 'epoch': 1.26} 13%|█▎ | 5204/41250 [12:34:54<87:00:43, 8.69s/it][2025-04-25 20:32:37,433] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:32:37,434] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2864.18 | bwd_microstep: 5726.45 | bwd_inner_microstep: 5713.35 | bwd_allreduce_microstep: 13.05 | step_microstep: 18.31 [2025-04-25 20:32:37,434] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2864.18 | bwd: 5726.46 | bwd_inner: 5713.35 | bwd_allreduce: 13.07 | step: 18.32 13%|█▎ | 5205/41250 [12:35:02<86:57:13, 8.68s/it] {'loss': 0.1761, 'grad_norm': 1.4141696691513062, 'learning_rate': 3.903765752475605e-05, 'epoch': 1.26} 13%|█▎ | 5205/41250 [12:35:02<86:57:13, 8.68s/it][2025-04-25 20:32:46,140] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.97 | optimizer_step: 1.06 [2025-04-25 20:32:46,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.41 | bwd_microstep: 5766.68 | bwd_inner_microstep: 5715.24 | bwd_allreduce_microstep: 51.40 | step_microstep: 18.59 [2025-04-25 20:32:46,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.41 | bwd: 5766.70 | bwd_inner: 5715.24 | bwd_allreduce: 51.42 | step: 18.59 13%|█▎ | 5206/41250 [12:35:11<87:01:28, 8.69s/it] {'loss': 0.1516, 'grad_norm': 1.5096745491027832, 'learning_rate': 3.903717622098121e-05, 'epoch': 1.26} 13%|█▎ | 5206/41250 [12:35:11<87:01:28, 8.69s/it][2025-04-25 20:32:54,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:32:54,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2865.13 | bwd_microstep: 5730.94 | bwd_inner_microstep: 5715.37 | bwd_allreduce_microstep: 15.53 | step_microstep: 18.66 [2025-04-25 20:32:54,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2865.13 | bwd: 5730.96 | bwd_inner: 5715.37 | bwd_allreduce: 15.55 | step: 18.67 13%|█▎ | 5207/41250 [12:35:20<86:59:05, 8.69s/it] {'loss': 0.1957, 'grad_norm': 1.5843240022659302, 'learning_rate': 3.903669479984594e-05, 'epoch': 1.26} 13%|█▎ | 5207/41250 [12:35:20<86:59:05, 8.69s/it][2025-04-25 20:33:03,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 1.07 [2025-04-25 20:33:03,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.96 | bwd_microstep: 5952.04 | bwd_inner_microstep: 5674.35 | bwd_allreduce_microstep: 277.64 | step_microstep: 18.69 [2025-04-25 20:33:03,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.96 | bwd: 5952.05 | bwd_inner: 5674.35 | bwd_allreduce: 277.66 | step: 18.69 13%|█▎ | 5208/41250 [12:35:29<87:32:28, 8.74s/it] {'loss': 0.0704, 'grad_norm': 1.0838452577590942, 'learning_rate': 3.903621326135318e-05, 'epoch': 1.26} 13%|█▎ | 5208/41250 [12:35:29<87:32:28, 8.74s/it][2025-04-25 20:33:12,377] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.24 | optimizer_step: 1.04 [2025-04-25 20:33:12,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.52 | bwd_microstep: 5773.11 | bwd_inner_microstep: 5649.26 | bwd_allreduce_microstep: 123.79 | step_microstep: 20.08 [2025-04-25 20:33:12,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.52 | bwd: 5773.13 | bwd_inner: 5649.26 | bwd_allreduce: 123.82 | step: 20.09 13%|█▎ | 5209/41250 [12:35:37<87:21:15, 8.73s/it] {'loss': 0.3394, 'grad_norm': 3.6605679988861084, 'learning_rate': 3.903573160550592e-05, 'epoch': 1.26} 13%|█▎ | 5209/41250 [12:35:37<87:21:15, 8.73s/it][2025-04-25 20:33:21,039] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.06 | optimizer_step: 1.00 [2025-04-25 20:33:21,039] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.14 | bwd_microstep: 5715.29 | bwd_inner_microstep: 5702.36 | bwd_allreduce_microstep: 12.88 | step_microstep: 19.24 [2025-04-25 20:33:21,040] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.14 | bwd: 5715.30 | bwd_inner: 5702.36 | bwd_allreduce: 12.90 | step: 19.25 13%|█▎ | 5210/41250 [12:35:46<87:09:51, 8.71s/it] {'loss': 0.0715, 'grad_norm': 1.7776235342025757, 'learning_rate': 3.903524983230711e-05, 'epoch': 1.26} 13%|█▎ | 5210/41250 [12:35:46<87:09:51, 8.71s/it][2025-04-25 20:33:29,724] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:33:29,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.38 | bwd_microstep: 5758.67 | bwd_inner_microstep: 5684.56 | bwd_allreduce_microstep: 74.07 | step_microstep: 18.84 [2025-04-25 20:33:29,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.38 | bwd: 5758.68 | bwd_inner: 5684.56 | bwd_allreduce: 74.09 | step: 18.84 13%|█▎ | 5211/41250 [12:35:55<87:05:29, 8.70s/it] {'loss': 0.0209, 'grad_norm': 0.24360746145248413, 'learning_rate': 3.9034767941759736e-05, 'epoch': 1.26} 13%|█▎ | 5211/41250 [12:35:55<87:05:29, 8.70s/it][2025-04-25 20:33:38,406] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:33:38,406] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.15 | bwd_microstep: 5772.48 | bwd_inner_microstep: 5652.55 | bwd_allreduce_microstep: 119.88 | step_microstep: 18.95 [2025-04-25 20:33:38,407] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.15 | bwd: 5772.49 | bwd_inner: 5652.55 | bwd_allreduce: 119.90 | step: 18.95 13%|█▎ | 5212/41250 [12:36:03<87:02:06, 8.69s/it] {'loss': 0.3268, 'grad_norm': 2.059321880340576, 'learning_rate': 3.903428593386675e-05, 'epoch': 1.26} 13%|█▎ | 5212/41250 [12:36:03<87:02:06, 8.69s/it][2025-04-25 20:33:47,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.02 | optimizer_step: 1.00 [2025-04-25 20:33:47,096] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.33 | bwd_microstep: 5779.46 | bwd_inner_microstep: 5652.20 | bwd_allreduce_microstep: 127.21 | step_microstep: 19.30 [2025-04-25 20:33:47,096] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.33 | bwd: 5779.47 | bwd_inner: 5652.20 | bwd_allreduce: 127.23 | step: 19.30 13%|█▎ | 5213/41250 [12:36:12<87:01:38, 8.69s/it] {'loss': 0.1681, 'grad_norm': 1.590093731880188, 'learning_rate': 3.903380380863115e-05, 'epoch': 1.26} 13%|█▎ | 5213/41250 [12:36:12<87:01:38, 8.69s/it][2025-04-25 20:33:55,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-25 20:33:55,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.14 | bwd_microstep: 5732.38 | bwd_inner_microstep: 5697.49 | bwd_allreduce_microstep: 34.85 | step_microstep: 19.27 [2025-04-25 20:33:55,774] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.14 | bwd: 5732.39 | bwd_inner: 5697.49 | bwd_allreduce: 34.86 | step: 19.27 13%|█▎ | 5214/41250 [12:36:21<86:58:09, 8.69s/it] {'loss': 0.2126, 'grad_norm': 5.252957344055176, 'learning_rate': 3.9033321566055885e-05, 'epoch': 1.26} 13%|█▎ | 5214/41250 [12:36:21<86:58:09, 8.69s/it][2025-04-25 20:34:04,449] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.33 | optimizer_step: 1.06 [2025-04-25 20:34:04,450] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.57 | bwd_microstep: 5767.19 | bwd_inner_microstep: 5656.19 | bwd_allreduce_microstep: 110.94 | step_microstep: 20.03 [2025-04-25 20:34:04,450] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.57 | bwd: 5767.21 | bwd_inner: 5656.19 | bwd_allreduce: 110.96 | step: 20.03 13%|█▎ | 5215/41250 [12:36:29<86:56:04, 8.69s/it] {'loss': 0.0711, 'grad_norm': 1.1080176830291748, 'learning_rate': 3.903283920614393e-05, 'epoch': 1.26} 13%|█▎ | 5215/41250 [12:36:29<86:56:04, 8.69s/it][2025-04-25 20:34:13,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 1.04 [2025-04-25 20:34:13,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.83 | bwd_microstep: 5705.98 | bwd_inner_microstep: 5693.06 | bwd_allreduce_microstep: 12.87 | step_microstep: 19.25 [2025-04-25 20:34:13,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.83 | bwd: 5705.99 | bwd_inner: 5693.06 | bwd_allreduce: 12.89 | step: 19.25 13%|█▎ | 5216/41250 [12:36:38<86:48:24, 8.67s/it] {'loss': 0.216, 'grad_norm': 2.104565382003784, 'learning_rate': 3.9032356728898274e-05, 'epoch': 1.26} 13%|█▎ | 5216/41250 [12:36:38<86:48:24, 8.67s/it][2025-04-25 20:34:21,772] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:34:21,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.04 | bwd_microstep: 5745.96 | bwd_inner_microstep: 5694.48 | bwd_allreduce_microstep: 51.43 | step_microstep: 18.33 [2025-04-25 20:34:21,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.04 | bwd: 5745.98 | bwd_inner: 5694.48 | bwd_allreduce: 51.45 | step: 18.33 13%|█▎ | 5217/41250 [12:36:47<86:50:01, 8.68s/it] {'loss': 0.2479, 'grad_norm': 2.428995132446289, 'learning_rate': 3.9031874134321885e-05, 'epoch': 1.26} 13%|█▎ | 5217/41250 [12:36:47<86:50:01, 8.68s/it][2025-04-25 20:34:30,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.02 | optimizer_step: 0.96 [2025-04-25 20:34:30,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.04 | bwd_microstep: 5684.96 | bwd_inner_microstep: 5671.94 | bwd_allreduce_microstep: 12.98 | step_microstep: 19.19 [2025-04-25 20:34:30,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.05 | bwd: 5684.98 | bwd_inner: 5671.94 | bwd_allreduce: 13.00 | step: 19.19 13%|█▎ | 5218/41250 [12:36:55<86:38:41, 8.66s/it] {'loss': 0.2144, 'grad_norm': 2.1224312782287598, 'learning_rate': 3.9031391422417725e-05, 'epoch': 1.26} 13%|█▎ | 5218/41250 [12:36:55<86:38:41, 8.66s/it][2025-04-25 20:34:39,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 20:34:39,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.29 | bwd_microstep: 5745.00 | bwd_inner_microstep: 5702.60 | bwd_allreduce_microstep: 42.35 | step_microstep: 18.75 [2025-04-25 20:34:39,073] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.29 | bwd: 5745.01 | bwd_inner: 5702.60 | bwd_allreduce: 42.37 | step: 18.76 13%|█▎ | 5219/41250 [12:37:04<86:43:04, 8.66s/it] {'loss': 0.2486, 'grad_norm': 2.053959369659424, 'learning_rate': 3.903090859318879e-05, 'epoch': 1.27} 13%|█▎ | 5219/41250 [12:37:04<86:43:04, 8.66s/it][2025-04-25 20:34:47,688] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 20:34:47,689] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.60 | bwd_microstep: 5690.79 | bwd_inner_microstep: 5678.19 | bwd_allreduce_microstep: 12.55 | step_microstep: 18.61 [2025-04-25 20:34:47,689] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.60 | bwd: 5690.80 | bwd_inner: 5678.19 | bwd_allreduce: 12.56 | step: 18.61 13%|█▎ | 5220/41250 [12:37:13<86:35:01, 8.65s/it] {'loss': 0.0842, 'grad_norm': 1.325960397720337, 'learning_rate': 3.9030425646638044e-05, 'epoch': 1.27} 13%|█▎ | 5220/41250 [12:37:13<86:35:01, 8.65s/it][2025-04-25 20:34:56,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 1.03 | optimizer_step: 1.13 [2025-04-25 20:34:56,280] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.36 | bwd_microstep: 5683.21 | bwd_inner_microstep: 5655.89 | bwd_allreduce_microstep: 27.28 | step_microstep: 18.96 [2025-04-25 20:34:56,280] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.36 | bwd: 5683.23 | bwd_inner: 5655.89 | bwd_allreduce: 27.29 | step: 18.96 13%|█▎ | 5221/41250 [12:37:21<86:23:45, 8.63s/it] {'loss': 0.5577, 'grad_norm': 3.4328904151916504, 'learning_rate': 3.902994258276846e-05, 'epoch': 1.27} 13%|█▎ | 5221/41250 [12:37:21<86:23:45, 8.63s/it][2025-04-25 20:35:04,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:35:04,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.99 | bwd_microstep: 5751.59 | bwd_inner_microstep: 5652.22 | bwd_allreduce_microstep: 99.32 | step_microstep: 18.53 [2025-04-25 20:35:04,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.99 | bwd: 5751.60 | bwd_inner: 5652.22 | bwd_allreduce: 99.34 | step: 18.54 13%|█▎ | 5222/41250 [12:37:30<86:29:32, 8.64s/it] {'loss': 0.1346, 'grad_norm': 1.3831435441970825, 'learning_rate': 3.902945940158303e-05, 'epoch': 1.27} 13%|█▎ | 5222/41250 [12:37:30<86:29:32, 8.64s/it][2025-04-25 20:35:13,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 20:35:13,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.95 | bwd_microstep: 5687.43 | bwd_inner_microstep: 5640.88 | bwd_allreduce_microstep: 46.50 | step_microstep: 19.09 [2025-04-25 20:35:13,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.95 | bwd: 5687.44 | bwd_inner: 5640.88 | bwd_allreduce: 46.52 | step: 19.10 13%|█▎ | 5223/41250 [12:37:38<86:21:43, 8.63s/it] {'loss': 0.0339, 'grad_norm': 0.6695288419723511, 'learning_rate': 3.902897610308472e-05, 'epoch': 1.27} 13%|█▎ | 5223/41250 [12:37:38<86:21:43, 8.63s/it][2025-04-25 20:35:22,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 20:35:22,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.19 | bwd_microstep: 5715.98 | bwd_inner_microstep: 5703.17 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.77 [2025-04-25 20:35:22,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.19 | bwd: 5715.99 | bwd_inner: 5703.17 | bwd_allreduce: 12.78 | step: 18.78 13%|█▎ | 5224/41250 [12:37:47<86:25:30, 8.64s/it] {'loss': 0.0688, 'grad_norm': 0.8529980182647705, 'learning_rate': 3.9028492687276515e-05, 'epoch': 1.27} 13%|█▎ | 5224/41250 [12:37:47<86:25:30, 8.64s/it][2025-04-25 20:35:30,952] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 20:35:30,953] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2886.56 | bwd_microstep: 5782.85 | bwd_inner_microstep: 5770.32 | bwd_allreduce_microstep: 12.48 | step_microstep: 18.17 [2025-04-25 20:35:30,953] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2886.56 | bwd: 5782.86 | bwd_inner: 5770.32 | bwd_allreduce: 12.50 | step: 18.17 13%|█▎ | 5225/41250 [12:37:56<86:46:26, 8.67s/it] {'loss': 0.125, 'grad_norm': 1.3741517066955566, 'learning_rate': 3.9028009154161403e-05, 'epoch': 1.27} 13%|█▎ | 5225/41250 [12:37:56<86:46:26, 8.67s/it][2025-04-25 20:35:39,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-25 20:35:39,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.21 | bwd_microstep: 5748.60 | bwd_inner_microstep: 5680.97 | bwd_allreduce_microstep: 67.58 | step_microstep: 19.28 [2025-04-25 20:35:39,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.21 | bwd: 5748.62 | bwd_inner: 5680.97 | bwd_allreduce: 67.60 | step: 19.28 13%|█▎ | 5226/41250 [12:38:04<86:47:22, 8.67s/it] {'loss': 0.0568, 'grad_norm': 0.7324853539466858, 'learning_rate': 3.902752550374235e-05, 'epoch': 1.27} 13%|█▎ | 5226/41250 [12:38:04<86:47:22, 8.67s/it][2025-04-25 20:35:48,258] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.92 [2025-04-25 20:35:48,259] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.79 | bwd_microstep: 5700.50 | bwd_inner_microstep: 5687.62 | bwd_allreduce_microstep: 12.83 | step_microstep: 19.08 [2025-04-25 20:35:48,259] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.79 | bwd: 5700.51 | bwd_inner: 5687.62 | bwd_allreduce: 12.85 | step: 19.09 13%|█▎ | 5227/41250 [12:38:13<86:39:18, 8.66s/it] {'loss': 0.1946, 'grad_norm': 1.225992202758789, 'learning_rate': 3.902704173602235e-05, 'epoch': 1.27} 13%|█▎ | 5227/41250 [12:38:13<86:39:18, 8.66s/it][2025-04-25 20:35:56,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:35:56,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.83 | bwd_microstep: 5714.00 | bwd_inner_microstep: 5679.77 | bwd_allreduce_microstep: 34.18 | step_microstep: 18.82 [2025-04-25 20:35:56,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.83 | bwd: 5714.01 | bwd_inner: 5679.77 | bwd_allreduce: 34.20 | step: 18.83 13%|█▎ | 5228/41250 [12:38:22<86:36:50, 8.66s/it] {'loss': 0.0538, 'grad_norm': 0.8265960812568665, 'learning_rate': 3.902655785100438e-05, 'epoch': 1.27} 13%|█▎ | 5228/41250 [12:38:22<86:36:50, 8.66s/it][2025-04-25 20:36:05,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:36:05,496] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.02 | bwd_microstep: 5679.17 | bwd_inner_microstep: 5649.97 | bwd_allreduce_microstep: 29.15 | step_microstep: 18.47 [2025-04-25 20:36:05,496] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.02 | bwd: 5679.18 | bwd_inner: 5649.97 | bwd_allreduce: 29.17 | step: 18.47 13%|█▎ | 5229/41250 [12:38:30<86:24:49, 8.64s/it] {'loss': 0.0856, 'grad_norm': 1.0257365703582764, 'learning_rate': 3.902607384869142e-05, 'epoch': 1.27} 13%|█▎ | 5229/41250 [12:38:30<86:24:49, 8.64s/it][2025-04-25 20:36:14,135] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 1.07 [2025-04-25 20:36:14,135] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.59 | bwd_microstep: 5705.03 | bwd_inner_microstep: 5692.19 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.84 [2025-04-25 20:36:14,136] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.59 | bwd: 5705.04 | bwd_inner: 5692.19 | bwd_allreduce: 12.81 | step: 18.84 13%|█▎ | 5230/41250 [12:38:39<86:25:22, 8.64s/it] {'loss': 0.2575, 'grad_norm': 1.9906117916107178, 'learning_rate': 3.902558972908646e-05, 'epoch': 1.27} 13%|█▎ | 5230/41250 [12:38:39<86:25:22, 8.64s/it][2025-04-25 20:36:22,723] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 20:36:22,724] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.07 | bwd_microstep: 5671.55 | bwd_inner_microstep: 5658.14 | bwd_allreduce_microstep: 13.36 | step_microstep: 18.78 [2025-04-25 20:36:22,724] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.07 | bwd: 5671.56 | bwd_inner: 5658.14 | bwd_allreduce: 13.38 | step: 18.78 13%|█▎ | 5231/41250 [12:38:48<86:16:34, 8.62s/it] {'loss': 0.1149, 'grad_norm': 3.7135748863220215, 'learning_rate': 3.9025105492192476e-05, 'epoch': 1.27} 13%|█▎ | 5231/41250 [12:38:48<86:16:34, 8.62s/it][2025-04-25 20:36:31,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.02 | optimizer_step: 0.96 [2025-04-25 20:36:31,362] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.25 | bwd_microstep: 5702.09 | bwd_inner_microstep: 5685.83 | bwd_allreduce_microstep: 16.21 | step_microstep: 19.17 [2025-04-25 20:36:31,362] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.25 | bwd: 5702.10 | bwd_inner: 5685.83 | bwd_allreduce: 16.23 | step: 19.18 13%|█▎ | 5232/41250 [12:38:56<86:18:55, 8.63s/it] {'loss': 0.0607, 'grad_norm': 0.7787355184555054, 'learning_rate': 3.902462113801246e-05, 'epoch': 1.27} 13%|█▎ | 5232/41250 [12:38:56<86:18:55, 8.63s/it][2025-04-25 20:36:40,034] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 20:36:40,034] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.93 | bwd_microstep: 5760.64 | bwd_inner_microstep: 5636.67 | bwd_allreduce_microstep: 123.92 | step_microstep: 18.85 [2025-04-25 20:36:40,034] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.93 | bwd: 5760.65 | bwd_inner: 5636.67 | bwd_allreduce: 123.94 | step: 18.85 13%|█▎ | 5233/41250 [12:39:05<86:26:45, 8.64s/it] {'loss': 0.3226, 'grad_norm': 1.3911728858947754, 'learning_rate': 3.9024136666549406e-05, 'epoch': 1.27} 13%|█▎ | 5233/41250 [12:39:05<86:26:45, 8.64s/it][2025-04-25 20:36:48,691] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:36:48,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.11 | bwd_microstep: 5737.04 | bwd_inner_microstep: 5644.76 | bwd_allreduce_microstep: 92.24 | step_microstep: 18.27 [2025-04-25 20:36:48,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.11 | bwd: 5737.05 | bwd_inner: 5644.76 | bwd_allreduce: 92.25 | step: 18.27 13%|█▎ | 5234/41250 [12:39:14<86:29:46, 8.65s/it] {'loss': 0.0518, 'grad_norm': 0.6512015461921692, 'learning_rate': 3.9023652077806284e-05, 'epoch': 1.27} 13%|█▎ | 5234/41250 [12:39:14<86:29:46, 8.65s/it][2025-04-25 20:36:57,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 20:36:57,353] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.43 | bwd_microstep: 5729.23 | bwd_inner_microstep: 5640.24 | bwd_allreduce_microstep: 88.94 | step_microstep: 18.76 [2025-04-25 20:36:57,353] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.43 | bwd: 5729.24 | bwd_inner: 5640.24 | bwd_allreduce: 88.96 | step: 18.77 13%|█▎ | 5235/41250 [12:39:22<86:32:20, 8.65s/it] {'loss': 0.1963, 'grad_norm': 1.6571452617645264, 'learning_rate': 3.9023167371786084e-05, 'epoch': 1.27} 13%|█▎ | 5235/41250 [12:39:22<86:32:20, 8.65s/it][2025-04-25 20:37:05,964] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:37:05,965] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.53 | bwd_microstep: 5696.33 | bwd_inner_microstep: 5653.49 | bwd_allreduce_microstep: 42.79 | step_microstep: 18.83 [2025-04-25 20:37:05,965] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.53 | bwd: 5696.34 | bwd_inner: 5653.49 | bwd_allreduce: 42.81 | step: 18.84 13%|█▎ | 5236/41250 [12:39:31<86:25:15, 8.64s/it] {'loss': 0.1833, 'grad_norm': 1.9767824411392212, 'learning_rate': 3.9022682548491807e-05, 'epoch': 1.27} 13%|█▎ | 5236/41250 [12:39:31<86:25:15, 8.64s/it][2025-04-25 20:37:14,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:37:14,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.96 | bwd_microstep: 5739.82 | bwd_inner_microstep: 5652.10 | bwd_allreduce_microstep: 87.68 | step_microstep: 18.78 [2025-04-25 20:37:14,620] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.96 | bwd: 5739.83 | bwd_inner: 5652.10 | bwd_allreduce: 87.69 | step: 18.78 13%|█▎ | 5237/41250 [12:39:39<86:28:06, 8.64s/it] {'loss': 0.0488, 'grad_norm': 0.972159743309021, 'learning_rate': 3.902219760792643e-05, 'epoch': 1.27} 13%|█▎ | 5237/41250 [12:39:39<86:28:06, 8.64s/it][2025-04-25 20:37:23,287] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.02 | optimizer_step: 0.96 [2025-04-25 20:37:23,288] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.17 | bwd_microstep: 5735.96 | bwd_inner_microstep: 5690.73 | bwd_allreduce_microstep: 45.18 | step_microstep: 19.22 [2025-04-25 20:37:23,288] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.18 | bwd: 5735.97 | bwd_inner: 5690.74 | bwd_allreduce: 45.20 | step: 19.23 13%|█▎ | 5238/41250 [12:39:48<86:32:10, 8.65s/it] {'loss': 0.1001, 'grad_norm': 1.2373214960098267, 'learning_rate': 3.902171255009294e-05, 'epoch': 1.27} 13%|█▎ | 5238/41250 [12:39:48<86:32:10, 8.65s/it][2025-04-25 20:37:31,956] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-25 20:37:31,957] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.22 | bwd_microstep: 5765.50 | bwd_inner_microstep: 5639.47 | bwd_allreduce_microstep: 125.98 | step_microstep: 18.85 [2025-04-25 20:37:31,957] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.22 | bwd: 5765.51 | bwd_inner: 5639.47 | bwd_allreduce: 126.00 | step: 18.85 13%|█▎ | 5239/41250 [12:39:57<86:35:18, 8.66s/it] {'loss': 0.2456, 'grad_norm': 1.967659592628479, 'learning_rate': 3.9021227374994346e-05, 'epoch': 1.27} 13%|█▎ | 5239/41250 [12:39:57<86:35:18, 8.66s/it][2025-04-25 20:37:40,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 20:37:40,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.32 | bwd_microstep: 5701.98 | bwd_inner_microstep: 5688.46 | bwd_allreduce_microstep: 13.47 | step_microstep: 18.87 [2025-04-25 20:37:40,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.32 | bwd: 5701.99 | bwd_inner: 5688.46 | bwd_allreduce: 13.49 | step: 18.88 13%|█▎ | 5240/41250 [12:40:05<86:32:05, 8.65s/it] {'loss': 0.3665, 'grad_norm': 2.196247100830078, 'learning_rate': 3.902074208263362e-05, 'epoch': 1.27} 13%|█▎ | 5240/41250 [12:40:05<86:32:05, 8.65s/it][2025-04-25 20:37:49,271] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 20:37:49,272] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.36 | bwd_microstep: 5747.08 | bwd_inner_microstep: 5693.52 | bwd_allreduce_microstep: 53.51 | step_microstep: 18.75 [2025-04-25 20:37:49,272] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.36 | bwd: 5747.10 | bwd_inner: 5693.52 | bwd_allreduce: 53.53 | step: 18.75 13%|█▎ | 5241/41250 [12:40:14<86:36:39, 8.66s/it] {'loss': 0.2497, 'grad_norm': 2.431851863861084, 'learning_rate': 3.9020256673013765e-05, 'epoch': 1.27} 13%|█▎ | 5241/41250 [12:40:14<86:36:39, 8.66s/it][2025-04-25 20:37:57,900] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:37:57,900] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.16 | bwd_microstep: 5696.04 | bwd_inner_microstep: 5683.22 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.85 [2025-04-25 20:37:57,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.16 | bwd: 5696.05 | bwd_inner: 5683.22 | bwd_allreduce: 12.79 | step: 18.85 13%|█▎ | 5242/41250 [12:40:23<86:30:54, 8.65s/it] {'loss': 0.2712, 'grad_norm': 2.8871781826019287, 'learning_rate': 3.9019771146137764e-05, 'epoch': 1.27} 13%|█▎ | 5242/41250 [12:40:23<86:30:54, 8.65s/it][2025-04-25 20:38:06,585] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:38:06,585] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.15 | bwd_microstep: 5740.26 | bwd_inner_microstep: 5704.39 | bwd_allreduce_microstep: 35.83 | step_microstep: 18.51 [2025-04-25 20:38:06,585] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.15 | bwd: 5740.27 | bwd_inner: 5704.39 | bwd_allreduce: 35.85 | step: 18.51 13%|█▎ | 5243/41250 [12:40:31<86:37:05, 8.66s/it] {'loss': 0.2867, 'grad_norm': 3.747800588607788, 'learning_rate': 3.901928550200861e-05, 'epoch': 1.27} 13%|█▎ | 5243/41250 [12:40:31<86:37:05, 8.66s/it][2025-04-25 20:38:15,207] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.21 | optimizer_step: 0.90 [2025-04-25 20:38:15,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.99 | bwd_microstep: 5709.56 | bwd_inner_microstep: 5658.70 | bwd_allreduce_microstep: 50.82 | step_microstep: 19.07 [2025-04-25 20:38:15,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.99 | bwd: 5709.58 | bwd_inner: 5658.70 | bwd_allreduce: 50.84 | step: 19.07 13%|█▎ | 5244/41250 [12:40:40<86:30:09, 8.65s/it] {'loss': 0.1033, 'grad_norm': 1.5048028230667114, 'learning_rate': 3.901879974062931e-05, 'epoch': 1.27} 13%|█▎ | 5244/41250 [12:40:40<86:30:09, 8.65s/it][2025-04-25 20:38:23,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.99 | optimizer_step: 0.96 [2025-04-25 20:38:23,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2904.49 | bwd_microstep: 5794.59 | bwd_inner_microstep: 5781.81 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.60 [2025-04-25 20:38:23,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2904.49 | bwd: 5794.60 | bwd_inner: 5781.81 | bwd_allreduce: 12.74 | step: 18.60 13%|█▎ | 5245/41250 [12:40:49<86:54:48, 8.69s/it] {'loss': 0.073, 'grad_norm': 1.009722352027893, 'learning_rate': 3.9018313862002846e-05, 'epoch': 1.27} 13%|█▎ | 5245/41250 [12:40:49<86:54:48, 8.69s/it][2025-04-25 20:38:32,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:38:32,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.66 | bwd_microstep: 5777.73 | bwd_inner_microstep: 5668.58 | bwd_allreduce_microstep: 109.10 | step_microstep: 18.53 [2025-04-25 20:38:32,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.66 | bwd: 5777.74 | bwd_inner: 5668.58 | bwd_allreduce: 109.12 | step: 18.53 13%|█▎ | 5246/41250 [12:40:58<86:55:08, 8.69s/it] {'loss': 0.0652, 'grad_norm': 0.696983814239502, 'learning_rate': 3.901782786613222e-05, 'epoch': 1.27} 13%|█▎ | 5246/41250 [12:40:58<86:55:08, 8.69s/it][2025-04-25 20:38:41,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:38:41,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.46 | bwd_microstep: 5761.51 | bwd_inner_microstep: 5702.96 | bwd_allreduce_microstep: 58.50 | step_microstep: 18.54 [2025-04-25 20:38:41,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.46 | bwd: 5761.52 | bwd_inner: 5702.96 | bwd_allreduce: 58.52 | step: 18.55 13%|█▎ | 5247/41250 [12:41:06<86:55:13, 8.69s/it] {'loss': 0.1771, 'grad_norm': 1.554031491279602, 'learning_rate': 3.9017341753020424e-05, 'epoch': 1.27} 13%|█▎ | 5247/41250 [12:41:06<86:55:13, 8.69s/it][2025-04-25 20:38:50,048] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.46 | optimizer_step: 1.08 [2025-04-25 20:38:50,049] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.31 | bwd_microstep: 5725.64 | bwd_inner_microstep: 5711.25 | bwd_allreduce_microstep: 14.31 | step_microstep: 20.62 [2025-04-25 20:38:50,049] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.31 | bwd: 5725.66 | bwd_inner: 5711.25 | bwd_allreduce: 14.35 | step: 20.62 13%|█▎ | 5248/41250 [12:41:15<86:51:56, 8.69s/it] {'loss': 0.0703, 'grad_norm': 0.7013179659843445, 'learning_rate': 3.901685552267046e-05, 'epoch': 1.27} 13%|█▎ | 5248/41250 [12:41:15<86:51:56, 8.69s/it][2025-04-25 20:38:58,771] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:38:58,771] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.27 | bwd_microstep: 5776.90 | bwd_inner_microstep: 5716.64 | bwd_allreduce_microstep: 60.22 | step_microstep: 18.62 [2025-04-25 20:38:58,771] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.27 | bwd: 5776.92 | bwd_inner: 5716.64 | bwd_allreduce: 60.23 | step: 18.62 13%|█▎ | 5249/41250 [12:41:24<86:57:32, 8.70s/it] {'loss': 0.0844, 'grad_norm': 1.4649126529693604, 'learning_rate': 3.901636917508532e-05, 'epoch': 1.27} 13%|█▎ | 5249/41250 [12:41:24<86:57:32, 8.70s/it][2025-04-25 20:39:07,463] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.92 | optimizer_gradients: 1.25 | optimizer_step: 0.95 [2025-04-25 20:39:07,463] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.81 | bwd_microstep: 5745.19 | bwd_inner_microstep: 5704.83 | bwd_allreduce_microstep: 40.31 | step_microstep: 19.48 [2025-04-25 20:39:07,464] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.81 | bwd: 5745.21 | bwd_inner: 5704.82 | bwd_allreduce: 40.33 | step: 19.48 13%|█▎ | 5250/41250 [12:41:32<86:56:59, 8.69s/it] {'loss': 0.1519, 'grad_norm': 3.9304394721984863, 'learning_rate': 3.9015882710268e-05, 'epoch': 1.27} 13%|█▎ | 5250/41250 [12:41:32<86:56:59, 8.69s/it][2025-04-25 20:39:16,143] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-25 20:39:16,144] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.73 | bwd_microstep: 5749.80 | bwd_inner_microstep: 5702.34 | bwd_allreduce_microstep: 47.41 | step_microstep: 18.59 [2025-04-25 20:39:16,144] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.73 | bwd: 5749.81 | bwd_inner: 5702.34 | bwd_allreduce: 47.43 | step: 18.59 13%|█▎ | 5251/41250 [12:41:41<86:53:59, 8.69s/it] {'loss': 0.0466, 'grad_norm': 1.1253445148468018, 'learning_rate': 3.901539612822151e-05, 'epoch': 1.27} 13%|█▎ | 5251/41250 [12:41:41<86:53:59, 8.69s/it][2025-04-25 20:39:24,772] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.19 | optimizer_step: 0.90 [2025-04-25 20:39:24,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.63 | bwd_microstep: 5705.46 | bwd_inner_microstep: 5665.89 | bwd_allreduce_microstep: 39.53 | step_microstep: 19.09 [2025-04-25 20:39:24,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.63 | bwd: 5705.48 | bwd_inner: 5665.89 | bwd_allreduce: 39.55 | step: 19.09 13%|█▎ | 5252/41250 [12:41:50<86:42:50, 8.67s/it] {'loss': 0.1064, 'grad_norm': 1.393225073814392, 'learning_rate': 3.9014909428948844e-05, 'epoch': 1.27} 13%|█▎ | 5252/41250 [12:41:50<86:42:50, 8.67s/it][2025-04-25 20:39:33,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 1.07 [2025-04-25 20:39:33,688] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2942.33 | bwd_microstep: 5887.63 | bwd_inner_microstep: 5874.83 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.90 [2025-04-25 20:39:33,688] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2942.33 | bwd: 5887.65 | bwd_inner: 5874.83 | bwd_allreduce: 12.78 | step: 18.91 13%|█▎ | 5253/41250 [12:41:59<87:26:31, 8.74s/it] {'loss': 0.1137, 'grad_norm': 1.557653546333313, 'learning_rate': 3.901442261245299e-05, 'epoch': 1.27} 13%|█▎ | 5253/41250 [12:41:59<87:26:31, 8.74s/it][2025-04-25 20:39:42,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:39:42,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.89 | bwd_microstep: 5901.32 | bwd_inner_microstep: 5707.34 | bwd_allreduce_microstep: 193.94 | step_microstep: 18.83 [2025-04-25 20:39:42,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.89 | bwd: 5901.33 | bwd_inner: 5707.34 | bwd_allreduce: 193.96 | step: 18.83 13%|█▎ | 5254/41250 [12:42:07<87:43:15, 8.77s/it] {'loss': 0.2955, 'grad_norm': 2.172912359237671, 'learning_rate': 3.901393567873697e-05, 'epoch': 1.27} 13%|█▎ | 5254/41250 [12:42:07<87:43:15, 8.77s/it][2025-04-25 20:39:51,228] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.04 | optimizer_step: 1.03 [2025-04-25 20:39:51,229] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.32 | bwd_microstep: 5757.47 | bwd_inner_microstep: 5715.21 | bwd_allreduce_microstep: 42.22 | step_microstep: 19.37 [2025-04-25 20:39:51,229] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.32 | bwd: 5757.49 | bwd_inner: 5715.21 | bwd_allreduce: 42.24 | step: 19.37 13%|█▎ | 5255/41250 [12:42:16<87:30:39, 8.75s/it] {'loss': 0.0592, 'grad_norm': 0.9939250349998474, 'learning_rate': 3.901344862780378e-05, 'epoch': 1.27} 13%|█▎ | 5255/41250 [12:42:16<87:30:39, 8.75s/it][2025-04-25 20:39:59,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.22 | optimizer_step: 0.90 [2025-04-25 20:39:59,956] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.81 | bwd_microstep: 5808.54 | bwd_inner_microstep: 5666.94 | bwd_allreduce_microstep: 141.55 | step_microstep: 19.16 [2025-04-25 20:39:59,956] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.81 | bwd: 5808.55 | bwd_inner: 5666.94 | bwd_allreduce: 141.57 | step: 19.16 13%|█▎ | 5256/41250 [12:42:25<87:25:39, 8.74s/it] {'loss': 0.1937, 'grad_norm': 2.009052276611328, 'learning_rate': 3.901296145965641e-05, 'epoch': 1.27} 13%|█▎ | 5256/41250 [12:42:25<87:25:39, 8.74s/it][2025-04-25 20:40:08,606] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:40:08,607] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.25 | bwd_microstep: 5705.16 | bwd_inner_microstep: 5692.39 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.71 [2025-04-25 20:40:08,607] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.25 | bwd: 5705.17 | bwd_inner: 5692.39 | bwd_allreduce: 12.74 | step: 18.71 13%|█▎ | 5257/41250 [12:42:33<87:08:40, 8.72s/it] {'loss': 0.1493, 'grad_norm': 1.3962600231170654, 'learning_rate': 3.9012474174297875e-05, 'epoch': 1.27} 13%|█▎ | 5257/41250 [12:42:33<87:08:40, 8.72s/it][2025-04-25 20:40:17,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 20:40:17,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.01 | bwd_microstep: 5779.43 | bwd_inner_microstep: 5648.69 | bwd_allreduce_microstep: 130.69 | step_microstep: 18.81 [2025-04-25 20:40:17,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.01 | bwd: 5779.44 | bwd_inner: 5648.69 | bwd_allreduce: 130.71 | step: 18.81 13%|█▎ | 5258/41250 [12:42:42<87:04:53, 8.71s/it] {'loss': 0.0621, 'grad_norm': 0.9779583215713501, 'learning_rate': 3.9011986771731176e-05, 'epoch': 1.27} 13%|█▎ | 5258/41250 [12:42:42<87:04:53, 8.71s/it][2025-04-25 20:40:26,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:40:26,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.72 | bwd_microstep: 5773.35 | bwd_inner_microstep: 5670.03 | bwd_allreduce_microstep: 103.28 | step_microstep: 18.81 [2025-04-25 20:40:26,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.72 | bwd: 5773.37 | bwd_inner: 5670.03 | bwd_allreduce: 103.29 | step: 18.81 13%|█▎ | 5259/41250 [12:42:51<87:02:50, 8.71s/it] {'loss': 0.1334, 'grad_norm': 1.3732142448425293, 'learning_rate': 3.9011499251959316e-05, 'epoch': 1.27} 13%|█▎ | 5259/41250 [12:42:51<87:02:50, 8.71s/it][2025-04-25 20:40:34,646] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 20:40:34,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.59 | bwd_microstep: 5733.80 | bwd_inner_microstep: 5640.85 | bwd_allreduce_microstep: 92.91 | step_microstep: 18.92 [2025-04-25 20:40:34,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.59 | bwd: 5733.82 | bwd_inner: 5640.85 | bwd_allreduce: 92.93 | step: 18.94 13%|█▎ | 5260/41250 [12:42:59<86:51:34, 8.69s/it] {'loss': 0.0937, 'grad_norm': 1.0533536672592163, 'learning_rate': 3.901101161498531e-05, 'epoch': 1.28} 13%|█▎ | 5260/41250 [12:42:59<86:51:34, 8.69s/it][2025-04-25 20:40:43,347] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.07 | optimizer_step: 0.90 [2025-04-25 20:40:43,348] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.49 | bwd_microstep: 5785.91 | bwd_inner_microstep: 5656.71 | bwd_allreduce_microstep: 129.16 | step_microstep: 18.88 [2025-04-25 20:40:43,348] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.49 | bwd: 5785.92 | bwd_inner: 5656.71 | bwd_allreduce: 129.17 | step: 18.89 13%|█▎ | 5261/41250 [12:43:08<86:53:59, 8.69s/it] {'loss': 0.2944, 'grad_norm': 3.5812265872955322, 'learning_rate': 3.901052386081215e-05, 'epoch': 1.28} 13%|█▎ | 5261/41250 [12:43:08<86:53:59, 8.69s/it][2025-04-25 20:40:52,005] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 20:40:52,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.94 | bwd_microstep: 5713.98 | bwd_inner_microstep: 5701.19 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.88 [2025-04-25 20:40:52,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.94 | bwd: 5713.99 | bwd_inner: 5701.19 | bwd_allreduce: 12.76 | step: 18.88 13%|█▎ | 5262/41250 [12:43:17<86:47:05, 8.68s/it] {'loss': 0.1228, 'grad_norm': 1.163267970085144, 'learning_rate': 3.901003598944285e-05, 'epoch': 1.28} 13%|█▎ | 5262/41250 [12:43:17<86:47:05, 8.68s/it][2025-04-25 20:41:00,678] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.97 | optimizer_step: 1.07 [2025-04-25 20:41:00,679] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.67 | bwd_microstep: 5735.14 | bwd_inner_microstep: 5722.52 | bwd_allreduce_microstep: 12.57 | step_microstep: 18.74 [2025-04-25 20:41:00,679] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.67 | bwd: 5735.15 | bwd_inner: 5722.52 | bwd_allreduce: 12.59 | step: 18.74 13%|█▎ | 5263/41250 [12:43:26<86:45:28, 8.68s/it] {'loss': 0.2593, 'grad_norm': 2.3963623046875, 'learning_rate': 3.9009548000880424e-05, 'epoch': 1.28} 13%|█▎ | 5263/41250 [12:43:26<86:45:28, 8.68s/it][2025-04-25 20:41:09,426] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-25 20:41:09,426] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2882.18 | bwd_microstep: 5780.04 | bwd_inner_microstep: 5767.43 | bwd_allreduce_microstep: 12.57 | step_microstep: 18.53 [2025-04-25 20:41:09,426] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2882.18 | bwd: 5780.06 | bwd_inner: 5767.43 | bwd_allreduce: 12.59 | step: 18.54 13%|█▎ | 5264/41250 [12:43:34<86:57:33, 8.70s/it] {'loss': 0.1576, 'grad_norm': 2.840195417404175, 'learning_rate': 3.9009059895127864e-05, 'epoch': 1.28} 13%|█▎ | 5264/41250 [12:43:34<86:57:33, 8.70s/it][2025-04-25 20:41:18,021] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.96 | optimizer_step: 1.01 [2025-04-25 20:41:18,022] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.50 | bwd_microstep: 5685.09 | bwd_inner_microstep: 5657.27 | bwd_allreduce_microstep: 27.78 | step_microstep: 18.32 [2025-04-25 20:41:18,022] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.50 | bwd: 5685.10 | bwd_inner: 5657.27 | bwd_allreduce: 27.79 | step: 18.32 13%|█▎ | 5265/41250 [12:43:43<86:38:44, 8.67s/it] {'loss': 0.1488, 'grad_norm': 2.8034257888793945, 'learning_rate': 3.9008571672188195e-05, 'epoch': 1.28} 13%|█▎ | 5265/41250 [12:43:43<86:38:44, 8.67s/it][2025-04-25 20:41:26,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 20:41:26,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.76 | bwd_microstep: 5760.80 | bwd_inner_microstep: 5702.24 | bwd_allreduce_microstep: 58.52 | step_microstep: 18.65 [2025-04-25 20:41:26,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.76 | bwd: 5760.82 | bwd_inner: 5702.24 | bwd_allreduce: 58.54 | step: 18.65 13%|█▎ | 5266/41250 [12:43:52<86:42:53, 8.68s/it] {'loss': 0.2026, 'grad_norm': 2.3955180644989014, 'learning_rate': 3.900808333206442e-05, 'epoch': 1.28} 13%|█▎ | 5266/41250 [12:43:52<86:42:53, 8.68s/it][2025-04-25 20:41:35,312] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:41:35,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.33 | bwd_microstep: 5684.11 | bwd_inner_microstep: 5662.82 | bwd_allreduce_microstep: 21.25 | step_microstep: 18.82 [2025-04-25 20:41:35,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.33 | bwd: 5684.13 | bwd_inner: 5662.82 | bwd_allreduce: 21.27 | step: 18.82 13%|█▎ | 5267/41250 [12:44:00<86:29:23, 8.65s/it] {'loss': 0.1328, 'grad_norm': 2.4166641235351562, 'learning_rate': 3.900759487475955e-05, 'epoch': 1.28} 13%|█▎ | 5267/41250 [12:44:00<86:29:23, 8.65s/it][2025-04-25 20:41:43,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-25 20:41:43,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.18 | bwd_microstep: 5695.09 | bwd_inner_microstep: 5664.31 | bwd_allreduce_microstep: 30.73 | step_microstep: 19.37 [2025-04-25 20:41:43,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.18 | bwd: 5695.10 | bwd_inner: 5664.31 | bwd_allreduce: 30.75 | step: 19.37 13%|█▎ | 5268/41250 [12:44:09<86:21:24, 8.64s/it] {'loss': 0.1289, 'grad_norm': 1.1627761125564575, 'learning_rate': 3.900710630027659e-05, 'epoch': 1.28} 13%|█▎ | 5268/41250 [12:44:09<86:21:24, 8.64s/it][2025-04-25 20:41:52,601] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.04 | optimizer_step: 0.89 [2025-04-25 20:41:52,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.08 | bwd_microstep: 5772.63 | bwd_inner_microstep: 5648.44 | bwd_allreduce_microstep: 124.14 | step_microstep: 18.99 [2025-04-25 20:41:52,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.09 | bwd: 5772.64 | bwd_inner: 5648.44 | bwd_allreduce: 124.16 | step: 18.99 13%|█▎ | 5269/41250 [12:44:17<86:28:15, 8.65s/it] {'loss': 0.187, 'grad_norm': 2.2360453605651855, 'learning_rate': 3.900661760861857e-05, 'epoch': 1.28} 13%|█▎ | 5269/41250 [12:44:17<86:28:15, 8.65s/it][2025-04-25 20:42:01,265] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 20:42:01,266] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.02 | bwd_microstep: 5754.39 | bwd_inner_microstep: 5645.38 | bwd_allreduce_microstep: 108.96 | step_microstep: 18.72 [2025-04-25 20:42:01,266] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.02 | bwd: 5754.40 | bwd_inner: 5645.38 | bwd_allreduce: 108.97 | step: 18.72 13%|█▎ | 5270/41250 [12:44:26<86:30:16, 8.66s/it] {'loss': 0.0239, 'grad_norm': 0.6363493204116821, 'learning_rate': 3.900612879978848e-05, 'epoch': 1.28} 13%|█▎ | 5270/41250 [12:44:26<86:30:16, 8.66s/it][2025-04-25 20:42:09,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 20:42:09,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.41 | bwd_microstep: 5694.38 | bwd_inner_microstep: 5636.98 | bwd_allreduce_microstep: 57.35 | step_microstep: 18.88 [2025-04-25 20:42:09,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.41 | bwd: 5694.39 | bwd_inner: 5636.98 | bwd_allreduce: 57.37 | step: 18.88 13%|█▎ | 5271/41250 [12:44:35<86:20:37, 8.64s/it] {'loss': 0.2919, 'grad_norm': 2.094874620437622, 'learning_rate': 3.900563987378935e-05, 'epoch': 1.28} 13%|█▎ | 5271/41250 [12:44:35<86:20:37, 8.64s/it][2025-04-25 20:42:18,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:42:18,605] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2880.12 | bwd_microstep: 5771.97 | bwd_inner_microstep: 5759.06 | bwd_allreduce_microstep: 12.85 | step_microstep: 18.86 [2025-04-25 20:42:18,605] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2880.12 | bwd: 5771.98 | bwd_inner: 5759.06 | bwd_allreduce: 12.88 | step: 18.86 13%|█▎ | 5272/41250 [12:44:43<86:38:22, 8.67s/it] {'loss': 0.1489, 'grad_norm': 3.87359881401062, 'learning_rate': 3.9005150830624185e-05, 'epoch': 1.28} 13%|█▎ | 5272/41250 [12:44:43<86:38:22, 8.67s/it][2025-04-25 20:42:27,221] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.04 | optimizer_step: 1.14 [2025-04-25 20:42:27,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.19 | bwd_microstep: 5710.95 | bwd_inner_microstep: 5642.93 | bwd_allreduce_microstep: 67.98 | step_microstep: 19.61 [2025-04-25 20:42:27,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.19 | bwd: 5710.97 | bwd_inner: 5642.93 | bwd_allreduce: 67.99 | step: 19.62 13%|█▎ | 5273/41250 [12:44:52<86:28:34, 8.65s/it] {'loss': 0.1616, 'grad_norm': 2.7788050174713135, 'learning_rate': 3.9004661670296e-05, 'epoch': 1.28} 13%|█▎ | 5273/41250 [12:44:52<86:28:34, 8.65s/it][2025-04-25 20:42:35,969] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 1.03 [2025-04-25 20:42:35,969] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.03 | bwd_microstep: 5772.09 | bwd_inner_microstep: 5759.06 | bwd_allreduce_microstep: 12.98 | step_microstep: 18.96 [2025-04-25 20:42:35,970] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.03 | bwd: 5772.10 | bwd_inner: 5759.06 | bwd_allreduce: 13.00 | step: 18.96 13%|█▎ | 5274/41250 [12:45:01<86:45:08, 8.68s/it] {'loss': 0.0852, 'grad_norm': 0.916693389415741, 'learning_rate': 3.900417239280782e-05, 'epoch': 1.28} 13%|█▎ | 5274/41250 [12:45:01<86:45:08, 8.68s/it][2025-04-25 20:42:44,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.95 | optimizer_step: 0.95 [2025-04-25 20:42:44,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.77 | bwd_microstep: 5724.93 | bwd_inner_microstep: 5696.83 | bwd_allreduce_microstep: 28.04 | step_microstep: 18.30 [2025-04-25 20:42:44,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.77 | bwd: 5724.94 | bwd_inner: 5696.83 | bwd_allreduce: 28.06 | step: 18.31 13%|█▎ | 5275/41250 [12:45:09<86:41:26, 8.68s/it] {'loss': 0.103, 'grad_norm': 1.159017562866211, 'learning_rate': 3.900368299816266e-05, 'epoch': 1.28} 13%|█▎ | 5275/41250 [12:45:09<86:41:26, 8.68s/it][2025-04-25 20:42:53,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 1.03 [2025-04-25 20:42:53,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.05 | bwd_microstep: 5748.17 | bwd_inner_microstep: 5653.36 | bwd_allreduce_microstep: 94.77 | step_microstep: 19.38 [2025-04-25 20:42:53,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.05 | bwd: 5748.19 | bwd_inner: 5653.36 | bwd_allreduce: 94.79 | step: 19.38 13%|█▎ | 5276/41250 [12:45:18<86:38:47, 8.67s/it] {'loss': 0.2825, 'grad_norm': 2.1857359409332275, 'learning_rate': 3.900319348636352e-05, 'epoch': 1.28} 13%|█▎ | 5276/41250 [12:45:18<86:38:47, 8.67s/it][2025-04-25 20:43:01,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.97 | optimizer_step: 0.93 [2025-04-25 20:43:01,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.31 | bwd_microstep: 5696.59 | bwd_inner_microstep: 5683.96 | bwd_allreduce_microstep: 12.58 | step_microstep: 18.51 [2025-04-25 20:43:01,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.31 | bwd: 5696.60 | bwd_inner: 5683.96 | bwd_allreduce: 12.60 | step: 18.52 13%|█▎ | 5277/41250 [12:45:27<86:31:14, 8.66s/it] {'loss': 0.0779, 'grad_norm': 1.9813487529754639, 'learning_rate': 3.900270385741344e-05, 'epoch': 1.28} 13%|█▎ | 5277/41250 [12:45:27<86:31:14, 8.66s/it][2025-04-25 20:43:10,800] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 20:43:10,800] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2928.06 | bwd_microstep: 5870.98 | bwd_inner_microstep: 5858.14 | bwd_allreduce_microstep: 12.79 | step_microstep: 18.92 [2025-04-25 20:43:10,800] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2928.06 | bwd: 5870.99 | bwd_inner: 5858.14 | bwd_allreduce: 12.81 | step: 18.92 13%|█▎ | 5278/41250 [12:45:36<87:10:39, 8.72s/it] {'loss': 0.0898, 'grad_norm': 2.324204444885254, 'learning_rate': 3.900221411131542e-05, 'epoch': 1.28} 13%|█▎ | 5278/41250 [12:45:36<87:10:39, 8.72s/it][2025-04-25 20:43:19,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:43:19,397] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.45 | bwd_microstep: 5688.28 | bwd_inner_microstep: 5653.91 | bwd_allreduce_microstep: 34.32 | step_microstep: 18.72 [2025-04-25 20:43:19,397] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.45 | bwd: 5688.30 | bwd_inner: 5653.91 | bwd_allreduce: 34.34 | step: 18.73 13%|█▎ | 5279/41250 [12:45:44<86:47:33, 8.69s/it] {'loss': 0.1231, 'grad_norm': 2.2180676460266113, 'learning_rate': 3.90017242480725e-05, 'epoch': 1.28} 13%|█▎ | 5279/41250 [12:45:44<86:47:33, 8.69s/it][2025-04-25 20:43:27,987] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.03 | optimizer_step: 1.21 [2025-04-25 20:43:27,988] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.69 | bwd_microstep: 5672.84 | bwd_inner_microstep: 5656.18 | bwd_allreduce_microstep: 16.62 | step_microstep: 19.29 [2025-04-25 20:43:27,988] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.69 | bwd: 5672.85 | bwd_inner: 5656.18 | bwd_allreduce: 16.63 | step: 19.29 13%|█▎ | 5280/41250 [12:45:53<86:30:33, 8.66s/it] {'loss': 0.1447, 'grad_norm': 1.8251460790634155, 'learning_rate': 3.900123426768767e-05, 'epoch': 1.28} 13%|█▎ | 5280/41250 [12:45:53<86:30:33, 8.66s/it][2025-04-25 20:43:36,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 1.03 [2025-04-25 20:43:36,661] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.79 | bwd_microstep: 5760.45 | bwd_inner_microstep: 5644.59 | bwd_allreduce_microstep: 115.82 | step_microstep: 18.98 [2025-04-25 20:43:36,661] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.79 | bwd: 5760.47 | bwd_inner: 5644.59 | bwd_allreduce: 115.84 | step: 18.97 13%|█▎ | 5281/41250 [12:46:01<86:32:40, 8.66s/it] {'loss': 0.1121, 'grad_norm': 1.335759162902832, 'learning_rate': 3.900074417016398e-05, 'epoch': 1.28} 13%|█▎ | 5281/41250 [12:46:01<86:32:40, 8.66s/it][2025-04-25 20:43:45,284] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 20:43:45,285] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.66 | bwd_microstep: 5699.07 | bwd_inner_microstep: 5686.30 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.66 [2025-04-25 20:43:45,285] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.66 | bwd: 5699.08 | bwd_inner: 5686.30 | bwd_allreduce: 12.75 | step: 18.67 13%|█▎ | 5282/41250 [12:46:10<86:25:45, 8.65s/it] {'loss': 0.3314, 'grad_norm': 3.1146724224090576, 'learning_rate': 3.900025395550444e-05, 'epoch': 1.28} 13%|█▎ | 5282/41250 [12:46:10<86:25:45, 8.65s/it][2025-04-25 20:43:53,925] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 20:43:53,925] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.55 | bwd_microstep: 5709.27 | bwd_inner_microstep: 5696.72 | bwd_allreduce_microstep: 12.51 | step_microstep: 18.50 [2025-04-25 20:43:53,925] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.55 | bwd: 5709.28 | bwd_inner: 5696.72 | bwd_allreduce: 12.52 | step: 18.51 13%|█▎ | 5283/41250 [12:46:19<86:23:47, 8.65s/it] {'loss': 0.3809, 'grad_norm': 2.8897244930267334, 'learning_rate': 3.8999763623712065e-05, 'epoch': 1.28} 13%|█▎ | 5283/41250 [12:46:19<86:23:47, 8.65s/it][2025-04-25 20:44:02,571] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 20:44:02,572] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.27 | bwd_microstep: 5734.32 | bwd_inner_microstep: 5645.61 | bwd_allreduce_microstep: 88.67 | step_microstep: 18.27 [2025-04-25 20:44:02,572] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.27 | bwd: 5734.33 | bwd_inner: 5645.61 | bwd_allreduce: 88.68 | step: 18.27 13%|█▎ | 5284/41250 [12:46:27<86:23:22, 8.65s/it] {'loss': 0.0388, 'grad_norm': 0.653759777545929, 'learning_rate': 3.899927317478989e-05, 'epoch': 1.28} 13%|█▎ | 5284/41250 [12:46:27<86:23:22, 8.65s/it][2025-04-25 20:44:11,348] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.14 | optimizer_step: 1.04 [2025-04-25 20:44:11,349] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.29 | bwd_microstep: 5848.08 | bwd_inner_microstep: 5692.36 | bwd_allreduce_microstep: 155.66 | step_microstep: 19.95 [2025-04-25 20:44:11,349] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.29 | bwd: 5848.10 | bwd_inner: 5692.36 | bwd_allreduce: 155.69 | step: 19.95 13%|█▎ | 5285/41250 [12:46:36<86:46:58, 8.69s/it] {'loss': 0.1058, 'grad_norm': 1.5419501066207886, 'learning_rate': 3.899878260874092e-05, 'epoch': 1.28} 13%|█▎ | 5285/41250 [12:46:36<86:46:58, 8.69s/it][2025-04-25 20:44:19,956] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.19 | optimizer_step: 0.89 [2025-04-25 20:44:19,957] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.57 | bwd_microstep: 5697.17 | bwd_inner_microstep: 5643.93 | bwd_allreduce_microstep: 53.19 | step_microstep: 18.86 [2025-04-25 20:44:19,957] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.57 | bwd: 5697.18 | bwd_inner: 5643.93 | bwd_allreduce: 53.21 | step: 18.86 13%|█▎ | 5286/41250 [12:46:45<86:32:23, 8.66s/it] {'loss': 0.1653, 'grad_norm': 2.969127655029297, 'learning_rate': 3.8998291925568213e-05, 'epoch': 1.28} 13%|█▎ | 5286/41250 [12:46:45<86:32:23, 8.66s/it][2025-04-25 20:44:28,613] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 20:44:28,614] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.61 | bwd_microstep: 5726.97 | bwd_inner_microstep: 5692.88 | bwd_allreduce_microstep: 34.04 | step_microstep: 18.55 [2025-04-25 20:44:28,614] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.61 | bwd: 5726.99 | bwd_inner: 5692.88 | bwd_allreduce: 34.06 | step: 18.56 13%|█▎ | 5287/41250 [12:46:53<86:31:10, 8.66s/it] {'loss': 0.257, 'grad_norm': 3.8821935653686523, 'learning_rate': 3.8997801125274754e-05, 'epoch': 1.28} 13%|█▎ | 5287/41250 [12:46:53<86:31:10, 8.66s/it][2025-04-25 20:44:37,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.04 | optimizer_step: 0.89 [2025-04-25 20:44:37,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.93 | bwd_microstep: 5769.08 | bwd_inner_microstep: 5665.37 | bwd_allreduce_microstep: 103.67 | step_microstep: 18.58 [2025-04-25 20:44:37,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.93 | bwd: 5769.10 | bwd_inner: 5665.37 | bwd_allreduce: 103.69 | step: 18.59 13%|█▎ | 5288/41250 [12:47:02<86:33:59, 8.67s/it] {'loss': 0.0378, 'grad_norm': 1.343819499015808, 'learning_rate': 3.8997310207863594e-05, 'epoch': 1.28} 13%|█▎ | 5288/41250 [12:47:02<86:33:59, 8.67s/it][2025-04-25 20:44:45,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 20:44:45,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.59 | bwd_microstep: 5691.82 | bwd_inner_microstep: 5678.91 | bwd_allreduce_microstep: 12.87 | step_microstep: 18.74 [2025-04-25 20:44:45,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.59 | bwd: 5691.83 | bwd_inner: 5678.91 | bwd_allreduce: 12.88 | step: 18.74 13%|█▎ | 5289/41250 [12:47:11<86:26:24, 8.65s/it] {'loss': 0.1978, 'grad_norm': 1.6305135488510132, 'learning_rate': 3.899681917333776e-05, 'epoch': 1.28} 13%|█▎ | 5289/41250 [12:47:11<86:26:24, 8.65s/it][2025-04-25 20:44:54,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 20:44:54,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2891.04 | bwd_microstep: 5794.88 | bwd_inner_microstep: 5782.15 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.60 [2025-04-25 20:44:54,683] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2891.04 | bwd: 5794.89 | bwd_inner: 5782.15 | bwd_allreduce: 12.70 | step: 18.60 13%|█▎ | 5290/41250 [12:47:20<86:46:42, 8.69s/it] {'loss': 0.0753, 'grad_norm': 0.7305170297622681, 'learning_rate': 3.899632802170027e-05, 'epoch': 1.28} 13%|█▎ | 5290/41250 [12:47:20<86:46:42, 8.69s/it][2025-04-25 20:45:03,376] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.32 | optimizer_step: 1.04 [2025-04-25 20:45:03,377] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.57 | bwd_microstep: 5759.53 | bwd_inner_microstep: 5688.31 | bwd_allreduce_microstep: 71.16 | step_microstep: 20.09 [2025-04-25 20:45:03,377] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.57 | bwd: 5759.54 | bwd_inner: 5688.31 | bwd_allreduce: 71.19 | step: 20.09 13%|█▎ | 5291/41250 [12:47:28<86:48:04, 8.69s/it] {'loss': 0.2865, 'grad_norm': 3.5986685752868652, 'learning_rate': 3.899583675295415e-05, 'epoch': 1.28} 13%|█▎ | 5291/41250 [12:47:28<86:48:04, 8.69s/it][2025-04-25 20:45:12,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.04 | optimizer_step: 1.10 [2025-04-25 20:45:12,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.68 | bwd_microstep: 5766.93 | bwd_inner_microstep: 5659.85 | bwd_allreduce_microstep: 107.03 | step_microstep: 19.63 [2025-04-25 20:45:12,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.68 | bwd: 5766.94 | bwd_inner: 5659.85 | bwd_allreduce: 107.06 | step: 19.63 13%|█▎ | 5292/41250 [12:47:37<86:48:42, 8.69s/it] {'loss': 0.2434, 'grad_norm': 3.064574956893921, 'learning_rate': 3.899534536710243e-05, 'epoch': 1.28} 13%|█▎ | 5292/41250 [12:47:37<86:48:42, 8.69s/it][2025-04-25 20:45:20,753] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 20:45:20,753] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.67 | bwd_microstep: 5771.96 | bwd_inner_microstep: 5656.73 | bwd_allreduce_microstep: 115.19 | step_microstep: 18.66 [2025-04-25 20:45:20,753] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.67 | bwd: 5771.97 | bwd_inner: 5656.73 | bwd_allreduce: 115.20 | step: 18.66 13%|█▎ | 5293/41250 [12:47:46<86:46:36, 8.69s/it] {'loss': 0.0744, 'grad_norm': 0.7591884136199951, 'learning_rate': 3.899485386414815e-05, 'epoch': 1.28} 13%|█▎ | 5293/41250 [12:47:46<86:46:36, 8.69s/it][2025-04-25 20:45:29,471] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 20:45:29,471] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.47 | bwd_microstep: 5805.01 | bwd_inner_microstep: 5660.76 | bwd_allreduce_microstep: 144.20 | step_microstep: 18.45 [2025-04-25 20:45:29,471] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.47 | bwd: 5805.02 | bwd_inner: 5660.76 | bwd_allreduce: 144.22 | step: 18.45 13%|█▎ | 5294/41250 [12:47:54<86:51:50, 8.70s/it] {'loss': 0.1101, 'grad_norm': 2.268676519393921, 'learning_rate': 3.8994362244094326e-05, 'epoch': 1.28} 13%|█▎ | 5294/41250 [12:47:54<86:51:50, 8.70s/it][2025-04-25 20:45:38,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.23 | optimizer_step: 1.11 [2025-04-25 20:45:38,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.19 | bwd_microstep: 5754.59 | bwd_inner_microstep: 5718.40 | bwd_allreduce_microstep: 36.13 | step_microstep: 19.80 [2025-04-25 20:45:38,168] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.19 | bwd: 5754.61 | bwd_inner: 5718.40 | bwd_allreduce: 36.16 | step: 19.80 13%|█▎ | 5295/41250 [12:48:03<86:51:47, 8.70s/it] {'loss': 0.0992, 'grad_norm': 1.4767423868179321, 'learning_rate': 3.8993870506944e-05, 'epoch': 1.28} 13%|█▎ | 5295/41250 [12:48:03<86:51:47, 8.70s/it][2025-04-25 20:45:46,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-25 20:45:46,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.51 | bwd_microstep: 5709.70 | bwd_inner_microstep: 5696.48 | bwd_allreduce_microstep: 13.17 | step_microstep: 19.30 [2025-04-25 20:45:46,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.51 | bwd: 5709.71 | bwd_inner: 5696.48 | bwd_allreduce: 13.19 | step: 19.30 13%|█▎ | 5296/41250 [12:48:12<86:41:18, 8.68s/it] {'loss': 0.0464, 'grad_norm': 1.1050931215286255, 'learning_rate': 3.8993378652700196e-05, 'epoch': 1.28} 13%|█▎ | 5296/41250 [12:48:12<86:41:18, 8.68s/it][2025-04-25 20:45:55,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 20:45:55,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.13 | bwd_microstep: 5776.65 | bwd_inner_microstep: 5707.89 | bwd_allreduce_microstep: 68.72 | step_microstep: 18.82 [2025-04-25 20:45:55,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.13 | bwd: 5776.66 | bwd_inner: 5707.89 | bwd_allreduce: 68.73 | step: 18.82 13%|█▎ | 5297/41250 [12:48:20<86:47:56, 8.69s/it] {'loss': 0.1397, 'grad_norm': 2.3522632122039795, 'learning_rate': 3.8992886681365946e-05, 'epoch': 1.28} 13%|█▎ | 5297/41250 [12:48:20<86:47:56, 8.69s/it][2025-04-25 20:46:04,436] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.21 | optimizer_step: 0.90 [2025-04-25 20:46:04,437] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2936.15 | bwd_microstep: 5891.04 | bwd_inner_microstep: 5877.76 | bwd_allreduce_microstep: 13.22 | step_microstep: 19.01 [2025-04-25 20:46:04,437] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2936.15 | bwd: 5891.05 | bwd_inner: 5877.76 | bwd_allreduce: 13.25 | step: 19.02 13%|█▎ | 5298/41250 [12:48:29<87:27:23, 8.76s/it] {'loss': 0.1554, 'grad_norm': 1.4029279947280884, 'learning_rate': 3.8992394592944286e-05, 'epoch': 1.28} 13%|█▎ | 5298/41250 [12:48:29<87:27:23, 8.76s/it][2025-04-25 20:46:13,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.98 [2025-04-25 20:46:13,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.42 | bwd_microstep: 5769.35 | bwd_inner_microstep: 5662.67 | bwd_allreduce_microstep: 106.63 | step_microstep: 18.69 [2025-04-25 20:46:13,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.42 | bwd: 5769.36 | bwd_inner: 5662.67 | bwd_allreduce: 106.65 | step: 18.69 13%|█▎ | 5299/41250 [12:48:38<87:14:40, 8.74s/it] {'loss': 0.0444, 'grad_norm': 0.5884720087051392, 'learning_rate': 3.899190238743825e-05, 'epoch': 1.28} 13%|█▎ | 5299/41250 [12:48:38<87:14:40, 8.74s/it][2025-04-25 20:46:21,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:46:21,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.50 | bwd_microstep: 5706.19 | bwd_inner_microstep: 5693.33 | bwd_allreduce_microstep: 12.81 | step_microstep: 18.64 [2025-04-25 20:46:21,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.50 | bwd: 5706.21 | bwd_inner: 5693.33 | bwd_allreduce: 12.83 | step: 18.65 13%|█▎ | 5300/41250 [12:48:47<86:57:36, 8.71s/it] {'loss': 0.2198, 'grad_norm': 1.9400051832199097, 'learning_rate': 3.899141006485087e-05, 'epoch': 1.28} 13%|█▎ | 5300/41250 [12:48:47<86:57:36, 8.71s/it][2025-04-25 20:46:30,481] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.23 | optimizer_step: 1.00 [2025-04-25 20:46:30,481] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.08 | bwd_microstep: 5799.64 | bwd_inner_microstep: 5656.99 | bwd_allreduce_microstep: 142.60 | step_microstep: 19.39 [2025-04-25 20:46:30,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.08 | bwd: 5799.66 | bwd_inner: 5656.99 | bwd_allreduce: 142.62 | step: 19.39 13%|█▎ | 5301/41250 [12:48:55<86:58:46, 8.71s/it] {'loss': 0.2973, 'grad_norm': 2.47452712059021, 'learning_rate': 3.8990917625185186e-05, 'epoch': 1.29} 13%|█▎ | 5301/41250 [12:48:55<86:58:46, 8.71s/it][2025-04-25 20:46:39,148] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.99 | optimizer_step: 1.00 [2025-04-25 20:46:39,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.26 | bwd_microstep: 5728.86 | bwd_inner_microstep: 5716.07 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.80 [2025-04-25 20:46:39,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.26 | bwd: 5728.87 | bwd_inner: 5716.07 | bwd_allreduce: 12.76 | step: 18.80 13%|█▎ | 5302/41250 [12:49:04<86:50:49, 8.70s/it] {'loss': 0.0297, 'grad_norm': 0.7823610901832581, 'learning_rate': 3.8990425068444227e-05, 'epoch': 1.29} 13%|█▎ | 5302/41250 [12:49:04<86:50:49, 8.70s/it][2025-04-25 20:46:47,869] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.99 [2025-04-25 20:46:47,869] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.27 | bwd_microstep: 5779.79 | bwd_inner_microstep: 5715.31 | bwd_allreduce_microstep: 64.42 | step_microstep: 18.81 [2025-04-25 20:46:47,870] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.27 | bwd: 5779.80 | bwd_inner: 5715.31 | bwd_allreduce: 64.45 | step: 18.82 13%|█▎ | 5303/41250 [12:49:13<86:54:47, 8.70s/it] {'loss': 0.1164, 'grad_norm': 2.0880062580108643, 'learning_rate': 3.898993239463103e-05, 'epoch': 1.29} 13%|█▎ | 5303/41250 [12:49:13<86:54:47, 8.70s/it][2025-04-25 20:46:56,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 20:46:56,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2891.63 | bwd_microstep: 5804.69 | bwd_inner_microstep: 5792.21 | bwd_allreduce_microstep: 12.44 | step_microstep: 18.51 [2025-04-25 20:46:56,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2891.63 | bwd: 5804.71 | bwd_inner: 5792.21 | bwd_allreduce: 12.46 | step: 18.51 13%|█▎ | 5304/41250 [12:49:21<87:07:50, 8.73s/it] {'loss': 0.1643, 'grad_norm': 1.278975009918213, 'learning_rate': 3.898943960374864e-05, 'epoch': 1.29} 13%|█▎ | 5304/41250 [12:49:21<87:07:50, 8.73s/it][2025-04-25 20:47:05,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 20:47:05,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.45 | bwd_microstep: 5734.30 | bwd_inner_microstep: 5654.23 | bwd_allreduce_microstep: 80.03 | step_microstep: 18.77 [2025-04-25 20:47:05,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.45 | bwd: 5734.31 | bwd_inner: 5654.23 | bwd_allreduce: 80.05 | step: 18.77 13%|█▎ | 5305/41250 [12:49:30<86:52:52, 8.70s/it] {'loss': 0.0825, 'grad_norm': 2.7062830924987793, 'learning_rate': 3.898894669580009e-05, 'epoch': 1.29} 13%|█▎ | 5305/41250 [12:49:30<86:52:52, 8.70s/it][2025-04-25 20:47:14,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:47:14,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.06 | bwd_microstep: 5879.50 | bwd_inner_microstep: 5704.00 | bwd_allreduce_microstep: 175.46 | step_microstep: 18.38 [2025-04-25 20:47:14,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.07 | bwd: 5879.51 | bwd_inner: 5704.00 | bwd_allreduce: 175.47 | step: 18.38 13%|█▎ | 5306/41250 [12:49:39<87:14:44, 8.74s/it] {'loss': 0.0688, 'grad_norm': 1.2679163217544556, 'learning_rate': 3.8988453670788416e-05, 'epoch': 1.29} 13%|█▎ | 5306/41250 [12:49:39<87:14:44, 8.74s/it][2025-04-25 20:47:22,735] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:47:22,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.09 | bwd_microstep: 5711.60 | bwd_inner_microstep: 5663.68 | bwd_allreduce_microstep: 47.88 | step_microstep: 18.43 [2025-04-25 20:47:22,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.09 | bwd: 5711.61 | bwd_inner: 5663.68 | bwd_allreduce: 47.89 | step: 18.43 13%|█▎ | 5307/41250 [12:49:48<86:53:44, 8.70s/it] {'loss': 0.2662, 'grad_norm': 3.1570379734039307, 'learning_rate': 3.898796052871666e-05, 'epoch': 1.29} 13%|█▎ | 5307/41250 [12:49:48<86:53:44, 8.70s/it][2025-04-25 20:47:31,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-25 20:47:31,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.47 | bwd_microstep: 5708.58 | bwd_inner_microstep: 5668.78 | bwd_allreduce_microstep: 39.75 | step_microstep: 18.84 [2025-04-25 20:47:31,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.47 | bwd: 5708.59 | bwd_inner: 5668.78 | bwd_allreduce: 39.77 | step: 18.84 13%|█▎ | 5308/41250 [12:49:56<86:42:52, 8.69s/it] {'loss': 0.1416, 'grad_norm': 3.7084598541259766, 'learning_rate': 3.898746726958786e-05, 'epoch': 1.29} 13%|█▎ | 5308/41250 [12:49:56<86:42:52, 8.69s/it][2025-04-25 20:47:40,054] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.56 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:47:40,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.68 | bwd_microstep: 5731.55 | bwd_inner_microstep: 5718.52 | bwd_allreduce_microstep: 12.99 | step_microstep: 18.89 [2025-04-25 20:47:40,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.68 | bwd: 5731.57 | bwd_inner: 5718.52 | bwd_allreduce: 13.01 | step: 18.90 13%|█▎ | 5309/41250 [12:50:05<86:40:42, 8.68s/it] {'loss': 0.5431, 'grad_norm': 6.383234024047852, 'learning_rate': 3.8986973893405064e-05, 'epoch': 1.29} 13%|█▎ | 5309/41250 [12:50:05<86:40:42, 8.68s/it][2025-04-25 20:47:48,720] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:47:48,721] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.10 | bwd_microstep: 5754.93 | bwd_inner_microstep: 5661.83 | bwd_allreduce_microstep: 93.06 | step_microstep: 18.33 [2025-04-25 20:47:48,721] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.10 | bwd: 5754.95 | bwd_inner: 5661.83 | bwd_allreduce: 93.08 | step: 18.33 13%|█▎ | 5310/41250 [12:50:14<86:37:47, 8.68s/it] {'loss': 0.0755, 'grad_norm': 1.9767638444900513, 'learning_rate': 3.8986480400171306e-05, 'epoch': 1.29} 13%|█▎ | 5310/41250 [12:50:14<86:37:47, 8.68s/it][2025-04-25 20:47:57,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:47:57,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2865.77 | bwd_microstep: 5722.02 | bwd_inner_microstep: 5709.24 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.68 [2025-04-25 20:47:57,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2865.77 | bwd: 5722.03 | bwd_inner: 5709.24 | bwd_allreduce: 12.75 | step: 18.68 13%|█▎ | 5311/41250 [12:50:22<86:36:48, 8.68s/it] {'loss': 0.0748, 'grad_norm': 1.579447627067566, 'learning_rate': 3.898598678988963e-05, 'epoch': 1.29} 13%|█▎ | 5311/41250 [12:50:22<86:36:48, 8.68s/it][2025-04-25 20:48:06,000] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 20:48:06,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.67 | bwd_microstep: 5695.43 | bwd_inner_microstep: 5660.35 | bwd_allreduce_microstep: 35.03 | step_microstep: 18.79 [2025-04-25 20:48:06,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.67 | bwd: 5695.45 | bwd_inner: 5660.35 | bwd_allreduce: 35.05 | step: 18.79 13%|█▎ | 5312/41250 [12:50:31<86:24:10, 8.66s/it] {'loss': 0.2175, 'grad_norm': 3.4025449752807617, 'learning_rate': 3.8985493062563075e-05, 'epoch': 1.29} 13%|█▎ | 5312/41250 [12:50:31<86:24:10, 8.66s/it][2025-04-25 20:48:14,695] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.17 | optimizer_step: 1.12 [2025-04-25 20:48:14,696] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.81 | bwd_microstep: 5755.81 | bwd_inner_microstep: 5684.37 | bwd_allreduce_microstep: 71.38 | step_microstep: 19.45 [2025-04-25 20:48:14,696] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.81 | bwd: 5755.83 | bwd_inner: 5684.37 | bwd_allreduce: 71.41 | step: 19.45 13%|█▎ | 5313/41250 [12:50:40<86:31:31, 8.67s/it] {'loss': 0.1003, 'grad_norm': 1.4936007261276245, 'learning_rate': 3.89849992181947e-05, 'epoch': 1.29} 13%|█▎ | 5313/41250 [12:50:40<86:31:31, 8.67s/it][2025-04-25 20:48:23,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 20:48:23,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.65 | bwd_microstep: 5763.82 | bwd_inner_microstep: 5692.72 | bwd_allreduce_microstep: 71.06 | step_microstep: 18.84 [2025-04-25 20:48:23,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.65 | bwd: 5763.83 | bwd_inner: 5692.72 | bwd_allreduce: 71.07 | step: 18.85 13%|█▎ | 5314/41250 [12:50:48<86:35:00, 8.67s/it] {'loss': 0.2692, 'grad_norm': 2.176755428314209, 'learning_rate': 3.898450525678754e-05, 'epoch': 1.29} 13%|█▎ | 5314/41250 [12:50:48<86:35:00, 8.67s/it][2025-04-25 20:48:32,008] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.02 | optimizer_step: 1.09 [2025-04-25 20:48:32,009] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.05 | bwd_microstep: 5712.67 | bwd_inner_microstep: 5658.09 | bwd_allreduce_microstep: 54.54 | step_microstep: 19.01 [2025-04-25 20:48:32,009] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.05 | bwd: 5712.69 | bwd_inner: 5658.09 | bwd_allreduce: 54.55 | step: 19.01 13%|█▎ | 5315/41250 [12:50:57<86:25:53, 8.66s/it] {'loss': 0.0582, 'grad_norm': 1.550394892692566, 'learning_rate': 3.8984011178344625e-05, 'epoch': 1.29} 13%|█▎ | 5315/41250 [12:50:57<86:25:53, 8.66s/it][2025-04-25 20:48:40,673] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:48:40,674] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.80 | bwd_microstep: 5741.26 | bwd_inner_microstep: 5681.19 | bwd_allreduce_microstep: 60.03 | step_microstep: 18.78 [2025-04-25 20:48:40,674] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.80 | bwd: 5741.28 | bwd_inner: 5681.19 | bwd_allreduce: 60.05 | step: 18.78 13%|█▎ | 5316/41250 [12:51:05<86:26:49, 8.66s/it] {'loss': 0.1575, 'grad_norm': 1.5664092302322388, 'learning_rate': 3.898351698286902e-05, 'epoch': 1.29} 13%|█▎ | 5316/41250 [12:51:06<86:26:49, 8.66s/it][2025-04-25 20:48:49,316] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 20:48:49,317] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.50 | bwd_microstep: 5712.08 | bwd_inner_microstep: 5699.10 | bwd_allreduce_microstep: 12.93 | step_microstep: 18.89 [2025-04-25 20:48:49,317] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.50 | bwd: 5712.10 | bwd_inner: 5699.10 | bwd_allreduce: 12.95 | step: 18.89 13%|█▎ | 5317/41250 [12:51:14<86:23:35, 8.66s/it] {'loss': 0.0541, 'grad_norm': 1.3806331157684326, 'learning_rate': 3.898302267036378e-05, 'epoch': 1.29} 13%|█▎ | 5317/41250 [12:51:14<86:23:35, 8.66s/it][2025-04-25 20:48:57,954] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:48:57,954] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.56 | bwd_microstep: 5708.35 | bwd_inner_microstep: 5695.41 | bwd_allreduce_microstep: 12.90 | step_microstep: 18.84 [2025-04-25 20:48:57,954] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.56 | bwd: 5708.36 | bwd_inner: 5695.41 | bwd_allreduce: 12.91 | step: 18.84 13%|█▎ | 5318/41250 [12:51:23<86:20:13, 8.65s/it] {'loss': 0.1044, 'grad_norm': 2.6360912322998047, 'learning_rate': 3.898252824083192e-05, 'epoch': 1.29} 13%|█▎ | 5318/41250 [12:51:23<86:20:13, 8.65s/it][2025-04-25 20:49:06,635] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:49:06,635] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.45 | bwd_microstep: 5767.45 | bwd_inner_microstep: 5660.65 | bwd_allreduce_microstep: 106.75 | step_microstep: 18.46 [2025-04-25 20:49:06,636] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.45 | bwd: 5767.47 | bwd_inner: 5660.65 | bwd_allreduce: 106.77 | step: 18.46 13%|█▎ | 5319/41250 [12:51:31<86:25:42, 8.66s/it] {'loss': 0.0727, 'grad_norm': 1.7556785345077515, 'learning_rate': 3.898203369427652e-05, 'epoch': 1.29} 13%|█▎ | 5319/41250 [12:51:31<86:25:42, 8.66s/it][2025-04-25 20:49:15,293] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 20:49:15,293] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.06 | bwd_microstep: 5723.39 | bwd_inner_microstep: 5710.60 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.80 [2025-04-25 20:49:15,293] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.06 | bwd: 5723.40 | bwd_inner: 5710.60 | bwd_allreduce: 12.76 | step: 18.80 13%|█▎ | 5320/41250 [12:51:40<86:25:10, 8.66s/it] {'loss': 0.3169, 'grad_norm': 3.668302059173584, 'learning_rate': 3.8981539030700614e-05, 'epoch': 1.29} 13%|█▎ | 5320/41250 [12:51:40<86:25:10, 8.66s/it][2025-04-25 20:49:23,965] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 20:49:23,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.74 | bwd_microstep: 5763.72 | bwd_inner_microstep: 5642.06 | bwd_allreduce_microstep: 121.60 | step_microstep: 18.83 [2025-04-25 20:49:23,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.74 | bwd: 5763.74 | bwd_inner: 5642.06 | bwd_allreduce: 121.62 | step: 18.83 13%|█▎ | 5321/41250 [12:51:49<86:28:21, 8.66s/it] {'loss': 0.081, 'grad_norm': 1.7508270740509033, 'learning_rate': 3.8981044250107246e-05, 'epoch': 1.29} 13%|█▎ | 5321/41250 [12:51:49<86:28:21, 8.66s/it][2025-04-25 20:49:32,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:49:32,620] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.24 | bwd_microstep: 5720.21 | bwd_inner_microstep: 5693.47 | bwd_allreduce_microstep: 26.70 | step_microstep: 18.84 [2025-04-25 20:49:32,620] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.24 | bwd: 5720.22 | bwd_inner: 5693.47 | bwd_allreduce: 26.71 | step: 18.84 13%|█▎ | 5322/41250 [12:51:57<86:25:36, 8.66s/it] {'loss': 0.1557, 'grad_norm': 1.391118049621582, 'learning_rate': 3.898054935249948e-05, 'epoch': 1.29} 13%|█▎ | 5322/41250 [12:51:57<86:25:36, 8.66s/it][2025-04-25 20:49:41,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-25 20:49:41,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.01 | bwd_microstep: 5719.11 | bwd_inner_microstep: 5635.90 | bwd_allreduce_microstep: 83.17 | step_microstep: 18.84 [2025-04-25 20:49:41,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.01 | bwd: 5719.13 | bwd_inner: 5635.90 | bwd_allreduce: 83.19 | step: 18.84 13%|█▎ | 5323/41250 [12:52:06<86:18:42, 8.65s/it] {'loss': 0.263, 'grad_norm': 3.46669602394104, 'learning_rate': 3.898005433788036e-05, 'epoch': 1.29} 13%|█▎ | 5323/41250 [12:52:06<86:18:42, 8.65s/it][2025-04-25 20:49:49,840] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 20:49:49,840] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.65 | bwd_microstep: 5685.50 | bwd_inner_microstep: 5650.14 | bwd_allreduce_microstep: 35.31 | step_microstep: 18.58 [2025-04-25 20:49:49,840] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.65 | bwd: 5685.52 | bwd_inner: 5650.14 | bwd_allreduce: 35.33 | step: 18.58 13%|█▎ | 5324/41250 [12:52:15<86:09:12, 8.63s/it] {'loss': 0.1042, 'grad_norm': 1.6797568798065186, 'learning_rate': 3.897955920625294e-05, 'epoch': 1.29} 13%|█▎ | 5324/41250 [12:52:15<86:09:12, 8.63s/it][2025-04-25 20:49:58,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-25 20:49:58,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.14 | bwd_microstep: 5724.31 | bwd_inner_microstep: 5711.72 | bwd_allreduce_microstep: 12.55 | step_microstep: 18.55 [2025-04-25 20:49:58,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.14 | bwd: 5724.33 | bwd_inner: 5711.72 | bwd_allreduce: 12.57 | step: 18.55 13%|█▎ | 5325/41250 [12:52:23<86:14:13, 8.64s/it] {'loss': 0.0417, 'grad_norm': 0.763512372970581, 'learning_rate': 3.897906395762027e-05, 'epoch': 1.29} 13%|█▎ | 5325/41250 [12:52:23<86:14:13, 8.64s/it][2025-04-25 20:50:07,258] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.03 | optimizer_step: 0.91 [2025-04-25 20:50:07,259] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2885.64 | bwd_microstep: 5784.54 | bwd_inner_microstep: 5771.83 | bwd_allreduce_microstep: 12.67 | step_microstep: 19.01 [2025-04-25 20:50:07,259] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2885.64 | bwd: 5784.56 | bwd_inner: 5771.83 | bwd_allreduce: 12.69 | step: 19.02 13%|█▎ | 5326/41250 [12:52:32<86:34:48, 8.68s/it] {'loss': 0.1669, 'grad_norm': 1.1182764768600464, 'learning_rate': 3.8978568591985397e-05, 'epoch': 1.29} 13%|█▎ | 5326/41250 [12:52:32<86:34:48, 8.68s/it][2025-04-25 20:50:15,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:50:15,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.26 | bwd_microstep: 5717.37 | bwd_inner_microstep: 5704.50 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.88 [2025-04-25 20:50:15,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.26 | bwd: 5717.39 | bwd_inner: 5704.49 | bwd_allreduce: 12.85 | step: 18.88 13%|█▎ | 5327/41250 [12:52:41<86:30:01, 8.67s/it] {'loss': 0.102, 'grad_norm': 1.007108211517334, 'learning_rate': 3.897807310935139e-05, 'epoch': 1.29} 13%|█▎ | 5327/41250 [12:52:41<86:30:01, 8.67s/it][2025-04-25 20:50:24,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.32 | optimizer_step: 1.04 [2025-04-25 20:50:24,521] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.20 | bwd_microstep: 5685.10 | bwd_inner_microstep: 5671.18 | bwd_allreduce_microstep: 13.86 | step_microstep: 20.20 [2025-04-25 20:50:24,521] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.20 | bwd: 5685.11 | bwd_inner: 5671.18 | bwd_allreduce: 13.89 | step: 20.20 13%|█▎ | 5328/41250 [12:52:49<86:19:45, 8.65s/it] {'loss': 0.1037, 'grad_norm': 2.1144182682037354, 'learning_rate': 3.897757750972129e-05, 'epoch': 1.29} 13%|█▎ | 5328/41250 [12:52:49<86:19:45, 8.65s/it][2025-04-25 20:50:33,144] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 20:50:33,145] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.30 | bwd_microstep: 5694.36 | bwd_inner_microstep: 5681.58 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.83 [2025-04-25 20:50:33,145] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.30 | bwd: 5694.37 | bwd_inner: 5681.58 | bwd_allreduce: 12.75 | step: 18.83 13%|█▎ | 5329/41250 [12:52:58<86:14:29, 8.64s/it] {'loss': 0.1879, 'grad_norm': 1.248883843421936, 'learning_rate': 3.8977081793098156e-05, 'epoch': 1.29} 13%|█▎ | 5329/41250 [12:52:58<86:14:29, 8.64s/it][2025-04-25 20:50:41,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:50:41,786] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.52 | bwd_microstep: 5711.22 | bwd_inner_microstep: 5698.33 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.89 [2025-04-25 20:50:41,786] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.53 | bwd: 5711.23 | bwd_inner: 5698.33 | bwd_allreduce: 12.86 | step: 18.89 13%|█▎ | 5330/41250 [12:53:07<86:14:09, 8.64s/it] {'loss': 0.0687, 'grad_norm': 2.392622709274292, 'learning_rate': 3.897658595948505e-05, 'epoch': 1.29} 13%|█▎ | 5330/41250 [12:53:07<86:14:09, 8.64s/it][2025-04-25 20:50:50,432] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 0.99 | optimizer_step: 0.99 [2025-04-25 20:50:50,432] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.36 | bwd_microstep: 5713.53 | bwd_inner_microstep: 5700.87 | bwd_allreduce_microstep: 12.62 | step_microstep: 18.70 [2025-04-25 20:50:50,433] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.36 | bwd: 5713.54 | bwd_inner: 5700.87 | bwd_allreduce: 12.63 | step: 18.70 13%|█▎ | 5331/41250 [12:53:15<86:14:33, 8.64s/it] {'loss': 0.2547, 'grad_norm': 2.2274696826934814, 'learning_rate': 3.8976090008885024e-05, 'epoch': 1.29} 13%|█▎ | 5331/41250 [12:53:15<86:14:33, 8.64s/it][2025-04-25 20:50:59,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.20 | optimizer_step: 0.90 [2025-04-25 20:50:59,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.54 | bwd_microstep: 5845.22 | bwd_inner_microstep: 5635.06 | bwd_allreduce_microstep: 210.11 | step_microstep: 19.30 [2025-04-25 20:50:59,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.54 | bwd: 5845.24 | bwd_inner: 5635.06 | bwd_allreduce: 210.13 | step: 19.30 13%|█▎ | 5332/41250 [12:53:24<86:34:31, 8.68s/it] {'loss': 0.1289, 'grad_norm': 2.136383056640625, 'learning_rate': 3.897559394130113e-05, 'epoch': 1.29} 13%|█▎ | 5332/41250 [12:53:24<86:34:31, 8.68s/it][2025-04-25 20:51:07,842] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:51:07,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.17 | bwd_microstep: 5744.81 | bwd_inner_microstep: 5662.83 | bwd_allreduce_microstep: 81.93 | step_microstep: 18.84 [2025-04-25 20:51:07,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.17 | bwd: 5744.83 | bwd_inner: 5662.83 | bwd_allreduce: 81.95 | step: 18.85 13%|█▎ | 5333/41250 [12:53:33<86:30:29, 8.67s/it] {'loss': 0.2246, 'grad_norm': 3.028548240661621, 'learning_rate': 3.8975097756736434e-05, 'epoch': 1.29} 13%|█▎ | 5333/41250 [12:53:33<86:30:29, 8.67s/it][2025-04-25 20:51:16,486] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.98 | optimizer_step: 0.99 [2025-04-25 20:51:16,486] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.75 | bwd_microstep: 5708.59 | bwd_inner_microstep: 5687.68 | bwd_allreduce_microstep: 20.87 | step_microstep: 18.54 [2025-04-25 20:51:16,486] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.75 | bwd: 5708.60 | bwd_inner: 5687.68 | bwd_allreduce: 20.89 | step: 18.55 13%|█▎ | 5334/41250 [12:53:41<86:25:15, 8.66s/it] {'loss': 0.0834, 'grad_norm': 1.4576940536499023, 'learning_rate': 3.8974601455193995e-05, 'epoch': 1.29} 13%|█▎ | 5334/41250 [12:53:41<86:25:15, 8.66s/it][2025-04-25 20:51:25,158] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.32 | optimizer_step: 1.04 [2025-04-25 20:51:25,159] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.11 | bwd_microstep: 5730.30 | bwd_inner_microstep: 5708.47 | bwd_allreduce_microstep: 21.77 | step_microstep: 20.19 [2025-04-25 20:51:25,159] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.11 | bwd: 5730.31 | bwd_inner: 5708.47 | bwd_allreduce: 21.80 | step: 20.19 13%|█▎ | 5335/41250 [12:53:50<86:27:16, 8.67s/it] {'loss': 0.1017, 'grad_norm': 1.5402885675430298, 'learning_rate': 3.897410503667687e-05, 'epoch': 1.29} 13%|█▎ | 5335/41250 [12:53:50<86:27:16, 8.67s/it][2025-04-25 20:51:33,830] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.03 | optimizer_step: 0.92 [2025-04-25 20:51:33,831] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.62 | bwd_microstep: 5761.52 | bwd_inner_microstep: 5644.23 | bwd_allreduce_microstep: 117.24 | step_microstep: 18.86 [2025-04-25 20:51:33,831] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.62 | bwd: 5761.54 | bwd_inner: 5644.23 | bwd_allreduce: 117.26 | step: 18.86 13%|█▎ | 5336/41250 [12:53:59<86:27:52, 8.67s/it] {'loss': 0.0177, 'grad_norm': 0.5361781716346741, 'learning_rate': 3.897360850118812e-05, 'epoch': 1.29} 13%|█▎ | 5336/41250 [12:53:59<86:27:52, 8.67s/it][2025-04-25 20:51:42,498] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.98 | optimizer_step: 1.08 [2025-04-25 20:51:42,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.33 | bwd_microstep: 5762.11 | bwd_inner_microstep: 5656.00 | bwd_allreduce_microstep: 106.07 | step_microstep: 18.68 [2025-04-25 20:51:42,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.33 | bwd: 5762.13 | bwd_inner: 5656.00 | bwd_allreduce: 106.09 | step: 18.68 13%|█▎ | 5337/41250 [12:54:07<86:28:02, 8.67s/it] {'loss': 0.0248, 'grad_norm': 0.3793656527996063, 'learning_rate': 3.89731118487308e-05, 'epoch': 1.29} 13%|█▎ | 5337/41250 [12:54:07<86:28:02, 8.67s/it][2025-04-25 20:51:51,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:51:51,156] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.30 | bwd_microstep: 5724.93 | bwd_inner_microstep: 5701.43 | bwd_allreduce_microstep: 23.46 | step_microstep: 19.00 [2025-04-25 20:51:51,156] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.30 | bwd: 5724.94 | bwd_inner: 5701.43 | bwd_allreduce: 23.47 | step: 19.00 13%|█▎ | 5338/41250 [12:54:16<86:25:47, 8.66s/it] {'loss': 0.1266, 'grad_norm': 1.6418863534927368, 'learning_rate': 3.897261507930798e-05, 'epoch': 1.29} 13%|█▎ | 5338/41250 [12:54:16<86:25:47, 8.66s/it][2025-04-25 20:51:59,860] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 20:51:59,860] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.36 | bwd_microstep: 5772.08 | bwd_inner_microstep: 5682.82 | bwd_allreduce_microstep: 89.21 | step_microstep: 18.58 [2025-04-25 20:51:59,860] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.36 | bwd: 5772.09 | bwd_inner: 5682.82 | bwd_allreduce: 89.23 | step: 18.58 13%|█▎ | 5339/41250 [12:54:25<86:32:52, 8.68s/it] {'loss': 0.133, 'grad_norm': 1.769993782043457, 'learning_rate': 3.897211819292272e-05, 'epoch': 1.29} 13%|█▎ | 5339/41250 [12:54:25<86:32:52, 8.68s/it][2025-04-25 20:52:08,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 20:52:08,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.67 | bwd_microstep: 5773.53 | bwd_inner_microstep: 5651.40 | bwd_allreduce_microstep: 122.09 | step_microstep: 18.61 [2025-04-25 20:52:08,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.68 | bwd: 5773.54 | bwd_inner: 5651.40 | bwd_allreduce: 122.10 | step: 18.61 13%|█▎ | 5340/41250 [12:54:33<86:32:58, 8.68s/it] {'loss': 0.209, 'grad_norm': 2.334758996963501, 'learning_rate': 3.8971621189578085e-05, 'epoch': 1.29} 13%|█▎ | 5340/41250 [12:54:33<86:32:58, 8.68s/it][2025-04-25 20:52:17,171] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.10 | optimizer_step: 0.99 [2025-04-25 20:52:17,172] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.76 | bwd_microstep: 5708.04 | bwd_inner_microstep: 5695.10 | bwd_allreduce_microstep: 12.88 | step_microstep: 19.54 [2025-04-25 20:52:17,172] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.76 | bwd: 5708.06 | bwd_inner: 5695.10 | bwd_allreduce: 12.91 | step: 19.54 13%|█▎ | 5341/41250 [12:54:42<86:25:52, 8.67s/it] {'loss': 0.0705, 'grad_norm': 1.5546172857284546, 'learning_rate': 3.897112406927713e-05, 'epoch': 1.29} 13%|█▎ | 5341/41250 [12:54:42<86:25:52, 8.67s/it][2025-04-25 20:52:26,030] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.05 | optimizer_step: 0.94 [2025-04-25 20:52:26,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.26 | bwd_microstep: 5920.43 | bwd_inner_microstep: 5691.69 | bwd_allreduce_microstep: 228.68 | step_microstep: 19.31 [2025-04-25 20:52:26,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.26 | bwd: 5920.45 | bwd_inner: 5691.69 | bwd_allreduce: 228.71 | step: 19.31 13%|█▎ | 5342/41250 [12:54:51<87:00:14, 8.72s/it] {'loss': 0.1173, 'grad_norm': 1.6627469062805176, 'learning_rate': 3.8970626832022934e-05, 'epoch': 1.3} 13%|█▎ | 5342/41250 [12:54:51<87:00:14, 8.72s/it][2025-04-25 20:52:34,886] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 20:52:34,886] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.63 | bwd_microstep: 5944.38 | bwd_inner_microstep: 5651.56 | bwd_allreduce_microstep: 292.77 | step_microstep: 19.00 [2025-04-25 20:52:34,887] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.63 | bwd: 5944.40 | bwd_inner: 5651.56 | bwd_allreduce: 292.79 | step: 19.00 13%|█▎ | 5343/41250 [12:55:00<87:23:58, 8.76s/it] {'loss': 0.0449, 'grad_norm': 0.8988699316978455, 'learning_rate': 3.897012947781855e-05, 'epoch': 1.3} 13%|█▎ | 5343/41250 [12:55:00<87:23:58, 8.76s/it][2025-04-25 20:52:43,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.04 | optimizer_step: 0.91 [2025-04-25 20:52:43,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.81 | bwd_microstep: 5766.62 | bwd_inner_microstep: 5688.01 | bwd_allreduce_microstep: 78.55 | step_microstep: 19.08 [2025-04-25 20:52:43,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.81 | bwd: 5766.63 | bwd_inner: 5688.01 | bwd_allreduce: 78.57 | step: 19.08 13%|█▎ | 5344/41250 [12:55:08<87:12:21, 8.74s/it] {'loss': 0.1197, 'grad_norm': 2.405693531036377, 'learning_rate': 3.896963200666706e-05, 'epoch': 1.3} 13%|█▎ | 5344/41250 [12:55:08<87:12:21, 8.74s/it][2025-04-25 20:52:52,263] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.21 | optimizer_step: 1.06 [2025-04-25 20:52:52,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.84 | bwd_microstep: 5743.43 | bwd_inner_microstep: 5700.10 | bwd_allreduce_microstep: 43.27 | step_microstep: 20.55 [2025-04-25 20:52:52,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.84 | bwd: 5743.45 | bwd_inner: 5700.10 | bwd_allreduce: 43.30 | step: 20.55 13%|█▎ | 5345/41250 [12:55:17<87:00:43, 8.72s/it] {'loss': 0.3924, 'grad_norm': 3.4353883266448975, 'learning_rate': 3.896913441857151e-05, 'epoch': 1.3} 13%|█▎ | 5345/41250 [12:55:17<87:00:43, 8.72s/it][2025-04-25 20:53:00,970] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-25 20:53:00,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.60 | bwd_microstep: 5763.20 | bwd_inner_microstep: 5703.98 | bwd_allreduce_microstep: 59.17 | step_microstep: 19.05 [2025-04-25 20:53:00,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.60 | bwd: 5763.21 | bwd_inner: 5703.98 | bwd_allreduce: 59.19 | step: 19.05 13%|█▎ | 5346/41250 [12:55:26<86:57:05, 8.72s/it] {'loss': 0.0191, 'grad_norm': 0.2267979085445404, 'learning_rate': 3.8968636713534975e-05, 'epoch': 1.3} 13%|█▎ | 5346/41250 [12:55:26<86:57:05, 8.72s/it][2025-04-25 20:53:09,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 20:53:09,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.85 | bwd_microstep: 5714.38 | bwd_inner_microstep: 5667.19 | bwd_allreduce_microstep: 47.14 | step_microstep: 18.75 [2025-04-25 20:53:09,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.85 | bwd: 5714.39 | bwd_inner: 5667.19 | bwd_allreduce: 47.16 | step: 18.75 13%|█▎ | 5347/41250 [12:55:34<86:41:19, 8.69s/it] {'loss': 0.1644, 'grad_norm': 2.3520567417144775, 'learning_rate': 3.896813889156053e-05, 'epoch': 1.3} 13%|█▎ | 5347/41250 [12:55:34<86:41:19, 8.69s/it][2025-04-25 20:53:18,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:53:18,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2890.16 | bwd_microstep: 5801.69 | bwd_inner_microstep: 5788.98 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.77 [2025-04-25 20:53:18,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2890.16 | bwd: 5801.70 | bwd_inner: 5788.98 | bwd_allreduce: 12.69 | step: 18.78 13%|█▎ | 5348/41250 [12:55:43<86:56:14, 8.72s/it] {'loss': 0.214, 'grad_norm': 3.2804782390594482, 'learning_rate': 3.896764095265124e-05, 'epoch': 1.3} 13%|█▎ | 5348/41250 [12:55:43<86:56:14, 8.72s/it][2025-04-25 20:53:27,036] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:53:27,036] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.76 | bwd_microstep: 5720.14 | bwd_inner_microstep: 5707.44 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.82 [2025-04-25 20:53:27,036] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.76 | bwd: 5720.15 | bwd_inner: 5707.44 | bwd_allreduce: 12.67 | step: 18.82 13%|█▎ | 5349/41250 [12:55:52<86:45:15, 8.70s/it] {'loss': 0.295, 'grad_norm': 2.3660857677459717, 'learning_rate': 3.8967142896810177e-05, 'epoch': 1.3} 13%|█▎ | 5349/41250 [12:55:52<86:45:15, 8.70s/it][2025-04-25 20:53:35,698] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 20:53:35,698] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.28 | bwd_microstep: 5727.18 | bwd_inner_microstep: 5714.46 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.70 [2025-04-25 20:53:35,698] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.28 | bwd: 5727.19 | bwd_inner: 5714.45 | bwd_allreduce: 12.70 | step: 18.70 13%|█▎ | 5350/41250 [12:56:01<86:38:23, 8.69s/it] {'loss': 0.1228, 'grad_norm': 2.381660223007202, 'learning_rate': 3.89666447240404e-05, 'epoch': 1.3} 13%|█▎ | 5350/41250 [12:56:01<86:38:23, 8.69s/it][2025-04-25 20:53:44,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:53:44,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.04 | bwd_microstep: 5724.93 | bwd_inner_microstep: 5712.17 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.86 [2025-04-25 20:53:44,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.04 | bwd: 5724.94 | bwd_inner: 5712.17 | bwd_allreduce: 12.73 | step: 18.86 13%|█▎ | 5351/41250 [12:56:09<86:34:52, 8.68s/it] {'loss': 0.2828, 'grad_norm': 1.6854013204574585, 'learning_rate': 3.8966146434344996e-05, 'epoch': 1.3} 13%|█▎ | 5351/41250 [12:56:09<86:34:52, 8.68s/it][2025-04-25 20:53:53,052] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:53:53,052] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2867.43 | bwd_microstep: 5735.54 | bwd_inner_microstep: 5719.36 | bwd_allreduce_microstep: 16.13 | step_microstep: 18.99 [2025-04-25 20:53:53,053] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2867.43 | bwd: 5735.55 | bwd_inner: 5719.36 | bwd_allreduce: 16.15 | step: 18.99 13%|█▎ | 5352/41250 [12:56:18<86:35:28, 8.68s/it] {'loss': 0.1967, 'grad_norm': 2.0797078609466553, 'learning_rate': 3.8965648027727024e-05, 'epoch': 1.3} 13%|█▎ | 5352/41250 [12:56:18<86:35:28, 8.68s/it][2025-04-25 20:54:01,679] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 20:54:01,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.62 | bwd_microstep: 5717.25 | bwd_inner_microstep: 5661.60 | bwd_allreduce_microstep: 55.60 | step_microstep: 18.55 [2025-04-25 20:54:01,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.62 | bwd: 5717.26 | bwd_inner: 5661.60 | bwd_allreduce: 55.62 | step: 18.55 13%|█▎ | 5353/41250 [12:56:27<86:25:00, 8.67s/it] {'loss': 0.2529, 'grad_norm': 2.134019613265991, 'learning_rate': 3.896514950418957e-05, 'epoch': 1.3} 13%|█▎ | 5353/41250 [12:56:27<86:25:00, 8.67s/it][2025-04-25 20:54:10,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.03 | optimizer_step: 0.95 [2025-04-25 20:54:10,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.98 | bwd_microstep: 5815.40 | bwd_inner_microstep: 5661.47 | bwd_allreduce_microstep: 153.88 | step_microstep: 19.27 [2025-04-25 20:54:10,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.98 | bwd: 5815.41 | bwd_inner: 5661.47 | bwd_allreduce: 153.90 | step: 19.27 13%|█▎ | 5354/41250 [12:56:35<86:36:44, 8.69s/it] {'loss': 0.0211, 'grad_norm': 0.23205003142356873, 'learning_rate': 3.89646508637357e-05, 'epoch': 1.3} 13%|█▎ | 5354/41250 [12:56:35<86:36:44, 8.69s/it][2025-04-25 20:54:19,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.14 | optimizer_step: 0.96 [2025-04-25 20:54:19,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.87 | bwd_microstep: 5796.39 | bwd_inner_microstep: 5661.55 | bwd_allreduce_microstep: 134.79 | step_microstep: 19.05 [2025-04-25 20:54:19,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.88 | bwd: 5796.40 | bwd_inner: 5661.56 | bwd_allreduce: 134.81 | step: 19.06 13%|█▎ | 5355/41250 [12:56:44<86:41:09, 8.69s/it] {'loss': 0.112, 'grad_norm': 2.975705862045288, 'learning_rate': 3.896415210636848e-05, 'epoch': 1.3} 13%|█▎ | 5355/41250 [12:56:44<86:41:09, 8.69s/it][2025-04-25 20:54:27,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 20:54:27,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.34 | bwd_microstep: 5808.95 | bwd_inner_microstep: 5657.84 | bwd_allreduce_microstep: 151.06 | step_microstep: 18.59 [2025-04-25 20:54:27,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.34 | bwd: 5808.96 | bwd_inner: 5657.84 | bwd_allreduce: 151.08 | step: 18.60 13%|█▎ | 5356/41250 [12:56:53<86:46:03, 8.70s/it] {'loss': 0.2154, 'grad_norm': 1.9624645709991455, 'learning_rate': 3.896365323209099e-05, 'epoch': 1.3} 13%|█▎ | 5356/41250 [12:56:53<86:46:03, 8.70s/it][2025-04-25 20:54:36,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.97 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 20:54:36,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.75 | bwd_microstep: 5772.08 | bwd_inner_microstep: 5705.19 | bwd_allreduce_microstep: 66.85 | step_microstep: 18.26 [2025-04-25 20:54:36,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.75 | bwd: 5772.09 | bwd_inner: 5705.19 | bwd_allreduce: 66.86 | step: 18.26 13%|█▎ | 5357/41250 [12:57:01<86:46:23, 8.70s/it] {'loss': 0.1432, 'grad_norm': 3.642180919647217, 'learning_rate': 3.8963154240906315e-05, 'epoch': 1.3} 13%|█▎ | 5357/41250 [12:57:01<86:46:23, 8.70s/it][2025-04-25 20:54:45,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 20:54:45,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.09 | bwd_microstep: 5713.52 | bwd_inner_microstep: 5700.56 | bwd_allreduce_microstep: 12.91 | step_microstep: 18.63 [2025-04-25 20:54:45,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.09 | bwd: 5713.54 | bwd_inner: 5700.56 | bwd_allreduce: 12.93 | step: 18.63 13%|█▎ | 5358/41250 [12:57:10<86:35:26, 8.69s/it] {'loss': 0.0565, 'grad_norm': 1.4851106405258179, 'learning_rate': 3.896265513281752e-05, 'epoch': 1.3} 13%|█▎ | 5358/41250 [12:57:10<86:35:26, 8.69s/it][2025-04-25 20:54:53,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.10 | optimizer_step: 1.01 [2025-04-25 20:54:53,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.26 | bwd_microstep: 5777.76 | bwd_inner_microstep: 5708.22 | bwd_allreduce_microstep: 69.49 | step_microstep: 18.74 [2025-04-25 20:54:53,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.26 | bwd: 5777.77 | bwd_inner: 5708.22 | bwd_allreduce: 69.51 | step: 18.74 13%|█▎ | 5359/41250 [12:57:19<86:40:19, 8.69s/it] {'loss': 0.1017, 'grad_norm': 3.4346909523010254, 'learning_rate': 3.8962155907827695e-05, 'epoch': 1.3} 13%|█▎ | 5359/41250 [12:57:19<86:40:19, 8.69s/it][2025-04-25 20:55:02,612] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.00 | optimizer_step: 1.03 [2025-04-25 20:55:02,612] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2874.28 | bwd_microstep: 5743.66 | bwd_inner_microstep: 5691.92 | bwd_allreduce_microstep: 51.69 | step_microstep: 18.71 [2025-04-25 20:55:02,613] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2874.28 | bwd: 5743.67 | bwd_inner: 5691.92 | bwd_allreduce: 51.71 | step: 18.71 13%|█▎ | 5360/41250 [12:57:27<86:42:13, 8.70s/it] {'loss': 0.0426, 'grad_norm': 0.8912521600723267, 'learning_rate': 3.89616565659399e-05, 'epoch': 1.3} 13%|█▎ | 5360/41250 [12:57:27<86:42:13, 8.70s/it][2025-04-25 20:55:11,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.05 | optimizer_step: 1.05 [2025-04-25 20:55:11,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2900.73 | bwd_microstep: 5795.96 | bwd_inner_microstep: 5782.98 | bwd_allreduce_microstep: 12.92 | step_microstep: 19.39 [2025-04-25 20:55:11,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2900.73 | bwd: 5795.98 | bwd_inner: 5782.98 | bwd_allreduce: 12.95 | step: 19.39 13%|█▎ | 5361/41250 [12:57:36<86:57:35, 8.72s/it] {'loss': 0.1213, 'grad_norm': 0.9914132952690125, 'learning_rate': 3.896115710715722e-05, 'epoch': 1.3} 13%|█▎ | 5361/41250 [12:57:36<86:57:35, 8.72s/it][2025-04-25 20:55:20,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 20:55:20,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.59 | bwd_microstep: 5907.32 | bwd_inner_microstep: 5688.38 | bwd_allreduce_microstep: 218.90 | step_microstep: 18.93 [2025-04-25 20:55:20,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.59 | bwd: 5907.34 | bwd_inner: 5688.38 | bwd_allreduce: 218.92 | step: 18.93 13%|█▎ | 5362/41250 [12:57:45<87:19:02, 8.76s/it] {'loss': 0.2273, 'grad_norm': 3.973499298095703, 'learning_rate': 3.896065753148274e-05, 'epoch': 1.3} 13%|█▎ | 5362/41250 [12:57:45<87:19:02, 8.76s/it][2025-04-25 20:55:28,911] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.03 | optimizer_step: 0.91 [2025-04-25 20:55:28,912] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.34 | bwd_microstep: 5737.33 | bwd_inner_microstep: 5711.18 | bwd_allreduce_microstep: 26.10 | step_microstep: 19.11 [2025-04-25 20:55:28,912] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.34 | bwd: 5737.34 | bwd_inner: 5711.18 | bwd_allreduce: 26.12 | step: 19.11 13%|█▎ | 5363/41250 [12:57:54<87:03:30, 8.73s/it] {'loss': 0.1304, 'grad_norm': 2.1885979175567627, 'learning_rate': 3.896015783891954e-05, 'epoch': 1.3} 13%|█▎ | 5363/41250 [12:57:54<87:03:30, 8.73s/it][2025-04-25 20:55:37,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:55:37,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.42 | bwd_microstep: 5706.54 | bwd_inner_microstep: 5659.69 | bwd_allreduce_microstep: 46.80 | step_microstep: 18.91 [2025-04-25 20:55:37,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.42 | bwd: 5706.55 | bwd_inner: 5659.69 | bwd_allreduce: 46.82 | step: 18.92 13%|█▎ | 5364/41250 [12:58:02<86:44:11, 8.70s/it] {'loss': 0.1981, 'grad_norm': 3.0357449054718018, 'learning_rate': 3.8959658029470686e-05, 'epoch': 1.3} 13%|█▎ | 5364/41250 [12:58:02<86:44:11, 8.70s/it][2025-04-25 20:55:46,135] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.34 | optimizer_step: 1.06 [2025-04-25 20:55:46,136] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.03 | bwd_microstep: 5684.54 | bwd_inner_microstep: 5667.47 | bwd_allreduce_microstep: 17.01 | step_microstep: 19.96 [2025-04-25 20:55:46,136] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.03 | bwd: 5684.56 | bwd_inner: 5667.47 | bwd_allreduce: 17.04 | step: 19.96 13%|█▎ | 5365/41250 [12:58:11<86:25:49, 8.67s/it] {'loss': 0.1636, 'grad_norm': 2.1661264896392822, 'learning_rate': 3.8959158103139275e-05, 'epoch': 1.3} 13%|█▎ | 5365/41250 [12:58:11<86:25:49, 8.67s/it][2025-04-25 20:55:54,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.02 | optimizer_step: 0.94 [2025-04-25 20:55:54,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.76 | bwd_microstep: 5734.14 | bwd_inner_microstep: 5685.43 | bwd_allreduce_microstep: 48.67 | step_microstep: 19.11 [2025-04-25 20:55:54,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.76 | bwd: 5734.16 | bwd_inner: 5685.43 | bwd_allreduce: 48.69 | step: 19.11 13%|█▎ | 5366/41250 [12:58:20<86:24:51, 8.67s/it] {'loss': 0.128, 'grad_norm': 2.3526453971862793, 'learning_rate': 3.895865805992838e-05, 'epoch': 1.3} 13%|█▎ | 5366/41250 [12:58:20<86:24:51, 8.67s/it][2025-04-25 20:56:03,478] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.99 | optimizer_step: 1.03 [2025-04-25 20:56:03,479] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.23 | bwd_microstep: 5746.71 | bwd_inner_microstep: 5694.90 | bwd_allreduce_microstep: 51.77 | step_microstep: 18.60 [2025-04-25 20:56:03,479] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.23 | bwd: 5746.72 | bwd_inner: 5694.90 | bwd_allreduce: 51.78 | step: 18.60 13%|█▎ | 5367/41250 [12:58:28<86:25:34, 8.67s/it] {'loss': 0.104, 'grad_norm': 1.4592487812042236, 'learning_rate': 3.895815789984109e-05, 'epoch': 1.3} 13%|█▎ | 5367/41250 [12:58:28<86:25:34, 8.67s/it][2025-04-25 20:56:12,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 20:56:12,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.22 | bwd_microstep: 5773.21 | bwd_inner_microstep: 5663.21 | bwd_allreduce_microstep: 109.95 | step_microstep: 18.72 [2025-04-25 20:56:12,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.22 | bwd: 5773.22 | bwd_inner: 5663.21 | bwd_allreduce: 109.96 | step: 18.73 13%|█▎ | 5368/41250 [12:58:37<86:28:39, 8.68s/it] {'loss': 0.0633, 'grad_norm': 0.9066182374954224, 'learning_rate': 3.895765762288049e-05, 'epoch': 1.3} 13%|█▎ | 5368/41250 [12:58:37<86:28:39, 8.68s/it][2025-04-25 20:56:20,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.26 | optimizer_step: 0.97 [2025-04-25 20:56:20,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.37 | bwd_microstep: 5711.07 | bwd_inner_microstep: 5651.15 | bwd_allreduce_microstep: 59.86 | step_microstep: 19.51 [2025-04-25 20:56:20,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.37 | bwd: 5711.08 | bwd_inner: 5651.15 | bwd_allreduce: 59.89 | step: 19.52 13%|█▎ | 5369/41250 [12:58:46<86:19:23, 8.66s/it] {'loss': 0.1105, 'grad_norm': 1.4671945571899414, 'learning_rate': 3.895715722904965e-05, 'epoch': 1.3} 13%|█▎ | 5369/41250 [12:58:46<86:19:23, 8.66s/it][2025-04-25 20:56:29,469] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 20:56:29,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.56 | bwd_microstep: 5750.63 | bwd_inner_microstep: 5678.77 | bwd_allreduce_microstep: 71.82 | step_microstep: 19.05 [2025-04-25 20:56:29,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.56 | bwd: 5750.64 | bwd_inner: 5678.77 | bwd_allreduce: 71.83 | step: 19.05 13%|█▎ | 5370/41250 [12:58:54<86:22:04, 8.67s/it] {'loss': 0.1065, 'grad_norm': 0.8624488711357117, 'learning_rate': 3.8956656718351674e-05, 'epoch': 1.3} 13%|█▎ | 5370/41250 [12:58:54<86:22:04, 8.67s/it][2025-04-25 20:56:38,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 20:56:38,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.62 | bwd_microstep: 5733.93 | bwd_inner_microstep: 5654.20 | bwd_allreduce_microstep: 79.69 | step_microstep: 18.44 [2025-04-25 20:56:38,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.62 | bwd: 5733.94 | bwd_inner: 5654.20 | bwd_allreduce: 79.70 | step: 18.44 13%|█▎ | 5371/41250 [12:59:03<86:18:07, 8.66s/it] {'loss': 0.2834, 'grad_norm': 1.972223162651062, 'learning_rate': 3.895615609078963e-05, 'epoch': 1.3} 13%|█▎ | 5371/41250 [12:59:03<86:18:07, 8.66s/it][2025-04-25 20:56:46,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 20:56:46,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.06 | bwd_microstep: 5744.18 | bwd_inner_microstep: 5684.01 | bwd_allreduce_microstep: 60.12 | step_microstep: 18.77 [2025-04-25 20:56:46,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.06 | bwd: 5744.19 | bwd_inner: 5684.01 | bwd_allreduce: 60.14 | step: 18.77 13%|█▎ | 5372/41250 [12:59:12<86:20:05, 8.66s/it] {'loss': 0.139, 'grad_norm': 1.191310167312622, 'learning_rate': 3.895565534636661e-05, 'epoch': 1.3} 13%|█▎ | 5372/41250 [12:59:12<86:20:05, 8.66s/it][2025-04-25 20:56:55,474] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 20:56:55,475] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.43 | bwd_microstep: 5759.22 | bwd_inner_microstep: 5691.67 | bwd_allreduce_microstep: 67.51 | step_microstep: 18.80 [2025-04-25 20:56:55,475] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.43 | bwd: 5759.23 | bwd_inner: 5691.67 | bwd_allreduce: 67.52 | step: 18.80 13%|█▎ | 5373/41250 [12:59:20<86:24:38, 8.67s/it] {'loss': 0.0645, 'grad_norm': 1.540278434753418, 'learning_rate': 3.8955154485085716e-05, 'epoch': 1.3} 13%|█▎ | 5373/41250 [12:59:20<86:24:38, 8.67s/it][2025-04-25 20:57:04,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 20:57:04,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.76 | bwd_microstep: 5742.18 | bwd_inner_microstep: 5652.34 | bwd_allreduce_microstep: 89.79 | step_microstep: 18.71 [2025-04-25 20:57:04,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.76 | bwd: 5742.19 | bwd_inner: 5652.34 | bwd_allreduce: 89.81 | step: 18.72 13%|█▎ | 5374/41250 [12:59:29<86:23:09, 8.67s/it] {'loss': 0.1971, 'grad_norm': 2.541630744934082, 'learning_rate': 3.895465350695001e-05, 'epoch': 1.3} 13%|█▎ | 5374/41250 [12:59:29<86:23:09, 8.67s/it][2025-04-25 20:57:12,822] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.08 | optimizer_step: 1.01 [2025-04-25 20:57:12,823] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.97 | bwd_microstep: 5746.00 | bwd_inner_microstep: 5689.61 | bwd_allreduce_microstep: 56.33 | step_microstep: 19.55 [2025-04-25 20:57:12,823] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.97 | bwd: 5746.02 | bwd_inner: 5689.61 | bwd_allreduce: 56.35 | step: 19.56 13%|█▎ | 5375/41250 [12:59:38<86:26:21, 8.67s/it] {'loss': 0.0428, 'grad_norm': 0.5607209205627441, 'learning_rate': 3.89541524119626e-05, 'epoch': 1.3} 13%|█▎ | 5375/41250 [12:59:38<86:26:21, 8.67s/it][2025-04-25 20:57:21,474] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 20:57:21,475] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.87 | bwd_microstep: 5723.34 | bwd_inner_microstep: 5696.10 | bwd_allreduce_microstep: 27.19 | step_microstep: 18.48 [2025-04-25 20:57:21,475] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.87 | bwd: 5723.35 | bwd_inner: 5696.10 | bwd_allreduce: 27.20 | step: 18.48 13%|█▎ | 5376/41250 [12:59:46<86:21:59, 8.67s/it] {'loss': 0.0822, 'grad_norm': 0.9110553860664368, 'learning_rate': 3.895365120012657e-05, 'epoch': 1.3} 13%|█▎ | 5376/41250 [12:59:46<86:21:59, 8.67s/it][2025-04-25 20:57:30,151] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:57:30,152] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.47 | bwd_microstep: 5743.96 | bwd_inner_microstep: 5703.94 | bwd_allreduce_microstep: 39.98 | step_microstep: 18.36 [2025-04-25 20:57:30,152] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.47 | bwd: 5743.97 | bwd_inner: 5703.94 | bwd_allreduce: 40.00 | step: 18.36 13%|█▎ | 5377/41250 [12:59:55<86:23:32, 8.67s/it] {'loss': 0.2316, 'grad_norm': 2.963923931121826, 'learning_rate': 3.8953149871445005e-05, 'epoch': 1.3} 13%|█▎ | 5377/41250 [12:59:55<86:23:32, 8.67s/it][2025-04-25 20:57:38,800] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.06 | optimizer_step: 0.94 [2025-04-25 20:57:38,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.26 | bwd_microstep: 5706.63 | bwd_inner_microstep: 5693.87 | bwd_allreduce_microstep: 12.70 | step_microstep: 19.25 [2025-04-25 20:57:38,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.26 | bwd: 5706.64 | bwd_inner: 5693.87 | bwd_allreduce: 12.73 | step: 19.25 13%|█▎ | 5378/41250 [13:00:04<86:20:02, 8.66s/it] {'loss': 0.126, 'grad_norm': 2.3201181888580322, 'learning_rate': 3.8952648425921e-05, 'epoch': 1.3} 13%|█▎ | 5378/41250 [13:00:04<86:20:02, 8.66s/it][2025-04-25 20:57:47,439] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 20:57:47,439] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.03 | bwd_microstep: 5692.90 | bwd_inner_microstep: 5680.17 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.74 [2025-04-25 20:57:47,439] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.03 | bwd: 5692.92 | bwd_inner: 5680.17 | bwd_allreduce: 12.70 | step: 18.75 13%|█▎ | 5379/41250 [13:00:12<86:15:16, 8.66s/it] {'loss': 0.229, 'grad_norm': 3.53867506980896, 'learning_rate': 3.895214686355764e-05, 'epoch': 1.3} 13%|█▎ | 5379/41250 [13:00:12<86:15:16, 8.66s/it][2025-04-25 20:57:56,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.03 | optimizer_step: 1.02 [2025-04-25 20:57:56,126] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.03 | bwd_microstep: 5756.11 | bwd_inner_microstep: 5652.98 | bwd_allreduce_microstep: 103.08 | step_microstep: 19.17 [2025-04-25 20:57:56,126] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.03 | bwd: 5756.13 | bwd_inner: 5652.98 | bwd_allreduce: 103.11 | step: 19.17 13%|█▎ | 5380/41250 [13:00:21<86:20:19, 8.67s/it] {'loss': 0.0486, 'grad_norm': 0.9903451204299927, 'learning_rate': 3.895164518435803e-05, 'epoch': 1.3} 13%|█▎ | 5380/41250 [13:00:21<86:20:19, 8.67s/it][2025-04-25 20:58:04,774] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:58:04,775] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.88 | bwd_microstep: 5708.82 | bwd_inner_microstep: 5696.01 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.56 [2025-04-25 20:58:04,775] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.88 | bwd: 5708.83 | bwd_inner: 5696.01 | bwd_allreduce: 12.78 | step: 18.57 13%|█▎ | 5381/41250 [13:00:30<86:17:14, 8.66s/it] {'loss': 0.0734, 'grad_norm': 1.1505106687545776, 'learning_rate': 3.895114338832525e-05, 'epoch': 1.3} 13%|█▎ | 5381/41250 [13:00:30<86:17:14, 8.66s/it][2025-04-25 20:58:13,426] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-25 20:58:13,427] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.45 | bwd_microstep: 5716.16 | bwd_inner_microstep: 5703.25 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.76 [2025-04-25 20:58:13,427] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.45 | bwd: 5716.18 | bwd_inner: 5703.25 | bwd_allreduce: 12.87 | step: 18.76 13%|█▎ | 5382/41250 [13:00:38<86:16:18, 8.66s/it] {'loss': 0.1197, 'grad_norm': 1.0969042778015137, 'learning_rate': 3.89506414754624e-05, 'epoch': 1.3} 13%|█▎ | 5382/41250 [13:00:38<86:16:18, 8.66s/it][2025-04-25 20:58:22,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-25 20:58:22,077] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.22 | bwd_microstep: 5712.19 | bwd_inner_microstep: 5699.31 | bwd_allreduce_microstep: 12.84 | step_microstep: 19.07 [2025-04-25 20:58:22,077] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.22 | bwd: 5712.21 | bwd_inner: 5699.31 | bwd_allreduce: 12.86 | step: 19.07 13%|█▎ | 5383/41250 [13:00:47<86:14:00, 8.66s/it] {'loss': 0.1696, 'grad_norm': 1.6733300685882568, 'learning_rate': 3.8950139445772574e-05, 'epoch': 1.3} 13%|█▎ | 5383/41250 [13:00:47<86:14:00, 8.66s/it][2025-04-25 20:58:30,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 20:58:30,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.33 | bwd_microstep: 5709.85 | bwd_inner_microstep: 5697.04 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.66 [2025-04-25 20:58:30,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.33 | bwd: 5709.86 | bwd_inner: 5697.04 | bwd_allreduce: 12.78 | step: 18.66 13%|█▎ | 5384/41250 [13:00:56<86:13:52, 8.66s/it] {'loss': 0.1339, 'grad_norm': 1.2872401475906372, 'learning_rate': 3.894963729925886e-05, 'epoch': 1.31} 13%|█▎ | 5384/41250 [13:00:56<86:13:52, 8.66s/it][2025-04-25 20:58:39,337] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.97 | optimizer_gradients: 1.04 | optimizer_step: 0.93 [2025-04-25 20:58:39,337] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.74 | bwd_microstep: 5687.08 | bwd_inner_microstep: 5647.24 | bwd_allreduce_microstep: 39.79 | step_microstep: 18.88 [2025-04-25 20:58:39,337] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.74 | bwd: 5687.09 | bwd_inner: 5647.24 | bwd_allreduce: 39.81 | step: 18.88 13%|█▎ | 5385/41250 [13:01:04<86:04:43, 8.64s/it] {'loss': 0.1595, 'grad_norm': 1.3653745651245117, 'learning_rate': 3.894913503592437e-05, 'epoch': 1.31} 13%|█▎ | 5385/41250 [13:01:04<86:04:43, 8.64s/it][2025-04-25 20:58:47,956] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.08 | optimizer_step: 1.07 [2025-04-25 20:58:47,956] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.99 | bwd_microstep: 5700.74 | bwd_inner_microstep: 5659.27 | bwd_allreduce_microstep: 41.42 | step_microstep: 19.45 [2025-04-25 20:58:47,957] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.99 | bwd: 5700.76 | bwd_inner: 5659.27 | bwd_allreduce: 41.44 | step: 19.45 13%|█▎ | 5386/41250 [13:01:13<86:00:55, 8.63s/it] {'loss': 0.4091, 'grad_norm': 2.8910772800445557, 'learning_rate': 3.894863265577218e-05, 'epoch': 1.31} 13%|█▎ | 5386/41250 [13:01:13<86:00:55, 8.63s/it][2025-04-25 20:58:56,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.02 | optimizer_step: 1.03 [2025-04-25 20:58:56,638] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.40 | bwd_microstep: 5769.28 | bwd_inner_microstep: 5657.30 | bwd_allreduce_microstep: 111.92 | step_microstep: 19.03 [2025-04-25 20:58:56,638] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.40 | bwd: 5769.29 | bwd_inner: 5657.30 | bwd_allreduce: 111.94 | step: 19.03 13%|█▎ | 5387/41250 [13:01:21<86:09:53, 8.65s/it] {'loss': 0.0717, 'grad_norm': 1.632311224937439, 'learning_rate': 3.89481301588054e-05, 'epoch': 1.31} 13%|█▎ | 5387/41250 [13:01:21<86:09:53, 8.65s/it][2025-04-25 20:59:05,268] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.07 | optimizer_step: 1.05 [2025-04-25 20:59:05,268] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.50 | bwd_microstep: 5697.34 | bwd_inner_microstep: 5678.21 | bwd_allreduce_microstep: 19.08 | step_microstep: 19.73 [2025-04-25 20:59:05,269] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.49 | bwd: 5697.35 | bwd_inner: 5678.21 | bwd_allreduce: 19.10 | step: 19.73 13%|█▎ | 5388/41250 [13:01:30<86:05:34, 8.64s/it] {'loss': 0.0778, 'grad_norm': 2.2529516220092773, 'learning_rate': 3.8947627545027124e-05, 'epoch': 1.31} 13%|█▎ | 5388/41250 [13:01:30<86:05:34, 8.64s/it][2025-04-25 20:59:13,950] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 20:59:13,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.15 | bwd_microstep: 5752.96 | bwd_inner_microstep: 5701.09 | bwd_allreduce_microstep: 51.82 | step_microstep: 18.94 [2025-04-25 20:59:13,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.15 | bwd: 5752.98 | bwd_inner: 5701.09 | bwd_allreduce: 51.84 | step: 18.94 13%|█▎ | 5389/41250 [13:01:39<86:12:28, 8.65s/it] {'loss': 0.122, 'grad_norm': 2.237243890762329, 'learning_rate': 3.894712481444045e-05, 'epoch': 1.31} 13%|█▎ | 5389/41250 [13:01:39<86:12:28, 8.65s/it][2025-04-25 20:59:22,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-25 20:59:22,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.51 | bwd_microstep: 5715.25 | bwd_inner_microstep: 5660.77 | bwd_allreduce_microstep: 54.43 | step_microstep: 18.80 [2025-04-25 20:59:22,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.51 | bwd: 5715.26 | bwd_inner: 5660.77 | bwd_allreduce: 54.45 | step: 18.80 13%|█▎ | 5390/41250 [13:01:47<86:08:19, 8.65s/it] {'loss': 0.0719, 'grad_norm': 1.352735161781311, 'learning_rate': 3.894662196704848e-05, 'epoch': 1.31} 13%|█▎ | 5390/41250 [13:01:47<86:08:19, 8.65s/it][2025-04-25 20:59:31,227] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 20:59:31,227] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.61 | bwd_microstep: 5713.69 | bwd_inner_microstep: 5701.00 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.66 [2025-04-25 20:59:31,228] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.61 | bwd: 5713.70 | bwd_inner: 5701.00 | bwd_allreduce: 12.66 | step: 18.66 13%|█▎ | 5391/41250 [13:01:56<86:07:40, 8.65s/it] {'loss': 0.248, 'grad_norm': 2.4674599170684814, 'learning_rate': 3.894611900285431e-05, 'epoch': 1.31} 13%|█▎ | 5391/41250 [13:01:56<86:07:40, 8.65s/it][2025-04-25 20:59:39,860] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.99 | optimizer_step: 0.98 [2025-04-25 20:59:39,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.79 | bwd_microstep: 5717.00 | bwd_inner_microstep: 5664.33 | bwd_allreduce_microstep: 52.62 | step_microstep: 18.60 [2025-04-25 20:59:39,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.79 | bwd: 5717.01 | bwd_inner: 5664.33 | bwd_allreduce: 52.64 | step: 18.61 13%|█▎ | 5392/41250 [13:02:05<86:05:02, 8.64s/it] {'loss': 0.108, 'grad_norm': 2.345466136932373, 'learning_rate': 3.894561592186105e-05, 'epoch': 1.31} 13%|█▎ | 5392/41250 [13:02:05<86:05:02, 8.64s/it][2025-04-25 20:59:48,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 20:59:48,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.28 | bwd_microstep: 5763.08 | bwd_inner_microstep: 5652.58 | bwd_allreduce_microstep: 110.46 | step_microstep: 18.68 [2025-04-25 20:59:48,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.28 | bwd: 5763.09 | bwd_inner: 5652.58 | bwd_allreduce: 110.47 | step: 18.69 13%|█▎ | 5393/41250 [13:02:13<86:11:11, 8.65s/it] {'loss': 0.1283, 'grad_norm': 1.6437636613845825, 'learning_rate': 3.8945112724071784e-05, 'epoch': 1.31} 13%|█▎ | 5393/41250 [13:02:13<86:11:11, 8.65s/it][2025-04-25 20:59:57,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 1.18 [2025-04-25 20:59:57,196] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.66 | bwd_microstep: 5717.86 | bwd_inner_microstep: 5705.23 | bwd_allreduce_microstep: 12.59 | step_microstep: 19.65 [2025-04-25 20:59:57,197] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.66 | bwd: 5717.88 | bwd_inner: 5705.23 | bwd_allreduce: 12.61 | step: 19.65 13%|█▎ | 5394/41250 [13:02:22<86:12:07, 8.65s/it] {'loss': 0.0715, 'grad_norm': 1.340706467628479, 'learning_rate': 3.894460940948963e-05, 'epoch': 1.31} 13%|█▎ | 5394/41250 [13:02:22<86:12:07, 8.65s/it][2025-04-25 21:00:05,908] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.03 | optimizer_step: 0.93 [2025-04-25 21:00:05,909] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.11 | bwd_microstep: 5777.81 | bwd_inner_microstep: 5699.36 | bwd_allreduce_microstep: 78.40 | step_microstep: 18.76 [2025-04-25 21:00:05,909] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.10 | bwd: 5777.82 | bwd_inner: 5699.36 | bwd_allreduce: 78.42 | step: 18.77 13%|█▎ | 5395/41250 [13:02:31<86:22:12, 8.67s/it] {'loss': 0.0379, 'grad_norm': 0.39968129992485046, 'learning_rate': 3.8944105978117685e-05, 'epoch': 1.31} 13%|█▎ | 5395/41250 [13:02:31<86:22:12, 8.67s/it][2025-04-25 21:00:14,588] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 21:00:14,589] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.88 | bwd_microstep: 5739.60 | bwd_inner_microstep: 5705.73 | bwd_allreduce_microstep: 33.83 | step_microstep: 18.77 [2025-04-25 21:00:14,589] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.88 | bwd: 5739.62 | bwd_inner: 5705.73 | bwd_allreduce: 33.85 | step: 18.77 13%|█▎ | 5396/41250 [13:02:39<86:23:29, 8.67s/it] {'loss': 0.157, 'grad_norm': 2.3863630294799805, 'learning_rate': 3.894360242995905e-05, 'epoch': 1.31} 13%|█▎ | 5396/41250 [13:02:39<86:23:29, 8.67s/it][2025-04-25 21:00:23,219] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 21:00:23,219] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.60 | bwd_microstep: 5715.90 | bwd_inner_microstep: 5656.18 | bwd_allreduce_microstep: 59.66 | step_microstep: 18.40 [2025-04-25 21:00:23,219] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.60 | bwd: 5715.91 | bwd_inner: 5656.18 | bwd_allreduce: 59.68 | step: 18.41 13%|█▎ | 5397/41250 [13:02:48<86:15:19, 8.66s/it] {'loss': 0.0399, 'grad_norm': 0.8622846603393555, 'learning_rate': 3.894309876501684e-05, 'epoch': 1.31} 13%|█▎ | 5397/41250 [13:02:48<86:15:19, 8.66s/it][2025-04-25 21:00:31,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.05 | optimizer_step: 0.97 [2025-04-25 21:00:31,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.66 | bwd_microstep: 5758.85 | bwd_inner_microstep: 5702.12 | bwd_allreduce_microstep: 56.67 | step_microstep: 19.01 [2025-04-25 21:00:31,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.66 | bwd: 5758.86 | bwd_inner: 5702.12 | bwd_allreduce: 56.69 | step: 19.01 13%|█▎ | 5398/41250 [13:02:57<86:21:58, 8.67s/it] {'loss': 0.2071, 'grad_norm': 3.9497554302215576, 'learning_rate': 3.894259498329415e-05, 'epoch': 1.31} 13%|█▎ | 5398/41250 [13:02:57<86:21:58, 8.67s/it][2025-04-25 21:00:40,609] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 21:00:40,610] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.27 | bwd_microstep: 5755.96 | bwd_inner_microstep: 5708.15 | bwd_allreduce_microstep: 47.75 | step_microstep: 19.09 [2025-04-25 21:00:40,610] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.27 | bwd: 5755.97 | bwd_inner: 5708.15 | bwd_allreduce: 47.77 | step: 19.09 13%|█▎ | 5399/41250 [13:03:05<86:25:49, 8.68s/it] {'loss': 0.1596, 'grad_norm': 2.5982367992401123, 'learning_rate': 3.894209108479408e-05, 'epoch': 1.31} 13%|█▎ | 5399/41250 [13:03:05<86:25:49, 8.68s/it][2025-04-25 21:00:49,281] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 1.02 [2025-04-25 21:00:49,281] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.56 | bwd_microstep: 5726.54 | bwd_inner_microstep: 5713.68 | bwd_allreduce_microstep: 12.81 | step_microstep: 19.48 [2025-04-25 21:00:49,282] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.56 | bwd: 5726.55 | bwd_inner: 5713.68 | bwd_allreduce: 12.83 | step: 19.48 13%|█▎ | 5400/41250 [13:03:14<86:23:58, 8.68s/it] {'loss': 0.046, 'grad_norm': 0.7645642161369324, 'learning_rate': 3.8941587069519746e-05, 'epoch': 1.31} 13%|█▎ | 5400/41250 [13:03:14<86:23:58, 8.68s/it][2025-04-25 21:00:57,977] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.06 | optimizer_step: 0.94 [2025-04-25 21:00:57,978] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.75 | bwd_microstep: 5759.09 | bwd_inner_microstep: 5718.15 | bwd_allreduce_microstep: 40.90 | step_microstep: 18.72 [2025-04-25 21:00:57,978] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.75 | bwd: 5759.11 | bwd_inner: 5718.15 | bwd_allreduce: 40.92 | step: 18.72 13%|█▎ | 5401/41250 [13:03:23<86:27:39, 8.68s/it] {'loss': 0.1107, 'grad_norm': 5.893352508544922, 'learning_rate': 3.8941082937474256e-05, 'epoch': 1.31} 13%|█▎ | 5401/41250 [13:03:23<86:27:39, 8.68s/it][2025-04-25 21:01:06,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 21:01:06,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.12 | bwd_microstep: 5773.61 | bwd_inner_microstep: 5695.00 | bwd_allreduce_microstep: 78.57 | step_microstep: 18.76 [2025-04-25 21:01:06,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.12 | bwd: 5773.63 | bwd_inner: 5695.00 | bwd_allreduce: 78.59 | step: 18.77 13%|█▎ | 5402/41250 [13:03:32<86:31:54, 8.69s/it] {'loss': 0.0988, 'grad_norm': 1.104246973991394, 'learning_rate': 3.894057868866071e-05, 'epoch': 1.31} 13%|█▎ | 5402/41250 [13:03:32<86:31:54, 8.69s/it][2025-04-25 21:01:15,357] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:01:15,357] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.13 | bwd_microstep: 5736.29 | bwd_inner_microstep: 5723.87 | bwd_allreduce_microstep: 12.37 | step_microstep: 18.91 [2025-04-25 21:01:15,358] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.13 | bwd: 5736.30 | bwd_inner: 5723.87 | bwd_allreduce: 12.38 | step: 18.91 13%|█▎ | 5403/41250 [13:03:40<86:28:53, 8.69s/it] {'loss': 0.1091, 'grad_norm': 4.1121625900268555, 'learning_rate': 3.8940074323082224e-05, 'epoch': 1.31} 13%|█▎ | 5403/41250 [13:03:40<86:28:53, 8.69s/it][2025-04-25 21:01:24,079] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.07 | optimizer_step: 1.02 [2025-04-25 21:01:24,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.32 | bwd_microstep: 5779.00 | bwd_inner_microstep: 5726.32 | bwd_allreduce_microstep: 52.63 | step_microstep: 19.15 [2025-04-25 21:01:24,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.32 | bwd: 5779.02 | bwd_inner: 5726.32 | bwd_allreduce: 52.66 | step: 19.15 13%|█▎ | 5404/41250 [13:03:49<86:34:58, 8.70s/it] {'loss': 0.268, 'grad_norm': 1.5505865812301636, 'learning_rate': 3.8939569840741916e-05, 'epoch': 1.31} 13%|█▎ | 5404/41250 [13:03:49<86:34:58, 8.70s/it][2025-04-25 21:01:32,740] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.08 | optimizer_step: 1.02 [2025-04-25 21:01:32,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2864.09 | bwd_microstep: 5714.11 | bwd_inner_microstep: 5701.68 | bwd_allreduce_microstep: 12.38 | step_microstep: 19.66 [2025-04-25 21:01:32,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2864.09 | bwd: 5714.12 | bwd_inner: 5701.68 | bwd_allreduce: 12.40 | step: 19.66 13%|█▎ | 5405/41250 [13:03:58<86:28:47, 8.69s/it] {'loss': 0.2592, 'grad_norm': 4.284611225128174, 'learning_rate': 3.893906524164288e-05, 'epoch': 1.31} 13%|█▎ | 5405/41250 [13:03:58<86:28:47, 8.69s/it][2025-04-25 21:01:41,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.58 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 21:01:41,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.38 | bwd_microstep: 5706.51 | bwd_inner_microstep: 5693.49 | bwd_allreduce_microstep: 12.97 | step_microstep: 19.21 [2025-04-25 21:01:41,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.38 | bwd: 5706.53 | bwd_inner: 5693.49 | bwd_allreduce: 12.99 | step: 19.22 13%|█▎ | 5406/41250 [13:04:06<86:21:33, 8.67s/it] {'loss': 0.1396, 'grad_norm': 1.4438618421554565, 'learning_rate': 3.893856052578822e-05, 'epoch': 1.31} 13%|█▎ | 5406/41250 [13:04:06<86:21:33, 8.67s/it][2025-04-25 21:01:50,036] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.02 | optimizer_step: 1.05 [2025-04-25 21:01:50,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.91 | bwd_microstep: 5729.91 | bwd_inner_microstep: 5663.47 | bwd_allreduce_microstep: 66.40 | step_microstep: 18.92 [2025-04-25 21:01:50,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.91 | bwd: 5729.93 | bwd_inner: 5663.47 | bwd_allreduce: 66.42 | step: 18.93 13%|█▎ | 5407/41250 [13:04:15<86:17:01, 8.67s/it] {'loss': 0.0766, 'grad_norm': 0.9799110293388367, 'learning_rate': 3.893805569318107e-05, 'epoch': 1.31} 13%|█▎ | 5407/41250 [13:04:15<86:17:01, 8.67s/it][2025-04-25 21:01:58,721] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:01:58,722] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.32 | bwd_microstep: 5772.15 | bwd_inner_microstep: 5657.07 | bwd_allreduce_microstep: 115.03 | step_microstep: 18.82 [2025-04-25 21:01:58,722] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.32 | bwd: 5772.16 | bwd_inner: 5657.07 | bwd_allreduce: 115.05 | step: 18.82 13%|█▎ | 5408/41250 [13:04:24<86:20:21, 8.67s/it] {'loss': 0.2607, 'grad_norm': 2.6129093170166016, 'learning_rate': 3.893755074382453e-05, 'epoch': 1.31} 13%|█▎ | 5408/41250 [13:04:24<86:20:21, 8.67s/it][2025-04-25 21:02:07,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:02:07,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.94 | bwd_microstep: 5768.68 | bwd_inner_microstep: 5709.87 | bwd_allreduce_microstep: 58.75 | step_microstep: 18.84 [2025-04-25 21:02:07,431] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.94 | bwd: 5768.69 | bwd_inner: 5709.87 | bwd_allreduce: 58.78 | step: 18.84 13%|█▎ | 5409/41250 [13:04:32<86:27:28, 8.68s/it] {'loss': 0.2248, 'grad_norm': 1.9308488368988037, 'learning_rate': 3.893704567772171e-05, 'epoch': 1.31} 13%|█▎ | 5409/41250 [13:04:32<86:27:28, 8.68s/it][2025-04-25 21:02:16,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:02:16,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.98 | bwd_microstep: 5725.19 | bwd_inner_microstep: 5658.50 | bwd_allreduce_microstep: 66.64 | step_microstep: 18.90 [2025-04-25 21:02:16,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.98 | bwd: 5725.21 | bwd_inner: 5658.50 | bwd_allreduce: 66.66 | step: 18.90 13%|█▎ | 5410/41250 [13:04:41<86:18:41, 8.67s/it] {'loss': 0.0797, 'grad_norm': 1.3268511295318604, 'learning_rate': 3.893654049487574e-05, 'epoch': 1.31} 13%|█▎ | 5410/41250 [13:04:41<86:18:41, 8.67s/it][2025-04-25 21:02:24,912] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.24 | optimizer_step: 1.08 [2025-04-25 21:02:24,913] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.74 | bwd_microstep: 5905.29 | bwd_inner_microstep: 5695.86 | bwd_allreduce_microstep: 209.38 | step_microstep: 19.90 [2025-04-25 21:02:24,913] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.74 | bwd: 5905.32 | bwd_inner: 5695.86 | bwd_allreduce: 209.40 | step: 19.90 13%|█▎ | 5411/41250 [13:04:50<86:49:35, 8.72s/it] {'loss': 0.2004, 'grad_norm': 1.7883166074752808, 'learning_rate': 3.893603519528971e-05, 'epoch': 1.31} 13%|█▎ | 5411/41250 [13:04:50<86:49:35, 8.72s/it][2025-04-25 21:02:33,617] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 21:02:33,617] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.89 | bwd_microstep: 5795.18 | bwd_inner_microstep: 5648.38 | bwd_allreduce_microstep: 146.76 | step_microstep: 19.13 [2025-04-25 21:02:33,617] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.89 | bwd: 5795.20 | bwd_inner: 5648.38 | bwd_allreduce: 146.78 | step: 19.13 13%|█▎ | 5412/41250 [13:04:58<86:46:13, 8.72s/it] {'loss': 0.046, 'grad_norm': 1.0099287033081055, 'learning_rate': 3.8935529778966755e-05, 'epoch': 1.31} 13%|█▎ | 5412/41250 [13:04:58<86:46:13, 8.72s/it][2025-04-25 21:02:42,247] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 1.02 [2025-04-25 21:02:42,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.72 | bwd_microstep: 5703.49 | bwd_inner_microstep: 5689.73 | bwd_allreduce_microstep: 13.71 | step_microstep: 18.78 [2025-04-25 21:02:42,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.72 | bwd: 5703.50 | bwd_inner: 5689.73 | bwd_allreduce: 13.72 | step: 18.79 13%|█▎ | 5413/41250 [13:05:07<86:30:46, 8.69s/it] {'loss': 0.0797, 'grad_norm': 1.1743543148040771, 'learning_rate': 3.893502424590998e-05, 'epoch': 1.31} 13%|█▎ | 5413/41250 [13:05:07<86:30:46, 8.69s/it][2025-04-25 21:02:50,950] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:02:50,950] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.27 | bwd_microstep: 5784.66 | bwd_inner_microstep: 5653.67 | bwd_allreduce_microstep: 130.95 | step_microstep: 18.85 [2025-04-25 21:02:50,950] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.27 | bwd: 5784.68 | bwd_inner: 5653.67 | bwd_allreduce: 130.97 | step: 18.85 13%|█▎ | 5414/41250 [13:05:16<86:32:43, 8.69s/it] {'loss': 0.1451, 'grad_norm': 3.203889846801758, 'learning_rate': 3.8934518596122504e-05, 'epoch': 1.31} 13%|█▎ | 5414/41250 [13:05:16<86:32:43, 8.69s/it][2025-04-25 21:02:59,652] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 0.95 | optimizer_step: 1.06 [2025-04-25 21:02:59,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.55 | bwd_microstep: 5796.61 | bwd_inner_microstep: 5645.29 | bwd_allreduce_microstep: 151.27 | step_microstep: 18.41 [2025-04-25 21:02:59,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.55 | bwd: 5796.62 | bwd_inner: 5645.29 | bwd_allreduce: 151.29 | step: 18.41 13%|█▎ | 5415/41250 [13:05:24<86:33:59, 8.70s/it] {'loss': 0.0809, 'grad_norm': 1.5897918939590454, 'learning_rate': 3.893401282960745e-05, 'epoch': 1.31} 13%|█▎ | 5415/41250 [13:05:24<86:33:59, 8.70s/it][2025-04-25 21:03:08,278] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 21:03:08,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.76 | bwd_microstep: 5702.56 | bwd_inner_microstep: 5689.94 | bwd_allreduce_microstep: 12.57 | step_microstep: 18.61 [2025-04-25 21:03:08,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.76 | bwd: 5702.57 | bwd_inner: 5689.94 | bwd_allreduce: 12.59 | step: 18.62 13%|█▎ | 5416/41250 [13:05:33<86:21:22, 8.68s/it] {'loss': 0.0206, 'grad_norm': 0.2628147602081299, 'learning_rate': 3.893350694636792e-05, 'epoch': 1.31} 13%|█▎ | 5416/41250 [13:05:33<86:21:22, 8.68s/it][2025-04-25 21:03:16,940] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.03 | optimizer_step: 1.01 [2025-04-25 21:03:16,941] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.81 | bwd_microstep: 5729.70 | bwd_inner_microstep: 5702.48 | bwd_allreduce_microstep: 27.16 | step_microstep: 18.90 [2025-04-25 21:03:16,941] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.81 | bwd: 5729.71 | bwd_inner: 5702.48 | bwd_allreduce: 27.18 | step: 18.91 13%|█▎ | 5417/41250 [13:05:42<86:18:52, 8.67s/it] {'loss': 0.0899, 'grad_norm': 3.6997830867767334, 'learning_rate': 3.893300094640705e-05, 'epoch': 1.31} 13%|█▎ | 5417/41250 [13:05:42<86:18:52, 8.67s/it][2025-04-25 21:03:25,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.02 | optimizer_step: 1.11 [2025-04-25 21:03:25,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.58 | bwd_microstep: 5711.25 | bwd_inner_microstep: 5698.07 | bwd_allreduce_microstep: 13.14 | step_microstep: 19.08 [2025-04-25 21:03:25,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.58 | bwd: 5711.27 | bwd_inner: 5698.07 | bwd_allreduce: 13.16 | step: 19.08 13%|█▎ | 5418/41250 [13:05:50<86:14:01, 8.66s/it] {'loss': 0.2456, 'grad_norm': 3.1421515941619873, 'learning_rate': 3.893249482972796e-05, 'epoch': 1.31} 13%|█▎ | 5418/41250 [13:05:50<86:14:01, 8.66s/it][2025-04-25 21:03:34,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-25 21:03:34,249] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.73 | bwd_microstep: 5727.48 | bwd_inner_microstep: 5705.54 | bwd_allreduce_microstep: 21.89 | step_microstep: 18.64 [2025-04-25 21:03:34,249] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.73 | bwd: 5727.49 | bwd_inner: 5705.54 | bwd_allreduce: 21.91 | step: 18.64 13%|█▎ | 5419/41250 [13:05:59<86:13:33, 8.66s/it] {'loss': 0.0283, 'grad_norm': 0.6052314043045044, 'learning_rate': 3.893198859633375e-05, 'epoch': 1.31} 13%|█▎ | 5419/41250 [13:05:59<86:13:33, 8.66s/it][2025-04-25 21:03:42,979] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-25 21:03:42,980] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2879.37 | bwd_microstep: 5767.96 | bwd_inner_microstep: 5755.23 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.76 [2025-04-25 21:03:42,980] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2879.37 | bwd: 5767.98 | bwd_inner: 5755.23 | bwd_allreduce: 12.70 | step: 18.77 13%|█▎ | 5420/41250 [13:06:08<86:25:28, 8.68s/it] {'loss': 0.0813, 'grad_norm': 2.1148884296417236, 'learning_rate': 3.893148224622757e-05, 'epoch': 1.31} 13%|█▎ | 5420/41250 [13:06:08<86:25:28, 8.68s/it][2025-04-25 21:03:51,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.04 | optimizer_step: 0.89 [2025-04-25 21:03:51,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.04 | bwd_microstep: 5706.03 | bwd_inner_microstep: 5642.42 | bwd_allreduce_microstep: 63.56 | step_microstep: 18.81 [2025-04-25 21:03:51,588] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.04 | bwd: 5706.05 | bwd_inner: 5642.42 | bwd_allreduce: 63.58 | step: 18.81 13%|█▎ | 5421/41250 [13:06:16<86:11:44, 8.66s/it] {'loss': 0.1193, 'grad_norm': 4.322851181030273, 'learning_rate': 3.8930975779412513e-05, 'epoch': 1.31} 13%|█▎ | 5421/41250 [13:06:16<86:11:44, 8.66s/it][2025-04-25 21:04:00,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:04:00,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.08 | bwd_microstep: 5708.00 | bwd_inner_microstep: 5695.23 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.83 [2025-04-25 21:04:00,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.08 | bwd: 5708.01 | bwd_inner: 5695.23 | bwd_allreduce: 12.73 | step: 18.83 13%|█▎ | 5422/41250 [13:06:25<86:10:04, 8.66s/it] {'loss': 0.2472, 'grad_norm': 3.3339078426361084, 'learning_rate': 3.8930469195891715e-05, 'epoch': 1.31} 13%|█▎ | 5422/41250 [13:06:25<86:10:04, 8.66s/it][2025-04-25 21:04:08,859] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.03 | optimizer_step: 1.01 [2025-04-25 21:04:08,860] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.90 | bwd_microstep: 5717.80 | bwd_inner_microstep: 5648.63 | bwd_allreduce_microstep: 69.12 | step_microstep: 19.28 [2025-04-25 21:04:08,860] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.90 | bwd: 5717.81 | bwd_inner: 5648.63 | bwd_allreduce: 69.14 | step: 19.28 13%|█▎ | 5423/41250 [13:06:34<86:03:10, 8.65s/it] {'loss': 0.03, 'grad_norm': 0.7398542761802673, 'learning_rate': 3.892996249566831e-05, 'epoch': 1.31} 13%|█▎ | 5423/41250 [13:06:34<86:03:10, 8.65s/it][2025-04-25 21:04:17,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 21:04:17,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.17 | bwd_microstep: 5729.73 | bwd_inner_microstep: 5685.15 | bwd_allreduce_microstep: 44.53 | step_microstep: 18.68 [2025-04-25 21:04:17,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.18 | bwd: 5729.74 | bwd_inner: 5685.15 | bwd_allreduce: 44.55 | step: 18.67 13%|█▎ | 5424/41250 [13:06:42<86:04:58, 8.65s/it] {'loss': 0.1605, 'grad_norm': 1.0949814319610596, 'learning_rate': 3.8929455678745394e-05, 'epoch': 1.31} 13%|█▎ | 5424/41250 [13:06:42<86:04:58, 8.65s/it][2025-04-25 21:04:26,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:04:26,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.08 | bwd_microstep: 5842.04 | bwd_inner_microstep: 5700.39 | bwd_allreduce_microstep: 141.61 | step_microstep: 18.51 [2025-04-25 21:04:26,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.08 | bwd: 5842.06 | bwd_inner: 5700.39 | bwd_allreduce: 141.63 | step: 18.51 13%|█▎ | 5425/41250 [13:06:51<86:26:57, 8.69s/it] {'loss': 0.0875, 'grad_norm': 1.1240589618682861, 'learning_rate': 3.892894874512611e-05, 'epoch': 1.32} 13%|█▎ | 5425/41250 [13:06:51<86:26:57, 8.69s/it][2025-04-25 21:04:34,967] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.10 | optimizer_step: 0.97 [2025-04-25 21:04:34,968] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2816.99 | bwd_microstep: 5774.68 | bwd_inner_microstep: 5633.38 | bwd_allreduce_microstep: 141.25 | step_microstep: 18.98 [2025-04-25 21:04:34,968] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2816.99 | bwd: 5774.69 | bwd_inner: 5633.38 | bwd_allreduce: 141.27 | step: 18.98 13%|█▎ | 5426/41250 [13:07:00<86:25:05, 8.68s/it] {'loss': 0.2893, 'grad_norm': 4.271388053894043, 'learning_rate': 3.892844169481359e-05, 'epoch': 1.32} 13%|█▎ | 5426/41250 [13:07:00<86:25:05, 8.68s/it][2025-04-25 21:04:43,766] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:04:43,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.09 | bwd_microstep: 5889.24 | bwd_inner_microstep: 5641.32 | bwd_allreduce_microstep: 247.87 | step_microstep: 18.81 [2025-04-25 21:04:43,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.09 | bwd: 5889.25 | bwd_inner: 5641.32 | bwd_allreduce: 247.89 | step: 18.81 13%|█▎ | 5427/41250 [13:07:09<86:45:24, 8.72s/it] {'loss': 0.0395, 'grad_norm': 1.9438347816467285, 'learning_rate': 3.892793452781094e-05, 'epoch': 1.32} 13%|█▎ | 5427/41250 [13:07:09<86:45:24, 8.72s/it][2025-04-25 21:04:52,434] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 21:04:52,435] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.18 | bwd_microstep: 5738.82 | bwd_inner_microstep: 5682.34 | bwd_allreduce_microstep: 56.44 | step_microstep: 18.45 [2025-04-25 21:04:52,435] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.18 | bwd: 5738.84 | bwd_inner: 5682.34 | bwd_allreduce: 56.46 | step: 18.45 13%|█▎ | 5428/41250 [13:07:17<86:36:11, 8.70s/it] {'loss': 0.0405, 'grad_norm': 0.7490013241767883, 'learning_rate': 3.8927427244121295e-05, 'epoch': 1.32} 13%|█▎ | 5428/41250 [13:07:17<86:36:11, 8.70s/it][2025-04-25 21:05:01,068] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 21:05:01,068] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.68 | bwd_microstep: 5703.04 | bwd_inner_microstep: 5690.14 | bwd_allreduce_microstep: 12.85 | step_microstep: 18.49 [2025-04-25 21:05:01,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.68 | bwd: 5703.05 | bwd_inner: 5690.14 | bwd_allreduce: 12.87 | step: 18.49 13%|█▎ | 5429/41250 [13:07:26<86:23:32, 8.68s/it] {'loss': 0.0548, 'grad_norm': 1.8502411842346191, 'learning_rate': 3.8926919843747795e-05, 'epoch': 1.32} 13%|█▎ | 5429/41250 [13:07:26<86:23:32, 8.68s/it][2025-04-25 21:05:09,698] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:05:09,699] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.36 | bwd_microstep: 5704.76 | bwd_inner_microstep: 5692.05 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.83 [2025-04-25 21:05:09,699] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.36 | bwd: 5704.77 | bwd_inner: 5692.05 | bwd_allreduce: 12.68 | step: 18.83 13%|█▎ | 5430/41250 [13:07:35<86:14:06, 8.67s/it] {'loss': 0.1481, 'grad_norm': 2.1636273860931396, 'learning_rate': 3.892641232669355e-05, 'epoch': 1.32} 13%|█▎ | 5430/41250 [13:07:35<86:14:06, 8.67s/it][2025-04-25 21:05:18,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.02 | optimizer_step: 1.04 [2025-04-25 21:05:18,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.50 | bwd_microstep: 5780.68 | bwd_inner_microstep: 5646.65 | bwd_allreduce_microstep: 133.98 | step_microstep: 19.10 [2025-04-25 21:05:18,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.50 | bwd: 5780.70 | bwd_inner: 5646.65 | bwd_allreduce: 134.00 | step: 19.11 13%|█▎ | 5431/41250 [13:07:43<86:17:46, 8.67s/it] {'loss': 0.0907, 'grad_norm': 2.026061773300171, 'learning_rate': 3.8925904692961697e-05, 'epoch': 1.32} 13%|█▎ | 5431/41250 [13:07:43<86:17:46, 8.67s/it][2025-04-25 21:05:27,157] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.04 | optimizer_step: 0.96 [2025-04-25 21:05:27,158] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.13 | bwd_microstep: 5850.50 | bwd_inner_microstep: 5650.85 | bwd_allreduce_microstep: 199.60 | step_microstep: 19.08 [2025-04-25 21:05:27,158] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.13 | bwd: 5850.51 | bwd_inner: 5650.85 | bwd_allreduce: 199.62 | step: 19.08 13%|█▎ | 5432/41250 [13:07:52<86:35:13, 8.70s/it] {'loss': 0.0375, 'grad_norm': 0.48664146661758423, 'learning_rate': 3.892539694255536e-05, 'epoch': 1.32} 13%|█▎ | 5432/41250 [13:07:52<86:35:13, 8.70s/it][2025-04-25 21:05:35,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-25 21:05:35,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.09 | bwd_microstep: 5709.23 | bwd_inner_microstep: 5684.54 | bwd_allreduce_microstep: 24.64 | step_microstep: 18.87 [2025-04-25 21:05:35,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.09 | bwd: 5709.24 | bwd_inner: 5684.54 | bwd_allreduce: 24.66 | step: 18.87 13%|█▎ | 5433/41250 [13:08:01<86:25:17, 8.69s/it] {'loss': 0.0341, 'grad_norm': 0.8262473940849304, 'learning_rate': 3.892488907547768e-05, 'epoch': 1.32} 13%|█▎ | 5433/41250 [13:08:01<86:25:17, 8.69s/it][2025-04-25 21:05:44,477] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:05:44,477] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.14 | bwd_microstep: 5763.22 | bwd_inner_microstep: 5643.76 | bwd_allreduce_microstep: 119.42 | step_microstep: 18.60 [2025-04-25 21:05:44,478] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.14 | bwd: 5763.23 | bwd_inner: 5643.76 | bwd_allreduce: 119.43 | step: 18.60 13%|█▎ | 5434/41250 [13:08:09<86:22:33, 8.68s/it] {'loss': 0.2434, 'grad_norm': 3.668747901916504, 'learning_rate': 3.8924381091731775e-05, 'epoch': 1.32} 13%|█▎ | 5434/41250 [13:08:09<86:22:33, 8.68s/it][2025-04-25 21:05:53,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:05:53,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.91 | bwd_microstep: 5679.54 | bwd_inner_microstep: 5643.87 | bwd_allreduce_microstep: 35.63 | step_microstep: 18.74 [2025-04-25 21:05:53,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.91 | bwd: 5679.56 | bwd_inner: 5643.87 | bwd_allreduce: 35.65 | step: 18.74 13%|█▎ | 5435/41250 [13:08:18<86:07:28, 8.66s/it] {'loss': 0.3185, 'grad_norm': 5.491805553436279, 'learning_rate': 3.892387299132079e-05, 'epoch': 1.32} 13%|█▎ | 5435/41250 [13:08:18<86:07:28, 8.66s/it][2025-04-25 21:06:01,770] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:06:01,770] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.23 | bwd_microstep: 5751.01 | bwd_inner_microstep: 5682.65 | bwd_allreduce_microstep: 68.31 | step_microstep: 18.53 [2025-04-25 21:06:01,771] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.23 | bwd: 5751.03 | bwd_inner: 5682.65 | bwd_allreduce: 68.33 | step: 18.53 13%|█▎ | 5436/41250 [13:08:27<86:14:19, 8.67s/it] {'loss': 0.3172, 'grad_norm': 2.7136361598968506, 'learning_rate': 3.892336477424784e-05, 'epoch': 1.32} 13%|█▎ | 5436/41250 [13:08:27<86:14:19, 8.67s/it][2025-04-25 21:06:10,436] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-25 21:06:10,437] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.18 | bwd_microstep: 5727.73 | bwd_inner_microstep: 5686.72 | bwd_allreduce_microstep: 40.96 | step_microstep: 18.96 [2025-04-25 21:06:10,437] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.18 | bwd: 5727.74 | bwd_inner: 5686.72 | bwd_allreduce: 40.98 | step: 18.96 13%|█▎ | 5437/41250 [13:08:35<86:13:18, 8.67s/it] {'loss': 0.0737, 'grad_norm': 1.2436856031417847, 'learning_rate': 3.892285644051607e-05, 'epoch': 1.32} 13%|█▎ | 5437/41250 [13:08:35<86:13:18, 8.67s/it][2025-04-25 21:06:19,036] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-25 21:06:19,036] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.73 | bwd_microstep: 5694.78 | bwd_inner_microstep: 5647.73 | bwd_allreduce_microstep: 47.01 | step_microstep: 18.68 [2025-04-25 21:06:19,036] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.73 | bwd: 5694.79 | bwd_inner: 5647.73 | bwd_allreduce: 47.02 | step: 18.68 13%|█▎ | 5438/41250 [13:08:44<86:00:56, 8.65s/it] {'loss': 0.1401, 'grad_norm': 3.042322874069214, 'learning_rate': 3.892234799012862e-05, 'epoch': 1.32} 13%|█▎ | 5438/41250 [13:08:44<86:00:56, 8.65s/it][2025-04-25 21:06:27,772] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:06:27,772] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2878.48 | bwd_microstep: 5771.10 | bwd_inner_microstep: 5758.27 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.62 [2025-04-25 21:06:27,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2878.48 | bwd: 5771.11 | bwd_inner: 5758.27 | bwd_allreduce: 12.79 | step: 18.62 13%|█▎ | 5439/41250 [13:08:53<86:16:53, 8.67s/it] {'loss': 0.1019, 'grad_norm': 2.8826181888580322, 'learning_rate': 3.892183942308862e-05, 'epoch': 1.32} 13%|█▎ | 5439/41250 [13:08:53<86:16:53, 8.67s/it][2025-04-25 21:06:36,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:06:36,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.12 | bwd_microstep: 5724.41 | bwd_inner_microstep: 5711.62 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.85 [2025-04-25 21:06:36,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.12 | bwd: 5724.43 | bwd_inner: 5711.62 | bwd_allreduce: 12.76 | step: 18.86 13%|█▎ | 5440/41250 [13:09:01<86:16:26, 8.67s/it] {'loss': 0.0624, 'grad_norm': 1.2844380140304565, 'learning_rate': 3.892133073939919e-05, 'epoch': 1.32} 13%|█▎ | 5440/41250 [13:09:01<86:16:26, 8.67s/it][2025-04-25 21:06:45,160] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:06:45,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.10 | bwd_microstep: 5780.51 | bwd_inner_microstep: 5665.29 | bwd_allreduce_microstep: 115.17 | step_microstep: 18.67 [2025-04-25 21:06:45,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.10 | bwd: 5780.52 | bwd_inner: 5665.29 | bwd_allreduce: 115.18 | step: 18.67 13%|█▎ | 5441/41250 [13:09:10<86:24:06, 8.69s/it] {'loss': 0.0557, 'grad_norm': 0.9871096014976501, 'learning_rate': 3.8920821939063476e-05, 'epoch': 1.32} 13%|█▎ | 5441/41250 [13:09:10<86:24:06, 8.69s/it][2025-04-25 21:06:53,768] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 1.13 [2025-04-25 21:06:53,769] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.21 | bwd_microstep: 5702.17 | bwd_inner_microstep: 5650.24 | bwd_allreduce_microstep: 51.89 | step_microstep: 19.16 [2025-04-25 21:06:53,769] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.21 | bwd: 5702.19 | bwd_inner: 5650.24 | bwd_allreduce: 51.91 | step: 19.16 13%|█▎ | 5442/41250 [13:09:19<86:10:13, 8.66s/it] {'loss': 0.1111, 'grad_norm': 3.059678554534912, 'learning_rate': 3.8920313022084625e-05, 'epoch': 1.32} 13%|█▎ | 5442/41250 [13:09:19<86:10:13, 8.66s/it][2025-04-25 21:07:02,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:07:02,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.42 | bwd_microstep: 6024.77 | bwd_inner_microstep: 5661.50 | bwd_allreduce_microstep: 363.22 | step_microstep: 18.47 [2025-04-25 21:07:02,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.42 | bwd: 6024.78 | bwd_inner: 5661.50 | bwd_allreduce: 363.24 | step: 18.48 13%|█▎ | 5443/41250 [13:09:28<86:59:19, 8.75s/it] {'loss': 0.0967, 'grad_norm': 1.5255939960479736, 'learning_rate': 3.891980398846576e-05, 'epoch': 1.32} 13%|█▎ | 5443/41250 [13:09:28<86:59:19, 8.75s/it][2025-04-25 21:07:11,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:07:11,397] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.85 | bwd_microstep: 5763.02 | bwd_inner_microstep: 5660.96 | bwd_allreduce_microstep: 102.01 | step_microstep: 18.83 [2025-04-25 21:07:11,397] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.85 | bwd: 5763.04 | bwd_inner: 5660.96 | bwd_allreduce: 102.03 | step: 18.83 13%|█▎ | 5444/41250 [13:09:36<86:48:40, 8.73s/it] {'loss': 0.2959, 'grad_norm': 2.8076705932617188, 'learning_rate': 3.891929483821003e-05, 'epoch': 1.32} 13%|█▎ | 5444/41250 [13:09:36<86:48:40, 8.73s/it][2025-04-25 21:07:20,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:07:20,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.69 | bwd_microstep: 5717.94 | bwd_inner_microstep: 5705.14 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.46 [2025-04-25 21:07:20,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.69 | bwd: 5717.95 | bwd_inner: 5705.14 | bwd_allreduce: 12.78 | step: 18.46 13%|█▎ | 5445/41250 [13:09:45<86:36:30, 8.71s/it] {'loss': 0.1582, 'grad_norm': 2.3650522232055664, 'learning_rate': 3.891878557132057e-05, 'epoch': 1.32} 13%|█▎ | 5445/41250 [13:09:45<86:36:30, 8.71s/it][2025-04-25 21:07:28,746] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.99 | optimizer_step: 1.00 [2025-04-25 21:07:28,746] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.85 | bwd_microstep: 5775.77 | bwd_inner_microstep: 5670.67 | bwd_allreduce_microstep: 105.05 | step_microstep: 18.52 [2025-04-25 21:07:28,746] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.85 | bwd: 5775.78 | bwd_inner: 5670.67 | bwd_allreduce: 105.07 | step: 18.52 13%|█▎ | 5446/41250 [13:09:54<86:33:16, 8.70s/it] {'loss': 0.1108, 'grad_norm': 7.728677272796631, 'learning_rate': 3.891827618780051e-05, 'epoch': 1.32} 13%|█▎ | 5446/41250 [13:09:54<86:33:16, 8.70s/it][2025-04-25 21:07:37,437] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 1.03 | optimizer_step: 1.02 [2025-04-25 21:07:37,438] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.87 | bwd_microstep: 5748.35 | bwd_inner_microstep: 5703.14 | bwd_allreduce_microstep: 45.17 | step_microstep: 19.31 [2025-04-25 21:07:37,438] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.87 | bwd: 5748.37 | bwd_inner: 5703.14 | bwd_allreduce: 45.19 | step: 19.32 13%|█▎ | 5447/41250 [13:10:02<86:30:52, 8.70s/it] {'loss': 0.0156, 'grad_norm': 0.22974760830402374, 'learning_rate': 3.8917766687653e-05, 'epoch': 1.32} 13%|█▎ | 5447/41250 [13:10:02<86:30:52, 8.70s/it][2025-04-25 21:07:46,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:07:46,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.96 | bwd_microstep: 5762.59 | bwd_inner_microstep: 5673.32 | bwd_allreduce_microstep: 89.23 | step_microstep: 18.36 [2025-04-25 21:07:46,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.96 | bwd: 5762.60 | bwd_inner: 5673.32 | bwd_allreduce: 89.24 | step: 18.36 13%|█▎ | 5448/41250 [13:10:11<86:28:17, 8.69s/it] {'loss': 0.0331, 'grad_norm': 0.7451435923576355, 'learning_rate': 3.8917257070881184e-05, 'epoch': 1.32} 13%|█▎ | 5448/41250 [13:10:11<86:28:17, 8.69s/it][2025-04-25 21:07:54,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.98 | optimizer_step: 0.97 [2025-04-25 21:07:54,902] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2898.76 | bwd_microstep: 5795.73 | bwd_inner_microstep: 5783.04 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.53 [2025-04-25 21:07:54,902] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2898.76 | bwd: 5795.74 | bwd_inner: 5783.04 | bwd_allreduce: 12.66 | step: 18.53 13%|█▎ | 5449/41250 [13:10:20<86:42:49, 8.72s/it] {'loss': 0.3177, 'grad_norm': 3.7769882678985596, 'learning_rate': 3.89167473374882e-05, 'epoch': 1.32} 13%|█▎ | 5449/41250 [13:10:20<86:42:49, 8.72s/it][2025-04-25 21:08:03,555] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:08:03,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.50 | bwd_microstep: 5714.24 | bwd_inner_microstep: 5701.72 | bwd_allreduce_microstep: 12.48 | step_microstep: 18.72 [2025-04-25 21:08:03,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.50 | bwd: 5714.26 | bwd_inner: 5701.72 | bwd_allreduce: 12.50 | step: 18.72 13%|█▎ | 5450/41250 [13:10:28<86:31:36, 8.70s/it] {'loss': 0.1553, 'grad_norm': 2.403780698776245, 'learning_rate': 3.891623748747718e-05, 'epoch': 1.32} 13%|█▎ | 5450/41250 [13:10:28<86:31:36, 8.70s/it][2025-04-25 21:08:12,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.04 | optimizer_step: 1.02 [2025-04-25 21:08:12,362] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.77 | bwd_microstep: 5870.91 | bwd_inner_microstep: 5672.49 | bwd_allreduce_microstep: 198.36 | step_microstep: 19.39 [2025-04-25 21:08:12,362] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.77 | bwd: 5870.92 | bwd_inner: 5672.49 | bwd_allreduce: 198.38 | step: 19.39 13%|█▎ | 5451/41250 [13:10:37<86:50:00, 8.73s/it] {'loss': 0.0495, 'grad_norm': 1.1586625576019287, 'learning_rate': 3.891572752085128e-05, 'epoch': 1.32} 13%|█▎ | 5451/41250 [13:10:37<86:50:00, 8.73s/it][2025-04-25 21:08:21,062] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:08:21,063] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.49 | bwd_microstep: 5754.79 | bwd_inner_microstep: 5699.36 | bwd_allreduce_microstep: 55.38 | step_microstep: 18.71 [2025-04-25 21:08:21,063] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.49 | bwd: 5754.81 | bwd_inner: 5699.36 | bwd_allreduce: 55.40 | step: 18.71 13%|█▎ | 5452/41250 [13:10:46<86:44:02, 8.72s/it] {'loss': 0.064, 'grad_norm': 1.7462748289108276, 'learning_rate': 3.8915217437613646e-05, 'epoch': 1.32} 13%|█▎ | 5452/41250 [13:10:46<86:44:02, 8.72s/it][2025-04-25 21:08:29,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 1.07 [2025-04-25 21:08:29,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.55 | bwd_microstep: 5796.57 | bwd_inner_microstep: 5705.76 | bwd_allreduce_microstep: 90.75 | step_microstep: 18.98 [2025-04-25 21:08:29,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.55 | bwd: 5796.58 | bwd_inner: 5705.76 | bwd_allreduce: 90.77 | step: 18.98 13%|█▎ | 5453/41250 [13:10:55<86:46:46, 8.73s/it] {'loss': 0.0203, 'grad_norm': 0.2604392170906067, 'learning_rate': 3.891470723776741e-05, 'epoch': 1.32} 13%|█▎ | 5453/41250 [13:10:55<86:46:46, 8.73s/it][2025-04-25 21:08:38,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.09 | optimizer_step: 1.03 [2025-04-25 21:08:38,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.30 | bwd_microstep: 5804.25 | bwd_inner_microstep: 5663.68 | bwd_allreduce_microstep: 140.52 | step_microstep: 19.66 [2025-04-25 21:08:38,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.30 | bwd: 5804.26 | bwd_inner: 5663.68 | bwd_allreduce: 140.54 | step: 19.67 13%|█▎ | 5454/41250 [13:11:03<86:45:01, 8.72s/it] {'loss': 0.0509, 'grad_norm': 0.7223865389823914, 'learning_rate': 3.891419692131573e-05, 'epoch': 1.32} 13%|█▎ | 5454/41250 [13:11:03<86:45:01, 8.72s/it][2025-04-25 21:08:47,301] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.04 | optimizer_step: 0.89 [2025-04-25 21:08:47,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2892.56 | bwd_microstep: 5805.93 | bwd_inner_microstep: 5792.88 | bwd_allreduce_microstep: 13.01 | step_microstep: 18.99 [2025-04-25 21:08:47,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2892.56 | bwd: 5805.95 | bwd_inner: 5792.88 | bwd_allreduce: 13.03 | step: 18.99 13%|█▎ | 5455/41250 [13:11:12<86:55:20, 8.74s/it] {'loss': 0.0262, 'grad_norm': 0.47586020827293396, 'learning_rate': 3.891368648826174e-05, 'epoch': 1.32} 13%|█▎ | 5455/41250 [13:11:12<86:55:20, 8.74s/it][2025-04-25 21:08:55,990] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.95 [2025-04-25 21:08:55,991] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.38 | bwd_microstep: 5752.98 | bwd_inner_microstep: 5698.28 | bwd_allreduce_microstep: 54.64 | step_microstep: 19.02 [2025-04-25 21:08:55,991] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.38 | bwd: 5753.00 | bwd_inner: 5698.28 | bwd_allreduce: 54.66 | step: 19.02 13%|█▎ | 5456/41250 [13:11:21<86:46:17, 8.73s/it] {'loss': 0.1823, 'grad_norm': 2.898010730743408, 'learning_rate': 3.8913175938608603e-05, 'epoch': 1.32} 13%|█▎ | 5456/41250 [13:11:21<86:46:17, 8.73s/it][2025-04-25 21:09:04,748] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.03 | optimizer_step: 0.98 [2025-04-25 21:09:04,749] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2888.84 | bwd_microstep: 5782.42 | bwd_inner_microstep: 5769.59 | bwd_allreduce_microstep: 12.78 | step_microstep: 19.45 [2025-04-25 21:09:04,749] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2888.84 | bwd: 5782.44 | bwd_inner: 5769.59 | bwd_allreduce: 12.80 | step: 19.45 13%|█▎ | 5457/41250 [13:11:30<86:51:08, 8.74s/it] {'loss': 0.1782, 'grad_norm': 1.8374974727630615, 'learning_rate': 3.891266527235945e-05, 'epoch': 1.32} 13%|█▎ | 5457/41250 [13:11:30<86:51:08, 8.74s/it][2025-04-25 21:09:13,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.03 | optimizer_step: 1.12 [2025-04-25 21:09:13,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.88 | bwd_microstep: 5755.48 | bwd_inner_microstep: 5717.07 | bwd_allreduce_microstep: 38.35 | step_microstep: 19.42 [2025-04-25 21:09:13,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.89 | bwd: 5755.49 | bwd_inner: 5717.07 | bwd_allreduce: 38.38 | step: 19.42 13%|█▎ | 5458/41250 [13:11:38<86:43:45, 8.72s/it] {'loss': 0.3303, 'grad_norm': 2.472928047180176, 'learning_rate': 3.8912154489517434e-05, 'epoch': 1.32} 13%|█▎ | 5458/41250 [13:11:38<86:43:45, 8.72s/it][2025-04-25 21:09:22,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:09:22,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.95 | bwd_microstep: 5778.76 | bwd_inner_microstep: 5669.16 | bwd_allreduce_microstep: 109.55 | step_microstep: 18.68 [2025-04-25 21:09:22,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.95 | bwd: 5778.78 | bwd_inner: 5669.16 | bwd_allreduce: 109.57 | step: 18.68 13%|█▎ | 5459/41250 [13:11:47<86:39:16, 8.72s/it] {'loss': 0.0975, 'grad_norm': 3.11958909034729, 'learning_rate': 3.8911643590085706e-05, 'epoch': 1.32} 13%|█▎ | 5459/41250 [13:11:47<86:39:16, 8.72s/it][2025-04-25 21:09:30,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.20 | optimizer_step: 0.97 [2025-04-25 21:09:30,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.75 | bwd_microstep: 5793.32 | bwd_inner_microstep: 5655.34 | bwd_allreduce_microstep: 137.92 | step_microstep: 19.45 [2025-04-25 21:09:30,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.75 | bwd: 5793.33 | bwd_inner: 5655.34 | bwd_allreduce: 137.94 | step: 19.45 13%|█▎ | 5460/41250 [13:11:56<86:37:02, 8.71s/it] {'loss': 0.0243, 'grad_norm': 0.4908677637577057, 'learning_rate': 3.8911132574067413e-05, 'epoch': 1.32} 13%|█▎ | 5460/41250 [13:11:56<86:37:02, 8.71s/it][2025-04-25 21:09:39,498] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 1.05 [2025-04-25 21:09:39,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.63 | bwd_microstep: 5728.07 | bwd_inner_microstep: 5662.85 | bwd_allreduce_microstep: 65.17 | step_microstep: 18.82 [2025-04-25 21:09:39,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.63 | bwd: 5728.08 | bwd_inner: 5662.85 | bwd_allreduce: 65.19 | step: 18.82 13%|█▎ | 5461/41250 [13:12:04<86:25:49, 8.69s/it] {'loss': 0.0135, 'grad_norm': 0.402901291847229, 'learning_rate': 3.8910621441465716e-05, 'epoch': 1.32} 13%|█▎ | 5461/41250 [13:12:04<86:25:49, 8.69s/it][2025-04-25 21:09:48,214] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.27 | optimizer_step: 0.90 [2025-04-25 21:09:48,215] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.49 | bwd_microstep: 5793.65 | bwd_inner_microstep: 5667.98 | bwd_allreduce_microstep: 125.62 | step_microstep: 19.59 [2025-04-25 21:09:48,215] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.49 | bwd: 5793.66 | bwd_inner: 5667.98 | bwd_allreduce: 125.64 | step: 19.59 13%|█▎ | 5462/41250 [13:12:13<86:29:33, 8.70s/it] {'loss': 0.2015, 'grad_norm': 2.8371548652648926, 'learning_rate': 3.891011019228375e-05, 'epoch': 1.32} 13%|█▎ | 5462/41250 [13:12:13<86:29:33, 8.70s/it][2025-04-25 21:09:56,900] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.05 | optimizer_step: 0.97 [2025-04-25 21:09:56,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.20 | bwd_microstep: 5773.20 | bwd_inner_microstep: 5648.44 | bwd_allreduce_microstep: 124.70 | step_microstep: 19.17 [2025-04-25 21:09:56,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.20 | bwd: 5773.22 | bwd_inner: 5648.44 | bwd_allreduce: 124.72 | step: 19.17 13%|█▎ | 5463/41250 [13:12:22<86:27:19, 8.70s/it] {'loss': 0.2296, 'grad_norm': 1.677554726600647, 'learning_rate': 3.890959882652467e-05, 'epoch': 1.32} 13%|█▎ | 5463/41250 [13:12:22<86:27:19, 8.70s/it][2025-04-25 21:10:05,691] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.15 | optimizer_step: 1.04 [2025-04-25 21:10:05,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.04 | bwd_microstep: 5847.27 | bwd_inner_microstep: 5694.56 | bwd_allreduce_microstep: 152.65 | step_microstep: 19.96 [2025-04-25 21:10:05,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.04 | bwd: 5847.29 | bwd_inner: 5694.56 | bwd_allreduce: 152.68 | step: 19.96 13%|█▎ | 5464/41250 [13:12:31<86:43:46, 8.72s/it] {'loss': 0.0533, 'grad_norm': 2.072084426879883, 'learning_rate': 3.890908734419164e-05, 'epoch': 1.32} 13%|█▎ | 5464/41250 [13:12:31<86:43:46, 8.72s/it][2025-04-25 21:10:14,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.30 | optimizer_step: 1.09 [2025-04-25 21:10:14,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.02 | bwd_microstep: 5770.36 | bwd_inner_microstep: 5706.66 | bwd_allreduce_microstep: 63.64 | step_microstep: 20.43 [2025-04-25 21:10:14,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.02 | bwd: 5770.38 | bwd_inner: 5706.66 | bwd_allreduce: 63.66 | step: 20.43 13%|█▎ | 5465/41250 [13:12:39<86:41:32, 8.72s/it] {'loss': 0.3754, 'grad_norm': 2.4627952575683594, 'learning_rate': 3.890857574528781e-05, 'epoch': 1.32} 13%|█▎ | 5465/41250 [13:12:39<86:41:32, 8.72s/it][2025-04-25 21:10:23,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 21:10:23,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.11 | bwd_microstep: 5995.88 | bwd_inner_microstep: 5696.87 | bwd_allreduce_microstep: 298.97 | step_microstep: 18.68 [2025-04-25 21:10:23,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.11 | bwd: 5995.90 | bwd_inner: 5696.87 | bwd_allreduce: 298.99 | step: 18.68 13%|█▎ | 5466/41250 [13:12:48<87:17:54, 8.78s/it] {'loss': 0.2051, 'grad_norm': 2.9105887413024902, 'learning_rate': 3.890806402981632e-05, 'epoch': 1.33} 13%|█▎ | 5466/41250 [13:12:48<87:17:54, 8.78s/it][2025-04-25 21:10:32,022] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:10:32,023] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.70 | bwd_microstep: 5784.78 | bwd_inner_microstep: 5653.22 | bwd_allreduce_microstep: 131.51 | step_microstep: 18.50 [2025-04-25 21:10:32,023] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.70 | bwd: 5784.80 | bwd_inner: 5653.22 | bwd_allreduce: 131.53 | step: 18.50 13%|█▎ | 5467/41250 [13:12:57<87:01:17, 8.75s/it] {'loss': 0.3624, 'grad_norm': 3.3244874477386475, 'learning_rate': 3.890755219778034e-05, 'epoch': 1.33} 13%|█▎ | 5467/41250 [13:12:57<87:01:17, 8.75s/it][2025-04-25 21:10:40,695] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 21:10:40,695] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.29 | bwd_microstep: 5766.05 | bwd_inner_microstep: 5655.81 | bwd_allreduce_microstep: 110.19 | step_microstep: 18.52 [2025-04-25 21:10:40,695] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.29 | bwd: 5766.06 | bwd_inner: 5655.81 | bwd_allreduce: 110.21 | step: 18.52 13%|█▎ | 5468/41250 [13:13:06<86:46:20, 8.73s/it] {'loss': 0.0856, 'grad_norm': 6.781400680541992, 'learning_rate': 3.890704024918302e-05, 'epoch': 1.33} 13%|█▎ | 5468/41250 [13:13:06<86:46:20, 8.73s/it][2025-04-25 21:10:49,303] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 21:10:49,303] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.52 | bwd_microstep: 5701.84 | bwd_inner_microstep: 5647.62 | bwd_allreduce_microstep: 54.17 | step_microstep: 18.42 [2025-04-25 21:10:49,304] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.52 | bwd: 5701.85 | bwd_inner: 5647.62 | bwd_allreduce: 54.19 | step: 18.42 13%|█▎ | 5469/41250 [13:13:14<86:24:24, 8.69s/it] {'loss': 0.0522, 'grad_norm': 0.9585870504379272, 'learning_rate': 3.890652818402752e-05, 'epoch': 1.33} 13%|█▎ | 5469/41250 [13:13:14<86:24:24, 8.69s/it][2025-04-25 21:10:58,000] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.02 | optimizer_step: 1.00 [2025-04-25 21:10:58,000] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.64 | bwd_microstep: 5764.54 | bwd_inner_microstep: 5693.59 | bwd_allreduce_microstep: 70.89 | step_microstep: 19.33 [2025-04-25 21:10:58,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.64 | bwd: 5764.55 | bwd_inner: 5693.59 | bwd_allreduce: 70.91 | step: 19.33 13%|█▎ | 5470/41250 [13:13:23<86:25:11, 8.70s/it] {'loss': 0.1856, 'grad_norm': 1.8189021348953247, 'learning_rate': 3.8906016002316974e-05, 'epoch': 1.33} 13%|█▎ | 5470/41250 [13:13:23<86:25:11, 8.70s/it][2025-04-25 21:11:06,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.16 | optimizer_step: 0.97 [2025-04-25 21:11:06,598] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.24 | bwd_microstep: 5688.00 | bwd_inner_microstep: 5646.21 | bwd_allreduce_microstep: 41.75 | step_microstep: 19.38 [2025-04-25 21:11:06,598] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.24 | bwd: 5688.01 | bwd_inner: 5646.21 | bwd_allreduce: 41.77 | step: 19.39 13%|█▎ | 5471/41250 [13:13:31<86:07:20, 8.67s/it] {'loss': 0.1907, 'grad_norm': 1.5859200954437256, 'learning_rate': 3.890550370405457e-05, 'epoch': 1.33} 13%|█▎ | 5471/41250 [13:13:31<86:07:20, 8.67s/it][2025-04-25 21:11:15,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.16 | optimizer_step: 0.90 [2025-04-25 21:11:15,258] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.34 | bwd_microstep: 5725.82 | bwd_inner_microstep: 5697.44 | bwd_allreduce_microstep: 28.33 | step_microstep: 18.88 [2025-04-25 21:11:15,258] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.34 | bwd: 5725.84 | bwd_inner: 5697.44 | bwd_allreduce: 28.35 | step: 18.89 13%|█▎ | 5472/41250 [13:13:40<86:06:12, 8.66s/it] {'loss': 0.127, 'grad_norm': 1.532275915145874, 'learning_rate': 3.890499128924345e-05, 'epoch': 1.33} 13%|█▎ | 5472/41250 [13:13:40<86:06:12, 8.66s/it][2025-04-25 21:11:23,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.10 | optimizer_step: 1.02 [2025-04-25 21:11:23,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.92 | bwd_microstep: 5739.60 | bwd_inner_microstep: 5648.39 | bwd_allreduce_microstep: 91.15 | step_microstep: 19.02 [2025-04-25 21:11:23,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.92 | bwd: 5739.62 | bwd_inner: 5648.39 | bwd_allreduce: 91.17 | step: 19.02 13%|█▎ | 5473/41250 [13:13:49<86:05:45, 8.66s/it] {'loss': 0.0664, 'grad_norm': 1.3962030410766602, 'learning_rate': 3.890447875788679e-05, 'epoch': 1.33} 13%|█▎ | 5473/41250 [13:13:49<86:05:45, 8.66s/it][2025-04-25 21:11:32,585] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-25 21:11:32,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.08 | bwd_microstep: 5730.88 | bwd_inner_microstep: 5698.65 | bwd_allreduce_microstep: 32.18 | step_microstep: 19.21 [2025-04-25 21:11:32,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.08 | bwd: 5730.89 | bwd_inner: 5698.65 | bwd_allreduce: 32.20 | step: 19.21 13%|█▎ | 5474/41250 [13:13:57<86:06:13, 8.66s/it] {'loss': 0.0719, 'grad_norm': 1.631029725074768, 'learning_rate': 3.890396610998773e-05, 'epoch': 1.33} 13%|█▎ | 5474/41250 [13:13:57<86:06:13, 8.66s/it][2025-04-25 21:11:41,514] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-25 21:11:41,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.72 | bwd_microstep: 6018.66 | bwd_inner_microstep: 5658.19 | bwd_allreduce_microstep: 360.43 | step_microstep: 18.93 [2025-04-25 21:11:41,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.72 | bwd: 6018.68 | bwd_inner: 5658.19 | bwd_allreduce: 360.45 | step: 18.93 13%|█▎ | 5475/41250 [13:14:06<86:53:18, 8.74s/it] {'loss': 0.5073, 'grad_norm': 5.107706546783447, 'learning_rate': 3.890345334554943e-05, 'epoch': 1.33} 13%|█▎ | 5475/41250 [13:14:06<86:53:18, 8.74s/it][2025-04-25 21:11:50,200] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:11:50,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.98 | bwd_microstep: 5764.81 | bwd_inner_microstep: 5682.00 | bwd_allreduce_microstep: 82.76 | step_microstep: 18.61 [2025-04-25 21:11:50,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.98 | bwd: 5764.82 | bwd_inner: 5682.00 | bwd_allreduce: 82.78 | step: 18.62 13%|█▎ | 5476/41250 [13:14:15<86:42:55, 8.73s/it] {'loss': 0.1439, 'grad_norm': 1.3937944173812866, 'learning_rate': 3.890294046457507e-05, 'epoch': 1.33} 13%|█▎ | 5476/41250 [13:14:15<86:42:55, 8.73s/it][2025-04-25 21:11:58,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 21:11:58,879] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.21 | bwd_microstep: 5770.10 | bwd_inner_microstep: 5642.59 | bwd_allreduce_microstep: 127.47 | step_microstep: 18.72 [2025-04-25 21:11:58,879] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.22 | bwd: 5770.11 | bwd_inner: 5642.59 | bwd_allreduce: 127.48 | step: 18.72 13%|█▎ | 5477/41250 [13:14:24<86:34:07, 8.71s/it] {'loss': 0.102, 'grad_norm': 0.9179266095161438, 'learning_rate': 3.8902427467067787e-05, 'epoch': 1.33} 13%|█▎ | 5477/41250 [13:14:24<86:34:07, 8.71s/it][2025-04-25 21:12:07,555] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:12:07,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.36 | bwd_microstep: 5768.51 | bwd_inner_microstep: 5648.07 | bwd_allreduce_microstep: 120.39 | step_microstep: 19.03 [2025-04-25 21:12:07,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.36 | bwd: 5768.52 | bwd_inner: 5648.07 | bwd_allreduce: 120.41 | step: 19.03 13%|█▎ | 5478/41250 [13:14:32<86:27:48, 8.70s/it] {'loss': 0.0195, 'grad_norm': 0.47357940673828125, 'learning_rate': 3.890191435303076e-05, 'epoch': 1.33} 13%|█▎ | 5478/41250 [13:14:32<86:27:48, 8.70s/it][2025-04-25 21:12:16,224] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.12 | optimizer_step: 1.05 [2025-04-25 21:12:16,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.19 | bwd_microstep: 5765.47 | bwd_inner_microstep: 5631.41 | bwd_allreduce_microstep: 134.00 | step_microstep: 19.74 [2025-04-25 21:12:16,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.20 | bwd: 5765.49 | bwd_inner: 5631.41 | bwd_allreduce: 134.03 | step: 19.74 13%|█▎ | 5479/41250 [13:14:41<86:21:53, 8.69s/it] {'loss': 0.2338, 'grad_norm': 2.8662660121917725, 'learning_rate': 3.890140112246715e-05, 'epoch': 1.33} 13%|█▎ | 5479/41250 [13:14:41<86:21:53, 8.69s/it][2025-04-25 21:12:25,024] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 21:12:25,024] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.00 | bwd_microstep: 5891.15 | bwd_inner_microstep: 5633.81 | bwd_allreduce_microstep: 257.29 | step_microstep: 19.01 [2025-04-25 21:12:25,025] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.00 | bwd: 5891.16 | bwd_inner: 5633.81 | bwd_allreduce: 257.31 | step: 19.02 13%|█▎ | 5480/41250 [13:14:50<86:41:05, 8.72s/it] {'loss': 0.0731, 'grad_norm': 0.7891958355903625, 'learning_rate': 3.890088777538012e-05, 'epoch': 1.33} 13%|█▎ | 5480/41250 [13:14:50<86:41:05, 8.72s/it][2025-04-25 21:12:33,701] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 21:12:33,701] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2816.33 | bwd_microstep: 5777.69 | bwd_inner_microstep: 5633.42 | bwd_allreduce_microstep: 144.22 | step_microstep: 18.87 [2025-04-25 21:12:33,702] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2816.33 | bwd: 5777.70 | bwd_inner: 5633.42 | bwd_allreduce: 144.24 | step: 18.87 13%|█▎ | 5481/41250 [13:14:59<86:32:26, 8.71s/it] {'loss': 0.0525, 'grad_norm': 1.8168134689331055, 'learning_rate': 3.890037431177284e-05, 'epoch': 1.33} 13%|█▎ | 5481/41250 [13:14:59<86:32:26, 8.71s/it][2025-04-25 21:12:42,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.07 | optimizer_step: 0.98 [2025-04-25 21:12:42,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.28 | bwd_microstep: 5694.71 | bwd_inner_microstep: 5643.40 | bwd_allreduce_microstep: 51.27 | step_microstep: 18.86 [2025-04-25 21:12:42,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.28 | bwd: 5694.73 | bwd_inner: 5643.40 | bwd_allreduce: 51.28 | step: 18.86 13%|█▎ | 5482/41250 [13:15:07<86:14:57, 8.68s/it] {'loss': 0.0249, 'grad_norm': 0.39514419436454773, 'learning_rate': 3.8899860731648466e-05, 'epoch': 1.33} 13%|█▎ | 5482/41250 [13:15:07<86:14:57, 8.68s/it][2025-04-25 21:12:50,923] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 21:12:50,923] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.45 | bwd_microstep: 5687.94 | bwd_inner_microstep: 5644.02 | bwd_allreduce_microstep: 43.87 | step_microstep: 18.97 [2025-04-25 21:12:50,923] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.45 | bwd: 5687.96 | bwd_inner: 5644.02 | bwd_allreduce: 43.89 | step: 18.98 13%|█▎ | 5483/41250 [13:15:16<86:02:01, 8.66s/it] {'loss': 0.3749, 'grad_norm': 2.1315784454345703, 'learning_rate': 3.889934703501017e-05, 'epoch': 1.33} 13%|█▎ | 5483/41250 [13:15:16<86:02:01, 8.66s/it][2025-04-25 21:12:59,578] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.23 | optimizer_step: 1.05 [2025-04-25 21:12:59,579] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.25 | bwd_microstep: 5740.63 | bwd_inner_microstep: 5641.67 | bwd_allreduce_microstep: 98.91 | step_microstep: 19.57 [2025-04-25 21:12:59,579] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.25 | bwd: 5740.65 | bwd_inner: 5641.67 | bwd_allreduce: 98.93 | step: 19.57 13%|█▎ | 5484/41250 [13:15:24<86:01:17, 8.66s/it] {'loss': 0.1904, 'grad_norm': 2.0090699195861816, 'learning_rate': 3.8898833221861115e-05, 'epoch': 1.33} 13%|█▎ | 5484/41250 [13:15:24<86:01:17, 8.66s/it][2025-04-25 21:13:08,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 21:13:08,245] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.67 | bwd_microstep: 5734.83 | bwd_inner_microstep: 5691.68 | bwd_allreduce_microstep: 43.10 | step_microstep: 18.97 [2025-04-25 21:13:08,245] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.67 | bwd: 5734.84 | bwd_inner: 5691.68 | bwd_allreduce: 43.12 | step: 18.98 13%|█▎ | 5485/41250 [13:15:33<86:02:13, 8.66s/it] {'loss': 0.1404, 'grad_norm': 2.889772891998291, 'learning_rate': 3.8898319292204464e-05, 'epoch': 1.33} 13%|█▎ | 5485/41250 [13:15:33<86:02:13, 8.66s/it][2025-04-25 21:13:16,851] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 21:13:16,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.01 | bwd_microstep: 5698.74 | bwd_inner_microstep: 5644.74 | bwd_allreduce_microstep: 53.95 | step_microstep: 19.01 [2025-04-25 21:13:16,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.01 | bwd: 5698.75 | bwd_inner: 5644.74 | bwd_allreduce: 53.96 | step: 19.01 13%|█▎ | 5486/41250 [13:15:42<85:52:36, 8.64s/it] {'loss': 0.4454, 'grad_norm': 3.715909242630005, 'learning_rate': 3.88978052460434e-05, 'epoch': 1.33} 13%|█▎ | 5486/41250 [13:15:42<85:52:36, 8.64s/it][2025-04-25 21:13:25,484] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 21:13:25,485] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.83 | bwd_microstep: 5704.30 | bwd_inner_microstep: 5691.31 | bwd_allreduce_microstep: 12.92 | step_microstep: 19.20 [2025-04-25 21:13:25,485] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.83 | bwd: 5704.32 | bwd_inner: 5691.31 | bwd_allreduce: 12.95 | step: 19.20 13%|█▎ | 5487/41250 [13:15:50<85:50:47, 8.64s/it] {'loss': 0.0498, 'grad_norm': 1.4337108135223389, 'learning_rate': 3.889729108338108e-05, 'epoch': 1.33} 13%|█▎ | 5487/41250 [13:15:50<85:50:47, 8.64s/it][2025-04-25 21:13:34,157] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.01 | optimizer_step: 1.00 [2025-04-25 21:13:34,158] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.53 | bwd_microstep: 5729.43 | bwd_inner_microstep: 5716.65 | bwd_allreduce_microstep: 12.74 | step_microstep: 19.22 [2025-04-25 21:13:34,158] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.53 | bwd: 5729.44 | bwd_inner: 5716.65 | bwd_allreduce: 12.76 | step: 19.22 13%|█▎ | 5488/41250 [13:15:59<85:55:56, 8.65s/it] {'loss': 0.1787, 'grad_norm': 2.047485589981079, 'learning_rate': 3.889677680422068e-05, 'epoch': 1.33} 13%|█▎ | 5488/41250 [13:15:59<85:55:56, 8.65s/it][2025-04-25 21:13:42,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:13:42,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.99 | bwd_microstep: 5729.18 | bwd_inner_microstep: 5709.61 | bwd_allreduce_microstep: 19.52 | step_microstep: 18.78 [2025-04-25 21:13:42,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.99 | bwd: 5729.20 | bwd_inner: 5709.61 | bwd_allreduce: 19.55 | step: 18.79 13%|█▎ | 5489/41250 [13:16:08<85:59:02, 8.66s/it] {'loss': 0.1453, 'grad_norm': 4.281248092651367, 'learning_rate': 3.8896262408565365e-05, 'epoch': 1.33} 13%|█▎ | 5489/41250 [13:16:08<85:59:02, 8.66s/it][2025-04-25 21:13:51,454] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.05 | optimizer_step: 0.93 [2025-04-25 21:13:51,455] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.61 | bwd_microstep: 5686.33 | bwd_inner_microstep: 5662.79 | bwd_allreduce_microstep: 23.48 | step_microstep: 19.23 [2025-04-25 21:13:51,455] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.62 | bwd: 5686.34 | bwd_inner: 5662.79 | bwd_allreduce: 23.50 | step: 19.23 13%|█▎ | 5490/41250 [13:16:16<85:54:03, 8.65s/it] {'loss': 0.1843, 'grad_norm': 2.1088523864746094, 'learning_rate': 3.889574789641831e-05, 'epoch': 1.33} 13%|█▎ | 5490/41250 [13:16:16<85:54:03, 8.65s/it][2025-04-25 21:14:00,129] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 21:14:00,130] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.28 | bwd_microstep: 5737.01 | bwd_inner_microstep: 5707.38 | bwd_allreduce_microstep: 29.59 | step_microstep: 18.91 [2025-04-25 21:14:00,130] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.28 | bwd: 5737.03 | bwd_inner: 5707.38 | bwd_allreduce: 29.61 | step: 18.91 13%|█▎ | 5491/41250 [13:16:25<85:58:40, 8.66s/it] {'loss': 0.1696, 'grad_norm': 2.3685309886932373, 'learning_rate': 3.889523326778269e-05, 'epoch': 1.33} 13%|█▎ | 5491/41250 [13:16:25<85:58:40, 8.66s/it][2025-04-25 21:14:08,788] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.18 | optimizer_step: 1.04 [2025-04-25 21:14:08,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.12 | bwd_microstep: 5721.00 | bwd_inner_microstep: 5708.08 | bwd_allreduce_microstep: 12.87 | step_microstep: 19.90 [2025-04-25 21:14:08,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.12 | bwd: 5721.02 | bwd_inner: 5708.08 | bwd_allreduce: 12.90 | step: 19.90 13%|█▎ | 5492/41250 [13:16:34<85:59:20, 8.66s/it] {'loss': 0.2582, 'grad_norm': 2.6051878929138184, 'learning_rate': 3.889471852266167e-05, 'epoch': 1.33} 13%|█▎ | 5492/41250 [13:16:34<85:59:20, 8.66s/it][2025-04-25 21:14:17,564] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.04 | optimizer_step: 1.04 [2025-04-25 21:14:17,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2894.86 | bwd_microstep: 5795.74 | bwd_inner_microstep: 5782.83 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.99 [2025-04-25 21:14:17,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2894.86 | bwd: 5795.76 | bwd_inner: 5782.83 | bwd_allreduce: 12.88 | step: 18.99 13%|█▎ | 5493/41250 [13:16:42<86:20:17, 8.69s/it] {'loss': 0.1994, 'grad_norm': 2.134068489074707, 'learning_rate': 3.889420366105842e-05, 'epoch': 1.33} 13%|█▎ | 5493/41250 [13:16:42<86:20:17, 8.69s/it][2025-04-25 21:14:26,255] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 21:14:26,256] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.05 | bwd_microstep: 5773.73 | bwd_inner_microstep: 5664.64 | bwd_allreduce_microstep: 109.04 | step_microstep: 19.15 [2025-04-25 21:14:26,256] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.05 | bwd: 5773.74 | bwd_inner: 5664.64 | bwd_allreduce: 109.06 | step: 19.15 13%|█▎ | 5494/41250 [13:16:51<86:20:00, 8.69s/it] {'loss': 0.0688, 'grad_norm': 1.203930139541626, 'learning_rate': 3.889368868297613e-05, 'epoch': 1.33} 13%|█▎ | 5494/41250 [13:16:51<86:20:00, 8.69s/it][2025-04-25 21:14:34,938] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.02 | optimizer_step: 1.21 [2025-04-25 21:14:34,939] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.88 | bwd_microstep: 5748.42 | bwd_inner_microstep: 5690.03 | bwd_allreduce_microstep: 58.35 | step_microstep: 19.36 [2025-04-25 21:14:34,939] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.88 | bwd: 5748.44 | bwd_inner: 5690.03 | bwd_allreduce: 58.37 | step: 19.36 13%|█▎ | 5495/41250 [13:17:00<86:18:18, 8.69s/it] {'loss': 0.1758, 'grad_norm': 2.577298879623413, 'learning_rate': 3.889317358841797e-05, 'epoch': 1.33} 13%|█▎ | 5495/41250 [13:17:00<86:18:18, 8.69s/it][2025-04-25 21:14:43,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-25 21:14:43,661] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.76 | bwd_microstep: 5805.33 | bwd_inner_microstep: 5662.85 | bwd_allreduce_microstep: 142.43 | step_microstep: 19.04 [2025-04-25 21:14:43,661] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.76 | bwd: 5805.35 | bwd_inner: 5662.85 | bwd_allreduce: 142.45 | step: 19.05 13%|█▎ | 5496/41250 [13:17:08<86:23:38, 8.70s/it] {'loss': 0.1694, 'grad_norm': 2.8119096755981445, 'learning_rate': 3.88926583773871e-05, 'epoch': 1.33} 13%|█▎ | 5496/41250 [13:17:08<86:23:38, 8.70s/it][2025-04-25 21:14:52,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:14:52,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.07 | bwd_microstep: 5773.83 | bwd_inner_microstep: 5703.23 | bwd_allreduce_microstep: 70.55 | step_microstep: 18.84 [2025-04-25 21:14:52,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.07 | bwd: 5773.84 | bwd_inner: 5703.23 | bwd_allreduce: 70.57 | step: 18.84 13%|█▎ | 5497/41250 [13:17:17<86:24:50, 8.70s/it] {'loss': 0.0228, 'grad_norm': 0.5616419315338135, 'learning_rate': 3.889214304988671e-05, 'epoch': 1.33} 13%|█▎ | 5497/41250 [13:17:17<86:24:50, 8.70s/it][2025-04-25 21:15:01,148] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 21:15:01,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2893.80 | bwd_microstep: 5805.59 | bwd_inner_microstep: 5792.75 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.69 [2025-04-25 21:15:01,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2893.80 | bwd: 5805.61 | bwd_inner: 5792.75 | bwd_allreduce: 12.82 | step: 18.70 13%|█▎ | 5498/41250 [13:17:26<86:39:17, 8.73s/it] {'loss': 0.1287, 'grad_norm': 1.7678215503692627, 'learning_rate': 3.8891627605919977e-05, 'epoch': 1.33} 13%|█▎ | 5498/41250 [13:17:26<86:39:17, 8.73s/it][2025-04-25 21:15:09,786] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.06 | optimizer_step: 1.00 [2025-04-25 21:15:09,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.47 | bwd_microstep: 5715.88 | bwd_inner_microstep: 5669.23 | bwd_allreduce_microstep: 46.59 | step_microstep: 19.28 [2025-04-25 21:15:09,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.47 | bwd: 5715.89 | bwd_inner: 5669.23 | bwd_allreduce: 46.61 | step: 19.28 13%|█▎ | 5499/41250 [13:17:35<86:23:37, 8.70s/it] {'loss': 0.0618, 'grad_norm': 1.4847462177276611, 'learning_rate': 3.889111204549007e-05, 'epoch': 1.33} 13%|█▎ | 5499/41250 [13:17:35<86:23:37, 8.70s/it][2025-04-25 21:15:18,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 0.95 [2025-04-25 21:15:18,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2864.48 | bwd_microstep: 5721.05 | bwd_inner_microstep: 5708.12 | bwd_allreduce_microstep: 12.88 | step_microstep: 18.97 [2025-04-25 21:15:18,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2864.48 | bwd: 5721.07 | bwd_inner: 5708.12 | bwd_allreduce: 12.90 | step: 18.97 13%|█▎ | 5500/41250 [13:17:43<86:18:10, 8.69s/it] {'loss': 0.1607, 'grad_norm': 0.8976242542266846, 'learning_rate': 3.889059636860017e-05, 'epoch': 1.33} 13%|█▎ | 5500/41250 [13:17:43<86:18:10, 8.69s/it][2025-04-25 21:15:27,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 21:15:27,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.24 | bwd_microstep: 5764.86 | bwd_inner_microstep: 5652.64 | bwd_allreduce_microstep: 112.18 | step_microstep: 18.72 [2025-04-25 21:15:27,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.24 | bwd: 5764.87 | bwd_inner: 5652.64 | bwd_allreduce: 112.19 | step: 18.73 13%|█▎ | 5501/41250 [13:17:52<86:15:56, 8.69s/it] {'loss': 0.0649, 'grad_norm': 1.8299729824066162, 'learning_rate': 3.889008057525347e-05, 'epoch': 1.33} 13%|█▎ | 5501/41250 [13:17:52<86:15:56, 8.69s/it][2025-04-25 21:15:35,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 6.02 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:15:35,802] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.26 | bwd_microstep: 5727.31 | bwd_inner_microstep: 5714.42 | bwd_allreduce_microstep: 12.84 | step_microstep: 19.27 [2025-04-25 21:15:35,802] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.26 | bwd: 5727.32 | bwd_inner: 5714.42 | bwd_allreduce: 12.86 | step: 19.27 13%|█▎ | 5502/41250 [13:18:01<86:11:36, 8.68s/it] {'loss': 0.1546, 'grad_norm': 1.486798644065857, 'learning_rate': 3.8889564665453135e-05, 'epoch': 1.33} 13%|█▎ | 5502/41250 [13:18:01<86:11:36, 8.68s/it][2025-04-25 21:15:44,513] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:15:44,514] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.28 | bwd_microstep: 5781.51 | bwd_inner_microstep: 5669.72 | bwd_allreduce_microstep: 111.74 | step_microstep: 19.06 [2025-04-25 21:15:44,514] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.29 | bwd: 5781.52 | bwd_inner: 5669.73 | bwd_allreduce: 111.76 | step: 19.06 13%|█▎ | 5503/41250 [13:18:09<86:17:17, 8.69s/it] {'loss': 0.1301, 'grad_norm': 1.4280034303665161, 'learning_rate': 3.8889048639202346e-05, 'epoch': 1.33} 13%|█▎ | 5503/41250 [13:18:09<86:17:17, 8.69s/it][2025-04-25 21:15:53,182] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:15:53,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2865.68 | bwd_microstep: 5719.22 | bwd_inner_microstep: 5706.31 | bwd_allreduce_microstep: 12.85 | step_microstep: 18.88 [2025-04-25 21:15:53,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2865.68 | bwd: 5719.24 | bwd_inner: 5706.31 | bwd_allreduce: 12.88 | step: 18.88 13%|█▎ | 5504/41250 [13:18:18<86:14:00, 8.68s/it] {'loss': 0.2158, 'grad_norm': 2.5425209999084473, 'learning_rate': 3.8888532496504286e-05, 'epoch': 1.33} 13%|█▎ | 5504/41250 [13:18:18<86:14:00, 8.68s/it][2025-04-25 21:16:01,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-25 21:16:01,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2898.29 | bwd_microstep: 5802.77 | bwd_inner_microstep: 5789.85 | bwd_allreduce_microstep: 12.87 | step_microstep: 19.08 [2025-04-25 21:16:01,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2898.29 | bwd: 5802.78 | bwd_inner: 5789.85 | bwd_allreduce: 12.89 | step: 19.08 13%|█▎ | 5505/41250 [13:18:27<86:31:51, 8.71s/it] {'loss': 0.2176, 'grad_norm': 4.34747838973999, 'learning_rate': 3.888801623736215e-05, 'epoch': 1.33} 13%|█▎ | 5505/41250 [13:18:27<86:31:51, 8.71s/it][2025-04-25 21:16:10,706] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:16:10,707] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.18 | bwd_microstep: 5800.42 | bwd_inner_microstep: 5708.76 | bwd_allreduce_microstep: 91.61 | step_microstep: 18.62 [2025-04-25 21:16:10,707] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.18 | bwd: 5800.43 | bwd_inner: 5708.76 | bwd_allreduce: 91.63 | step: 18.63 13%|█▎ | 5506/41250 [13:18:36<86:35:22, 8.72s/it] {'loss': 0.1358, 'grad_norm': 1.7987653017044067, 'learning_rate': 3.8887499861779093e-05, 'epoch': 1.33} 13%|█▎ | 5506/41250 [13:18:36<86:35:22, 8.72s/it][2025-04-25 21:16:19,524] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.02 | optimizer_step: 1.07 [2025-04-25 21:16:19,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.44 | bwd_microstep: 5903.16 | bwd_inner_microstep: 5677.19 | bwd_allreduce_microstep: 225.93 | step_microstep: 19.09 [2025-04-25 21:16:19,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.44 | bwd: 5903.18 | bwd_inner: 5677.19 | bwd_allreduce: 225.95 | step: 19.10 13%|█▎ | 5507/41250 [13:18:44<86:52:39, 8.75s/it] {'loss': 0.1887, 'grad_norm': 2.370481014251709, 'learning_rate': 3.8886983369758324e-05, 'epoch': 1.34} 13%|█▎ | 5507/41250 [13:18:44<86:52:39, 8.75s/it][2025-04-25 21:16:28,215] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 0.95 [2025-04-25 21:16:28,216] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.67 | bwd_microstep: 5750.69 | bwd_inner_microstep: 5705.46 | bwd_allreduce_microstep: 45.19 | step_microstep: 18.74 [2025-04-25 21:16:28,216] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.67 | bwd: 5750.70 | bwd_inner: 5705.45 | bwd_allreduce: 45.20 | step: 18.74 13%|█▎ | 5508/41250 [13:18:53<86:41:55, 8.73s/it] {'loss': 0.1088, 'grad_norm': 2.67803955078125, 'learning_rate': 3.8886466761303016e-05, 'epoch': 1.34} 13%|█▎ | 5508/41250 [13:18:53<86:41:55, 8.73s/it][2025-04-25 21:16:36,842] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:16:36,842] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.22 | bwd_microstep: 5708.22 | bwd_inner_microstep: 5667.23 | bwd_allreduce_microstep: 40.94 | step_microstep: 18.51 [2025-04-25 21:16:36,842] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.22 | bwd: 5708.24 | bwd_inner: 5667.23 | bwd_allreduce: 40.96 | step: 18.52 13%|█▎ | 5509/41250 [13:19:02<86:22:46, 8.70s/it] {'loss': 0.0636, 'grad_norm': 1.9809812307357788, 'learning_rate': 3.8885950036416355e-05, 'epoch': 1.34} 13%|█▎ | 5509/41250 [13:19:02<86:22:46, 8.70s/it][2025-04-25 21:16:45,544] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:16:45,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.02 | bwd_microstep: 5767.42 | bwd_inner_microstep: 5702.16 | bwd_allreduce_microstep: 65.21 | step_microstep: 18.50 [2025-04-25 21:16:45,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.02 | bwd: 5767.43 | bwd_inner: 5702.16 | bwd_allreduce: 65.23 | step: 18.50 13%|█▎ | 5510/41250 [13:19:10<86:23:23, 8.70s/it] {'loss': 0.1855, 'grad_norm': 3.606659173965454, 'learning_rate': 3.8885433195101526e-05, 'epoch': 1.34} 13%|█▎ | 5510/41250 [13:19:10<86:23:23, 8.70s/it][2025-04-25 21:16:54,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.02 | optimizer_step: 0.98 [2025-04-25 21:16:54,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.54 | bwd_microstep: 5755.17 | bwd_inner_microstep: 5705.65 | bwd_allreduce_microstep: 49.47 | step_microstep: 19.43 [2025-04-25 21:16:54,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.54 | bwd: 5755.18 | bwd_inner: 5705.65 | bwd_allreduce: 49.49 | step: 19.43 13%|█▎ | 5511/41250 [13:19:19<86:22:05, 8.70s/it] {'loss': 0.0717, 'grad_norm': 1.6154595613479614, 'learning_rate': 3.888491623736171e-05, 'epoch': 1.34} 13%|█▎ | 5511/41250 [13:19:19<86:22:05, 8.70s/it][2025-04-25 21:17:03,014] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 21:17:03,014] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2892.05 | bwd_microstep: 5797.14 | bwd_inner_microstep: 5784.27 | bwd_allreduce_microstep: 12.83 | step_microstep: 18.74 [2025-04-25 21:17:03,014] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2892.05 | bwd: 5797.16 | bwd_inner: 5784.27 | bwd_allreduce: 12.84 | step: 18.75 13%|█▎ | 5512/41250 [13:19:28<86:34:47, 8.72s/it] {'loss': 0.1833, 'grad_norm': 2.5004568099975586, 'learning_rate': 3.8884399163200115e-05, 'epoch': 1.34} 13%|█▎ | 5512/41250 [13:19:28<86:34:47, 8.72s/it][2025-04-25 21:17:11,842] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 21:17:11,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.02 | bwd_microstep: 5918.27 | bwd_inner_microstep: 5646.17 | bwd_allreduce_microstep: 272.05 | step_microstep: 18.95 [2025-04-25 21:17:11,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.02 | bwd: 5918.28 | bwd_inner: 5646.17 | bwd_allreduce: 272.07 | step: 18.95 13%|█▎ | 5513/41250 [13:19:37<86:53:46, 8.75s/it] {'loss': 0.0447, 'grad_norm': 1.013670563697815, 'learning_rate': 3.8883881972619904e-05, 'epoch': 1.34} 13%|█▎ | 5513/41250 [13:19:37<86:53:46, 8.75s/it][2025-04-25 21:17:20,477] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 21:17:20,478] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.24 | bwd_microstep: 5704.39 | bwd_inner_microstep: 5691.52 | bwd_allreduce_microstep: 12.83 | step_microstep: 19.14 [2025-04-25 21:17:20,478] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.24 | bwd: 5704.41 | bwd_inner: 5691.52 | bwd_allreduce: 12.85 | step: 19.14 13%|█▎ | 5514/41250 [13:19:45<86:32:22, 8.72s/it] {'loss': 0.0844, 'grad_norm': 1.0146520137786865, 'learning_rate': 3.888336466562427e-05, 'epoch': 1.34} 13%|█▎ | 5514/41250 [13:19:45<86:32:22, 8.72s/it][2025-04-25 21:17:29,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:17:29,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.46 | bwd_microstep: 5753.87 | bwd_inner_microstep: 5707.76 | bwd_allreduce_microstep: 46.06 | step_microstep: 18.66 [2025-04-25 21:17:29,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.46 | bwd: 5753.88 | bwd_inner: 5707.76 | bwd_allreduce: 46.08 | step: 18.66 13%|█▎ | 5515/41250 [13:19:54<86:27:06, 8.71s/it] {'loss': 0.3565, 'grad_norm': 2.572575092315674, 'learning_rate': 3.888284724221641e-05, 'epoch': 1.34} 13%|█▎ | 5515/41250 [13:19:54<86:27:06, 8.71s/it][2025-04-25 21:17:37,768] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:17:37,769] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.12 | bwd_microstep: 5680.75 | bwd_inner_microstep: 5655.45 | bwd_allreduce_microstep: 25.25 | step_microstep: 18.76 [2025-04-25 21:17:37,769] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.12 | bwd: 5680.76 | bwd_inner: 5655.45 | bwd_allreduce: 25.26 | step: 18.76 13%|█▎ | 5516/41250 [13:20:03<86:07:50, 8.68s/it] {'loss': 0.1748, 'grad_norm': 4.495077133178711, 'learning_rate': 3.8882329702399515e-05, 'epoch': 1.34} 13%|█▎ | 5516/41250 [13:20:03<86:07:50, 8.68s/it][2025-04-25 21:17:46,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:17:46,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.06 | bwd_microstep: 5743.56 | bwd_inner_microstep: 5684.40 | bwd_allreduce_microstep: 59.12 | step_microstep: 18.60 [2025-04-25 21:17:46,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.06 | bwd: 5743.58 | bwd_inner: 5684.40 | bwd_allreduce: 59.13 | step: 18.60 13%|█▎ | 5517/41250 [13:20:11<86:07:10, 8.68s/it] {'loss': 0.0926, 'grad_norm': 1.09340238571167, 'learning_rate': 3.888181204617677e-05, 'epoch': 1.34} 13%|█▎ | 5517/41250 [13:20:11<86:07:10, 8.68s/it][2025-04-25 21:17:55,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.09 | optimizer_step: 1.02 [2025-04-25 21:17:55,129] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.82 | bwd_microstep: 5757.18 | bwd_inner_microstep: 5681.19 | bwd_allreduce_microstep: 75.94 | step_microstep: 18.95 [2025-04-25 21:17:55,129] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.82 | bwd: 5757.20 | bwd_inner: 5681.19 | bwd_allreduce: 75.96 | step: 18.96 13%|█▎ | 5518/41250 [13:20:20<86:08:41, 8.68s/it] {'loss': 0.2007, 'grad_norm': 2.256058931350708, 'learning_rate': 3.8881294273551364e-05, 'epoch': 1.34} 13%|█▎ | 5518/41250 [13:20:20<86:08:41, 8.68s/it][2025-04-25 21:18:03,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:18:03,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.91 | bwd_microstep: 5753.09 | bwd_inner_microstep: 5657.56 | bwd_allreduce_microstep: 95.48 | step_microstep: 18.45 [2025-04-25 21:18:03,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.91 | bwd: 5753.10 | bwd_inner: 5657.56 | bwd_allreduce: 95.50 | step: 18.45 13%|█▎ | 5519/41250 [13:20:29<86:05:55, 8.67s/it] {'loss': 0.1747, 'grad_norm': 2.9966695308685303, 'learning_rate': 3.88807763845265e-05, 'epoch': 1.34} 13%|█▎ | 5519/41250 [13:20:29<86:05:55, 8.67s/it][2025-04-25 21:18:12,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.03 | optimizer_step: 1.13 [2025-04-25 21:18:12,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.45 | bwd_microstep: 5694.43 | bwd_inner_microstep: 5641.46 | bwd_allreduce_microstep: 52.93 | step_microstep: 19.20 [2025-04-25 21:18:12,397] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.45 | bwd: 5694.45 | bwd_inner: 5641.46 | bwd_allreduce: 52.95 | step: 19.20 13%|█▎ | 5520/41250 [13:20:37<85:53:01, 8.65s/it] {'loss': 0.0564, 'grad_norm': 1.0483108758926392, 'learning_rate': 3.888025837910536e-05, 'epoch': 1.34} 13%|█▎ | 5520/41250 [13:20:37<85:53:01, 8.65s/it][2025-04-25 21:18:21,083] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-25 21:18:21,083] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.03 | bwd_microstep: 5762.23 | bwd_inner_microstep: 5680.85 | bwd_allreduce_microstep: 81.33 | step_microstep: 18.71 [2025-04-25 21:18:21,084] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.03 | bwd: 5762.24 | bwd_inner: 5680.85 | bwd_allreduce: 81.35 | step: 18.71 13%|█▎ | 5521/41250 [13:20:46<85:58:54, 8.66s/it] {'loss': 0.1182, 'grad_norm': 1.8858964443206787, 'learning_rate': 3.887974025729114e-05, 'epoch': 1.34} 13%|█▎ | 5521/41250 [13:20:46<85:58:54, 8.66s/it][2025-04-25 21:18:29,694] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 21:18:29,695] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.21 | bwd_microstep: 5709.28 | bwd_inner_microstep: 5643.15 | bwd_allreduce_microstep: 66.08 | step_microstep: 19.00 [2025-04-25 21:18:29,695] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.21 | bwd: 5709.30 | bwd_inner: 5643.15 | bwd_allreduce: 66.10 | step: 19.00 13%|█▎ | 5522/41250 [13:20:55<85:49:28, 8.65s/it] {'loss': 0.0219, 'grad_norm': 0.5126562118530273, 'learning_rate': 3.887922201908703e-05, 'epoch': 1.34} 13%|█▎ | 5522/41250 [13:20:55<85:49:28, 8.65s/it][2025-04-25 21:18:38,351] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 0.99 | optimizer_step: 0.97 [2025-04-25 21:18:38,351] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.41 | bwd_microstep: 5713.59 | bwd_inner_microstep: 5700.80 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.55 [2025-04-25 21:18:38,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.41 | bwd: 5713.60 | bwd_inner: 5700.80 | bwd_allreduce: 12.76 | step: 18.55 13%|█▎ | 5523/41250 [13:21:03<85:50:53, 8.65s/it] {'loss': 0.0654, 'grad_norm': 1.146093726158142, 'learning_rate': 3.8878703664496235e-05, 'epoch': 1.34} 13%|█▎ | 5523/41250 [13:21:03<85:50:53, 8.65s/it][2025-04-25 21:18:47,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.05 | optimizer_step: 0.90 [2025-04-25 21:18:47,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.20 | bwd_microstep: 5778.28 | bwd_inner_microstep: 5643.27 | bwd_allreduce_microstep: 134.96 | step_microstep: 18.87 [2025-04-25 21:18:47,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.20 | bwd: 5778.29 | bwd_inner: 5643.27 | bwd_allreduce: 134.98 | step: 18.88 13%|█▎ | 5524/41250 [13:21:12<85:57:07, 8.66s/it] {'loss': 0.0497, 'grad_norm': 0.916450560092926, 'learning_rate': 3.8878185193521945e-05, 'epoch': 1.34} 13%|█▎ | 5524/41250 [13:21:12<85:57:07, 8.66s/it][2025-04-25 21:18:55,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:18:55,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.41 | bwd_microstep: 5740.94 | bwd_inner_microstep: 5699.35 | bwd_allreduce_microstep: 41.55 | step_microstep: 18.71 [2025-04-25 21:18:55,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.41 | bwd: 5740.95 | bwd_inner: 5699.35 | bwd_allreduce: 41.56 | step: 18.71 13%|█▎ | 5525/41250 [13:21:21<85:59:42, 8.67s/it] {'loss': 0.0758, 'grad_norm': 1.8716577291488647, 'learning_rate': 3.8877666606167354e-05, 'epoch': 1.34} 13%|█▎ | 5525/41250 [13:21:21<85:59:42, 8.67s/it][2025-04-25 21:19:04,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 21:19:04,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.92 | bwd_microstep: 5761.76 | bwd_inner_microstep: 5666.14 | bwd_allreduce_microstep: 95.57 | step_microstep: 18.84 [2025-04-25 21:19:04,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.92 | bwd: 5761.77 | bwd_inner: 5666.14 | bwd_allreduce: 95.59 | step: 18.84 13%|█▎ | 5526/41250 [13:21:29<86:01:02, 8.67s/it] {'loss': 0.0528, 'grad_norm': 1.3208311796188354, 'learning_rate': 3.887714790243566e-05, 'epoch': 1.34} 13%|█▎ | 5526/41250 [13:21:29<86:01:02, 8.67s/it][2025-04-25 21:19:12,993] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:19:12,993] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.04 | bwd_microstep: 5697.67 | bwd_inner_microstep: 5646.18 | bwd_allreduce_microstep: 51.44 | step_microstep: 18.94 [2025-04-25 21:19:12,993] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.04 | bwd: 5697.69 | bwd_inner: 5646.18 | bwd_allreduce: 51.46 | step: 18.95 13%|█▎ | 5527/41250 [13:21:38<85:49:29, 8.65s/it] {'loss': 0.2225, 'grad_norm': 2.131061553955078, 'learning_rate': 3.887662908233007e-05, 'epoch': 1.34} 13%|█▎ | 5527/41250 [13:21:38<85:49:29, 8.65s/it][2025-04-25 21:19:21,621] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.98 | optimizer_step: 0.98 [2025-04-25 21:19:21,621] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.06 | bwd_microstep: 5697.71 | bwd_inner_microstep: 5684.94 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.64 [2025-04-25 21:19:21,622] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.06 | bwd: 5697.73 | bwd_inner: 5684.94 | bwd_allreduce: 12.75 | step: 18.64 13%|█▎ | 5528/41250 [13:21:46<85:45:39, 8.64s/it] {'loss': 0.205, 'grad_norm': 2.88101863861084, 'learning_rate': 3.887611014585377e-05, 'epoch': 1.34} 13%|█▎ | 5528/41250 [13:21:46<85:45:39, 8.64s/it][2025-04-25 21:19:30,298] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.22 | optimizer_step: 1.04 [2025-04-25 21:19:30,299] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.30 | bwd_microstep: 5743.53 | bwd_inner_microstep: 5707.92 | bwd_allreduce_microstep: 35.55 | step_microstep: 19.82 [2025-04-25 21:19:30,299] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.30 | bwd: 5743.55 | bwd_inner: 5707.92 | bwd_allreduce: 35.58 | step: 19.82 13%|█▎ | 5529/41250 [13:21:55<85:51:58, 8.65s/it] {'loss': 0.1095, 'grad_norm': 4.274804592132568, 'learning_rate': 3.8875591093009964e-05, 'epoch': 1.34} 13%|█▎ | 5529/41250 [13:21:55<85:51:58, 8.65s/it][2025-04-25 21:19:38,964] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:19:38,965] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.29 | bwd_microstep: 5731.51 | bwd_inner_microstep: 5688.97 | bwd_allreduce_microstep: 42.49 | step_microstep: 18.79 [2025-04-25 21:19:38,965] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.30 | bwd: 5731.52 | bwd_inner: 5688.97 | bwd_allreduce: 42.50 | step: 18.79 13%|█▎ | 5530/41250 [13:22:04<85:53:42, 8.66s/it] {'loss': 0.2012, 'grad_norm': 3.0580852031707764, 'learning_rate': 3.887507192380185e-05, 'epoch': 1.34} 13%|█▎ | 5530/41250 [13:22:04<85:53:42, 8.66s/it][2025-04-25 21:19:47,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 21:19:47,719] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2883.23 | bwd_microstep: 5788.61 | bwd_inner_microstep: 5775.82 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.99 [2025-04-25 21:19:47,719] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2883.23 | bwd: 5788.63 | bwd_inner: 5775.82 | bwd_allreduce: 12.76 | step: 19.00 13%|█▎ | 5531/41250 [13:22:13<86:10:58, 8.69s/it] {'loss': 0.1165, 'grad_norm': 2.864020347595215, 'learning_rate': 3.887455263823263e-05, 'epoch': 1.34} 13%|█▎ | 5531/41250 [13:22:13<86:10:58, 8.69s/it][2025-04-25 21:19:56,355] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:19:56,356] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.04 | bwd_microstep: 5713.93 | bwd_inner_microstep: 5691.68 | bwd_allreduce_microstep: 22.21 | step_microstep: 18.35 [2025-04-25 21:19:56,356] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.04 | bwd: 5713.94 | bwd_inner: 5691.68 | bwd_allreduce: 22.22 | step: 18.35 13%|█▎ | 5532/41250 [13:22:21<86:02:04, 8.67s/it] {'loss': 0.2305, 'grad_norm': 1.948059320449829, 'learning_rate': 3.88740332363055e-05, 'epoch': 1.34} 13%|█▎ | 5532/41250 [13:22:21<86:02:04, 8.67s/it][2025-04-25 21:20:05,027] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:20:05,027] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.57 | bwd_microstep: 5736.55 | bwd_inner_microstep: 5677.01 | bwd_allreduce_microstep: 59.49 | step_microstep: 18.83 [2025-04-25 21:20:05,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.57 | bwd: 5736.56 | bwd_inner: 5677.01 | bwd_allreduce: 59.51 | step: 18.84 13%|█▎ | 5533/41250 [13:22:30<86:02:04, 8.67s/it] {'loss': 0.2975, 'grad_norm': 3.5976412296295166, 'learning_rate': 3.887351371802368e-05, 'epoch': 1.34} 13%|█▎ | 5533/41250 [13:22:30<86:02:04, 8.67s/it][2025-04-25 21:20:13,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-25 21:20:13,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.32 | bwd_microstep: 5690.16 | bwd_inner_microstep: 5655.69 | bwd_allreduce_microstep: 34.42 | step_microstep: 19.27 [2025-04-25 21:20:13,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.32 | bwd: 5690.17 | bwd_inner: 5655.69 | bwd_allreduce: 34.44 | step: 19.27 13%|█▎ | 5534/41250 [13:22:38<85:49:39, 8.65s/it] {'loss': 0.2349, 'grad_norm': 2.965714454650879, 'learning_rate': 3.8872994083390355e-05, 'epoch': 1.34} 13%|█▎ | 5534/41250 [13:22:38<85:49:39, 8.65s/it][2025-04-25 21:20:22,301] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.03 | optimizer_step: 1.03 [2025-04-25 21:20:22,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.72 | bwd_microstep: 5731.18 | bwd_inner_microstep: 5700.17 | bwd_allreduce_microstep: 30.96 | step_microstep: 19.10 [2025-04-25 21:20:22,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.72 | bwd: 5731.20 | bwd_inner: 5700.17 | bwd_allreduce: 30.98 | step: 19.10 13%|█▎ | 5535/41250 [13:22:47<85:53:12, 8.66s/it] {'loss': 0.0106, 'grad_norm': 0.1783975064754486, 'learning_rate': 3.887247433240873e-05, 'epoch': 1.34} 13%|█▎ | 5535/41250 [13:22:47<85:53:12, 8.66s/it][2025-04-25 21:20:31,000] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 21:20:31,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.24 | bwd_microstep: 5788.30 | bwd_inner_microstep: 5652.53 | bwd_allreduce_microstep: 135.71 | step_microstep: 18.97 [2025-04-25 21:20:31,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.24 | bwd: 5788.31 | bwd_inner: 5652.53 | bwd_allreduce: 135.74 | step: 18.97 13%|█▎ | 5536/41250 [13:22:56<86:00:35, 8.67s/it] {'loss': 0.1923, 'grad_norm': 3.8296260833740234, 'learning_rate': 3.887195446508202e-05, 'epoch': 1.34} 13%|█▎ | 5536/41250 [13:22:56<86:00:35, 8.67s/it][2025-04-25 21:20:39,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:20:39,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.13 | bwd_microstep: 5753.09 | bwd_inner_microstep: 5712.74 | bwd_allreduce_microstep: 40.31 | step_microstep: 18.52 [2025-04-25 21:20:39,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.14 | bwd: 5753.10 | bwd_inner: 5712.74 | bwd_allreduce: 40.32 | step: 18.53 13%|█▎ | 5537/41250 [13:23:05<86:04:11, 8.68s/it] {'loss': 0.1643, 'grad_norm': 3.2234268188476562, 'learning_rate': 3.887143448141341e-05, 'epoch': 1.34} 13%|█▎ | 5537/41250 [13:23:05<86:04:11, 8.68s/it][2025-04-25 21:20:48,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:20:48,368] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.18 | bwd_microstep: 5769.09 | bwd_inner_microstep: 5654.47 | bwd_allreduce_microstep: 114.57 | step_microstep: 18.64 [2025-04-25 21:20:48,368] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.18 | bwd: 5769.10 | bwd_inner: 5654.47 | bwd_allreduce: 114.59 | step: 18.64 13%|█▎ | 5538/41250 [13:23:13<86:03:54, 8.68s/it] {'loss': 0.2457, 'grad_norm': 2.760890483856201, 'learning_rate': 3.887091438140613e-05, 'epoch': 1.34} 13%|█▎ | 5538/41250 [13:23:13<86:03:54, 8.68s/it][2025-04-25 21:20:57,052] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.01 | optimizer_step: 0.95 [2025-04-25 21:20:57,053] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.94 | bwd_microstep: 5768.57 | bwd_inner_microstep: 5667.18 | bwd_allreduce_microstep: 101.34 | step_microstep: 18.83 [2025-04-25 21:20:57,053] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.94 | bwd: 5768.58 | bwd_inner: 5667.18 | bwd_allreduce: 101.36 | step: 18.84 13%|█▎ | 5539/41250 [13:23:22<86:05:23, 8.68s/it] {'loss': 0.2305, 'grad_norm': 3.2994790077209473, 'learning_rate': 3.887039416506337e-05, 'epoch': 1.34} 13%|█▎ | 5539/41250 [13:23:22<86:05:23, 8.68s/it][2025-04-25 21:21:05,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-25 21:21:05,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.45 | bwd_microstep: 5738.47 | bwd_inner_microstep: 5710.92 | bwd_allreduce_microstep: 27.51 | step_microstep: 18.86 [2025-04-25 21:21:05,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.45 | bwd: 5738.49 | bwd_inner: 5710.92 | bwd_allreduce: 27.53 | step: 18.85 13%|█▎ | 5540/41250 [13:23:31<86:05:15, 8.68s/it] {'loss': 0.2486, 'grad_norm': 3.8228609561920166, 'learning_rate': 3.886987383238834e-05, 'epoch': 1.34} 13%|█▎ | 5540/41250 [13:23:31<86:05:15, 8.68s/it][2025-04-25 21:21:14,408] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 21:21:14,408] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.51 | bwd_microstep: 5760.33 | bwd_inner_microstep: 5665.91 | bwd_allreduce_microstep: 94.37 | step_microstep: 18.81 [2025-04-25 21:21:14,408] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.51 | bwd: 5760.34 | bwd_inner: 5665.91 | bwd_allreduce: 94.39 | step: 18.81 13%|█▎ | 5541/41250 [13:23:39<86:04:43, 8.68s/it] {'loss': 0.0825, 'grad_norm': 2.0746686458587646, 'learning_rate': 3.886935338338425e-05, 'epoch': 1.34} 13%|█▎ | 5541/41250 [13:23:39<86:04:43, 8.68s/it][2025-04-25 21:21:23,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:21:23,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.29 | bwd_microstep: 5773.14 | bwd_inner_microstep: 5663.21 | bwd_allreduce_microstep: 109.88 | step_microstep: 18.66 [2025-04-25 21:21:23,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.28 | bwd: 5773.15 | bwd_inner: 5663.21 | bwd_allreduce: 109.90 | step: 18.66 13%|█▎ | 5542/41250 [13:23:48<86:05:56, 8.68s/it] {'loss': 0.0437, 'grad_norm': 1.3790768384933472, 'learning_rate': 3.886883281805431e-05, 'epoch': 1.34} 13%|█▎ | 5542/41250 [13:23:48<86:05:56, 8.68s/it][2025-04-25 21:21:31,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:21:31,791] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.21 | bwd_microstep: 5782.08 | bwd_inner_microstep: 5665.73 | bwd_allreduce_microstep: 116.30 | step_microstep: 18.76 [2025-04-25 21:21:31,791] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.21 | bwd: 5782.09 | bwd_inner: 5665.73 | bwd_allreduce: 116.32 | step: 18.76 13%|█▎ | 5543/41250 [13:23:57<86:08:37, 8.69s/it] {'loss': 0.0787, 'grad_norm': 1.313536524772644, 'learning_rate': 3.886831213640172e-05, 'epoch': 1.34} 13%|█▎ | 5543/41250 [13:23:57<86:08:37, 8.69s/it][2025-04-25 21:21:40,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:21:40,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.78 | bwd_microstep: 5733.02 | bwd_inner_microstep: 5720.25 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.84 [2025-04-25 21:21:40,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.78 | bwd: 5733.03 | bwd_inner: 5720.25 | bwd_allreduce: 12.75 | step: 18.84 13%|█▎ | 5544/41250 [13:24:05<86:06:02, 8.68s/it] {'loss': 0.1961, 'grad_norm': 2.2888612747192383, 'learning_rate': 3.886779133842971e-05, 'epoch': 1.34} 13%|█▎ | 5544/41250 [13:24:05<86:06:02, 8.68s/it][2025-04-25 21:21:49,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:21:49,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.55 | bwd_microstep: 5764.83 | bwd_inner_microstep: 5660.43 | bwd_allreduce_microstep: 104.36 | step_microstep: 18.75 [2025-04-25 21:21:49,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.55 | bwd: 5764.84 | bwd_inner: 5660.43 | bwd_allreduce: 104.38 | step: 18.75 13%|█▎ | 5545/41250 [13:24:14<86:05:47, 8.68s/it] {'loss': 0.1472, 'grad_norm': 2.335383415222168, 'learning_rate': 3.8867270424141466e-05, 'epoch': 1.34} 13%|█▎ | 5545/41250 [13:24:14<86:05:47, 8.68s/it][2025-04-25 21:21:57,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 21:21:57,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.33 | bwd_microstep: 5771.82 | bwd_inner_microstep: 5658.47 | bwd_allreduce_microstep: 113.29 | step_microstep: 19.06 [2025-04-25 21:21:57,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.33 | bwd: 5771.84 | bwd_inner: 5658.47 | bwd_allreduce: 113.32 | step: 19.06 13%|█▎ | 5546/41250 [13:24:23<86:06:56, 8.68s/it] {'loss': 0.0791, 'grad_norm': 1.3003511428833008, 'learning_rate': 3.886674939354022e-05, 'epoch': 1.34} 13%|█▎ | 5546/41250 [13:24:23<86:06:56, 8.68s/it][2025-04-25 21:22:06,516] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.29 | optimizer_step: 1.05 [2025-04-25 21:22:06,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.63 | bwd_microstep: 5747.29 | bwd_inner_microstep: 5725.60 | bwd_allreduce_microstep: 21.63 | step_microstep: 19.89 [2025-04-25 21:22:06,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.63 | bwd: 5747.31 | bwd_inner: 5725.60 | bwd_allreduce: 21.66 | step: 19.89 13%|█▎ | 5547/41250 [13:24:31<86:07:32, 8.68s/it] {'loss': 0.1619, 'grad_norm': 2.5763514041900635, 'learning_rate': 3.886622824662917e-05, 'epoch': 1.34} 13%|█▎ | 5547/41250 [13:24:31<86:07:32, 8.68s/it][2025-04-25 21:22:15,174] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:22:15,174] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.02 | bwd_microstep: 5717.66 | bwd_inner_microstep: 5704.90 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.73 [2025-04-25 21:22:15,175] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.02 | bwd: 5717.68 | bwd_inner: 5704.90 | bwd_allreduce: 12.73 | step: 18.74 13%|█▎ | 5548/41250 [13:24:40<86:02:29, 8.68s/it] {'loss': 0.5752, 'grad_norm': 3.633366584777832, 'learning_rate': 3.886570698341153e-05, 'epoch': 1.34} 13%|█▎ | 5548/41250 [13:24:40<86:02:29, 8.68s/it][2025-04-25 21:22:23,891] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 21:22:23,892] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.39 | bwd_microstep: 5798.58 | bwd_inner_microstep: 5658.31 | bwd_allreduce_microstep: 140.21 | step_microstep: 18.98 [2025-04-25 21:22:23,892] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.39 | bwd: 5798.59 | bwd_inner: 5658.31 | bwd_allreduce: 140.24 | step: 18.98 13%|█▎ | 5549/41250 [13:24:49<86:09:47, 8.69s/it] {'loss': 0.0297, 'grad_norm': 0.6086517572402954, 'learning_rate': 3.886518560389053e-05, 'epoch': 1.35} 13%|█▎ | 5549/41250 [13:24:49<86:09:47, 8.69s/it][2025-04-25 21:22:32,536] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:22:32,536] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.73 | bwd_microstep: 5714.12 | bwd_inner_microstep: 5667.74 | bwd_allreduce_microstep: 46.34 | step_microstep: 18.85 [2025-04-25 21:22:32,536] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.73 | bwd: 5714.14 | bwd_inner: 5667.74 | bwd_allreduce: 46.36 | step: 18.85 13%|█▎ | 5550/41250 [13:24:57<86:01:42, 8.68s/it] {'loss': 0.2043, 'grad_norm': 2.834888219833374, 'learning_rate': 3.886466410806936e-05, 'epoch': 1.35} 13%|█▎ | 5550/41250 [13:24:57<86:01:42, 8.68s/it][2025-04-25 21:22:41,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:22:41,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2864.49 | bwd_microstep: 5755.28 | bwd_inner_microstep: 5712.50 | bwd_allreduce_microstep: 42.73 | step_microstep: 18.77 [2025-04-25 21:22:41,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2864.49 | bwd: 5755.29 | bwd_inner: 5712.50 | bwd_allreduce: 42.75 | step: 18.77 13%|█▎ | 5551/41250 [13:25:06<86:07:04, 8.68s/it] {'loss': 0.2453, 'grad_norm': 3.4232537746429443, 'learning_rate': 3.886414249595125e-05, 'epoch': 1.35} 13%|█▎ | 5551/41250 [13:25:06<86:07:04, 8.68s/it][2025-04-25 21:22:49,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 21:22:49,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.20 | bwd_microstep: 5774.40 | bwd_inner_microstep: 5675.42 | bwd_allreduce_microstep: 98.93 | step_microstep: 18.79 [2025-04-25 21:22:49,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.20 | bwd: 5774.41 | bwd_inner: 5675.42 | bwd_allreduce: 98.95 | step: 18.80 13%|█▎ | 5552/41250 [13:25:15<86:08:08, 8.69s/it] {'loss': 0.2157, 'grad_norm': 3.613065242767334, 'learning_rate': 3.8863620767539406e-05, 'epoch': 1.35} 13%|█▎ | 5552/41250 [13:25:15<86:08:08, 8.69s/it][2025-04-25 21:22:58,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:22:58,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.10 | bwd_microstep: 5731.36 | bwd_inner_microstep: 5657.55 | bwd_allreduce_microstep: 73.76 | step_microstep: 18.49 [2025-04-25 21:22:58,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.10 | bwd: 5731.37 | bwd_inner: 5657.55 | bwd_allreduce: 73.78 | step: 18.49 13%|█▎ | 5553/41250 [13:25:23<86:01:25, 8.68s/it] {'loss': 0.1398, 'grad_norm': 2.1698226928710938, 'learning_rate': 3.8863098922837055e-05, 'epoch': 1.35} 13%|█▎ | 5553/41250 [13:25:23<86:01:25, 8.68s/it][2025-04-25 21:23:07,299] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.00 | optimizer_step: 1.02 [2025-04-25 21:23:07,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.11 | bwd_microstep: 5781.75 | bwd_inner_microstep: 5703.87 | bwd_allreduce_microstep: 77.83 | step_microstep: 18.94 [2025-04-25 21:23:07,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.11 | bwd: 5781.76 | bwd_inner: 5703.87 | bwd_allreduce: 77.85 | step: 18.95 13%|█▎ | 5554/41250 [13:25:32<86:08:40, 8.69s/it] {'loss': 0.1704, 'grad_norm': 1.6090606451034546, 'learning_rate': 3.886257696184741e-05, 'epoch': 1.35} 13%|█▎ | 5554/41250 [13:25:32<86:08:40, 8.69s/it][2025-04-25 21:23:15,931] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:23:15,931] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.21 | bwd_microstep: 5714.36 | bwd_inner_microstep: 5675.69 | bwd_allreduce_microstep: 38.62 | step_microstep: 18.61 [2025-04-25 21:23:15,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.21 | bwd: 5714.37 | bwd_inner: 5675.69 | bwd_allreduce: 38.64 | step: 18.62 13%|█▎ | 5555/41250 [13:25:41<85:58:30, 8.67s/it] {'loss': 0.2387, 'grad_norm': 2.0865395069122314, 'learning_rate': 3.886205488457369e-05, 'epoch': 1.35} 13%|█▎ | 5555/41250 [13:25:41<85:58:30, 8.67s/it][2025-04-25 21:23:24,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:23:24,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.58 | bwd_microstep: 5796.94 | bwd_inner_microstep: 5659.29 | bwd_allreduce_microstep: 137.60 | step_microstep: 18.75 [2025-04-25 21:23:24,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.58 | bwd: 5796.96 | bwd_inner: 5659.29 | bwd_allreduce: 137.62 | step: 18.75 13%|█▎ | 5556/41250 [13:25:49<86:04:35, 8.68s/it] {'loss': 0.1746, 'grad_norm': 3.1179513931274414, 'learning_rate': 3.88615326910191e-05, 'epoch': 1.35} 13%|█▎ | 5556/41250 [13:25:49<86:04:35, 8.68s/it][2025-04-25 21:23:33,359] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:23:33,360] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.70 | bwd_microstep: 5790.15 | bwd_inner_microstep: 5710.25 | bwd_allreduce_microstep: 79.85 | step_microstep: 18.63 [2025-04-25 21:23:33,360] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.70 | bwd: 5790.17 | bwd_inner: 5710.25 | bwd_allreduce: 79.87 | step: 18.63 13%|█▎ | 5557/41250 [13:25:58<86:12:18, 8.69s/it] {'loss': 0.2462, 'grad_norm': 3.1114351749420166, 'learning_rate': 3.886101038118688e-05, 'epoch': 1.35} 13%|█▎ | 5557/41250 [13:25:58<86:12:18, 8.69s/it][2025-04-25 21:23:42,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 21:23:42,059] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.32 | bwd_microstep: 5764.99 | bwd_inner_microstep: 5705.83 | bwd_allreduce_microstep: 59.11 | step_microstep: 18.90 [2025-04-25 21:23:42,059] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.32 | bwd: 5765.01 | bwd_inner: 5705.83 | bwd_allreduce: 59.13 | step: 18.91 13%|█▎ | 5558/41250 [13:26:07<86:12:41, 8.70s/it] {'loss': 0.0692, 'grad_norm': 1.680681824684143, 'learning_rate': 3.886048795508024e-05, 'epoch': 1.35} 13%|█▎ | 5558/41250 [13:26:07<86:12:41, 8.70s/it][2025-04-25 21:23:50,698] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:23:50,699] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.07 | bwd_microstep: 5730.05 | bwd_inner_microstep: 5655.09 | bwd_allreduce_microstep: 74.91 | step_microstep: 18.74 [2025-04-25 21:23:50,699] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.07 | bwd: 5730.06 | bwd_inner: 5655.09 | bwd_allreduce: 74.93 | step: 18.74 13%|█▎ | 5559/41250 [13:26:16<86:02:18, 8.68s/it] {'loss': 0.0768, 'grad_norm': 1.6565593481063843, 'learning_rate': 3.8859965412702406e-05, 'epoch': 1.35} 13%|█▎ | 5559/41250 [13:26:16<86:02:18, 8.68s/it][2025-04-25 21:23:59,395] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-25 21:23:59,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.38 | bwd_microstep: 5780.87 | bwd_inner_microstep: 5656.68 | bwd_allreduce_microstep: 124.14 | step_microstep: 19.17 [2025-04-25 21:23:59,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.38 | bwd: 5780.88 | bwd_inner: 5656.68 | bwd_allreduce: 124.16 | step: 19.17 13%|█▎ | 5560/41250 [13:26:24<86:05:43, 8.68s/it] {'loss': 0.0212, 'grad_norm': 0.5249252915382385, 'learning_rate': 3.885944275405659e-05, 'epoch': 1.35} 13%|█▎ | 5560/41250 [13:26:24<86:05:43, 8.68s/it][2025-04-25 21:24:08,022] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 21:24:08,023] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.04 | bwd_microstep: 5719.28 | bwd_inner_microstep: 5651.07 | bwd_allreduce_microstep: 68.16 | step_microstep: 18.69 [2025-04-25 21:24:08,023] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.04 | bwd: 5719.29 | bwd_inner: 5651.07 | bwd_allreduce: 68.18 | step: 18.70 13%|█▎ | 5561/41250 [13:26:33<85:55:14, 8.67s/it] {'loss': 0.1919, 'grad_norm': 1.2650532722473145, 'learning_rate': 3.885891997914602e-05, 'epoch': 1.35} 13%|█▎ | 5561/41250 [13:26:33<85:55:14, 8.67s/it][2025-04-25 21:24:16,717] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 21:24:16,717] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.00 | bwd_microstep: 5785.76 | bwd_inner_microstep: 5644.03 | bwd_allreduce_microstep: 141.69 | step_microstep: 18.94 [2025-04-25 21:24:16,717] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.00 | bwd: 5785.78 | bwd_inner: 5644.03 | bwd_allreduce: 141.71 | step: 18.95 13%|█▎ | 5562/41250 [13:26:42<86:00:05, 8.68s/it] {'loss': 0.0541, 'grad_norm': 0.892282247543335, 'learning_rate': 3.885839708797392e-05, 'epoch': 1.35} 13%|█▎ | 5562/41250 [13:26:42<86:00:05, 8.68s/it][2025-04-25 21:24:25,399] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.22 | optimizer_step: 0.90 [2025-04-25 21:24:25,400] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.74 | bwd_microstep: 5749.65 | bwd_inner_microstep: 5707.20 | bwd_allreduce_microstep: 42.39 | step_microstep: 18.86 [2025-04-25 21:24:25,400] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.74 | bwd: 5749.66 | bwd_inner: 5707.20 | bwd_allreduce: 42.42 | step: 18.86 13%|█▎ | 5563/41250 [13:26:50<86:01:09, 8.68s/it] {'loss': 0.1442, 'grad_norm': 3.5981218814849854, 'learning_rate': 3.8857874080543504e-05, 'epoch': 1.35} 13%|█▎ | 5563/41250 [13:26:50<86:01:09, 8.68s/it][2025-04-25 21:24:34,079] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:24:34,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.31 | bwd_microstep: 5768.31 | bwd_inner_microstep: 5656.47 | bwd_allreduce_microstep: 111.81 | step_microstep: 18.52 [2025-04-25 21:24:34,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.31 | bwd: 5768.33 | bwd_inner: 5656.47 | bwd_allreduce: 111.82 | step: 18.53 13%|█▎ | 5564/41250 [13:26:59<86:01:25, 8.68s/it] {'loss': 0.107, 'grad_norm': 1.8996095657348633, 'learning_rate': 3.885735095685801e-05, 'epoch': 1.35} 13%|█▎ | 5564/41250 [13:26:59<86:01:25, 8.68s/it][2025-04-25 21:24:42,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.04 | optimizer_step: 1.07 [2025-04-25 21:24:42,711] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.90 | bwd_microstep: 5715.99 | bwd_inner_microstep: 5659.39 | bwd_allreduce_microstep: 56.56 | step_microstep: 18.71 [2025-04-25 21:24:42,711] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.90 | bwd: 5716.01 | bwd_inner: 5659.39 | bwd_allreduce: 56.58 | step: 18.71 13%|█▎ | 5565/41250 [13:27:08<85:52:55, 8.66s/it] {'loss': 0.2075, 'grad_norm': 1.5018328428268433, 'learning_rate': 3.8856827716920654e-05, 'epoch': 1.35} 13%|█▎ | 5565/41250 [13:27:08<85:52:55, 8.66s/it][2025-04-25 21:24:51,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:24:51,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.11 | bwd_microstep: 5737.06 | bwd_inner_microstep: 5698.78 | bwd_allreduce_microstep: 38.23 | step_microstep: 18.88 [2025-04-25 21:24:51,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.11 | bwd: 5737.07 | bwd_inner: 5698.78 | bwd_allreduce: 38.24 | step: 18.89 13%|█▎ | 5566/41250 [13:27:16<85:54:42, 8.67s/it] {'loss': 0.0599, 'grad_norm': 2.3577446937561035, 'learning_rate': 3.885630436073466e-05, 'epoch': 1.35} 13%|█▎ | 5566/41250 [13:27:16<85:54:42, 8.67s/it][2025-04-25 21:25:00,150] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:25:00,151] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2891.39 | bwd_microstep: 5790.00 | bwd_inner_microstep: 5777.18 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.74 [2025-04-25 21:25:00,151] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2891.39 | bwd: 5790.02 | bwd_inner: 5777.18 | bwd_allreduce: 12.79 | step: 18.74 13%|█▎ | 5567/41250 [13:27:25<86:12:04, 8.70s/it] {'loss': 0.1543, 'grad_norm': 2.3263967037200928, 'learning_rate': 3.8855780888303266e-05, 'epoch': 1.35} 13%|█▎ | 5567/41250 [13:27:25<86:12:04, 8.70s/it][2025-04-25 21:25:08,764] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:25:08,764] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.50 | bwd_microstep: 5708.60 | bwd_inner_microstep: 5651.38 | bwd_allreduce_microstep: 57.18 | step_microstep: 18.84 [2025-04-25 21:25:08,765] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.50 | bwd: 5708.62 | bwd_inner: 5651.38 | bwd_allreduce: 57.20 | step: 18.84 13%|█▎ | 5568/41250 [13:27:34<85:57:01, 8.67s/it] {'loss': 0.1221, 'grad_norm': 3.1319596767425537, 'learning_rate': 3.885525729962969e-05, 'epoch': 1.35} 13%|█▎ | 5568/41250 [13:27:34<85:57:01, 8.67s/it][2025-04-25 21:25:17,427] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:25:17,428] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.13 | bwd_microstep: 5748.50 | bwd_inner_microstep: 5656.13 | bwd_allreduce_microstep: 92.33 | step_microstep: 18.77 [2025-04-25 21:25:17,428] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.13 | bwd: 5748.51 | bwd_inner: 5656.13 | bwd_allreduce: 92.34 | step: 18.78 14%|█▎ | 5569/41250 [13:27:42<85:55:31, 8.67s/it] {'loss': 0.0627, 'grad_norm': 1.277498722076416, 'learning_rate': 3.885473359471716e-05, 'epoch': 1.35} 14%|█▎ | 5569/41250 [13:27:42<85:55:31, 8.67s/it][2025-04-25 21:25:26,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 21:25:26,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.11 | bwd_microstep: 5700.88 | bwd_inner_microstep: 5688.06 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.77 [2025-04-25 21:25:26,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.11 | bwd: 5700.89 | bwd_inner: 5688.06 | bwd_allreduce: 12.79 | step: 18.77 14%|█▎ | 5570/41250 [13:27:51<85:48:07, 8.66s/it] {'loss': 0.0432, 'grad_norm': 0.6865904331207275, 'learning_rate': 3.88542097735689e-05, 'epoch': 1.35} 14%|█▎ | 5570/41250 [13:27:51<85:48:07, 8.66s/it][2025-04-25 21:25:34,702] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 21:25:34,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.43 | bwd_microstep: 5712.42 | bwd_inner_microstep: 5699.57 | bwd_allreduce_microstep: 12.79 | step_microstep: 18.62 [2025-04-25 21:25:34,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.43 | bwd: 5712.44 | bwd_inner: 5699.57 | bwd_allreduce: 12.82 | step: 18.63 14%|█▎ | 5571/41250 [13:28:00<85:46:37, 8.65s/it] {'loss': 0.1115, 'grad_norm': 1.7200937271118164, 'learning_rate': 3.885368583618816e-05, 'epoch': 1.35} 14%|█▎ | 5571/41250 [13:28:00<85:46:37, 8.65s/it][2025-04-25 21:25:43,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-25 21:25:43,656] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.00 | bwd_microstep: 6013.48 | bwd_inner_microstep: 5700.62 | bwd_allreduce_microstep: 312.82 | step_microstep: 19.18 [2025-04-25 21:25:43,656] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.00 | bwd: 6013.50 | bwd_inner: 5700.62 | bwd_allreduce: 312.84 | step: 19.18 14%|█▎ | 5572/41250 [13:28:08<86:39:06, 8.74s/it] {'loss': 0.1158, 'grad_norm': 1.838423728942871, 'learning_rate': 3.885316178257814e-05, 'epoch': 1.35} 14%|█▎ | 5572/41250 [13:28:08<86:39:06, 8.74s/it][2025-04-25 21:25:52,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:25:52,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.86 | bwd_microstep: 5786.86 | bwd_inner_microstep: 5645.06 | bwd_allreduce_microstep: 141.76 | step_microstep: 18.57 [2025-04-25 21:25:52,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.86 | bwd: 5786.88 | bwd_inner: 5645.06 | bwd_allreduce: 141.78 | step: 18.57 14%|█▎ | 5573/41250 [13:28:17<86:30:25, 8.73s/it] {'loss': 0.1738, 'grad_norm': 2.2174878120422363, 'learning_rate': 3.885263761274209e-05, 'epoch': 1.35} 14%|█▎ | 5573/41250 [13:28:17<86:30:25, 8.73s/it][2025-04-25 21:26:01,016] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:26:01,016] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.18 | bwd_microstep: 5756.65 | bwd_inner_microstep: 5654.07 | bwd_allreduce_microstep: 102.53 | step_microstep: 18.55 [2025-04-25 21:26:01,016] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.18 | bwd: 5756.66 | bwd_inner: 5654.07 | bwd_allreduce: 102.55 | step: 18.55 14%|█▎ | 5574/41250 [13:28:26<86:18:40, 8.71s/it] {'loss': 0.0585, 'grad_norm': 1.440537929534912, 'learning_rate': 3.8852113326683236e-05, 'epoch': 1.35} 14%|█▎ | 5574/41250 [13:28:26<86:18:40, 8.71s/it][2025-04-25 21:26:09,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 1.02 [2025-04-25 21:26:09,802] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.70 | bwd_microstep: 5860.01 | bwd_inner_microstep: 5686.57 | bwd_allreduce_microstep: 173.39 | step_microstep: 18.73 [2025-04-25 21:26:09,802] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.70 | bwd: 5860.02 | bwd_inner: 5686.57 | bwd_allreduce: 173.41 | step: 18.73 14%|█▎ | 5575/41250 [13:28:35<86:32:08, 8.73s/it] {'loss': 0.1101, 'grad_norm': 2.3798370361328125, 'learning_rate': 3.885158892440481e-05, 'epoch': 1.35} 14%|█▎ | 5575/41250 [13:28:35<86:32:08, 8.73s/it][2025-04-25 21:26:18,433] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:26:18,434] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.21 | bwd_microstep: 5695.20 | bwd_inner_microstep: 5682.44 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.83 [2025-04-25 21:26:18,434] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.21 | bwd: 5695.21 | bwd_inner: 5682.44 | bwd_allreduce: 12.74 | step: 18.83 14%|█▎ | 5576/41250 [13:28:43<86:14:05, 8.70s/it] {'loss': 0.133, 'grad_norm': 2.703328847885132, 'learning_rate': 3.885106440591006e-05, 'epoch': 1.35} 14%|█▎ | 5576/41250 [13:28:43<86:14:05, 8.70s/it][2025-04-25 21:26:27,182] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.95 [2025-04-25 21:26:27,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2882.99 | bwd_microstep: 5784.56 | bwd_inner_microstep: 5771.87 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.75 [2025-04-25 21:26:27,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2882.99 | bwd: 5784.58 | bwd_inner: 5771.87 | bwd_allreduce: 12.67 | step: 18.75 14%|█▎ | 5577/41250 [13:28:52<86:22:19, 8.72s/it] {'loss': 0.1849, 'grad_norm': 1.9333152770996094, 'learning_rate': 3.885053977120219e-05, 'epoch': 1.35} 14%|█▎ | 5577/41250 [13:28:52<86:22:19, 8.72s/it][2025-04-25 21:26:35,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:26:35,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.34 | bwd_microstep: 5767.19 | bwd_inner_microstep: 5649.89 | bwd_allreduce_microstep: 117.25 | step_microstep: 18.40 [2025-04-25 21:26:35,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.34 | bwd: 5767.20 | bwd_inner: 5649.89 | bwd_allreduce: 117.26 | step: 18.40 14%|█▎ | 5578/41250 [13:29:01<86:14:02, 8.70s/it] {'loss': 0.1415, 'grad_norm': 3.4903173446655273, 'learning_rate': 3.885001502028446e-05, 'epoch': 1.35} 14%|█▎ | 5578/41250 [13:29:01<86:14:02, 8.70s/it][2025-04-25 21:26:44,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.24 | optimizer_step: 1.04 [2025-04-25 21:26:44,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.42 | bwd_microstep: 5758.70 | bwd_inner_microstep: 5658.04 | bwd_allreduce_microstep: 100.59 | step_microstep: 19.68 [2025-04-25 21:26:44,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.42 | bwd: 5758.71 | bwd_inner: 5658.04 | bwd_allreduce: 100.62 | step: 19.68 14%|█▎ | 5579/41250 [13:29:09<86:08:41, 8.69s/it] {'loss': 0.1173, 'grad_norm': 1.816239833831787, 'learning_rate': 3.884949015316009e-05, 'epoch': 1.35} 14%|█▎ | 5579/41250 [13:29:09<86:08:41, 8.69s/it][2025-04-25 21:26:53,204] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.03 | optimizer_step: 1.07 [2025-04-25 21:26:53,204] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.85 | bwd_microstep: 5767.04 | bwd_inner_microstep: 5646.05 | bwd_allreduce_microstep: 120.95 | step_microstep: 18.94 [2025-04-25 21:26:53,205] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.85 | bwd: 5767.05 | bwd_inner: 5646.05 | bwd_allreduce: 120.96 | step: 18.95 14%|█▎ | 5580/41250 [13:29:18<86:05:29, 8.69s/it] {'loss': 0.0802, 'grad_norm': 1.1241098642349243, 'learning_rate': 3.884896516983232e-05, 'epoch': 1.35} 14%|█▎ | 5580/41250 [13:29:18<86:05:29, 8.69s/it][2025-04-25 21:27:01,835] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:27:01,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.83 | bwd_microstep: 5698.11 | bwd_inner_microstep: 5685.29 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.61 [2025-04-25 21:27:01,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.84 | bwd: 5698.12 | bwd_inner: 5685.29 | bwd_allreduce: 12.78 | step: 18.61 14%|█▎ | 5581/41250 [13:29:27<85:55:03, 8.67s/it] {'loss': 0.1845, 'grad_norm': 4.35075044631958, 'learning_rate': 3.884844007030439e-05, 'epoch': 1.35} 14%|█▎ | 5581/41250 [13:29:27<85:55:03, 8.67s/it][2025-04-25 21:27:10,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:27:10,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.60 | bwd_microstep: 5758.44 | bwd_inner_microstep: 5641.62 | bwd_allreduce_microstep: 116.77 | step_microstep: 18.94 [2025-04-25 21:27:10,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.60 | bwd: 5758.45 | bwd_inner: 5641.62 | bwd_allreduce: 116.79 | step: 18.95 14%|█▎ | 5582/41250 [13:29:35<85:53:48, 8.67s/it] {'loss': 0.1141, 'grad_norm': 2.520479917526245, 'learning_rate': 3.884791485457953e-05, 'epoch': 1.35} 14%|█▎ | 5582/41250 [13:29:35<85:53:48, 8.67s/it][2025-04-25 21:27:19,159] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:27:19,159] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.13 | bwd_microstep: 5731.41 | bwd_inner_microstep: 5685.07 | bwd_allreduce_microstep: 46.29 | step_microstep: 18.38 [2025-04-25 21:27:19,160] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.13 | bwd: 5731.42 | bwd_inner: 5685.07 | bwd_allreduce: 46.31 | step: 18.39 14%|█▎ | 5583/41250 [13:29:44<85:51:42, 8.67s/it] {'loss': 0.1906, 'grad_norm': 1.4924067258834839, 'learning_rate': 3.884738952266099e-05, 'epoch': 1.35} 14%|█▎ | 5583/41250 [13:29:44<85:51:42, 8.67s/it][2025-04-25 21:27:27,814] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.05 | optimizer_step: 1.02 [2025-04-25 21:27:27,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.75 | bwd_microstep: 5722.92 | bwd_inner_microstep: 5709.86 | bwd_allreduce_microstep: 13.02 | step_microstep: 19.46 [2025-04-25 21:27:27,816] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.75 | bwd: 5722.94 | bwd_inner: 5709.86 | bwd_allreduce: 13.04 | step: 19.46 14%|█▎ | 5584/41250 [13:29:53<85:50:01, 8.66s/it] {'loss': 0.1322, 'grad_norm': 6.728465557098389, 'learning_rate': 3.8846864074551995e-05, 'epoch': 1.35} 14%|█▎ | 5584/41250 [13:29:53<85:50:01, 8.66s/it][2025-04-25 21:27:36,599] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.08 | optimizer_step: 0.94 [2025-04-25 21:27:36,600] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.96 | bwd_microstep: 5853.83 | bwd_inner_microstep: 5672.65 | bwd_allreduce_microstep: 181.12 | step_microstep: 19.27 [2025-04-25 21:27:36,600] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.96 | bwd: 5853.85 | bwd_inner: 5672.65 | bwd_allreduce: 181.15 | step: 19.27 14%|█▎ | 5585/41250 [13:30:01<86:11:06, 8.70s/it] {'loss': 0.0825, 'grad_norm': 1.4726125001907349, 'learning_rate': 3.884633851025579e-05, 'epoch': 1.35} 14%|█▎ | 5585/41250 [13:30:01<86:11:06, 8.70s/it][2025-04-25 21:27:45,223] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.25 | optimizer_step: 0.91 [2025-04-25 21:27:45,224] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.93 | bwd_microstep: 5698.67 | bwd_inner_microstep: 5685.20 | bwd_allreduce_microstep: 13.40 | step_microstep: 19.48 [2025-04-25 21:27:45,224] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.93 | bwd: 5698.68 | bwd_inner: 5685.20 | bwd_allreduce: 13.43 | step: 19.48 14%|█▎ | 5586/41250 [13:30:10<85:58:02, 8.68s/it] {'loss': 0.0845, 'grad_norm': 1.978248119354248, 'learning_rate': 3.884581282977561e-05, 'epoch': 1.35} 14%|█▎ | 5586/41250 [13:30:10<85:58:02, 8.68s/it][2025-04-25 21:27:53,905] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 1.24 | optimizer_step: 1.03 [2025-04-25 21:27:53,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.17 | bwd_microstep: 5748.19 | bwd_inner_microstep: 5686.73 | bwd_allreduce_microstep: 61.41 | step_microstep: 19.43 [2025-04-25 21:27:53,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.17 | bwd: 5748.21 | bwd_inner: 5686.73 | bwd_allreduce: 61.43 | step: 19.43 14%|█▎ | 5587/41250 [13:30:19<85:58:14, 8.68s/it] {'loss': 0.2495, 'grad_norm': 3.417552947998047, 'learning_rate': 3.884528703311471e-05, 'epoch': 1.35} 14%|█▎ | 5587/41250 [13:30:19<85:58:14, 8.68s/it][2025-04-25 21:28:02,605] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 21:28:02,606] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.12 | bwd_microstep: 5758.54 | bwd_inner_microstep: 5692.86 | bwd_allreduce_microstep: 65.64 | step_microstep: 19.02 [2025-04-25 21:28:02,606] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.12 | bwd: 5758.55 | bwd_inner: 5692.86 | bwd_allreduce: 65.65 | step: 19.03 14%|█▎ | 5588/41250 [13:30:27<86:01:57, 8.68s/it] {'loss': 0.1455, 'grad_norm': 3.415832757949829, 'learning_rate': 3.8844761120276325e-05, 'epoch': 1.35} 14%|█▎ | 5588/41250 [13:30:27<86:01:57, 8.68s/it][2025-04-25 21:28:11,266] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 21:28:11,267] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.61 | bwd_microstep: 5726.74 | bwd_inner_microstep: 5700.05 | bwd_allreduce_microstep: 26.63 | step_microstep: 18.76 [2025-04-25 21:28:11,267] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.61 | bwd: 5726.76 | bwd_inner: 5700.05 | bwd_allreduce: 26.65 | step: 18.76 14%|█▎ | 5589/41250 [13:30:36<85:58:16, 8.68s/it] {'loss': 0.1193, 'grad_norm': 1.8688690662384033, 'learning_rate': 3.8844235091263686e-05, 'epoch': 1.35} 14%|█▎ | 5589/41250 [13:30:36<85:58:16, 8.68s/it][2025-04-25 21:28:19,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.02 | optimizer_step: 1.03 [2025-04-25 21:28:19,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.57 | bwd_microstep: 5713.23 | bwd_inner_microstep: 5700.48 | bwd_allreduce_microstep: 12.70 | step_microstep: 19.09 [2025-04-25 21:28:19,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.57 | bwd: 5713.24 | bwd_inner: 5700.48 | bwd_allreduce: 12.72 | step: 19.09 14%|█▎ | 5590/41250 [13:30:45<85:52:38, 8.67s/it] {'loss': 0.0401, 'grad_norm': 1.4990702867507935, 'learning_rate': 3.8843708946080044e-05, 'epoch': 1.36} 14%|█▎ | 5590/41250 [13:30:45<85:52:38, 8.67s/it][2025-04-25 21:28:28,521] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.01 | optimizer_step: 1.03 [2025-04-25 21:28:28,522] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.88 | bwd_microstep: 5690.70 | bwd_inner_microstep: 5656.50 | bwd_allreduce_microstep: 34.15 | step_microstep: 18.86 [2025-04-25 21:28:28,522] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.88 | bwd: 5690.72 | bwd_inner: 5656.50 | bwd_allreduce: 34.17 | step: 18.86 14%|█▎ | 5591/41250 [13:30:53<85:40:32, 8.65s/it] {'loss': 0.0842, 'grad_norm': 2.3593077659606934, 'learning_rate': 3.8843182684728646e-05, 'epoch': 1.36} 14%|█▎ | 5591/41250 [13:30:53<85:40:32, 8.65s/it][2025-04-25 21:28:37,210] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:28:37,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.50 | bwd_microstep: 5752.78 | bwd_inner_microstep: 5705.48 | bwd_allreduce_microstep: 47.26 | step_microstep: 18.75 [2025-04-25 21:28:37,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.50 | bwd: 5752.80 | bwd_inner: 5705.48 | bwd_allreduce: 47.28 | step: 18.75 14%|█▎ | 5592/41250 [13:31:02<85:47:27, 8.66s/it] {'loss': 0.0656, 'grad_norm': 1.8343713283538818, 'learning_rate': 3.884265630721273e-05, 'epoch': 1.36} 14%|█▎ | 5592/41250 [13:31:02<85:47:27, 8.66s/it][2025-04-25 21:28:45,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 21:28:45,905] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.09 | bwd_microstep: 5783.03 | bwd_inner_microstep: 5654.45 | bwd_allreduce_microstep: 128.54 | step_microstep: 18.70 [2025-04-25 21:28:45,905] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.09 | bwd: 5783.05 | bwd_inner: 5654.45 | bwd_allreduce: 128.56 | step: 18.70 14%|█▎ | 5593/41250 [13:31:11<85:53:05, 8.67s/it] {'loss': 0.291, 'grad_norm': 2.758741617202759, 'learning_rate': 3.884212981353555e-05, 'epoch': 1.36} 14%|█▎ | 5593/41250 [13:31:11<85:53:05, 8.67s/it][2025-04-25 21:28:54,580] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 21:28:54,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.75 | bwd_microstep: 5740.14 | bwd_inner_microstep: 5711.94 | bwd_allreduce_microstep: 28.15 | step_microstep: 19.12 [2025-04-25 21:28:54,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.75 | bwd: 5740.16 | bwd_inner: 5711.95 | bwd_allreduce: 28.17 | step: 19.12 14%|█▎ | 5594/41250 [13:31:19<85:54:12, 8.67s/it] {'loss': 0.1629, 'grad_norm': 3.3252358436584473, 'learning_rate': 3.884160320370034e-05, 'epoch': 1.36} 14%|█▎ | 5594/41250 [13:31:19<85:54:12, 8.67s/it][2025-04-25 21:29:03,230] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 21:29:03,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.71 | bwd_microstep: 5704.50 | bwd_inner_microstep: 5691.72 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.54 [2025-04-25 21:29:03,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.71 | bwd: 5704.51 | bwd_inner: 5691.72 | bwd_allreduce: 12.75 | step: 18.54 14%|█▎ | 5595/41250 [13:31:28<85:49:26, 8.67s/it] {'loss': 0.6303, 'grad_norm': 3.311692714691162, 'learning_rate': 3.884107647771036e-05, 'epoch': 1.36} 14%|█▎ | 5595/41250 [13:31:28<85:49:26, 8.67s/it][2025-04-25 21:29:11,928] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:29:11,928] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.64 | bwd_microstep: 5740.80 | bwd_inner_microstep: 5701.62 | bwd_allreduce_microstep: 39.13 | step_microstep: 18.54 [2025-04-25 21:29:11,929] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.64 | bwd: 5740.81 | bwd_inner: 5701.62 | bwd_allreduce: 39.15 | step: 18.54 14%|█▎ | 5596/41250 [13:31:37<85:54:55, 8.67s/it] {'loss': 0.0599, 'grad_norm': 1.2621138095855713, 'learning_rate': 3.884054963556884e-05, 'epoch': 1.36} 14%|█▎ | 5596/41250 [13:31:37<85:54:55, 8.67s/it][2025-04-25 21:29:20,648] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 21:29:20,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.33 | bwd_microstep: 5777.62 | bwd_inner_microstep: 5718.03 | bwd_allreduce_microstep: 59.55 | step_microstep: 17.93 [2025-04-25 21:29:20,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.33 | bwd: 5777.63 | bwd_inner: 5718.03 | bwd_allreduce: 59.56 | step: 17.94 14%|█▎ | 5597/41250 [13:31:45<86:02:54, 8.69s/it] {'loss': 0.0523, 'grad_norm': 0.9731487035751343, 'learning_rate': 3.8840022677279035e-05, 'epoch': 1.36} 14%|█▎ | 5597/41250 [13:31:45<86:02:54, 8.69s/it][2025-04-25 21:29:29,283] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 21:29:29,284] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.07 | bwd_microstep: 5714.37 | bwd_inner_microstep: 5663.01 | bwd_allreduce_microstep: 51.31 | step_microstep: 18.59 [2025-04-25 21:29:29,284] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.07 | bwd: 5714.38 | bwd_inner: 5663.01 | bwd_allreduce: 51.33 | step: 18.59 14%|█▎ | 5598/41250 [13:31:54<85:53:44, 8.67s/it] {'loss': 0.0523, 'grad_norm': 1.1931345462799072, 'learning_rate': 3.8839495602844204e-05, 'epoch': 1.36} 14%|█▎ | 5598/41250 [13:31:54<85:53:44, 8.67s/it][2025-04-25 21:29:37,967] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.00 | optimizer_step: 1.09 [2025-04-25 21:29:37,968] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.61 | bwd_microstep: 5767.31 | bwd_inner_microstep: 5657.52 | bwd_allreduce_microstep: 109.75 | step_microstep: 18.90 [2025-04-25 21:29:37,968] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.61 | bwd: 5767.33 | bwd_inner: 5657.52 | bwd_allreduce: 109.77 | step: 18.90 14%|█▎ | 5599/41250 [13:32:03<85:55:06, 8.68s/it] {'loss': 0.3435, 'grad_norm': 2.04844069480896, 'learning_rate': 3.883896841226758e-05, 'epoch': 1.36} 14%|█▎ | 5599/41250 [13:32:03<85:55:06, 8.68s/it][2025-04-25 21:29:46,659] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 21:29:46,659] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.53 | bwd_microstep: 5775.77 | bwd_inner_microstep: 5666.03 | bwd_allreduce_microstep: 109.69 | step_microstep: 18.80 [2025-04-25 21:29:46,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.53 | bwd: 5775.78 | bwd_inner: 5666.03 | bwd_allreduce: 109.71 | step: 18.80 14%|█▎ | 5600/41250 [13:32:11<85:57:44, 8.68s/it] {'loss': 0.0447, 'grad_norm': 1.2559137344360352, 'learning_rate': 3.883844110555243e-05, 'epoch': 1.36} 14%|█▎ | 5600/41250 [13:32:11<85:57:44, 8.68s/it][2025-04-25 21:29:55,376] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 21:29:55,376] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.34 | bwd_microstep: 5796.10 | bwd_inner_microstep: 5667.55 | bwd_allreduce_microstep: 128.51 | step_microstep: 18.34 [2025-04-25 21:29:55,376] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.34 | bwd: 5796.11 | bwd_inner: 5667.55 | bwd_allreduce: 128.52 | step: 18.34 14%|█▎ | 5601/41250 [13:32:20<86:03:59, 8.69s/it] {'loss': 0.1002, 'grad_norm': 2.6007039546966553, 'learning_rate': 3.883791368270199e-05, 'epoch': 1.36} 14%|█▎ | 5601/41250 [13:32:20<86:03:59, 8.69s/it][2025-04-25 21:30:04,039] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:30:04,040] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.63 | bwd_microstep: 5719.85 | bwd_inner_microstep: 5707.26 | bwd_allreduce_microstep: 12.55 | step_microstep: 18.52 [2025-04-25 21:30:04,040] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.63 | bwd: 5719.87 | bwd_inner: 5707.26 | bwd_allreduce: 12.56 | step: 18.52 14%|█▎ | 5602/41250 [13:32:29<85:59:06, 8.68s/it] {'loss': 0.2588, 'grad_norm': 3.5359280109405518, 'learning_rate': 3.883738614371952e-05, 'epoch': 1.36} 14%|█▎ | 5602/41250 [13:32:29<85:59:06, 8.68s/it][2025-04-25 21:30:12,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.97 | optimizer_gradients: 1.08 | optimizer_step: 1.05 [2025-04-25 21:30:12,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.28 | bwd_microstep: 5747.63 | bwd_inner_microstep: 5653.73 | bwd_allreduce_microstep: 93.85 | step_microstep: 19.66 [2025-04-25 21:30:12,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.28 | bwd: 5747.64 | bwd_inner: 5653.73 | bwd_allreduce: 93.86 | step: 19.67 14%|█▎ | 5603/41250 [13:32:38<85:55:29, 8.68s/it] {'loss': 0.192, 'grad_norm': 2.546175479888916, 'learning_rate': 3.883685848860827e-05, 'epoch': 1.36} 14%|█▎ | 5603/41250 [13:32:38<85:55:29, 8.68s/it][2025-04-25 21:30:21,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:30:21,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.96 | bwd_microstep: 5928.61 | bwd_inner_microstep: 5670.26 | bwd_allreduce_microstep: 258.29 | step_microstep: 19.03 [2025-04-25 21:30:21,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.96 | bwd: 5928.62 | bwd_inner: 5670.26 | bwd_allreduce: 258.31 | step: 19.04 14%|█▎ | 5604/41250 [13:32:46<86:25:30, 8.73s/it] {'loss': 0.0367, 'grad_norm': 0.7083260416984558, 'learning_rate': 3.883633071737149e-05, 'epoch': 1.36} 14%|█▎ | 5604/41250 [13:32:46<86:25:30, 8.73s/it][2025-04-25 21:30:30,197] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:30:30,197] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.72 | bwd_microstep: 5737.96 | bwd_inner_microstep: 5651.31 | bwd_allreduce_microstep: 86.60 | step_microstep: 19.01 [2025-04-25 21:30:30,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.72 | bwd: 5737.97 | bwd_inner: 5651.31 | bwd_allreduce: 86.62 | step: 19.01 14%|█▎ | 5605/41250 [13:32:55<86:10:45, 8.70s/it] {'loss': 0.0228, 'grad_norm': 0.9621867537498474, 'learning_rate': 3.883580283001244e-05, 'epoch': 1.36} 14%|█▎ | 5605/41250 [13:32:55<86:10:45, 8.70s/it][2025-04-25 21:30:38,867] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 1.10 [2025-04-25 21:30:38,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.12 | bwd_microstep: 5726.72 | bwd_inner_microstep: 5713.84 | bwd_allreduce_microstep: 12.82 | step_microstep: 19.42 [2025-04-25 21:30:38,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.12 | bwd: 5726.73 | bwd_inner: 5713.84 | bwd_allreduce: 12.85 | step: 19.42 14%|█▎ | 5606/41250 [13:33:04<86:04:58, 8.69s/it] {'loss': 0.0923, 'grad_norm': 0.9281152486801147, 'learning_rate': 3.883527482653436e-05, 'epoch': 1.36} 14%|█▎ | 5606/41250 [13:33:04<86:04:58, 8.69s/it][2025-04-25 21:30:47,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.04 | optimizer_step: 0.92 [2025-04-25 21:30:47,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.76 | bwd_microstep: 5922.53 | bwd_inner_microstep: 5713.13 | bwd_allreduce_microstep: 209.34 | step_microstep: 19.08 [2025-04-25 21:30:47,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.76 | bwd: 5922.54 | bwd_inner: 5713.13 | bwd_allreduce: 209.36 | step: 19.08 14%|█▎ | 5607/41250 [13:33:13<86:34:44, 8.74s/it] {'loss': 0.1534, 'grad_norm': 2.35833477973938, 'learning_rate': 3.883474670694053e-05, 'epoch': 1.36} 14%|█▎ | 5607/41250 [13:33:13<86:34:44, 8.74s/it][2025-04-25 21:30:56,377] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:30:56,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.62 | bwd_microstep: 5712.23 | bwd_inner_microstep: 5699.37 | bwd_allreduce_microstep: 12.81 | step_microstep: 19.10 [2025-04-25 21:30:56,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.62 | bwd: 5712.24 | bwd_inner: 5699.37 | bwd_allreduce: 12.83 | step: 19.10 14%|█▎ | 5608/41250 [13:33:21<86:17:04, 8.72s/it] {'loss': 0.1491, 'grad_norm': 1.6757112741470337, 'learning_rate': 3.883421847123419e-05, 'epoch': 1.36} 14%|█▎ | 5608/41250 [13:33:21<86:17:04, 8.72s/it][2025-04-25 21:31:05,111] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 21:31:05,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.91 | bwd_microstep: 5788.13 | bwd_inner_microstep: 5720.02 | bwd_allreduce_microstep: 68.07 | step_microstep: 18.36 [2025-04-25 21:31:05,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.91 | bwd: 5788.15 | bwd_inner: 5720.02 | bwd_allreduce: 68.09 | step: 18.36 14%|█▎ | 5609/41250 [13:33:30<86:20:00, 8.72s/it] {'loss': 0.0352, 'grad_norm': 0.9720394611358643, 'learning_rate': 3.883369011941859e-05, 'epoch': 1.36} 14%|█▎ | 5609/41250 [13:33:30<86:20:00, 8.72s/it][2025-04-25 21:31:13,931] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 21:31:13,931] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.08 | bwd_microstep: 5892.73 | bwd_inner_microstep: 5700.25 | bwd_allreduce_microstep: 192.44 | step_microstep: 18.41 [2025-04-25 21:31:13,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.08 | bwd: 5892.74 | bwd_inner: 5700.25 | bwd_allreduce: 192.45 | step: 18.41 14%|█▎ | 5610/41250 [13:33:39<86:37:39, 8.75s/it] {'loss': 0.0305, 'grad_norm': 1.1314795017242432, 'learning_rate': 3.883316165149699e-05, 'epoch': 1.36} 14%|█▎ | 5610/41250 [13:33:39<86:37:39, 8.75s/it][2025-04-25 21:31:22,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.26 | optimizer_step: 1.00 [2025-04-25 21:31:22,571] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.58 | bwd_microstep: 5727.06 | bwd_inner_microstep: 5658.86 | bwd_allreduce_microstep: 68.16 | step_microstep: 19.53 [2025-04-25 21:31:22,571] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.58 | bwd: 5727.08 | bwd_inner: 5658.86 | bwd_allreduce: 68.18 | step: 19.53 14%|█▎ | 5611/41250 [13:33:47<86:17:56, 8.72s/it] {'loss': 0.2084, 'grad_norm': 3.372615337371826, 'learning_rate': 3.883263306747267e-05, 'epoch': 1.36} 14%|█▎ | 5611/41250 [13:33:47<86:17:56, 8.72s/it][2025-04-25 21:31:31,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-25 21:31:31,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2882.98 | bwd_microstep: 5776.23 | bwd_inner_microstep: 5763.61 | bwd_allreduce_microstep: 12.58 | step_microstep: 18.59 [2025-04-25 21:31:31,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2882.98 | bwd: 5776.24 | bwd_inner: 5763.61 | bwd_allreduce: 12.59 | step: 18.59 14%|█▎ | 5612/41250 [13:33:56<86:22:04, 8.72s/it] {'loss': 0.0922, 'grad_norm': 1.7531616687774658, 'learning_rate': 3.883210436734885e-05, 'epoch': 1.36} 14%|█▎ | 5612/41250 [13:33:56<86:22:04, 8.72s/it][2025-04-25 21:31:40,019] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 21:31:40,019] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.93 | bwd_microstep: 5774.87 | bwd_inner_microstep: 5699.14 | bwd_allreduce_microstep: 75.69 | step_microstep: 18.49 [2025-04-25 21:31:40,019] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.93 | bwd: 5774.89 | bwd_inner: 5699.14 | bwd_allreduce: 75.71 | step: 18.49 14%|█▎ | 5613/41250 [13:34:05<86:18:35, 8.72s/it] {'loss': 0.0322, 'grad_norm': 1.0576574802398682, 'learning_rate': 3.8831575551128825e-05, 'epoch': 1.36} 14%|█▎ | 5613/41250 [13:34:05<86:18:35, 8.72s/it][2025-04-25 21:31:48,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 21:31:48,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.70 | bwd_microstep: 5776.02 | bwd_inner_microstep: 5653.38 | bwd_allreduce_microstep: 122.60 | step_microstep: 18.55 [2025-04-25 21:31:48,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.70 | bwd: 5776.04 | bwd_inner: 5653.38 | bwd_allreduce: 122.62 | step: 18.55 14%|█▎ | 5614/41250 [13:34:14<86:12:36, 8.71s/it] {'loss': 0.0144, 'grad_norm': 0.4161290228366852, 'learning_rate': 3.883104661881584e-05, 'epoch': 1.36} 14%|█▎ | 5614/41250 [13:34:14<86:12:36, 8.71s/it][2025-04-25 21:31:57,301] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:31:57,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.61 | bwd_microstep: 5685.06 | bwd_inner_microstep: 5667.34 | bwd_allreduce_microstep: 17.68 | step_microstep: 18.43 [2025-04-25 21:31:57,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.61 | bwd: 5685.08 | bwd_inner: 5667.34 | bwd_allreduce: 17.69 | step: 18.43 14%|█▎ | 5615/41250 [13:34:22<85:52:24, 8.68s/it] {'loss': 0.0374, 'grad_norm': 2.3548338413238525, 'learning_rate': 3.8830517570413156e-05, 'epoch': 1.36} 14%|█▎ | 5615/41250 [13:34:22<85:52:24, 8.68s/it][2025-04-25 21:32:06,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 21:32:06,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.52 | bwd_microstep: 5874.18 | bwd_inner_microstep: 5688.04 | bwd_allreduce_microstep: 186.09 | step_microstep: 18.38 [2025-04-25 21:32:06,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.52 | bwd: 5874.19 | bwd_inner: 5688.04 | bwd_allreduce: 186.11 | step: 18.38 14%|█▎ | 5616/41250 [13:34:31<86:14:25, 8.71s/it] {'loss': 0.0979, 'grad_norm': 4.407697677612305, 'learning_rate': 3.8829988405924026e-05, 'epoch': 1.36} 14%|█▎ | 5616/41250 [13:34:31<86:14:25, 8.71s/it][2025-04-25 21:32:14,745] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.02 | optimizer_step: 1.13 [2025-04-25 21:32:14,746] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.04 | bwd_microstep: 5718.07 | bwd_inner_microstep: 5705.15 | bwd_allreduce_microstep: 12.86 | step_microstep: 19.20 [2025-04-25 21:32:14,746] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.04 | bwd: 5718.08 | bwd_inner: 5705.15 | bwd_allreduce: 12.88 | step: 19.20 14%|█▎ | 5617/41250 [13:34:40<86:02:28, 8.69s/it] {'loss': 0.0205, 'grad_norm': 0.4558219015598297, 'learning_rate': 3.8829459125351734e-05, 'epoch': 1.36} 14%|█▎ | 5617/41250 [13:34:40<86:02:28, 8.69s/it][2025-04-25 21:32:23,506] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:32:23,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2885.53 | bwd_microstep: 5790.04 | bwd_inner_microstep: 5777.09 | bwd_allreduce_microstep: 12.90 | step_microstep: 18.71 [2025-04-25 21:32:23,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2885.53 | bwd: 5790.05 | bwd_inner: 5777.09 | bwd_allreduce: 12.92 | step: 18.71 14%|█▎ | 5618/41250 [13:34:48<86:14:10, 8.71s/it] {'loss': 0.0637, 'grad_norm': 1.9162484407424927, 'learning_rate': 3.882892972869952e-05, 'epoch': 1.36} 14%|█▎ | 5618/41250 [13:34:48<86:14:10, 8.71s/it][2025-04-25 21:32:32,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 21:32:32,126] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.80 | bwd_microstep: 5714.59 | bwd_inner_microstep: 5651.60 | bwd_allreduce_microstep: 62.95 | step_microstep: 18.41 [2025-04-25 21:32:32,126] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.80 | bwd: 5714.61 | bwd_inner: 5651.60 | bwd_allreduce: 62.97 | step: 18.41 14%|█▎ | 5619/41250 [13:34:57<85:57:27, 8.68s/it] {'loss': 0.3918, 'grad_norm': 2.5789434909820557, 'learning_rate': 3.882840021597066e-05, 'epoch': 1.36} 14%|█▎ | 5619/41250 [13:34:57<85:57:27, 8.68s/it][2025-04-25 21:32:40,779] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-25 21:32:40,780] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.49 | bwd_microstep: 5730.48 | bwd_inner_microstep: 5683.84 | bwd_allreduce_microstep: 46.60 | step_microstep: 18.89 [2025-04-25 21:32:40,780] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.49 | bwd: 5730.49 | bwd_inner: 5683.84 | bwd_allreduce: 46.62 | step: 18.90 14%|█▎ | 5620/41250 [13:35:06<85:51:47, 8.68s/it] {'loss': 0.1586, 'grad_norm': 4.1691789627075195, 'learning_rate': 3.882787058716842e-05, 'epoch': 1.36} 14%|█▎ | 5620/41250 [13:35:06<85:51:47, 8.68s/it][2025-04-25 21:32:49,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 21:32:49,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.32 | bwd_microstep: 5772.14 | bwd_inner_microstep: 5653.09 | bwd_allreduce_microstep: 119.00 | step_microstep: 18.87 [2025-04-25 21:32:49,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.32 | bwd: 5772.15 | bwd_inner: 5653.09 | bwd_allreduce: 119.02 | step: 18.87 14%|█▎ | 5621/41250 [13:35:14<85:52:57, 8.68s/it] {'loss': 0.0928, 'grad_norm': 3.250216484069824, 'learning_rate': 3.882734084229606e-05, 'epoch': 1.36} 14%|█▎ | 5621/41250 [13:35:14<85:52:57, 8.68s/it][2025-04-25 21:32:58,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.04 | optimizer_step: 1.01 [2025-04-25 21:32:58,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.24 | bwd_microstep: 5729.53 | bwd_inner_microstep: 5710.33 | bwd_allreduce_microstep: 19.14 | step_microstep: 19.59 [2025-04-25 21:32:58,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.24 | bwd: 5729.54 | bwd_inner: 5710.33 | bwd_allreduce: 19.16 | step: 19.60 14%|█▎ | 5622/41250 [13:35:23<85:50:41, 8.67s/it] {'loss': 0.0594, 'grad_norm': 1.2396906614303589, 'learning_rate': 3.882681098135685e-05, 'epoch': 1.36} 14%|█▎ | 5622/41250 [13:35:23<85:50:41, 8.67s/it][2025-04-25 21:33:06,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.03 | optimizer_step: 0.92 [2025-04-25 21:33:06,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.54 | bwd_microstep: 5722.68 | bwd_inner_microstep: 5703.30 | bwd_allreduce_microstep: 19.32 | step_microstep: 19.11 [2025-04-25 21:33:06,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.54 | bwd: 5722.69 | bwd_inner: 5703.30 | bwd_allreduce: 19.35 | step: 19.11 14%|█▎ | 5623/41250 [13:35:32<85:48:09, 8.67s/it] {'loss': 0.1831, 'grad_norm': 2.304884195327759, 'learning_rate': 3.882628100435404e-05, 'epoch': 1.36} 14%|█▎ | 5623/41250 [13:35:32<85:48:09, 8.67s/it][2025-04-25 21:33:15,450] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 1.01 [2025-04-25 21:33:15,451] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.83 | bwd_microstep: 5729.90 | bwd_inner_microstep: 5695.26 | bwd_allreduce_microstep: 34.59 | step_microstep: 18.92 [2025-04-25 21:33:15,451] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.83 | bwd: 5729.91 | bwd_inner: 5695.26 | bwd_allreduce: 34.61 | step: 18.92 14%|█▎ | 5624/41250 [13:35:40<85:46:36, 8.67s/it] {'loss': 0.4025, 'grad_norm': 8.447113990783691, 'learning_rate': 3.882575091129092e-05, 'epoch': 1.36} 14%|█▎ | 5624/41250 [13:35:40<85:46:36, 8.67s/it][2025-04-25 21:33:24,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:33:24,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.43 | bwd_microstep: 5790.69 | bwd_inner_microstep: 5777.96 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.78 [2025-04-25 21:33:24,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.43 | bwd: 5790.70 | bwd_inner: 5777.96 | bwd_allreduce: 12.70 | step: 18.79 14%|█▎ | 5625/41250 [13:35:49<86:02:56, 8.70s/it] {'loss': 0.0373, 'grad_norm': 1.1791044473648071, 'learning_rate': 3.882522070217075e-05, 'epoch': 1.36} 14%|█▎ | 5625/41250 [13:35:49<86:02:56, 8.70s/it][2025-04-25 21:33:32,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 21:33:32,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.18 | bwd_microstep: 5720.14 | bwd_inner_microstep: 5697.93 | bwd_allreduce_microstep: 22.17 | step_microstep: 18.84 [2025-04-25 21:33:32,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.18 | bwd: 5720.15 | bwd_inner: 5697.93 | bwd_allreduce: 22.18 | step: 18.85 14%|█▎ | 5626/41250 [13:35:58<85:54:48, 8.68s/it] {'loss': 0.1065, 'grad_norm': 1.727655291557312, 'learning_rate': 3.8824690376996794e-05, 'epoch': 1.36} 14%|█▎ | 5626/41250 [13:35:58<85:54:48, 8.68s/it][2025-04-25 21:33:41,624] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.05 | optimizer_step: 1.02 [2025-04-25 21:33:41,624] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.43 | bwd_microstep: 5853.96 | bwd_inner_microstep: 5647.29 | bwd_allreduce_microstep: 206.61 | step_microstep: 19.42 [2025-04-25 21:33:41,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.43 | bwd: 5853.97 | bwd_inner: 5647.29 | bwd_allreduce: 206.64 | step: 19.42 14%|█▎ | 5627/41250 [13:36:06<86:09:00, 8.71s/it] {'loss': 0.087, 'grad_norm': 2.512007474899292, 'learning_rate': 3.882415993577232e-05, 'epoch': 1.36} 14%|█▎ | 5627/41250 [13:36:06<86:09:00, 8.71s/it][2025-04-25 21:33:50,289] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 21:33:50,289] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.38 | bwd_microstep: 5733.90 | bwd_inner_microstep: 5694.34 | bwd_allreduce_microstep: 39.51 | step_microstep: 18.56 [2025-04-25 21:33:50,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.38 | bwd: 5733.92 | bwd_inner: 5694.34 | bwd_allreduce: 39.53 | step: 18.56 14%|█▎ | 5628/41250 [13:36:15<86:02:09, 8.69s/it] {'loss': 0.0793, 'grad_norm': 1.9861080646514893, 'learning_rate': 3.8823629378500614e-05, 'epoch': 1.36} 14%|█▎ | 5628/41250 [13:36:15<86:02:09, 8.69s/it][2025-04-25 21:33:58,890] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 21:33:58,890] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.68 | bwd_microstep: 5690.10 | bwd_inner_microstep: 5648.62 | bwd_allreduce_microstep: 41.44 | step_microstep: 19.12 [2025-04-25 21:33:58,891] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.68 | bwd: 5690.12 | bwd_inner: 5648.62 | bwd_allreduce: 41.46 | step: 19.12 14%|█▎ | 5629/41250 [13:36:24<85:44:37, 8.67s/it] {'loss': 0.4645, 'grad_norm': 2.7274010181427, 'learning_rate': 3.8823098705184926e-05, 'epoch': 1.36} 14%|█▎ | 5629/41250 [13:36:24<85:44:37, 8.67s/it][2025-04-25 21:34:07,558] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.06 | optimizer_step: 1.07 [2025-04-25 21:34:07,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.00 | bwd_microstep: 5759.89 | bwd_inner_microstep: 5648.14 | bwd_allreduce_microstep: 111.70 | step_microstep: 19.25 [2025-04-25 21:34:07,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.00 | bwd: 5759.91 | bwd_inner: 5648.14 | bwd_allreduce: 111.73 | step: 19.25 14%|█▎ | 5630/41250 [13:36:32<85:44:58, 8.67s/it] {'loss': 0.0441, 'grad_norm': 0.8993282318115234, 'learning_rate': 3.882256791582854e-05, 'epoch': 1.36} 14%|█▎ | 5630/41250 [13:36:32<85:44:58, 8.67s/it][2025-04-25 21:34:16,176] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:34:16,176] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.93 | bwd_microstep: 5707.60 | bwd_inner_microstep: 5650.59 | bwd_allreduce_microstep: 56.96 | step_microstep: 18.75 [2025-04-25 21:34:16,177] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.93 | bwd: 5707.61 | bwd_inner: 5650.59 | bwd_allreduce: 56.98 | step: 18.75 14%|█▎ | 5631/41250 [13:36:41<85:36:01, 8.65s/it] {'loss': 0.115, 'grad_norm': 2.43778395652771, 'learning_rate': 3.882203701043473e-05, 'epoch': 1.37} 14%|█▎ | 5631/41250 [13:36:41<85:36:01, 8.65s/it][mov,mp4,m4a,3gp,3g2,mj2 @ 0xc90db80] moov atom not found [21:34:16] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Allvideos/Animate/00776.mp4, Invalid data found when processing input petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. Error reading /home/wangjiarui/AIGV6K/Allvideos/Animate/00776.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Allvideos/Animate/00776.mp4... [2025-04-25 21:34:24,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:34:24,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.06 | bwd_microstep: 5705.26 | bwd_inner_microstep: 5692.26 | bwd_allreduce_microstep: 12.95 | step_microstep: 18.79 [2025-04-25 21:34:24,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.06 | bwd: 5705.27 | bwd_inner: 5692.26 | bwd_allreduce: 12.97 | step: 18.79 14%|█▎ | 5632/41250 [13:36:50<85:32:10, 8.65s/it] {'loss': 0.2289, 'grad_norm': 1.743869423866272, 'learning_rate': 3.8821505989006756e-05, 'epoch': 1.37} 14%|█▎ | 5632/41250 [13:36:50<85:32:10, 8.65s/it][2025-04-25 21:34:33,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.98 | optimizer_step: 1.09 [2025-04-25 21:34:33,459] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.31 | bwd_microstep: 5716.69 | bwd_inner_microstep: 5703.99 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.63 [2025-04-25 21:34:33,459] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.31 | bwd: 5716.70 | bwd_inner: 5703.99 | bwd_allreduce: 12.67 | step: 18.63 14%|█▎ | 5633/41250 [13:36:58<85:33:16, 8.65s/it] {'loss': 0.4112, 'grad_norm': 4.213136672973633, 'learning_rate': 3.8820974851547906e-05, 'epoch': 1.37} 14%|█▎ | 5633/41250 [13:36:58<85:33:16, 8.65s/it][2025-04-25 21:34:42,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 21:34:42,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.95 | bwd_microstep: 5755.96 | bwd_inner_microstep: 5641.66 | bwd_allreduce_microstep: 114.25 | step_microstep: 18.96 [2025-04-25 21:34:42,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.95 | bwd: 5755.97 | bwd_inner: 5641.66 | bwd_allreduce: 114.27 | step: 18.97 14%|█▎ | 5634/41250 [13:37:07<85:36:04, 8.65s/it] {'loss': 0.1015, 'grad_norm': 2.4715452194213867, 'learning_rate': 3.882044359806144e-05, 'epoch': 1.37} 14%|█▎ | 5634/41250 [13:37:07<85:36:04, 8.65s/it][2025-04-25 21:34:50,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:34:50,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.66 | bwd_microstep: 5693.51 | bwd_inner_microstep: 5680.45 | bwd_allreduce_microstep: 13.01 | step_microstep: 18.63 [2025-04-25 21:34:50,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.66 | bwd: 5693.52 | bwd_inner: 5680.45 | bwd_allreduce: 13.03 | step: 18.63 14%|█▎ | 5635/41250 [13:37:16<85:29:52, 8.64s/it] {'loss': 0.0965, 'grad_norm': 1.0410650968551636, 'learning_rate': 3.881991222855065e-05, 'epoch': 1.37} 14%|█▎ | 5635/41250 [13:37:16<85:29:52, 8.64s/it][2025-04-25 21:34:59,326] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.98 | optimizer_step: 0.97 [2025-04-25 21:34:59,327] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.30 | bwd_microstep: 5679.73 | bwd_inner_microstep: 5642.74 | bwd_allreduce_microstep: 36.94 | step_microstep: 18.62 [2025-04-25 21:34:59,327] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.30 | bwd: 5679.74 | bwd_inner: 5642.74 | bwd_allreduce: 36.95 | step: 18.63 14%|█▎ | 5636/41250 [13:37:24<85:19:31, 8.63s/it] {'loss': 0.2283, 'grad_norm': 2.855778217315674, 'learning_rate': 3.8819380743018794e-05, 'epoch': 1.37} 14%|█▎ | 5636/41250 [13:37:24<85:19:31, 8.63s/it][2025-04-25 21:35:07,997] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 0.97 | optimizer_step: 0.94 [2025-04-25 21:35:07,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.14 | bwd_microstep: 5757.89 | bwd_inner_microstep: 5655.91 | bwd_allreduce_microstep: 101.93 | step_microstep: 18.35 [2025-04-25 21:35:07,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.14 | bwd: 5757.90 | bwd_inner: 5655.91 | bwd_allreduce: 101.95 | step: 18.35 14%|█▎ | 5637/41250 [13:37:33<85:27:31, 8.64s/it] {'loss': 0.3016, 'grad_norm': 2.748706579208374, 'learning_rate': 3.881884914146916e-05, 'epoch': 1.37} 14%|█▎ | 5637/41250 [13:37:33<85:27:31, 8.64s/it][2025-04-25 21:35:16,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.03 | optimizer_step: 0.92 [2025-04-25 21:35:16,661] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.84 | bwd_microstep: 5755.97 | bwd_inner_microstep: 5648.88 | bwd_allreduce_microstep: 107.04 | step_microstep: 18.95 [2025-04-25 21:35:16,661] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.84 | bwd: 5755.99 | bwd_inner: 5648.88 | bwd_allreduce: 107.06 | step: 18.96 14%|█▎ | 5638/41250 [13:37:41<85:31:50, 8.65s/it] {'loss': 0.0908, 'grad_norm': 1.8227765560150146, 'learning_rate': 3.8818317423905025e-05, 'epoch': 1.37} 14%|█▎ | 5638/41250 [13:37:41<85:31:50, 8.65s/it][2025-04-25 21:35:25,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:35:25,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.82 | bwd_microstep: 5743.02 | bwd_inner_microstep: 5639.58 | bwd_allreduce_microstep: 103.38 | step_microstep: 18.63 [2025-04-25 21:35:25,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.82 | bwd: 5743.03 | bwd_inner: 5639.58 | bwd_allreduce: 103.40 | step: 18.63 14%|█▎ | 5639/41250 [13:37:50<85:31:32, 8.65s/it] {'loss': 0.0968, 'grad_norm': 1.524330735206604, 'learning_rate': 3.8817785590329664e-05, 'epoch': 1.37} 14%|█▎ | 5639/41250 [13:37:50<85:31:32, 8.65s/it][2025-04-25 21:35:33,992] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:35:33,993] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.48 | bwd_microstep: 5758.09 | bwd_inner_microstep: 5679.35 | bwd_allreduce_microstep: 78.69 | step_microstep: 18.75 [2025-04-25 21:35:33,993] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.48 | bwd: 5758.10 | bwd_inner: 5679.35 | bwd_allreduce: 78.71 | step: 18.75 14%|█▎ | 5640/41250 [13:37:59<85:38:39, 8.66s/it] {'loss': 0.2818, 'grad_norm': 3.4086015224456787, 'learning_rate': 3.881725364074636e-05, 'epoch': 1.37} 14%|█▎ | 5640/41250 [13:37:59<85:38:39, 8.66s/it][2025-04-25 21:35:42,616] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.01 | optimizer_step: 0.95 [2025-04-25 21:35:42,617] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.42 | bwd_microstep: 5706.12 | bwd_inner_microstep: 5659.84 | bwd_allreduce_microstep: 46.23 | step_microstep: 18.82 [2025-04-25 21:35:42,617] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.42 | bwd: 5706.14 | bwd_inner: 5659.84 | bwd_allreduce: 46.25 | step: 18.82 14%|█▎ | 5641/41250 [13:38:07<85:32:19, 8.65s/it] {'loss': 0.1329, 'grad_norm': 3.8678267002105713, 'learning_rate': 3.881672157515838e-05, 'epoch': 1.37} 14%|█▎ | 5641/41250 [13:38:07<85:32:19, 8.65s/it][2025-04-25 21:35:51,230] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 21:35:51,230] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.16 | bwd_microstep: 5704.94 | bwd_inner_microstep: 5649.66 | bwd_allreduce_microstep: 55.23 | step_microstep: 18.96 [2025-04-25 21:35:51,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.16 | bwd: 5704.95 | bwd_inner: 5649.66 | bwd_allreduce: 55.25 | step: 18.96 14%|█▎ | 5642/41250 [13:38:16<85:26:11, 8.64s/it] {'loss': 0.3987, 'grad_norm': 3.1546993255615234, 'learning_rate': 3.881618939356901e-05, 'epoch': 1.37} 14%|█▎ | 5642/41250 [13:38:16<85:26:11, 8.64s/it][2025-04-25 21:35:59,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 21:35:59,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.13 | bwd_microstep: 5781.27 | bwd_inner_microstep: 5639.21 | bwd_allreduce_microstep: 142.02 | step_microstep: 18.72 [2025-04-25 21:35:59,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.13 | bwd: 5781.29 | bwd_inner: 5639.21 | bwd_allreduce: 142.04 | step: 18.72 14%|█▎ | 5643/41250 [13:38:25<85:34:41, 8.65s/it] {'loss': 0.2714, 'grad_norm': 4.207535266876221, 'learning_rate': 3.881565709598154e-05, 'epoch': 1.37} 14%|█▎ | 5643/41250 [13:38:25<85:34:41, 8.65s/it][2025-04-25 21:36:08,615] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:36:08,615] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.90 | bwd_microstep: 5761.39 | bwd_inner_microstep: 5705.23 | bwd_allreduce_microstep: 56.12 | step_microstep: 18.66 [2025-04-25 21:36:08,615] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.90 | bwd: 5761.41 | bwd_inner: 5705.23 | bwd_allreduce: 56.13 | step: 18.66 14%|█▎ | 5644/41250 [13:38:33<85:42:38, 8.67s/it] {'loss': 0.2097, 'grad_norm': 2.3665547370910645, 'learning_rate': 3.881512468239925e-05, 'epoch': 1.37} 14%|█▎ | 5644/41250 [13:38:33<85:42:38, 8.67s/it][2025-04-25 21:36:17,265] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:36:17,266] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.66 | bwd_microstep: 5717.77 | bwd_inner_microstep: 5705.11 | bwd_allreduce_microstep: 12.62 | step_microstep: 18.46 [2025-04-25 21:36:17,266] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.66 | bwd: 5717.79 | bwd_inner: 5705.11 | bwd_allreduce: 12.63 | step: 18.46 14%|█▎ | 5645/41250 [13:38:42<85:39:47, 8.66s/it] {'loss': 0.0361, 'grad_norm': 1.2173537015914917, 'learning_rate': 3.881459215282541e-05, 'epoch': 1.37} 14%|█▎ | 5645/41250 [13:38:42<85:39:47, 8.66s/it][2025-04-25 21:36:25,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.02 | optimizer_step: 1.05 [2025-04-25 21:36:25,900] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.70 | bwd_microstep: 5695.73 | bwd_inner_microstep: 5682.93 | bwd_allreduce_microstep: 12.76 | step_microstep: 19.33 [2025-04-25 21:36:25,900] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.70 | bwd: 5695.74 | bwd_inner: 5682.93 | bwd_allreduce: 12.78 | step: 19.33 14%|█▎ | 5646/41250 [13:38:51<85:34:57, 8.65s/it] {'loss': 0.1305, 'grad_norm': 1.590264081954956, 'learning_rate': 3.881405950726331e-05, 'epoch': 1.37} 14%|█▎ | 5646/41250 [13:38:51<85:34:57, 8.65s/it][2025-04-25 21:36:34,516] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 1.03 [2025-04-25 21:36:34,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.35 | bwd_microstep: 5704.52 | bwd_inner_microstep: 5656.73 | bwd_allreduce_microstep: 47.74 | step_microstep: 19.23 [2025-04-25 21:36:34,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.35 | bwd: 5704.54 | bwd_inner: 5656.73 | bwd_allreduce: 47.76 | step: 19.23 14%|█▎ | 5647/41250 [13:38:59<85:28:13, 8.64s/it] {'loss': 0.1251, 'grad_norm': 2.260995388031006, 'learning_rate': 3.881352674571624e-05, 'epoch': 1.37} 14%|█▎ | 5647/41250 [13:38:59<85:28:13, 8.64s/it][2025-04-25 21:36:43,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:36:43,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.49 | bwd_microstep: 5740.42 | bwd_inner_microstep: 5714.46 | bwd_allreduce_microstep: 25.91 | step_microstep: 19.02 [2025-04-25 21:36:43,200] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.50 | bwd: 5740.43 | bwd_inner: 5714.46 | bwd_allreduce: 25.93 | step: 19.02 14%|█▎ | 5648/41250 [13:39:08<85:35:10, 8.65s/it] {'loss': 0.0831, 'grad_norm': 2.4447743892669678, 'learning_rate': 3.881299386818748e-05, 'epoch': 1.37} 14%|█▎ | 5648/41250 [13:39:08<85:35:10, 8.65s/it][2025-04-25 21:36:51,867] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:36:51,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.68 | bwd_microstep: 5727.17 | bwd_inner_microstep: 5714.38 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.75 [2025-04-25 21:36:51,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.68 | bwd: 5727.19 | bwd_inner: 5714.38 | bwd_allreduce: 12.77 | step: 18.75 14%|█▎ | 5649/41250 [13:39:17<85:37:29, 8.66s/it] {'loss': 0.1265, 'grad_norm': 1.1228097677230835, 'learning_rate': 3.88124608746803e-05, 'epoch': 1.37} 14%|█▎ | 5649/41250 [13:39:17<85:37:29, 8.66s/it][2025-04-25 21:37:00,623] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:37:00,624] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2883.99 | bwd_microstep: 5786.92 | bwd_inner_microstep: 5774.18 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.75 [2025-04-25 21:37:00,624] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2883.99 | bwd: 5786.93 | bwd_inner: 5774.18 | bwd_allreduce: 12.71 | step: 18.75 14%|█▎ | 5650/41250 [13:39:25<85:54:46, 8.69s/it] {'loss': 0.0456, 'grad_norm': 0.6913841962814331, 'learning_rate': 3.8811927765198005e-05, 'epoch': 1.37} 14%|█▎ | 5650/41250 [13:39:25<85:54:46, 8.69s/it][2025-04-25 21:37:09,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 21:37:09,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.14 | bwd_microstep: 5793.58 | bwd_inner_microstep: 5663.79 | bwd_allreduce_microstep: 129.73 | step_microstep: 19.10 [2025-04-25 21:37:09,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.14 | bwd: 5793.59 | bwd_inner: 5663.79 | bwd_allreduce: 129.75 | step: 19.10 14%|█▎ | 5651/41250 [13:39:34<85:58:49, 8.69s/it] {'loss': 0.0548, 'grad_norm': 0.9587269425392151, 'learning_rate': 3.881139453974387e-05, 'epoch': 1.37} 14%|█▎ | 5651/41250 [13:39:34<85:58:49, 8.69s/it][2025-04-25 21:37:17,970] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.97 [2025-04-25 21:37:17,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.96 | bwd_microstep: 5717.22 | bwd_inner_microstep: 5652.02 | bwd_allreduce_microstep: 65.16 | step_microstep: 18.95 [2025-04-25 21:37:17,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.96 | bwd: 5717.24 | bwd_inner: 5652.02 | bwd_allreduce: 65.18 | step: 18.96 14%|█▎ | 5652/41250 [13:39:43<85:48:11, 8.68s/it] {'loss': 0.0975, 'grad_norm': 1.3430930376052856, 'learning_rate': 3.8810861198321196e-05, 'epoch': 1.37} 14%|█▎ | 5652/41250 [13:39:43<85:48:11, 8.68s/it][2025-04-25 21:37:26,766] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 21:37:26,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.84 | bwd_microstep: 5850.59 | bwd_inner_microstep: 5711.24 | bwd_allreduce_microstep: 139.30 | step_microstep: 18.40 [2025-04-25 21:37:26,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.84 | bwd: 5850.60 | bwd_inner: 5711.24 | bwd_allreduce: 139.32 | step: 18.41 14%|█▎ | 5653/41250 [13:39:52<86:08:59, 8.71s/it] {'loss': 0.1305, 'grad_norm': 1.3731762170791626, 'learning_rate': 3.881032774093326e-05, 'epoch': 1.37} 14%|█▎ | 5653/41250 [13:39:52<86:08:59, 8.71s/it][2025-04-25 21:37:35,477] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 21:37:35,478] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2875.89 | bwd_microstep: 5751.49 | bwd_inner_microstep: 5692.90 | bwd_allreduce_microstep: 58.54 | step_microstep: 18.53 [2025-04-25 21:37:35,478] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2875.89 | bwd: 5751.50 | bwd_inner: 5692.90 | bwd_allreduce: 58.56 | step: 18.54 14%|█▎ | 5654/41250 [13:40:00<86:08:30, 8.71s/it] {'loss': 0.2211, 'grad_norm': 2.1762962341308594, 'learning_rate': 3.8809794167583354e-05, 'epoch': 1.37} 14%|█▎ | 5654/41250 [13:40:00<86:08:30, 8.71s/it][2025-04-25 21:37:44,192] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 0.98 | optimizer_step: 1.04 [2025-04-25 21:37:44,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.43 | bwd_microstep: 5781.22 | bwd_inner_microstep: 5716.50 | bwd_allreduce_microstep: 64.68 | step_microstep: 18.43 [2025-04-25 21:37:44,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.43 | bwd: 5781.24 | bwd_inner: 5716.50 | bwd_allreduce: 64.70 | step: 18.43 14%|█▎ | 5655/41250 [13:40:09<86:09:04, 8.71s/it] {'loss': 0.1266, 'grad_norm': 1.1526734828948975, 'learning_rate': 3.8809260478274774e-05, 'epoch': 1.37} 14%|█▎ | 5655/41250 [13:40:09<86:09:04, 8.71s/it][2025-04-25 21:37:52,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 21:37:52,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.46 | bwd_microstep: 5719.22 | bwd_inner_microstep: 5706.20 | bwd_allreduce_microstep: 12.97 | step_microstep: 19.01 [2025-04-25 21:37:52,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.46 | bwd: 5719.24 | bwd_inner: 5706.20 | bwd_allreduce: 12.99 | step: 19.02 14%|█▎ | 5656/41250 [13:40:18<85:58:31, 8.70s/it] {'loss': 0.0177, 'grad_norm': 0.22662527859210968, 'learning_rate': 3.8808726673010785e-05, 'epoch': 1.37} 14%|█▎ | 5656/41250 [13:40:18<85:58:31, 8.70s/it][2025-04-25 21:38:01,494] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.12 | optimizer_step: 0.96 [2025-04-25 21:38:01,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.13 | bwd_microstep: 5733.87 | bwd_inner_microstep: 5659.08 | bwd_allreduce_microstep: 74.73 | step_microstep: 18.96 [2025-04-25 21:38:01,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.13 | bwd: 5733.88 | bwd_inner: 5659.08 | bwd_allreduce: 74.75 | step: 18.96 14%|█▎ | 5657/41250 [13:40:26<85:50:33, 8.68s/it] {'loss': 0.1722, 'grad_norm': 1.7321053743362427, 'learning_rate': 3.880819275179471e-05, 'epoch': 1.37} 14%|█▎ | 5657/41250 [13:40:26<85:50:33, 8.68s/it][2025-04-25 21:38:10,146] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.22 | optimizer_step: 0.99 [2025-04-25 21:38:10,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.80 | bwd_microstep: 5714.07 | bwd_inner_microstep: 5700.56 | bwd_allreduce_microstep: 13.46 | step_microstep: 19.60 [2025-04-25 21:38:10,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.80 | bwd: 5714.09 | bwd_inner: 5700.56 | bwd_allreduce: 13.48 | step: 19.60 14%|█▎ | 5658/41250 [13:40:35<85:44:13, 8.67s/it] {'loss': 0.1208, 'grad_norm': 1.132050633430481, 'learning_rate': 3.880765871462982e-05, 'epoch': 1.37} 14%|█▎ | 5658/41250 [13:40:35<85:44:13, 8.67s/it][2025-04-25 21:38:18,781] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-25 21:38:18,781] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.35 | bwd_microstep: 5717.57 | bwd_inner_microstep: 5667.46 | bwd_allreduce_microstep: 50.06 | step_microstep: 19.25 [2025-04-25 21:38:18,781] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.35 | bwd: 5717.58 | bwd_inner: 5667.46 | bwd_allreduce: 50.09 | step: 19.25 14%|█▎ | 5659/41250 [13:40:44<85:37:27, 8.66s/it] {'loss': 0.1909, 'grad_norm': 1.4809356927871704, 'learning_rate': 3.8807124561519414e-05, 'epoch': 1.37} 14%|█▎ | 5659/41250 [13:40:44<85:37:27, 8.66s/it][2025-04-25 21:38:27,439] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.05 | optimizer_step: 1.02 [2025-04-25 21:38:27,440] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.44 | bwd_microstep: 5721.56 | bwd_inner_microstep: 5708.74 | bwd_allreduce_microstep: 12.77 | step_microstep: 19.63 [2025-04-25 21:38:27,440] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.44 | bwd: 5721.57 | bwd_inner: 5708.74 | bwd_allreduce: 12.79 | step: 19.64 14%|█▎ | 5660/41250 [13:40:52<85:37:35, 8.66s/it] {'loss': 0.0932, 'grad_norm': 1.1971241235733032, 'learning_rate': 3.880659029246679e-05, 'epoch': 1.37} 14%|█▎ | 5660/41250 [13:40:52<85:37:35, 8.66s/it][2025-04-25 21:38:36,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.03 | optimizer_step: 1.05 [2025-04-25 21:38:36,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.72 | bwd_microstep: 5727.55 | bwd_inner_microstep: 5668.23 | bwd_allreduce_microstep: 59.27 | step_microstep: 19.25 [2025-04-25 21:38:36,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.72 | bwd: 5727.56 | bwd_inner: 5668.23 | bwd_allreduce: 59.29 | step: 19.25 14%|█▎ | 5661/41250 [13:41:01<85:35:09, 8.66s/it] {'loss': 0.1065, 'grad_norm': 1.7765965461730957, 'learning_rate': 3.8806055907475224e-05, 'epoch': 1.37} 14%|█▎ | 5661/41250 [13:41:01<85:35:09, 8.66s/it][2025-04-25 21:38:44,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.03 | optimizer_step: 0.93 [2025-04-25 21:38:44,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.71 | bwd_microstep: 5722.59 | bwd_inner_microstep: 5660.48 | bwd_allreduce_microstep: 62.07 | step_microstep: 19.00 [2025-04-25 21:38:44,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.71 | bwd: 5722.61 | bwd_inner: 5660.48 | bwd_allreduce: 62.09 | step: 19.00 14%|█▎ | 5662/41250 [13:41:10<85:31:52, 8.65s/it] {'loss': 0.0533, 'grad_norm': 0.766474723815918, 'learning_rate': 3.880552140654803e-05, 'epoch': 1.37} 14%|█▎ | 5662/41250 [13:41:10<85:31:52, 8.65s/it][2025-04-25 21:38:53,433] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:38:53,434] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.41 | bwd_microstep: 5758.93 | bwd_inner_microstep: 5718.18 | bwd_allreduce_microstep: 40.70 | step_microstep: 18.91 [2025-04-25 21:38:53,434] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.41 | bwd: 5758.94 | bwd_inner: 5718.18 | bwd_allreduce: 40.72 | step: 18.91 14%|█▎ | 5663/41250 [13:41:18<85:40:31, 8.67s/it] {'loss': 0.2171, 'grad_norm': 3.9582526683807373, 'learning_rate': 3.88049867896885e-05, 'epoch': 1.37} 14%|█▎ | 5663/41250 [13:41:18<85:40:31, 8.67s/it][2025-04-25 21:39:02,160] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.01 | optimizer_step: 0.98 [2025-04-25 21:39:02,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.40 | bwd_microstep: 5794.52 | bwd_inner_microstep: 5691.64 | bwd_allreduce_microstep: 102.83 | step_microstep: 19.33 [2025-04-25 21:39:02,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.40 | bwd: 5794.54 | bwd_inner: 5691.64 | bwd_allreduce: 102.85 | step: 19.33 14%|█▎ | 5664/41250 [13:41:27<85:51:35, 8.69s/it] {'loss': 0.1024, 'grad_norm': 1.4898571968078613, 'learning_rate': 3.880445205689991e-05, 'epoch': 1.37} 14%|█▎ | 5664/41250 [13:41:27<85:51:35, 8.69s/it][2025-04-25 21:39:10,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:39:10,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.45 | bwd_microstep: 5758.55 | bwd_inner_microstep: 5666.53 | bwd_allreduce_microstep: 91.97 | step_microstep: 18.66 [2025-04-25 21:39:10,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.45 | bwd: 5758.56 | bwd_inner: 5666.53 | bwd_allreduce: 91.99 | step: 18.67 14%|█▎ | 5665/41250 [13:41:36<85:49:13, 8.68s/it] {'loss': 0.085, 'grad_norm': 1.9482736587524414, 'learning_rate': 3.8803917208185583e-05, 'epoch': 1.37} 14%|█▎ | 5665/41250 [13:41:36<85:49:13, 8.68s/it][2025-04-25 21:39:19,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-25 21:39:19,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.07 | bwd_microstep: 5784.60 | bwd_inner_microstep: 5709.84 | bwd_allreduce_microstep: 74.72 | step_microstep: 18.93 [2025-04-25 21:39:19,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.07 | bwd: 5784.62 | bwd_inner: 5709.84 | bwd_allreduce: 74.74 | step: 18.94 14%|█▎ | 5666/41250 [13:41:44<85:55:43, 8.69s/it] {'loss': 0.0937, 'grad_norm': 2.194953203201294, 'learning_rate': 3.8803382243548806e-05, 'epoch': 1.37} 14%|█▎ | 5666/41250 [13:41:44<85:55:43, 8.69s/it][2025-04-25 21:39:28,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.15 | optimizer_step: 0.93 [2025-04-25 21:39:28,249] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.71 | bwd_microstep: 5764.43 | bwd_inner_microstep: 5688.09 | bwd_allreduce_microstep: 76.29 | step_microstep: 18.96 [2025-04-25 21:39:28,249] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.71 | bwd: 5764.44 | bwd_inner: 5688.09 | bwd_allreduce: 76.31 | step: 18.96 14%|█▎ | 5667/41250 [13:41:53<85:55:21, 8.69s/it] {'loss': 0.0958, 'grad_norm': 1.327024221420288, 'learning_rate': 3.880284716299287e-05, 'epoch': 1.37} 14%|█▎ | 5667/41250 [13:41:53<85:55:21, 8.69s/it][2025-04-25 21:39:36,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:39:36,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.23 | bwd_microstep: 5732.07 | bwd_inner_microstep: 5710.42 | bwd_allreduce_microstep: 21.60 | step_microstep: 18.90 [2025-04-25 21:39:36,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.23 | bwd: 5732.08 | bwd_inner: 5710.42 | bwd_allreduce: 21.62 | step: 18.91 14%|█▎ | 5668/41250 [13:42:02<85:52:29, 8.69s/it] {'loss': 0.1478, 'grad_norm': 4.88890266418457, 'learning_rate': 3.880231196652108e-05, 'epoch': 1.37} 14%|█▎ | 5668/41250 [13:42:02<85:52:29, 8.69s/it][2025-04-25 21:39:45,563] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 0.99 | optimizer_step: 1.07 [2025-04-25 21:39:45,564] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.68 | bwd_microstep: 5705.57 | bwd_inner_microstep: 5692.76 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.60 [2025-04-25 21:39:45,564] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.68 | bwd: 5705.58 | bwd_inner: 5692.76 | bwd_allreduce: 12.78 | step: 18.61 14%|█▎ | 5669/41250 [13:42:10<85:43:28, 8.67s/it] {'loss': 0.1502, 'grad_norm': 4.2795090675354, 'learning_rate': 3.8801776654136744e-05, 'epoch': 1.37} 14%|█▎ | 5669/41250 [13:42:10<85:43:28, 8.67s/it][2025-04-25 21:39:54,256] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 1.01 [2025-04-25 21:39:54,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.88 | bwd_microstep: 5759.08 | bwd_inner_microstep: 5700.40 | bwd_allreduce_microstep: 58.63 | step_microstep: 19.27 [2025-04-25 21:39:54,258] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.88 | bwd: 5759.09 | bwd_inner: 5700.40 | bwd_allreduce: 58.65 | step: 19.27 14%|█▎ | 5670/41250 [13:42:19<85:46:48, 8.68s/it] {'loss': 0.0408, 'grad_norm': 0.6673532128334045, 'learning_rate': 3.880124122584315e-05, 'epoch': 1.37} 14%|█▎ | 5670/41250 [13:42:19<85:46:48, 8.68s/it][2025-04-25 21:40:02,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 21:40:02,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.13 | bwd_microstep: 5762.66 | bwd_inner_microstep: 5688.77 | bwd_allreduce_microstep: 73.84 | step_microstep: 18.44 [2025-04-25 21:40:02,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.13 | bwd: 5762.67 | bwd_inner: 5688.77 | bwd_allreduce: 73.86 | step: 18.45 14%|█▎ | 5671/41250 [13:42:28<85:49:06, 8.68s/it] {'loss': 0.2906, 'grad_norm': 2.000570297241211, 'learning_rate': 3.8800705681643594e-05, 'epoch': 1.37} 14%|█▎ | 5671/41250 [13:42:28<85:49:06, 8.68s/it][2025-04-25 21:40:11,579] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:40:11,580] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.12 | bwd_microstep: 5718.46 | bwd_inner_microstep: 5663.78 | bwd_allreduce_microstep: 54.64 | step_microstep: 18.55 [2025-04-25 21:40:11,580] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.12 | bwd: 5718.47 | bwd_inner: 5663.78 | bwd_allreduce: 54.65 | step: 18.55 14%|█▍ | 5672/41250 [13:42:36<85:39:16, 8.67s/it] {'loss': 0.0453, 'grad_norm': 1.0938867330551147, 'learning_rate': 3.880017002154139e-05, 'epoch': 1.38} 14%|█▍ | 5672/41250 [13:42:36<85:39:16, 8.67s/it][2025-04-25 21:40:20,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:40:20,265] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.98 | bwd_microstep: 5752.32 | bwd_inner_microstep: 5697.30 | bwd_allreduce_microstep: 54.97 | step_microstep: 18.52 [2025-04-25 21:40:20,265] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.98 | bwd: 5752.33 | bwd_inner: 5697.30 | bwd_allreduce: 54.99 | step: 18.52 14%|█▍ | 5673/41250 [13:42:45<85:42:17, 8.67s/it] {'loss': 0.1525, 'grad_norm': 3.697646379470825, 'learning_rate': 3.8799634245539835e-05, 'epoch': 1.38} 14%|█▍ | 5673/41250 [13:42:45<85:42:17, 8.67s/it][2025-04-25 21:40:28,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.00 | optimizer_step: 1.04 [2025-04-25 21:40:28,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.22 | bwd_microstep: 5733.77 | bwd_inner_microstep: 5695.68 | bwd_allreduce_microstep: 38.05 | step_microstep: 18.71 [2025-04-25 21:40:28,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.22 | bwd: 5733.78 | bwd_inner: 5695.68 | bwd_allreduce: 38.06 | step: 18.72 14%|█▍ | 5674/41250 [13:42:54<85:41:24, 8.67s/it] {'loss': 0.2965, 'grad_norm': 1.5924859046936035, 'learning_rate': 3.879909835364223e-05, 'epoch': 1.38} 14%|█▍ | 5674/41250 [13:42:54<85:41:24, 8.67s/it][2025-04-25 21:40:37,724] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:40:37,724] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.55 | bwd_microstep: 5852.96 | bwd_inner_microstep: 5709.89 | bwd_allreduce_microstep: 143.02 | step_microstep: 18.62 [2025-04-25 21:40:37,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.55 | bwd: 5852.98 | bwd_inner: 5709.89 | bwd_allreduce: 143.05 | step: 18.62 14%|█▍ | 5675/41250 [13:43:03<86:02:49, 8.71s/it] {'loss': 0.2503, 'grad_norm': 3.517625570297241, 'learning_rate': 3.879856234585189e-05, 'epoch': 1.38} 14%|█▍ | 5675/41250 [13:43:03<86:02:49, 8.71s/it][2025-04-25 21:40:46,481] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 1.12 [2025-04-25 21:40:46,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2885.85 | bwd_microstep: 5788.83 | bwd_inner_microstep: 5776.07 | bwd_allreduce_microstep: 12.71 | step_microstep: 19.28 [2025-04-25 21:40:46,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2885.85 | bwd: 5788.84 | bwd_inner: 5776.07 | bwd_allreduce: 12.73 | step: 19.28 14%|█▍ | 5676/41250 [13:43:11<86:11:32, 8.72s/it] {'loss': 0.0364, 'grad_norm': 0.6321446299552917, 'learning_rate': 3.87980262221721e-05, 'epoch': 1.38} 14%|█▍ | 5676/41250 [13:43:11<86:11:32, 8.72s/it][2025-04-25 21:40:55,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:40:55,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.60 | bwd_microstep: 5726.02 | bwd_inner_microstep: 5713.20 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.79 [2025-04-25 21:40:55,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.60 | bwd: 5726.04 | bwd_inner: 5713.20 | bwd_allreduce: 12.80 | step: 18.80 14%|█▍ | 5677/41250 [13:43:20<86:01:28, 8.71s/it] {'loss': 0.0785, 'grad_norm': 1.5374460220336914, 'learning_rate': 3.879748998260619e-05, 'epoch': 1.38} 14%|█▍ | 5677/41250 [13:43:20<86:01:28, 8.71s/it][2025-04-25 21:41:03,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:41:03,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.51 | bwd_microstep: 5753.77 | bwd_inner_microstep: 5696.42 | bwd_allreduce_microstep: 57.30 | step_microstep: 18.68 [2025-04-25 21:41:03,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.51 | bwd: 5753.78 | bwd_inner: 5696.42 | bwd_allreduce: 57.32 | step: 18.68 14%|█▍ | 5678/41250 [13:43:29<85:58:29, 8.70s/it] {'loss': 0.1856, 'grad_norm': 4.93423318862915, 'learning_rate': 3.879695362715744e-05, 'epoch': 1.38} 14%|█▍ | 5678/41250 [13:43:29<85:58:29, 8.70s/it][2025-04-25 21:41:12,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:41:12,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.10 | bwd_microstep: 5757.08 | bwd_inner_microstep: 5641.69 | bwd_allreduce_microstep: 115.34 | step_microstep: 18.62 [2025-04-25 21:41:12,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.10 | bwd: 5757.09 | bwd_inner: 5641.69 | bwd_allreduce: 115.35 | step: 18.62 14%|█▍ | 5679/41250 [13:43:37<85:51:15, 8.69s/it] {'loss': 0.2899, 'grad_norm': 1.6766951084136963, 'learning_rate': 3.8796417155829166e-05, 'epoch': 1.38} 14%|█▍ | 5679/41250 [13:43:37<85:51:15, 8.69s/it][2025-04-25 21:41:21,085] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 21:41:21,085] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.77 | bwd_microstep: 5679.96 | bwd_inner_microstep: 5639.00 | bwd_allreduce_microstep: 40.91 | step_microstep: 18.73 [2025-04-25 21:41:21,085] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.77 | bwd: 5679.98 | bwd_inner: 5639.00 | bwd_allreduce: 40.93 | step: 18.73 14%|█▍ | 5680/41250 [13:43:46<85:32:39, 8.66s/it] {'loss': 0.2479, 'grad_norm': 2.0503129959106445, 'learning_rate': 3.879588056862468e-05, 'epoch': 1.38} 14%|█▍ | 5680/41250 [13:43:46<85:32:39, 8.66s/it][2025-04-25 21:41:30,036] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 21:41:30,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.64 | bwd_microstep: 6048.28 | bwd_inner_microstep: 5635.02 | bwd_allreduce_microstep: 413.21 | step_microstep: 18.66 [2025-04-25 21:41:30,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.64 | bwd: 6048.29 | bwd_inner: 5635.02 | bwd_allreduce: 413.23 | step: 18.66 14%|█▍ | 5681/41250 [13:43:55<86:24:49, 8.75s/it] {'loss': 0.2306, 'grad_norm': 2.9101030826568604, 'learning_rate': 3.879534386554729e-05, 'epoch': 1.38} 14%|█▍ | 5681/41250 [13:43:55<86:24:49, 8.75s/it][2025-04-25 21:41:38,626] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 21:41:38,627] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.00 | bwd_microstep: 5679.22 | bwd_inner_microstep: 5645.46 | bwd_allreduce_microstep: 33.71 | step_microstep: 18.53 [2025-04-25 21:41:38,627] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.00 | bwd: 5679.23 | bwd_inner: 5645.46 | bwd_allreduce: 33.73 | step: 18.53 14%|█▍ | 5682/41250 [13:44:03<85:56:55, 8.70s/it] {'loss': 0.1706, 'grad_norm': 2.070218801498413, 'learning_rate': 3.87948070466003e-05, 'epoch': 1.38} 14%|█▍ | 5682/41250 [13:44:03<85:56:55, 8.70s/it][2025-04-25 21:41:47,283] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.00 | optimizer_step: 1.00 [2025-04-25 21:41:47,284] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.54 | bwd_microstep: 5736.00 | bwd_inner_microstep: 5673.15 | bwd_allreduce_microstep: 62.82 | step_microstep: 18.76 [2025-04-25 21:41:47,284] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.54 | bwd: 5736.02 | bwd_inner: 5673.14 | bwd_allreduce: 62.83 | step: 18.76 14%|█▍ | 5683/41250 [13:44:12<85:49:04, 8.69s/it] {'loss': 0.0595, 'grad_norm': 1.7908625602722168, 'learning_rate': 3.8794270111787024e-05, 'epoch': 1.38} 14%|█▍ | 5683/41250 [13:44:12<85:49:04, 8.69s/it][2025-04-25 21:41:55,897] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:41:55,897] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.06 | bwd_microstep: 5693.45 | bwd_inner_microstep: 5680.67 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.88 [2025-04-25 21:41:55,898] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.06 | bwd: 5693.46 | bwd_inner: 5680.67 | bwd_allreduce: 12.75 | step: 18.88 14%|█▍ | 5684/41250 [13:44:21<85:36:12, 8.66s/it] {'loss': 0.162, 'grad_norm': 2.7150332927703857, 'learning_rate': 3.879373306111077e-05, 'epoch': 1.38} 14%|█▍ | 5684/41250 [13:44:21<85:36:12, 8.66s/it][2025-04-25 21:42:04,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.17 | optimizer_step: 0.94 [2025-04-25 21:42:04,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.47 | bwd_microstep: 5705.45 | bwd_inner_microstep: 5692.15 | bwd_allreduce_microstep: 13.25 | step_microstep: 19.18 [2025-04-25 21:42:04,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.47 | bwd: 5705.47 | bwd_inner: 5692.15 | bwd_allreduce: 13.27 | step: 19.18 14%|█▍ | 5685/41250 [13:44:29<85:30:09, 8.65s/it] {'loss': 0.1643, 'grad_norm': 1.6177912950515747, 'learning_rate': 3.879319589457484e-05, 'epoch': 1.38} 14%|█▍ | 5685/41250 [13:44:29<85:30:09, 8.65s/it][2025-04-25 21:42:13,158] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:42:13,159] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.69 | bwd_microstep: 5689.66 | bwd_inner_microstep: 5676.67 | bwd_allreduce_microstep: 12.94 | step_microstep: 18.74 [2025-04-25 21:42:13,159] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.69 | bwd: 5689.67 | bwd_inner: 5676.67 | bwd_allreduce: 12.96 | step: 18.74 14%|█▍ | 5686/41250 [13:44:38<85:25:25, 8.65s/it] {'loss': 0.3128, 'grad_norm': 1.8212904930114746, 'learning_rate': 3.879265861218256e-05, 'epoch': 1.38} 14%|█▍ | 5686/41250 [13:44:38<85:25:25, 8.65s/it][2025-04-25 21:42:21,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:42:21,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2883.46 | bwd_microstep: 5778.01 | bwd_inner_microstep: 5765.16 | bwd_allreduce_microstep: 12.81 | step_microstep: 18.57 [2025-04-25 21:42:21,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2883.46 | bwd: 5778.03 | bwd_inner: 5765.16 | bwd_allreduce: 12.83 | step: 18.57 14%|█▍ | 5687/41250 [13:44:47<85:42:40, 8.68s/it] {'loss': 0.2135, 'grad_norm': 2.758702039718628, 'learning_rate': 3.8792121213937234e-05, 'epoch': 1.38} 14%|█▍ | 5687/41250 [13:44:47<85:42:40, 8.68s/it][2025-04-25 21:42:30,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:42:30,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.68 | bwd_microstep: 6043.67 | bwd_inner_microstep: 5647.51 | bwd_allreduce_microstep: 396.11 | step_microstep: 18.92 [2025-04-25 21:42:30,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.68 | bwd: 6043.68 | bwd_inner: 5647.51 | bwd_allreduce: 396.13 | step: 18.92 14%|█▍ | 5688/41250 [13:44:56<86:31:27, 8.76s/it] {'loss': 0.1004, 'grad_norm': 2.7078707218170166, 'learning_rate': 3.879158369984218e-05, 'epoch': 1.38} 14%|█▍ | 5688/41250 [13:44:56<86:31:27, 8.76s/it][2025-04-25 21:42:39,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.03 | optimizer_step: 0.94 [2025-04-25 21:42:39,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.24 | bwd_microstep: 5777.50 | bwd_inner_microstep: 5655.87 | bwd_allreduce_microstep: 121.58 | step_microstep: 18.96 [2025-04-25 21:42:39,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.24 | bwd: 5777.52 | bwd_inner: 5655.87 | bwd_allreduce: 121.61 | step: 18.96 14%|█▍ | 5689/41250 [13:45:04<86:19:25, 8.74s/it] {'loss': 0.1639, 'grad_norm': 3.0762693881988525, 'learning_rate': 3.879104606990071e-05, 'epoch': 1.38} 14%|█▍ | 5689/41250 [13:45:04<86:19:25, 8.74s/it][2025-04-25 21:42:48,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 21:42:48,227] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.53 | bwd_microstep: 5759.47 | bwd_inner_microstep: 5649.29 | bwd_allreduce_microstep: 110.14 | step_microstep: 18.11 [2025-04-25 21:42:48,227] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.53 | bwd: 5759.49 | bwd_inner: 5649.29 | bwd_allreduce: 110.16 | step: 18.11 14%|█▍ | 5690/41250 [13:45:13<86:08:40, 8.72s/it] {'loss': 0.2262, 'grad_norm': 3.3197925090789795, 'learning_rate': 3.879050832411613e-05, 'epoch': 1.38} 14%|█▍ | 5690/41250 [13:45:13<86:08:40, 8.72s/it][2025-04-25 21:42:56,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.96 | optimizer_step: 1.09 [2025-04-25 21:42:56,900] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.63 | bwd_microstep: 5763.20 | bwd_inner_microstep: 5647.82 | bwd_allreduce_microstep: 115.34 | step_microstep: 18.33 [2025-04-25 21:42:56,900] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.64 | bwd: 5763.21 | bwd_inner: 5647.82 | bwd_allreduce: 115.35 | step: 18.34 14%|█▍ | 5691/41250 [13:45:22<85:59:58, 8.71s/it] {'loss': 0.0699, 'grad_norm': 2.224595785140991, 'learning_rate': 3.8789970462491764e-05, 'epoch': 1.38} 14%|█▍ | 5691/41250 [13:45:22<85:59:58, 8.71s/it][2025-04-25 21:43:05,558] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 21:43:05,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.12 | bwd_microstep: 5710.63 | bwd_inner_microstep: 5697.79 | bwd_allreduce_microstep: 12.79 | step_microstep: 18.23 [2025-04-25 21:43:05,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.12 | bwd: 5710.64 | bwd_inner: 5697.79 | bwd_allreduce: 12.81 | step: 18.24 14%|█▍ | 5692/41250 [13:45:30<85:51:14, 8.69s/it] {'loss': 0.1388, 'grad_norm': 1.692805290222168, 'learning_rate': 3.878943248503093e-05, 'epoch': 1.38} 14%|█▍ | 5692/41250 [13:45:30<85:51:14, 8.69s/it][2025-04-25 21:43:14,315] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 21:43:14,316] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.85 | bwd_microstep: 5787.73 | bwd_inner_microstep: 5775.18 | bwd_allreduce_microstep: 12.51 | step_microstep: 18.63 [2025-04-25 21:43:14,316] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.85 | bwd: 5787.75 | bwd_inner: 5775.18 | bwd_allreduce: 12.52 | step: 18.63 14%|█▍ | 5693/41250 [13:45:39<86:02:42, 8.71s/it] {'loss': 0.1884, 'grad_norm': 1.659735918045044, 'learning_rate': 3.878889439173694e-05, 'epoch': 1.38} 14%|█▍ | 5693/41250 [13:45:39<86:02:42, 8.71s/it][2025-04-25 21:43:22,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 21:43:22,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.26 | bwd_microstep: 5714.66 | bwd_inner_microstep: 5701.92 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.49 [2025-04-25 21:43:22,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.26 | bwd: 5714.67 | bwd_inner: 5701.92 | bwd_allreduce: 12.71 | step: 18.50 14%|█▍ | 5694/41250 [13:45:48<85:51:39, 8.69s/it] {'loss': 0.1433, 'grad_norm': 2.13973069190979, 'learning_rate': 3.878835618261311e-05, 'epoch': 1.38} 14%|█▍ | 5694/41250 [13:45:48<85:51:39, 8.69s/it][2025-04-25 21:43:31,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:43:31,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.16 | bwd_microstep: 5709.34 | bwd_inner_microstep: 5650.25 | bwd_allreduce_microstep: 59.05 | step_microstep: 18.18 [2025-04-25 21:43:31,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.16 | bwd: 5709.36 | bwd_inner: 5650.25 | bwd_allreduce: 59.07 | step: 18.18 14%|█▍ | 5695/41250 [13:45:56<85:39:21, 8.67s/it] {'loss': 0.2706, 'grad_norm': 3.4817028045654297, 'learning_rate': 3.878781785766276e-05, 'epoch': 1.38} 14%|█▍ | 5695/41250 [13:45:56<85:39:21, 8.67s/it][2025-04-25 21:43:40,273] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.03 | optimizer_step: 0.98 [2025-04-25 21:43:40,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.48 | bwd_microstep: 5766.57 | bwd_inner_microstep: 5652.01 | bwd_allreduce_microstep: 114.51 | step_microstep: 19.25 [2025-04-25 21:43:40,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.48 | bwd: 5766.58 | bwd_inner: 5652.01 | bwd_allreduce: 114.53 | step: 19.25 14%|█▍ | 5696/41250 [13:46:05<85:41:05, 8.68s/it] {'loss': 0.074, 'grad_norm': 1.452658772468567, 'learning_rate': 3.87872794168892e-05, 'epoch': 1.38} 14%|█▍ | 5696/41250 [13:46:05<85:41:05, 8.68s/it][2025-04-25 21:43:48,925] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:43:48,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.38 | bwd_microstep: 5726.08 | bwd_inner_microstep: 5663.39 | bwd_allreduce_microstep: 62.64 | step_microstep: 18.53 [2025-04-25 21:43:48,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.38 | bwd: 5726.09 | bwd_inner: 5663.39 | bwd_allreduce: 62.66 | step: 18.53 14%|█▍ | 5697/41250 [13:46:14<85:36:34, 8.67s/it] {'loss': 0.0797, 'grad_norm': 1.38713538646698, 'learning_rate': 3.878674086029577e-05, 'epoch': 1.38} 14%|█▍ | 5697/41250 [13:46:14<85:36:34, 8.67s/it][2025-04-25 21:43:57,543] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.05 | optimizer_step: 0.89 [2025-04-25 21:43:57,544] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.40 | bwd_microstep: 5687.39 | bwd_inner_microstep: 5663.21 | bwd_allreduce_microstep: 24.12 | step_microstep: 18.88 [2025-04-25 21:43:57,544] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.40 | bwd: 5687.40 | bwd_inner: 5663.21 | bwd_allreduce: 24.14 | step: 18.88 14%|█▍ | 5698/41250 [13:46:22<85:27:27, 8.65s/it] {'loss': 0.1831, 'grad_norm': 2.263737916946411, 'learning_rate': 3.8786202187885776e-05, 'epoch': 1.38} 14%|█▍ | 5698/41250 [13:46:22<85:27:27, 8.65s/it][2025-04-25 21:44:06,262] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:44:06,263] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.98 | bwd_microstep: 5778.62 | bwd_inner_microstep: 5722.88 | bwd_allreduce_microstep: 55.70 | step_microstep: 18.89 [2025-04-25 21:44:06,263] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.98 | bwd: 5778.63 | bwd_inner: 5722.88 | bwd_allreduce: 55.71 | step: 18.90 14%|█▍ | 5699/41250 [13:46:31<85:39:01, 8.67s/it] {'loss': 0.169, 'grad_norm': 1.8436477184295654, 'learning_rate': 3.878566339966254e-05, 'epoch': 1.38} 14%|█▍ | 5699/41250 [13:46:31<85:39:01, 8.67s/it][2025-04-25 21:44:14,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 21:44:14,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.77 | bwd_microstep: 5772.90 | bwd_inner_microstep: 5715.81 | bwd_allreduce_microstep: 57.05 | step_microstep: 19.14 [2025-04-25 21:44:14,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.77 | bwd: 5772.91 | bwd_inner: 5715.81 | bwd_allreduce: 57.06 | step: 19.14 14%|█▍ | 5700/41250 [13:46:40<85:45:20, 8.68s/it] {'loss': 0.0331, 'grad_norm': 0.571779727935791, 'learning_rate': 3.878512449562938e-05, 'epoch': 1.38} 14%|█▍ | 5700/41250 [13:46:40<85:45:20, 8.68s/it][2025-04-25 21:44:23,611] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 21:44:23,612] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.45 | bwd_microstep: 5710.07 | bwd_inner_microstep: 5697.09 | bwd_allreduce_microstep: 12.94 | step_microstep: 18.67 [2025-04-25 21:44:23,612] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.45 | bwd: 5710.08 | bwd_inner: 5697.09 | bwd_allreduce: 12.95 | step: 18.67 14%|█▍ | 5701/41250 [13:46:48<85:37:18, 8.67s/it] {'loss': 0.1822, 'grad_norm': 1.72425377368927, 'learning_rate': 3.878458547578962e-05, 'epoch': 1.38} 14%|█▍ | 5701/41250 [13:46:48<85:37:18, 8.67s/it][2025-04-25 21:44:32,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 21:44:32,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.56 | bwd_microstep: 5718.51 | bwd_inner_microstep: 5706.16 | bwd_allreduce_microstep: 12.30 | step_microstep: 18.32 [2025-04-25 21:44:32,265] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.56 | bwd: 5718.52 | bwd_inner: 5706.16 | bwd_allreduce: 12.32 | step: 18.32 14%|█▍ | 5702/41250 [13:46:57<85:33:52, 8.67s/it] {'loss': 0.0599, 'grad_norm': 1.4293957948684692, 'learning_rate': 3.878404634014659e-05, 'epoch': 1.38} 14%|█▍ | 5702/41250 [13:46:57<85:33:52, 8.67s/it][2025-04-25 21:44:40,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 1.14 [2025-04-25 21:44:40,902] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.61 | bwd_microstep: 5721.50 | bwd_inner_microstep: 5676.91 | bwd_allreduce_microstep: 44.55 | step_microstep: 19.23 [2025-04-25 21:44:40,902] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.61 | bwd: 5721.52 | bwd_inner: 5676.91 | bwd_allreduce: 44.57 | step: 19.23 14%|█▍ | 5703/41250 [13:47:06<85:28:52, 8.66s/it] {'loss': 0.0855, 'grad_norm': 1.9913661479949951, 'learning_rate': 3.878350708870361e-05, 'epoch': 1.38} 14%|█▍ | 5703/41250 [13:47:06<85:28:52, 8.66s/it][2025-04-25 21:44:49,573] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:44:49,573] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.58 | bwd_microstep: 5730.93 | bwd_inner_microstep: 5718.23 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.92 [2025-04-25 21:44:49,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.58 | bwd: 5730.94 | bwd_inner: 5718.23 | bwd_allreduce: 12.67 | step: 18.93 14%|█▍ | 5704/41250 [13:47:14<85:31:23, 8.66s/it] {'loss': 0.0399, 'grad_norm': 0.6355468034744263, 'learning_rate': 3.8782967721464e-05, 'epoch': 1.38} 14%|█▍ | 5704/41250 [13:47:14<85:31:23, 8.66s/it][2025-04-25 21:44:58,330] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:44:58,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.81 | bwd_microstep: 5786.34 | bwd_inner_microstep: 5773.58 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.70 [2025-04-25 21:44:58,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.81 | bwd: 5786.36 | bwd_inner: 5773.58 | bwd_allreduce: 12.74 | step: 18.70 14%|█▍ | 5705/41250 [13:47:23<85:48:12, 8.69s/it] {'loss': 0.1853, 'grad_norm': 1.3517807722091675, 'learning_rate': 3.878242823843109e-05, 'epoch': 1.38} 14%|█▍ | 5705/41250 [13:47:23<85:48:12, 8.69s/it][2025-04-25 21:45:07,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:45:07,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.73 | bwd_microstep: 5797.74 | bwd_inner_microstep: 5671.83 | bwd_allreduce_microstep: 125.86 | step_microstep: 18.93 [2025-04-25 21:45:07,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.73 | bwd: 5797.75 | bwd_inner: 5671.83 | bwd_allreduce: 125.88 | step: 18.93 14%|█▍ | 5706/41250 [13:47:32<85:52:35, 8.70s/it] {'loss': 0.2022, 'grad_norm': 2.8608899116516113, 'learning_rate': 3.87818886396082e-05, 'epoch': 1.38} 14%|█▍ | 5706/41250 [13:47:32<85:52:35, 8.70s/it][2025-04-25 21:45:15,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.94 | optimizer_gradients: 1.23 | optimizer_step: 0.93 [2025-04-25 21:45:15,683] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.96 | bwd_microstep: 5720.41 | bwd_inner_microstep: 5676.64 | bwd_allreduce_microstep: 43.73 | step_microstep: 19.08 [2025-04-25 21:45:15,683] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.96 | bwd: 5720.42 | bwd_inner: 5676.64 | bwd_allreduce: 43.75 | step: 19.08 14%|█▍ | 5707/41250 [13:47:41<85:41:39, 8.68s/it] {'loss': 0.0279, 'grad_norm': 0.5598965287208557, 'learning_rate': 3.878134892499866e-05, 'epoch': 1.38} 14%|█▍ | 5707/41250 [13:47:41<85:41:39, 8.68s/it][2025-04-25 21:45:24,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.97 | optimizer_step: 1.02 [2025-04-25 21:45:24,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.94 | bwd_microstep: 5746.56 | bwd_inner_microstep: 5708.17 | bwd_allreduce_microstep: 38.35 | step_microstep: 18.72 [2025-04-25 21:45:24,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.94 | bwd: 5746.57 | bwd_inner: 5708.17 | bwd_allreduce: 38.37 | step: 18.72 14%|█▍ | 5708/41250 [13:47:49<85:42:02, 8.68s/it] {'loss': 0.2436, 'grad_norm': 1.7796581983566284, 'learning_rate': 3.878080909460581e-05, 'epoch': 1.38} 14%|█▍ | 5708/41250 [13:47:49<85:42:02, 8.68s/it][2025-04-25 21:45:33,085] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:45:33,086] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.70 | bwd_microstep: 5790.40 | bwd_inner_microstep: 5661.18 | bwd_allreduce_microstep: 129.17 | step_microstep: 18.96 [2025-04-25 21:45:33,086] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.70 | bwd: 5790.41 | bwd_inner: 5661.18 | bwd_allreduce: 129.19 | step: 18.96 14%|█▍ | 5709/41250 [13:47:58<85:48:58, 8.69s/it] {'loss': 0.0784, 'grad_norm': 0.7603529691696167, 'learning_rate': 3.8780269148432956e-05, 'epoch': 1.38} 14%|█▍ | 5709/41250 [13:47:58<85:48:58, 8.69s/it][2025-04-25 21:45:41,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.03 | optimizer_step: 0.98 [2025-04-25 21:45:41,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.58 | bwd_microstep: 5873.33 | bwd_inner_microstep: 5716.93 | bwd_allreduce_microstep: 156.36 | step_microstep: 19.34 [2025-04-25 21:45:41,897] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.58 | bwd: 5873.35 | bwd_inner: 5716.93 | bwd_allreduce: 156.38 | step: 19.34 14%|█▍ | 5710/41250 [13:48:07<86:09:44, 8.73s/it] {'loss': 0.0848, 'grad_norm': 1.5750917196273804, 'learning_rate': 3.877972908648344e-05, 'epoch': 1.38} 14%|█▍ | 5710/41250 [13:48:07<86:09:44, 8.73s/it][2025-04-25 21:45:50,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 1.00 | optimizer_step: 1.03 [2025-04-25 21:45:50,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2864.36 | bwd_microstep: 5730.16 | bwd_inner_microstep: 5717.35 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.68 [2025-04-25 21:45:50,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2864.36 | bwd: 5730.17 | bwd_inner: 5717.35 | bwd_allreduce: 12.78 | step: 18.68 14%|█▍ | 5711/41250 [13:48:15<86:00:38, 8.71s/it] {'loss': 0.0549, 'grad_norm': 0.9476281404495239, 'learning_rate': 3.8779188908760585e-05, 'epoch': 1.38} 14%|█▍ | 5711/41250 [13:48:15<86:00:38, 8.71s/it][2025-04-25 21:45:59,270] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:45:59,270] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.17 | bwd_microstep: 5762.76 | bwd_inner_microstep: 5710.60 | bwd_allreduce_microstep: 52.12 | step_microstep: 18.79 [2025-04-25 21:45:59,270] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.17 | bwd: 5762.77 | bwd_inner: 5710.60 | bwd_allreduce: 52.13 | step: 18.80 14%|█▍ | 5712/41250 [13:48:24<85:57:32, 8.71s/it] {'loss': 0.1242, 'grad_norm': 1.7636487483978271, 'learning_rate': 3.877864861526773e-05, 'epoch': 1.38} 14%|█▍ | 5712/41250 [13:48:24<85:57:32, 8.71s/it][2025-04-25 21:46:07,879] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:46:07,880] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.44 | bwd_microstep: 5693.97 | bwd_inner_microstep: 5668.84 | bwd_allreduce_microstep: 25.08 | step_microstep: 18.65 [2025-04-25 21:46:07,880] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.44 | bwd: 5693.98 | bwd_inner: 5668.84 | bwd_allreduce: 25.10 | step: 18.65 14%|█▍ | 5713/41250 [13:48:33<85:39:57, 8.68s/it] {'loss': 0.1109, 'grad_norm': 1.8546826839447021, 'learning_rate': 3.87781082060082e-05, 'epoch': 1.38} 14%|█▍ | 5713/41250 [13:48:33<85:39:57, 8.68s/it][2025-04-25 21:46:16,541] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.98 | optimizer_step: 1.04 [2025-04-25 21:46:16,542] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.74 | bwd_microstep: 5716.64 | bwd_inner_microstep: 5703.94 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.67 [2025-04-25 21:46:16,542] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.74 | bwd: 5716.65 | bwd_inner: 5703.94 | bwd_allreduce: 12.66 | step: 18.68 14%|█▍ | 5714/41250 [13:48:41<85:36:53, 8.67s/it] {'loss': 0.1722, 'grad_norm': 5.103842735290527, 'learning_rate': 3.877756768098532e-05, 'epoch': 1.39} 14%|█▍ | 5714/41250 [13:48:41<85:36:53, 8.67s/it][2025-04-25 21:46:25,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:46:25,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.22 | bwd_microstep: 5753.00 | bwd_inner_microstep: 5711.76 | bwd_allreduce_microstep: 41.19 | step_microstep: 18.83 [2025-04-25 21:46:25,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.22 | bwd: 5753.02 | bwd_inner: 5711.77 | bwd_allreduce: 41.21 | step: 18.83 14%|█▍ | 5715/41250 [13:48:50<85:41:02, 8.68s/it] {'loss': 0.0553, 'grad_norm': 1.152808666229248, 'learning_rate': 3.877702704020243e-05, 'epoch': 1.39} 14%|█▍ | 5715/41250 [13:48:50<85:41:02, 8.68s/it][2025-04-25 21:46:33,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 21:46:33,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.81 | bwd_microstep: 5764.27 | bwd_inner_microstep: 5656.71 | bwd_allreduce_microstep: 107.52 | step_microstep: 18.55 [2025-04-25 21:46:33,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.81 | bwd: 5764.29 | bwd_inner: 5656.71 | bwd_allreduce: 107.54 | step: 18.55 14%|█▍ | 5716/41250 [13:48:59<85:40:22, 8.68s/it] {'loss': 0.1067, 'grad_norm': 1.8836474418640137, 'learning_rate': 3.877648628366286e-05, 'epoch': 1.39} 14%|█▍ | 5716/41250 [13:48:59<85:40:22, 8.68s/it][2025-04-25 21:46:42,561] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.05 | optimizer_step: 0.95 [2025-04-25 21:46:42,562] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.77 | bwd_microstep: 5715.80 | bwd_inner_microstep: 5702.89 | bwd_allreduce_microstep: 12.86 | step_microstep: 19.25 [2025-04-25 21:46:42,562] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.77 | bwd: 5715.81 | bwd_inner: 5702.89 | bwd_allreduce: 12.88 | step: 19.25 14%|█▍ | 5717/41250 [13:49:07<85:34:20, 8.67s/it] {'loss': 0.0911, 'grad_norm': 1.4370940923690796, 'learning_rate': 3.877594541136995e-05, 'epoch': 1.39} 14%|█▍ | 5717/41250 [13:49:07<85:34:20, 8.67s/it][mov,mp4,m4a,3gp,3g2,mj2 @ 0x18ff2480] moov atom not found [21:46:44] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Allvideos/Pyramid/01174.mp4, Invalid data found when processing input petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. Error reading /home/wangjiarui/AIGV6K/Allvideos/Pyramid/01174.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Allvideos/Pyramid/01174.mp4... [2025-04-25 21:46:51,172] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 21:46:51,173] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.67 | bwd_microstep: 5696.29 | bwd_inner_microstep: 5644.68 | bwd_allreduce_microstep: 51.56 | step_microstep: 18.71 [2025-04-25 21:46:51,173] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.67 | bwd: 5696.30 | bwd_inner: 5644.68 | bwd_allreduce: 51.58 | step: 18.71 14%|█▍ | 5718/41250 [13:49:16<85:23:26, 8.65s/it] {'loss': 0.036, 'grad_norm': 0.5774238705635071, 'learning_rate': 3.877540442332703e-05, 'epoch': 1.39} 14%|█▍ | 5718/41250 [13:49:16<85:23:26, 8.65s/it][2025-04-25 21:46:59,833] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 21:46:59,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.53 | bwd_microstep: 5736.50 | bwd_inner_microstep: 5685.37 | bwd_allreduce_microstep: 51.09 | step_microstep: 18.64 [2025-04-25 21:46:59,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.53 | bwd: 5736.52 | bwd_inner: 5685.37 | bwd_allreduce: 51.10 | step: 18.64 14%|█▍ | 5719/41250 [13:49:25<85:25:03, 8.65s/it] {'loss': 0.2643, 'grad_norm': 1.9334242343902588, 'learning_rate': 3.877486331953743e-05, 'epoch': 1.39} 14%|█▍ | 5719/41250 [13:49:25<85:25:03, 8.65s/it][2025-04-25 21:47:08,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:47:08,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.99 | bwd_microstep: 5734.76 | bwd_inner_microstep: 5710.43 | bwd_allreduce_microstep: 24.28 | step_microstep: 19.08 [2025-04-25 21:47:08,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.99 | bwd: 5734.78 | bwd_inner: 5710.43 | bwd_allreduce: 24.30 | step: 19.08 14%|█▍ | 5720/41250 [13:49:33<85:27:02, 8.66s/it] {'loss': 0.0731, 'grad_norm': 2.510830879211426, 'learning_rate': 3.877432210000449e-05, 'epoch': 1.39} 14%|█▍ | 5720/41250 [13:49:33<85:27:02, 8.66s/it][2025-04-25 21:47:17,189] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:47:17,189] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.63 | bwd_microstep: 5770.26 | bwd_inner_microstep: 5657.00 | bwd_allreduce_microstep: 113.22 | step_microstep: 18.65 [2025-04-25 21:47:17,189] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.63 | bwd: 5770.27 | bwd_inner: 5657.00 | bwd_allreduce: 113.23 | step: 18.66 14%|█▍ | 5721/41250 [13:49:42<85:32:14, 8.67s/it] {'loss': 0.2478, 'grad_norm': 3.187695026397705, 'learning_rate': 3.877378076473156e-05, 'epoch': 1.39} 14%|█▍ | 5721/41250 [13:49:42<85:32:14, 8.67s/it][2025-04-25 21:47:25,867] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:47:25,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.97 | bwd_microstep: 5751.67 | bwd_inner_microstep: 5688.45 | bwd_allreduce_microstep: 63.17 | step_microstep: 18.91 [2025-04-25 21:47:25,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.97 | bwd: 5751.68 | bwd_inner: 5688.45 | bwd_allreduce: 63.19 | step: 18.91 14%|█▍ | 5722/41250 [13:49:51<85:34:45, 8.67s/it] {'loss': 0.2536, 'grad_norm': 2.5074808597564697, 'learning_rate': 3.8773239313721945e-05, 'epoch': 1.39} 14%|█▍ | 5722/41250 [13:49:51<85:34:45, 8.67s/it][2025-04-25 21:47:34,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.02 | optimizer_step: 0.96 [2025-04-25 21:47:34,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.36 | bwd_microstep: 5722.01 | bwd_inner_microstep: 5693.78 | bwd_allreduce_microstep: 28.18 | step_microstep: 19.21 [2025-04-25 21:47:34,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.36 | bwd: 5722.02 | bwd_inner: 5693.78 | bwd_allreduce: 28.20 | step: 19.21 14%|█▍ | 5723/41250 [13:49:59<85:32:33, 8.67s/it] {'loss': 0.2854, 'grad_norm': 4.297073841094971, 'learning_rate': 3.877269774697901e-05, 'epoch': 1.39} 14%|█▍ | 5723/41250 [13:49:59<85:32:33, 8.67s/it][2025-04-25 21:47:43,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:47:43,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.50 | bwd_microstep: 5703.92 | bwd_inner_microstep: 5650.13 | bwd_allreduce_microstep: 53.75 | step_microstep: 18.83 [2025-04-25 21:47:43,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.50 | bwd: 5703.94 | bwd_inner: 5650.13 | bwd_allreduce: 53.76 | step: 18.83 14%|█▍ | 5724/41250 [13:50:08<85:22:13, 8.65s/it] {'loss': 0.0795, 'grad_norm': 1.662947416305542, 'learning_rate': 3.8772156064506086e-05, 'epoch': 1.39} 14%|█▍ | 5724/41250 [13:50:08<85:22:13, 8.65s/it][2025-04-25 21:47:52,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.03 | optimizer_step: 1.08 [2025-04-25 21:47:52,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.27 | bwd_microstep: 5997.45 | bwd_inner_microstep: 5703.28 | bwd_allreduce_microstep: 294.12 | step_microstep: 19.29 [2025-04-25 21:47:52,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.27 | bwd: 5997.46 | bwd_inner: 5703.28 | bwd_allreduce: 294.14 | step: 19.29 14%|█▍ | 5725/41250 [13:50:17<86:11:23, 8.73s/it] {'loss': 0.0127, 'grad_norm': 0.17695143818855286, 'learning_rate': 3.877161426630652e-05, 'epoch': 1.39} 14%|█▍ | 5725/41250 [13:50:17<86:11:23, 8.73s/it][2025-04-25 21:48:00,753] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.10 | optimizer_step: 0.95 [2025-04-25 21:48:00,754] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.53 | bwd_microstep: 5747.95 | bwd_inner_microstep: 5704.44 | bwd_allreduce_microstep: 43.47 | step_microstep: 19.07 [2025-04-25 21:48:00,754] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.54 | bwd: 5747.96 | bwd_inner: 5704.44 | bwd_allreduce: 43.48 | step: 19.08 14%|█▍ | 5726/41250 [13:50:26<86:02:20, 8.72s/it] {'loss': 0.03, 'grad_norm': 0.795855700969696, 'learning_rate': 3.877107235238363e-05, 'epoch': 1.39} 14%|█▍ | 5726/41250 [13:50:26<86:02:20, 8.72s/it][2025-04-25 21:48:09,426] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 21:48:09,427] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.06 | bwd_microstep: 5746.23 | bwd_inner_microstep: 5688.77 | bwd_allreduce_microstep: 57.41 | step_microstep: 18.63 [2025-04-25 21:48:09,427] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.07 | bwd: 5746.24 | bwd_inner: 5688.77 | bwd_allreduce: 57.43 | step: 18.63 14%|█▍ | 5727/41250 [13:50:34<85:53:56, 8.71s/it] {'loss': 0.0582, 'grad_norm': 1.7485995292663574, 'learning_rate': 3.877053032274078e-05, 'epoch': 1.39} 14%|█▍ | 5727/41250 [13:50:34<85:53:56, 8.71s/it][2025-04-25 21:48:18,042] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 0.93 [2025-04-25 21:48:18,043] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.32 | bwd_microstep: 5691.68 | bwd_inner_microstep: 5678.89 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.79 [2025-04-25 21:48:18,043] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.33 | bwd: 5691.70 | bwd_inner: 5678.89 | bwd_allreduce: 12.77 | step: 18.79 14%|█▍ | 5728/41250 [13:50:43<85:37:55, 8.68s/it] {'loss': 0.0589, 'grad_norm': 1.2237637042999268, 'learning_rate': 3.876998817738131e-05, 'epoch': 1.39} 14%|█▍ | 5728/41250 [13:50:43<85:37:55, 8.68s/it][2025-04-25 21:48:26,690] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.98 | optimizer_step: 1.02 [2025-04-25 21:48:26,690] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.19 | bwd_microstep: 5718.13 | bwd_inner_microstep: 5705.41 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.67 [2025-04-25 21:48:26,691] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.19 | bwd: 5718.14 | bwd_inner: 5705.41 | bwd_allreduce: 12.69 | step: 18.68 14%|█▍ | 5729/41250 [13:50:52<85:32:14, 8.67s/it] {'loss': 0.1944, 'grad_norm': 1.9234812259674072, 'learning_rate': 3.876944591630854e-05, 'epoch': 1.39} 14%|█▍ | 5729/41250 [13:50:52<85:32:14, 8.67s/it][2025-04-25 21:48:35,325] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:48:35,326] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.17 | bwd_microstep: 5709.76 | bwd_inner_microstep: 5696.99 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.77 [2025-04-25 21:48:35,326] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.17 | bwd: 5709.77 | bwd_inner: 5696.99 | bwd_allreduce: 12.74 | step: 18.77 14%|█▍ | 5730/41250 [13:51:00<85:26:27, 8.66s/it] {'loss': 0.0695, 'grad_norm': 2.868527412414551, 'learning_rate': 3.8768903539525824e-05, 'epoch': 1.39} 14%|█▍ | 5730/41250 [13:51:00<85:26:27, 8.66s/it][2025-04-25 21:48:44,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.05 | optimizer_step: 1.08 [2025-04-25 21:48:44,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.73 | bwd_microstep: 5983.28 | bwd_inner_microstep: 5705.75 | bwd_allreduce_microstep: 277.48 | step_microstep: 19.46 [2025-04-25 21:48:44,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.73 | bwd: 5983.30 | bwd_inner: 5705.75 | bwd_allreduce: 277.50 | step: 19.47 14%|█▍ | 5731/41250 [13:51:09<86:12:05, 8.74s/it] {'loss': 0.08, 'grad_norm': 2.5555636882781982, 'learning_rate': 3.8768361047036525e-05, 'epoch': 1.39} 14%|█▍ | 5731/41250 [13:51:09<86:12:05, 8.74s/it][2025-04-25 21:48:52,912] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.12 | optimizer_step: 1.00 [2025-04-25 21:48:52,912] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.97 | bwd_microstep: 5729.61 | bwd_inner_microstep: 5685.65 | bwd_allreduce_microstep: 43.91 | step_microstep: 19.07 [2025-04-25 21:48:52,913] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.97 | bwd: 5729.62 | bwd_inner: 5685.65 | bwd_allreduce: 43.93 | step: 19.08 14%|█▍ | 5732/41250 [13:51:18<85:59:36, 8.72s/it] {'loss': 0.051, 'grad_norm': 0.6453459858894348, 'learning_rate': 3.876781843884396e-05, 'epoch': 1.39} 14%|█▍ | 5732/41250 [13:51:18<85:59:36, 8.72s/it][2025-04-25 21:49:01,535] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:49:01,535] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.68 | bwd_microstep: 5698.40 | bwd_inner_microstep: 5685.55 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.40 [2025-04-25 21:49:01,536] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.68 | bwd: 5698.41 | bwd_inner: 5685.55 | bwd_allreduce: 12.82 | step: 18.40 14%|█▍ | 5733/41250 [13:51:26<85:42:51, 8.69s/it] {'loss': 0.0564, 'grad_norm': 1.4197590351104736, 'learning_rate': 3.876727571495149e-05, 'epoch': 1.39} 14%|█▍ | 5733/41250 [13:51:26<85:42:51, 8.69s/it][2025-04-25 21:49:10,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:49:10,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.70 | bwd_microstep: 5722.28 | bwd_inner_microstep: 5674.83 | bwd_allreduce_microstep: 47.41 | step_microstep: 18.77 [2025-04-25 21:49:10,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.70 | bwd: 5722.30 | bwd_inner: 5674.83 | bwd_allreduce: 47.42 | step: 18.77 14%|█▍ | 5734/41250 [13:51:35<85:38:49, 8.68s/it] {'loss': 0.111, 'grad_norm': 1.3781685829162598, 'learning_rate': 3.876673287536246e-05, 'epoch': 1.39} 14%|█▍ | 5734/41250 [13:51:35<85:38:49, 8.68s/it][2025-04-25 21:49:18,855] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.04 | optimizer_step: 0.99 [2025-04-25 21:49:18,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.65 | bwd_microstep: 5744.77 | bwd_inner_microstep: 5650.88 | bwd_allreduce_microstep: 93.84 | step_microstep: 19.05 [2025-04-25 21:49:18,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.65 | bwd: 5744.79 | bwd_inner: 5650.88 | bwd_allreduce: 93.86 | step: 19.06 14%|█▍ | 5735/41250 [13:51:44<85:34:04, 8.67s/it] {'loss': 0.1382, 'grad_norm': 1.264591932296753, 'learning_rate': 3.876618992008021e-05, 'epoch': 1.39} 14%|█▍ | 5735/41250 [13:51:44<85:34:04, 8.67s/it][2025-04-25 21:49:27,532] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:49:27,533] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.68 | bwd_microstep: 5764.84 | bwd_inner_microstep: 5645.43 | bwd_allreduce_microstep: 119.37 | step_microstep: 18.51 [2025-04-25 21:49:27,533] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.68 | bwd: 5764.85 | bwd_inner: 5645.43 | bwd_allreduce: 119.38 | step: 18.52 14%|█▍ | 5736/41250 [13:51:52<85:34:28, 8.67s/it] {'loss': 0.1935, 'grad_norm': 2.544330358505249, 'learning_rate': 3.8765646849108096e-05, 'epoch': 1.39} 14%|█▍ | 5736/41250 [13:51:52<85:34:28, 8.67s/it][2025-04-25 21:49:36,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.98 | optimizer_step: 1.06 [2025-04-25 21:49:36,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.81 | bwd_microstep: 5696.60 | bwd_inner_microstep: 5683.90 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.56 [2025-04-25 21:49:36,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.81 | bwd: 5696.61 | bwd_inner: 5683.90 | bwd_allreduce: 12.67 | step: 18.57 14%|█▍ | 5737/41250 [13:52:01<85:26:54, 8.66s/it] {'loss': 0.2428, 'grad_norm': 2.290504217147827, 'learning_rate': 3.876510366244945e-05, 'epoch': 1.39} 14%|█▍ | 5737/41250 [13:52:01<85:26:54, 8.66s/it][2025-04-25 21:49:44,841] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:49:44,842] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.27 | bwd_microstep: 5767.28 | bwd_inner_microstep: 5637.03 | bwd_allreduce_microstep: 130.21 | step_microstep: 18.35 [2025-04-25 21:49:44,842] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.27 | bwd: 5767.30 | bwd_inner: 5637.03 | bwd_allreduce: 130.23 | step: 18.35 14%|█▍ | 5738/41250 [13:52:10<85:29:13, 8.67s/it] {'loss': 0.1397, 'grad_norm': 3.4501497745513916, 'learning_rate': 3.876456036010764e-05, 'epoch': 1.39} 14%|█▍ | 5738/41250 [13:52:10<85:29:13, 8.67s/it][2025-04-25 21:49:53,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:49:53,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.70 | bwd_microstep: 5713.11 | bwd_inner_microstep: 5685.48 | bwd_allreduce_microstep: 27.59 | step_microstep: 18.20 [2025-04-25 21:49:53,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.70 | bwd: 5713.13 | bwd_inner: 5685.48 | bwd_allreduce: 27.61 | step: 18.20 14%|█▍ | 5739/41250 [13:52:18<85:26:55, 8.66s/it] {'loss': 0.084, 'grad_norm': 1.0185654163360596, 'learning_rate': 3.8764016942085996e-05, 'epoch': 1.39} 14%|█▍ | 5739/41250 [13:52:18<85:26:55, 8.66s/it][2025-04-25 21:50:02,150] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:50:02,151] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.29 | bwd_microstep: 5709.54 | bwd_inner_microstep: 5696.46 | bwd_allreduce_microstep: 13.03 | step_microstep: 18.63 [2025-04-25 21:50:02,151] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.29 | bwd: 5709.55 | bwd_inner: 5696.46 | bwd_allreduce: 13.05 | step: 18.63 14%|█▍ | 5740/41250 [13:52:27<85:25:44, 8.66s/it] {'loss': 0.3283, 'grad_norm': 4.299954414367676, 'learning_rate': 3.8763473408387884e-05, 'epoch': 1.39} 14%|█▍ | 5740/41250 [13:52:27<85:25:44, 8.66s/it][2025-04-25 21:50:10,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-25 21:50:10,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.48 | bwd_microstep: 5735.84 | bwd_inner_microstep: 5691.70 | bwd_allreduce_microstep: 44.10 | step_microstep: 18.98 [2025-04-25 21:50:10,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.48 | bwd: 5735.85 | bwd_inner: 5691.70 | bwd_allreduce: 44.12 | step: 18.99 14%|█▍ | 5741/41250 [13:52:36<85:27:52, 8.66s/it] {'loss': 0.0626, 'grad_norm': 1.2254819869995117, 'learning_rate': 3.876292975901665e-05, 'epoch': 1.39} 14%|█▍ | 5741/41250 [13:52:36<85:27:52, 8.66s/it][2025-04-25 21:50:19,735] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 21:50:19,735] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2930.15 | bwd_microstep: 5888.54 | bwd_inner_microstep: 5875.69 | bwd_allreduce_microstep: 12.81 | step_microstep: 19.15 [2025-04-25 21:50:19,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2930.15 | bwd: 5888.56 | bwd_inner: 5875.69 | bwd_allreduce: 12.82 | step: 19.15 14%|█▍ | 5742/41250 [13:52:45<86:11:23, 8.74s/it] {'loss': 0.0956, 'grad_norm': 1.398473858833313, 'learning_rate': 3.8762385993975646e-05, 'epoch': 1.39} 14%|█▍ | 5742/41250 [13:52:45<86:11:23, 8.74s/it][2025-04-25 21:50:28,523] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:50:28,524] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.22 | bwd_microstep: 5879.03 | bwd_inner_microstep: 5650.07 | bwd_allreduce_microstep: 228.92 | step_microstep: 18.69 [2025-04-25 21:50:28,524] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.22 | bwd: 5879.04 | bwd_inner: 5650.07 | bwd_allreduce: 228.93 | step: 18.70 14%|█▍ | 5743/41250 [13:52:53<86:20:05, 8.75s/it] {'loss': 0.2321, 'grad_norm': 1.5424368381500244, 'learning_rate': 3.8761842113268214e-05, 'epoch': 1.39} 14%|█▍ | 5743/41250 [13:52:53<86:20:05, 8.75s/it][2025-04-25 21:50:37,132] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:50:37,133] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.52 | bwd_microstep: 5686.94 | bwd_inner_microstep: 5658.38 | bwd_allreduce_microstep: 28.52 | step_microstep: 18.46 [2025-04-25 21:50:37,133] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.52 | bwd: 5686.95 | bwd_inner: 5658.38 | bwd_allreduce: 28.53 | step: 18.46 14%|█▍ | 5744/41250 [13:53:02<85:54:22, 8.71s/it] {'loss': 0.0877, 'grad_norm': 1.6143277883529663, 'learning_rate': 3.876129811689772e-05, 'epoch': 1.39} 14%|█▍ | 5744/41250 [13:53:02<85:54:22, 8.71s/it][2025-04-25 21:50:45,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:50:45,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.16 | bwd_microstep: 5787.58 | bwd_inner_microstep: 5661.36 | bwd_allreduce_microstep: 126.17 | step_microstep: 18.79 [2025-04-25 21:50:45,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.16 | bwd: 5787.60 | bwd_inner: 5661.36 | bwd_allreduce: 126.19 | step: 18.79 14%|█▍ | 5745/41250 [13:53:11<85:53:19, 8.71s/it] {'loss': 0.0882, 'grad_norm': 2.8928706645965576, 'learning_rate': 3.8760754004867516e-05, 'epoch': 1.39} 14%|█▍ | 5745/41250 [13:53:11<85:53:19, 8.71s/it][2025-04-25 21:50:54,457] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:50:54,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.92 | bwd_microstep: 5707.90 | bwd_inner_microstep: 5646.92 | bwd_allreduce_microstep: 60.93 | step_microstep: 19.02 [2025-04-25 21:50:54,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.92 | bwd: 5707.91 | bwd_inner: 5646.92 | bwd_allreduce: 60.95 | step: 19.03 14%|█▍ | 5746/41250 [13:53:19<85:37:21, 8.68s/it] {'loss': 0.18, 'grad_norm': 4.241868495941162, 'learning_rate': 3.876020977718096e-05, 'epoch': 1.39} 14%|█▍ | 5746/41250 [13:53:19<85:37:21, 8.68s/it][2025-04-25 21:51:03,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.06 | optimizer_step: 0.89 [2025-04-25 21:51:03,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.03 | bwd_microstep: 5718.59 | bwd_inner_microstep: 5650.51 | bwd_allreduce_microstep: 68.03 | step_microstep: 19.18 [2025-04-25 21:51:03,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.03 | bwd: 5718.61 | bwd_inner: 5650.51 | bwd_allreduce: 68.05 | step: 19.19 14%|█▍ | 5747/41250 [13:53:28<85:28:50, 8.67s/it] {'loss': 0.0812, 'grad_norm': 1.0823923349380493, 'learning_rate': 3.875966543384139e-05, 'epoch': 1.39} 14%|█▍ | 5747/41250 [13:53:28<85:28:50, 8.67s/it][2025-04-25 21:51:11,772] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.01 | optimizer_step: 1.03 [2025-04-25 21:51:11,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.54 | bwd_microstep: 5744.20 | bwd_inner_microstep: 5699.80 | bwd_allreduce_microstep: 44.36 | step_microstep: 19.03 [2025-04-25 21:51:11,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.54 | bwd: 5744.22 | bwd_inner: 5699.80 | bwd_allreduce: 44.38 | step: 19.03 14%|█▍ | 5748/41250 [13:53:37<85:31:03, 8.67s/it] {'loss': 0.0396, 'grad_norm': 1.451864242553711, 'learning_rate': 3.875912097485218e-05, 'epoch': 1.39} 14%|█▍ | 5748/41250 [13:53:37<85:31:03, 8.67s/it][2025-04-25 21:51:20,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.02 | optimizer_step: 1.11 [2025-04-25 21:51:20,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.92 | bwd_microstep: 5708.05 | bwd_inner_microstep: 5694.76 | bwd_allreduce_microstep: 13.24 | step_microstep: 19.28 [2025-04-25 21:51:20,415] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.92 | bwd: 5708.06 | bwd_inner: 5694.76 | bwd_allreduce: 13.26 | step: 19.28 14%|█▍ | 5749/41250 [13:53:45<85:25:23, 8.66s/it] {'loss': 0.0301, 'grad_norm': 0.800682783126831, 'learning_rate': 3.875857640021667e-05, 'epoch': 1.39} 14%|█▍ | 5749/41250 [13:53:45<85:25:23, 8.66s/it][2025-04-25 21:51:29,084] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 21:51:29,084] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.26 | bwd_microstep: 5753.46 | bwd_inner_microstep: 5656.96 | bwd_allreduce_microstep: 96.45 | step_microstep: 18.72 [2025-04-25 21:51:29,085] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.26 | bwd: 5753.47 | bwd_inner: 5656.96 | bwd_allreduce: 96.47 | step: 18.72 14%|█▍ | 5750/41250 [13:53:54<85:26:36, 8.66s/it] {'loss': 0.0717, 'grad_norm': 1.20710289478302, 'learning_rate': 3.8758031709938234e-05, 'epoch': 1.39} 14%|█▍ | 5750/41250 [13:53:54<85:26:36, 8.66s/it][2025-04-25 21:51:37,754] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:51:37,755] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.12 | bwd_microstep: 5730.47 | bwd_inner_microstep: 5717.52 | bwd_allreduce_microstep: 12.90 | step_microstep: 19.06 [2025-04-25 21:51:37,755] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.12 | bwd: 5730.48 | bwd_inner: 5717.52 | bwd_allreduce: 12.92 | step: 19.06 14%|█▍ | 5751/41250 [13:54:03<85:27:27, 8.67s/it] {'loss': 0.2405, 'grad_norm': 3.2214772701263428, 'learning_rate': 3.875748690402022e-05, 'epoch': 1.39} 14%|█▍ | 5751/41250 [13:54:03<85:27:27, 8.67s/it][2025-04-25 21:51:46,438] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:51:46,439] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2864.68 | bwd_microstep: 5732.70 | bwd_inner_microstep: 5720.02 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.59 [2025-04-25 21:51:46,439] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2864.68 | bwd: 5732.72 | bwd_inner: 5720.02 | bwd_allreduce: 12.66 | step: 18.60 14%|█▍ | 5752/41250 [13:54:11<85:30:15, 8.67s/it] {'loss': 0.2253, 'grad_norm': 2.9906158447265625, 'learning_rate': 3.875694198246598e-05, 'epoch': 1.39} 14%|█▍ | 5752/41250 [13:54:11<85:30:15, 8.67s/it][2025-04-25 21:51:55,134] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:51:55,135] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.76 | bwd_microstep: 5775.93 | bwd_inner_microstep: 5651.81 | bwd_allreduce_microstep: 124.07 | step_microstep: 18.98 [2025-04-25 21:51:55,135] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.76 | bwd: 5775.94 | bwd_inner: 5651.81 | bwd_allreduce: 124.09 | step: 18.98 14%|█▍ | 5753/41250 [13:54:20<85:34:22, 8.68s/it] {'loss': 0.1889, 'grad_norm': 1.5983004570007324, 'learning_rate': 3.875639694527889e-05, 'epoch': 1.39} 14%|█▍ | 5753/41250 [13:54:20<85:34:22, 8.68s/it][2025-04-25 21:52:03,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 21:52:03,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.82 | bwd_microstep: 5786.86 | bwd_inner_microstep: 5661.79 | bwd_allreduce_microstep: 125.03 | step_microstep: 18.90 [2025-04-25 21:52:03,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.82 | bwd: 5786.88 | bwd_inner: 5661.79 | bwd_allreduce: 125.04 | step: 18.90 14%|█▍ | 5754/41250 [13:54:29<85:40:38, 8.69s/it] {'loss': 0.1078, 'grad_norm': 1.0637422800064087, 'learning_rate': 3.8755851792462306e-05, 'epoch': 1.39} 14%|█▍ | 5754/41250 [13:54:29<85:40:38, 8.69s/it][2025-04-25 21:52:12,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:52:12,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2866.17 | bwd_microstep: 5753.53 | bwd_inner_microstep: 5700.34 | bwd_allreduce_microstep: 53.15 | step_microstep: 18.34 [2025-04-25 21:52:12,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2866.17 | bwd: 5753.54 | bwd_inner: 5700.34 | bwd_allreduce: 53.16 | step: 18.35 14%|█▍ | 5755/41250 [13:54:37<85:42:18, 8.69s/it] {'loss': 0.3498, 'grad_norm': 2.4493143558502197, 'learning_rate': 3.8755306524019584e-05, 'epoch': 1.4} 14%|█▍ | 5755/41250 [13:54:37<85:42:18, 8.69s/it][2025-04-25 21:52:21,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 21:52:21,381] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.24 | bwd_microstep: 5908.15 | bwd_inner_microstep: 5658.30 | bwd_allreduce_microstep: 249.80 | step_microstep: 18.99 [2025-04-25 21:52:21,381] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.24 | bwd: 5908.17 | bwd_inner: 5658.30 | bwd_allreduce: 249.82 | step: 19.00 14%|█▍ | 5756/41250 [13:54:46<86:07:09, 8.73s/it] {'loss': 0.0504, 'grad_norm': 0.8119387626647949, 'learning_rate': 3.875476113995408e-05, 'epoch': 1.4} 14%|█▍ | 5756/41250 [13:54:46<86:07:09, 8.73s/it][2025-04-25 21:52:30,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 21:52:30,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.45 | bwd_microstep: 5758.26 | bwd_inner_microstep: 5692.54 | bwd_allreduce_microstep: 65.67 | step_microstep: 19.02 [2025-04-25 21:52:30,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.45 | bwd: 5758.28 | bwd_inner: 5692.54 | bwd_allreduce: 65.69 | step: 19.02 14%|█▍ | 5757/41250 [13:54:55<85:59:06, 8.72s/it] {'loss': 0.1856, 'grad_norm': 1.8181382417678833, 'learning_rate': 3.8754215640269166e-05, 'epoch': 1.4} 14%|█▍ | 5757/41250 [13:54:55<85:59:06, 8.72s/it][2025-04-25 21:52:38,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:52:38,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.66 | bwd_microstep: 5713.83 | bwd_inner_microstep: 5668.95 | bwd_allreduce_microstep: 44.83 | step_microstep: 18.84 [2025-04-25 21:52:38,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.66 | bwd: 5713.85 | bwd_inner: 5668.95 | bwd_allreduce: 44.85 | step: 18.84 14%|█▍ | 5758/41250 [13:55:04<85:43:13, 8.69s/it] {'loss': 0.0515, 'grad_norm': 0.9847307801246643, 'learning_rate': 3.87536700249682e-05, 'epoch': 1.4} 14%|█▍ | 5758/41250 [13:55:04<85:43:13, 8.69s/it][2025-04-25 21:52:47,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 1.02 [2025-04-25 21:52:47,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.98 | bwd_microstep: 5738.80 | bwd_inner_microstep: 5725.80 | bwd_allreduce_microstep: 12.95 | step_microstep: 18.99 [2025-04-25 21:52:47,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.98 | bwd: 5738.81 | bwd_inner: 5725.80 | bwd_allreduce: 12.97 | step: 18.99 14%|█▍ | 5759/41250 [13:55:12<85:41:28, 8.69s/it] {'loss': 0.1692, 'grad_norm': 3.4113454818725586, 'learning_rate': 3.875312429405454e-05, 'epoch': 1.4} 14%|█▍ | 5759/41250 [13:55:12<85:41:28, 8.69s/it][2025-04-25 21:52:56,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.04 | optimizer_step: 1.13 [2025-04-25 21:52:56,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.36 | bwd_microstep: 5784.08 | bwd_inner_microstep: 5710.54 | bwd_allreduce_microstep: 73.49 | step_microstep: 19.47 [2025-04-25 21:52:56,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.36 | bwd: 5784.10 | bwd_inner: 5710.54 | bwd_allreduce: 73.51 | step: 19.47 14%|█▍ | 5760/41250 [13:55:21<85:46:53, 8.70s/it] {'loss': 0.0676, 'grad_norm': 2.1067605018615723, 'learning_rate': 3.8752578447531564e-05, 'epoch': 1.4} 14%|█▍ | 5760/41250 [13:55:21<85:46:53, 8.70s/it][2025-04-25 21:53:04,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 1.06 [2025-04-25 21:53:04,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.08 | bwd_microstep: 5729.25 | bwd_inner_microstep: 5716.45 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.99 [2025-04-25 21:53:04,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.08 | bwd: 5729.27 | bwd_inner: 5716.45 | bwd_allreduce: 12.78 | step: 18.99 14%|█▍ | 5761/41250 [13:55:30<85:40:59, 8.69s/it] {'loss': 0.0359, 'grad_norm': 1.0150946378707886, 'learning_rate': 3.875203248540263e-05, 'epoch': 1.4} 14%|█▍ | 5761/41250 [13:55:30<85:40:59, 8.69s/it][2025-04-25 21:53:13,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 1.16 | optimizer_step: 0.92 [2025-04-25 21:53:13,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.60 | bwd_microstep: 5860.90 | bwd_inner_microstep: 5712.39 | bwd_allreduce_microstep: 148.46 | step_microstep: 19.02 [2025-04-25 21:53:13,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.61 | bwd: 5860.92 | bwd_inner: 5712.39 | bwd_allreduce: 148.48 | step: 19.02 14%|█▍ | 5762/41250 [13:55:38<85:59:53, 8.72s/it] {'loss': 0.1584, 'grad_norm': 1.7788664102554321, 'learning_rate': 3.875148640767111e-05, 'epoch': 1.4} 14%|█▍ | 5762/41250 [13:55:38<85:59:53, 8.72s/it][2025-04-25 21:53:22,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.05 | optimizer_step: 0.97 [2025-04-25 21:53:22,223] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.21 | bwd_microstep: 5705.77 | bwd_inner_microstep: 5693.00 | bwd_allreduce_microstep: 12.72 | step_microstep: 19.03 [2025-04-25 21:53:22,223] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.21 | bwd: 5705.79 | bwd_inner: 5693.00 | bwd_allreduce: 12.73 | step: 19.03 14%|█▍ | 5763/41250 [13:55:47<85:45:00, 8.70s/it] {'loss': 0.0879, 'grad_norm': 0.9363600015640259, 'learning_rate': 3.875094021434036e-05, 'epoch': 1.4} 14%|█▍ | 5763/41250 [13:55:47<85:45:00, 8.70s/it][2025-04-25 21:53:30,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 21:53:30,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.99 | bwd_microstep: 5748.50 | bwd_inner_microstep: 5703.54 | bwd_allreduce_microstep: 44.91 | step_microstep: 18.52 [2025-04-25 21:53:30,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.99 | bwd: 5748.51 | bwd_inner: 5703.54 | bwd_allreduce: 44.93 | step: 18.52 14%|█▍ | 5764/41250 [13:55:56<85:41:42, 8.69s/it] {'loss': 0.0964, 'grad_norm': 2.0029497146606445, 'learning_rate': 3.8750393905413746e-05, 'epoch': 1.4} 14%|█▍ | 5764/41250 [13:55:56<85:41:42, 8.69s/it][2025-04-25 21:53:39,591] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 1.01 | optimizer_step: 1.00 [2025-04-25 21:53:39,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.05 | bwd_microstep: 5783.14 | bwd_inner_microstep: 5651.08 | bwd_allreduce_microstep: 132.01 | step_microstep: 19.32 [2025-04-25 21:53:39,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.05 | bwd: 5783.15 | bwd_inner: 5651.08 | bwd_allreduce: 132.03 | step: 19.32 14%|█▍ | 5765/41250 [13:56:04<85:41:08, 8.69s/it] {'loss': 0.0474, 'grad_norm': 1.3364062309265137, 'learning_rate': 3.874984748089465e-05, 'epoch': 1.4} 14%|█▍ | 5765/41250 [13:56:04<85:41:08, 8.69s/it][2025-04-25 21:53:48,224] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 21:53:48,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.45 | bwd_microstep: 5715.18 | bwd_inner_microstep: 5663.30 | bwd_allreduce_microstep: 51.84 | step_microstep: 18.60 [2025-04-25 21:53:48,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.45 | bwd: 5715.20 | bwd_inner: 5663.30 | bwd_allreduce: 51.86 | step: 18.60 14%|█▍ | 5766/41250 [13:56:13<85:29:47, 8.67s/it] {'loss': 0.229, 'grad_norm': 1.9999562501907349, 'learning_rate': 3.874930094078643e-05, 'epoch': 1.4} 14%|█▍ | 5766/41250 [13:56:13<85:29:47, 8.67s/it][2025-04-25 21:53:56,914] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-25 21:53:56,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.86 | bwd_microstep: 5773.76 | bwd_inner_microstep: 5649.41 | bwd_allreduce_microstep: 124.30 | step_microstep: 19.12 [2025-04-25 21:53:56,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.86 | bwd: 5773.77 | bwd_inner: 5649.41 | bwd_allreduce: 124.32 | step: 19.13 14%|█▍ | 5767/41250 [13:56:22<85:32:33, 8.68s/it] {'loss': 0.2812, 'grad_norm': 2.878133773803711, 'learning_rate': 3.874875428509246e-05, 'epoch': 1.4} 14%|█▍ | 5767/41250 [13:56:22<85:32:33, 8.68s/it][2025-04-25 21:54:05,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:54:05,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.08 | bwd_microstep: 5697.84 | bwd_inner_microstep: 5661.31 | bwd_allreduce_microstep: 36.49 | step_microstep: 19.04 [2025-04-25 21:54:05,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.08 | bwd: 5697.86 | bwd_inner: 5661.31 | bwd_allreduce: 36.51 | step: 19.05 14%|█▍ | 5768/41250 [13:56:30<85:20:19, 8.66s/it] {'loss': 0.1274, 'grad_norm': 1.839848279953003, 'learning_rate': 3.87482075138161e-05, 'epoch': 1.4} 14%|█▍ | 5768/41250 [13:56:30<85:20:19, 8.66s/it][2025-04-25 21:54:14,182] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 21:54:14,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.20 | bwd_microstep: 5717.75 | bwd_inner_microstep: 5704.84 | bwd_allreduce_microstep: 12.87 | step_microstep: 18.68 [2025-04-25 21:54:14,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.20 | bwd: 5717.77 | bwd_inner: 5704.84 | bwd_allreduce: 12.88 | step: 18.68 14%|█▍ | 5769/41250 [13:56:39<85:20:05, 8.66s/it] {'loss': 0.1088, 'grad_norm': 0.9682438969612122, 'learning_rate': 3.874766062696073e-05, 'epoch': 1.4} 14%|█▍ | 5769/41250 [13:56:39<85:20:05, 8.66s/it][2025-04-25 21:54:22,885] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.04 | optimizer_step: 1.09 [2025-04-25 21:54:22,886] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.72 | bwd_microstep: 5763.50 | bwd_inner_microstep: 5681.93 | bwd_allreduce_microstep: 81.52 | step_microstep: 19.17 [2025-04-25 21:54:22,886] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.72 | bwd: 5763.51 | bwd_inner: 5681.93 | bwd_allreduce: 81.54 | step: 19.17 14%|█▍ | 5770/41250 [13:56:48<85:27:45, 8.67s/it] {'loss': 0.3608, 'grad_norm': 2.564032793045044, 'learning_rate': 3.874711362452972e-05, 'epoch': 1.4} 14%|█▍ | 5770/41250 [13:56:48<85:27:45, 8.67s/it][2025-04-25 21:54:31,572] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 21:54:31,572] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.21 | bwd_microstep: 5749.01 | bwd_inner_microstep: 5706.32 | bwd_allreduce_microstep: 42.64 | step_microstep: 18.84 [2025-04-25 21:54:31,572] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.21 | bwd: 5749.02 | bwd_inner: 5706.32 | bwd_allreduce: 42.66 | step: 18.84 14%|█▍ | 5771/41250 [13:56:56<85:30:18, 8.68s/it] {'loss': 0.0236, 'grad_norm': 0.5178430676460266, 'learning_rate': 3.874656650652644e-05, 'epoch': 1.4} 14%|█▍ | 5771/41250 [13:56:56<85:30:18, 8.68s/it][2025-04-25 21:54:40,269] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.19 | optimizer_step: 0.92 [2025-04-25 21:54:40,269] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.28 | bwd_microstep: 5773.28 | bwd_inner_microstep: 5655.88 | bwd_allreduce_microstep: 117.35 | step_microstep: 19.30 [2025-04-25 21:54:40,270] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.28 | bwd: 5773.30 | bwd_inner: 5655.88 | bwd_allreduce: 117.37 | step: 19.30 14%|█▍ | 5772/41250 [13:57:05<85:33:45, 8.68s/it] {'loss': 0.1811, 'grad_norm': 1.289467692375183, 'learning_rate': 3.874601927295427e-05, 'epoch': 1.4} 14%|█▍ | 5772/41250 [13:57:05<85:33:45, 8.68s/it][2025-04-25 21:54:48,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.23 | optimizer_step: 0.90 [2025-04-25 21:54:48,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.70 | bwd_microstep: 5720.92 | bwd_inner_microstep: 5646.57 | bwd_allreduce_microstep: 74.31 | step_microstep: 19.29 [2025-04-25 21:54:48,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.70 | bwd: 5720.94 | bwd_inner: 5646.57 | bwd_allreduce: 74.32 | step: 19.29 14%|█▍ | 5773/41250 [13:57:14<85:25:47, 8.67s/it] {'loss': 0.2867, 'grad_norm': 1.769667625427246, 'learning_rate': 3.8745471923816576e-05, 'epoch': 1.4} 14%|█▍ | 5773/41250 [13:57:14<85:25:47, 8.67s/it][2025-04-25 21:54:57,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 1.04 [2025-04-25 21:54:57,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.97 | bwd_microstep: 5698.52 | bwd_inner_microstep: 5685.67 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.97 [2025-04-25 21:54:57,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.97 | bwd: 5698.53 | bwd_inner: 5685.67 | bwd_allreduce: 12.82 | step: 18.98 14%|█▍ | 5774/41250 [13:57:22<85:17:29, 8.66s/it] {'loss': 0.1252, 'grad_norm': 1.562077283859253, 'learning_rate': 3.8744924459116734e-05, 'epoch': 1.4} 14%|█▍ | 5774/41250 [13:57:22<85:17:29, 8.66s/it][2025-04-25 21:55:06,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.00 | optimizer_step: 0.94 [2025-04-25 21:55:06,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.40 | bwd_microstep: 5749.40 | bwd_inner_microstep: 5688.34 | bwd_allreduce_microstep: 61.01 | step_microstep: 18.60 [2025-04-25 21:55:06,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.40 | bwd: 5749.41 | bwd_inner: 5688.34 | bwd_allreduce: 61.03 | step: 18.61 14%|█▍ | 5775/41250 [13:57:31<85:22:03, 8.66s/it] {'loss': 0.1899, 'grad_norm': 1.6367706060409546, 'learning_rate': 3.874437687885812e-05, 'epoch': 1.4} 14%|█▍ | 5775/41250 [13:57:31<85:22:03, 8.66s/it][2025-04-25 21:55:14,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-25 21:55:14,874] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.33 | bwd_microstep: 5733.77 | bwd_inner_microstep: 5698.09 | bwd_allreduce_microstep: 35.63 | step_microstep: 19.10 [2025-04-25 21:55:14,874] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.33 | bwd: 5733.78 | bwd_inner: 5698.09 | bwd_allreduce: 35.65 | step: 19.10 14%|█▍ | 5776/41250 [13:57:40<85:21:55, 8.66s/it] {'loss': 0.0277, 'grad_norm': 0.8317041397094727, 'learning_rate': 3.874382918304411e-05, 'epoch': 1.4} 14%|█▍ | 5776/41250 [13:57:40<85:21:55, 8.66s/it][2025-04-25 21:55:23,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.03 | optimizer_step: 0.93 [2025-04-25 21:55:23,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.18 | bwd_microstep: 5737.21 | bwd_inner_microstep: 5709.80 | bwd_allreduce_microstep: 27.36 | step_microstep: 19.34 [2025-04-25 21:55:23,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.18 | bwd: 5737.23 | bwd_inner: 5709.80 | bwd_allreduce: 27.38 | step: 19.34 14%|█▍ | 5777/41250 [13:57:48<85:23:24, 8.67s/it] {'loss': 0.0394, 'grad_norm': 0.4655708372592926, 'learning_rate': 3.8743281371678085e-05, 'epoch': 1.4} 14%|█▍ | 5777/41250 [13:57:48<85:23:24, 8.67s/it][2025-04-25 21:55:32,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 21:55:32,200] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.85 | bwd_microstep: 5736.03 | bwd_inner_microstep: 5645.96 | bwd_allreduce_microstep: 90.02 | step_microstep: 18.93 [2025-04-25 21:55:32,200] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.85 | bwd: 5736.05 | bwd_inner: 5645.96 | bwd_allreduce: 90.04 | step: 18.93 14%|█▍ | 5778/41250 [13:57:57<85:21:06, 8.66s/it] {'loss': 0.2015, 'grad_norm': 2.456530809402466, 'learning_rate': 3.874273344476341e-05, 'epoch': 1.4} 14%|█▍ | 5778/41250 [13:57:57<85:21:06, 8.66s/it][2025-04-25 21:55:40,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 21:55:40,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.63 | bwd_microstep: 5677.37 | bwd_inner_microstep: 5651.32 | bwd_allreduce_microstep: 26.01 | step_microstep: 18.72 [2025-04-25 21:55:40,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.63 | bwd: 5677.38 | bwd_inner: 5651.32 | bwd_allreduce: 26.02 | step: 18.73 14%|█▍ | 5779/41250 [13:58:06<85:10:56, 8.65s/it] {'loss': 0.1342, 'grad_norm': 1.7136582136154175, 'learning_rate': 3.874218540230347e-05, 'epoch': 1.4} 14%|█▍ | 5779/41250 [13:58:06<85:10:56, 8.65s/it][2025-04-25 21:55:49,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.03 | optimizer_step: 0.97 [2025-04-25 21:55:49,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.49 | bwd_microstep: 5740.87 | bwd_inner_microstep: 5644.43 | bwd_allreduce_microstep: 96.40 | step_microstep: 18.96 [2025-04-25 21:55:49,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.49 | bwd: 5740.89 | bwd_inner: 5644.43 | bwd_allreduce: 96.42 | step: 18.97 14%|█▍ | 5780/41250 [13:58:14<85:12:20, 8.65s/it] {'loss': 0.2544, 'grad_norm': 2.474437952041626, 'learning_rate': 3.8741637244301646e-05, 'epoch': 1.4} 14%|█▍ | 5780/41250 [13:58:14<85:12:20, 8.65s/it][2025-04-25 21:55:58,136] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 21:55:58,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.50 | bwd_microstep: 5740.86 | bwd_inner_microstep: 5700.82 | bwd_allreduce_microstep: 39.99 | step_microstep: 19.00 [2025-04-25 21:55:58,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.50 | bwd: 5740.87 | bwd_inner: 5700.82 | bwd_allreduce: 40.01 | step: 19.00 14%|█▍ | 5781/41250 [13:58:23<85:17:19, 8.66s/it] {'loss': 0.0387, 'grad_norm': 0.7303967475891113, 'learning_rate': 3.8741088970761314e-05, 'epoch': 1.4} 14%|█▍ | 5781/41250 [13:58:23<85:17:19, 8.66s/it][2025-04-25 21:56:06,805] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-25 21:56:06,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.14 | bwd_microstep: 5738.99 | bwd_inner_microstep: 5689.55 | bwd_allreduce_microstep: 49.39 | step_microstep: 18.83 [2025-04-25 21:56:06,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.14 | bwd: 5739.00 | bwd_inner: 5689.55 | bwd_allreduce: 49.41 | step: 18.83 14%|█▍ | 5782/41250 [13:58:32<85:19:21, 8.66s/it] {'loss': 0.0477, 'grad_norm': 0.9105989933013916, 'learning_rate': 3.874054058168585e-05, 'epoch': 1.4} 14%|█▍ | 5782/41250 [13:58:32<85:19:21, 8.66s/it][2025-04-25 21:56:15,435] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.96 | optimizer_step: 1.07 [2025-04-25 21:56:15,436] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.00 | bwd_microstep: 5699.39 | bwd_inner_microstep: 5686.80 | bwd_allreduce_microstep: 12.54 | step_microstep: 18.58 [2025-04-25 21:56:15,436] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.00 | bwd: 5699.40 | bwd_inner: 5686.80 | bwd_allreduce: 12.56 | step: 18.58 14%|█▍ | 5783/41250 [13:58:40<85:13:41, 8.65s/it] {'loss': 0.0724, 'grad_norm': 0.7996006011962891, 'learning_rate': 3.873999207707865e-05, 'epoch': 1.4} 14%|█▍ | 5783/41250 [13:58:40<85:13:41, 8.65s/it][2025-04-25 21:56:24,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 21:56:24,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.27 | bwd_microstep: 5715.12 | bwd_inner_microstep: 5702.53 | bwd_allreduce_microstep: 12.54 | step_microstep: 18.48 [2025-04-25 21:56:24,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.27 | bwd: 5715.13 | bwd_inner: 5702.53 | bwd_allreduce: 12.56 | step: 18.49 14%|█▍ | 5784/41250 [13:58:49<85:14:09, 8.65s/it] {'loss': 0.2017, 'grad_norm': 3.4448628425598145, 'learning_rate': 3.873944345694308e-05, 'epoch': 1.4} 14%|█▍ | 5784/41250 [13:58:49<85:14:09, 8.65s/it][2025-04-25 21:56:32,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 21:56:32,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.20 | bwd_microstep: 5712.04 | bwd_inner_microstep: 5699.07 | bwd_allreduce_microstep: 12.93 | step_microstep: 18.60 [2025-04-25 21:56:32,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.20 | bwd: 5712.05 | bwd_inner: 5699.07 | bwd_allreduce: 12.94 | step: 18.60 14%|█▍ | 5785/41250 [13:58:58<85:12:10, 8.65s/it] {'loss': 0.1224, 'grad_norm': 2.808861255645752, 'learning_rate': 3.8738894721282524e-05, 'epoch': 1.4} 14%|█▍ | 5785/41250 [13:58:58<85:12:10, 8.65s/it][2025-04-25 21:56:41,406] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 21:56:41,407] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.10 | bwd_microstep: 5771.09 | bwd_inner_microstep: 5631.34 | bwd_allreduce_microstep: 139.70 | step_microstep: 18.62 [2025-04-25 21:56:41,407] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.10 | bwd: 5771.11 | bwd_inner: 5631.34 | bwd_allreduce: 139.72 | step: 18.62 14%|█▍ | 5786/41250 [13:59:06<85:17:21, 8.66s/it] {'loss': 0.208, 'grad_norm': 2.737283945083618, 'learning_rate': 3.873834587010037e-05, 'epoch': 1.4} 14%|█▍ | 5786/41250 [13:59:06<85:17:21, 8.66s/it][2025-04-25 21:56:50,073] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.03 | optimizer_step: 1.03 [2025-04-25 21:56:50,074] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.74 | bwd_microstep: 5756.00 | bwd_inner_microstep: 5658.86 | bwd_allreduce_microstep: 97.09 | step_microstep: 19.13 [2025-04-25 21:56:50,074] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.74 | bwd: 5756.02 | bwd_inner: 5658.86 | bwd_allreduce: 97.11 | step: 19.13 14%|█▍ | 5787/41250 [13:59:15<85:18:26, 8.66s/it] {'loss': 0.0705, 'grad_norm': 1.140403151512146, 'learning_rate': 3.87377969034e-05, 'epoch': 1.4} 14%|█▍ | 5787/41250 [13:59:15<85:18:26, 8.66s/it][2025-04-25 21:56:58,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:56:58,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.02 | bwd_microstep: 5716.72 | bwd_inner_microstep: 5704.04 | bwd_allreduce_microstep: 12.63 | step_microstep: 18.78 [2025-04-25 21:56:58,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.02 | bwd: 5716.73 | bwd_inner: 5704.04 | bwd_allreduce: 12.65 | step: 18.78 14%|█▍ | 5788/41250 [13:59:24<85:16:40, 8.66s/it] {'loss': 0.1487, 'grad_norm': 1.8643600940704346, 'learning_rate': 3.873724782118479e-05, 'epoch': 1.4} 14%|█▍ | 5788/41250 [13:59:24<85:16:40, 8.66s/it][2025-04-25 21:57:07,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.05 | optimizer_step: 1.09 [2025-04-25 21:57:07,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.74 | bwd_microstep: 5730.90 | bwd_inner_microstep: 5683.37 | bwd_allreduce_microstep: 47.48 | step_microstep: 19.02 [2025-04-25 21:57:07,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.74 | bwd: 5730.92 | bwd_inner: 5683.37 | bwd_allreduce: 47.50 | step: 19.02 14%|█▍ | 5789/41250 [13:59:32<85:15:50, 8.66s/it] {'loss': 0.1807, 'grad_norm': 2.8208658695220947, 'learning_rate': 3.873669862345814e-05, 'epoch': 1.4} 14%|█▍ | 5789/41250 [13:59:32<85:15:50, 8.66s/it][2025-04-25 21:57:15,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 21:57:15,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.23 | bwd_microstep: 5682.11 | bwd_inner_microstep: 5651.02 | bwd_allreduce_microstep: 31.04 | step_microstep: 19.15 [2025-04-25 21:57:15,973] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.23 | bwd: 5682.12 | bwd_inner: 5651.02 | bwd_allreduce: 31.05 | step: 19.15 14%|█▍ | 5790/41250 [13:59:41<85:04:34, 8.64s/it] {'loss': 0.1729, 'grad_norm': 5.7255988121032715, 'learning_rate': 3.873614931022343e-05, 'epoch': 1.4} 14%|█▍ | 5790/41250 [13:59:41<85:04:34, 8.64s/it][2025-04-25 21:57:24,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 21:57:24,648] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.83 | bwd_microstep: 5756.62 | bwd_inner_microstep: 5655.24 | bwd_allreduce_microstep: 101.33 | step_microstep: 18.57 [2025-04-25 21:57:24,648] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.83 | bwd: 5756.64 | bwd_inner: 5655.24 | bwd_allreduce: 101.35 | step: 18.57 14%|█▍ | 5791/41250 [13:59:49<85:11:53, 8.65s/it] {'loss': 0.2557, 'grad_norm': 1.2293674945831299, 'learning_rate': 3.873559988148404e-05, 'epoch': 1.4} 14%|█▍ | 5791/41250 [13:59:49<85:11:53, 8.65s/it][2025-04-25 21:57:33,325] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.08 | optimizer_step: 1.04 [2025-04-25 21:57:33,325] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.71 | bwd_microstep: 5759.88 | bwd_inner_microstep: 5658.70 | bwd_allreduce_microstep: 101.13 | step_microstep: 19.85 [2025-04-25 21:57:33,326] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.71 | bwd: 5759.90 | bwd_inner: 5658.70 | bwd_allreduce: 101.15 | step: 19.85 14%|█▍ | 5792/41250 [13:59:58<85:16:26, 8.66s/it] {'loss': 0.1377, 'grad_norm': 2.3128035068511963, 'learning_rate': 3.873505033724336e-05, 'epoch': 1.4} 14%|█▍ | 5792/41250 [13:59:58<85:16:26, 8.66s/it][2025-04-25 21:57:41,965] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.11 | optimizer_step: 1.05 [2025-04-25 21:57:41,965] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.23 | bwd_microstep: 5701.61 | bwd_inner_microstep: 5687.93 | bwd_allreduce_microstep: 13.62 | step_microstep: 19.43 [2025-04-25 21:57:41,965] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.23 | bwd: 5701.63 | bwd_inner: 5687.93 | bwd_allreduce: 13.65 | step: 19.43 14%|█▍ | 5793/41250 [14:00:07<85:13:28, 8.65s/it] {'loss': 0.0897, 'grad_norm': 4.790627479553223, 'learning_rate': 3.8734500677504785e-05, 'epoch': 1.4} 14%|█▍ | 5793/41250 [14:00:07<85:13:28, 8.65s/it][2025-04-25 21:57:50,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.23 | optimizer_step: 0.90 [2025-04-25 21:57:50,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.85 | bwd_microstep: 5670.67 | bwd_inner_microstep: 5657.99 | bwd_allreduce_microstep: 12.63 | step_microstep: 19.25 [2025-04-25 21:57:50,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.85 | bwd: 5670.68 | bwd_inner: 5657.99 | bwd_allreduce: 12.65 | step: 19.25 14%|█▍ | 5794/41250 [14:00:15<85:05:37, 8.64s/it] {'loss': 0.1319, 'grad_norm': 2.966675043106079, 'learning_rate': 3.8733950902271695e-05, 'epoch': 1.4} 14%|█▍ | 5794/41250 [14:00:15<85:05:37, 8.64s/it][2025-04-25 21:57:59,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.25 | optimizer_step: 0.90 [2025-04-25 21:57:59,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.04 | bwd_microstep: 5708.27 | bwd_inner_microstep: 5648.59 | bwd_allreduce_microstep: 59.62 | step_microstep: 19.62 [2025-04-25 21:57:59,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.04 | bwd: 5708.29 | bwd_inner: 5648.59 | bwd_allreduce: 59.65 | step: 19.62 14%|█▍ | 5795/41250 [14:00:24<85:01:15, 8.63s/it] {'loss': 0.263, 'grad_norm': 3.1326234340667725, 'learning_rate': 3.8733401011547485e-05, 'epoch': 1.4} 14%|█▍ | 5795/41250 [14:00:24<85:01:15, 8.63s/it][2025-04-25 21:58:07,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.22 | optimizer_step: 1.11 [2025-04-25 21:58:07,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.60 | bwd_microstep: 5754.47 | bwd_inner_microstep: 5701.20 | bwd_allreduce_microstep: 53.19 | step_microstep: 20.95 [2025-04-25 21:58:07,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.60 | bwd: 5754.50 | bwd_inner: 5701.20 | bwd_allreduce: 53.23 | step: 20.94 14%|█▍ | 5796/41250 [14:00:33<85:11:56, 8.65s/it] {'loss': 0.0841, 'grad_norm': 2.0961413383483887, 'learning_rate': 3.8732851005335544e-05, 'epoch': 1.41} 14%|█▍ | 5796/41250 [14:00:33<85:11:56, 8.65s/it][2025-04-25 21:58:16,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 21:58:16,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.62 | bwd_microstep: 5718.91 | bwd_inner_microstep: 5706.17 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.92 [2025-04-25 21:58:16,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.62 | bwd: 5718.92 | bwd_inner: 5706.17 | bwd_allreduce: 12.71 | step: 18.93 14%|█▍ | 5797/41250 [14:00:41<85:13:45, 8.65s/it] {'loss': 0.1503, 'grad_norm': 2.870396375656128, 'learning_rate': 3.873230088363926e-05, 'epoch': 1.41} 14%|█▍ | 5797/41250 [14:00:41<85:13:45, 8.65s/it][2025-04-25 21:58:25,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 21:58:25,156] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.59 | bwd_microstep: 5692.96 | bwd_inner_microstep: 5661.31 | bwd_allreduce_microstep: 31.60 | step_microstep: 18.79 [2025-04-25 21:58:25,156] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.59 | bwd: 5692.98 | bwd_inner: 5661.31 | bwd_allreduce: 31.62 | step: 18.79 14%|█▍ | 5798/41250 [14:00:50<85:04:43, 8.64s/it] {'loss': 0.0365, 'grad_norm': 0.35562247037887573, 'learning_rate': 3.873175064646202e-05, 'epoch': 1.41} 14%|█▍ | 5798/41250 [14:00:50<85:04:43, 8.64s/it][2025-04-25 21:58:33,857] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.22 | optimizer_step: 0.90 [2025-04-25 21:58:33,857] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.61 | bwd_microstep: 5778.16 | bwd_inner_microstep: 5665.95 | bwd_allreduce_microstep: 112.17 | step_microstep: 19.00 [2025-04-25 21:58:33,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.61 | bwd: 5778.18 | bwd_inner: 5665.95 | bwd_allreduce: 112.19 | step: 19.00 14%|█▍ | 5799/41250 [14:00:59<85:15:51, 8.66s/it] {'loss': 0.0656, 'grad_norm': 1.1915571689605713, 'learning_rate': 3.873120029380723e-05, 'epoch': 1.41} 14%|█▍ | 5799/41250 [14:00:59<85:15:51, 8.66s/it][2025-04-25 21:58:42,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 21:58:42,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.96 | bwd_microstep: 5749.83 | bwd_inner_microstep: 5707.25 | bwd_allreduce_microstep: 42.52 | step_microstep: 19.02 [2025-04-25 21:58:42,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.96 | bwd: 5749.84 | bwd_inner: 5707.25 | bwd_allreduce: 42.54 | step: 19.02 14%|█▍ | 5800/41250 [14:01:07<85:21:23, 8.67s/it] {'loss': 0.2048, 'grad_norm': 2.5068509578704834, 'learning_rate': 3.873064982567827e-05, 'epoch': 1.41} 14%|█▍ | 5800/41250 [14:01:07<85:21:23, 8.67s/it][2025-04-25 21:58:51,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.05 | optimizer_step: 1.03 [2025-04-25 21:58:51,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.19 | bwd_microstep: 5700.13 | bwd_inner_microstep: 5672.43 | bwd_allreduce_microstep: 27.66 | step_microstep: 19.53 [2025-04-25 21:58:51,171] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.19 | bwd: 5700.15 | bwd_inner: 5672.43 | bwd_allreduce: 27.68 | step: 19.53 14%|█▍ | 5801/41250 [14:01:16<85:13:46, 8.66s/it] {'loss': 0.1041, 'grad_norm': 1.9750382900238037, 'learning_rate': 3.8730099242078534e-05, 'epoch': 1.41} 14%|█▍ | 5801/41250 [14:01:16<85:13:46, 8.66s/it][2025-04-25 21:58:59,883] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.04 | optimizer_step: 0.96 [2025-04-25 21:58:59,884] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.44 | bwd_microstep: 5773.08 | bwd_inner_microstep: 5697.28 | bwd_allreduce_microstep: 75.75 | step_microstep: 19.33 [2025-04-25 21:58:59,884] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.44 | bwd: 5773.09 | bwd_inner: 5697.28 | bwd_allreduce: 75.77 | step: 19.34 14%|█▍ | 5802/41250 [14:01:25<85:23:35, 8.67s/it] {'loss': 0.381, 'grad_norm': 2.4362690448760986, 'learning_rate': 3.8729548543011423e-05, 'epoch': 1.41} 14%|█▍ | 5802/41250 [14:01:25<85:23:35, 8.67s/it][2025-04-25 21:59:08,673] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.05 | optimizer_step: 1.01 [2025-04-25 21:59:08,673] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2893.65 | bwd_microstep: 5809.42 | bwd_inner_microstep: 5796.44 | bwd_allreduce_microstep: 12.94 | step_microstep: 19.41 [2025-04-25 21:59:08,674] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2893.65 | bwd: 5809.44 | bwd_inner: 5796.44 | bwd_allreduce: 12.96 | step: 19.41 14%|█▍ | 5803/41250 [14:01:33<85:44:15, 8.71s/it] {'loss': 0.1431, 'grad_norm': 2.0251810550689697, 'learning_rate': 3.872899772848033e-05, 'epoch': 1.41} 14%|█▍ | 5803/41250 [14:01:34<85:44:15, 8.71s/it][2025-04-25 21:59:17,305] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.06 | optimizer_step: 0.91 [2025-04-25 21:59:17,305] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.07 | bwd_microstep: 5717.71 | bwd_inner_microstep: 5660.49 | bwd_allreduce_microstep: 57.16 | step_microstep: 19.42 [2025-04-25 21:59:17,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.07 | bwd: 5717.73 | bwd_inner: 5660.49 | bwd_allreduce: 57.19 | step: 19.44 14%|█▍ | 5804/41250 [14:01:42<85:30:36, 8.68s/it] {'loss': 0.1567, 'grad_norm': 1.5209641456604004, 'learning_rate': 3.8728446798488655e-05, 'epoch': 1.41} 14%|█▍ | 5804/41250 [14:01:42<85:30:36, 8.68s/it][2025-04-25 21:59:25,976] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 21:59:25,977] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.02 | bwd_microstep: 5729.68 | bwd_inner_microstep: 5716.77 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.71 [2025-04-25 21:59:25,977] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.02 | bwd: 5729.69 | bwd_inner: 5716.77 | bwd_allreduce: 12.88 | step: 18.72 14%|█▍ | 5805/41250 [14:01:51<85:27:38, 8.68s/it] {'loss': 0.4979, 'grad_norm': 3.0175983905792236, 'learning_rate': 3.872789575303978e-05, 'epoch': 1.41} 14%|█▍ | 5805/41250 [14:01:51<85:27:38, 8.68s/it][2025-04-25 21:59:34,638] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.04 | optimizer_step: 1.08 [2025-04-25 21:59:34,639] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.73 | bwd_microstep: 5719.52 | bwd_inner_microstep: 5694.17 | bwd_allreduce_microstep: 25.30 | step_microstep: 19.49 [2025-04-25 21:59:34,639] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.73 | bwd: 5719.54 | bwd_inner: 5694.16 | bwd_allreduce: 25.33 | step: 19.49 14%|█▍ | 5806/41250 [14:01:59<85:24:37, 8.68s/it] {'loss': 0.0555, 'grad_norm': 0.8129034638404846, 'learning_rate': 3.8727344592137116e-05, 'epoch': 1.41} 14%|█▍ | 5806/41250 [14:01:59<85:24:37, 8.68s/it][2025-04-25 21:59:43,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.00 | optimizer_step: 1.04 [2025-04-25 21:59:43,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.21 | bwd_microstep: 6009.68 | bwd_inner_microstep: 5724.81 | bwd_allreduce_microstep: 284.82 | step_microstep: 18.85 [2025-04-25 21:59:43,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.21 | bwd: 6009.69 | bwd_inner: 5724.81 | bwd_allreduce: 284.84 | step: 18.85 14%|█▍ | 5807/41250 [14:02:08<86:13:45, 8.76s/it] {'loss': 0.4203, 'grad_norm': 3.721673011779785, 'learning_rate': 3.872679331578406e-05, 'epoch': 1.41} 14%|█▍ | 5807/41250 [14:02:08<86:13:45, 8.76s/it][2025-04-25 21:59:52,298] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 21:59:52,298] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.70 | bwd_microstep: 5796.91 | bwd_inner_microstep: 5648.59 | bwd_allreduce_microstep: 148.27 | step_microstep: 18.88 [2025-04-25 21:59:52,298] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.70 | bwd: 5796.93 | bwd_inner: 5648.59 | bwd_allreduce: 148.29 | step: 18.88 14%|█▍ | 5808/41250 [14:02:17<86:04:13, 8.74s/it] {'loss': 0.0282, 'grad_norm': 0.31457871198654175, 'learning_rate': 3.8726241923984e-05, 'epoch': 1.41} 14%|█▍ | 5808/41250 [14:02:17<86:04:13, 8.74s/it][2025-04-25 22:00:00,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:00:00,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.84 | bwd_microstep: 5708.92 | bwd_inner_microstep: 5658.77 | bwd_allreduce_microstep: 50.10 | step_microstep: 18.72 [2025-04-25 22:00:00,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.84 | bwd: 5708.93 | bwd_inner: 5658.77 | bwd_allreduce: 50.12 | step: 18.72 14%|█▍ | 5809/41250 [14:02:26<85:41:59, 8.71s/it] {'loss': 0.2274, 'grad_norm': 2.065526008605957, 'learning_rate': 3.8725690416740346e-05, 'epoch': 1.41} 14%|█▍ | 5809/41250 [14:02:26<85:41:59, 8.71s/it][2025-04-25 22:00:09,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:00:09,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.12 | bwd_microstep: 5714.13 | bwd_inner_microstep: 5677.88 | bwd_allreduce_microstep: 36.21 | step_microstep: 18.84 [2025-04-25 22:00:09,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.12 | bwd: 5714.15 | bwd_inner: 5677.88 | bwd_allreduce: 36.22 | step: 18.84 14%|█▍ | 5810/41250 [14:02:34<85:28:26, 8.68s/it] {'loss': 0.1031, 'grad_norm': 1.5965031385421753, 'learning_rate': 3.872513879405649e-05, 'epoch': 1.41} 14%|█▍ | 5810/41250 [14:02:34<85:28:26, 8.68s/it][2025-04-25 22:00:18,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.32 | optimizer_step: 1.05 [2025-04-25 22:00:18,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.15 | bwd_microstep: 5772.41 | bwd_inner_microstep: 5667.78 | bwd_allreduce_microstep: 104.57 | step_microstep: 19.98 [2025-04-25 22:00:18,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.15 | bwd: 5772.43 | bwd_inner: 5667.78 | bwd_allreduce: 104.60 | step: 19.98 14%|█▍ | 5811/41250 [14:02:43<85:30:50, 8.69s/it] {'loss': 0.1219, 'grad_norm': 3.568040609359741, 'learning_rate': 3.8724587055935836e-05, 'epoch': 1.41} 14%|█▍ | 5811/41250 [14:02:43<85:30:50, 8.69s/it][2025-04-25 22:00:27,043] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.02 | optimizer_step: 1.04 [2025-04-25 22:00:27,044] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.92 | bwd_microstep: 5859.38 | bwd_inner_microstep: 5713.43 | bwd_allreduce_microstep: 145.89 | step_microstep: 19.11 [2025-04-25 22:00:27,044] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.92 | bwd: 5859.39 | bwd_inner: 5713.43 | bwd_allreduce: 145.91 | step: 19.11 14%|█▍ | 5812/41250 [14:02:52<85:51:18, 8.72s/it] {'loss': 0.089, 'grad_norm': 1.3775241374969482, 'learning_rate': 3.872403520238179e-05, 'epoch': 1.41} 14%|█▍ | 5812/41250 [14:02:52<85:51:18, 8.72s/it][2025-04-25 22:00:35,769] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.05 | optimizer_step: 1.00 [2025-04-25 22:00:35,770] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.32 | bwd_microstep: 5787.89 | bwd_inner_microstep: 5704.02 | bwd_allreduce_microstep: 83.82 | step_microstep: 19.25 [2025-04-25 22:00:35,770] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.32 | bwd: 5787.90 | bwd_inner: 5704.02 | bwd_allreduce: 83.84 | step: 19.25 14%|█▍ | 5813/41250 [14:03:01<85:51:31, 8.72s/it] {'loss': 0.1359, 'grad_norm': 1.4103896617889404, 'learning_rate': 3.872348323339774e-05, 'epoch': 1.41} 14%|█▍ | 5813/41250 [14:03:01<85:51:31, 8.72s/it][2025-04-25 22:00:44,415] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.02 | optimizer_step: 1.00 [2025-04-25 22:00:44,416] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.71 | bwd_microstep: 5714.80 | bwd_inner_microstep: 5702.11 | bwd_allreduce_microstep: 12.64 | step_microstep: 19.43 [2025-04-25 22:00:44,416] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.71 | bwd: 5714.81 | bwd_inner: 5702.11 | bwd_allreduce: 12.66 | step: 19.44 14%|█▍ | 5814/41250 [14:03:09<85:38:27, 8.70s/it] {'loss': 0.023, 'grad_norm': 0.3041434586048126, 'learning_rate': 3.872293114898711e-05, 'epoch': 1.41} 14%|█▍ | 5814/41250 [14:03:09<85:38:27, 8.70s/it][2025-04-25 22:00:53,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.01 | optimizer_step: 1.10 [2025-04-25 22:00:53,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.58 | bwd_microstep: 5731.51 | bwd_inner_microstep: 5718.16 | bwd_allreduce_microstep: 13.30 | step_microstep: 18.91 [2025-04-25 22:00:53,098] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.58 | bwd: 5731.52 | bwd_inner: 5718.16 | bwd_allreduce: 13.32 | step: 18.91 14%|█▍ | 5815/41250 [14:03:18<85:34:33, 8.69s/it] {'loss': 0.1467, 'grad_norm': 2.096825361251831, 'learning_rate': 3.872237894915329e-05, 'epoch': 1.41} 14%|█▍ | 5815/41250 [14:03:18<85:34:33, 8.69s/it][2025-04-25 22:01:01,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 22:01:01,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.20 | bwd_microstep: 5713.76 | bwd_inner_microstep: 5700.98 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.89 [2025-04-25 22:01:01,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.20 | bwd: 5713.77 | bwd_inner: 5700.98 | bwd_allreduce: 12.75 | step: 18.89 14%|█▍ | 5816/41250 [14:03:27<85:25:56, 8.68s/it] {'loss': 0.0616, 'grad_norm': 0.8939443230628967, 'learning_rate': 3.872182663389969e-05, 'epoch': 1.41} 14%|█▍ | 5816/41250 [14:03:27<85:25:56, 8.68s/it][2025-04-25 22:01:10,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:01:10,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.13 | bwd_microstep: 5737.53 | bwd_inner_microstep: 5709.79 | bwd_allreduce_microstep: 27.69 | step_microstep: 18.53 [2025-04-25 22:01:10,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.13 | bwd: 5737.54 | bwd_inner: 5709.79 | bwd_allreduce: 27.71 | step: 18.53 14%|█▍ | 5817/41250 [14:03:35<85:27:50, 8.68s/it] {'loss': 0.1168, 'grad_norm': 2.2168638706207275, 'learning_rate': 3.87212742032297e-05, 'epoch': 1.41} 14%|█▍ | 5817/41250 [14:03:35<85:27:50, 8.68s/it][2025-04-25 22:01:19,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.04 | optimizer_step: 1.17 [2025-04-25 22:01:19,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.45 | bwd_microstep: 5696.07 | bwd_inner_microstep: 5663.11 | bwd_allreduce_microstep: 32.91 | step_microstep: 19.19 [2025-04-25 22:01:19,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.45 | bwd: 5696.08 | bwd_inner: 5663.11 | bwd_allreduce: 32.93 | step: 19.19 14%|█▍ | 5818/41250 [14:03:44<85:16:56, 8.66s/it] {'loss': 0.2891, 'grad_norm': 2.666503667831421, 'learning_rate': 3.872072165714674e-05, 'epoch': 1.41} 14%|█▍ | 5818/41250 [14:03:44<85:16:56, 8.66s/it][2025-04-25 22:01:27,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-25 22:01:27,733] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.40 | bwd_microstep: 5726.64 | bwd_inner_microstep: 5713.90 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.70 [2025-04-25 22:01:27,733] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.40 | bwd: 5726.65 | bwd_inner: 5713.90 | bwd_allreduce: 12.71 | step: 18.71 14%|█▍ | 5819/41250 [14:03:53<85:18:29, 8.67s/it] {'loss': 0.143, 'grad_norm': 2.4554970264434814, 'learning_rate': 3.872016899565422e-05, 'epoch': 1.41} 14%|█▍ | 5819/41250 [14:03:53<85:18:29, 8.67s/it][2025-04-25 22:01:36,399] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 22:01:36,400] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.13 | bwd_microstep: 5737.80 | bwd_inner_microstep: 5706.16 | bwd_allreduce_microstep: 31.59 | step_microstep: 18.69 [2025-04-25 22:01:36,400] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.13 | bwd: 5737.81 | bwd_inner: 5706.16 | bwd_allreduce: 31.61 | step: 18.69 14%|█▍ | 5820/41250 [14:04:01<85:18:10, 8.67s/it] {'loss': 0.6133, 'grad_norm': 2.0198183059692383, 'learning_rate': 3.8719616218755533e-05, 'epoch': 1.41} 14%|█▍ | 5820/41250 [14:04:01<85:18:10, 8.67s/it][2025-04-25 22:01:45,061] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 22:01:45,062] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.73 | bwd_microstep: 5726.77 | bwd_inner_microstep: 5697.33 | bwd_allreduce_microstep: 29.40 | step_microstep: 18.40 [2025-04-25 22:01:45,062] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.73 | bwd: 5726.78 | bwd_inner: 5697.33 | bwd_allreduce: 29.42 | step: 18.40 14%|█▍ | 5821/41250 [14:04:10<85:16:57, 8.67s/it] {'loss': 0.105, 'grad_norm': 1.589539647102356, 'learning_rate': 3.87190633264541e-05, 'epoch': 1.41} 14%|█▍ | 5821/41250 [14:04:10<85:16:57, 8.67s/it][2025-04-25 22:01:53,844] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:01:53,845] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.18 | bwd_microstep: 5852.85 | bwd_inner_microstep: 5678.73 | bwd_allreduce_microstep: 174.06 | step_microstep: 18.47 [2025-04-25 22:01:53,845] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.18 | bwd: 5852.87 | bwd_inner: 5678.73 | bwd_allreduce: 174.08 | step: 18.47 14%|█▍ | 5822/41250 [14:04:19<85:38:20, 8.70s/it] {'loss': 0.2914, 'grad_norm': 3.5780904293060303, 'learning_rate': 3.871851031875332e-05, 'epoch': 1.41} 14%|█▍ | 5822/41250 [14:04:19<85:38:20, 8.70s/it][2025-04-25 22:02:02,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.05 | optimizer_step: 1.04 [2025-04-25 22:02:02,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.26 | bwd_microstep: 5703.45 | bwd_inner_microstep: 5650.35 | bwd_allreduce_microstep: 53.05 | step_microstep: 19.21 [2025-04-25 22:02:02,463] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.26 | bwd: 5703.46 | bwd_inner: 5650.35 | bwd_allreduce: 53.07 | step: 19.21 14%|█▍ | 5823/41250 [14:04:27<85:22:39, 8.68s/it] {'loss': 0.1485, 'grad_norm': 1.3786659240722656, 'learning_rate': 3.87179571956566e-05, 'epoch': 1.41} 14%|█▍ | 5823/41250 [14:04:27<85:22:39, 8.68s/it][2025-04-25 22:02:11,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.05 | optimizer_step: 0.93 [2025-04-25 22:02:11,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.23 | bwd_microstep: 5724.39 | bwd_inner_microstep: 5711.48 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.95 [2025-04-25 22:02:11,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.23 | bwd: 5724.40 | bwd_inner: 5711.48 | bwd_allreduce: 12.88 | step: 18.95 14%|█▍ | 5824/41250 [14:04:36<85:20:08, 8.67s/it] {'loss': 0.1346, 'grad_norm': 1.1440273523330688, 'learning_rate': 3.871740395716737e-05, 'epoch': 1.41} 14%|█▍ | 5824/41250 [14:04:36<85:20:08, 8.67s/it][2025-04-25 22:02:19,809] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.98 | optimizer_step: 1.02 [2025-04-25 22:02:19,810] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.59 | bwd_microstep: 5748.54 | bwd_inner_microstep: 5696.84 | bwd_allreduce_microstep: 51.65 | step_microstep: 18.93 [2025-04-25 22:02:19,810] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.59 | bwd: 5748.56 | bwd_inner: 5696.84 | bwd_allreduce: 51.67 | step: 18.93 14%|█▍ | 5825/41250 [14:04:45<85:22:19, 8.68s/it] {'loss': 0.1584, 'grad_norm': 2.0580742359161377, 'learning_rate': 3.871685060328901e-05, 'epoch': 1.41} 14%|█▍ | 5825/41250 [14:04:45<85:22:19, 8.68s/it][2025-04-25 22:02:28,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 22:02:28,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.40 | bwd_microstep: 5694.05 | bwd_inner_microstep: 5648.77 | bwd_allreduce_microstep: 45.23 | step_microstep: 18.77 [2025-04-25 22:02:28,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.40 | bwd: 5694.06 | bwd_inner: 5648.76 | bwd_allreduce: 45.25 | step: 18.77 14%|█▍ | 5826/41250 [14:04:53<85:09:22, 8.65s/it] {'loss': 0.0996, 'grad_norm': 1.2529923915863037, 'learning_rate': 3.871629713402496e-05, 'epoch': 1.41} 14%|█▍ | 5826/41250 [14:04:53<85:09:22, 8.65s/it][2025-04-25 22:02:37,087] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 22:02:37,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.05 | bwd_microstep: 5736.94 | bwd_inner_microstep: 5709.37 | bwd_allreduce_microstep: 27.51 | step_microstep: 18.92 [2025-04-25 22:02:37,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.05 | bwd: 5736.96 | bwd_inner: 5709.37 | bwd_allreduce: 27.54 | step: 18.92 14%|█▍ | 5827/41250 [14:05:02<85:13:06, 8.66s/it] {'loss': 0.0453, 'grad_norm': 0.562140166759491, 'learning_rate': 3.871574354937861e-05, 'epoch': 1.41} 14%|█▍ | 5827/41250 [14:05:02<85:13:06, 8.66s/it][2025-04-25 22:02:45,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.02 | optimizer_step: 1.06 [2025-04-25 22:02:45,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.12 | bwd_microstep: 5685.21 | bwd_inner_microstep: 5655.63 | bwd_allreduce_microstep: 29.53 | step_microstep: 19.16 [2025-04-25 22:02:45,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.12 | bwd: 5685.22 | bwd_inner: 5655.63 | bwd_allreduce: 29.55 | step: 19.16 14%|█▍ | 5828/41250 [14:05:11<85:01:27, 8.64s/it] {'loss': 0.0434, 'grad_norm': 0.6718298196792603, 'learning_rate': 3.871518984935339e-05, 'epoch': 1.41} 14%|█▍ | 5828/41250 [14:05:11<85:01:27, 8.64s/it][2025-04-25 22:02:54,336] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:02:54,337] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.23 | bwd_microstep: 5725.70 | bwd_inner_microstep: 5680.36 | bwd_allreduce_microstep: 45.29 | step_microstep: 18.38 [2025-04-25 22:02:54,337] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.23 | bwd: 5725.71 | bwd_inner: 5680.36 | bwd_allreduce: 45.31 | step: 18.39 14%|█▍ | 5829/41250 [14:05:19<85:03:03, 8.64s/it] {'loss': 0.1285, 'grad_norm': 0.8264704942703247, 'learning_rate': 3.87146360339527e-05, 'epoch': 1.41} 14%|█▍ | 5829/41250 [14:05:19<85:03:03, 8.64s/it][2025-04-25 22:03:02,985] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 22:03:02,985] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.19 | bwd_microstep: 5738.35 | bwd_inner_microstep: 5658.85 | bwd_allreduce_microstep: 79.45 | step_microstep: 18.34 [2025-04-25 22:03:02,985] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.19 | bwd: 5738.36 | bwd_inner: 5658.85 | bwd_allreduce: 79.47 | step: 18.34 14%|█▍ | 5830/41250 [14:05:28<85:03:36, 8.65s/it] {'loss': 0.016, 'grad_norm': 0.2515619397163391, 'learning_rate': 3.8714082103179956e-05, 'epoch': 1.41} 14%|█▍ | 5830/41250 [14:05:28<85:03:36, 8.65s/it][2025-04-25 22:03:11,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 22:03:11,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.53 | bwd_microstep: 5693.55 | bwd_inner_microstep: 5644.68 | bwd_allreduce_microstep: 48.83 | step_microstep: 18.80 [2025-04-25 22:03:11,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.53 | bwd: 5693.57 | bwd_inner: 5644.68 | bwd_allreduce: 48.85 | step: 18.80 14%|█▍ | 5831/41250 [14:05:36<84:55:17, 8.63s/it] {'loss': 0.0951, 'grad_norm': 1.327409267425537, 'learning_rate': 3.871352805703859e-05, 'epoch': 1.41} 14%|█▍ | 5831/41250 [14:05:36<84:55:17, 8.63s/it][2025-04-25 22:03:20,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 22:03:20,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.60 | bwd_microstep: 5728.83 | bwd_inner_microstep: 5686.99 | bwd_allreduce_microstep: 41.80 | step_microstep: 18.78 [2025-04-25 22:03:20,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.60 | bwd: 5728.85 | bwd_inner: 5686.99 | bwd_allreduce: 41.82 | step: 18.78 14%|█▍ | 5832/41250 [14:05:45<84:59:52, 8.64s/it] {'loss': 0.064, 'grad_norm': 1.007927417755127, 'learning_rate': 3.8712973895531993e-05, 'epoch': 1.41} 14%|█▍ | 5832/41250 [14:05:45<84:59:52, 8.64s/it][2025-04-25 22:03:28,855] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 22:03:28,855] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.41 | bwd_microstep: 5704.13 | bwd_inner_microstep: 5654.61 | bwd_allreduce_microstep: 49.46 | step_microstep: 18.60 [2025-04-25 22:03:28,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.41 | bwd: 5704.14 | bwd_inner: 5654.61 | bwd_allreduce: 49.49 | step: 18.60 14%|█▍ | 5833/41250 [14:05:54<84:55:05, 8.63s/it] {'loss': 0.2102, 'grad_norm': 3.49479079246521, 'learning_rate': 3.87124196186636e-05, 'epoch': 1.41} 14%|█▍ | 5833/41250 [14:05:54<84:55:05, 8.63s/it][2025-04-25 22:03:37,492] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:03:37,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.18 | bwd_microstep: 5706.92 | bwd_inner_microstep: 5694.16 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.81 [2025-04-25 22:03:37,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.18 | bwd: 5706.93 | bwd_inner: 5694.16 | bwd_allreduce: 12.73 | step: 18.81 14%|█▍ | 5834/41250 [14:06:02<84:55:57, 8.63s/it] {'loss': 0.1024, 'grad_norm': 1.0134410858154297, 'learning_rate': 3.871186522643681e-05, 'epoch': 1.41} 14%|█▍ | 5834/41250 [14:06:02<84:55:57, 8.63s/it][2025-04-25 22:03:46,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:03:46,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.36 | bwd_microstep: 5751.65 | bwd_inner_microstep: 5697.20 | bwd_allreduce_microstep: 54.40 | step_microstep: 18.57 [2025-04-25 22:03:46,186] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.36 | bwd: 5751.66 | bwd_inner: 5697.20 | bwd_allreduce: 54.42 | step: 18.58 14%|█▍ | 5835/41250 [14:06:11<85:06:27, 8.65s/it] {'loss': 0.0532, 'grad_norm': 1.0244821310043335, 'learning_rate': 3.871131071885507e-05, 'epoch': 1.41} 14%|█▍ | 5835/41250 [14:06:11<85:06:27, 8.65s/it][2025-04-25 22:03:54,828] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.06 | optimizer_step: 1.20 [2025-04-25 22:03:54,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.91 | bwd_microstep: 5712.16 | bwd_inner_microstep: 5698.86 | bwd_allreduce_microstep: 13.25 | step_microstep: 19.96 [2025-04-25 22:03:54,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.91 | bwd: 5712.18 | bwd_inner: 5698.86 | bwd_allreduce: 13.27 | step: 19.97 14%|█▍ | 5836/41250 [14:06:20<85:05:01, 8.65s/it] {'loss': 0.019, 'grad_norm': 0.19857332110404968, 'learning_rate': 3.871075609592177e-05, 'epoch': 1.41} 14%|█▍ | 5836/41250 [14:06:20<85:05:01, 8.65s/it][2025-04-25 22:04:03,469] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:04:03,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.72 | bwd_microstep: 5707.17 | bwd_inner_microstep: 5694.62 | bwd_allreduce_microstep: 12.51 | step_microstep: 18.57 [2025-04-25 22:04:03,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.72 | bwd: 5707.18 | bwd_inner: 5694.62 | bwd_allreduce: 12.53 | step: 18.57 14%|█▍ | 5837/41250 [14:06:28<85:03:07, 8.65s/it] {'loss': 0.1136, 'grad_norm': 1.5644125938415527, 'learning_rate': 3.8710201357640326e-05, 'epoch': 1.42} 14%|█▍ | 5837/41250 [14:06:28<85:03:07, 8.65s/it][2025-04-25 22:04:12,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 22:04:12,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.61 | bwd_microstep: 5759.62 | bwd_inner_microstep: 5693.56 | bwd_allreduce_microstep: 66.02 | step_microstep: 18.89 [2025-04-25 22:04:12,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.61 | bwd: 5759.64 | bwd_inner: 5693.56 | bwd_allreduce: 66.04 | step: 18.89 14%|█▍ | 5838/41250 [14:06:37<85:09:48, 8.66s/it] {'loss': 0.0678, 'grad_norm': 1.9971327781677246, 'learning_rate': 3.870964650401418e-05, 'epoch': 1.42} 14%|█▍ | 5838/41250 [14:06:37<85:09:48, 8.66s/it][2025-04-25 22:04:21,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:04:21,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2935.94 | bwd_microstep: 5882.63 | bwd_inner_microstep: 5869.82 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.70 [2025-04-25 22:04:21,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2935.94 | bwd: 5882.64 | bwd_inner: 5869.82 | bwd_allreduce: 12.78 | step: 18.71 14%|█▍ | 5839/41250 [14:06:46<85:52:47, 8.73s/it] {'loss': 0.1048, 'grad_norm': 2.275702953338623, 'learning_rate': 3.870909153504675e-05, 'epoch': 1.42} 14%|█▍ | 5839/41250 [14:06:46<85:52:47, 8.73s/it][2025-04-25 22:04:29,721] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.10 | optimizer_step: 0.95 [2025-04-25 22:04:29,722] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.33 | bwd_microstep: 5730.38 | bwd_inner_microstep: 5690.89 | bwd_allreduce_microstep: 39.44 | step_microstep: 18.91 [2025-04-25 22:04:29,722] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.33 | bwd: 5730.39 | bwd_inner: 5690.89 | bwd_allreduce: 39.46 | step: 18.91 14%|█▍ | 5840/41250 [14:06:55<85:41:23, 8.71s/it] {'loss': 0.1769, 'grad_norm': 2.6905784606933594, 'learning_rate': 3.870853645074144e-05, 'epoch': 1.42} 14%|█▍ | 5840/41250 [14:06:55<85:41:23, 8.71s/it][2025-04-25 22:04:38,326] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 22:04:38,327] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.10 | bwd_microstep: 5693.78 | bwd_inner_microstep: 5660.49 | bwd_allreduce_microstep: 33.25 | step_microstep: 18.70 [2025-04-25 22:04:38,327] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.10 | bwd: 5693.80 | bwd_inner: 5660.49 | bwd_allreduce: 33.26 | step: 18.71 14%|█▍ | 5841/41250 [14:07:03<85:22:12, 8.68s/it] {'loss': 0.0705, 'grad_norm': 1.7422692775726318, 'learning_rate': 3.8707981251101695e-05, 'epoch': 1.42} 14%|█▍ | 5841/41250 [14:07:03<85:22:12, 8.68s/it][2025-04-25 22:04:46,936] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.08 | optimizer_step: 1.22 [2025-04-25 22:04:46,937] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.29 | bwd_microstep: 5697.22 | bwd_inner_microstep: 5658.31 | bwd_allreduce_microstep: 38.86 | step_microstep: 19.59 [2025-04-25 22:04:46,937] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.29 | bwd: 5697.23 | bwd_inner: 5658.31 | bwd_allreduce: 38.89 | step: 19.59 14%|█▍ | 5842/41250 [14:07:12<85:10:10, 8.66s/it] {'loss': 0.3238, 'grad_norm': 3.001801013946533, 'learning_rate': 3.8707425936130916e-05, 'epoch': 1.42} 14%|█▍ | 5842/41250 [14:07:12<85:10:10, 8.66s/it][2025-04-25 22:04:55,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.04 | optimizer_step: 0.98 [2025-04-25 22:04:55,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.32 | bwd_microstep: 5754.35 | bwd_inner_microstep: 5695.87 | bwd_allreduce_microstep: 58.43 | step_microstep: 18.93 [2025-04-25 22:04:55,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.32 | bwd: 5754.36 | bwd_inner: 5695.87 | bwd_allreduce: 58.45 | step: 18.93 14%|█▍ | 5843/41250 [14:07:20<85:15:54, 8.67s/it] {'loss': 0.1998, 'grad_norm': 2.065520763397217, 'learning_rate': 3.8706870505832536e-05, 'epoch': 1.42} 14%|█▍ | 5843/41250 [14:07:20<85:15:54, 8.67s/it][2025-04-25 22:05:04,320] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-25 22:05:04,321] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.67 | bwd_microstep: 5756.01 | bwd_inner_microstep: 5700.97 | bwd_allreduce_microstep: 54.99 | step_microstep: 19.15 [2025-04-25 22:05:04,321] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.67 | bwd: 5756.03 | bwd_inner: 5700.97 | bwd_allreduce: 55.01 | step: 19.15 14%|█▍ | 5844/41250 [14:07:29<85:19:44, 8.68s/it] {'loss': 0.1756, 'grad_norm': 1.3237673044204712, 'learning_rate': 3.8706314960209984e-05, 'epoch': 1.42} 14%|█▍ | 5844/41250 [14:07:29<85:19:44, 8.68s/it][2025-04-25 22:05:13,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 22:05:13,008] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.93 | bwd_microstep: 5773.90 | bwd_inner_microstep: 5657.10 | bwd_allreduce_microstep: 116.76 | step_microstep: 18.62 [2025-04-25 22:05:13,008] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.93 | bwd: 5773.91 | bwd_inner: 5657.10 | bwd_allreduce: 116.78 | step: 18.63 14%|█▍ | 5845/41250 [14:07:38<85:21:00, 8.68s/it] {'loss': 0.131, 'grad_norm': 3.102499008178711, 'learning_rate': 3.8705759299266673e-05, 'epoch': 1.42} 14%|█▍ | 5845/41250 [14:07:38<85:21:00, 8.68s/it][2025-04-25 22:05:21,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 22:05:21,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.57 | bwd_microstep: 5756.75 | bwd_inner_microstep: 5663.91 | bwd_allreduce_microstep: 92.80 | step_microstep: 18.62 [2025-04-25 22:05:21,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.57 | bwd: 5756.77 | bwd_inner: 5663.91 | bwd_allreduce: 92.82 | step: 18.62 14%|█▍ | 5846/41250 [14:07:47<85:20:31, 8.68s/it] {'loss': 0.1817, 'grad_norm': 1.7901272773742676, 'learning_rate': 3.870520352300604e-05, 'epoch': 1.42} 14%|█▍ | 5846/41250 [14:07:47<85:20:31, 8.68s/it][2025-04-25 22:05:30,386] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 0.99 [2025-04-25 22:05:30,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.22 | bwd_microstep: 5783.67 | bwd_inner_microstep: 5668.51 | bwd_allreduce_microstep: 115.11 | step_microstep: 18.90 [2025-04-25 22:05:30,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.22 | bwd: 5783.68 | bwd_inner: 5668.51 | bwd_allreduce: 115.13 | step: 18.91 14%|█▍ | 5847/41250 [14:07:55<85:24:49, 8.69s/it] {'loss': 0.0375, 'grad_norm': 0.5672743320465088, 'learning_rate': 3.870464763143151e-05, 'epoch': 1.42} 14%|█▍ | 5847/41250 [14:07:55<85:24:49, 8.69s/it][2025-04-25 22:05:39,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.62 | optimizer_step: 1.01 [2025-04-25 22:05:39,203] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.44 | bwd_microstep: 5893.41 | bwd_inner_microstep: 5664.21 | bwd_allreduce_microstep: 229.14 | step_microstep: 20.84 [2025-04-25 22:05:39,203] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.44 | bwd: 5893.43 | bwd_inner: 5664.21 | bwd_allreduce: 229.17 | step: 20.84 14%|█▍ | 5848/41250 [14:08:04<85:48:20, 8.73s/it] {'loss': 0.1334, 'grad_norm': 1.2383677959442139, 'learning_rate': 3.87040916245465e-05, 'epoch': 1.42} 14%|█▍ | 5848/41250 [14:08:04<85:48:20, 8.73s/it][2025-04-25 22:05:48,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 22:05:48,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.59 | bwd_microstep: 6069.24 | bwd_inner_microstep: 5655.86 | bwd_allreduce_microstep: 413.33 | step_microstep: 19.09 [2025-04-25 22:05:48,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.59 | bwd: 6069.25 | bwd_inner: 5655.86 | bwd_allreduce: 413.35 | step: 19.09 14%|█▍ | 5849/41250 [14:08:13<86:33:37, 8.80s/it] {'loss': 0.109, 'grad_norm': 0.9964556694030762, 'learning_rate': 3.870353550235445e-05, 'epoch': 1.42} 14%|█▍ | 5849/41250 [14:08:13<86:33:37, 8.80s/it][2025-04-25 22:05:56,888] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 22:05:56,888] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.02 | bwd_microstep: 5785.62 | bwd_inner_microstep: 5664.72 | bwd_allreduce_microstep: 120.86 | step_microstep: 19.25 [2025-04-25 22:05:56,889] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.02 | bwd: 5785.64 | bwd_inner: 5664.72 | bwd_allreduce: 120.88 | step: 19.25 14%|█▍ | 5850/41250 [14:08:22<86:15:19, 8.77s/it] {'loss': 0.3356, 'grad_norm': 2.9847264289855957, 'learning_rate': 3.8702979264858774e-05, 'epoch': 1.42} 14%|█▍ | 5850/41250 [14:08:22<86:15:19, 8.77s/it][2025-04-25 22:06:05,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.30 | optimizer_step: 1.06 [2025-04-25 22:06:05,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.08 | bwd_microstep: 5767.91 | bwd_inner_microstep: 5716.84 | bwd_allreduce_microstep: 51.01 | step_microstep: 19.88 [2025-04-25 22:06:05,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.08 | bwd: 5767.92 | bwd_inner: 5716.84 | bwd_allreduce: 51.03 | step: 19.89 14%|█▍ | 5851/41250 [14:08:30<86:04:11, 8.75s/it] {'loss': 0.0658, 'grad_norm': 1.6175215244293213, 'learning_rate': 3.870242291206291e-05, 'epoch': 1.42} 14%|█▍ | 5851/41250 [14:08:30<86:04:11, 8.75s/it][2025-04-25 22:06:14,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:06:14,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.26 | bwd_microstep: 5776.29 | bwd_inner_microstep: 5667.79 | bwd_allreduce_microstep: 108.46 | step_microstep: 18.63 [2025-04-25 22:06:14,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.26 | bwd: 5776.31 | bwd_inner: 5667.79 | bwd_allreduce: 108.47 | step: 18.64 14%|█▍ | 5852/41250 [14:08:39<85:53:17, 8.73s/it] {'loss': 0.083, 'grad_norm': 1.007958173751831, 'learning_rate': 3.870186644397029e-05, 'epoch': 1.42} 14%|█▍ | 5852/41250 [14:08:39<85:53:17, 8.73s/it][2025-04-25 22:06:22,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:06:22,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.02 | bwd_microstep: 5702.54 | bwd_inner_microstep: 5669.45 | bwd_allreduce_microstep: 33.04 | step_microstep: 18.88 [2025-04-25 22:06:22,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.02 | bwd: 5702.55 | bwd_inner: 5669.45 | bwd_allreduce: 33.06 | step: 18.89 14%|█▍ | 5853/41250 [14:08:48<85:33:39, 8.70s/it] {'loss': 0.0571, 'grad_norm': 1.5164377689361572, 'learning_rate': 3.870130986058434e-05, 'epoch': 1.42} 14%|█▍ | 5853/41250 [14:08:48<85:33:39, 8.70s/it][2025-04-25 22:06:31,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:06:31,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.04 | bwd_microstep: 5706.90 | bwd_inner_microstep: 5673.94 | bwd_allreduce_microstep: 32.92 | step_microstep: 18.48 [2025-04-25 22:06:31,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.04 | bwd: 5706.92 | bwd_inner: 5673.94 | bwd_allreduce: 32.94 | step: 18.49 14%|█▍ | 5854/41250 [14:08:56<85:19:22, 8.68s/it] {'loss': 0.095, 'grad_norm': 2.077878475189209, 'learning_rate': 3.8700753161908497e-05, 'epoch': 1.42} 14%|█▍ | 5854/41250 [14:08:56<85:19:22, 8.68s/it][2025-04-25 22:06:40,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 22:06:40,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.51 | bwd_microstep: 5771.99 | bwd_inner_microstep: 5702.80 | bwd_allreduce_microstep: 69.13 | step_microstep: 18.99 [2025-04-25 22:06:40,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.51 | bwd: 5772.00 | bwd_inner: 5702.80 | bwd_allreduce: 69.15 | step: 18.99 14%|█▍ | 5855/41250 [14:09:05<85:26:33, 8.69s/it] {'loss': 0.0831, 'grad_norm': 1.235107660293579, 'learning_rate': 3.870019634794619e-05, 'epoch': 1.42} 14%|█▍ | 5855/41250 [14:09:05<85:26:33, 8.69s/it][2025-04-25 22:06:48,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.02 | optimizer_step: 1.05 [2025-04-25 22:06:48,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.17 | bwd_microstep: 5731.37 | bwd_inner_microstep: 5718.61 | bwd_allreduce_microstep: 12.72 | step_microstep: 19.17 [2025-04-25 22:06:48,925] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.17 | bwd: 5731.38 | bwd_inner: 5718.61 | bwd_allreduce: 12.73 | step: 19.17 14%|█▍ | 5856/41250 [14:09:14<85:22:31, 8.68s/it] {'loss': 0.3052, 'grad_norm': 2.3226277828216553, 'learning_rate': 3.8699639418700845e-05, 'epoch': 1.42} 14%|█▍ | 5856/41250 [14:09:14<85:22:31, 8.68s/it][2025-04-25 22:06:57,603] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:06:57,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.09 | bwd_microstep: 5732.11 | bwd_inner_microstep: 5719.45 | bwd_allreduce_microstep: 12.61 | step_microstep: 18.84 [2025-04-25 22:06:57,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.09 | bwd: 5732.12 | bwd_inner: 5719.45 | bwd_allreduce: 12.62 | step: 18.84 14%|█▍ | 5857/41250 [14:09:22<85:21:33, 8.68s/it] {'loss': 0.2001, 'grad_norm': 2.172253370285034, 'learning_rate': 3.8699082374175904e-05, 'epoch': 1.42} 14%|█▍ | 5857/41250 [14:09:22<85:21:33, 8.68s/it][2025-04-25 22:07:06,227] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:07:06,228] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.76 | bwd_microstep: 5704.58 | bwd_inner_microstep: 5660.84 | bwd_allreduce_microstep: 43.69 | step_microstep: 18.49 [2025-04-25 22:07:06,228] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.76 | bwd: 5704.59 | bwd_inner: 5660.84 | bwd_allreduce: 43.71 | step: 18.49 14%|█▍ | 5858/41250 [14:09:31<85:11:10, 8.66s/it] {'loss': 0.1731, 'grad_norm': 1.618879795074463, 'learning_rate': 3.86985252143748e-05, 'epoch': 1.42} 14%|█▍ | 5858/41250 [14:09:31<85:11:10, 8.66s/it][2025-04-25 22:07:14,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 22:07:14,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.25 | bwd_microstep: 5772.64 | bwd_inner_microstep: 5675.61 | bwd_allreduce_microstep: 96.99 | step_microstep: 18.98 [2025-04-25 22:07:14,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.25 | bwd: 5772.66 | bwd_inner: 5675.61 | bwd_allreduce: 97.00 | step: 18.98 14%|█▍ | 5859/41250 [14:09:40<85:15:15, 8.67s/it] {'loss': 0.0044, 'grad_norm': 0.053636614233255386, 'learning_rate': 3.869796793930096e-05, 'epoch': 1.42} 14%|█▍ | 5859/41250 [14:09:40<85:15:15, 8.67s/it][2025-04-25 22:07:23,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 1.03 [2025-04-25 22:07:23,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.68 | bwd_microstep: 5716.45 | bwd_inner_microstep: 5665.21 | bwd_allreduce_microstep: 51.19 | step_microstep: 18.91 [2025-04-25 22:07:23,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.68 | bwd: 5716.47 | bwd_inner: 5665.21 | bwd_allreduce: 51.21 | step: 18.92 14%|█▍ | 5860/41250 [14:09:48<85:08:14, 8.66s/it] {'loss': 0.0565, 'grad_norm': 1.1256051063537598, 'learning_rate': 3.869741054895784e-05, 'epoch': 1.42} 14%|█▍ | 5860/41250 [14:09:48<85:08:14, 8.66s/it][2025-04-25 22:07:32,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.66 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:07:32,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2868.33 | bwd_microstep: 5729.90 | bwd_inner_microstep: 5716.97 | bwd_allreduce_microstep: 12.88 | step_microstep: 19.05 [2025-04-25 22:07:32,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2868.33 | bwd: 5729.91 | bwd_inner: 5716.97 | bwd_allreduce: 12.90 | step: 19.06 14%|█▍ | 5861/41250 [14:09:57<85:13:07, 8.67s/it] {'loss': 0.1415, 'grad_norm': 3.0619754791259766, 'learning_rate': 3.8696853043348846e-05, 'epoch': 1.42} 14%|█▍ | 5861/41250 [14:09:57<85:13:07, 8.67s/it][2025-04-25 22:07:40,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 1.02 | optimizer_step: 0.95 [2025-04-25 22:07:40,883] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.83 | bwd_microstep: 5710.62 | bwd_inner_microstep: 5697.69 | bwd_allreduce_microstep: 12.89 | step_microstep: 19.39 [2025-04-25 22:07:40,883] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.83 | bwd: 5710.64 | bwd_inner: 5697.69 | bwd_allreduce: 12.90 | step: 19.40 14%|█▍ | 5862/41250 [14:10:06<85:08:47, 8.66s/it] {'loss': 0.1131, 'grad_norm': 2.320702075958252, 'learning_rate': 3.869629542247744e-05, 'epoch': 1.42} 14%|█▍ | 5862/41250 [14:10:06<85:08:47, 8.66s/it][2025-04-25 22:07:49,531] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 22:07:49,531] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.04 | bwd_microstep: 5715.86 | bwd_inner_microstep: 5703.27 | bwd_allreduce_microstep: 12.54 | step_microstep: 19.00 [2025-04-25 22:07:49,531] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.04 | bwd: 5715.87 | bwd_inner: 5703.27 | bwd_allreduce: 12.56 | step: 19.00 14%|█▍ | 5863/41250 [14:10:14<85:05:59, 8.66s/it] {'loss': 0.276, 'grad_norm': 2.7636325359344482, 'learning_rate': 3.8695737686347034e-05, 'epoch': 1.42} 14%|█▍ | 5863/41250 [14:10:14<85:05:59, 8.66s/it][2025-04-25 22:07:58,265] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 22:07:58,265] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.51 | bwd_microstep: 5821.24 | bwd_inner_microstep: 5649.68 | bwd_allreduce_microstep: 171.52 | step_microstep: 18.68 [2025-04-25 22:07:58,265] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.51 | bwd: 5821.26 | bwd_inner: 5649.68 | bwd_allreduce: 171.54 | step: 18.69 14%|█▍ | 5864/41250 [14:10:23<85:19:17, 8.68s/it] {'loss': 0.0798, 'grad_norm': 1.2966474294662476, 'learning_rate': 3.86951798349611e-05, 'epoch': 1.42} 14%|█▍ | 5864/41250 [14:10:23<85:19:17, 8.68s/it][2025-04-25 22:08:06,887] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.15 | optimizer_step: 0.90 [2025-04-25 22:08:06,888] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.22 | bwd_microstep: 5711.39 | bwd_inner_microstep: 5663.51 | bwd_allreduce_microstep: 47.83 | step_microstep: 18.98 [2025-04-25 22:08:06,888] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.22 | bwd: 5711.40 | bwd_inner: 5663.51 | bwd_allreduce: 47.85 | step: 18.98 14%|█▍ | 5865/41250 [14:10:32<85:09:01, 8.66s/it] {'loss': 0.0398, 'grad_norm': 0.5650381445884705, 'learning_rate': 3.869462186832305e-05, 'epoch': 1.42} 14%|█▍ | 5865/41250 [14:10:32<85:09:01, 8.66s/it][2025-04-25 22:08:15,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 22:08:15,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.31 | bwd_microstep: 5724.33 | bwd_inner_microstep: 5711.54 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.83 [2025-04-25 22:08:15,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.31 | bwd: 5724.35 | bwd_inner: 5711.54 | bwd_allreduce: 12.77 | step: 18.83 14%|█▍ | 5866/41250 [14:10:40<85:08:19, 8.66s/it] {'loss': 0.2435, 'grad_norm': 1.778333067893982, 'learning_rate': 3.869406378643634e-05, 'epoch': 1.42} 14%|█▍ | 5866/41250 [14:10:40<85:08:19, 8.66s/it][2025-04-25 22:08:24,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 22:08:24,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.77 | bwd_microstep: 5779.89 | bwd_inner_microstep: 5650.69 | bwd_allreduce_microstep: 129.16 | step_microstep: 18.76 [2025-04-25 22:08:24,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.77 | bwd: 5779.91 | bwd_inner: 5650.69 | bwd_allreduce: 129.18 | step: 18.76 14%|█▍ | 5867/41250 [14:10:49<85:13:08, 8.67s/it] {'loss': 0.0373, 'grad_norm': 0.8243726491928101, 'learning_rate': 3.86935055893044e-05, 'epoch': 1.42} 14%|█▍ | 5867/41250 [14:10:49<85:13:08, 8.67s/it][2025-04-25 22:08:33,192] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 1.07 [2025-04-25 22:08:33,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.40 | bwd_microstep: 6025.79 | bwd_inner_microstep: 5698.72 | bwd_allreduce_microstep: 327.02 | step_microstep: 19.11 [2025-04-25 22:08:33,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.40 | bwd: 6025.80 | bwd_inner: 5698.72 | bwd_allreduce: 327.04 | step: 19.11 14%|█▍ | 5868/41250 [14:10:58<86:03:30, 8.76s/it] {'loss': 0.0899, 'grad_norm': 1.4782840013504028, 'learning_rate': 3.869294727693067e-05, 'epoch': 1.42} 14%|█▍ | 5868/41250 [14:10:58<86:03:30, 8.76s/it][2025-04-25 22:08:41,800] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 22:08:41,800] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.04 | bwd_microstep: 5692.31 | bwd_inner_microstep: 5653.29 | bwd_allreduce_microstep: 38.97 | step_microstep: 18.83 [2025-04-25 22:08:41,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.04 | bwd: 5692.33 | bwd_inner: 5653.29 | bwd_allreduce: 38.99 | step: 18.83 14%|█▍ | 5869/41250 [14:11:07<85:37:00, 8.71s/it] {'loss': 0.0486, 'grad_norm': 2.0385007858276367, 'learning_rate': 3.8692388849318604e-05, 'epoch': 1.42} 14%|█▍ | 5869/41250 [14:11:07<85:37:00, 8.71s/it][2025-04-25 22:08:50,494] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 22:08:50,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.80 | bwd_microstep: 5783.85 | bwd_inner_microstep: 5652.90 | bwd_allreduce_microstep: 130.90 | step_microstep: 19.10 [2025-04-25 22:08:50,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.80 | bwd: 5783.87 | bwd_inner: 5652.90 | bwd_allreduce: 130.92 | step: 19.11 14%|█▍ | 5870/41250 [14:11:15<85:33:46, 8.71s/it] {'loss': 0.0883, 'grad_norm': 3.974820613861084, 'learning_rate': 3.869183030647164e-05, 'epoch': 1.42} 14%|█▍ | 5870/41250 [14:11:15<85:33:46, 8.71s/it][2025-04-25 22:08:59,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.00 | optimizer_step: 1.08 [2025-04-25 22:08:59,102] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.66 | bwd_microstep: 5688.14 | bwd_inner_microstep: 5650.88 | bwd_allreduce_microstep: 37.21 | step_microstep: 18.86 [2025-04-25 22:08:59,102] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.66 | bwd: 5688.15 | bwd_inner: 5650.88 | bwd_allreduce: 37.23 | step: 18.87 14%|█▍ | 5871/41250 [14:11:24<85:16:07, 8.68s/it] {'loss': 0.3802, 'grad_norm': 4.343644142150879, 'learning_rate': 3.869127164839321e-05, 'epoch': 1.42} 14%|█▍ | 5871/41250 [14:11:24<85:16:07, 8.68s/it][2025-04-25 22:09:07,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.98 [2025-04-25 22:09:07,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.76 | bwd_microstep: 5899.31 | bwd_inner_microstep: 5646.80 | bwd_allreduce_microstep: 252.46 | step_microstep: 19.00 [2025-04-25 22:09:07,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.76 | bwd: 5899.33 | bwd_inner: 5646.80 | bwd_allreduce: 252.48 | step: 19.00 14%|█▍ | 5872/41250 [14:11:33<85:38:18, 8.71s/it] {'loss': 0.1942, 'grad_norm': 2.052727222442627, 'learning_rate': 3.869071287508678e-05, 'epoch': 1.42} 14%|█▍ | 5872/41250 [14:11:33<85:38:18, 8.71s/it][2025-04-25 22:09:16,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.91 [2025-04-25 22:09:16,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.16 | bwd_microstep: 5704.12 | bwd_inner_microstep: 5690.97 | bwd_allreduce_microstep: 13.10 | step_microstep: 19.45 [2025-04-25 22:09:16,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.16 | bwd: 5704.14 | bwd_inner: 5690.97 | bwd_allreduce: 13.12 | step: 19.46 14%|█▍ | 5873/41250 [14:11:41<85:25:15, 8.69s/it] {'loss': 0.1636, 'grad_norm': 1.5939736366271973, 'learning_rate': 3.869015398655576e-05, 'epoch': 1.42} 14%|█▍ | 5873/41250 [14:11:41<85:25:15, 8.69s/it][2025-04-25 22:09:25,200] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 22:09:25,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.96 | bwd_microstep: 5744.09 | bwd_inner_microstep: 5647.75 | bwd_allreduce_microstep: 96.28 | step_microstep: 19.12 [2025-04-25 22:09:25,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.96 | bwd: 5744.10 | bwd_inner: 5647.75 | bwd_allreduce: 96.30 | step: 19.13 14%|█▍ | 5874/41250 [14:11:50<85:18:18, 8.68s/it] {'loss': 0.2495, 'grad_norm': 2.8331356048583984, 'learning_rate': 3.8689594982803635e-05, 'epoch': 1.42} 14%|█▍ | 5874/41250 [14:11:50<85:18:18, 8.68s/it][2025-04-25 22:09:33,860] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 22:09:33,860] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.09 | bwd_microstep: 5748.82 | bwd_inner_microstep: 5651.86 | bwd_allreduce_microstep: 96.91 | step_microstep: 18.71 [2025-04-25 22:09:33,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.09 | bwd: 5748.83 | bwd_inner: 5651.86 | bwd_allreduce: 96.93 | step: 18.72 14%|█▍ | 5875/41250 [14:11:59<85:14:23, 8.67s/it] {'loss': 0.1116, 'grad_norm': 5.107389450073242, 'learning_rate': 3.868903586383382e-05, 'epoch': 1.42} 14%|█▍ | 5875/41250 [14:11:59<85:14:23, 8.67s/it][2025-04-25 22:09:42,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 22:09:42,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.69 | bwd_microstep: 5721.96 | bwd_inner_microstep: 5709.21 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.85 [2025-04-25 22:09:42,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.69 | bwd: 5721.98 | bwd_inner: 5709.21 | bwd_allreduce: 12.72 | step: 18.85 14%|█▍ | 5876/41250 [14:12:07<85:11:37, 8.67s/it] {'loss': 0.0433, 'grad_norm': 0.6804834604263306, 'learning_rate': 3.868847662964978e-05, 'epoch': 1.42} 14%|█▍ | 5876/41250 [14:12:07<85:11:37, 8.67s/it][2025-04-25 22:09:51,184] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.04 | optimizer_step: 1.07 [2025-04-25 22:09:51,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.72 | bwd_microstep: 5736.57 | bwd_inner_microstep: 5675.31 | bwd_allreduce_microstep: 61.21 | step_microstep: 19.44 [2025-04-25 22:09:51,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.72 | bwd: 5736.58 | bwd_inner: 5675.31 | bwd_allreduce: 61.23 | step: 19.44 14%|█▍ | 5877/41250 [14:12:16<85:10:55, 8.67s/it] {'loss': 0.1588, 'grad_norm': 2.6494460105895996, 'learning_rate': 3.8687917280254965e-05, 'epoch': 1.42} 14%|█▍ | 5877/41250 [14:12:16<85:10:55, 8.67s/it][2025-04-25 22:09:59,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.96 | optimizer_step: 0.99 [2025-04-25 22:09:59,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2884.79 | bwd_microstep: 5764.22 | bwd_inner_microstep: 5751.65 | bwd_allreduce_microstep: 12.53 | step_microstep: 18.66 [2025-04-25 22:09:59,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2884.79 | bwd: 5764.24 | bwd_inner: 5751.65 | bwd_allreduce: 12.54 | step: 18.66 14%|█▍ | 5878/41250 [14:12:25<85:22:47, 8.69s/it] {'loss': 0.171, 'grad_norm': 2.324094533920288, 'learning_rate': 3.868735781565281e-05, 'epoch': 1.42} 14%|█▍ | 5878/41250 [14:12:25<85:22:47, 8.69s/it][2025-04-25 22:10:08,569] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:10:08,569] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.98 | bwd_microstep: 5714.29 | bwd_inner_microstep: 5692.34 | bwd_allreduce_microstep: 21.90 | step_microstep: 19.04 [2025-04-25 22:10:08,569] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.98 | bwd: 5714.30 | bwd_inner: 5692.34 | bwd_allreduce: 21.91 | step: 19.04 14%|█▍ | 5879/41250 [14:12:33<85:14:54, 8.68s/it] {'loss': 0.1494, 'grad_norm': 1.583552598953247, 'learning_rate': 3.868679823584677e-05, 'epoch': 1.43} 14%|█▍ | 5879/41250 [14:12:33<85:14:54, 8.68s/it][2025-04-25 22:10:17,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 0.99 | optimizer_step: 0.94 [2025-04-25 22:10:17,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.94 | bwd_microstep: 5705.69 | bwd_inner_microstep: 5693.23 | bwd_allreduce_microstep: 12.42 | step_microstep: 18.34 [2025-04-25 22:10:17,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.94 | bwd: 5705.71 | bwd_inner: 5693.23 | bwd_allreduce: 12.44 | step: 18.34 14%|█▍ | 5880/41250 [14:12:42<85:08:01, 8.67s/it] {'loss': 0.0275, 'grad_norm': 0.6912130117416382, 'learning_rate': 3.868623854084029e-05, 'epoch': 1.43} 14%|█▍ | 5880/41250 [14:12:42<85:08:01, 8.67s/it][2025-04-25 22:10:25,902] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 0.93 [2025-04-25 22:10:25,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.11 | bwd_microstep: 5766.05 | bwd_inner_microstep: 5696.47 | bwd_allreduce_microstep: 69.53 | step_microstep: 18.69 [2025-04-25 22:10:25,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.11 | bwd: 5766.07 | bwd_inner: 5696.47 | bwd_allreduce: 69.55 | step: 18.69 14%|█▍ | 5881/41250 [14:12:51<85:13:04, 8.67s/it] {'loss': 0.0141, 'grad_norm': 0.25424790382385254, 'learning_rate': 3.868567873063683e-05, 'epoch': 1.43} 14%|█▍ | 5881/41250 [14:12:51<85:13:04, 8.67s/it][2025-04-25 22:10:34,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 22:10:34,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.92 | bwd_microstep: 5714.25 | bwd_inner_microstep: 5701.36 | bwd_allreduce_microstep: 12.84 | step_microstep: 19.10 [2025-04-25 22:10:34,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.92 | bwd: 5714.26 | bwd_inner: 5701.36 | bwd_allreduce: 12.86 | step: 19.10 14%|█▍ | 5882/41250 [14:12:59<85:08:17, 8.67s/it] {'loss': 0.1724, 'grad_norm': 2.563837766647339, 'learning_rate': 3.8685118805239836e-05, 'epoch': 1.43} 14%|█▍ | 5882/41250 [14:12:59<85:08:17, 8.67s/it][2025-04-25 22:10:43,152] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.05 | optimizer_step: 1.05 [2025-04-25 22:10:43,153] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.58 | bwd_microstep: 5694.00 | bwd_inner_microstep: 5651.81 | bwd_allreduce_microstep: 42.13 | step_microstep: 19.68 [2025-04-25 22:10:43,153] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.58 | bwd: 5694.01 | bwd_inner: 5651.81 | bwd_allreduce: 42.16 | step: 19.68 14%|█▍ | 5883/41250 [14:13:08<84:57:14, 8.65s/it] {'loss': 0.0701, 'grad_norm': 1.7607955932617188, 'learning_rate': 3.8684558764652756e-05, 'epoch': 1.43} 14%|█▍ | 5883/41250 [14:13:08<84:57:14, 8.65s/it][2025-04-25 22:10:51,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.08 | optimizer_step: 1.02 [2025-04-25 22:10:51,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.04 | bwd_microstep: 5788.62 | bwd_inner_microstep: 5640.07 | bwd_allreduce_microstep: 148.50 | step_microstep: 19.31 [2025-04-25 22:10:51,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.04 | bwd: 5788.64 | bwd_inner: 5640.07 | bwd_allreduce: 148.52 | step: 19.31 14%|█▍ | 5884/41250 [14:13:17<85:05:16, 8.66s/it] {'loss': 0.2652, 'grad_norm': 2.559706211090088, 'learning_rate': 3.868399860887905e-05, 'epoch': 1.43} 14%|█▍ | 5884/41250 [14:13:17<85:05:16, 8.66s/it][2025-04-25 22:11:00,453] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.03 | optimizer_step: 0.94 [2025-04-25 22:11:00,454] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.50 | bwd_microstep: 5697.35 | bwd_inner_microstep: 5654.34 | bwd_allreduce_microstep: 42.95 | step_microstep: 19.07 [2025-04-25 22:11:00,454] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.50 | bwd: 5697.37 | bwd_inner: 5654.34 | bwd_allreduce: 42.98 | step: 19.06 14%|█▍ | 5885/41250 [14:13:25<84:55:14, 8.64s/it] {'loss': 0.249, 'grad_norm': 2.973750591278076, 'learning_rate': 3.868343833792216e-05, 'epoch': 1.43} 14%|█▍ | 5885/41250 [14:13:25<84:55:14, 8.64s/it][2025-04-25 22:11:09,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 22:11:09,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.57 | bwd_microstep: 5694.18 | bwd_inner_microstep: 5651.81 | bwd_allreduce_microstep: 42.31 | step_microstep: 18.85 [2025-04-25 22:11:09,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.57 | bwd: 5694.19 | bwd_inner: 5651.81 | bwd_allreduce: 42.33 | step: 18.85 14%|█▍ | 5886/41250 [14:13:34<84:47:38, 8.63s/it] {'loss': 0.1041, 'grad_norm': 1.9636603593826294, 'learning_rate': 3.868287795178556e-05, 'epoch': 1.43} 14%|█▍ | 5886/41250 [14:13:34<84:47:38, 8.63s/it][2025-04-25 22:11:17,690] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 22:11:17,690] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.28 | bwd_microstep: 5702.17 | bwd_inner_microstep: 5689.16 | bwd_allreduce_microstep: 12.96 | step_microstep: 18.68 [2025-04-25 22:11:17,690] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.28 | bwd: 5702.18 | bwd_inner: 5689.16 | bwd_allreduce: 12.98 | step: 18.69 14%|█▍ | 5887/41250 [14:13:43<84:47:53, 8.63s/it] {'loss': 0.0369, 'grad_norm': 0.760072648525238, 'learning_rate': 3.868231745047268e-05, 'epoch': 1.43} 14%|█▍ | 5887/41250 [14:13:43<84:47:53, 8.63s/it][2025-04-25 22:11:26,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 1.08 [2025-04-25 22:11:26,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2883.79 | bwd_microstep: 5786.29 | bwd_inner_microstep: 5773.32 | bwd_allreduce_microstep: 12.93 | step_microstep: 19.23 [2025-04-25 22:11:26,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2883.79 | bwd: 5786.31 | bwd_inner: 5773.32 | bwd_allreduce: 12.95 | step: 19.23 14%|█▍ | 5888/41250 [14:13:51<85:09:08, 8.67s/it] {'loss': 0.2406, 'grad_norm': 1.9756300449371338, 'learning_rate': 3.8681756833987e-05, 'epoch': 1.43} 14%|█▍ | 5888/41250 [14:13:51<85:09:08, 8.67s/it][2025-04-25 22:11:35,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.18 | optimizer_step: 1.03 [2025-04-25 22:11:35,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.01 | bwd_microstep: 5739.83 | bwd_inner_microstep: 5684.14 | bwd_allreduce_microstep: 55.64 | step_microstep: 19.61 [2025-04-25 22:11:35,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.01 | bwd: 5739.84 | bwd_inner: 5684.14 | bwd_allreduce: 55.66 | step: 19.61 14%|█▍ | 5889/41250 [14:14:00<85:09:30, 8.67s/it] {'loss': 0.1033, 'grad_norm': 1.810861587524414, 'learning_rate': 3.868119610233195e-05, 'epoch': 1.43} 14%|█▍ | 5889/41250 [14:14:00<85:09:30, 8.67s/it][2025-04-25 22:11:43,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.99 | optimizer_step: 0.98 [2025-04-25 22:11:43,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.40 | bwd_microstep: 5737.11 | bwd_inner_microstep: 5690.19 | bwd_allreduce_microstep: 46.87 | step_microstep: 18.68 [2025-04-25 22:11:43,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.40 | bwd: 5737.12 | bwd_inner: 5690.19 | bwd_allreduce: 46.89 | step: 18.68 14%|█▍ | 5890/41250 [14:14:09<85:08:59, 8.67s/it] {'loss': 0.073, 'grad_norm': 3.261234760284424, 'learning_rate': 3.868063525551101e-05, 'epoch': 1.43} 14%|█▍ | 5890/41250 [14:14:09<85:08:59, 8.67s/it][2025-04-25 22:11:52,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.05 | optimizer_step: 0.90 [2025-04-25 22:11:52,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.55 | bwd_microstep: 5753.31 | bwd_inner_microstep: 5684.07 | bwd_allreduce_microstep: 69.18 | step_microstep: 18.98 [2025-04-25 22:11:52,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.55 | bwd: 5753.32 | bwd_inner: 5684.07 | bwd_allreduce: 69.20 | step: 18.99 14%|█▍ | 5891/41250 [14:14:17<85:10:28, 8.67s/it] {'loss': 0.2021, 'grad_norm': 2.3671507835388184, 'learning_rate': 3.8680074293527624e-05, 'epoch': 1.43} 14%|█▍ | 5891/41250 [14:14:17<85:10:28, 8.67s/it][2025-04-25 22:12:01,214] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 22:12:01,214] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2883.29 | bwd_microstep: 5787.60 | bwd_inner_microstep: 5774.48 | bwd_allreduce_microstep: 13.08 | step_microstep: 18.75 [2025-04-25 22:12:01,214] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2883.29 | bwd: 5787.62 | bwd_inner: 5774.48 | bwd_allreduce: 13.09 | step: 18.76 14%|█▍ | 5892/41250 [14:14:26<85:24:43, 8.70s/it] {'loss': 0.0534, 'grad_norm': 1.0790445804595947, 'learning_rate': 3.8679513216385256e-05, 'epoch': 1.43} 14%|█▍ | 5892/41250 [14:14:26<85:24:43, 8.70s/it][2025-04-25 22:12:09,886] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.03 | optimizer_step: 0.98 [2025-04-25 22:12:09,886] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.36 | bwd_microstep: 5760.36 | bwd_inner_microstep: 5645.62 | bwd_allreduce_microstep: 114.68 | step_microstep: 19.30 [2025-04-25 22:12:09,887] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.36 | bwd: 5760.37 | bwd_inner: 5645.62 | bwd_allreduce: 114.71 | step: 19.31 14%|█▍ | 5893/41250 [14:14:35<85:20:17, 8.69s/it] {'loss': 0.1718, 'grad_norm': 3.6917927265167236, 'learning_rate': 3.867895202408736e-05, 'epoch': 1.43} 14%|█▍ | 5893/41250 [14:14:35<85:20:17, 8.69s/it][2025-04-25 22:12:18,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 22:12:18,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.20 | bwd_microstep: 5749.98 | bwd_inner_microstep: 5657.77 | bwd_allreduce_microstep: 92.17 | step_microstep: 18.77 [2025-04-25 22:12:18,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.20 | bwd: 5750.00 | bwd_inner: 5657.77 | bwd_allreduce: 92.19 | step: 18.77 14%|█▍ | 5894/41250 [14:14:43<85:15:15, 8.68s/it] {'loss': 0.102, 'grad_norm': 0.9546758532524109, 'learning_rate': 3.86783907166374e-05, 'epoch': 1.43} 14%|█▍ | 5894/41250 [14:14:43<85:15:15, 8.68s/it][2025-04-25 22:12:27,159] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.58 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 22:12:27,159] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.70 | bwd_microstep: 5702.18 | bwd_inner_microstep: 5653.57 | bwd_allreduce_microstep: 48.57 | step_microstep: 18.86 [2025-04-25 22:12:27,159] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.70 | bwd: 5702.20 | bwd_inner: 5653.57 | bwd_allreduce: 48.59 | step: 18.86 14%|█▍ | 5895/41250 [14:14:52<85:02:42, 8.66s/it] {'loss': 0.1558, 'grad_norm': 2.3560545444488525, 'learning_rate': 3.867782929403884e-05, 'epoch': 1.43} 14%|█▍ | 5895/41250 [14:14:52<85:02:42, 8.66s/it][2025-04-25 22:12:35,823] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:12:35,824] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.00 | bwd_microstep: 5757.25 | bwd_inner_microstep: 5651.70 | bwd_allreduce_microstep: 105.52 | step_microstep: 18.81 [2025-04-25 22:12:35,824] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.00 | bwd: 5757.27 | bwd_inner: 5651.70 | bwd_allreduce: 105.53 | step: 18.82 14%|█▍ | 5896/41250 [14:15:01<85:03:30, 8.66s/it] {'loss': 0.0996, 'grad_norm': 4.357958793640137, 'learning_rate': 3.867726775629513e-05, 'epoch': 1.43} 14%|█▍ | 5896/41250 [14:15:01<85:03:30, 8.66s/it][2025-04-25 22:12:44,523] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:12:44,524] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.95 | bwd_microstep: 5792.94 | bwd_inner_microstep: 5644.69 | bwd_allreduce_microstep: 148.20 | step_microstep: 18.83 [2025-04-25 22:12:44,524] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.95 | bwd: 5792.95 | bwd_inner: 5644.69 | bwd_allreduce: 148.22 | step: 18.83 14%|█▍ | 5897/41250 [14:15:09<85:10:10, 8.67s/it] {'loss': 0.252, 'grad_norm': 2.911107063293457, 'learning_rate': 3.867670610340974e-05, 'epoch': 1.43} 14%|█▍ | 5897/41250 [14:15:09<85:10:10, 8.67s/it][2025-04-25 22:12:53,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.91 [2025-04-25 22:12:53,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.02 | bwd_microstep: 5751.96 | bwd_inner_microstep: 5709.23 | bwd_allreduce_microstep: 42.68 | step_microstep: 18.89 [2025-04-25 22:12:53,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.02 | bwd: 5751.97 | bwd_inner: 5709.23 | bwd_allreduce: 42.70 | step: 18.89 14%|█▍ | 5898/41250 [14:15:18<85:12:08, 8.68s/it] {'loss': 0.1467, 'grad_norm': 1.9213134050369263, 'learning_rate': 3.867614433538613e-05, 'epoch': 1.43} 14%|█▍ | 5898/41250 [14:15:18<85:12:08, 8.68s/it][2025-04-25 22:13:01,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:13:01,923] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.13 | bwd_microstep: 5780.87 | bwd_inner_microstep: 5698.39 | bwd_allreduce_microstep: 82.42 | step_microstep: 18.46 [2025-04-25 22:13:01,923] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.13 | bwd: 5780.88 | bwd_inner: 5698.39 | bwd_allreduce: 82.44 | step: 18.46 14%|█▍ | 5899/41250 [14:15:27<85:19:07, 8.69s/it] {'loss': 0.1016, 'grad_norm': 1.6278600692749023, 'learning_rate': 3.8675582452227764e-05, 'epoch': 1.43} 14%|█▍ | 5899/41250 [14:15:27<85:19:07, 8.69s/it][2025-04-25 22:13:10,817] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 1.01 | optimizer_step: 1.14 [2025-04-25 22:13:10,818] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2930.41 | bwd_microstep: 5877.89 | bwd_inner_microstep: 5865.08 | bwd_allreduce_microstep: 12.76 | step_microstep: 19.42 [2025-04-25 22:13:10,818] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2930.41 | bwd: 5877.91 | bwd_inner: 5865.08 | bwd_allreduce: 12.79 | step: 19.42 14%|█▍ | 5900/41250 [14:15:36<85:55:10, 8.75s/it] {'loss': 0.0621, 'grad_norm': 3.8436145782470703, 'learning_rate': 3.867502045393811e-05, 'epoch': 1.43} 14%|█▍ | 5900/41250 [14:15:36<85:55:10, 8.75s/it][2025-04-25 22:13:19,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:13:19,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2883.52 | bwd_microstep: 5789.56 | bwd_inner_microstep: 5776.73 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.60 [2025-04-25 22:13:19,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2883.52 | bwd: 5789.57 | bwd_inner: 5776.73 | bwd_allreduce: 12.80 | step: 18.60 14%|█▍ | 5901/41250 [14:15:44<85:56:02, 8.75s/it] {'loss': 0.4661, 'grad_norm': 3.337528705596924, 'learning_rate': 3.867445834052062e-05, 'epoch': 1.43} 14%|█▍ | 5901/41250 [14:15:44<85:56:02, 8.75s/it][2025-04-25 22:13:28,381] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 22:13:28,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.17 | bwd_microstep: 5895.77 | bwd_inner_microstep: 5669.47 | bwd_allreduce_microstep: 226.26 | step_microstep: 18.67 [2025-04-25 22:13:28,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.17 | bwd: 5895.78 | bwd_inner: 5669.47 | bwd_allreduce: 226.28 | step: 18.67 14%|█▍ | 5902/41250 [14:15:53<86:05:46, 8.77s/it] {'loss': 0.1562, 'grad_norm': 2.041858673095703, 'learning_rate': 3.867389611197877e-05, 'epoch': 1.43} 14%|█▍ | 5902/41250 [14:15:53<86:05:46, 8.77s/it][2025-04-25 22:13:37,041] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.03 | optimizer_step: 0.92 [2025-04-25 22:13:37,041] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.95 | bwd_microstep: 5721.47 | bwd_inner_microstep: 5708.20 | bwd_allreduce_microstep: 13.22 | step_microstep: 20.27 [2025-04-25 22:13:37,042] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.95 | bwd: 5721.48 | bwd_inner: 5708.20 | bwd_allreduce: 13.24 | step: 20.27 14%|█▍ | 5903/41250 [14:16:02<85:46:32, 8.74s/it] {'loss': 0.5524, 'grad_norm': 5.1037797927856445, 'learning_rate': 3.8673333768316025e-05, 'epoch': 1.43} 14%|█▍ | 5903/41250 [14:16:02<85:46:32, 8.74s/it][2025-04-25 22:13:45,754] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.97 | optimizer_step: 1.01 [2025-04-25 22:13:45,755] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.82 | bwd_microstep: 5772.68 | bwd_inner_microstep: 5720.24 | bwd_allreduce_microstep: 52.39 | step_microstep: 18.55 [2025-04-25 22:13:45,755] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.82 | bwd: 5772.69 | bwd_inner: 5720.24 | bwd_allreduce: 52.41 | step: 18.55 14%|█▍ | 5904/41250 [14:16:11<85:42:23, 8.73s/it] {'loss': 0.1854, 'grad_norm': 1.8685697317123413, 'learning_rate': 3.8672771309535845e-05, 'epoch': 1.43} 14%|█▍ | 5904/41250 [14:16:11<85:42:23, 8.73s/it][2025-04-25 22:13:54,566] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.03 | optimizer_step: 1.07 [2025-04-25 22:13:54,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2893.38 | bwd_microstep: 5835.44 | bwd_inner_microstep: 5782.51 | bwd_allreduce_microstep: 52.88 | step_microstep: 18.74 [2025-04-25 22:13:54,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2893.38 | bwd: 5835.45 | bwd_inner: 5782.51 | bwd_allreduce: 52.90 | step: 18.75 14%|█▍ | 5905/41250 [14:16:19<85:56:57, 8.75s/it] {'loss': 0.3468, 'grad_norm': 3.194444179534912, 'learning_rate': 3.867220873564171e-05, 'epoch': 1.43} 14%|█▍ | 5905/41250 [14:16:19<85:56:57, 8.75s/it][2025-04-25 22:14:03,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.05 | optimizer_step: 1.09 [2025-04-25 22:14:03,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.10 | bwd_microstep: 5714.83 | bwd_inner_microstep: 5664.20 | bwd_allreduce_microstep: 50.58 | step_microstep: 19.07 [2025-04-25 22:14:03,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.10 | bwd: 5714.84 | bwd_inner: 5664.19 | bwd_allreduce: 50.60 | step: 19.07 14%|█▍ | 5906/41250 [14:16:28<85:35:08, 8.72s/it] {'loss': 0.0439, 'grad_norm': 0.8200770616531372, 'learning_rate': 3.8671646046637075e-05, 'epoch': 1.43} 14%|█▍ | 5906/41250 [14:16:28<85:35:08, 8.72s/it][2025-04-25 22:14:11,850] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:14:11,850] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.93 | bwd_microstep: 5737.59 | bwd_inner_microstep: 5672.23 | bwd_allreduce_microstep: 65.32 | step_microstep: 18.40 [2025-04-25 22:14:11,851] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.93 | bwd: 5737.60 | bwd_inner: 5672.23 | bwd_allreduce: 65.33 | step: 18.41 14%|█▍ | 5907/41250 [14:16:37<85:23:12, 8.70s/it] {'loss': 0.1478, 'grad_norm': 1.877320408821106, 'learning_rate': 3.8671083242525424e-05, 'epoch': 1.43} 14%|█▍ | 5907/41250 [14:16:37<85:23:12, 8.70s/it][2025-04-25 22:14:20,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 22:14:20,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.26 | bwd_microstep: 5707.57 | bwd_inner_microstep: 5694.83 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.84 [2025-04-25 22:14:20,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.26 | bwd: 5707.59 | bwd_inner: 5694.83 | bwd_allreduce: 12.72 | step: 18.85 14%|█▍ | 5908/41250 [14:16:45<85:13:56, 8.68s/it] {'loss': 0.1685, 'grad_norm': 3.549952983856201, 'learning_rate': 3.867052032331021e-05, 'epoch': 1.43} 14%|█▍ | 5908/41250 [14:16:45<85:13:56, 8.68s/it][2025-04-25 22:14:29,148] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.05 | optimizer_step: 1.04 [2025-04-25 22:14:29,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.44 | bwd_microstep: 5718.26 | bwd_inner_microstep: 5705.04 | bwd_allreduce_microstep: 13.17 | step_microstep: 19.49 [2025-04-25 22:14:29,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.44 | bwd: 5718.27 | bwd_inner: 5705.04 | bwd_allreduce: 13.19 | step: 19.50 14%|█▍ | 5909/41250 [14:16:54<85:08:44, 8.67s/it] {'loss': 0.027, 'grad_norm': 0.34629666805267334, 'learning_rate': 3.866995728899491e-05, 'epoch': 1.43} 14%|█▍ | 5909/41250 [14:16:54<85:08:44, 8.67s/it][2025-04-25 22:14:37,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.99 | optimizer_step: 0.99 [2025-04-25 22:14:37,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.88 | bwd_microstep: 5722.07 | bwd_inner_microstep: 5709.26 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.74 [2025-04-25 22:14:37,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.88 | bwd: 5722.09 | bwd_inner: 5709.26 | bwd_allreduce: 12.78 | step: 18.74 14%|█▍ | 5910/41250 [14:17:03<85:05:40, 8.67s/it] {'loss': 0.0801, 'grad_norm': 1.1805609464645386, 'learning_rate': 3.8669394139582996e-05, 'epoch': 1.43} 14%|█▍ | 5910/41250 [14:17:03<85:05:40, 8.67s/it][2025-04-25 22:14:46,514] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.03 | optimizer_step: 1.01 [2025-04-25 22:14:46,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.00 | bwd_microstep: 5768.91 | bwd_inner_microstep: 5712.88 | bwd_allreduce_microstep: 55.98 | step_microstep: 19.20 [2025-04-25 22:14:46,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.00 | bwd: 5768.93 | bwd_inner: 5712.88 | bwd_allreduce: 56.00 | step: 19.20 14%|█▍ | 5911/41250 [14:17:11<85:13:28, 8.68s/it] {'loss': 0.28, 'grad_norm': 2.4993250370025635, 'learning_rate': 3.866883087507794e-05, 'epoch': 1.43} 14%|█▍ | 5911/41250 [14:17:11<85:13:28, 8.68s/it][2025-04-25 22:14:55,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 22:14:55,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2893.39 | bwd_microstep: 5794.32 | bwd_inner_microstep: 5781.47 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.89 [2025-04-25 22:14:55,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2893.39 | bwd: 5794.33 | bwd_inner: 5781.47 | bwd_allreduce: 12.82 | step: 18.89 14%|█▍ | 5912/41250 [14:17:20<85:29:34, 8.71s/it] {'loss': 0.2481, 'grad_norm': 3.1441564559936523, 'learning_rate': 3.866826749548321e-05, 'epoch': 1.43} 14%|█▍ | 5912/41250 [14:17:20<85:29:34, 8.71s/it][2025-04-25 22:15:03,983] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.24 | optimizer_step: 0.96 [2025-04-25 22:15:03,984] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.72 | bwd_microstep: 5756.28 | bwd_inner_microstep: 5696.46 | bwd_allreduce_microstep: 59.77 | step_microstep: 19.45 [2025-04-25 22:15:03,984] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.72 | bwd: 5756.29 | bwd_inner: 5696.46 | bwd_allreduce: 59.79 | step: 19.45 14%|█▍ | 5913/41250 [14:17:29<85:26:33, 8.70s/it] {'loss': 0.1069, 'grad_norm': 1.441693663597107, 'learning_rate': 3.866770400080229e-05, 'epoch': 1.43} 14%|█▍ | 5913/41250 [14:17:29<85:26:33, 8.70s/it][2025-04-25 22:15:12,708] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-25 22:15:12,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.64 | bwd_microstep: 5813.30 | bwd_inner_microstep: 5656.78 | bwd_allreduce_microstep: 156.47 | step_microstep: 18.78 [2025-04-25 22:15:12,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.64 | bwd: 5813.31 | bwd_inner: 5656.78 | bwd_allreduce: 156.48 | step: 18.78 14%|█▍ | 5914/41250 [14:17:38<85:29:57, 8.71s/it] {'loss': 0.0912, 'grad_norm': 1.7000855207443237, 'learning_rate': 3.8667140391038646e-05, 'epoch': 1.43} 14%|█▍ | 5914/41250 [14:17:38<85:29:57, 8.71s/it][2025-04-25 22:15:21,356] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 22:15:21,356] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.90 | bwd_microstep: 5715.92 | bwd_inner_microstep: 5702.95 | bwd_allreduce_microstep: 12.93 | step_microstep: 18.89 [2025-04-25 22:15:21,357] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.90 | bwd: 5715.94 | bwd_inner: 5702.95 | bwd_allreduce: 12.94 | step: 18.89 14%|█▍ | 5915/41250 [14:17:46<85:18:40, 8.69s/it] {'loss': 0.1017, 'grad_norm': 3.4430594444274902, 'learning_rate': 3.866657666619575e-05, 'epoch': 1.43} 14%|█▍ | 5915/41250 [14:17:46<85:18:40, 8.69s/it][2025-04-25 22:15:30,053] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.11 | optimizer_step: 1.03 [2025-04-25 22:15:30,054] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.60 | bwd_microstep: 5762.74 | bwd_inner_microstep: 5698.73 | bwd_allreduce_microstep: 63.96 | step_microstep: 19.98 [2025-04-25 22:15:30,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.60 | bwd: 5762.76 | bwd_inner: 5698.73 | bwd_allreduce: 63.98 | step: 19.98 14%|█▍ | 5916/41250 [14:17:55<85:20:08, 8.69s/it] {'loss': 0.0914, 'grad_norm': 0.9425786137580872, 'learning_rate': 3.8666012826277074e-05, 'epoch': 1.43} 14%|█▍ | 5916/41250 [14:17:55<85:20:08, 8.69s/it][2025-04-25 22:15:38,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:15:38,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.79 | bwd_microstep: 5715.32 | bwd_inner_microstep: 5702.49 | bwd_allreduce_microstep: 12.78 | step_microstep: 19.01 [2025-04-25 22:15:38,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.79 | bwd: 5715.33 | bwd_inner: 5702.49 | bwd_allreduce: 12.80 | step: 19.01 14%|█▍ | 5917/41250 [14:18:04<85:12:20, 8.68s/it] {'loss': 0.211, 'grad_norm': 2.4640305042266846, 'learning_rate': 3.866544887128611e-05, 'epoch': 1.43} 14%|█▍ | 5917/41250 [14:18:04<85:12:20, 8.68s/it][2025-04-25 22:15:47,398] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-25 22:15:47,398] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.51 | bwd_microstep: 5762.88 | bwd_inner_microstep: 5689.41 | bwd_allreduce_microstep: 73.43 | step_microstep: 18.66 [2025-04-25 22:15:47,399] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.51 | bwd: 5762.89 | bwd_inner: 5689.41 | bwd_allreduce: 73.44 | step: 18.66 14%|█▍ | 5918/41250 [14:18:12<85:13:29, 8.68s/it] {'loss': 0.0716, 'grad_norm': 1.3453577756881714, 'learning_rate': 3.866488480122632e-05, 'epoch': 1.43} 14%|█▍ | 5918/41250 [14:18:12<85:13:29, 8.68s/it][2025-04-25 22:15:56,087] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 22:15:56,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.58 | bwd_microstep: 5760.75 | bwd_inner_microstep: 5698.77 | bwd_allreduce_microstep: 61.93 | step_microstep: 18.89 [2025-04-25 22:15:56,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.58 | bwd: 5760.76 | bwd_inner: 5698.77 | bwd_allreduce: 61.94 | step: 18.90 14%|█▍ | 5919/41250 [14:18:21<85:14:23, 8.69s/it] {'loss': 0.182, 'grad_norm': 3.023306131362915, 'learning_rate': 3.866432061610119e-05, 'epoch': 1.43} 14%|█▍ | 5919/41250 [14:18:21<85:14:23, 8.69s/it][2025-04-25 22:16:04,760] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 22:16:04,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.93 | bwd_microstep: 5736.57 | bwd_inner_microstep: 5705.07 | bwd_allreduce_microstep: 31.45 | step_microstep: 18.88 [2025-04-25 22:16:04,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.93 | bwd: 5736.58 | bwd_inner: 5705.07 | bwd_allreduce: 31.47 | step: 18.88 14%|█▍ | 5920/41250 [14:18:30<85:11:59, 8.68s/it] {'loss': 0.0952, 'grad_norm': 2.215611457824707, 'learning_rate': 3.866375631591419e-05, 'epoch': 1.44} 14%|█▍ | 5920/41250 [14:18:30<85:11:59, 8.68s/it][2025-04-25 22:16:13,446] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.25 | optimizer_step: 0.90 [2025-04-25 22:16:13,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.84 | bwd_microstep: 5775.88 | bwd_inner_microstep: 5662.15 | bwd_allreduce_microstep: 113.69 | step_microstep: 19.22 [2025-04-25 22:16:13,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.84 | bwd: 5775.90 | bwd_inner: 5662.15 | bwd_allreduce: 113.71 | step: 19.23 14%|█▍ | 5921/41250 [14:18:38<85:12:35, 8.68s/it] {'loss': 0.0641, 'grad_norm': 1.576524257659912, 'learning_rate': 3.86631919006688e-05, 'epoch': 1.44} 14%|█▍ | 5921/41250 [14:18:38<85:12:35, 8.68s/it][2025-04-25 22:16:22,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 22:16:22,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.35 | bwd_microstep: 5694.55 | bwd_inner_microstep: 5648.42 | bwd_allreduce_microstep: 46.08 | step_microstep: 18.84 [2025-04-25 22:16:22,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.35 | bwd: 5694.57 | bwd_inner: 5648.42 | bwd_allreduce: 46.11 | step: 18.84 14%|█▍ | 5922/41250 [14:18:47<84:59:33, 8.66s/it] {'loss': 0.0936, 'grad_norm': 1.9162911176681519, 'learning_rate': 3.8662627370368516e-05, 'epoch': 1.44} 14%|█▍ | 5922/41250 [14:18:47<84:59:33, 8.66s/it][2025-04-25 22:16:30,758] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.25 | optimizer_step: 0.97 [2025-04-25 22:16:30,759] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.45 | bwd_microstep: 5793.32 | bwd_inner_microstep: 5658.18 | bwd_allreduce_microstep: 135.08 | step_microstep: 19.62 [2025-04-25 22:16:30,760] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.45 | bwd: 5793.33 | bwd_inner: 5658.18 | bwd_allreduce: 135.11 | step: 19.62 14%|█▍ | 5923/41250 [14:18:56<85:07:03, 8.67s/it] {'loss': 0.1433, 'grad_norm': 3.1753737926483154, 'learning_rate': 3.86620627250168e-05, 'epoch': 1.44} 14%|█▍ | 5923/41250 [14:18:56<85:07:03, 8.67s/it][2025-04-25 22:16:39,432] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.06 | optimizer_step: 0.92 [2025-04-25 22:16:39,432] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.02 | bwd_microstep: 5736.48 | bwd_inner_microstep: 5704.43 | bwd_allreduce_microstep: 32.01 | step_microstep: 19.00 [2025-04-25 22:16:39,432] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.02 | bwd: 5736.50 | bwd_inner: 5704.43 | bwd_allreduce: 32.02 | step: 19.00 14%|█▍ | 5924/41250 [14:19:04<85:06:39, 8.67s/it] {'loss': 0.286, 'grad_norm': 2.4158575534820557, 'learning_rate': 3.8661497964617134e-05, 'epoch': 1.44} 14%|█▍ | 5924/41250 [14:19:04<85:06:39, 8.67s/it][2025-04-25 22:16:48,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.05 | optimizer_step: 0.99 [2025-04-25 22:16:48,156] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.31 | bwd_microstep: 5817.21 | bwd_inner_microstep: 5639.80 | bwd_allreduce_microstep: 177.36 | step_microstep: 18.87 [2025-04-25 22:16:48,156] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.31 | bwd: 5817.22 | bwd_inner: 5639.80 | bwd_allreduce: 177.38 | step: 18.87 14%|█▍ | 5925/41250 [14:19:13<85:15:21, 8.69s/it] {'loss': 0.2645, 'grad_norm': 2.4800374507904053, 'learning_rate': 3.8660933089173004e-05, 'epoch': 1.44} 14%|█▍ | 5925/41250 [14:19:13<85:15:21, 8.69s/it][2025-04-25 22:16:56,781] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 1.03 [2025-04-25 22:16:56,781] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.24 | bwd_microstep: 5697.86 | bwd_inner_microstep: 5684.81 | bwd_allreduce_microstep: 13.00 | step_microstep: 18.82 [2025-04-25 22:16:56,781] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.24 | bwd: 5697.87 | bwd_inner: 5684.81 | bwd_allreduce: 13.02 | step: 18.82 14%|█▍ | 5926/41250 [14:19:22<85:04:04, 8.67s/it] {'loss': 0.2241, 'grad_norm': 2.828735113143921, 'learning_rate': 3.8660368098687896e-05, 'epoch': 1.44} 14%|█▍ | 5926/41250 [14:19:22<85:04:04, 8.67s/it][2025-04-25 22:17:05,511] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 22:17:05,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2876.63 | bwd_microstep: 5766.90 | bwd_inner_microstep: 5754.04 | bwd_allreduce_microstep: 12.82 | step_microstep: 19.03 [2025-04-25 22:17:05,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2876.63 | bwd: 5766.92 | bwd_inner: 5754.04 | bwd_allreduce: 12.84 | step: 19.04 14%|█▍ | 5927/41250 [14:19:30<85:14:34, 8.69s/it] {'loss': 0.0594, 'grad_norm': 1.8917381763458252, 'learning_rate': 3.8659802993165285e-05, 'epoch': 1.44} 14%|█▍ | 5927/41250 [14:19:30<85:14:34, 8.69s/it][2025-04-25 22:17:14,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.03 | optimizer_step: 1.06 [2025-04-25 22:17:14,129] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.45 | bwd_microstep: 5709.10 | bwd_inner_microstep: 5643.42 | bwd_allreduce_microstep: 65.62 | step_microstep: 18.97 [2025-04-25 22:17:14,129] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.45 | bwd: 5709.12 | bwd_inner: 5643.42 | bwd_allreduce: 65.65 | step: 18.97 14%|█▍ | 5928/41250 [14:19:39<85:02:17, 8.67s/it] {'loss': 0.0853, 'grad_norm': 1.8385581970214844, 'learning_rate': 3.865923777260866e-05, 'epoch': 1.44} 14%|█▍ | 5928/41250 [14:19:39<85:02:17, 8.67s/it][2025-04-25 22:17:22,735] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.06 | optimizer_step: 0.96 [2025-04-25 22:17:22,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.64 | bwd_microstep: 5699.48 | bwd_inner_microstep: 5641.40 | bwd_allreduce_microstep: 58.04 | step_microstep: 19.25 [2025-04-25 22:17:22,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.64 | bwd: 5699.50 | bwd_inner: 5641.40 | bwd_allreduce: 58.06 | step: 19.25 14%|█▍ | 5929/41250 [14:19:48<84:51:13, 8.65s/it] {'loss': 0.1192, 'grad_norm': 0.979945182800293, 'learning_rate': 3.86586724370215e-05, 'epoch': 1.44} 14%|█▍ | 5929/41250 [14:19:48<84:51:13, 8.65s/it][2025-04-25 22:17:31,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 22:17:31,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.97 | bwd_microstep: 5725.67 | bwd_inner_microstep: 5712.94 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.84 [2025-04-25 22:17:31,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.97 | bwd: 5725.68 | bwd_inner: 5712.94 | bwd_allreduce: 12.70 | step: 18.85 14%|█▍ | 5930/41250 [14:19:56<84:54:17, 8.65s/it] {'loss': 0.1176, 'grad_norm': 1.4870407581329346, 'learning_rate': 3.8658106986407304e-05, 'epoch': 1.44} 14%|█▍ | 5930/41250 [14:19:56<84:54:17, 8.65s/it][2025-04-25 22:17:40,213] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.22 | optimizer_step: 0.93 [2025-04-25 22:17:40,214] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.90 | bwd_microstep: 5897.97 | bwd_inner_microstep: 5654.53 | bwd_allreduce_microstep: 243.39 | step_microstep: 19.25 [2025-04-25 22:17:40,214] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.90 | bwd: 5897.99 | bwd_inner: 5654.53 | bwd_allreduce: 243.42 | step: 19.26 14%|█▍ | 5931/41250 [14:20:05<85:21:45, 8.70s/it] {'loss': 0.1146, 'grad_norm': 2.5337350368499756, 'learning_rate': 3.865754142076954e-05, 'epoch': 1.44} 14%|█▍ | 5931/41250 [14:20:05<85:21:45, 8.70s/it][2025-04-25 22:17:48,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 22:17:48,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.82 | bwd_microstep: 5706.68 | bwd_inner_microstep: 5694.01 | bwd_allreduce_microstep: 12.63 | step_microstep: 18.76 [2025-04-25 22:17:48,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.82 | bwd: 5706.70 | bwd_inner: 5694.01 | bwd_allreduce: 12.65 | step: 18.76 14%|█▍ | 5932/41250 [14:20:14<85:11:14, 8.68s/it] {'loss': 0.1371, 'grad_norm': 1.6217522621154785, 'learning_rate': 3.8656975740111705e-05, 'epoch': 1.44} 14%|█▍ | 5932/41250 [14:20:14<85:11:14, 8.68s/it][2025-04-25 22:17:57,476] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 22:17:57,476] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.45 | bwd_microstep: 5692.93 | bwd_inner_microstep: 5680.26 | bwd_allreduce_microstep: 12.62 | step_microstep: 18.32 [2025-04-25 22:17:57,477] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.45 | bwd: 5692.95 | bwd_inner: 5680.26 | bwd_allreduce: 12.64 | step: 18.32 14%|█▍ | 5933/41250 [14:20:22<85:00:36, 8.67s/it] {'loss': 0.0414, 'grad_norm': 1.659089207649231, 'learning_rate': 3.865640994443729e-05, 'epoch': 1.44} 14%|█▍ | 5933/41250 [14:20:22<85:00:36, 8.67s/it][2025-04-25 22:18:06,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.03 | optimizer_step: 0.91 [2025-04-25 22:18:06,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.69 | bwd_microstep: 5788.07 | bwd_inner_microstep: 5774.78 | bwd_allreduce_microstep: 13.24 | step_microstep: 19.05 [2025-04-25 22:18:06,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.69 | bwd: 5788.08 | bwd_inner: 5774.78 | bwd_allreduce: 13.26 | step: 19.06 14%|█▍ | 5934/41250 [14:20:31<85:17:24, 8.69s/it] {'loss': 0.0759, 'grad_norm': 1.2036981582641602, 'learning_rate': 3.865584403374977e-05, 'epoch': 1.44} 14%|█▍ | 5934/41250 [14:20:31<85:17:24, 8.69s/it][2025-04-25 22:18:14,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.04 | optimizer_step: 0.94 [2025-04-25 22:18:14,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.78 | bwd_microstep: 5750.15 | bwd_inner_microstep: 5686.12 | bwd_allreduce_microstep: 63.98 | step_microstep: 19.01 [2025-04-25 22:18:14,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.78 | bwd: 5750.16 | bwd_inner: 5686.12 | bwd_allreduce: 63.99 | step: 19.01 14%|█▍ | 5935/41250 [14:20:40<85:14:33, 8.69s/it] {'loss': 0.0821, 'grad_norm': 1.5328139066696167, 'learning_rate': 3.8655278008052636e-05, 'epoch': 1.44} 14%|█▍ | 5935/41250 [14:20:40<85:14:33, 8.69s/it][2025-04-25 22:18:23,543] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 22:18:23,543] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.12 | bwd_microstep: 5697.69 | bwd_inner_microstep: 5684.83 | bwd_allreduce_microstep: 12.82 | step_microstep: 19.02 [2025-04-25 22:18:23,544] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.12 | bwd: 5697.71 | bwd_inner: 5684.83 | bwd_allreduce: 12.84 | step: 19.03 14%|█▍ | 5936/41250 [14:20:48<85:02:41, 8.67s/it] {'loss': 0.0757, 'grad_norm': 1.5966695547103882, 'learning_rate': 3.865471186734939e-05, 'epoch': 1.44} 14%|█▍ | 5936/41250 [14:20:48<85:02:41, 8.67s/it][2025-04-25 22:18:32,173] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-25 22:18:32,174] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.65 | bwd_microstep: 5693.48 | bwd_inner_microstep: 5680.71 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.73 [2025-04-25 22:18:32,174] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.65 | bwd: 5693.49 | bwd_inner: 5680.71 | bwd_allreduce: 12.74 | step: 18.73 14%|█▍ | 5937/41250 [14:20:57<84:55:41, 8.66s/it] {'loss': 0.1249, 'grad_norm': 1.4220380783081055, 'learning_rate': 3.865414561164351e-05, 'epoch': 1.44} 14%|█▍ | 5937/41250 [14:20:57<84:55:41, 8.66s/it][2025-04-25 22:18:40,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 22:18:40,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.65 | bwd_microstep: 5754.42 | bwd_inner_microstep: 5650.23 | bwd_allreduce_microstep: 104.14 | step_microstep: 18.94 [2025-04-25 22:18:40,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.65 | bwd: 5754.43 | bwd_inner: 5650.23 | bwd_allreduce: 104.16 | step: 18.94 14%|█▍ | 5938/41250 [14:21:06<84:56:24, 8.66s/it] {'loss': 0.1465, 'grad_norm': 1.4781441688537598, 'learning_rate': 3.86535792409385e-05, 'epoch': 1.44} 14%|█▍ | 5938/41250 [14:21:06<84:56:24, 8.66s/it][2025-04-25 22:18:49,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.03 | optimizer_step: 1.10 [2025-04-25 22:18:49,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.36 | bwd_microstep: 5741.68 | bwd_inner_microstep: 5693.11 | bwd_allreduce_microstep: 48.52 | step_microstep: 19.40 [2025-04-25 22:18:49,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.36 | bwd: 5741.70 | bwd_inner: 5693.11 | bwd_allreduce: 48.54 | step: 19.41 14%|█▍ | 5939/41250 [14:21:14<84:57:35, 8.66s/it] {'loss': 0.0234, 'grad_norm': 0.35413485765457153, 'learning_rate': 3.8653012755237836e-05, 'epoch': 1.44} 14%|█▍ | 5939/41250 [14:21:14<84:57:35, 8.66s/it][2025-04-25 22:18:58,133] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:18:58,134] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.68 | bwd_microstep: 5720.32 | bwd_inner_microstep: 5652.72 | bwd_allreduce_microstep: 67.55 | step_microstep: 18.75 [2025-04-25 22:18:58,134] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.68 | bwd: 5720.33 | bwd_inner: 5652.72 | bwd_allreduce: 67.57 | step: 18.75 14%|█▍ | 5940/41250 [14:21:23<84:51:48, 8.65s/it] {'loss': 0.2304, 'grad_norm': 6.402402877807617, 'learning_rate': 3.865244615454502e-05, 'epoch': 1.44} 14%|█▍ | 5940/41250 [14:21:23<84:51:48, 8.65s/it][2025-04-25 22:19:06,898] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.01 | optimizer_step: 1.06 [2025-04-25 22:19:06,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2891.93 | bwd_microstep: 5789.92 | bwd_inner_microstep: 5777.14 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.96 [2025-04-25 22:19:06,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2891.93 | bwd: 5789.93 | bwd_inner: 5777.14 | bwd_allreduce: 12.75 | step: 18.96 14%|█▍ | 5941/41250 [14:21:32<85:11:39, 8.69s/it] {'loss': 0.3099, 'grad_norm': 2.423943281173706, 'learning_rate': 3.865187943886354e-05, 'epoch': 1.44} 14%|█▍ | 5941/41250 [14:21:32<85:11:39, 8.69s/it][2025-04-25 22:19:15,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:19:15,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.70 | bwd_microstep: 5691.57 | bwd_inner_microstep: 5678.79 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.82 [2025-04-25 22:19:15,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.70 | bwd: 5691.59 | bwd_inner: 5678.79 | bwd_allreduce: 12.75 | step: 18.83 14%|█▍ | 5942/41250 [14:21:40<85:01:31, 8.67s/it] {'loss': 0.1458, 'grad_norm': 2.434711217880249, 'learning_rate': 3.8651312608196897e-05, 'epoch': 1.44} 14%|█▍ | 5942/41250 [14:21:40<85:01:31, 8.67s/it][2025-04-25 22:19:24,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:19:24,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.53 | bwd_microstep: 5756.52 | bwd_inner_microstep: 5661.30 | bwd_allreduce_microstep: 95.17 | step_microstep: 18.74 [2025-04-25 22:19:24,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.53 | bwd: 5756.53 | bwd_inner: 5661.30 | bwd_allreduce: 95.19 | step: 18.74 14%|█▍ | 5943/41250 [14:21:49<85:01:23, 8.67s/it] {'loss': 0.1229, 'grad_norm': 3.243161201477051, 'learning_rate': 3.865074566254858e-05, 'epoch': 1.44} 14%|█▍ | 5943/41250 [14:21:49<85:01:23, 8.67s/it][2025-04-25 22:19:33,004] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:19:33,004] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.73 | bwd_microstep: 5872.47 | bwd_inner_microstep: 5690.70 | bwd_allreduce_microstep: 181.73 | step_microstep: 18.47 [2025-04-25 22:19:33,005] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.73 | bwd: 5872.49 | bwd_inner: 5690.70 | bwd_allreduce: 181.75 | step: 18.47 14%|█▍ | 5944/41250 [14:21:58<85:25:34, 8.71s/it] {'loss': 0.1333, 'grad_norm': 1.3955222368240356, 'learning_rate': 3.8650178601922074e-05, 'epoch': 1.44} 14%|█▍ | 5944/41250 [14:21:58<85:25:34, 8.71s/it][2025-04-25 22:19:41,664] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:19:41,664] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.78 | bwd_microstep: 5714.23 | bwd_inner_microstep: 5701.43 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.75 [2025-04-25 22:19:41,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.78 | bwd: 5714.25 | bwd_inner: 5701.43 | bwd_allreduce: 12.77 | step: 18.75 14%|█▍ | 5945/41250 [14:22:06<85:16:44, 8.70s/it] {'loss': 0.435, 'grad_norm': 2.375783681869507, 'learning_rate': 3.86496114263209e-05, 'epoch': 1.44} 14%|█▍ | 5945/41250 [14:22:06<85:16:44, 8.70s/it][2025-04-25 22:19:50,478] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.13 | optimizer_step: 1.06 [2025-04-25 22:19:50,479] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.45 | bwd_microstep: 5870.50 | bwd_inner_microstep: 5698.95 | bwd_allreduce_microstep: 171.50 | step_microstep: 19.29 [2025-04-25 22:19:50,479] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.45 | bwd: 5870.51 | bwd_inner: 5698.95 | bwd_allreduce: 171.52 | step: 19.30 14%|█▍ | 5946/41250 [14:22:15<85:37:23, 8.73s/it] {'loss': 0.2476, 'grad_norm': 2.703193187713623, 'learning_rate': 3.8649044135748525e-05, 'epoch': 1.44} 14%|█▍ | 5946/41250 [14:22:15<85:37:23, 8.73s/it][2025-04-25 22:19:59,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.06 | optimizer_step: 1.08 [2025-04-25 22:19:59,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.66 | bwd_microstep: 5701.00 | bwd_inner_microstep: 5687.75 | bwd_allreduce_microstep: 13.20 | step_microstep: 19.32 [2025-04-25 22:19:59,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.66 | bwd: 5701.01 | bwd_inner: 5687.75 | bwd_allreduce: 13.22 | step: 19.33 14%|█▍ | 5947/41250 [14:22:24<85:20:21, 8.70s/it] {'loss': 0.142, 'grad_norm': 3.522315502166748, 'learning_rate': 3.8648476730208464e-05, 'epoch': 1.44} 14%|█▍ | 5947/41250 [14:22:24<85:20:21, 8.70s/it][2025-04-25 22:20:07,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 22:20:07,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.29 | bwd_microstep: 5803.79 | bwd_inner_microstep: 5661.28 | bwd_allreduce_microstep: 142.47 | step_microstep: 18.88 [2025-04-25 22:20:07,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.29 | bwd: 5803.81 | bwd_inner: 5661.28 | bwd_allreduce: 142.49 | step: 18.88 14%|█▍ | 5948/41250 [14:22:33<85:23:27, 8.71s/it] {'loss': 0.1381, 'grad_norm': 2.9432857036590576, 'learning_rate': 3.864790920970422e-05, 'epoch': 1.44} 14%|█▍ | 5948/41250 [14:22:33<85:23:27, 8.71s/it][2025-04-25 22:20:16,701] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:20:16,702] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 3084.37 | bwd_microstep: 5699.57 | bwd_inner_microstep: 5686.95 | bwd_allreduce_microstep: 12.58 | step_microstep: 18.45 [2025-04-25 22:20:16,702] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 3084.37 | bwd: 5699.58 | bwd_inner: 5686.95 | bwd_allreduce: 12.60 | step: 18.46 14%|█▍ | 5949/41250 [14:22:42<85:50:59, 8.75s/it] {'loss': 0.1469, 'grad_norm': 1.0792123079299927, 'learning_rate': 3.864734157423928e-05, 'epoch': 1.44} 14%|█▍ | 5949/41250 [14:22:42<85:50:59, 8.75s/it][2025-04-25 22:20:25,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:20:25,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.10 | bwd_microstep: 5770.70 | bwd_inner_microstep: 5660.03 | bwd_allreduce_microstep: 110.62 | step_microstep: 18.45 [2025-04-25 22:20:25,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.10 | bwd: 5770.71 | bwd_inner: 5660.03 | bwd_allreduce: 110.64 | step: 18.45 14%|█▍ | 5950/41250 [14:22:50<85:38:42, 8.73s/it] {'loss': 0.1501, 'grad_norm': 3.167292594909668, 'learning_rate': 3.8646773823817147e-05, 'epoch': 1.44} 14%|█▍ | 5950/41250 [14:22:50<85:38:42, 8.73s/it][2025-04-25 22:20:34,023] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.98 | optimizer_step: 1.08 [2025-04-25 22:20:34,024] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.13 | bwd_microstep: 5725.97 | bwd_inner_microstep: 5665.40 | bwd_allreduce_microstep: 60.52 | step_microstep: 18.48 [2025-04-25 22:20:34,024] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.13 | bwd: 5725.98 | bwd_inner: 5665.40 | bwd_allreduce: 60.54 | step: 18.49 14%|█▍ | 5951/41250 [14:22:59<85:21:15, 8.70s/it] {'loss': 0.0657, 'grad_norm': 1.445813536643982, 'learning_rate': 3.8646205958441324e-05, 'epoch': 1.44} 14%|█▍ | 5951/41250 [14:22:59<85:21:15, 8.70s/it][2025-04-25 22:20:42,717] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.08 | optimizer_step: 1.02 [2025-04-25 22:20:42,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.03 | bwd_microstep: 5759.84 | bwd_inner_microstep: 5708.33 | bwd_allreduce_microstep: 51.45 | step_microstep: 19.50 [2025-04-25 22:20:42,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.03 | bwd: 5759.85 | bwd_inner: 5708.33 | bwd_allreduce: 51.48 | step: 19.50 14%|█▍ | 5952/41250 [14:23:08<85:19:24, 8.70s/it] {'loss': 0.1367, 'grad_norm': 3.08185076713562, 'learning_rate': 3.86456379781153e-05, 'epoch': 1.44} 14%|█▍ | 5952/41250 [14:23:08<85:19:24, 8.70s/it][2025-04-25 22:20:51,410] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.24 | optimizer_step: 0.95 [2025-04-25 22:20:51,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.82 | bwd_microstep: 5755.44 | bwd_inner_microstep: 5699.15 | bwd_allreduce_microstep: 56.24 | step_microstep: 19.57 [2025-04-25 22:20:51,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.82 | bwd: 5755.45 | bwd_inner: 5699.15 | bwd_allreduce: 56.26 | step: 19.57 14%|█▍ | 5953/41250 [14:23:16<85:17:36, 8.70s/it] {'loss': 0.1856, 'grad_norm': 2.8464760780334473, 'learning_rate': 3.864506988284259e-05, 'epoch': 1.44} 14%|█▍ | 5953/41250 [14:23:16<85:17:36, 8.70s/it][2025-04-25 22:21:00,096] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 22:21:00,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.73 | bwd_microstep: 5745.31 | bwd_inner_microstep: 5715.25 | bwd_allreduce_microstep: 30.01 | step_microstep: 18.88 [2025-04-25 22:21:00,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.73 | bwd: 5745.32 | bwd_inner: 5715.25 | bwd_allreduce: 30.03 | step: 18.89 14%|█▍ | 5954/41250 [14:23:25<85:14:56, 8.69s/it] {'loss': 0.3636, 'grad_norm': 2.4434759616851807, 'learning_rate': 3.86445016726267e-05, 'epoch': 1.44} 14%|█▍ | 5954/41250 [14:23:25<85:14:56, 8.69s/it][2025-04-25 22:21:08,810] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-25 22:21:08,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.35 | bwd_microstep: 5779.04 | bwd_inner_microstep: 5701.07 | bwd_allreduce_microstep: 77.92 | step_microstep: 18.78 [2025-04-25 22:21:08,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.35 | bwd: 5779.05 | bwd_inner: 5701.07 | bwd_allreduce: 77.94 | step: 18.78 14%|█▍ | 5955/41250 [14:23:34<85:18:20, 8.70s/it] {'loss': 0.1265, 'grad_norm': 2.035027265548706, 'learning_rate': 3.864393334747111e-05, 'epoch': 1.44} 14%|█▍ | 5955/41250 [14:23:34<85:18:20, 8.70s/it][2025-04-25 22:21:17,484] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 22:21:17,485] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2870.60 | bwd_microstep: 5717.97 | bwd_inner_microstep: 5705.19 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.33 [2025-04-25 22:21:17,485] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2870.60 | bwd: 5717.98 | bwd_inner: 5705.19 | bwd_allreduce: 12.75 | step: 18.34 14%|█▍ | 5956/41250 [14:23:42<85:13:23, 8.69s/it] {'loss': 0.0921, 'grad_norm': 1.7291606664657593, 'learning_rate': 3.864336490737935e-05, 'epoch': 1.44} 14%|█▍ | 5956/41250 [14:23:42<85:13:23, 8.69s/it][2025-04-25 22:21:26,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 0.98 | optimizer_step: 1.08 [2025-04-25 22:21:26,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.78 | bwd_microstep: 5785.89 | bwd_inner_microstep: 5671.28 | bwd_allreduce_microstep: 114.57 | step_microstep: 18.35 [2025-04-25 22:21:26,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.78 | bwd: 5785.91 | bwd_inner: 5671.28 | bwd_allreduce: 114.59 | step: 18.35 14%|█▍ | 5957/41250 [14:23:51<85:14:07, 8.69s/it] {'loss': 0.188, 'grad_norm': 2.4686014652252197, 'learning_rate': 3.86427963523549e-05, 'epoch': 1.44} 14%|█▍ | 5957/41250 [14:23:51<85:14:07, 8.69s/it][2025-04-25 22:21:34,892] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:21:34,893] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.56 | bwd_microstep: 5768.80 | bwd_inner_microstep: 5708.76 | bwd_allreduce_microstep: 60.00 | step_microstep: 18.15 [2025-04-25 22:21:34,893] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.56 | bwd: 5768.82 | bwd_inner: 5708.76 | bwd_allreduce: 60.02 | step: 18.15 14%|█▍ | 5958/41250 [14:24:00<85:16:34, 8.70s/it] {'loss': 0.0257, 'grad_norm': 0.2589578628540039, 'learning_rate': 3.864222768240129e-05, 'epoch': 1.44} 14%|█▍ | 5958/41250 [14:24:00<85:16:34, 8.70s/it][2025-04-25 22:21:43,564] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.04 | optimizer_step: 1.21 [2025-04-25 22:21:43,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.59 | bwd_microstep: 5721.71 | bwd_inner_microstep: 5709.21 | bwd_allreduce_microstep: 12.46 | step_microstep: 19.80 [2025-04-25 22:21:43,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.59 | bwd: 5721.73 | bwd_inner: 5709.21 | bwd_allreduce: 12.48 | step: 19.80 14%|█▍ | 5959/41250 [14:24:08<85:12:06, 8.69s/it] {'loss': 0.1147, 'grad_norm': 1.7898619174957275, 'learning_rate': 3.8641658897522e-05, 'epoch': 1.44} 14%|█▍ | 5959/41250 [14:24:08<85:12:06, 8.69s/it][2025-04-25 22:21:52,260] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.07 | optimizer_step: 1.06 [2025-04-25 22:21:52,261] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.91 | bwd_microstep: 5772.66 | bwd_inner_microstep: 5661.35 | bwd_allreduce_microstep: 111.26 | step_microstep: 19.79 [2025-04-25 22:21:52,261] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.91 | bwd: 5772.67 | bwd_inner: 5661.35 | bwd_allreduce: 111.28 | step: 19.79 14%|█▍ | 5960/41250 [14:24:17<85:12:52, 8.69s/it] {'loss': 0.2601, 'grad_norm': 4.500431537628174, 'learning_rate': 3.864108999772056e-05, 'epoch': 1.44} 14%|█▍ | 5960/41250 [14:24:17<85:12:52, 8.69s/it][2025-04-25 22:22:00,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 22:22:00,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.07 | bwd_microstep: 5753.05 | bwd_inner_microstep: 5688.84 | bwd_allreduce_microstep: 64.16 | step_microstep: 19.06 [2025-04-25 22:22:00,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.07 | bwd: 5753.06 | bwd_inner: 5688.84 | bwd_allreduce: 64.18 | step: 19.06 14%|█▍ | 5961/41250 [14:24:26<85:12:47, 8.69s/it] {'loss': 0.0892, 'grad_norm': 1.3783509731292725, 'learning_rate': 3.8640520983000457e-05, 'epoch': 1.45} 14%|█▍ | 5961/41250 [14:24:26<85:12:47, 8.69s/it][2025-04-25 22:22:09,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.25 | optimizer_step: 0.91 [2025-04-25 22:22:09,598] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.92 | bwd_microstep: 5733.16 | bwd_inner_microstep: 5654.82 | bwd_allreduce_microstep: 78.28 | step_microstep: 19.19 [2025-04-25 22:22:09,598] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.92 | bwd: 5733.18 | bwd_inner: 5654.82 | bwd_allreduce: 78.31 | step: 19.19 14%|█▍ | 5962/41250 [14:24:34<85:04:28, 8.68s/it] {'loss': 0.0597, 'grad_norm': 0.5523503422737122, 'learning_rate': 3.863995185336522e-05, 'epoch': 1.45} 14%|█▍ | 5962/41250 [14:24:34<85:04:28, 8.68s/it][2025-04-25 22:22:18,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.06 | optimizer_step: 0.97 [2025-04-25 22:22:18,315] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.00 | bwd_microstep: 5779.92 | bwd_inner_microstep: 5699.38 | bwd_allreduce_microstep: 80.49 | step_microstep: 19.06 [2025-04-25 22:22:18,315] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.00 | bwd: 5779.93 | bwd_inner: 5699.38 | bwd_allreduce: 80.51 | step: 19.07 14%|█▍ | 5963/41250 [14:24:43<85:10:15, 8.69s/it] {'loss': 0.0634, 'grad_norm': 0.8338819742202759, 'learning_rate': 3.863938260881834e-05, 'epoch': 1.45} 14%|█▍ | 5963/41250 [14:24:43<85:10:15, 8.69s/it][2025-04-25 22:22:26,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-25 22:22:26,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.45 | bwd_microstep: 5770.01 | bwd_inner_microstep: 5662.39 | bwd_allreduce_microstep: 107.57 | step_microstep: 18.91 [2025-04-25 22:22:26,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.45 | bwd: 5770.02 | bwd_inner: 5662.39 | bwd_allreduce: 107.59 | step: 18.91 14%|█▍ | 5964/41250 [14:24:52<85:09:10, 8.69s/it] {'loss': 0.2165, 'grad_norm': 2.267787218093872, 'learning_rate': 3.863881324936333e-05, 'epoch': 1.45} 14%|█▍ | 5964/41250 [14:24:52<85:09:10, 8.69s/it][2025-04-25 22:22:35,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 22:22:35,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.57 | bwd_microstep: 5789.45 | bwd_inner_microstep: 5658.89 | bwd_allreduce_microstep: 130.51 | step_microstep: 18.91 [2025-04-25 22:22:35,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.57 | bwd: 5789.46 | bwd_inner: 5658.89 | bwd_allreduce: 130.53 | step: 18.91 14%|█▍ | 5965/41250 [14:25:01<85:12:00, 8.69s/it] {'loss': 0.2801, 'grad_norm': 3.9203174114227295, 'learning_rate': 3.863824377500371e-05, 'epoch': 1.45} 14%|█▍ | 5965/41250 [14:25:01<85:12:00, 8.69s/it][2025-04-25 22:22:44,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:22:44,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.93 | bwd_microstep: 5776.54 | bwd_inner_microstep: 5660.47 | bwd_allreduce_microstep: 116.03 | step_microstep: 18.79 [2025-04-25 22:22:44,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.93 | bwd: 5776.56 | bwd_inner: 5660.47 | bwd_allreduce: 116.04 | step: 18.79 14%|█▍ | 5966/41250 [14:25:09<85:11:21, 8.69s/it] {'loss': 0.029, 'grad_norm': 0.3702692687511444, 'learning_rate': 3.8637674185742975e-05, 'epoch': 1.45} 14%|█▍ | 5966/41250 [14:25:09<85:11:21, 8.69s/it][2025-04-25 22:22:53,255] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.34 | optimizer_step: 1.04 [2025-04-25 22:22:53,256] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.35 | bwd_microstep: 5934.88 | bwd_inner_microstep: 5684.91 | bwd_allreduce_microstep: 249.91 | step_microstep: 19.94 [2025-04-25 22:22:53,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.35 | bwd: 5934.89 | bwd_inner: 5684.91 | bwd_allreduce: 249.93 | step: 19.95 14%|█▍ | 5967/41250 [14:25:18<85:41:52, 8.74s/it] {'loss': 0.2104, 'grad_norm': 4.268947124481201, 'learning_rate': 3.863710448158465e-05, 'epoch': 1.45} 14%|█▍ | 5967/41250 [14:25:18<85:41:52, 8.74s/it][2025-04-25 22:23:02,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.03 | optimizer_step: 0.99 [2025-04-25 22:23:02,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.88 | bwd_microstep: 6025.76 | bwd_inner_microstep: 5657.95 | bwd_allreduce_microstep: 367.77 | step_microstep: 18.94 [2025-04-25 22:23:02,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.88 | bwd: 6025.78 | bwd_inner: 5657.95 | bwd_allreduce: 367.79 | step: 18.95 14%|█▍ | 5968/41250 [14:25:27<86:15:42, 8.80s/it] {'loss': 0.1063, 'grad_norm': 1.1939221620559692, 'learning_rate': 3.863653466253223e-05, 'epoch': 1.45} 14%|█▍ | 5968/41250 [14:25:27<86:15:42, 8.80s/it][2025-04-25 22:23:10,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:23:10,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.60 | bwd_microstep: 5791.16 | bwd_inner_microstep: 5778.40 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.74 [2025-04-25 22:23:10,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.60 | bwd: 5791.18 | bwd_inner: 5778.40 | bwd_allreduce: 12.73 | step: 18.74 14%|█▍ | 5969/41250 [14:25:36<86:08:42, 8.79s/it] {'loss': 0.0637, 'grad_norm': 1.0410579442977905, 'learning_rate': 3.8635964728589256e-05, 'epoch': 1.45} 14%|█▍ | 5969/41250 [14:25:36<86:08:42, 8.79s/it][2025-04-25 22:23:19,642] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:23:19,643] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.65 | bwd_microstep: 5750.54 | bwd_inner_microstep: 5686.12 | bwd_allreduce_microstep: 64.37 | step_microstep: 18.44 [2025-04-25 22:23:19,643] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.65 | bwd: 5750.55 | bwd_inner: 5686.12 | bwd_allreduce: 64.39 | step: 18.44 14%|█▍ | 5970/41250 [14:25:44<85:49:52, 8.76s/it] {'loss': 0.0637, 'grad_norm': 1.261069655418396, 'learning_rate': 3.863539467975922e-05, 'epoch': 1.45} 14%|█▍ | 5970/41250 [14:25:44<85:49:52, 8.76s/it][2025-04-25 22:23:28,347] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 22:23:28,347] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.51 | bwd_microstep: 5792.52 | bwd_inner_microstep: 5660.68 | bwd_allreduce_microstep: 131.79 | step_microstep: 18.77 [2025-04-25 22:23:28,347] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.51 | bwd: 5792.53 | bwd_inner: 5660.68 | bwd_allreduce: 131.80 | step: 18.77 14%|█▍ | 5971/41250 [14:25:53<85:40:19, 8.74s/it] {'loss': 0.1405, 'grad_norm': 2.610260486602783, 'learning_rate': 3.863482451604563e-05, 'epoch': 1.45} 14%|█▍ | 5971/41250 [14:25:53<85:40:19, 8.74s/it][2025-04-25 22:23:37,121] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 22:23:37,121] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.86 | bwd_microstep: 5840.40 | bwd_inner_microstep: 5690.47 | bwd_allreduce_microstep: 149.88 | step_microstep: 18.72 [2025-04-25 22:23:37,121] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.86 | bwd: 5840.42 | bwd_inner: 5690.47 | bwd_allreduce: 149.90 | step: 18.72 14%|█▍ | 5972/41250 [14:26:02<85:45:40, 8.75s/it] {'loss': 0.2438, 'grad_norm': 2.577087163925171, 'learning_rate': 3.8634254237452024e-05, 'epoch': 1.45} 14%|█▍ | 5972/41250 [14:26:02<85:45:40, 8.75s/it][2025-04-25 22:23:45,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.00 | optimizer_step: 1.00 [2025-04-25 22:23:45,743] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.96 | bwd_microstep: 5697.56 | bwd_inner_microstep: 5684.44 | bwd_allreduce_microstep: 13.08 | step_microstep: 18.67 [2025-04-25 22:23:45,743] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.96 | bwd: 5697.58 | bwd_inner: 5684.44 | bwd_allreduce: 13.10 | step: 18.67 14%|█▍ | 5973/41250 [14:26:11<85:22:36, 8.71s/it] {'loss': 0.1084, 'grad_norm': 1.05701744556427, 'learning_rate': 3.86336838439819e-05, 'epoch': 1.45} 14%|█▍ | 5973/41250 [14:26:11<85:22:36, 8.71s/it][2025-04-25 22:23:54,351] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 22:23:54,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.88 | bwd_microstep: 5701.66 | bwd_inner_microstep: 5656.23 | bwd_allreduce_microstep: 45.37 | step_microstep: 18.88 [2025-04-25 22:23:54,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.89 | bwd: 5701.67 | bwd_inner: 5656.23 | bwd_allreduce: 45.40 | step: 18.88 14%|█▍ | 5974/41250 [14:26:19<85:04:35, 8.68s/it] {'loss': 0.0934, 'grad_norm': 1.773788571357727, 'learning_rate': 3.863311333563879e-05, 'epoch': 1.45} 14%|█▍ | 5974/41250 [14:26:19<85:04:35, 8.68s/it][2025-04-25 22:24:03,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.97 [2025-04-25 22:24:03,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2934.29 | bwd_microstep: 5863.49 | bwd_inner_microstep: 5850.53 | bwd_allreduce_microstep: 12.92 | step_microstep: 19.31 [2025-04-25 22:24:03,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2934.29 | bwd: 5863.51 | bwd_inner: 5850.53 | bwd_allreduce: 12.94 | step: 19.31 14%|█▍ | 5975/41250 [14:26:28<85:39:34, 8.74s/it] {'loss': 0.2093, 'grad_norm': 2.0757343769073486, 'learning_rate': 3.863254271242619e-05, 'epoch': 1.45} 14%|█▍ | 5975/41250 [14:26:28<85:39:34, 8.74s/it][2025-04-25 22:24:11,912] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:24:11,913] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.95 | bwd_microstep: 5749.86 | bwd_inner_microstep: 5677.73 | bwd_allreduce_microstep: 72.09 | step_microstep: 18.68 [2025-04-25 22:24:11,913] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.95 | bwd: 5749.88 | bwd_inner: 5677.73 | bwd_allreduce: 72.11 | step: 18.68 14%|█▍ | 5976/41250 [14:26:37<85:28:02, 8.72s/it] {'loss': 0.25, 'grad_norm': 2.1790034770965576, 'learning_rate': 3.8631971974347637e-05, 'epoch': 1.45} 14%|█▍ | 5976/41250 [14:26:37<85:28:02, 8.72s/it][2025-04-25 22:24:20,652] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-25 22:24:20,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2881.00 | bwd_microstep: 5775.03 | bwd_inner_microstep: 5761.66 | bwd_allreduce_microstep: 13.33 | step_microstep: 19.16 [2025-04-25 22:24:20,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2881.00 | bwd: 5775.05 | bwd_inner: 5761.66 | bwd_allreduce: 13.35 | step: 19.17 14%|█▍ | 5977/41250 [14:26:45<85:31:00, 8.73s/it] {'loss': 0.1381, 'grad_norm': 1.943813443183899, 'learning_rate': 3.8631401121406635e-05, 'epoch': 1.45} 14%|█▍ | 5977/41250 [14:26:45<85:31:00, 8.73s/it][2025-04-25 22:24:29,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:24:29,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.28 | bwd_microstep: 5769.96 | bwd_inner_microstep: 5649.09 | bwd_allreduce_microstep: 120.82 | step_microstep: 18.55 [2025-04-25 22:24:29,336] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.28 | bwd: 5769.97 | bwd_inner: 5649.09 | bwd_allreduce: 120.84 | step: 18.55 14%|█▍ | 5978/41250 [14:26:54<85:22:45, 8.71s/it] {'loss': 0.1852, 'grad_norm': 1.6303398609161377, 'learning_rate': 3.863083015360672e-05, 'epoch': 1.45} 14%|█▍ | 5978/41250 [14:26:54<85:22:45, 8.71s/it][2025-04-25 22:24:37,984] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 22:24:37,985] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.91 | bwd_microstep: 5717.55 | bwd_inner_microstep: 5704.84 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.57 [2025-04-25 22:24:37,985] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.91 | bwd: 5717.56 | bwd_inner: 5704.84 | bwd_allreduce: 12.68 | step: 18.57 14%|█▍ | 5979/41250 [14:27:03<85:11:10, 8.69s/it] {'loss': 0.393, 'grad_norm': 5.184871673583984, 'learning_rate': 3.8630259070951386e-05, 'epoch': 1.45} 14%|█▍ | 5979/41250 [14:27:03<85:11:10, 8.69s/it][2025-04-25 22:24:46,624] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 22:24:46,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.25 | bwd_microstep: 5704.45 | bwd_inner_microstep: 5691.85 | bwd_allreduce_microstep: 12.55 | step_microstep: 18.78 [2025-04-25 22:24:46,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.25 | bwd: 5704.46 | bwd_inner: 5691.85 | bwd_allreduce: 12.57 | step: 18.78 14%|█▍ | 5980/41250 [14:27:11<85:01:21, 8.68s/it] {'loss': 0.0341, 'grad_norm': 1.246018409729004, 'learning_rate': 3.862968787344418e-05, 'epoch': 1.45} 14%|█▍ | 5980/41250 [14:27:11<85:01:21, 8.68s/it][2025-04-25 22:24:55,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 22:24:55,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.67 | bwd_microstep: 5709.39 | bwd_inner_microstep: 5696.51 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.75 [2025-04-25 22:24:55,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.67 | bwd: 5709.40 | bwd_inner: 5696.51 | bwd_allreduce: 12.85 | step: 18.76 14%|█▍ | 5981/41250 [14:27:20<84:54:20, 8.67s/it] {'loss': 0.0859, 'grad_norm': 2.345658540725708, 'learning_rate': 3.862911656108861e-05, 'epoch': 1.45} 14%|█▍ | 5981/41250 [14:27:20<84:54:20, 8.67s/it][2025-04-25 22:25:03,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:25:03,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.65 | bwd_microstep: 5751.93 | bwd_inner_microstep: 5653.06 | bwd_allreduce_microstep: 98.83 | step_microstep: 18.69 [2025-04-25 22:25:03,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.65 | bwd: 5751.95 | bwd_inner: 5653.06 | bwd_allreduce: 98.85 | step: 18.70 15%|█▍ | 5982/41250 [14:27:29<84:53:37, 8.67s/it] {'loss': 0.1086, 'grad_norm': 5.272680282592773, 'learning_rate': 3.86285451338882e-05, 'epoch': 1.45} 15%|█▍ | 5982/41250 [14:27:29<84:53:37, 8.67s/it][2025-04-25 22:25:12,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 0.97 | optimizer_step: 0.91 [2025-04-25 22:25:12,603] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.46 | bwd_microstep: 5748.72 | bwd_inner_microstep: 5699.01 | bwd_allreduce_microstep: 49.66 | step_microstep: 18.45 [2025-04-25 22:25:12,603] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.46 | bwd: 5748.74 | bwd_inner: 5699.01 | bwd_allreduce: 49.68 | step: 18.45 15%|█▍ | 5983/41250 [14:27:37<84:55:26, 8.67s/it] {'loss': 0.0437, 'grad_norm': 0.667789876461029, 'learning_rate': 3.8627973591846474e-05, 'epoch': 1.45} 15%|█▍ | 5983/41250 [14:27:37<84:55:26, 8.67s/it][2025-04-25 22:25:21,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.03 | optimizer_step: 0.96 [2025-04-25 22:25:21,233] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.68 | bwd_microstep: 5700.28 | bwd_inner_microstep: 5687.51 | bwd_allreduce_microstep: 12.73 | step_microstep: 19.09 [2025-04-25 22:25:21,233] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.68 | bwd: 5700.30 | bwd_inner: 5687.51 | bwd_allreduce: 12.74 | step: 19.09 15%|█▍ | 5984/41250 [14:27:46<84:48:29, 8.66s/it] {'loss': 0.0895, 'grad_norm': 2.0766124725341797, 'learning_rate': 3.862740193496695e-05, 'epoch': 1.45} 15%|█▍ | 5984/41250 [14:27:46<84:48:29, 8.66s/it][2025-04-25 22:25:29,905] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.04 | optimizer_step: 0.91 [2025-04-25 22:25:29,905] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.43 | bwd_microstep: 5741.90 | bwd_inner_microstep: 5689.38 | bwd_allreduce_microstep: 52.47 | step_microstep: 19.22 [2025-04-25 22:25:29,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.43 | bwd: 5741.91 | bwd_inner: 5689.38 | bwd_allreduce: 52.49 | step: 19.22 15%|█▍ | 5985/41250 [14:27:55<84:51:07, 8.66s/it] {'loss': 0.2955, 'grad_norm': 3.746577024459839, 'learning_rate': 3.862683016325316e-05, 'epoch': 1.45} 15%|█▍ | 5985/41250 [14:27:55<84:51:07, 8.66s/it][2025-04-25 22:25:38,569] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 22:25:38,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.58 | bwd_microstep: 5732.00 | bwd_inner_microstep: 5691.88 | bwd_allreduce_microstep: 40.07 | step_microstep: 18.78 [2025-04-25 22:25:38,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.58 | bwd: 5732.01 | bwd_inner: 5691.88 | bwd_allreduce: 40.09 | step: 18.78 15%|█▍ | 5986/41250 [14:28:03<84:51:17, 8.66s/it] {'loss': 0.066, 'grad_norm': 1.641679286956787, 'learning_rate': 3.862625827670863e-05, 'epoch': 1.45} 15%|█▍ | 5986/41250 [14:28:03<84:51:17, 8.66s/it][2025-04-25 22:25:47,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 22:25:47,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.57 | bwd_microstep: 5742.71 | bwd_inner_microstep: 5689.17 | bwd_allreduce_microstep: 53.49 | step_microstep: 18.80 [2025-04-25 22:25:47,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.57 | bwd: 5742.72 | bwd_inner: 5689.16 | bwd_allreduce: 53.52 | step: 18.80 15%|█▍ | 5987/41250 [14:28:12<84:52:39, 8.67s/it] {'loss': 0.1472, 'grad_norm': 1.0787123441696167, 'learning_rate': 3.862568627533688e-05, 'epoch': 1.45} 15%|█▍ | 5987/41250 [14:28:12<84:52:39, 8.67s/it][2025-04-25 22:25:55,910] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:25:55,910] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.29 | bwd_microstep: 5732.88 | bwd_inner_microstep: 5691.50 | bwd_allreduce_microstep: 41.33 | step_microstep: 18.86 [2025-04-25 22:25:55,910] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.29 | bwd: 5732.89 | bwd_inner: 5691.50 | bwd_allreduce: 41.35 | step: 18.86 15%|█▍ | 5988/41250 [14:28:21<84:53:13, 8.67s/it] {'loss': 0.1178, 'grad_norm': 3.8970842361450195, 'learning_rate': 3.862511415914144e-05, 'epoch': 1.45} 15%|█▍ | 5988/41250 [14:28:21<84:53:13, 8.67s/it][2025-04-25 22:26:04,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.08 | optimizer_step: 0.96 [2025-04-25 22:26:04,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.31 | bwd_microstep: 5765.29 | bwd_inner_microstep: 5650.33 | bwd_allreduce_microstep: 114.91 | step_microstep: 19.43 [2025-04-25 22:26:04,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.31 | bwd: 5765.30 | bwd_inner: 5650.33 | bwd_allreduce: 114.93 | step: 19.43 15%|█▍ | 5989/41250 [14:28:29<84:54:13, 8.67s/it] {'loss': 0.3216, 'grad_norm': 4.374232292175293, 'learning_rate': 3.862454192812583e-05, 'epoch': 1.45} 15%|█▍ | 5989/41250 [14:28:29<84:54:13, 8.67s/it][2025-04-25 22:26:13,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-25 22:26:13,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.41 | bwd_microstep: 5761.72 | bwd_inner_microstep: 5655.51 | bwd_allreduce_microstep: 106.16 | step_microstep: 18.83 [2025-04-25 22:26:13,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.41 | bwd: 5761.73 | bwd_inner: 5655.51 | bwd_allreduce: 106.18 | step: 18.83 15%|█▍ | 5990/41250 [14:28:38<84:55:01, 8.67s/it] {'loss': 0.0313, 'grad_norm': 0.5697267055511475, 'learning_rate': 3.862396958229358e-05, 'epoch': 1.45} 15%|█▍ | 5990/41250 [14:28:38<84:55:01, 8.67s/it][2025-04-25 22:26:22,162] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.02 | optimizer_step: 0.91 [2025-04-25 22:26:22,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2934.04 | bwd_microstep: 5887.06 | bwd_inner_microstep: 5874.31 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.83 [2025-04-25 22:26:22,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2934.04 | bwd: 5887.08 | bwd_inner: 5874.31 | bwd_allreduce: 12.73 | step: 18.83 15%|█▍ | 5991/41250 [14:28:47<85:36:19, 8.74s/it] {'loss': 0.2402, 'grad_norm': 4.361567497253418, 'learning_rate': 3.862339712164822e-05, 'epoch': 1.45} 15%|█▍ | 5991/41250 [14:28:47<85:36:19, 8.74s/it][2025-04-25 22:26:31,067] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.58 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 22:26:31,067] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2929.19 | bwd_microstep: 5891.53 | bwd_inner_microstep: 5878.67 | bwd_allreduce_microstep: 12.81 | step_microstep: 19.19 [2025-04-25 22:26:31,068] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2929.19 | bwd: 5891.54 | bwd_inner: 5878.67 | bwd_allreduce: 12.83 | step: 19.19 15%|█▍ | 5992/41250 [14:28:56<86:05:11, 8.79s/it] {'loss': 0.4158, 'grad_norm': 4.031471252441406, 'learning_rate': 3.862282454619329e-05, 'epoch': 1.45} 15%|█▍ | 5992/41250 [14:28:56<86:05:11, 8.79s/it][2025-04-25 22:26:39,716] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-25 22:26:39,717] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.08 | bwd_microstep: 5712.94 | bwd_inner_microstep: 5700.14 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.69 [2025-04-25 22:26:39,717] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.08 | bwd: 5712.95 | bwd_inner: 5700.14 | bwd_allreduce: 12.77 | step: 18.70 15%|█▍ | 5993/41250 [14:29:05<85:40:18, 8.75s/it] {'loss': 0.3106, 'grad_norm': 5.127185344696045, 'learning_rate': 3.862225185593229e-05, 'epoch': 1.45} 15%|█▍ | 5993/41250 [14:29:05<85:40:18, 8.75s/it][2025-04-25 22:26:48,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.98 [2025-04-25 22:26:48,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.95 | bwd_microstep: 5762.41 | bwd_inner_microstep: 5646.14 | bwd_allreduce_microstep: 116.22 | step_microstep: 19.08 [2025-04-25 22:26:48,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.95 | bwd: 5762.42 | bwd_inner: 5646.14 | bwd_allreduce: 116.24 | step: 19.08 15%|█▍ | 5994/41250 [14:29:13<85:27:04, 8.73s/it] {'loss': 0.0551, 'grad_norm': 0.5558867454528809, 'learning_rate': 3.862167905086879e-05, 'epoch': 1.45} 15%|█▍ | 5994/41250 [14:29:13<85:27:04, 8.73s/it][2025-04-25 22:26:57,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.08 | optimizer_step: 1.06 [2025-04-25 22:26:57,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.44 | bwd_microstep: 5846.59 | bwd_inner_microstep: 5693.32 | bwd_allreduce_microstep: 153.22 | step_microstep: 19.56 [2025-04-25 22:26:57,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.44 | bwd: 5846.61 | bwd_inner: 5693.32 | bwd_allreduce: 153.24 | step: 19.56 15%|█▍ | 5995/41250 [14:29:22<85:35:45, 8.74s/it] {'loss': 0.2408, 'grad_norm': 1.8026617765426636, 'learning_rate': 3.8621106131006295e-05, 'epoch': 1.45} 15%|█▍ | 5995/41250 [14:29:22<85:35:45, 8.74s/it][2025-04-25 22:27:05,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 0.98 | optimizer_step: 1.08 [2025-04-25 22:27:05,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.94 | bwd_microstep: 5708.81 | bwd_inner_microstep: 5692.16 | bwd_allreduce_microstep: 16.61 | step_microstep: 18.64 [2025-04-25 22:27:05,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.94 | bwd: 5708.82 | bwd_inner: 5692.16 | bwd_allreduce: 16.62 | step: 18.64 15%|█▍ | 5996/41250 [14:29:31<85:17:53, 8.71s/it] {'loss': 0.1117, 'grad_norm': 1.3722176551818848, 'learning_rate': 3.862053309634834e-05, 'epoch': 1.45} 15%|█▍ | 5996/41250 [14:29:31<85:17:53, 8.71s/it][2025-04-25 22:27:14,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:27:14,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.08 | bwd_microstep: 5763.88 | bwd_inner_microstep: 5695.81 | bwd_allreduce_microstep: 68.03 | step_microstep: 18.71 [2025-04-25 22:27:14,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.08 | bwd: 5763.89 | bwd_inner: 5695.81 | bwd_allreduce: 68.04 | step: 18.72 15%|█▍ | 5997/41250 [14:29:39<85:15:10, 8.71s/it] {'loss': 0.1395, 'grad_norm': 2.2690796852111816, 'learning_rate': 3.8619959946898465e-05, 'epoch': 1.45} 15%|█▍ | 5997/41250 [14:29:39<85:15:10, 8.71s/it][2025-04-25 22:27:23,107] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 22:27:23,107] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.82 | bwd_microstep: 5695.62 | bwd_inner_microstep: 5658.74 | bwd_allreduce_microstep: 36.84 | step_microstep: 18.69 [2025-04-25 22:27:23,108] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.82 | bwd: 5695.64 | bwd_inner: 5658.74 | bwd_allreduce: 36.86 | step: 18.69 15%|█▍ | 5998/41250 [14:29:48<84:57:22, 8.68s/it] {'loss': 0.1772, 'grad_norm': 7.9394378662109375, 'learning_rate': 3.86193866826602e-05, 'epoch': 1.45} 15%|█▍ | 5998/41250 [14:29:48<84:57:22, 8.68s/it][2025-04-25 22:27:32,050] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:27:32,050] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.66 | bwd_microstep: 6025.85 | bwd_inner_microstep: 5666.17 | bwd_allreduce_microstep: 359.64 | step_microstep: 18.67 [2025-04-25 22:27:32,050] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.66 | bwd: 6025.87 | bwd_inner: 5666.17 | bwd_allreduce: 359.65 | step: 18.67 15%|█▍ | 5999/41250 [14:29:57<85:44:08, 8.76s/it] {'loss': 0.2833, 'grad_norm': 3.2787697315216064, 'learning_rate': 3.861881330363708e-05, 'epoch': 1.45} 15%|█▍ | 5999/41250 [14:29:57<85:44:08, 8.76s/it][2025-04-25 22:27:40,749] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.00 | optimizer_step: 0.93 [2025-04-25 22:27:40,750] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.63 | bwd_microstep: 5765.30 | bwd_inner_microstep: 5703.66 | bwd_allreduce_microstep: 61.59 | step_microstep: 18.77 [2025-04-25 22:27:40,750] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.63 | bwd: 5765.31 | bwd_inner: 5703.66 | bwd_allreduce: 61.61 | step: 18.78 15%|█▍ | 6000/41250 [14:30:06<85:34:09, 8.74s/it] {'loss': 0.0896, 'grad_norm': 1.4204987287521362, 'learning_rate': 3.861823980983263e-05, 'epoch': 1.45} 15%|█▍ | 6000/41250 [14:30:06<85:34:09, 8.74s/it][2025-04-25 22:27:49,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:27:49,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.19 | bwd_microstep: 5706.12 | bwd_inner_microstep: 5663.04 | bwd_allreduce_microstep: 43.04 | step_microstep: 18.28 [2025-04-25 22:27:49,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.19 | bwd: 5706.13 | bwd_inner: 5663.04 | bwd_allreduce: 43.05 | step: 18.28 15%|█▍ | 6001/41250 [14:30:14<85:13:30, 8.70s/it] {'loss': 0.5359, 'grad_norm': 3.469266176223755, 'learning_rate': 3.86176662012504e-05, 'epoch': 1.45} 15%|█▍ | 6001/41250 [14:30:14<85:13:30, 8.70s/it][2025-04-25 22:27:58,073] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-25 22:27:58,073] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.53 | bwd_microstep: 5770.51 | bwd_inner_microstep: 5706.76 | bwd_allreduce_microstep: 63.70 | step_microstep: 18.54 [2025-04-25 22:27:58,074] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.53 | bwd: 5770.52 | bwd_inner: 5706.76 | bwd_allreduce: 63.72 | step: 18.54 15%|█▍ | 6002/41250 [14:30:23<85:13:07, 8.70s/it] {'loss': 0.2288, 'grad_norm': 2.8164730072021484, 'learning_rate': 3.861709247789391e-05, 'epoch': 1.46} 15%|█▍ | 6002/41250 [14:30:23<85:13:07, 8.70s/it][2025-04-25 22:28:06,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:28:06,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.89 | bwd_microstep: 5710.93 | bwd_inner_microstep: 5697.92 | bwd_allreduce_microstep: 12.96 | step_microstep: 18.71 [2025-04-25 22:28:06,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.89 | bwd: 5710.94 | bwd_inner: 5697.92 | bwd_allreduce: 12.98 | step: 18.71 15%|█▍ | 6003/41250 [14:30:32<85:01:32, 8.68s/it] {'loss': 0.2511, 'grad_norm': 1.9134937524795532, 'learning_rate': 3.861651863976672e-05, 'epoch': 1.46} 15%|█▍ | 6003/41250 [14:30:32<85:01:32, 8.68s/it][2025-04-25 22:28:15,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.96 [2025-04-25 22:28:15,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.51 | bwd_microstep: 5785.98 | bwd_inner_microstep: 5675.66 | bwd_allreduce_microstep: 110.28 | step_microstep: 19.17 [2025-04-25 22:28:15,415] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.51 | bwd: 5785.99 | bwd_inner: 5675.66 | bwd_allreduce: 110.30 | step: 19.17 15%|█▍ | 6004/41250 [14:30:40<85:04:19, 8.69s/it] {'loss': 0.0346, 'grad_norm': 3.3747308254241943, 'learning_rate': 3.8615944686872344e-05, 'epoch': 1.46} 15%|█▍ | 6004/41250 [14:30:40<85:04:19, 8.69s/it][2025-04-25 22:28:24,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.07 | optimizer_step: 0.95 [2025-04-25 22:28:24,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.28 | bwd_microstep: 5728.49 | bwd_inner_microstep: 5659.84 | bwd_allreduce_microstep: 68.59 | step_microstep: 19.85 [2025-04-25 22:28:24,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.28 | bwd: 5728.50 | bwd_inner: 5659.84 | bwd_allreduce: 68.61 | step: 19.85 15%|█▍ | 6005/41250 [14:30:49<84:55:38, 8.67s/it] {'loss': 0.0664, 'grad_norm': 1.5825631618499756, 'learning_rate': 3.8615370619214336e-05, 'epoch': 1.46} 15%|█▍ | 6005/41250 [14:30:49<84:55:38, 8.67s/it][2025-04-25 22:28:32,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.06 | optimizer_step: 0.89 [2025-04-25 22:28:32,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.98 | bwd_microstep: 5731.13 | bwd_inner_microstep: 5664.43 | bwd_allreduce_microstep: 66.65 | step_microstep: 19.10 [2025-04-25 22:28:32,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.98 | bwd: 5731.15 | bwd_inner: 5664.43 | bwd_allreduce: 66.67 | step: 19.10 15%|█▍ | 6006/41250 [14:30:58<84:50:43, 8.67s/it] {'loss': 0.2139, 'grad_norm': 3.776607036590576, 'learning_rate': 3.8614796436796226e-05, 'epoch': 1.46} 15%|█▍ | 6006/41250 [14:30:58<84:50:43, 8.67s/it][2025-04-25 22:28:41,349] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.98 | optimizer_step: 0.97 [2025-04-25 22:28:41,349] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.52 | bwd_microstep: 5728.28 | bwd_inner_microstep: 5653.25 | bwd_allreduce_microstep: 74.99 | step_microstep: 18.45 [2025-04-25 22:28:41,349] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.52 | bwd: 5728.29 | bwd_inner: 5653.25 | bwd_allreduce: 75.00 | step: 18.45 15%|█▍ | 6007/41250 [14:31:06<84:46:54, 8.66s/it] {'loss': 0.5148, 'grad_norm': 3.078779697418213, 'learning_rate': 3.861422213962156e-05, 'epoch': 1.46} 15%|█▍ | 6007/41250 [14:31:06<84:46:54, 8.66s/it][2025-04-25 22:28:50,013] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:28:50,013] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.25 | bwd_microstep: 5744.52 | bwd_inner_microstep: 5664.01 | bwd_allreduce_microstep: 80.46 | step_microstep: 18.47 [2025-04-25 22:28:50,013] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.25 | bwd: 5744.53 | bwd_inner: 5664.01 | bwd_allreduce: 80.48 | step: 18.48 15%|█▍ | 6008/41250 [14:31:15<84:47:31, 8.66s/it] {'loss': 0.0372, 'grad_norm': 0.718842625617981, 'learning_rate': 3.861364772769388e-05, 'epoch': 1.46} 15%|█▍ | 6008/41250 [14:31:15<84:47:31, 8.66s/it][2025-04-25 22:28:58,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 22:28:58,667] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.60 | bwd_microstep: 5739.21 | bwd_inner_microstep: 5663.33 | bwd_allreduce_microstep: 75.84 | step_microstep: 18.56 [2025-04-25 22:28:58,667] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.60 | bwd: 5739.22 | bwd_inner: 5663.33 | bwd_allreduce: 75.86 | step: 18.57 15%|█▍ | 6009/41250 [14:31:23<84:45:54, 8.66s/it] {'loss': 0.0494, 'grad_norm': 1.043354868888855, 'learning_rate': 3.861307320101672e-05, 'epoch': 1.46} 15%|█▍ | 6009/41250 [14:31:23<84:45:54, 8.66s/it][2025-04-25 22:29:07,297] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:29:07,298] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.88 | bwd_microstep: 5719.59 | bwd_inner_microstep: 5661.41 | bwd_allreduce_microstep: 58.13 | step_microstep: 18.55 [2025-04-25 22:29:07,298] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.88 | bwd: 5719.60 | bwd_inner: 5661.41 | bwd_allreduce: 58.15 | step: 18.55 15%|█▍ | 6010/41250 [14:31:32<84:41:13, 8.65s/it] {'loss': 0.1703, 'grad_norm': 2.671112060546875, 'learning_rate': 3.861249855959362e-05, 'epoch': 1.46} 15%|█▍ | 6010/41250 [14:31:32<84:41:13, 8.65s/it][2025-04-25 22:29:16,012] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.02 | optimizer_step: 1.13 [2025-04-25 22:29:16,013] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.17 | bwd_microstep: 5777.33 | bwd_inner_microstep: 5708.24 | bwd_allreduce_microstep: 69.05 | step_microstep: 18.94 [2025-04-25 22:29:16,013] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.17 | bwd: 5777.35 | bwd_inner: 5708.24 | bwd_allreduce: 69.07 | step: 18.94 15%|█▍ | 6011/41250 [14:31:41<84:51:58, 8.67s/it] {'loss': 0.0468, 'grad_norm': 0.6078846454620361, 'learning_rate': 3.861192380342813e-05, 'epoch': 1.46} 15%|█▍ | 6011/41250 [14:31:41<84:51:58, 8.67s/it][2025-04-25 22:29:24,701] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 22:29:24,702] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.98 | bwd_microstep: 5775.31 | bwd_inner_microstep: 5657.54 | bwd_allreduce_microstep: 117.71 | step_microstep: 19.10 [2025-04-25 22:29:24,702] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.98 | bwd: 5775.32 | bwd_inner: 5657.54 | bwd_allreduce: 117.73 | step: 19.11 15%|█▍ | 6012/41250 [14:31:50<84:55:12, 8.68s/it] {'loss': 0.1012, 'grad_norm': 1.638817548751831, 'learning_rate': 3.861134893252379e-05, 'epoch': 1.46} 15%|█▍ | 6012/41250 [14:31:50<84:55:12, 8.68s/it][2025-04-25 22:29:33,416] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 22:29:33,416] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.95 | bwd_microstep: 5802.32 | bwd_inner_microstep: 5650.64 | bwd_allreduce_microstep: 151.64 | step_microstep: 18.43 [2025-04-25 22:29:33,417] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.95 | bwd: 5802.34 | bwd_inner: 5650.64 | bwd_allreduce: 151.66 | step: 18.44 15%|█▍ | 6013/41250 [14:31:58<85:01:57, 8.69s/it] {'loss': 0.0414, 'grad_norm': 0.3184720277786255, 'learning_rate': 3.8610773946884145e-05, 'epoch': 1.46} 15%|█▍ | 6013/41250 [14:31:58<85:01:57, 8.69s/it][2025-04-25 22:29:42,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 22:29:42,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.74 | bwd_microstep: 5766.33 | bwd_inner_microstep: 5657.65 | bwd_allreduce_microstep: 108.62 | step_microstep: 18.75 [2025-04-25 22:29:42,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.74 | bwd: 5766.34 | bwd_inner: 5657.65 | bwd_allreduce: 108.65 | step: 18.75 15%|█▍ | 6014/41250 [14:32:07<84:59:15, 8.68s/it] {'loss': 0.1876, 'grad_norm': 4.004354953765869, 'learning_rate': 3.861019884651274e-05, 'epoch': 1.46} 15%|█▍ | 6014/41250 [14:32:07<84:59:15, 8.68s/it][2025-04-25 22:29:50,791] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:29:50,791] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.97 | bwd_microstep: 5769.03 | bwd_inner_microstep: 5703.17 | bwd_allreduce_microstep: 65.82 | step_microstep: 18.67 [2025-04-25 22:29:50,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.97 | bwd: 5769.05 | bwd_inner: 5703.17 | bwd_allreduce: 65.84 | step: 18.67 15%|█▍ | 6015/41250 [14:32:16<85:02:21, 8.69s/it] {'loss': 0.1337, 'grad_norm': 3.2183620929718018, 'learning_rate': 3.860962363141312e-05, 'epoch': 1.46} 15%|█▍ | 6015/41250 [14:32:16<85:02:21, 8.69s/it][2025-04-25 22:29:59,453] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:29:59,453] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.47 | bwd_microstep: 5726.39 | bwd_inner_microstep: 5713.76 | bwd_allreduce_microstep: 12.58 | step_microstep: 18.70 [2025-04-25 22:29:59,453] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.47 | bwd: 5726.41 | bwd_inner: 5713.76 | bwd_allreduce: 12.60 | step: 18.71 15%|█▍ | 6016/41250 [14:32:24<84:57:30, 8.68s/it] {'loss': 0.1351, 'grad_norm': 1.0827490091323853, 'learning_rate': 3.860904830158883e-05, 'epoch': 1.46} 15%|█▍ | 6016/41250 [14:32:24<84:57:30, 8.68s/it][2025-04-25 22:30:08,111] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:30:08,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.56 | bwd_microstep: 5721.98 | bwd_inner_microstep: 5709.48 | bwd_allreduce_microstep: 12.45 | step_microstep: 18.63 [2025-04-25 22:30:08,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.56 | bwd: 5722.00 | bwd_inner: 5709.48 | bwd_allreduce: 12.47 | step: 18.64 15%|█▍ | 6017/41250 [14:32:33<84:53:32, 8.67s/it] {'loss': 0.0677, 'grad_norm': 2.362746477127075, 'learning_rate': 3.860847285704342e-05, 'epoch': 1.46} 15%|█▍ | 6017/41250 [14:32:33<84:53:32, 8.67s/it][2025-04-25 22:30:16,739] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.04 | optimizer_step: 1.01 [2025-04-25 22:30:16,740] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.71 | bwd_microstep: 5698.73 | bwd_inner_microstep: 5685.87 | bwd_allreduce_microstep: 12.81 | step_microstep: 19.53 [2025-04-25 22:30:16,740] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.71 | bwd: 5698.75 | bwd_inner: 5685.87 | bwd_allreduce: 12.83 | step: 19.53 15%|█▍ | 6018/41250 [14:32:42<84:45:46, 8.66s/it] {'loss': 0.0881, 'grad_norm': 1.0314360857009888, 'learning_rate': 3.860789729778043e-05, 'epoch': 1.46} 15%|█▍ | 6018/41250 [14:32:42<84:45:46, 8.66s/it][2025-04-25 22:30:25,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.25 | optimizer_step: 0.96 [2025-04-25 22:30:25,431] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.58 | bwd_microstep: 5757.33 | bwd_inner_microstep: 5698.79 | bwd_allreduce_microstep: 58.49 | step_microstep: 19.78 [2025-04-25 22:30:25,431] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.58 | bwd: 5757.34 | bwd_inner: 5698.79 | bwd_allreduce: 58.51 | step: 19.79 15%|█▍ | 6019/41250 [14:32:50<84:50:47, 8.67s/it] {'loss': 0.0673, 'grad_norm': 1.0969582796096802, 'learning_rate': 3.8607321623803414e-05, 'epoch': 1.46} 15%|█▍ | 6019/41250 [14:32:50<84:50:47, 8.67s/it][2025-04-25 22:30:34,100] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 22:30:34,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.67 | bwd_microstep: 5756.52 | bwd_inner_microstep: 5649.61 | bwd_allreduce_microstep: 106.86 | step_microstep: 18.81 [2025-04-25 22:30:34,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.67 | bwd: 5756.53 | bwd_inner: 5649.61 | bwd_allreduce: 106.88 | step: 18.81 15%|█▍ | 6020/41250 [14:32:59<84:50:30, 8.67s/it] {'loss': 0.3519, 'grad_norm': 3.006700277328491, 'learning_rate': 3.860674583511592e-05, 'epoch': 1.46} 15%|█▍ | 6020/41250 [14:32:59<84:50:30, 8.67s/it][2025-04-25 22:30:42,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 1.00 [2025-04-25 22:30:42,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.35 | bwd_microstep: 5783.64 | bwd_inner_microstep: 5645.91 | bwd_allreduce_microstep: 137.69 | step_microstep: 19.07 [2025-04-25 22:30:42,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.35 | bwd: 5783.65 | bwd_inner: 5645.91 | bwd_allreduce: 137.70 | step: 19.07 15%|█▍ | 6021/41250 [14:33:08<84:54:23, 8.68s/it] {'loss': 0.0874, 'grad_norm': 2.1098389625549316, 'learning_rate': 3.86061699317215e-05, 'epoch': 1.46} 15%|█▍ | 6021/41250 [14:33:08<84:54:23, 8.68s/it][2025-04-25 22:30:51,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.01 | optimizer_step: 1.02 [2025-04-25 22:30:51,552] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2891.37 | bwd_microstep: 5782.65 | bwd_inner_microstep: 5769.68 | bwd_allreduce_microstep: 12.92 | step_microstep: 18.93 [2025-04-25 22:30:51,552] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2891.37 | bwd: 5782.66 | bwd_inner: 5769.68 | bwd_allreduce: 12.93 | step: 18.93 15%|█▍ | 6022/41250 [14:33:16<85:08:40, 8.70s/it] {'loss': 0.211, 'grad_norm': 2.4093122482299805, 'learning_rate': 3.86055939136237e-05, 'epoch': 1.46} 15%|█▍ | 6022/41250 [14:33:16<85:08:40, 8.70s/it][2025-04-25 22:31:00,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 1.02 | optimizer_step: 1.07 [2025-04-25 22:31:00,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.28 | bwd_microstep: 5688.59 | bwd_inner_microstep: 5662.89 | bwd_allreduce_microstep: 25.66 | step_microstep: 19.28 [2025-04-25 22:31:00,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.28 | bwd: 5688.60 | bwd_inner: 5662.89 | bwd_allreduce: 25.67 | step: 19.28 15%|█▍ | 6023/41250 [14:33:25<84:51:15, 8.67s/it] {'loss': 0.1648, 'grad_norm': 0.9961243271827698, 'learning_rate': 3.8605017780826074e-05, 'epoch': 1.46} 15%|█▍ | 6023/41250 [14:33:25<84:51:15, 8.67s/it][2025-04-25 22:31:08,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 22:31:08,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.26 | bwd_microstep: 5700.98 | bwd_inner_microstep: 5688.07 | bwd_allreduce_microstep: 12.87 | step_microstep: 18.98 [2025-04-25 22:31:08,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.26 | bwd: 5701.00 | bwd_inner: 5688.07 | bwd_allreduce: 12.88 | step: 18.99 15%|█▍ | 6024/41250 [14:33:34<84:43:51, 8.66s/it] {'loss': 0.1807, 'grad_norm': 2.960679769515991, 'learning_rate': 3.860444153333217e-05, 'epoch': 1.46} 15%|█▍ | 6024/41250 [14:33:34<84:43:51, 8.66s/it][2025-04-25 22:31:17,466] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.08 | optimizer_step: 1.06 [2025-04-25 22:31:17,466] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.32 | bwd_microstep: 5749.12 | bwd_inner_microstep: 5702.97 | bwd_allreduce_microstep: 46.09 | step_microstep: 19.44 [2025-04-25 22:31:17,467] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.32 | bwd: 5749.13 | bwd_inner: 5702.97 | bwd_allreduce: 46.11 | step: 19.45 15%|█▍ | 6025/41250 [14:33:42<84:47:47, 8.67s/it] {'loss': 0.1161, 'grad_norm': 2.1977336406707764, 'learning_rate': 3.860386517114555e-05, 'epoch': 1.46} 15%|█▍ | 6025/41250 [14:33:42<84:47:47, 8.67s/it][2025-04-25 22:31:26,139] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.19 | optimizer_step: 1.01 [2025-04-25 22:31:26,140] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.63 | bwd_microstep: 5747.31 | bwd_inner_microstep: 5684.21 | bwd_allreduce_microstep: 63.05 | step_microstep: 19.23 [2025-04-25 22:31:26,140] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.63 | bwd: 5747.32 | bwd_inner: 5684.21 | bwd_allreduce: 63.07 | step: 19.23 15%|█▍ | 6026/41250 [14:33:51<84:48:49, 8.67s/it] {'loss': 0.0669, 'grad_norm': 2.1447784900665283, 'learning_rate': 3.860328869426975e-05, 'epoch': 1.46} 15%|█▍ | 6026/41250 [14:33:51<84:48:49, 8.67s/it][2025-04-25 22:31:34,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-25 22:31:34,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.71 | bwd_microstep: 5721.76 | bwd_inner_microstep: 5709.07 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.65 [2025-04-25 22:31:34,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.71 | bwd: 5721.77 | bwd_inner: 5709.07 | bwd_allreduce: 12.66 | step: 18.65 15%|█▍ | 6027/41250 [14:34:00<84:47:02, 8.67s/it] {'loss': 0.379, 'grad_norm': 7.965217590332031, 'learning_rate': 3.860271210270835e-05, 'epoch': 1.46} 15%|█▍ | 6027/41250 [14:34:00<84:47:02, 8.67s/it][2025-04-25 22:31:43,440] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.98 | optimizer_step: 0.93 [2025-04-25 22:31:43,440] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.01 | bwd_microstep: 5711.46 | bwd_inner_microstep: 5698.86 | bwd_allreduce_microstep: 12.55 | step_microstep: 18.71 [2025-04-25 22:31:43,440] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.01 | bwd: 5711.47 | bwd_inner: 5698.86 | bwd_allreduce: 12.57 | step: 18.71 15%|█▍ | 6028/41250 [14:34:08<84:42:14, 8.66s/it] {'loss': 0.0819, 'grad_norm': 1.471078634262085, 'learning_rate': 3.8602135396464874e-05, 'epoch': 1.46} 15%|█▍ | 6028/41250 [14:34:08<84:42:14, 8.66s/it][2025-04-25 22:31:52,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.00 | optimizer_step: 1.04 [2025-04-25 22:31:52,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.03 | bwd_microstep: 5753.20 | bwd_inner_microstep: 5652.85 | bwd_allreduce_microstep: 100.31 | step_microstep: 18.57 [2025-04-25 22:31:52,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.03 | bwd: 5753.21 | bwd_inner: 5652.85 | bwd_allreduce: 100.33 | step: 18.57 15%|█▍ | 6029/41250 [14:34:17<84:43:32, 8.66s/it] {'loss': 0.1149, 'grad_norm': 1.5126733779907227, 'learning_rate': 3.86015585755429e-05, 'epoch': 1.46} 15%|█▍ | 6029/41250 [14:34:17<84:43:32, 8.66s/it][2025-04-25 22:32:00,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 22:32:00,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.25 | bwd_microstep: 5683.63 | bwd_inner_microstep: 5671.02 | bwd_allreduce_microstep: 12.57 | step_microstep: 18.63 [2025-04-25 22:32:00,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.26 | bwd: 5683.65 | bwd_inner: 5671.02 | bwd_allreduce: 12.58 | step: 18.63 15%|█▍ | 6030/41250 [14:34:26<84:33:31, 8.64s/it] {'loss': 0.1516, 'grad_norm': 1.4321759939193726, 'learning_rate': 3.860098163994597e-05, 'epoch': 1.46} 15%|█▍ | 6030/41250 [14:34:26<84:33:31, 8.64s/it][2025-04-25 22:32:09,357] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:32:09,357] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.88 | bwd_microstep: 5739.55 | bwd_inner_microstep: 5655.79 | bwd_allreduce_microstep: 83.72 | step_microstep: 18.44 [2025-04-25 22:32:09,358] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.88 | bwd: 5739.56 | bwd_inner: 5655.79 | bwd_allreduce: 83.74 | step: 18.44 15%|█▍ | 6031/41250 [14:34:34<84:34:05, 8.64s/it] {'loss': 0.4045, 'grad_norm': 3.781477689743042, 'learning_rate': 3.8600404589677655e-05, 'epoch': 1.46} 15%|█▍ | 6031/41250 [14:34:34<84:34:05, 8.64s/it][2025-04-25 22:32:18,011] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 22:32:18,011] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.04 | bwd_microstep: 5721.78 | bwd_inner_microstep: 5709.23 | bwd_allreduce_microstep: 12.51 | step_microstep: 18.66 [2025-04-25 22:32:18,011] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.04 | bwd: 5721.80 | bwd_inner: 5709.23 | bwd_allreduce: 12.52 | step: 18.66 15%|█▍ | 6032/41250 [14:34:43<84:35:36, 8.65s/it] {'loss': 0.1428, 'grad_norm': 1.0697413682937622, 'learning_rate': 3.8599827424741497e-05, 'epoch': 1.46} 15%|█▍ | 6032/41250 [14:34:43<84:35:36, 8.65s/it][2025-04-25 22:32:26,651] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 22:32:26,651] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.12 | bwd_microstep: 5710.43 | bwd_inner_microstep: 5697.77 | bwd_allreduce_microstep: 12.62 | step_microstep: 18.70 [2025-04-25 22:32:26,652] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.12 | bwd: 5710.44 | bwd_inner: 5697.77 | bwd_allreduce: 12.64 | step: 18.70 15%|█▍ | 6033/41250 [14:34:51<84:34:14, 8.65s/it] {'loss': 0.342, 'grad_norm': 1.3906996250152588, 'learning_rate': 3.8599250145141065e-05, 'epoch': 1.46} 15%|█▍ | 6033/41250 [14:34:51<84:34:14, 8.65s/it][2025-04-25 22:32:35,293] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:32:35,294] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.97 | bwd_microstep: 5724.01 | bwd_inner_microstep: 5646.30 | bwd_allreduce_microstep: 77.67 | step_microstep: 18.32 [2025-04-25 22:32:35,294] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.97 | bwd: 5724.03 | bwd_inner: 5646.30 | bwd_allreduce: 77.68 | step: 18.32 15%|█▍ | 6034/41250 [14:35:00<84:33:34, 8.64s/it] {'loss': 0.1929, 'grad_norm': 2.4597911834716797, 'learning_rate': 3.859867275087991e-05, 'epoch': 1.46} 15%|█▍ | 6034/41250 [14:35:00<84:33:34, 8.64s/it][mov,mp4,m4a,3gp,3g2,mj2 @ 0x111d3b80] moov atom not found [22:32:36] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Allvideos/ZeroScope/02039.mp4, Invalid data found when processing input petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. Error reading /home/wangjiarui/AIGV6K/Allvideos/ZeroScope/02039.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Allvideos/ZeroScope/02039.mp4... [2025-04-25 22:32:43,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 22:32:43,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.30 | bwd_microstep: 5690.19 | bwd_inner_microstep: 5677.50 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.31 [2025-04-25 22:32:43,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.30 | bwd: 5690.20 | bwd_inner: 5677.50 | bwd_allreduce: 12.66 | step: 18.31 15%|█▍ | 6035/41250 [14:35:09<84:27:22, 8.63s/it] {'loss': 0.1523, 'grad_norm': 1.336107850074768, 'learning_rate': 3.85980952419616e-05, 'epoch': 1.46} 15%|█▍ | 6035/41250 [14:35:09<84:27:22, 8.63s/it][2025-04-25 22:32:52,568] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 22:32:52,569] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.42 | bwd_microstep: 5758.52 | bwd_inner_microstep: 5633.42 | bwd_allreduce_microstep: 125.06 | step_microstep: 18.38 [2025-04-25 22:32:52,569] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.42 | bwd: 5758.53 | bwd_inner: 5633.42 | bwd_allreduce: 125.07 | step: 18.39 15%|█▍ | 6036/41250 [14:35:17<84:32:47, 8.64s/it] {'loss': 0.0864, 'grad_norm': 1.113574504852295, 'learning_rate': 3.8597517618389696e-05, 'epoch': 1.46} 15%|█▍ | 6036/41250 [14:35:17<84:32:47, 8.64s/it][2025-04-25 22:33:01,492] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 0.95 | optimizer_step: 1.02 [2025-04-25 22:33:01,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.69 | bwd_microstep: 6017.83 | bwd_inner_microstep: 5648.42 | bwd_allreduce_microstep: 369.36 | step_microstep: 18.20 [2025-04-25 22:33:01,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.69 | bwd: 6017.84 | bwd_inner: 5648.42 | bwd_allreduce: 369.38 | step: 18.21 15%|█▍ | 6037/41250 [14:35:26<85:22:03, 8.73s/it] {'loss': 0.1654, 'grad_norm': 1.218349814414978, 'learning_rate': 3.859693988016775e-05, 'epoch': 1.46} 15%|█▍ | 6037/41250 [14:35:26<85:22:03, 8.73s/it][2025-04-25 22:33:10,168] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.94 | optimizer_step: 0.93 [2025-04-25 22:33:10,169] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.29 | bwd_microstep: 5750.74 | bwd_inner_microstep: 5678.89 | bwd_allreduce_microstep: 71.80 | step_microstep: 18.40 [2025-04-25 22:33:10,169] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.29 | bwd: 5750.75 | bwd_inner: 5678.89 | bwd_allreduce: 71.82 | step: 18.40 15%|█▍ | 6038/41250 [14:35:35<85:12:47, 8.71s/it] {'loss': 0.0382, 'grad_norm': 0.44539186358451843, 'learning_rate': 3.859636202729932e-05, 'epoch': 1.46} 15%|█▍ | 6038/41250 [14:35:35<85:12:47, 8.71s/it][2025-04-25 22:33:18,821] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:33:18,822] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.72 | bwd_microstep: 5746.02 | bwd_inner_microstep: 5642.96 | bwd_allreduce_microstep: 103.01 | step_microstep: 18.29 [2025-04-25 22:33:18,822] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.72 | bwd: 5746.03 | bwd_inner: 5642.96 | bwd_allreduce: 103.02 | step: 18.29 15%|█▍ | 6039/41250 [14:35:44<85:02:33, 8.69s/it] {'loss': 0.1243, 'grad_norm': 2.9664084911346436, 'learning_rate': 3.859578405978797e-05, 'epoch': 1.46} 15%|█▍ | 6039/41250 [14:35:44<85:02:33, 8.69s/it][2025-04-25 22:33:27,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 1.24 | optimizer_step: 0.99 [2025-04-25 22:33:27,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.36 | bwd_microstep: 5761.81 | bwd_inner_microstep: 5706.53 | bwd_allreduce_microstep: 55.22 | step_microstep: 19.33 [2025-04-25 22:33:27,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.36 | bwd: 5761.82 | bwd_inner: 5706.53 | bwd_allreduce: 55.24 | step: 19.33 15%|█▍ | 6040/41250 [14:35:52<85:02:43, 8.70s/it] {'loss': 0.2261, 'grad_norm': 3.1006715297698975, 'learning_rate': 3.859520597763728e-05, 'epoch': 1.46} 15%|█▍ | 6040/41250 [14:35:52<85:02:43, 8.70s/it][2025-04-25 22:33:36,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.03 | optimizer_step: 0.91 [2025-04-25 22:33:36,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.31 | bwd_microstep: 5784.50 | bwd_inner_microstep: 5637.79 | bwd_allreduce_microstep: 146.66 | step_microstep: 18.77 [2025-04-25 22:33:36,210] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.31 | bwd: 5784.51 | bwd_inner: 5637.79 | bwd_allreduce: 146.68 | step: 18.78 15%|█▍ | 6041/41250 [14:36:01<85:01:38, 8.69s/it] {'loss': 0.1902, 'grad_norm': 2.7475080490112305, 'learning_rate': 3.8594627780850806e-05, 'epoch': 1.46} 15%|█▍ | 6041/41250 [14:36:01<85:01:38, 8.69s/it][2025-04-25 22:33:44,875] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:33:44,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.21 | bwd_microstep: 5737.45 | bwd_inner_microstep: 5706.24 | bwd_allreduce_microstep: 31.17 | step_microstep: 18.80 [2025-04-25 22:33:44,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.21 | bwd: 5737.46 | bwd_inner: 5706.24 | bwd_allreduce: 31.19 | step: 18.80 15%|█▍ | 6042/41250 [14:36:10<84:56:40, 8.69s/it] {'loss': 0.0812, 'grad_norm': 1.3141539096832275, 'learning_rate': 3.859404946943211e-05, 'epoch': 1.46} 15%|█▍ | 6042/41250 [14:36:10<84:56:40, 8.69s/it][2025-04-25 22:33:53,475] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:33:53,476] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.40 | bwd_microstep: 5693.59 | bwd_inner_microstep: 5656.03 | bwd_allreduce_microstep: 37.52 | step_microstep: 18.44 [2025-04-25 22:33:53,476] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.40 | bwd: 5693.61 | bwd_inner: 5656.03 | bwd_allreduce: 37.54 | step: 18.44 15%|█▍ | 6043/41250 [14:36:18<84:41:23, 8.66s/it] {'loss': 0.1987, 'grad_norm': 2.712811231613159, 'learning_rate': 3.859347104338475e-05, 'epoch': 1.46} 15%|█▍ | 6043/41250 [14:36:18<84:41:23, 8.66s/it][2025-04-25 22:34:02,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:34:02,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.40 | bwd_microstep: 5769.42 | bwd_inner_microstep: 5652.62 | bwd_allreduce_microstep: 116.76 | step_microstep: 18.51 [2025-04-25 22:34:02,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.40 | bwd: 5769.43 | bwd_inner: 5652.62 | bwd_allreduce: 116.77 | step: 18.51 15%|█▍ | 6044/41250 [14:36:27<84:44:33, 8.67s/it] {'loss': 0.2535, 'grad_norm': 1.9107818603515625, 'learning_rate': 3.8592892502712305e-05, 'epoch': 1.47} 15%|█▍ | 6044/41250 [14:36:27<84:44:33, 8.67s/it][2025-04-25 22:34:10,818] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:34:10,818] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.16 | bwd_microstep: 5728.44 | bwd_inner_microstep: 5707.54 | bwd_allreduce_microstep: 20.86 | step_microstep: 18.50 [2025-04-25 22:34:10,818] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.16 | bwd: 5728.45 | bwd_inner: 5707.54 | bwd_allreduce: 20.87 | step: 18.50 15%|█▍ | 6045/41250 [14:36:36<84:44:08, 8.66s/it] {'loss': 0.0547, 'grad_norm': 0.8077385425567627, 'learning_rate': 3.859231384741833e-05, 'epoch': 1.47} 15%|█▍ | 6045/41250 [14:36:36<84:44:08, 8.66s/it][2025-04-25 22:34:19,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:34:19,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.28 | bwd_microstep: 5757.34 | bwd_inner_microstep: 5654.13 | bwd_allreduce_microstep: 103.16 | step_microstep: 18.49 [2025-04-25 22:34:19,494] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.28 | bwd: 5757.35 | bwd_inner: 5654.13 | bwd_allreduce: 103.18 | step: 18.49 15%|█▍ | 6046/41250 [14:36:44<84:45:46, 8.67s/it] {'loss': 0.2543, 'grad_norm': 3.6806631088256836, 'learning_rate': 3.8591735077506396e-05, 'epoch': 1.47} 15%|█▍ | 6046/41250 [14:36:44<84:45:46, 8.67s/it][2025-04-25 22:34:28,176] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:34:28,176] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.52 | bwd_microstep: 5753.65 | bwd_inner_microstep: 5700.42 | bwd_allreduce_microstep: 53.19 | step_microstep: 18.37 [2025-04-25 22:34:28,176] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.52 | bwd: 5753.66 | bwd_inner: 5700.42 | bwd_allreduce: 53.20 | step: 18.38 15%|█▍ | 6047/41250 [14:36:53<84:48:16, 8.67s/it] {'loss': 0.2885, 'grad_norm': 2.596372127532959, 'learning_rate': 3.859115619298007e-05, 'epoch': 1.47} 15%|█▍ | 6047/41250 [14:36:53<84:48:16, 8.67s/it][2025-04-25 22:34:36,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:34:36,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.04 | bwd_microstep: 5777.37 | bwd_inner_microstep: 5655.39 | bwd_allreduce_microstep: 121.95 | step_microstep: 18.30 [2025-04-25 22:34:36,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.04 | bwd: 5777.39 | bwd_inner: 5655.39 | bwd_allreduce: 121.96 | step: 18.30 15%|█▍ | 6048/41250 [14:37:02<84:49:49, 8.68s/it] {'loss': 0.1754, 'grad_norm': 1.7823270559310913, 'learning_rate': 3.859057719384292e-05, 'epoch': 1.47} 15%|█▍ | 6048/41250 [14:37:02<84:49:49, 8.68s/it][2025-04-25 22:34:45,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 22:34:45,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.61 | bwd_microstep: 5712.54 | bwd_inner_microstep: 5699.96 | bwd_allreduce_microstep: 12.53 | step_microstep: 18.42 [2025-04-25 22:34:45,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.61 | bwd: 5712.56 | bwd_inner: 5699.96 | bwd_allreduce: 12.55 | step: 18.42 15%|█▍ | 6049/41250 [14:37:10<84:44:08, 8.67s/it] {'loss': 0.1769, 'grad_norm': 1.8995798826217651, 'learning_rate': 3.858999808009853e-05, 'epoch': 1.47} 15%|█▍ | 6049/41250 [14:37:10<84:44:08, 8.67s/it][2025-04-25 22:34:54,152] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:34:54,152] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.43 | bwd_microstep: 5718.82 | bwd_inner_microstep: 5705.94 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.53 [2025-04-25 22:34:54,153] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.43 | bwd: 5718.83 | bwd_inner: 5705.94 | bwd_allreduce: 12.86 | step: 18.53 15%|█▍ | 6050/41250 [14:37:19<84:41:14, 8.66s/it] {'loss': 0.0852, 'grad_norm': 1.1659653186798096, 'learning_rate': 3.858941885175045e-05, 'epoch': 1.47} 15%|█▍ | 6050/41250 [14:37:19<84:41:14, 8.66s/it][2025-04-25 22:35:02,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.95 | optimizer_step: 0.97 [2025-04-25 22:35:02,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.11 | bwd_microstep: 5784.78 | bwd_inner_microstep: 5644.57 | bwd_allreduce_microstep: 140.17 | step_microstep: 18.26 [2025-04-25 22:35:02,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.11 | bwd: 5784.79 | bwd_inner: 5644.57 | bwd_allreduce: 140.18 | step: 18.27 15%|█▍ | 6051/41250 [14:37:28<84:46:51, 8.67s/it] {'loss': 0.1277, 'grad_norm': 1.1411675214767456, 'learning_rate': 3.858883950880226e-05, 'epoch': 1.47} 15%|█▍ | 6051/41250 [14:37:28<84:46:51, 8.67s/it][2025-04-25 22:35:11,532] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.07 | optimizer_step: 1.00 [2025-04-25 22:35:11,533] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.69 | bwd_microstep: 5772.25 | bwd_inner_microstep: 5648.49 | bwd_allreduce_microstep: 123.71 | step_microstep: 19.35 [2025-04-25 22:35:11,533] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.69 | bwd: 5772.27 | bwd_inner: 5648.49 | bwd_allreduce: 123.73 | step: 19.35 15%|█▍ | 6052/41250 [14:37:36<84:49:55, 8.68s/it] {'loss': 0.2397, 'grad_norm': 2.151775598526001, 'learning_rate': 3.858826005125753e-05, 'epoch': 1.47} 15%|█▍ | 6052/41250 [14:37:36<84:49:55, 8.68s/it][2025-04-25 22:35:20,327] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 22:35:20,328] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2905.68 | bwd_microstep: 5798.22 | bwd_inner_microstep: 5785.49 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.45 [2025-04-25 22:35:20,328] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2905.68 | bwd: 5798.23 | bwd_inner: 5785.49 | bwd_allreduce: 12.71 | step: 18.45 15%|█▍ | 6053/41250 [14:37:45<85:10:16, 8.71s/it] {'loss': 0.1041, 'grad_norm': 1.1700177192687988, 'learning_rate': 3.858768047911984e-05, 'epoch': 1.47} 15%|█▍ | 6053/41250 [14:37:45<85:10:16, 8.71s/it][2025-04-25 22:35:28,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.98 | optimizer_step: 1.03 [2025-04-25 22:35:28,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.12 | bwd_microstep: 5725.10 | bwd_inner_microstep: 5712.39 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.49 [2025-04-25 22:35:28,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.12 | bwd: 5725.11 | bwd_inner: 5712.39 | bwd_allreduce: 12.68 | step: 18.50 15%|█▍ | 6054/41250 [14:37:54<85:02:06, 8.70s/it] {'loss': 0.1303, 'grad_norm': 1.2000117301940918, 'learning_rate': 3.8587100792392744e-05, 'epoch': 1.47} 15%|█▍ | 6054/41250 [14:37:54<85:02:06, 8.70s/it][2025-04-25 22:35:37,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 22:35:37,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.86 | bwd_microstep: 5774.54 | bwd_inner_microstep: 5712.08 | bwd_allreduce_microstep: 62.41 | step_microstep: 18.95 [2025-04-25 22:35:37,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.86 | bwd: 5774.55 | bwd_inner: 5712.08 | bwd_allreduce: 62.43 | step: 18.95 15%|█▍ | 6055/41250 [14:38:03<85:05:09, 8.70s/it] {'loss': 0.2119, 'grad_norm': 1.4694048166275024, 'learning_rate': 3.858652099107984e-05, 'epoch': 1.47} 15%|█▍ | 6055/41250 [14:38:03<85:05:09, 8.70s/it][2025-04-25 22:35:46,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.03 | optimizer_step: 1.01 [2025-04-25 22:35:46,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.24 | bwd_microstep: 5760.24 | bwd_inner_microstep: 5694.14 | bwd_allreduce_microstep: 66.05 | step_microstep: 19.28 [2025-04-25 22:35:46,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.24 | bwd: 5760.26 | bwd_inner: 5694.14 | bwd_allreduce: 66.07 | step: 19.28 15%|█▍ | 6056/41250 [14:38:11<85:03:46, 8.70s/it] {'loss': 0.2578, 'grad_norm': 2.9190850257873535, 'learning_rate': 3.8585941075184675e-05, 'epoch': 1.47} 15%|█▍ | 6056/41250 [14:38:11<85:03:46, 8.70s/it][2025-04-25 22:35:55,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 22:35:55,059] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.08 | bwd_microstep: 5718.51 | bwd_inner_microstep: 5690.02 | bwd_allreduce_microstep: 28.44 | step_microstep: 19.02 [2025-04-25 22:35:55,059] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.08 | bwd: 5718.52 | bwd_inner: 5690.02 | bwd_allreduce: 28.46 | step: 19.02 15%|█▍ | 6057/41250 [14:38:20<84:55:14, 8.69s/it] {'loss': 0.0708, 'grad_norm': 0.7106594443321228, 'learning_rate': 3.8585361044710856e-05, 'epoch': 1.47} 15%|█▍ | 6057/41250 [14:38:20<84:55:14, 8.69s/it][2025-04-25 22:36:03,697] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 22:36:03,698] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.65 | bwd_microstep: 5720.39 | bwd_inner_microstep: 5671.41 | bwd_allreduce_microstep: 48.93 | step_microstep: 19.23 [2025-04-25 22:36:03,698] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.65 | bwd: 5720.40 | bwd_inner: 5671.41 | bwd_allreduce: 48.95 | step: 19.23 15%|█▍ | 6058/41250 [14:38:29<84:46:41, 8.67s/it] {'loss': 0.1476, 'grad_norm': 0.9464519619941711, 'learning_rate': 3.858478089966193e-05, 'epoch': 1.47} 15%|█▍ | 6058/41250 [14:38:29<84:46:41, 8.67s/it][2025-04-25 22:36:12,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:36:12,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.12 | bwd_microstep: 5780.46 | bwd_inner_microstep: 5652.72 | bwd_allreduce_microstep: 127.69 | step_microstep: 18.55 [2025-04-25 22:36:12,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.12 | bwd: 5780.48 | bwd_inner: 5652.72 | bwd_allreduce: 127.71 | step: 18.56 15%|█▍ | 6059/41250 [14:38:37<84:50:40, 8.68s/it] {'loss': 0.2547, 'grad_norm': 3.0924010276794434, 'learning_rate': 3.858420064004149e-05, 'epoch': 1.47} 15%|█▍ | 6059/41250 [14:38:37<84:50:40, 8.68s/it][2025-04-25 22:36:21,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:36:21,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.23 | bwd_microstep: 5772.10 | bwd_inner_microstep: 5699.12 | bwd_allreduce_microstep: 72.93 | step_microstep: 18.70 [2025-04-25 22:36:21,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.23 | bwd: 5772.11 | bwd_inner: 5699.12 | bwd_allreduce: 72.95 | step: 18.71 15%|█▍ | 6060/41250 [14:38:46<84:54:09, 8.69s/it] {'loss': 0.2941, 'grad_norm': 2.818307876586914, 'learning_rate': 3.8583620265853104e-05, 'epoch': 1.47} 15%|█▍ | 6060/41250 [14:38:46<84:54:09, 8.69s/it][2025-04-25 22:36:29,786] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:36:29,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.89 | bwd_microstep: 5784.01 | bwd_inner_microstep: 5659.09 | bwd_allreduce_microstep: 124.88 | step_microstep: 18.44 [2025-04-25 22:36:29,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.89 | bwd: 5784.03 | bwd_inner: 5659.09 | bwd_allreduce: 124.90 | step: 18.44 15%|█▍ | 6061/41250 [14:38:55<84:55:13, 8.69s/it] {'loss': 0.1607, 'grad_norm': 3.5408012866973877, 'learning_rate': 3.8583039777100355e-05, 'epoch': 1.47} 15%|█▍ | 6061/41250 [14:38:55<84:55:13, 8.69s/it][2025-04-25 22:36:38,423] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.98 | optimizer_step: 1.08 [2025-04-25 22:36:38,423] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.45 | bwd_microstep: 5721.79 | bwd_inner_microstep: 5660.24 | bwd_allreduce_microstep: 61.51 | step_microstep: 18.53 [2025-04-25 22:36:38,423] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.45 | bwd: 5721.81 | bwd_inner: 5660.24 | bwd_allreduce: 61.53 | step: 18.53 15%|█▍ | 6062/41250 [14:39:03<84:45:55, 8.67s/it] {'loss': 0.3557, 'grad_norm': 3.9034135341644287, 'learning_rate': 3.858245917378683e-05, 'epoch': 1.47} 15%|█▍ | 6062/41250 [14:39:03<84:45:55, 8.67s/it][2025-04-25 22:36:47,122] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.34 | optimizer_step: 1.05 [2025-04-25 22:36:47,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.41 | bwd_microstep: 5755.87 | bwd_inner_microstep: 5712.07 | bwd_allreduce_microstep: 43.73 | step_microstep: 20.32 [2025-04-25 22:36:47,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.41 | bwd: 5755.88 | bwd_inner: 5712.07 | bwd_allreduce: 43.76 | step: 20.32 15%|█▍ | 6063/41250 [14:39:12<84:50:58, 8.68s/it] {'loss': 0.0898, 'grad_norm': 1.1424237489700317, 'learning_rate': 3.8581878455916084e-05, 'epoch': 1.47} 15%|█▍ | 6063/41250 [14:39:12<84:50:58, 8.68s/it][2025-04-25 22:36:55,755] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 22:36:55,755] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.48 | bwd_microstep: 5721.64 | bwd_inner_microstep: 5657.15 | bwd_allreduce_microstep: 64.44 | step_microstep: 18.93 [2025-04-25 22:36:55,755] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.48 | bwd: 5721.66 | bwd_inner: 5657.15 | bwd_allreduce: 64.46 | step: 18.94 15%|█▍ | 6064/41250 [14:39:21<84:42:00, 8.67s/it] {'loss': 0.128, 'grad_norm': 1.6786420345306396, 'learning_rate': 3.8581297623491724e-05, 'epoch': 1.47} 15%|█▍ | 6064/41250 [14:39:21<84:42:00, 8.67s/it][2025-04-25 22:37:04,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.04 | optimizer_step: 1.04 [2025-04-25 22:37:04,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.63 | bwd_microstep: 5723.34 | bwd_inner_microstep: 5710.11 | bwd_allreduce_microstep: 13.18 | step_microstep: 19.37 [2025-04-25 22:37:04,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.63 | bwd: 5723.35 | bwd_inner: 5710.11 | bwd_allreduce: 13.20 | step: 19.37 15%|█▍ | 6065/41250 [14:39:29<84:40:32, 8.66s/it] {'loss': 0.0617, 'grad_norm': 1.8239099979400635, 'learning_rate': 3.858071667651731e-05, 'epoch': 1.47} 15%|█▍ | 6065/41250 [14:39:29<84:40:32, 8.66s/it][2025-04-25 22:37:13,087] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.04 | optimizer_step: 1.08 [2025-04-25 22:37:13,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.51 | bwd_microstep: 5747.13 | bwd_inner_microstep: 5686.79 | bwd_allreduce_microstep: 60.30 | step_microstep: 18.70 [2025-04-25 22:37:13,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.51 | bwd: 5747.15 | bwd_inner: 5686.79 | bwd_allreduce: 60.31 | step: 18.70 15%|█▍ | 6066/41250 [14:39:38<84:42:16, 8.67s/it] {'loss': 0.1621, 'grad_norm': 1.8253917694091797, 'learning_rate': 3.858013561499644e-05, 'epoch': 1.47} 15%|█▍ | 6066/41250 [14:39:38<84:42:16, 8.67s/it][2025-04-25 22:37:21,772] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:37:21,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.87 | bwd_microstep: 5770.37 | bwd_inner_microstep: 5658.90 | bwd_allreduce_microstep: 111.43 | step_microstep: 18.62 [2025-04-25 22:37:21,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.87 | bwd: 5770.39 | bwd_inner: 5658.90 | bwd_allreduce: 111.44 | step: 18.62 15%|█▍ | 6067/41250 [14:39:47<84:45:16, 8.67s/it] {'loss': 0.0482, 'grad_norm': 0.9281389713287354, 'learning_rate': 3.857955443893269e-05, 'epoch': 1.47} 15%|█▍ | 6067/41250 [14:39:47<84:45:16, 8.67s/it][2025-04-25 22:37:30,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 1.09 [2025-04-25 22:37:30,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.99 | bwd_microstep: 5720.71 | bwd_inner_microstep: 5658.36 | bwd_allreduce_microstep: 62.30 | step_microstep: 18.74 [2025-04-25 22:37:30,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.99 | bwd: 5720.73 | bwd_inner: 5658.36 | bwd_allreduce: 62.32 | step: 18.74 15%|█▍ | 6068/41250 [14:39:55<84:38:00, 8.66s/it] {'loss': 0.0365, 'grad_norm': 0.5670573711395264, 'learning_rate': 3.8578973148329635e-05, 'epoch': 1.47} 15%|█▍ | 6068/41250 [14:39:55<84:38:00, 8.66s/it][2025-04-25 22:37:39,110] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:37:39,110] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.96 | bwd_microstep: 5793.54 | bwd_inner_microstep: 5660.38 | bwd_allreduce_microstep: 133.11 | step_microstep: 18.68 [2025-04-25 22:37:39,111] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.96 | bwd: 5793.56 | bwd_inner: 5660.38 | bwd_allreduce: 133.13 | step: 18.68 15%|█▍ | 6069/41250 [14:40:04<84:45:50, 8.67s/it] {'loss': 0.1937, 'grad_norm': 1.8588240146636963, 'learning_rate': 3.857839174319087e-05, 'epoch': 1.47} 15%|█▍ | 6069/41250 [14:40:04<84:45:50, 8.67s/it][2025-04-25 22:37:47,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.05 | optimizer_step: 1.11 [2025-04-25 22:37:47,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.31 | bwd_microstep: 5776.95 | bwd_inner_microstep: 5661.55 | bwd_allreduce_microstep: 115.35 | step_microstep: 19.29 [2025-04-25 22:37:47,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.31 | bwd: 5776.96 | bwd_inner: 5661.55 | bwd_allreduce: 115.37 | step: 19.29 15%|█▍ | 6070/41250 [14:40:13<84:48:27, 8.68s/it] {'loss': 0.1793, 'grad_norm': 1.4817287921905518, 'learning_rate': 3.857781022351997e-05, 'epoch': 1.47} 15%|█▍ | 6070/41250 [14:40:13<84:48:27, 8.68s/it][2025-04-25 22:37:56,472] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.28 | optimizer_step: 0.94 [2025-04-25 22:37:56,472] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.88 | bwd_microstep: 5758.06 | bwd_inner_microstep: 5658.44 | bwd_allreduce_microstep: 99.57 | step_microstep: 19.45 [2025-04-25 22:37:56,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.88 | bwd: 5758.07 | bwd_inner: 5658.43 | bwd_allreduce: 99.59 | step: 19.45 15%|█▍ | 6071/41250 [14:40:21<84:47:20, 8.68s/it] {'loss': 0.1491, 'grad_norm': 1.4356129169464111, 'learning_rate': 3.857722858932052e-05, 'epoch': 1.47} 15%|█▍ | 6071/41250 [14:40:21<84:47:20, 8.68s/it][2025-04-25 22:38:05,157] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.18 | optimizer_step: 0.91 [2025-04-25 22:38:05,158] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.43 | bwd_microstep: 5778.81 | bwd_inner_microstep: 5645.84 | bwd_allreduce_microstep: 132.93 | step_microstep: 18.76 [2025-04-25 22:38:05,158] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.43 | bwd: 5778.83 | bwd_inner: 5645.84 | bwd_allreduce: 132.95 | step: 18.77 15%|█▍ | 6072/41250 [14:40:30<84:48:43, 8.68s/it] {'loss': 0.1335, 'grad_norm': 2.444269895553589, 'learning_rate': 3.857664684059612e-05, 'epoch': 1.47} 15%|█▍ | 6072/41250 [14:40:30<84:48:43, 8.68s/it][2025-04-25 22:38:13,824] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.12 | optimizer_step: 0.91 [2025-04-25 22:38:13,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.79 | bwd_microstep: 5729.94 | bwd_inner_microstep: 5699.29 | bwd_allreduce_microstep: 30.61 | step_microstep: 18.80 [2025-04-25 22:38:13,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.79 | bwd: 5729.96 | bwd_inner: 5699.29 | bwd_allreduce: 30.63 | step: 18.80 15%|█▍ | 6073/41250 [14:40:39<84:46:20, 8.68s/it] {'loss': 0.1738, 'grad_norm': 3.2738027572631836, 'learning_rate': 3.8576064977350335e-05, 'epoch': 1.47} 15%|█▍ | 6073/41250 [14:40:39<84:46:20, 8.68s/it][2025-04-25 22:38:22,484] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:38:22,484] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.99 | bwd_microstep: 5715.50 | bwd_inner_microstep: 5703.15 | bwd_allreduce_microstep: 12.31 | step_microstep: 18.77 [2025-04-25 22:38:22,485] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.99 | bwd: 5715.52 | bwd_inner: 5703.15 | bwd_allreduce: 12.33 | step: 18.78 15%|█▍ | 6074/41250 [14:40:47<84:43:24, 8.67s/it] {'loss': 0.193, 'grad_norm': 2.668656349182129, 'learning_rate': 3.8575482999586774e-05, 'epoch': 1.47} 15%|█▍ | 6074/41250 [14:40:47<84:43:24, 8.67s/it][2025-04-25 22:38:31,087] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:38:31,087] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.54 | bwd_microstep: 5695.01 | bwd_inner_microstep: 5637.06 | bwd_allreduce_microstep: 57.90 | step_microstep: 18.76 [2025-04-25 22:38:31,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.54 | bwd: 5695.02 | bwd_inner: 5637.06 | bwd_allreduce: 57.92 | step: 18.76 15%|█▍ | 6075/41250 [14:40:56<84:31:37, 8.65s/it] {'loss': 0.0554, 'grad_norm': 0.7869809865951538, 'learning_rate': 3.857490090730901e-05, 'epoch': 1.47} 15%|█▍ | 6075/41250 [14:40:56<84:31:37, 8.65s/it][2025-04-25 22:38:39,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.01 | optimizer_step: 0.95 [2025-04-25 22:38:39,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.17 | bwd_microstep: 5751.72 | bwd_inner_microstep: 5650.42 | bwd_allreduce_microstep: 101.25 | step_microstep: 19.19 [2025-04-25 22:38:39,745] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.17 | bwd: 5751.74 | bwd_inner: 5650.42 | bwd_allreduce: 101.27 | step: 19.19 15%|█▍ | 6076/41250 [14:41:05<84:32:20, 8.65s/it] {'loss': 0.0821, 'grad_norm': 0.7256941199302673, 'learning_rate': 3.857431870052063e-05, 'epoch': 1.47} 15%|█▍ | 6076/41250 [14:41:05<84:32:20, 8.65s/it][2025-04-25 22:38:48,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.03 | optimizer_step: 0.93 [2025-04-25 22:38:48,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2896.23 | bwd_microstep: 5778.90 | bwd_inner_microstep: 5765.61 | bwd_allreduce_microstep: 13.25 | step_microstep: 18.97 [2025-04-25 22:38:48,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2896.23 | bwd: 5778.92 | bwd_inner: 5765.60 | bwd_allreduce: 13.27 | step: 18.97 15%|█▍ | 6077/41250 [14:41:13<84:51:10, 8.68s/it] {'loss': 0.1527, 'grad_norm': 1.6713612079620361, 'learning_rate': 3.8573736379225234e-05, 'epoch': 1.47} 15%|█▍ | 6077/41250 [14:41:13<84:51:10, 8.68s/it][2025-04-25 22:38:57,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 22:38:57,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.52 | bwd_microstep: 5694.98 | bwd_inner_microstep: 5649.30 | bwd_allreduce_microstep: 45.64 | step_microstep: 18.51 [2025-04-25 22:38:57,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.52 | bwd: 5695.00 | bwd_inner: 5649.30 | bwd_allreduce: 45.65 | step: 18.51 15%|█▍ | 6078/41250 [14:41:22<84:37:36, 8.66s/it] {'loss': 0.2342, 'grad_norm': 1.2728734016418457, 'learning_rate': 3.8573153943426405e-05, 'epoch': 1.47} 15%|█▍ | 6078/41250 [14:41:22<84:37:36, 8.66s/it][2025-04-25 22:39:05,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:39:05,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.57 | bwd_microstep: 5732.86 | bwd_inner_microstep: 5702.53 | bwd_allreduce_microstep: 30.29 | step_microstep: 18.43 [2025-04-25 22:39:05,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.57 | bwd: 5732.88 | bwd_inner: 5702.53 | bwd_allreduce: 30.31 | step: 18.43 15%|█▍ | 6079/41250 [14:41:31<84:39:07, 8.66s/it] {'loss': 0.16, 'grad_norm': 2.0672128200531006, 'learning_rate': 3.8572571393127735e-05, 'epoch': 1.47} 15%|█▍ | 6079/41250 [14:41:31<84:39:07, 8.66s/it][2025-04-25 22:39:14,381] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:39:14,381] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.03 | bwd_microstep: 5675.21 | bwd_inner_microstep: 5643.76 | bwd_allreduce_microstep: 31.41 | step_microstep: 18.59 [2025-04-25 22:39:14,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.03 | bwd: 5675.22 | bwd_inner: 5643.76 | bwd_allreduce: 31.42 | step: 18.59 15%|█▍ | 6080/41250 [14:41:39<84:27:01, 8.64s/it] {'loss': 0.2636, 'grad_norm': 2.1917002201080322, 'learning_rate': 3.8571988728332816e-05, 'epoch': 1.47} 15%|█▍ | 6080/41250 [14:41:39<84:27:01, 8.64s/it][2025-04-25 22:39:23,009] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:39:23,010] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.15 | bwd_microstep: 5696.07 | bwd_inner_microstep: 5683.28 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.68 [2025-04-25 22:39:23,010] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.15 | bwd: 5696.08 | bwd_inner: 5683.28 | bwd_allreduce: 12.76 | step: 18.68 15%|█▍ | 6081/41250 [14:41:48<84:24:07, 8.64s/it] {'loss': 0.201, 'grad_norm': 3.958906650543213, 'learning_rate': 3.857140594904525e-05, 'epoch': 1.47} 15%|█▍ | 6081/41250 [14:41:48<84:24:07, 8.64s/it][2025-04-25 22:39:31,606] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-25 22:39:31,607] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.56 | bwd_microstep: 5689.98 | bwd_inner_microstep: 5643.94 | bwd_allreduce_microstep: 46.00 | step_microstep: 19.02 [2025-04-25 22:39:31,607] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.56 | bwd: 5690.00 | bwd_inner: 5643.94 | bwd_allreduce: 46.01 | step: 19.02 15%|█▍ | 6082/41250 [14:41:56<84:16:36, 8.63s/it] {'loss': 0.1709, 'grad_norm': 1.649725079536438, 'learning_rate': 3.8570823055268605e-05, 'epoch': 1.47} 15%|█▍ | 6082/41250 [14:41:56<84:16:36, 8.63s/it][2025-04-25 22:39:40,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 22:39:40,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.27 | bwd_microstep: 5688.23 | bwd_inner_microstep: 5648.63 | bwd_allreduce_microstep: 39.56 | step_microstep: 18.62 [2025-04-25 22:39:40,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.27 | bwd: 5688.24 | bwd_inner: 5648.63 | bwd_allreduce: 39.57 | step: 18.62 15%|█▍ | 6083/41250 [14:42:05<84:12:22, 8.62s/it] {'loss': 0.1508, 'grad_norm': 2.7458348274230957, 'learning_rate': 3.857024004700649e-05, 'epoch': 1.47} 15%|█▍ | 6083/41250 [14:42:05<84:12:22, 8.62s/it][2025-04-25 22:39:48,893] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:39:48,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.01 | bwd_microstep: 5767.70 | bwd_inner_microstep: 5633.02 | bwd_allreduce_microstep: 134.64 | step_microstep: 18.43 [2025-04-25 22:39:48,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.01 | bwd: 5767.71 | bwd_inner: 5633.02 | bwd_allreduce: 134.65 | step: 18.43 15%|█▍ | 6084/41250 [14:42:14<84:23:07, 8.64s/it] {'loss': 0.0374, 'grad_norm': 1.0035048723220825, 'learning_rate': 3.8569656924262495e-05, 'epoch': 1.47} 15%|█▍ | 6084/41250 [14:42:14<84:23:07, 8.64s/it][2025-04-25 22:39:57,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.99 | optimizer_step: 1.05 [2025-04-25 22:39:57,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.62 | bwd_microstep: 5727.29 | bwd_inner_microstep: 5683.32 | bwd_allreduce_microstep: 43.93 | step_microstep: 18.50 [2025-04-25 22:39:57,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.62 | bwd: 5727.31 | bwd_inner: 5683.32 | bwd_allreduce: 43.95 | step: 18.50 15%|█▍ | 6085/41250 [14:42:22<84:27:47, 8.65s/it] {'loss': 0.1282, 'grad_norm': 3.2309632301330566, 'learning_rate': 3.8569073687040224e-05, 'epoch': 1.48} 15%|█▍ | 6085/41250 [14:42:22<84:27:47, 8.65s/it][2025-04-25 22:40:06,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:40:06,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.51 | bwd_microstep: 5721.28 | bwd_inner_microstep: 5699.69 | bwd_allreduce_microstep: 21.54 | step_microstep: 18.45 [2025-04-25 22:40:06,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.51 | bwd: 5721.29 | bwd_inner: 5699.69 | bwd_allreduce: 21.56 | step: 18.45 15%|█▍ | 6086/41250 [14:42:31<84:28:30, 8.65s/it] {'loss': 0.1492, 'grad_norm': 2.697903871536255, 'learning_rate': 3.8568490335343266e-05, 'epoch': 1.48} 15%|█▍ | 6086/41250 [14:42:31<84:28:30, 8.65s/it][2025-04-25 22:40:14,970] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 22:40:14,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.66 | bwd_microstep: 5844.05 | bwd_inner_microstep: 5644.69 | bwd_allreduce_microstep: 199.32 | step_microstep: 18.47 [2025-04-25 22:40:14,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.66 | bwd: 5844.07 | bwd_inner: 5644.69 | bwd_allreduce: 199.34 | step: 18.48 15%|█▍ | 6087/41250 [14:42:40<84:47:48, 8.68s/it] {'loss': 0.0581, 'grad_norm': 1.035782814025879, 'learning_rate': 3.856790686917522e-05, 'epoch': 1.48} 15%|█▍ | 6087/41250 [14:42:40<84:47:48, 8.68s/it][2025-04-25 22:40:23,622] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 22:40:23,623] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.62 | bwd_microstep: 5718.73 | bwd_inner_microstep: 5683.32 | bwd_allreduce_microstep: 35.36 | step_microstep: 18.43 [2025-04-25 22:40:23,623] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.62 | bwd: 5718.74 | bwd_inner: 5683.32 | bwd_allreduce: 35.37 | step: 18.43 15%|█▍ | 6088/41250 [14:42:48<84:42:37, 8.67s/it] {'loss': 0.1835, 'grad_norm': 2.374643564224243, 'learning_rate': 3.856732328853967e-05, 'epoch': 1.48} 15%|█▍ | 6088/41250 [14:42:48<84:42:37, 8.67s/it][2025-04-25 22:40:32,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 1.02 | optimizer_step: 1.02 [2025-04-25 22:40:32,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.32 | bwd_microstep: 5684.77 | bwd_inner_microstep: 5657.82 | bwd_allreduce_microstep: 26.90 | step_microstep: 19.61 [2025-04-25 22:40:32,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.32 | bwd: 5684.78 | bwd_inner: 5657.82 | bwd_allreduce: 26.92 | step: 19.62 15%|█▍ | 6089/41250 [14:42:57<84:32:16, 8.66s/it] {'loss': 0.1522, 'grad_norm': 3.0391266345977783, 'learning_rate': 3.856673959344023e-05, 'epoch': 1.48} 15%|█▍ | 6089/41250 [14:42:57<84:32:16, 8.66s/it][2025-04-25 22:40:40,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.03 | optimizer_step: 1.01 [2025-04-25 22:40:40,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.79 | bwd_microstep: 5767.73 | bwd_inner_microstep: 5650.25 | bwd_allreduce_microstep: 117.42 | step_microstep: 18.99 [2025-04-25 22:40:40,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.79 | bwd: 5767.74 | bwd_inner: 5650.25 | bwd_allreduce: 117.44 | step: 18.99 15%|█▍ | 6090/41250 [14:43:06<84:36:22, 8.66s/it] {'loss': 0.1691, 'grad_norm': 3.3485734462738037, 'learning_rate': 3.856615578388049e-05, 'epoch': 1.48} 15%|█▍ | 6090/41250 [14:43:06<84:36:22, 8.66s/it][2025-04-25 22:40:49,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 22:40:49,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.14 | bwd_microstep: 5767.48 | bwd_inner_microstep: 5662.74 | bwd_allreduce_microstep: 104.69 | step_microstep: 18.81 [2025-04-25 22:40:49,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.14 | bwd: 5767.49 | bwd_inner: 5662.74 | bwd_allreduce: 104.71 | step: 18.81 15%|█▍ | 6091/41250 [14:43:14<84:38:32, 8.67s/it] {'loss': 0.145, 'grad_norm': 1.572174310684204, 'learning_rate': 3.8565571859864056e-05, 'epoch': 1.48} 15%|█▍ | 6091/41250 [14:43:14<84:38:32, 8.67s/it][2025-04-25 22:40:58,552] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:40:58,553] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.84 | bwd_microstep: 6046.41 | bwd_inner_microstep: 5649.51 | bwd_allreduce_microstep: 396.85 | step_microstep: 18.53 [2025-04-25 22:40:58,553] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.84 | bwd: 6046.42 | bwd_inner: 5649.51 | bwd_allreduce: 396.86 | step: 18.53 15%|█▍ | 6092/41250 [14:43:23<85:29:33, 8.75s/it] {'loss': 0.1116, 'grad_norm': 2.338496208190918, 'learning_rate': 3.856498782139451e-05, 'epoch': 1.48} 15%|█▍ | 6092/41250 [14:43:23<85:29:33, 8.75s/it][2025-04-25 22:41:07,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:41:07,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.95 | bwd_microstep: 5705.03 | bwd_inner_microstep: 5692.35 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.78 [2025-04-25 22:41:07,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.95 | bwd: 5705.05 | bwd_inner: 5692.35 | bwd_allreduce: 12.65 | step: 18.79 15%|█▍ | 6093/41250 [14:43:32<85:09:28, 8.72s/it] {'loss': 0.0723, 'grad_norm': 1.5163829326629639, 'learning_rate': 3.856440366847548e-05, 'epoch': 1.48} 15%|█▍ | 6093/41250 [14:43:32<85:09:28, 8.72s/it][2025-04-25 22:41:15,885] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.96 | optimizer_step: 1.03 [2025-04-25 22:41:15,886] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.05 | bwd_microstep: 5759.75 | bwd_inner_microstep: 5701.10 | bwd_allreduce_microstep: 58.60 | step_microstep: 18.22 [2025-04-25 22:41:15,886] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.05 | bwd: 5759.76 | bwd_inner: 5701.10 | bwd_allreduce: 58.62 | step: 18.22 15%|█▍ | 6094/41250 [14:43:41<85:04:34, 8.71s/it] {'loss': 0.1879, 'grad_norm': 3.0695433616638184, 'learning_rate': 3.856381940111055e-05, 'epoch': 1.48} 15%|█▍ | 6094/41250 [14:43:41<85:04:34, 8.71s/it][2025-04-25 22:41:24,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:41:24,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.90 | bwd_microstep: 5883.84 | bwd_inner_microstep: 5715.89 | bwd_allreduce_microstep: 167.91 | step_microstep: 18.44 [2025-04-25 22:41:24,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.90 | bwd: 5883.85 | bwd_inner: 5715.89 | bwd_allreduce: 167.92 | step: 18.44 15%|█▍ | 6095/41250 [14:43:50<85:24:09, 8.75s/it] {'loss': 0.0202, 'grad_norm': 0.48207953572273254, 'learning_rate': 3.856323501930333e-05, 'epoch': 1.48} 15%|█▍ | 6095/41250 [14:43:50<85:24:09, 8.75s/it][2025-04-25 22:41:33,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 22:41:33,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.86 | bwd_microstep: 5770.02 | bwd_inner_microstep: 5699.12 | bwd_allreduce_microstep: 70.86 | step_microstep: 18.38 [2025-04-25 22:41:33,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.86 | bwd: 5770.04 | bwd_inner: 5699.12 | bwd_allreduce: 70.87 | step: 18.38 15%|█▍ | 6096/41250 [14:43:58<85:16:12, 8.73s/it] {'loss': 0.1408, 'grad_norm': 1.3682284355163574, 'learning_rate': 3.856265052305741e-05, 'epoch': 1.48} 15%|█▍ | 6096/41250 [14:43:58<85:16:12, 8.73s/it][2025-04-25 22:41:42,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:41:42,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.32 | bwd_microstep: 5791.61 | bwd_inner_microstep: 5665.66 | bwd_allreduce_microstep: 125.90 | step_microstep: 18.45 [2025-04-25 22:41:42,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.32 | bwd: 5791.62 | bwd_inner: 5665.66 | bwd_allreduce: 125.92 | step: 18.45 15%|█▍ | 6097/41250 [14:44:07<85:11:23, 8.72s/it] {'loss': 0.038, 'grad_norm': 0.5344656109809875, 'learning_rate': 3.85620659123764e-05, 'epoch': 1.48} 15%|█▍ | 6097/41250 [14:44:07<85:11:23, 8.72s/it][2025-04-25 22:41:50,735] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.97 | optimizer_step: 0.94 [2025-04-25 22:41:50,735] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.92 | bwd_microstep: 5702.04 | bwd_inner_microstep: 5659.54 | bwd_allreduce_microstep: 42.47 | step_microstep: 18.79 [2025-04-25 22:41:50,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.92 | bwd: 5702.06 | bwd_inner: 5659.54 | bwd_allreduce: 42.48 | step: 18.78 15%|█▍ | 6098/41250 [14:44:16<84:52:41, 8.69s/it] {'loss': 0.3558, 'grad_norm': 1.6231334209442139, 'learning_rate': 3.85614811872639e-05, 'epoch': 1.48} 15%|█▍ | 6098/41250 [14:44:16<84:52:41, 8.69s/it][2025-04-25 22:41:59,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.23 | optimizer_step: 0.92 [2025-04-25 22:41:59,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.57 | bwd_microstep: 5754.65 | bwd_inner_microstep: 5694.78 | bwd_allreduce_microstep: 59.83 | step_microstep: 19.45 [2025-04-25 22:41:59,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.57 | bwd: 5754.67 | bwd_inner: 5694.78 | bwd_allreduce: 59.85 | step: 19.45 15%|█▍ | 6099/41250 [14:44:24<84:53:11, 8.69s/it] {'loss': 0.384, 'grad_norm': 3.0569567680358887, 'learning_rate': 3.856089634772353e-05, 'epoch': 1.48} 15%|█▍ | 6099/41250 [14:44:24<84:53:11, 8.69s/it][2025-04-25 22:42:08,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:42:08,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.13 | bwd_microstep: 5763.85 | bwd_inner_microstep: 5691.10 | bwd_allreduce_microstep: 72.71 | step_microstep: 18.29 [2025-04-25 22:42:08,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.13 | bwd: 5763.86 | bwd_inner: 5691.10 | bwd_allreduce: 72.72 | step: 18.29 15%|█▍ | 6100/41250 [14:44:33<84:52:44, 8.69s/it] {'loss': 0.2871, 'grad_norm': 2.9677653312683105, 'learning_rate': 3.856031139375888e-05, 'epoch': 1.48} 15%|█▍ | 6100/41250 [14:44:33<84:52:44, 8.69s/it][2025-04-25 22:42:16,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.21 | optimizer_step: 0.94 [2025-04-25 22:42:16,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.35 | bwd_microstep: 5718.22 | bwd_inner_microstep: 5705.59 | bwd_allreduce_microstep: 12.59 | step_microstep: 18.76 [2025-04-25 22:42:16,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.35 | bwd: 5718.24 | bwd_inner: 5705.59 | bwd_allreduce: 12.61 | step: 18.77 15%|█▍ | 6101/41250 [14:44:42<84:46:54, 8.68s/it] {'loss': 0.1104, 'grad_norm': 1.8857102394104004, 'learning_rate': 3.855972632537356e-05, 'epoch': 1.48} 15%|█▍ | 6101/41250 [14:44:42<84:46:54, 8.68s/it][2025-04-25 22:42:25,702] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.15 | optimizer_step: 1.07 [2025-04-25 22:42:25,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2944.25 | bwd_microstep: 5890.13 | bwd_inner_microstep: 5876.16 | bwd_allreduce_microstep: 13.91 | step_microstep: 20.43 [2025-04-25 22:42:25,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2944.25 | bwd: 5890.14 | bwd_inner: 5876.16 | bwd_allreduce: 13.94 | step: 20.43 15%|█▍ | 6102/41250 [14:44:51<85:28:21, 8.75s/it] {'loss': 0.2015, 'grad_norm': 5.8595075607299805, 'learning_rate': 3.855914114257118e-05, 'epoch': 1.48} 15%|█▍ | 6102/41250 [14:44:51<85:28:21, 8.75s/it][2025-04-25 22:42:34,400] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 22:42:34,400] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.19 | bwd_microstep: 5756.95 | bwd_inner_microstep: 5722.10 | bwd_allreduce_microstep: 34.80 | step_microstep: 18.92 [2025-04-25 22:42:34,401] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.19 | bwd: 5756.97 | bwd_inner: 5722.10 | bwd_allreduce: 34.82 | step: 18.92 15%|█▍ | 6103/41250 [14:44:59<85:18:03, 8.74s/it] {'loss': 0.0304, 'grad_norm': 0.7226480841636658, 'learning_rate': 3.855855584535534e-05, 'epoch': 1.48} 15%|█▍ | 6103/41250 [14:44:59<85:18:03, 8.74s/it][2025-04-25 22:42:43,117] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 22:42:43,117] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.12 | bwd_microstep: 5781.94 | bwd_inner_microstep: 5665.79 | bwd_allreduce_microstep: 116.10 | step_microstep: 18.98 [2025-04-25 22:42:43,117] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.12 | bwd: 5781.95 | bwd_inner: 5665.79 | bwd_allreduce: 116.12 | step: 18.99 15%|█▍ | 6104/41250 [14:45:08<85:14:13, 8.73s/it] {'loss': 0.465, 'grad_norm': 6.394726753234863, 'learning_rate': 3.855797043372966e-05, 'epoch': 1.48} 15%|█▍ | 6104/41250 [14:45:08<85:14:13, 8.73s/it][2025-04-25 22:42:51,822] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 22:42:51,823] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.29 | bwd_microstep: 5767.22 | bwd_inner_microstep: 5690.96 | bwd_allreduce_microstep: 76.21 | step_microstep: 18.72 [2025-04-25 22:42:51,823] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.29 | bwd: 5767.23 | bwd_inner: 5690.96 | bwd_allreduce: 76.23 | step: 18.73 15%|█▍ | 6105/41250 [14:45:17<85:09:41, 8.72s/it] {'loss': 0.195, 'grad_norm': 1.3097726106643677, 'learning_rate': 3.855738490769774e-05, 'epoch': 1.48} 15%|█▍ | 6105/41250 [14:45:17<85:09:41, 8.72s/it][2025-04-25 22:43:00,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.95 | optimizer_step: 1.00 [2025-04-25 22:43:00,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.70 | bwd_microstep: 5765.32 | bwd_inner_microstep: 5700.05 | bwd_allreduce_microstep: 65.23 | step_microstep: 18.52 [2025-04-25 22:43:00,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.70 | bwd: 5765.33 | bwd_inner: 5700.05 | bwd_allreduce: 65.25 | step: 18.52 15%|█▍ | 6106/41250 [14:45:25<85:04:36, 8.71s/it] {'loss': 0.1152, 'grad_norm': 4.100393295288086, 'learning_rate': 3.8556799267263195e-05, 'epoch': 1.48} 15%|█▍ | 6106/41250 [14:45:25<85:04:36, 8.71s/it][2025-04-25 22:43:09,259] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 22:43:09,260] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.66 | bwd_microstep: 5823.64 | bwd_inner_microstep: 5661.29 | bwd_allreduce_microstep: 162.31 | step_microstep: 18.47 [2025-04-25 22:43:09,260] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.66 | bwd: 5823.65 | bwd_inner: 5661.29 | bwd_allreduce: 162.32 | step: 18.47 15%|█▍ | 6107/41250 [14:45:34<85:09:07, 8.72s/it] {'loss': 0.1244, 'grad_norm': 2.3923091888427734, 'learning_rate': 3.855621351242963e-05, 'epoch': 1.48} 15%|█▍ | 6107/41250 [14:45:34<85:09:07, 8.72s/it][2025-04-25 22:43:17,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.06 | optimizer_step: 0.98 [2025-04-25 22:43:17,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.97 | bwd_microstep: 5768.31 | bwd_inner_microstep: 5707.89 | bwd_allreduce_microstep: 60.38 | step_microstep: 19.10 [2025-04-25 22:43:17,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.97 | bwd: 5768.32 | bwd_inner: 5707.89 | bwd_allreduce: 60.39 | step: 19.10 15%|█▍ | 6108/41250 [14:45:43<85:07:38, 8.72s/it] {'loss': 0.0737, 'grad_norm': 1.604665994644165, 'learning_rate': 3.855562764320065e-05, 'epoch': 1.48} 15%|█▍ | 6108/41250 [14:45:43<85:07:38, 8.72s/it][2025-04-25 22:43:26,748] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.00 | optimizer_step: 1.01 [2025-04-25 22:43:26,749] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2893.87 | bwd_microstep: 5795.05 | bwd_inner_microstep: 5782.17 | bwd_allreduce_microstep: 12.83 | step_microstep: 19.18 [2025-04-25 22:43:26,749] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2893.87 | bwd: 5795.07 | bwd_inner: 5782.17 | bwd_allreduce: 12.86 | step: 19.19 15%|█▍ | 6109/41250 [14:45:52<85:16:58, 8.74s/it] {'loss': 0.1094, 'grad_norm': 1.5783054828643799, 'learning_rate': 3.8555041659579885e-05, 'epoch': 1.48} 15%|█▍ | 6109/41250 [14:45:52<85:16:58, 8.74s/it][2025-04-25 22:43:35,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:43:35,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.48 | bwd_microstep: 5710.22 | bwd_inner_microstep: 5693.35 | bwd_allreduce_microstep: 16.83 | step_microstep: 18.84 [2025-04-25 22:43:35,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.48 | bwd: 5710.23 | bwd_inner: 5693.35 | bwd_allreduce: 16.84 | step: 18.84 15%|█▍ | 6110/41250 [14:46:00<84:59:52, 8.71s/it] {'loss': 0.0493, 'grad_norm': 0.6803320646286011, 'learning_rate': 3.855445556157093e-05, 'epoch': 1.48} 15%|█▍ | 6110/41250 [14:46:00<84:59:52, 8.71s/it][2025-04-25 22:43:44,099] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.17 | optimizer_step: 0.93 [2025-04-25 22:43:44,099] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.82 | bwd_microstep: 5778.77 | bwd_inner_microstep: 5695.62 | bwd_allreduce_microstep: 83.10 | step_microstep: 19.01 [2025-04-25 22:43:44,099] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.82 | bwd: 5778.78 | bwd_inner: 5695.62 | bwd_allreduce: 83.12 | step: 19.01 15%|█▍ | 6111/41250 [14:46:09<85:00:14, 8.71s/it] {'loss': 0.2111, 'grad_norm': 2.2007675170898438, 'learning_rate': 3.855386934917742e-05, 'epoch': 1.48} 15%|█▍ | 6111/41250 [14:46:09<85:00:14, 8.71s/it][2025-04-25 22:43:52,773] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:43:52,774] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.00 | bwd_microstep: 5734.79 | bwd_inner_microstep: 5708.25 | bwd_allreduce_microstep: 26.50 | step_microstep: 18.81 [2025-04-25 22:43:52,774] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.00 | bwd: 5734.80 | bwd_inner: 5708.25 | bwd_allreduce: 26.51 | step: 18.81 15%|█▍ | 6112/41250 [14:46:18<84:54:11, 8.70s/it] {'loss': 0.2271, 'grad_norm': 2.8475306034088135, 'learning_rate': 3.8553283022402947e-05, 'epoch': 1.48} 15%|█▍ | 6112/41250 [14:46:18<84:54:11, 8.70s/it][2025-04-25 22:44:01,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.02 | optimizer_step: 1.03 [2025-04-25 22:44:01,745] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2886.31 | bwd_microstep: 5997.78 | bwd_inner_microstep: 5768.26 | bwd_allreduce_microstep: 229.47 | step_microstep: 19.45 [2025-04-25 22:44:01,745] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2886.31 | bwd: 5997.79 | bwd_inner: 5768.26 | bwd_allreduce: 229.49 | step: 19.45 15%|█▍ | 6113/41250 [14:46:27<85:41:53, 8.78s/it] {'loss': 0.1954, 'grad_norm': 2.0391738414764404, 'learning_rate': 3.8552696581251136e-05, 'epoch': 1.48} 15%|█▍ | 6113/41250 [14:46:27<85:41:53, 8.78s/it][2025-04-25 22:44:10,446] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 22:44:10,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.07 | bwd_microstep: 5762.78 | bwd_inner_microstep: 5711.34 | bwd_allreduce_microstep: 51.40 | step_microstep: 18.82 [2025-04-25 22:44:10,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.07 | bwd: 5762.80 | bwd_inner: 5711.34 | bwd_allreduce: 51.42 | step: 18.82 15%|█▍ | 6114/41250 [14:46:35<85:28:00, 8.76s/it] {'loss': 0.0779, 'grad_norm': 1.6984562873840332, 'learning_rate': 3.855211002572559e-05, 'epoch': 1.48} 15%|█▍ | 6114/41250 [14:46:35<85:28:00, 8.76s/it][2025-04-25 22:44:19,153] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 22:44:19,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.41 | bwd_microstep: 5792.90 | bwd_inner_microstep: 5655.53 | bwd_allreduce_microstep: 137.32 | step_microstep: 19.30 [2025-04-25 22:44:19,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.41 | bwd: 5792.91 | bwd_inner: 5655.53 | bwd_allreduce: 137.34 | step: 19.30 15%|█▍ | 6115/41250 [14:46:44<85:18:55, 8.74s/it] {'loss': 0.1118, 'grad_norm': 1.7360550165176392, 'learning_rate': 3.8551523355829943e-05, 'epoch': 1.48} 15%|█▍ | 6115/41250 [14:46:44<85:18:55, 8.74s/it][2025-04-25 22:44:27,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 22:44:27,764] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.02 | bwd_microstep: 5698.59 | bwd_inner_microstep: 5654.95 | bwd_allreduce_microstep: 43.59 | step_microstep: 18.59 [2025-04-25 22:44:27,764] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.02 | bwd: 5698.60 | bwd_inner: 5654.95 | bwd_allreduce: 43.60 | step: 18.60 15%|█▍ | 6116/41250 [14:46:53<84:55:43, 8.70s/it] {'loss': 0.116, 'grad_norm': 3.732700824737549, 'learning_rate': 3.8550936571567806e-05, 'epoch': 1.48} 15%|█▍ | 6116/41250 [14:46:53<84:55:43, 8.70s/it][2025-04-25 22:44:36,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.20 | optimizer_step: 0.90 [2025-04-25 22:44:36,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.92 | bwd_microstep: 5764.93 | bwd_inner_microstep: 5655.96 | bwd_allreduce_microstep: 108.93 | step_microstep: 18.74 [2025-04-25 22:44:36,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.92 | bwd: 5764.94 | bwd_inner: 5655.96 | bwd_allreduce: 108.94 | step: 18.74 15%|█▍ | 6117/41250 [14:47:01<84:51:42, 8.70s/it] {'loss': 0.037, 'grad_norm': 1.5926142930984497, 'learning_rate': 3.8550349672942784e-05, 'epoch': 1.48} 15%|█▍ | 6117/41250 [14:47:01<84:51:42, 8.70s/it][2025-04-25 22:44:45,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.98 | optimizer_step: 0.92 [2025-04-25 22:44:45,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.72 | bwd_microstep: 5713.61 | bwd_inner_microstep: 5700.97 | bwd_allreduce_microstep: 12.59 | step_microstep: 18.77 [2025-04-25 22:44:45,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.72 | bwd: 5713.62 | bwd_inner: 5700.97 | bwd_allreduce: 12.61 | step: 18.77 15%|█▍ | 6118/41250 [14:47:10<84:43:34, 8.68s/it] {'loss': 0.1315, 'grad_norm': 2.694427728652954, 'learning_rate': 3.854976265995851e-05, 'epoch': 1.48} 15%|█▍ | 6118/41250 [14:47:10<84:43:34, 8.68s/it][2025-04-25 22:44:53,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.98 [2025-04-25 22:44:53,694] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.09 | bwd_microstep: 5686.10 | bwd_inner_microstep: 5653.60 | bwd_allreduce_microstep: 32.46 | step_microstep: 19.17 [2025-04-25 22:44:53,694] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.09 | bwd: 5686.11 | bwd_inner: 5653.60 | bwd_allreduce: 32.47 | step: 19.17 15%|█▍ | 6119/41250 [14:47:19<84:29:00, 8.66s/it] {'loss': 0.2185, 'grad_norm': 2.397686719894409, 'learning_rate': 3.85491755326186e-05, 'epoch': 1.48} 15%|█▍ | 6119/41250 [14:47:19<84:29:00, 8.66s/it][2025-04-25 22:45:02,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:45:02,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.29 | bwd_microstep: 5718.11 | bwd_inner_microstep: 5696.65 | bwd_allreduce_microstep: 21.42 | step_microstep: 18.79 [2025-04-25 22:45:02,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.29 | bwd: 5718.12 | bwd_inner: 5696.65 | bwd_allreduce: 21.43 | step: 18.80 15%|█▍ | 6120/41250 [14:47:27<84:27:34, 8.66s/it] {'loss': 0.2087, 'grad_norm': 2.3097383975982666, 'learning_rate': 3.854858829092667e-05, 'epoch': 1.48} 15%|█▍ | 6120/41250 [14:47:27<84:27:34, 8.66s/it][2025-04-25 22:45:11,016] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 22:45:11,016] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.60 | bwd_microstep: 5745.22 | bwd_inner_microstep: 5687.13 | bwd_allreduce_microstep: 58.05 | step_microstep: 18.86 [2025-04-25 22:45:11,017] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.60 | bwd: 5745.24 | bwd_inner: 5687.12 | bwd_allreduce: 58.07 | step: 18.86 15%|█▍ | 6121/41250 [14:47:36<84:30:26, 8.66s/it] {'loss': 0.0744, 'grad_norm': 1.47463858127594, 'learning_rate': 3.854800093488634e-05, 'epoch': 1.48} 15%|█▍ | 6121/41250 [14:47:36<84:30:26, 8.66s/it][2025-04-25 22:45:19,698] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 22:45:19,698] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.10 | bwd_microstep: 5747.75 | bwd_inner_microstep: 5697.34 | bwd_allreduce_microstep: 50.36 | step_microstep: 18.88 [2025-04-25 22:45:19,699] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.10 | bwd: 5747.77 | bwd_inner: 5697.34 | bwd_allreduce: 50.38 | step: 18.88 15%|█▍ | 6122/41250 [14:47:45<84:34:02, 8.67s/it] {'loss': 0.1605, 'grad_norm': 4.118828773498535, 'learning_rate': 3.854741346450124e-05, 'epoch': 1.48} 15%|█▍ | 6122/41250 [14:47:45<84:34:02, 8.67s/it][2025-04-25 22:45:28,384] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 22:45:28,384] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.91 | bwd_microstep: 5780.32 | bwd_inner_microstep: 5651.66 | bwd_allreduce_microstep: 128.62 | step_microstep: 18.60 [2025-04-25 22:45:28,384] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.91 | bwd: 5780.34 | bwd_inner: 5651.66 | bwd_allreduce: 128.64 | step: 18.60 15%|█▍ | 6123/41250 [14:47:53<84:37:16, 8.67s/it] {'loss': 0.1076, 'grad_norm': 2.025045394897461, 'learning_rate': 3.8546825879774975e-05, 'epoch': 1.48} 15%|█▍ | 6123/41250 [14:47:53<84:37:16, 8.67s/it][2025-04-25 22:45:37,005] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 1.05 [2025-04-25 22:45:37,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.45 | bwd_microstep: 5697.41 | bwd_inner_microstep: 5684.64 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.83 [2025-04-25 22:45:37,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.45 | bwd: 5697.43 | bwd_inner: 5684.64 | bwd_allreduce: 12.74 | step: 18.83 15%|█▍ | 6124/41250 [14:48:02<84:28:08, 8.66s/it] {'loss': 0.1178, 'grad_norm': 1.3667101860046387, 'learning_rate': 3.854623818071118e-05, 'epoch': 1.48} 15%|█▍ | 6124/41250 [14:48:02<84:28:08, 8.66s/it][2025-04-25 22:45:45,621] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:45:45,622] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.96 | bwd_microstep: 5703.50 | bwd_inner_microstep: 5655.88 | bwd_allreduce_microstep: 47.58 | step_microstep: 18.40 [2025-04-25 22:45:45,622] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.96 | bwd: 5703.51 | bwd_inner: 5655.88 | bwd_allreduce: 47.59 | step: 18.40 15%|█▍ | 6125/41250 [14:48:10<84:20:45, 8.64s/it] {'loss': 0.17, 'grad_norm': 1.346262812614441, 'learning_rate': 3.8545650367313475e-05, 'epoch': 1.48} 15%|█▍ | 6125/41250 [14:48:10<84:20:45, 8.64s/it][2025-04-25 22:45:54,304] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:45:54,305] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.07 | bwd_microstep: 5767.83 | bwd_inner_microstep: 5646.24 | bwd_allreduce_microstep: 121.54 | step_microstep: 18.53 [2025-04-25 22:45:54,305] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.07 | bwd: 5767.84 | bwd_inner: 5646.24 | bwd_allreduce: 121.56 | step: 18.53 15%|█▍ | 6126/41250 [14:48:19<84:27:18, 8.66s/it] {'loss': 0.0358, 'grad_norm': 1.34906005859375, 'learning_rate': 3.8545062439585475e-05, 'epoch': 1.49} 15%|█▍ | 6126/41250 [14:48:19<84:27:18, 8.66s/it][2025-04-25 22:46:02,979] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-25 22:46:02,979] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.48 | bwd_microstep: 5759.85 | bwd_inner_microstep: 5646.98 | bwd_allreduce_microstep: 112.83 | step_microstep: 18.50 [2025-04-25 22:46:02,980] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.48 | bwd: 5759.86 | bwd_inner: 5646.98 | bwd_allreduce: 112.84 | step: 18.50 15%|█▍ | 6127/41250 [14:48:28<84:30:23, 8.66s/it] {'loss': 0.1306, 'grad_norm': 1.5474921464920044, 'learning_rate': 3.854447439753082e-05, 'epoch': 1.49} 15%|█▍ | 6127/41250 [14:48:28<84:30:23, 8.66s/it][2025-04-25 22:46:11,571] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 1.00 [2025-04-25 22:46:11,572] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.23 | bwd_microstep: 5682.38 | bwd_inner_microstep: 5653.92 | bwd_allreduce_microstep: 28.41 | step_microstep: 18.59 [2025-04-25 22:46:11,572] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.23 | bwd: 5682.40 | bwd_inner: 5653.92 | bwd_allreduce: 28.43 | step: 18.60 15%|█▍ | 6128/41250 [14:48:36<84:18:01, 8.64s/it] {'loss': 0.2267, 'grad_norm': 3.750621795654297, 'learning_rate': 3.854388624115313e-05, 'epoch': 1.49} 15%|█▍ | 6128/41250 [14:48:36<84:18:01, 8.64s/it][2025-04-25 22:46:20,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 22:46:20,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.42 | bwd_microstep: 5715.04 | bwd_inner_microstep: 5702.41 | bwd_allreduce_microstep: 12.59 | step_microstep: 18.34 [2025-04-25 22:46:20,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.42 | bwd: 5715.05 | bwd_inner: 5702.41 | bwd_allreduce: 12.60 | step: 18.34 15%|█▍ | 6129/41250 [14:48:45<84:20:12, 8.64s/it] {'loss': 0.0796, 'grad_norm': 1.6302876472473145, 'learning_rate': 3.854329797045603e-05, 'epoch': 1.49} 15%|█▍ | 6129/41250 [14:48:45<84:20:12, 8.64s/it][2025-04-25 22:46:28,891] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:46:28,892] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.70 | bwd_microstep: 5747.99 | bwd_inner_microstep: 5643.57 | bwd_allreduce_microstep: 104.37 | step_microstep: 18.43 [2025-04-25 22:46:28,892] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.70 | bwd: 5748.00 | bwd_inner: 5643.57 | bwd_allreduce: 104.39 | step: 18.43 15%|█▍ | 6130/41250 [14:48:54<84:24:32, 8.65s/it] {'loss': 0.1084, 'grad_norm': 1.2418947219848633, 'learning_rate': 3.8542709585443136e-05, 'epoch': 1.49} 15%|█▍ | 6130/41250 [14:48:54<84:24:32, 8.65s/it][2025-04-25 22:46:37,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.14 | optimizer_step: 1.04 [2025-04-25 22:46:37,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.97 | bwd_microstep: 5746.72 | bwd_inner_microstep: 5674.27 | bwd_allreduce_microstep: 72.39 | step_microstep: 20.25 [2025-04-25 22:46:37,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.97 | bwd: 5746.74 | bwd_inner: 5674.27 | bwd_allreduce: 72.42 | step: 20.25 15%|█▍ | 6131/41250 [14:49:02<84:29:06, 8.66s/it] {'loss': 0.2228, 'grad_norm': 3.5927791595458984, 'learning_rate': 3.854212108611809e-05, 'epoch': 1.49} 15%|█▍ | 6131/41250 [14:49:02<84:29:06, 8.66s/it][2025-04-25 22:46:46,255] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.08 | optimizer_step: 1.12 [2025-04-25 22:46:46,255] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.57 | bwd_microstep: 5740.31 | bwd_inner_microstep: 5686.93 | bwd_allreduce_microstep: 53.32 | step_microstep: 19.22 [2025-04-25 22:46:46,255] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.57 | bwd: 5740.32 | bwd_inner: 5686.93 | bwd_allreduce: 53.34 | step: 19.22 15%|█▍ | 6132/41250 [14:49:11<84:32:34, 8.67s/it] {'loss': 0.3336, 'grad_norm': 2.748608112335205, 'learning_rate': 3.854153247248452e-05, 'epoch': 1.49} 15%|█▍ | 6132/41250 [14:49:11<84:32:34, 8.67s/it][2025-04-25 22:46:54,989] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.10 | optimizer_step: 0.98 [2025-04-25 22:46:54,990] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2881.05 | bwd_microstep: 5772.83 | bwd_inner_microstep: 5759.67 | bwd_allreduce_microstep: 13.11 | step_microstep: 19.29 [2025-04-25 22:46:54,990] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2881.05 | bwd: 5772.85 | bwd_inner: 5759.67 | bwd_allreduce: 13.13 | step: 19.29 15%|█▍ | 6133/41250 [14:49:20<84:44:19, 8.69s/it] {'loss': 0.1199, 'grad_norm': 2.1759111881256104, 'learning_rate': 3.854094374454603e-05, 'epoch': 1.49} 15%|█▍ | 6133/41250 [14:49:20<84:44:19, 8.69s/it][2025-04-25 22:47:03,664] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.33 | optimizer_step: 1.06 [2025-04-25 22:47:03,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.22 | bwd_microstep: 5767.54 | bwd_inner_microstep: 5657.94 | bwd_allreduce_microstep: 109.53 | step_microstep: 20.16 [2025-04-25 22:47:03,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.22 | bwd: 5767.56 | bwd_inner: 5657.94 | bwd_allreduce: 109.56 | step: 20.16 15%|█▍ | 6134/41250 [14:49:28<84:42:54, 8.68s/it] {'loss': 0.0682, 'grad_norm': 1.709874153137207, 'learning_rate': 3.854035490230629e-05, 'epoch': 1.49} 15%|█▍ | 6134/41250 [14:49:28<84:42:54, 8.68s/it][2025-04-25 22:47:12,363] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:47:12,364] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.69 | bwd_microstep: 5758.52 | bwd_inner_microstep: 5702.25 | bwd_allreduce_microstep: 56.23 | step_microstep: 18.77 [2025-04-25 22:47:12,364] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.69 | bwd: 5758.54 | bwd_inner: 5702.25 | bwd_allreduce: 56.24 | step: 18.77 15%|█▍ | 6135/41250 [14:49:37<84:44:30, 8.69s/it] {'loss': 0.2387, 'grad_norm': 4.299732208251953, 'learning_rate': 3.85397659457689e-05, 'epoch': 1.49} 15%|█▍ | 6135/41250 [14:49:37<84:44:30, 8.69s/it][2025-04-25 22:47:21,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-25 22:47:21,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.54 | bwd_microstep: 5861.50 | bwd_inner_microstep: 5688.17 | bwd_allreduce_microstep: 173.28 | step_microstep: 19.24 [2025-04-25 22:47:21,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.54 | bwd: 5861.51 | bwd_inner: 5688.17 | bwd_allreduce: 173.30 | step: 19.24 15%|█▍ | 6136/41250 [14:49:46<85:02:28, 8.72s/it] {'loss': 0.2292, 'grad_norm': 3.217113494873047, 'learning_rate': 3.8539176874937505e-05, 'epoch': 1.49} 15%|█▍ | 6136/41250 [14:49:46<85:02:28, 8.72s/it][2025-04-25 22:47:29,911] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-25 22:47:29,911] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2886.23 | bwd_microstep: 5787.99 | bwd_inner_microstep: 5775.11 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.80 [2025-04-25 22:47:29,912] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2886.23 | bwd: 5788.01 | bwd_inner: 5775.11 | bwd_allreduce: 12.86 | step: 18.80 15%|█▍ | 6137/41250 [14:49:55<85:08:58, 8.73s/it] {'loss': 0.4097, 'grad_norm': 2.867867946624756, 'learning_rate': 3.8538587689815726e-05, 'epoch': 1.49} 15%|█▍ | 6137/41250 [14:49:55<85:08:58, 8.73s/it][2025-04-25 22:47:38,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.05 | optimizer_step: 1.01 [2025-04-25 22:47:38,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.91 | bwd_microstep: 5704.57 | bwd_inner_microstep: 5691.86 | bwd_allreduce_microstep: 12.66 | step_microstep: 19.67 [2025-04-25 22:47:38,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.91 | bwd: 5704.58 | bwd_inner: 5691.86 | bwd_allreduce: 12.68 | step: 19.67 15%|█▍ | 6138/41250 [14:50:03<84:52:40, 8.70s/it] {'loss': 0.0532, 'grad_norm': 1.513671875, 'learning_rate': 3.853799839040719e-05, 'epoch': 1.49} 15%|█▍ | 6138/41250 [14:50:03<84:52:40, 8.70s/it][2025-04-25 22:47:47,358] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 22:47:47,358] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.18 | bwd_microstep: 5871.78 | bwd_inner_microstep: 5679.66 | bwd_allreduce_microstep: 192.07 | step_microstep: 18.75 [2025-04-25 22:47:47,358] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.18 | bwd: 5871.79 | bwd_inner: 5679.66 | bwd_allreduce: 192.09 | step: 18.75 15%|█▍ | 6139/41250 [14:50:12<85:11:08, 8.73s/it] {'loss': 0.204, 'grad_norm': 2.087547540664673, 'learning_rate': 3.853740897671556e-05, 'epoch': 1.49} 15%|█▍ | 6139/41250 [14:50:12<85:11:08, 8.73s/it][2025-04-25 22:47:56,049] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 22:47:56,050] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.38 | bwd_microstep: 5770.32 | bwd_inner_microstep: 5656.67 | bwd_allreduce_microstep: 113.60 | step_microstep: 18.89 [2025-04-25 22:47:56,050] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.38 | bwd: 5770.33 | bwd_inner: 5656.67 | bwd_allreduce: 113.62 | step: 18.89 15%|█▍ | 6140/41250 [14:50:21<85:03:21, 8.72s/it] {'loss': 0.382, 'grad_norm': 4.015090465545654, 'learning_rate': 3.853681944874444e-05, 'epoch': 1.49} 15%|█▍ | 6140/41250 [14:50:21<85:03:21, 8.72s/it][2025-04-25 22:48:04,850] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 22:48:04,851] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.34 | bwd_microstep: 5849.96 | bwd_inner_microstep: 5704.96 | bwd_allreduce_microstep: 144.96 | step_microstep: 18.24 [2025-04-25 22:48:04,851] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.34 | bwd: 5849.97 | bwd_inner: 5704.95 | bwd_allreduce: 144.97 | step: 18.24 15%|█▍ | 6141/41250 [14:50:30<85:17:22, 8.75s/it] {'loss': 0.1667, 'grad_norm': 2.4314091205596924, 'learning_rate': 3.853622980649747e-05, 'epoch': 1.49} 15%|█▍ | 6141/41250 [14:50:30<85:17:22, 8.75s/it][2025-04-25 22:48:13,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-25 22:48:13,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.85 | bwd_microstep: 5770.60 | bwd_inner_microstep: 5653.22 | bwd_allreduce_microstep: 117.33 | step_microstep: 19.06 [2025-04-25 22:48:13,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.04 | bwd: 5770.61 | bwd_inner: 5653.22 | bwd_allreduce: 117.35 | step: 19.07 15%|█▍ | 6142/41250 [14:50:38<85:08:33, 8.73s/it] {'loss': 0.0624, 'grad_norm': 0.9849414825439453, 'learning_rate': 3.85356400499783e-05, 'epoch': 1.49} 15%|█▍ | 6142/41250 [14:50:38<85:08:33, 8.73s/it][2025-04-25 22:48:22,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:48:22,245] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.59 | bwd_microstep: 5775.52 | bwd_inner_microstep: 5662.70 | bwd_allreduce_microstep: 112.78 | step_microstep: 18.27 [2025-04-25 22:48:22,245] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.59 | bwd: 5775.53 | bwd_inner: 5662.70 | bwd_allreduce: 112.80 | step: 18.27 15%|█▍ | 6143/41250 [14:50:47<85:02:31, 8.72s/it] {'loss': 0.0735, 'grad_norm': 2.251028537750244, 'learning_rate': 3.8535050179190545e-05, 'epoch': 1.49} 15%|█▍ | 6143/41250 [14:50:47<85:02:31, 8.72s/it][2025-04-25 22:48:30,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.18 | optimizer_step: 0.92 [2025-04-25 22:48:30,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.41 | bwd_microstep: 5759.81 | bwd_inner_microstep: 5707.81 | bwd_allreduce_microstep: 51.95 | step_microstep: 18.81 [2025-04-25 22:48:30,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.41 | bwd: 5759.82 | bwd_inner: 5707.81 | bwd_allreduce: 51.97 | step: 18.81 15%|█▍ | 6144/41250 [14:50:56<84:58:24, 8.71s/it] {'loss': 0.2281, 'grad_norm': 2.9236836433410645, 'learning_rate': 3.853446019413786e-05, 'epoch': 1.49} 15%|█▍ | 6144/41250 [14:50:56<84:58:24, 8.71s/it][2025-04-25 22:48:39,640] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 0.99 | optimizer_step: 1.01 [2025-04-25 22:48:39,641] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2867.04 | bwd_microstep: 5749.38 | bwd_inner_microstep: 5719.37 | bwd_allreduce_microstep: 29.97 | step_microstep: 18.71 [2025-04-25 22:48:39,641] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2867.04 | bwd: 5749.40 | bwd_inner: 5719.37 | bwd_allreduce: 29.99 | step: 18.71 15%|█▍ | 6145/41250 [14:51:04<84:55:32, 8.71s/it] {'loss': 0.2978, 'grad_norm': 4.084099292755127, 'learning_rate': 3.8533870094823866e-05, 'epoch': 1.49} 15%|█▍ | 6145/41250 [14:51:04<84:55:32, 8.71s/it][2025-04-25 22:48:48,284] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.99 | optimizer_step: 0.99 [2025-04-25 22:48:48,284] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.32 | bwd_microstep: 5723.16 | bwd_inner_microstep: 5671.13 | bwd_allreduce_microstep: 51.98 | step_microstep: 18.44 [2025-04-25 22:48:48,284] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.32 | bwd: 5723.17 | bwd_inner: 5671.13 | bwd_allreduce: 52.00 | step: 18.44 15%|█▍ | 6146/41250 [14:51:13<84:44:13, 8.69s/it] {'loss': 0.3981, 'grad_norm': 2.6619210243225098, 'learning_rate': 3.8533279881252206e-05, 'epoch': 1.49} 15%|█▍ | 6146/41250 [14:51:13<84:44:13, 8.69s/it][2025-04-25 22:48:56,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.99 | optimizer_step: 1.09 [2025-04-25 22:48:56,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.98 | bwd_microstep: 5750.68 | bwd_inner_microstep: 5708.76 | bwd_allreduce_microstep: 41.88 | step_microstep: 18.98 [2025-04-25 22:48:56,976] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.98 | bwd: 5750.69 | bwd_inner: 5708.76 | bwd_allreduce: 41.89 | step: 18.98 15%|█▍ | 6147/41250 [14:51:22<84:43:57, 8.69s/it] {'loss': 0.1607, 'grad_norm': 3.4278616905212402, 'learning_rate': 3.853268955342653e-05, 'epoch': 1.49} 15%|█▍ | 6147/41250 [14:51:22<84:43:57, 8.69s/it][2025-04-25 22:49:05,686] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:49:05,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.33 | bwd_microstep: 5791.85 | bwd_inner_microstep: 5660.63 | bwd_allreduce_microstep: 131.17 | step_microstep: 18.80 [2025-04-25 22:49:05,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.33 | bwd: 5791.86 | bwd_inner: 5660.63 | bwd_allreduce: 131.18 | step: 18.80 15%|█▍ | 6148/41250 [14:51:31<84:47:38, 8.70s/it] {'loss': 0.2998, 'grad_norm': 4.13748836517334, 'learning_rate': 3.8532099111350464e-05, 'epoch': 1.49} 15%|█▍ | 6148/41250 [14:51:31<84:47:38, 8.70s/it][2025-04-25 22:49:14,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:49:14,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2864.31 | bwd_microstep: 5756.27 | bwd_inner_microstep: 5712.08 | bwd_allreduce_microstep: 44.15 | step_microstep: 18.72 [2025-04-25 22:49:14,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2864.31 | bwd: 5756.29 | bwd_inner: 5712.08 | bwd_allreduce: 44.17 | step: 18.72 15%|█▍ | 6149/41250 [14:51:39<84:49:16, 8.70s/it] {'loss': 0.1861, 'grad_norm': 5.627776145935059, 'learning_rate': 3.8531508555027646e-05, 'epoch': 1.49} 15%|█▍ | 6149/41250 [14:51:39<84:49:16, 8.70s/it][2025-04-25 22:49:23,081] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:49:23,081] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.33 | bwd_microstep: 5752.84 | bwd_inner_microstep: 5663.08 | bwd_allreduce_microstep: 89.72 | step_microstep: 18.54 [2025-04-25 22:49:23,081] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.33 | bwd: 5752.86 | bwd_inner: 5663.08 | bwd_allreduce: 89.73 | step: 18.54 15%|█▍ | 6150/41250 [14:51:48<84:47:02, 8.70s/it] {'loss': 0.3029, 'grad_norm': 1.8546802997589111, 'learning_rate': 3.8530917884461735e-05, 'epoch': 1.49} 15%|█▍ | 6150/41250 [14:51:48<84:47:02, 8.70s/it][2025-04-25 22:49:31,760] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.01 | optimizer_step: 1.16 [2025-04-25 22:49:31,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.42 | bwd_microstep: 5765.21 | bwd_inner_microstep: 5650.58 | bwd_allreduce_microstep: 114.58 | step_microstep: 19.13 [2025-04-25 22:49:31,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.42 | bwd: 5765.23 | bwd_inner: 5650.58 | bwd_allreduce: 114.60 | step: 19.14 15%|█▍ | 6151/41250 [14:51:57<84:44:19, 8.69s/it] {'loss': 0.1358, 'grad_norm': 1.6705126762390137, 'learning_rate': 3.853032709965635e-05, 'epoch': 1.49} 15%|█▍ | 6151/41250 [14:51:57<84:44:19, 8.69s/it][2025-04-25 22:49:40,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 22:49:40,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.18 | bwd_microstep: 5700.15 | bwd_inner_microstep: 5664.36 | bwd_allreduce_microstep: 35.74 | step_microstep: 18.61 [2025-04-25 22:49:40,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.18 | bwd: 5700.16 | bwd_inner: 5664.36 | bwd_allreduce: 35.76 | step: 18.61 15%|█▍ | 6152/41250 [14:52:05<84:32:40, 8.67s/it] {'loss': 0.0958, 'grad_norm': 8.85157585144043, 'learning_rate': 3.852973620061515e-05, 'epoch': 1.49} 15%|█▍ | 6152/41250 [14:52:05<84:32:40, 8.67s/it][2025-04-25 22:49:49,045] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:49:49,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2862.15 | bwd_microstep: 5712.37 | bwd_inner_microstep: 5699.59 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.62 [2025-04-25 22:49:49,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2862.15 | bwd: 5712.39 | bwd_inner: 5699.59 | bwd_allreduce: 12.75 | step: 18.63 15%|█▍ | 6153/41250 [14:52:14<84:30:03, 8.67s/it] {'loss': 0.0364, 'grad_norm': 0.6150863766670227, 'learning_rate': 3.852914518734176e-05, 'epoch': 1.49} 15%|█▍ | 6153/41250 [14:52:14<84:30:03, 8.67s/it][2025-04-25 22:49:57,840] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:49:57,841] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2908.51 | bwd_microstep: 5805.56 | bwd_inner_microstep: 5792.73 | bwd_allreduce_microstep: 12.79 | step_microstep: 18.59 [2025-04-25 22:49:57,841] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2908.51 | bwd: 5805.57 | bwd_inner: 5792.73 | bwd_allreduce: 12.81 | step: 18.59 15%|█▍ | 6154/41250 [14:52:23<84:52:16, 8.71s/it] {'loss': 0.2872, 'grad_norm': 2.720568895339966, 'learning_rate': 3.8528554059839845e-05, 'epoch': 1.49} 15%|█▍ | 6154/41250 [14:52:23<84:52:16, 8.71s/it][2025-04-25 22:50:06,541] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:50:06,541] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.25 | bwd_microstep: 5784.68 | bwd_inner_microstep: 5664.11 | bwd_allreduce_microstep: 120.52 | step_microstep: 18.18 [2025-04-25 22:50:06,541] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.25 | bwd: 5784.69 | bwd_inner: 5664.11 | bwd_allreduce: 120.54 | step: 18.18 15%|█▍ | 6155/41250 [14:52:31<84:51:11, 8.70s/it] {'loss': 0.0746, 'grad_norm': 2.5417208671569824, 'learning_rate': 3.852796281811304e-05, 'epoch': 1.49} 15%|█▍ | 6155/41250 [14:52:31<84:51:11, 8.70s/it][2025-04-25 22:50:15,251] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:50:15,251] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.78 | bwd_microstep: 5767.07 | bwd_inner_microstep: 5703.19 | bwd_allreduce_microstep: 63.84 | step_microstep: 18.58 [2025-04-25 22:50:15,251] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.78 | bwd: 5767.08 | bwd_inner: 5703.19 | bwd_allreduce: 63.85 | step: 18.59 15%|█▍ | 6156/41250 [14:52:40<84:52:04, 8.71s/it] {'loss': 0.0117, 'grad_norm': 0.13042360544204712, 'learning_rate': 3.852737146216498e-05, 'epoch': 1.49} 15%|█▍ | 6156/41250 [14:52:40<84:52:04, 8.71s/it][2025-04-25 22:50:23,934] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:50:23,935] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.28 | bwd_microstep: 5737.54 | bwd_inner_microstep: 5688.87 | bwd_allreduce_microstep: 48.62 | step_microstep: 18.01 [2025-04-25 22:50:23,935] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.28 | bwd: 5737.55 | bwd_inner: 5688.87 | bwd_allreduce: 48.64 | step: 18.01 15%|█▍ | 6157/41250 [14:52:49<84:48:08, 8.70s/it] {'loss': 0.1204, 'grad_norm': 1.9024100303649902, 'learning_rate': 3.852677999199932e-05, 'epoch': 1.49} 15%|█▍ | 6157/41250 [14:52:49<84:48:08, 8.70s/it][2025-04-25 22:50:32,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.93 [2025-04-25 22:50:32,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.24 | bwd_microstep: 5690.51 | bwd_inner_microstep: 5659.93 | bwd_allreduce_microstep: 30.54 | step_microstep: 18.68 [2025-04-25 22:50:32,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.24 | bwd: 5690.52 | bwd_inner: 5659.93 | bwd_allreduce: 30.56 | step: 18.68 15%|█▍ | 6158/41250 [14:52:57<84:34:18, 8.68s/it] {'loss': 0.0416, 'grad_norm': 1.0778157711029053, 'learning_rate': 3.852618840761971e-05, 'epoch': 1.49} 15%|█▍ | 6158/41250 [14:52:57<84:34:18, 8.68s/it][2025-04-25 22:50:41,311] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.90 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:50:41,312] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2884.10 | bwd_microstep: 5790.36 | bwd_inner_microstep: 5777.61 | bwd_allreduce_microstep: 12.71 | step_microstep: 17.67 [2025-04-25 22:50:41,312] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2884.10 | bwd: 5790.37 | bwd_inner: 5777.60 | bwd_allreduce: 12.73 | step: 17.68 15%|█▍ | 6159/41250 [14:53:06<84:47:57, 8.70s/it] {'loss': 0.0882, 'grad_norm': 0.8754249215126038, 'learning_rate': 3.852559670902979e-05, 'epoch': 1.49} 15%|█▍ | 6159/41250 [14:53:06<84:47:57, 8.70s/it][2025-04-25 22:50:49,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 22:50:49,967] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.78 | bwd_microstep: 5711.23 | bwd_inner_microstep: 5695.29 | bwd_allreduce_microstep: 15.90 | step_microstep: 18.65 [2025-04-25 22:50:49,967] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.78 | bwd: 5711.25 | bwd_inner: 5695.29 | bwd_allreduce: 15.92 | step: 18.66 15%|█▍ | 6160/41250 [14:53:15<84:40:00, 8.69s/it] {'loss': 0.1788, 'grad_norm': 1.6947002410888672, 'learning_rate': 3.852500489623321e-05, 'epoch': 1.49} 15%|█▍ | 6160/41250 [14:53:15<84:40:00, 8.69s/it][2025-04-25 22:50:58,633] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:50:58,634] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.61 | bwd_microstep: 5728.27 | bwd_inner_microstep: 5691.94 | bwd_allreduce_microstep: 36.28 | step_microstep: 18.64 [2025-04-25 22:50:58,634] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.61 | bwd: 5728.28 | bwd_inner: 5691.94 | bwd_allreduce: 36.30 | step: 18.64 15%|█▍ | 6161/41250 [14:53:23<84:36:30, 8.68s/it] {'loss': 0.0349, 'grad_norm': 0.42656180262565613, 'learning_rate': 3.852441296923362e-05, 'epoch': 1.49} 15%|█▍ | 6161/41250 [14:53:23<84:36:30, 8.68s/it][2025-04-25 22:51:07,250] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:51:07,251] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.90 | bwd_microstep: 5701.48 | bwd_inner_microstep: 5648.38 | bwd_allreduce_microstep: 53.06 | step_microstep: 18.56 [2025-04-25 22:51:07,251] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.90 | bwd: 5701.49 | bwd_inner: 5648.38 | bwd_allreduce: 53.08 | step: 18.56 15%|█▍ | 6162/41250 [14:53:32<84:25:15, 8.66s/it] {'loss': 0.0871, 'grad_norm': 0.9051409959793091, 'learning_rate': 3.852382092803467e-05, 'epoch': 1.49} 15%|█▍ | 6162/41250 [14:53:32<84:25:15, 8.66s/it][2025-04-25 22:51:15,928] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:51:15,929] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.90 | bwd_microstep: 5746.70 | bwd_inner_microstep: 5694.08 | bwd_allreduce_microstep: 52.57 | step_microstep: 18.24 [2025-04-25 22:51:15,929] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.90 | bwd: 5746.71 | bwd_inner: 5694.08 | bwd_allreduce: 52.58 | step: 18.24 15%|█▍ | 6163/41250 [14:53:41<84:28:01, 8.67s/it] {'loss': 0.1088, 'grad_norm': 4.249880313873291, 'learning_rate': 3.852322877264e-05, 'epoch': 1.49} 15%|█▍ | 6163/41250 [14:53:41<84:28:01, 8.67s/it][2025-04-25 22:51:24,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:51:24,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.49 | bwd_microstep: 5723.99 | bwd_inner_microstep: 5685.60 | bwd_allreduce_microstep: 38.35 | step_microstep: 18.43 [2025-04-25 22:51:24,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.50 | bwd: 5724.01 | bwd_inner: 5685.60 | bwd_allreduce: 38.37 | step: 18.44 15%|█▍ | 6164/41250 [14:53:49<84:27:32, 8.67s/it] {'loss': 0.0789, 'grad_norm': 1.435632348060608, 'learning_rate': 3.8522636503053275e-05, 'epoch': 1.49} 15%|█▍ | 6164/41250 [14:53:49<84:27:32, 8.67s/it][2025-04-25 22:51:33,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 22:51:33,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2898.74 | bwd_microstep: 5793.30 | bwd_inner_microstep: 5780.61 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.22 [2025-04-25 22:51:33,373] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2898.74 | bwd: 5793.31 | bwd_inner: 5780.61 | bwd_allreduce: 12.66 | step: 18.22 15%|█▍ | 6165/41250 [14:53:58<84:47:13, 8.70s/it] {'loss': 0.0473, 'grad_norm': 2.2895290851593018, 'learning_rate': 3.852204411927814e-05, 'epoch': 1.49} 15%|█▍ | 6165/41250 [14:53:58<84:47:13, 8.70s/it][2025-04-25 22:51:42,027] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:51:42,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.85 | bwd_microstep: 5719.65 | bwd_inner_microstep: 5699.75 | bwd_allreduce_microstep: 19.87 | step_microstep: 18.34 [2025-04-25 22:51:42,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.86 | bwd: 5719.67 | bwd_inner: 5699.74 | bwd_allreduce: 19.88 | step: 18.34 15%|█▍ | 6166/41250 [14:54:07<84:39:13, 8.69s/it] {'loss': 0.0877, 'grad_norm': 2.008397340774536, 'learning_rate': 3.852145162131824e-05, 'epoch': 1.49} 15%|█▍ | 6166/41250 [14:54:07<84:39:13, 8.69s/it][2025-04-25 22:51:50,611] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-25 22:51:50,612] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.96 | bwd_microstep: 5677.54 | bwd_inner_microstep: 5631.88 | bwd_allreduce_microstep: 45.61 | step_microstep: 18.60 [2025-04-25 22:51:50,612] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.96 | bwd: 5677.55 | bwd_inner: 5631.88 | bwd_allreduce: 45.63 | step: 18.61 15%|█▍ | 6167/41250 [14:54:15<84:21:20, 8.66s/it] {'loss': 0.1657, 'grad_norm': 1.2612799406051636, 'learning_rate': 3.8520859009177236e-05, 'epoch': 1.5} 15%|█▍ | 6167/41250 [14:54:15<84:21:20, 8.66s/it][2025-04-25 22:51:59,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:51:59,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.80 | bwd_microstep: 5734.35 | bwd_inner_microstep: 5680.75 | bwd_allreduce_microstep: 53.56 | step_microstep: 18.19 [2025-04-25 22:51:59,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.80 | bwd: 5734.37 | bwd_inner: 5680.75 | bwd_allreduce: 53.58 | step: 18.19 15%|█▍ | 6168/41250 [14:54:24<84:24:55, 8.66s/it] {'loss': 0.1137, 'grad_norm': 1.6131759881973267, 'learning_rate': 3.852026628285878e-05, 'epoch': 1.5} 15%|█▍ | 6168/41250 [14:54:24<84:24:55, 8.66s/it][2025-04-25 22:52:07,953] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-25 22:52:07,953] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.49 | bwd_microstep: 5723.45 | bwd_inner_microstep: 5686.71 | bwd_allreduce_microstep: 36.70 | step_microstep: 18.26 [2025-04-25 22:52:07,953] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.49 | bwd: 5723.47 | bwd_inner: 5686.71 | bwd_allreduce: 36.71 | step: 18.26 15%|█▍ | 6169/41250 [14:54:33<84:24:50, 8.66s/it] {'loss': 0.292, 'grad_norm': 3.054283618927002, 'learning_rate': 3.8519673442366524e-05, 'epoch': 1.5} 15%|█▍ | 6169/41250 [14:54:33<84:24:50, 8.66s/it][2025-04-25 22:52:16,623] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-25 22:52:16,624] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.81 | bwd_microstep: 5753.96 | bwd_inner_microstep: 5647.83 | bwd_allreduce_microstep: 106.09 | step_microstep: 18.13 [2025-04-25 22:52:16,624] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.82 | bwd: 5753.97 | bwd_inner: 5647.83 | bwd_allreduce: 106.11 | step: 18.13 15%|█▍ | 6170/41250 [14:54:41<84:26:03, 8.66s/it] {'loss': 0.0375, 'grad_norm': 0.6268666982650757, 'learning_rate': 3.8519080487704116e-05, 'epoch': 1.5} 15%|█▍ | 6170/41250 [14:54:41<84:26:03, 8.66s/it][2025-04-25 22:52:25,277] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-25 22:52:25,277] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.27 | bwd_microstep: 5713.26 | bwd_inner_microstep: 5700.48 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.45 [2025-04-25 22:52:25,277] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.27 | bwd: 5713.27 | bwd_inner: 5700.48 | bwd_allreduce: 12.75 | step: 18.45 15%|█▍ | 6171/41250 [14:54:50<84:23:58, 8.66s/it] {'loss': 0.1004, 'grad_norm': 1.8482292890548706, 'learning_rate': 3.851848741887524e-05, 'epoch': 1.5} 15%|█▍ | 6171/41250 [14:54:50<84:23:58, 8.66s/it][2025-04-25 22:52:33,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:52:33,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.34 | bwd_microstep: 5704.74 | bwd_inner_microstep: 5643.92 | bwd_allreduce_microstep: 60.78 | step_microstep: 18.36 [2025-04-25 22:52:33,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.34 | bwd: 5704.76 | bwd_inner: 5643.92 | bwd_allreduce: 60.80 | step: 18.37 15%|█▍ | 6172/41250 [14:54:59<84:17:14, 8.65s/it] {'loss': 0.2443, 'grad_norm': 2.441150665283203, 'learning_rate': 3.8517894235883513e-05, 'epoch': 1.5} 15%|█▍ | 6172/41250 [14:54:59<84:17:14, 8.65s/it][2025-04-25 22:52:42,572] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:52:42,573] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.17 | bwd_microstep: 5757.52 | bwd_inner_microstep: 5645.00 | bwd_allreduce_microstep: 112.48 | step_microstep: 18.38 [2025-04-25 22:52:42,573] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.17 | bwd: 5757.54 | bwd_inner: 5645.00 | bwd_allreduce: 112.50 | step: 18.39 15%|█▍ | 6173/41250 [14:55:07<84:20:51, 8.66s/it] {'loss': 0.1564, 'grad_norm': 2.3411781787872314, 'learning_rate': 3.851730093873262e-05, 'epoch': 1.5} 15%|█▍ | 6173/41250 [14:55:07<84:20:51, 8.66s/it][2025-04-25 22:52:51,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 22:52:51,196] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.84 | bwd_microstep: 5697.70 | bwd_inner_microstep: 5684.83 | bwd_allreduce_microstep: 12.83 | step_microstep: 18.38 [2025-04-25 22:52:51,196] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.84 | bwd: 5697.71 | bwd_inner: 5684.83 | bwd_allreduce: 12.84 | step: 18.38 15%|█▍ | 6174/41250 [14:55:16<84:14:48, 8.65s/it] {'loss': 0.1519, 'grad_norm': 1.1046115159988403, 'learning_rate': 3.85167075274262e-05, 'epoch': 1.5} 15%|█▍ | 6174/41250 [14:55:16<84:14:48, 8.65s/it][2025-04-25 22:52:59,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:52:59,816] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.78 | bwd_microstep: 5699.22 | bwd_inner_microstep: 5656.85 | bwd_allreduce_microstep: 42.33 | step_microstep: 18.57 [2025-04-25 22:52:59,816] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.78 | bwd: 5699.24 | bwd_inner: 5656.85 | bwd_allreduce: 42.35 | step: 18.57 15%|█▍ | 6175/41250 [14:55:25<84:10:02, 8.64s/it] {'loss': 0.0642, 'grad_norm': 1.2515075206756592, 'learning_rate': 3.851611400196793e-05, 'epoch': 1.5} 15%|█▍ | 6175/41250 [14:55:25<84:10:02, 8.64s/it][2025-04-25 22:53:08,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-25 22:53:08,483] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.17 | bwd_microstep: 5723.79 | bwd_inner_microstep: 5692.65 | bwd_allreduce_microstep: 31.10 | step_microstep: 18.38 [2025-04-25 22:53:08,483] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.17 | bwd: 5723.80 | bwd_inner: 5692.65 | bwd_allreduce: 31.12 | step: 18.39 15%|█▍ | 6176/41250 [14:55:33<84:14:45, 8.65s/it] {'loss': 0.0932, 'grad_norm': 1.2805044651031494, 'learning_rate': 3.851552036236145e-05, 'epoch': 1.5} 15%|█▍ | 6176/41250 [14:55:33<84:14:45, 8.65s/it][2025-04-25 22:53:17,108] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:53:17,109] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.05 | bwd_microstep: 5711.87 | bwd_inner_microstep: 5661.75 | bwd_allreduce_microstep: 50.08 | step_microstep: 18.17 [2025-04-25 22:53:17,109] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.05 | bwd: 5711.88 | bwd_inner: 5661.74 | bwd_allreduce: 50.10 | step: 18.17 15%|█▍ | 6177/41250 [14:55:42<84:10:58, 8.64s/it] {'loss': 0.2782, 'grad_norm': 3.670531749725342, 'learning_rate': 3.851492660861043e-05, 'epoch': 1.5} 15%|█▍ | 6177/41250 [14:55:42<84:10:58, 8.64s/it][2025-04-25 22:53:25,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 22:53:25,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.10 | bwd_microstep: 5767.07 | bwd_inner_microstep: 5662.38 | bwd_allreduce_microstep: 104.65 | step_microstep: 18.46 [2025-04-25 22:53:25,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.10 | bwd: 5767.08 | bwd_inner: 5662.38 | bwd_allreduce: 104.66 | step: 18.46 15%|█▍ | 6178/41250 [14:55:51<84:19:57, 8.66s/it] {'loss': 0.1357, 'grad_norm': 4.0106892585754395, 'learning_rate': 3.8514332740718534e-05, 'epoch': 1.5} 15%|█▍ | 6178/41250 [14:55:51<84:19:57, 8.66s/it][2025-04-25 22:53:34,480] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:53:34,480] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.92 | bwd_microstep: 5768.76 | bwd_inner_microstep: 5666.39 | bwd_allreduce_microstep: 102.32 | step_microstep: 18.44 [2025-04-25 22:53:34,480] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.92 | bwd: 5768.77 | bwd_inner: 5666.39 | bwd_allreduce: 102.34 | step: 18.44 15%|█▍ | 6179/41250 [14:55:59<84:23:52, 8.66s/it] {'loss': 0.1105, 'grad_norm': 1.5755935907363892, 'learning_rate': 3.851373875868942e-05, 'epoch': 1.5} 15%|█▍ | 6179/41250 [14:55:59<84:23:52, 8.66s/it][2025-04-25 22:53:43,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.03 | optimizer_step: 0.95 [2025-04-25 22:53:43,168] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.11 | bwd_microstep: 5768.47 | bwd_inner_microstep: 5657.11 | bwd_allreduce_microstep: 111.31 | step_microstep: 18.82 [2025-04-25 22:53:43,168] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.11 | bwd: 5768.48 | bwd_inner: 5657.11 | bwd_allreduce: 111.33 | step: 18.82 15%|█▍ | 6180/41250 [14:56:08<84:27:52, 8.67s/it] {'loss': 0.1273, 'grad_norm': 2.3600151538848877, 'learning_rate': 3.8513144662526745e-05, 'epoch': 1.5} 15%|█▍ | 6180/41250 [14:56:08<84:27:52, 8.67s/it][2025-04-25 22:53:51,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 22:53:51,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.72 | bwd_microstep: 5769.56 | bwd_inner_microstep: 5661.46 | bwd_allreduce_microstep: 108.05 | step_microstep: 19.06 [2025-04-25 22:53:51,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.72 | bwd: 5769.58 | bwd_inner: 5661.46 | bwd_allreduce: 108.07 | step: 19.06 15%|█▍ | 6181/41250 [14:56:17<84:30:25, 8.68s/it] {'loss': 0.3245, 'grad_norm': 3.9293692111968994, 'learning_rate': 3.8512550452234175e-05, 'epoch': 1.5} 15%|█▍ | 6181/41250 [14:56:17<84:30:25, 8.68s/it][2025-04-25 22:54:00,513] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:54:00,514] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.07 | bwd_microstep: 5715.67 | bwd_inner_microstep: 5702.97 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.52 [2025-04-25 22:54:00,514] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.07 | bwd: 5715.68 | bwd_inner: 5702.97 | bwd_allreduce: 12.67 | step: 18.53 15%|█▍ | 6182/41250 [14:56:25<84:27:37, 8.67s/it] {'loss': 0.2518, 'grad_norm': 5.110779762268066, 'learning_rate': 3.851195612781537e-05, 'epoch': 1.5} 15%|█▍ | 6182/41250 [14:56:25<84:27:37, 8.67s/it][2025-04-25 22:54:09,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.02 | optimizer_step: 1.02 [2025-04-25 22:54:09,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.90 | bwd_microstep: 5713.62 | bwd_inner_microstep: 5663.31 | bwd_allreduce_microstep: 50.26 | step_microstep: 18.89 [2025-04-25 22:54:09,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.90 | bwd: 5713.63 | bwd_inner: 5663.31 | bwd_allreduce: 50.28 | step: 18.89 15%|█▍ | 6183/41250 [14:56:34<84:20:50, 8.66s/it] {'loss': 0.2755, 'grad_norm': 2.525625228881836, 'learning_rate': 3.8511361689274e-05, 'epoch': 1.5} 15%|█▍ | 6183/41250 [14:56:34<84:20:50, 8.66s/it][2025-04-25 22:54:17,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.98 | optimizer_step: 0.94 [2025-04-25 22:54:17,835] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.25 | bwd_microstep: 5752.52 | bwd_inner_microstep: 5689.01 | bwd_allreduce_microstep: 63.47 | step_microstep: 18.74 [2025-04-25 22:54:17,835] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.25 | bwd: 5752.54 | bwd_inner: 5689.01 | bwd_allreduce: 63.49 | step: 18.74 15%|█▍ | 6184/41250 [14:56:43<84:25:45, 8.67s/it] {'loss': 0.1363, 'grad_norm': 2.4097681045532227, 'learning_rate': 3.851076713661372e-05, 'epoch': 1.5} 15%|█▍ | 6184/41250 [14:56:43<84:25:45, 8.67s/it][2025-04-25 22:54:26,539] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.93 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:54:26,539] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.35 | bwd_microstep: 5790.25 | bwd_inner_microstep: 5644.08 | bwd_allreduce_microstep: 146.13 | step_microstep: 18.05 [2025-04-25 22:54:26,540] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.35 | bwd: 5790.27 | bwd_inner: 5644.08 | bwd_allreduce: 146.15 | step: 18.05 15%|█▍ | 6185/41250 [14:56:51<84:32:06, 8.68s/it] {'loss': 0.1014, 'grad_norm': 2.7060701847076416, 'learning_rate': 3.851017246983821e-05, 'epoch': 1.5} 15%|█▍ | 6185/41250 [14:56:51<84:32:06, 8.68s/it][2025-04-25 22:54:35,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.99 | optimizer_step: 0.93 [2025-04-25 22:54:35,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.62 | bwd_microstep: 5898.22 | bwd_inner_microstep: 5659.24 | bwd_allreduce_microstep: 238.94 | step_microstep: 18.64 [2025-04-25 22:54:35,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.62 | bwd: 5898.24 | bwd_inner: 5659.24 | bwd_allreduce: 238.96 | step: 18.64 15%|█▍ | 6186/41250 [14:57:00<84:58:02, 8.72s/it] {'loss': 0.0554, 'grad_norm': 1.121355652809143, 'learning_rate': 3.850957768895112e-05, 'epoch': 1.5} 15%|█▍ | 6186/41250 [14:57:00<84:58:02, 8.72s/it][2025-04-25 22:54:44,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.02 | optimizer_step: 1.01 [2025-04-25 22:54:44,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.30 | bwd_microstep: 5790.61 | bwd_inner_microstep: 5666.06 | bwd_allreduce_microstep: 124.50 | step_microstep: 18.71 [2025-04-25 22:54:44,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.30 | bwd: 5790.63 | bwd_inner: 5666.06 | bwd_allreduce: 124.52 | step: 18.72 15%|█▍ | 6187/41250 [14:57:09<84:55:18, 8.72s/it] {'loss': 0.2221, 'grad_norm': 1.7660428285598755, 'learning_rate': 3.8508982793956126e-05, 'epoch': 1.5} 15%|█▍ | 6187/41250 [14:57:09<84:55:18, 8.72s/it][2025-04-25 22:54:52,743] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.22 | optimizer_step: 0.99 [2025-04-25 22:54:52,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.79 | bwd_microstep: 5729.95 | bwd_inner_microstep: 5716.68 | bwd_allreduce_microstep: 13.22 | step_microstep: 19.63 [2025-04-25 22:54:52,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.79 | bwd: 5729.96 | bwd_inner: 5716.68 | bwd_allreduce: 13.24 | step: 19.63 15%|█▌ | 6188/41250 [14:57:18<84:46:23, 8.70s/it] {'loss': 0.1031, 'grad_norm': 1.3913403749465942, 'learning_rate': 3.85083877848569e-05, 'epoch': 1.5} 15%|█▌ | 6188/41250 [14:57:18<84:46:23, 8.70s/it][2025-04-25 22:55:01,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 0.94 [2025-04-25 22:55:01,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.49 | bwd_microstep: 5713.85 | bwd_inner_microstep: 5701.03 | bwd_allreduce_microstep: 12.78 | step_microstep: 19.02 [2025-04-25 22:55:01,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.49 | bwd: 5713.87 | bwd_inner: 5701.02 | bwd_allreduce: 12.80 | step: 19.02 15%|█▌ | 6189/41250 [14:57:26<84:35:55, 8.69s/it] {'loss': 0.1866, 'grad_norm': 2.446626663208008, 'learning_rate': 3.850779266165709e-05, 'epoch': 1.5} 15%|█▌ | 6189/41250 [14:57:26<84:35:55, 8.69s/it][2025-04-25 22:55:10,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.29 | optimizer_step: 0.95 [2025-04-25 22:55:10,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.11 | bwd_microstep: 5785.40 | bwd_inner_microstep: 5706.18 | bwd_allreduce_microstep: 79.17 | step_microstep: 19.48 [2025-04-25 22:55:10,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.11 | bwd: 5785.42 | bwd_inner: 5706.18 | bwd_allreduce: 79.20 | step: 19.48 15%|█▌ | 6190/41250 [14:57:35<84:42:32, 8.70s/it] {'loss': 0.1667, 'grad_norm': 1.8829782009124756, 'learning_rate': 3.850719742436039e-05, 'epoch': 1.5} 15%|█▌ | 6190/41250 [14:57:35<84:42:32, 8.70s/it][2025-04-25 22:55:18,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.17 | optimizer_step: 1.11 [2025-04-25 22:55:18,816] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.73 | bwd_microstep: 5761.81 | bwd_inner_microstep: 5713.20 | bwd_allreduce_microstep: 48.55 | step_microstep: 19.84 [2025-04-25 22:55:18,816] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.73 | bwd: 5761.82 | bwd_inner: 5713.20 | bwd_allreduce: 48.57 | step: 19.84 15%|█▌ | 6191/41250 [14:57:44<84:43:23, 8.70s/it] {'loss': 0.3206, 'grad_norm': 2.4337284564971924, 'learning_rate': 3.850660207297046e-05, 'epoch': 1.5} 15%|█▌ | 6191/41250 [14:57:44<84:43:23, 8.70s/it][2025-04-25 22:55:27,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 22:55:27,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.62 | bwd_microstep: 5764.58 | bwd_inner_microstep: 5687.50 | bwd_allreduce_microstep: 77.03 | step_microstep: 19.04 [2025-04-25 22:55:27,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.62 | bwd: 5764.59 | bwd_inner: 5687.50 | bwd_allreduce: 77.05 | step: 19.04 15%|█▌ | 6192/41250 [14:57:52<84:43:27, 8.70s/it] {'loss': 0.0805, 'grad_norm': 3.823103904724121, 'learning_rate': 3.850600660749096e-05, 'epoch': 1.5} 15%|█▌ | 6192/41250 [14:57:52<84:43:27, 8.70s/it][2025-04-25 22:55:36,182] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.36 | optimizer_step: 1.07 [2025-04-25 22:55:36,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.61 | bwd_microstep: 5729.05 | bwd_inner_microstep: 5697.38 | bwd_allreduce_microstep: 31.60 | step_microstep: 20.46 [2025-04-25 22:55:36,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.61 | bwd: 5729.06 | bwd_inner: 5697.38 | bwd_allreduce: 31.63 | step: 20.46 15%|█▌ | 6193/41250 [14:58:01<84:37:44, 8.69s/it] {'loss': 0.0674, 'grad_norm': 1.1589604616165161, 'learning_rate': 3.8505411027925574e-05, 'epoch': 1.5} 15%|█▌ | 6193/41250 [14:58:01<84:37:44, 8.69s/it][2025-04-25 22:55:44,841] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 22:55:44,842] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.71 | bwd_microstep: 5722.19 | bwd_inner_microstep: 5709.46 | bwd_allreduce_microstep: 12.68 | step_microstep: 19.10 [2025-04-25 22:55:44,842] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.71 | bwd: 5722.20 | bwd_inner: 5709.46 | bwd_allreduce: 12.70 | step: 19.10 15%|█▌ | 6194/41250 [14:58:10<84:31:22, 8.68s/it] {'loss': 0.2994, 'grad_norm': 1.6273412704467773, 'learning_rate': 3.850481533427797e-05, 'epoch': 1.5} 15%|█▌ | 6194/41250 [14:58:10<84:31:22, 8.68s/it][2025-04-25 22:55:53,543] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-25 22:55:53,544] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.47 | bwd_microstep: 5778.57 | bwd_inner_microstep: 5672.10 | bwd_allreduce_microstep: 106.42 | step_microstep: 19.27 [2025-04-25 22:55:53,544] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.47 | bwd: 5778.58 | bwd_inner: 5672.10 | bwd_allreduce: 106.44 | step: 19.27 15%|█▌ | 6195/41250 [14:58:18<84:35:29, 8.69s/it] {'loss': 0.0507, 'grad_norm': 1.2732257843017578, 'learning_rate': 3.850421952655181e-05, 'epoch': 1.5} 15%|█▌ | 6195/41250 [14:58:18<84:35:29, 8.69s/it][2025-04-25 22:56:02,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:56:02,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.62 | bwd_microstep: 5780.91 | bwd_inner_microstep: 5657.28 | bwd_allreduce_microstep: 123.59 | step_microstep: 18.66 [2025-04-25 22:56:02,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.62 | bwd: 5780.93 | bwd_inner: 5657.28 | bwd_allreduce: 123.61 | step: 18.67 15%|█▌ | 6196/41250 [14:58:27<84:35:23, 8.69s/it] {'loss': 0.1656, 'grad_norm': 2.6563336849212646, 'learning_rate': 3.850362360475079e-05, 'epoch': 1.5} 15%|█▌ | 6196/41250 [14:58:27<84:35:23, 8.69s/it][2025-04-25 22:56:11,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 22:56:11,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2932.34 | bwd_microstep: 5863.73 | bwd_inner_microstep: 5850.67 | bwd_allreduce_microstep: 13.01 | step_microstep: 19.69 [2025-04-25 22:56:11,114] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2932.34 | bwd: 5863.74 | bwd_inner: 5850.67 | bwd_allreduce: 13.03 | step: 19.69 15%|█▌ | 6197/41250 [14:58:36<85:09:01, 8.75s/it] {'loss': 0.1175, 'grad_norm': 2.104738473892212, 'learning_rate': 3.850302756887856e-05, 'epoch': 1.5} 15%|█▌ | 6197/41250 [14:58:36<85:09:01, 8.75s/it][2025-04-25 22:56:19,777] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 22:56:19,778] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.89 | bwd_microstep: 5725.03 | bwd_inner_microstep: 5712.31 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.77 [2025-04-25 22:56:19,778] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.89 | bwd: 5725.05 | bwd_inner: 5712.31 | bwd_allreduce: 12.70 | step: 18.78 15%|█▌ | 6198/41250 [14:58:45<84:54:38, 8.72s/it] {'loss': 0.1689, 'grad_norm': 2.7549808025360107, 'learning_rate': 3.850243141893881e-05, 'epoch': 1.5} 15%|█▌ | 6198/41250 [14:58:45<84:54:38, 8.72s/it][2025-04-25 22:56:28,591] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:56:28,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.47 | bwd_microstep: 5879.23 | bwd_inner_microstep: 5708.61 | bwd_allreduce_microstep: 170.58 | step_microstep: 18.82 [2025-04-25 22:56:28,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.47 | bwd: 5879.25 | bwd_inner: 5708.61 | bwd_allreduce: 170.60 | step: 18.82 15%|█▌ | 6199/41250 [14:58:53<85:10:50, 8.75s/it] {'loss': 0.1928, 'grad_norm': 1.6946865320205688, 'learning_rate': 3.850183515493521e-05, 'epoch': 1.5} 15%|█▌ | 6199/41250 [14:58:53<85:10:50, 8.75s/it][2025-04-25 22:56:37,220] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 22:56:37,220] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.41 | bwd_microstep: 5721.65 | bwd_inner_microstep: 5648.63 | bwd_allreduce_microstep: 72.98 | step_microstep: 18.87 [2025-04-25 22:56:37,221] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.41 | bwd: 5721.66 | bwd_inner: 5648.63 | bwd_allreduce: 72.99 | step: 18.88 15%|█▌ | 6200/41250 [14:59:02<84:49:51, 8.71s/it] {'loss': 0.0957, 'grad_norm': 1.8498295545578003, 'learning_rate': 3.850123877687143e-05, 'epoch': 1.5} 15%|█▌ | 6200/41250 [14:59:02<84:49:51, 8.71s/it][2025-04-25 22:56:45,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 1.02 | optimizer_step: 1.02 [2025-04-25 22:56:45,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.69 | bwd_microstep: 5717.00 | bwd_inner_microstep: 5704.12 | bwd_allreduce_microstep: 12.84 | step_microstep: 19.38 [2025-04-25 22:56:45,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.69 | bwd: 5717.02 | bwd_inner: 5704.12 | bwd_allreduce: 12.86 | step: 19.38 15%|█▌ | 6201/41250 [14:59:11<84:38:48, 8.69s/it] {'loss': 0.1657, 'grad_norm': 2.120957612991333, 'learning_rate': 3.8500642284751156e-05, 'epoch': 1.5} 15%|█▌ | 6201/41250 [14:59:11<84:38:48, 8.69s/it][2025-04-25 22:56:54,578] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 22:56:54,578] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.69 | bwd_microstep: 5761.82 | bwd_inner_microstep: 5701.94 | bwd_allreduce_microstep: 59.83 | step_microstep: 18.82 [2025-04-25 22:56:54,579] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.69 | bwd: 5761.83 | bwd_inner: 5701.94 | bwd_allreduce: 59.85 | step: 18.82 15%|█▌ | 6202/41250 [14:59:19<84:40:46, 8.70s/it] {'loss': 0.0903, 'grad_norm': 1.5932804346084595, 'learning_rate': 3.850004567857806e-05, 'epoch': 1.5} 15%|█▌ | 6202/41250 [14:59:19<84:40:46, 8.70s/it][2025-04-25 22:57:03,275] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 22:57:03,276] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.98 | bwd_microstep: 5785.30 | bwd_inner_microstep: 5642.03 | bwd_allreduce_microstep: 143.22 | step_microstep: 18.69 [2025-04-25 22:57:03,276] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.98 | bwd: 5785.31 | bwd_inner: 5642.03 | bwd_allreduce: 143.24 | step: 18.70 15%|█▌ | 6203/41250 [14:59:28<84:40:30, 8.70s/it] {'loss': 0.1424, 'grad_norm': 1.3928691148757935, 'learning_rate': 3.849944895835582e-05, 'epoch': 1.5} 15%|█▌ | 6203/41250 [14:59:28<84:40:30, 8.70s/it][2025-04-25 22:57:12,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:57:12,003] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2874.19 | bwd_microstep: 5769.52 | bwd_inner_microstep: 5757.01 | bwd_allreduce_microstep: 12.47 | step_microstep: 18.66 [2025-04-25 22:57:12,003] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2874.19 | bwd: 5769.53 | bwd_inner: 5757.01 | bwd_allreduce: 12.48 | step: 18.66 15%|█▌ | 6204/41250 [14:59:37<84:45:31, 8.71s/it] {'loss': 0.2484, 'grad_norm': 6.391931056976318, 'learning_rate': 3.849885212408811e-05, 'epoch': 1.5} 15%|█▌ | 6204/41250 [14:59:37<84:45:31, 8.71s/it][2025-04-25 22:57:20,654] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.97 | optimizer_step: 1.08 [2025-04-25 22:57:20,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.54 | bwd_microstep: 5716.48 | bwd_inner_microstep: 5703.69 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.70 [2025-04-25 22:57:20,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.54 | bwd: 5716.49 | bwd_inner: 5703.69 | bwd_allreduce: 12.76 | step: 18.70 15%|█▌ | 6205/41250 [14:59:45<84:35:45, 8.69s/it] {'loss': 0.1074, 'grad_norm': 1.4411994218826294, 'learning_rate': 3.849825517577863e-05, 'epoch': 1.5} 15%|█▌ | 6205/41250 [14:59:45<84:35:45, 8.69s/it][2025-04-25 22:57:29,369] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 22:57:29,369] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.29 | bwd_microstep: 5809.55 | bwd_inner_microstep: 5645.22 | bwd_allreduce_microstep: 164.27 | step_microstep: 18.98 [2025-04-25 22:57:29,369] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.29 | bwd: 5809.57 | bwd_inner: 5645.22 | bwd_allreduce: 164.29 | step: 18.98 15%|█▌ | 6206/41250 [14:59:54<84:40:33, 8.70s/it] {'loss': 0.1777, 'grad_norm': 2.9328393936157227, 'learning_rate': 3.849765811343103e-05, 'epoch': 1.5} 15%|█▌ | 6206/41250 [14:59:54<84:40:33, 8.70s/it][2025-04-25 22:57:38,049] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:57:38,049] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.31 | bwd_microstep: 5766.96 | bwd_inner_microstep: 5652.30 | bwd_allreduce_microstep: 114.61 | step_microstep: 18.79 [2025-04-25 22:57:38,049] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.31 | bwd: 5766.98 | bwd_inner: 5652.30 | bwd_allreduce: 114.63 | step: 18.79 15%|█▌ | 6207/41250 [15:00:03<84:36:24, 8.69s/it] {'loss': 0.0903, 'grad_norm': 1.5230047702789307, 'learning_rate': 3.8497060937049014e-05, 'epoch': 1.5} 15%|█▌ | 6207/41250 [15:00:03<84:36:24, 8.69s/it][2025-04-25 22:57:46,734] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-25 22:57:46,734] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.04 | bwd_microstep: 5767.73 | bwd_inner_microstep: 5648.45 | bwd_allreduce_microstep: 119.23 | step_microstep: 18.69 [2025-04-25 22:57:46,734] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.04 | bwd: 5767.74 | bwd_inner: 5648.45 | bwd_allreduce: 119.25 | step: 18.69 15%|█▌ | 6208/41250 [15:00:12<84:35:02, 8.69s/it] {'loss': 0.0878, 'grad_norm': 1.7612193822860718, 'learning_rate': 3.849646364663626e-05, 'epoch': 1.5} 15%|█▌ | 6208/41250 [15:00:12<84:35:02, 8.69s/it][2025-04-25 22:57:55,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.95 | optimizer_step: 0.91 [2025-04-25 22:57:55,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.40 | bwd_microstep: 5738.79 | bwd_inner_microstep: 5687.37 | bwd_allreduce_microstep: 51.37 | step_microstep: 18.40 [2025-04-25 22:57:55,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.40 | bwd: 5738.80 | bwd_inner: 5687.37 | bwd_allreduce: 51.39 | step: 18.40 15%|█▌ | 6209/41250 [15:00:20<84:31:01, 8.68s/it] {'loss': 0.1453, 'grad_norm': 1.7372223138809204, 'learning_rate': 3.849586624219644e-05, 'epoch': 1.51} 15%|█▌ | 6209/41250 [15:00:20<84:31:01, 8.68s/it][2025-04-25 22:58:04,205] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 22:58:04,205] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.27 | bwd_microstep: 5894.88 | bwd_inner_microstep: 5649.45 | bwd_allreduce_microstep: 245.38 | step_microstep: 18.59 [2025-04-25 22:58:04,205] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.27 | bwd: 5894.89 | bwd_inner: 5649.45 | bwd_allreduce: 245.40 | step: 18.59 15%|█▌ | 6210/41250 [15:00:29<84:51:53, 8.72s/it] {'loss': 0.1915, 'grad_norm': 2.5882668495178223, 'learning_rate': 3.849526872373324e-05, 'epoch': 1.51} 15%|█▌ | 6210/41250 [15:00:29<84:51:53, 8.72s/it][2025-04-25 22:58:12,877] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.92 [2025-04-25 22:58:12,877] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.42 | bwd_microstep: 5747.32 | bwd_inner_microstep: 5691.38 | bwd_allreduce_microstep: 55.89 | step_microstep: 18.86 [2025-04-25 22:58:12,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.42 | bwd: 5747.33 | bwd_inner: 5691.38 | bwd_allreduce: 55.91 | step: 18.86 15%|█▌ | 6211/41250 [15:00:38<84:43:38, 8.71s/it] {'loss': 0.0652, 'grad_norm': 1.0084086656570435, 'learning_rate': 3.849467109125034e-05, 'epoch': 1.51} 15%|█▌ | 6211/41250 [15:00:38<84:43:38, 8.71s/it][2025-04-25 22:58:21,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 22:58:21,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.72 | bwd_microstep: 5709.47 | bwd_inner_microstep: 5636.97 | bwd_allreduce_microstep: 72.45 | step_microstep: 18.70 [2025-04-25 22:58:21,490] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.72 | bwd: 5709.48 | bwd_inner: 5636.97 | bwd_allreduce: 72.47 | step: 18.71 15%|█▌ | 6212/41250 [15:00:46<84:27:11, 8.68s/it] {'loss': 0.143, 'grad_norm': 4.994961261749268, 'learning_rate': 3.849407334475144e-05, 'epoch': 1.51} 15%|█▌ | 6212/41250 [15:00:46<84:27:11, 8.68s/it][2025-04-25 22:58:30,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 22:58:30,126] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.09 | bwd_microstep: 5710.26 | bwd_inner_microstep: 5697.43 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.69 [2025-04-25 22:58:30,126] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.09 | bwd: 5710.28 | bwd_inner: 5697.43 | bwd_allreduce: 12.80 | step: 18.69 15%|█▌ | 6213/41250 [15:00:55<84:19:58, 8.67s/it] {'loss': 0.0654, 'grad_norm': 1.656506896018982, 'learning_rate': 3.8493475484240216e-05, 'epoch': 1.51} 15%|█▌ | 6213/41250 [15:00:55<84:19:58, 8.67s/it][2025-04-25 22:58:38,749] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:58:38,750] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.15 | bwd_microstep: 5697.87 | bwd_inner_microstep: 5685.19 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.67 [2025-04-25 22:58:38,750] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.15 | bwd: 5697.89 | bwd_inner: 5685.19 | bwd_allreduce: 12.66 | step: 18.68 15%|█▌ | 6214/41250 [15:01:04<84:12:37, 8.65s/it] {'loss': 0.2776, 'grad_norm': 2.6605417728424072, 'learning_rate': 3.8492877509720354e-05, 'epoch': 1.51} 15%|█▌ | 6214/41250 [15:01:04<84:12:37, 8.65s/it][2025-04-25 22:58:47,427] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:58:47,428] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.57 | bwd_microstep: 5754.30 | bwd_inner_microstep: 5680.99 | bwd_allreduce_microstep: 73.26 | step_microstep: 18.58 [2025-04-25 22:58:47,428] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.57 | bwd: 5754.32 | bwd_inner: 5680.99 | bwd_allreduce: 73.28 | step: 18.59 15%|█▌ | 6215/41250 [15:01:12<84:16:49, 8.66s/it] {'loss': 0.0833, 'grad_norm': 0.9651370644569397, 'learning_rate': 3.849227942119554e-05, 'epoch': 1.51} 15%|█▌ | 6215/41250 [15:01:12<84:16:49, 8.66s/it][2025-04-25 22:58:56,121] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 22:58:56,122] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.65 | bwd_microstep: 5786.31 | bwd_inner_microstep: 5649.31 | bwd_allreduce_microstep: 136.96 | step_microstep: 18.78 [2025-04-25 22:58:56,122] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.65 | bwd: 5786.33 | bwd_inner: 5649.31 | bwd_allreduce: 136.97 | step: 18.78 15%|█▌ | 6216/41250 [15:01:21<84:22:35, 8.67s/it] {'loss': 0.054, 'grad_norm': 1.4919657707214355, 'learning_rate': 3.849168121866946e-05, 'epoch': 1.51} 15%|█▌ | 6216/41250 [15:01:21<84:22:35, 8.67s/it][2025-04-25 22:59:04,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:59:04,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.39 | bwd_microstep: 5697.51 | bwd_inner_microstep: 5684.73 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.72 [2025-04-25 22:59:04,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.39 | bwd: 5697.52 | bwd_inner: 5684.73 | bwd_allreduce: 12.75 | step: 18.72 15%|█▌ | 6217/41250 [15:01:30<84:13:41, 8.66s/it] {'loss': 0.038, 'grad_norm': 0.8252747654914856, 'learning_rate': 3.84910829021458e-05, 'epoch': 1.51} 15%|█▌ | 6217/41250 [15:01:30<84:13:41, 8.66s/it][2025-04-25 22:59:13,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 22:59:13,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.50 | bwd_microstep: 5737.08 | bwd_inner_microstep: 5700.95 | bwd_allreduce_microstep: 36.09 | step_microstep: 18.88 [2025-04-25 22:59:13,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.50 | bwd: 5737.09 | bwd_inner: 5700.95 | bwd_allreduce: 36.11 | step: 18.88 15%|█▌ | 6218/41250 [15:01:38<84:16:10, 8.66s/it] {'loss': 0.2056, 'grad_norm': 3.8313026428222656, 'learning_rate': 3.849048447162825e-05, 'epoch': 1.51} 15%|█▌ | 6218/41250 [15:01:38<84:16:10, 8.66s/it][2025-04-25 22:59:22,096] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 22:59:22,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.63 | bwd_microstep: 5757.20 | bwd_inner_microstep: 5691.55 | bwd_allreduce_microstep: 65.60 | step_microstep: 18.62 [2025-04-25 22:59:22,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.63 | bwd: 5757.21 | bwd_inner: 5691.55 | bwd_allreduce: 65.62 | step: 18.62 15%|█▌ | 6219/41250 [15:01:47<84:20:20, 8.67s/it] {'loss': 0.0492, 'grad_norm': 0.670257568359375, 'learning_rate': 3.84898859271205e-05, 'epoch': 1.51} 15%|█▌ | 6219/41250 [15:01:47<84:20:20, 8.67s/it][2025-04-25 22:59:30,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.04 | optimizer_step: 0.95 [2025-04-25 22:59:30,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.08 | bwd_microstep: 5685.85 | bwd_inner_microstep: 5650.72 | bwd_allreduce_microstep: 35.09 | step_microstep: 19.19 [2025-04-25 22:59:30,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.08 | bwd: 5685.87 | bwd_inner: 5650.72 | bwd_allreduce: 35.11 | step: 19.19 15%|█▌ | 6220/41250 [15:01:56<84:07:44, 8.65s/it] {'loss': 0.1384, 'grad_norm': 2.2895395755767822, 'learning_rate': 3.848928726862624e-05, 'epoch': 1.51} 15%|█▌ | 6220/41250 [15:01:56<84:07:44, 8.65s/it][2025-04-25 22:59:39,299] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 22:59:39,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.27 | bwd_microstep: 5698.90 | bwd_inner_microstep: 5651.05 | bwd_allreduce_microstep: 47.80 | step_microstep: 18.54 [2025-04-25 22:59:39,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.27 | bwd: 5698.91 | bwd_inner: 5651.05 | bwd_allreduce: 47.82 | step: 18.54 15%|█▌ | 6221/41250 [15:02:04<84:00:47, 8.63s/it] {'loss': 0.0506, 'grad_norm': 1.690800428390503, 'learning_rate': 3.848868849614916e-05, 'epoch': 1.51} 15%|█▌ | 6221/41250 [15:02:04<84:00:47, 8.63s/it][2025-04-25 22:59:47,968] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 22:59:47,969] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.06 | bwd_microstep: 5730.94 | bwd_inner_microstep: 5689.88 | bwd_allreduce_microstep: 41.02 | step_microstep: 18.32 [2025-04-25 22:59:47,969] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.06 | bwd: 5730.96 | bwd_inner: 5689.88 | bwd_allreduce: 41.03 | step: 18.33 15%|█▌ | 6222/41250 [15:02:13<84:06:55, 8.64s/it] {'loss': 0.0728, 'grad_norm': 1.4212366342544556, 'learning_rate': 3.848808960969296e-05, 'epoch': 1.51} 15%|█▌ | 6222/41250 [15:02:13<84:06:55, 8.64s/it][2025-04-25 22:59:56,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 22:59:56,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.93 | bwd_microstep: 5765.17 | bwd_inner_microstep: 5665.15 | bwd_allreduce_microstep: 99.98 | step_microstep: 18.34 [2025-04-25 22:59:56,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.93 | bwd: 5765.18 | bwd_inner: 5665.15 | bwd_allreduce: 99.99 | step: 18.35 15%|█▌ | 6223/41250 [15:02:21<84:12:52, 8.66s/it] {'loss': 0.135, 'grad_norm': 4.277607440948486, 'learning_rate': 3.848749060926131e-05, 'epoch': 1.51} 15%|█▌ | 6223/41250 [15:02:21<84:12:52, 8.66s/it][2025-04-25 23:00:05,321] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.96 | optimizer_step: 1.01 [2025-04-25 23:00:05,321] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.76 | bwd_microstep: 5760.56 | bwd_inner_microstep: 5657.28 | bwd_allreduce_microstep: 103.23 | step_microstep: 18.71 [2025-04-25 23:00:05,321] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.76 | bwd: 5760.57 | bwd_inner: 5657.28 | bwd_allreduce: 103.24 | step: 18.71 15%|█▌ | 6224/41250 [15:02:30<84:15:31, 8.66s/it] {'loss': 0.2751, 'grad_norm': 3.292712688446045, 'learning_rate': 3.8486891494857926e-05, 'epoch': 1.51} 15%|█▌ | 6224/41250 [15:02:30<84:15:31, 8.66s/it][2025-04-25 23:00:14,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 23:00:14,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.66 | bwd_microstep: 5881.22 | bwd_inner_microstep: 5706.19 | bwd_allreduce_microstep: 174.98 | step_microstep: 18.63 [2025-04-25 23:00:14,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.67 | bwd: 5881.24 | bwd_inner: 5706.19 | bwd_allreduce: 175.00 | step: 18.64 15%|█▌ | 6225/41250 [15:02:39<84:42:53, 8.71s/it] {'loss': 0.1274, 'grad_norm': 2.5201358795166016, 'learning_rate': 3.8486292266486495e-05, 'epoch': 1.51} 15%|█▌ | 6225/41250 [15:02:39<84:42:53, 8.71s/it][2025-04-25 23:00:22,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.95 [2025-04-25 23:00:22,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.80 | bwd_microstep: 5720.36 | bwd_inner_microstep: 5707.65 | bwd_allreduce_microstep: 12.66 | step_microstep: 19.03 [2025-04-25 23:00:22,794] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.80 | bwd: 5720.37 | bwd_inner: 5707.65 | bwd_allreduce: 12.68 | step: 19.03 15%|█▌ | 6226/41250 [15:02:48<84:33:34, 8.69s/it] {'loss': 0.0643, 'grad_norm': 1.8000659942626953, 'learning_rate': 3.8485692924150704e-05, 'epoch': 1.51} 15%|█▌ | 6226/41250 [15:02:48<84:33:34, 8.69s/it][2025-04-25 23:00:31,428] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 23:00:31,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.33 | bwd_microstep: 5708.65 | bwd_inner_microstep: 5695.54 | bwd_allreduce_microstep: 13.06 | step_microstep: 18.93 [2025-04-25 23:00:31,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.33 | bwd: 5708.66 | bwd_inner: 5695.54 | bwd_allreduce: 13.08 | step: 18.93 15%|█▌ | 6227/41250 [15:02:56<84:23:31, 8.67s/it] {'loss': 0.0698, 'grad_norm': 1.0044256448745728, 'learning_rate': 3.848509346785425e-05, 'epoch': 1.51} 15%|█▌ | 6227/41250 [15:02:56<84:23:31, 8.67s/it][2025-04-25 23:00:40,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:00:40,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.33 | bwd_microstep: 5728.56 | bwd_inner_microstep: 5657.89 | bwd_allreduce_microstep: 70.63 | step_microstep: 18.66 [2025-04-25 23:00:40,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.33 | bwd: 5728.57 | bwd_inner: 5657.89 | bwd_allreduce: 70.64 | step: 18.67 15%|█▌ | 6228/41250 [15:03:05<84:17:34, 8.66s/it] {'loss': 0.023, 'grad_norm': 0.3697053790092468, 'learning_rate': 3.848449389760083e-05, 'epoch': 1.51} 15%|█▌ | 6228/41250 [15:03:05<84:17:34, 8.66s/it][2025-04-25 23:00:48,756] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:00:48,756] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.23 | bwd_microstep: 5740.21 | bwd_inner_microstep: 5667.74 | bwd_allreduce_microstep: 72.43 | step_microstep: 18.39 [2025-04-25 23:00:48,756] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.23 | bwd: 5740.23 | bwd_inner: 5667.74 | bwd_allreduce: 72.44 | step: 18.40 15%|█▌ | 6229/41250 [15:03:14<84:21:16, 8.67s/it] {'loss': 0.2478, 'grad_norm': 2.7276980876922607, 'learning_rate': 3.8483894213394145e-05, 'epoch': 1.51} 15%|█▌ | 6229/41250 [15:03:14<84:21:16, 8.67s/it][2025-04-25 23:00:57,410] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:00:57,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.34 | bwd_microstep: 5717.77 | bwd_inner_microstep: 5704.93 | bwd_allreduce_microstep: 12.79 | step_microstep: 18.78 [2025-04-25 23:00:57,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.34 | bwd: 5717.78 | bwd_inner: 5704.94 | bwd_allreduce: 12.81 | step: 18.78 15%|█▌ | 6230/41250 [15:03:22<84:18:18, 8.67s/it] {'loss': 0.135, 'grad_norm': 1.7075779438018799, 'learning_rate': 3.848329441523789e-05, 'epoch': 1.51} 15%|█▌ | 6230/41250 [15:03:22<84:18:18, 8.67s/it][2025-04-25 23:01:06,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.05 | optimizer_step: 0.97 [2025-04-25 23:01:06,073] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.03 | bwd_microstep: 5716.24 | bwd_inner_microstep: 5702.90 | bwd_allreduce_microstep: 13.30 | step_microstep: 19.24 [2025-04-25 23:01:06,073] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.03 | bwd: 5716.26 | bwd_inner: 5702.90 | bwd_allreduce: 13.32 | step: 19.24 15%|█▌ | 6231/41250 [15:03:31<84:17:20, 8.67s/it] {'loss': 0.2411, 'grad_norm': 2.5513272285461426, 'learning_rate': 3.8482694503135755e-05, 'epoch': 1.51} 15%|█▌ | 6231/41250 [15:03:31<84:17:20, 8.67s/it][2025-04-25 23:01:14,769] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.11 | optimizer_step: 0.98 [2025-04-25 23:01:14,770] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.65 | bwd_microstep: 5765.22 | bwd_inner_microstep: 5669.51 | bwd_allreduce_microstep: 95.66 | step_microstep: 19.04 [2025-04-25 23:01:14,770] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.65 | bwd: 5765.24 | bwd_inner: 5669.51 | bwd_allreduce: 95.68 | step: 19.04 15%|█▌ | 6232/41250 [15:03:40<84:22:47, 8.67s/it] {'loss': 0.1934, 'grad_norm': 1.8881112337112427, 'learning_rate': 3.848209447709144e-05, 'epoch': 1.51} 15%|█▌ | 6232/41250 [15:03:40<84:22:47, 8.67s/it][2025-04-25 23:01:23,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 23:01:23,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.44 | bwd_microstep: 5765.45 | bwd_inner_microstep: 5664.18 | bwd_allreduce_microstep: 101.23 | step_microstep: 18.78 [2025-04-25 23:01:23,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.44 | bwd: 5765.47 | bwd_inner: 5664.18 | bwd_allreduce: 101.24 | step: 18.78 15%|█▌ | 6233/41250 [15:03:48<84:25:25, 8.68s/it] {'loss': 0.0858, 'grad_norm': 2.4616172313690186, 'learning_rate': 3.8481494337108655e-05, 'epoch': 1.51} 15%|█▌ | 6233/41250 [15:03:48<84:25:25, 8.68s/it][2025-04-25 23:01:32,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:01:32,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.69 | bwd_microstep: 5745.52 | bwd_inner_microstep: 5704.00 | bwd_allreduce_microstep: 41.48 | step_microstep: 18.57 [2025-04-25 23:01:32,148] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.69 | bwd: 5745.54 | bwd_inner: 5704.00 | bwd_allreduce: 41.50 | step: 18.58 15%|█▌ | 6234/41250 [15:03:57<84:26:41, 8.68s/it] {'loss': 0.1643, 'grad_norm': 1.4057248830795288, 'learning_rate': 3.848089408319109e-05, 'epoch': 1.51} 15%|█▌ | 6234/41250 [15:03:57<84:26:41, 8.68s/it][2025-04-25 23:01:40,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.97 | optimizer_step: 1.03 [2025-04-25 23:01:40,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.23 | bwd_microstep: 5708.89 | bwd_inner_microstep: 5696.25 | bwd_allreduce_microstep: 12.60 | step_microstep: 18.34 [2025-04-25 23:01:40,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.23 | bwd: 5708.91 | bwd_inner: 5696.25 | bwd_allreduce: 12.62 | step: 18.34 15%|█▌ | 6235/41250 [15:04:06<84:20:57, 8.67s/it] {'loss': 0.0873, 'grad_norm': 1.6314393281936646, 'learning_rate': 3.848029371534245e-05, 'epoch': 1.51} 15%|█▌ | 6235/41250 [15:04:06<84:20:57, 8.67s/it][2025-04-25 23:01:49,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 23:01:49,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2931.98 | bwd_microstep: 5702.58 | bwd_inner_microstep: 5690.22 | bwd_allreduce_microstep: 12.32 | step_microstep: 18.56 [2025-04-25 23:01:49,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2931.98 | bwd: 5702.59 | bwd_inner: 5690.22 | bwd_allreduce: 12.34 | step: 18.56 15%|█▌ | 6236/41250 [15:04:14<84:29:08, 8.69s/it] {'loss': 0.0632, 'grad_norm': 1.1282817125320435, 'learning_rate': 3.847969323356643e-05, 'epoch': 1.51} 15%|█▌ | 6236/41250 [15:04:14<84:29:08, 8.69s/it][2025-04-25 23:01:58,152] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 0.99 | optimizer_step: 0.99 [2025-04-25 23:01:58,152] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.74 | bwd_microstep: 5721.01 | bwd_inner_microstep: 5657.64 | bwd_allreduce_microstep: 63.32 | step_microstep: 18.42 [2025-04-25 23:01:58,153] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.74 | bwd: 5721.02 | bwd_inner: 5657.64 | bwd_allreduce: 63.34 | step: 18.42 15%|█▌ | 6237/41250 [15:04:23<84:19:51, 8.67s/it] {'loss': 0.1619, 'grad_norm': 2.893524169921875, 'learning_rate': 3.847909263786674e-05, 'epoch': 1.51} 15%|█▌ | 6237/41250 [15:04:23<84:19:51, 8.67s/it][2025-04-25 23:02:06,851] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.04 | optimizer_step: 0.89 [2025-04-25 23:02:06,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.47 | bwd_microstep: 5774.43 | bwd_inner_microstep: 5672.92 | bwd_allreduce_microstep: 101.45 | step_microstep: 18.63 [2025-04-25 23:02:06,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.47 | bwd: 5774.45 | bwd_inner: 5672.92 | bwd_allreduce: 101.47 | step: 18.63 15%|█▌ | 6238/41250 [15:04:32<84:25:33, 8.68s/it] {'loss': 0.0903, 'grad_norm': 0.982167661190033, 'learning_rate': 3.847849192824708e-05, 'epoch': 1.51} 15%|█▌ | 6238/41250 [15:04:32<84:25:33, 8.68s/it][2025-04-25 23:02:15,536] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.21 | optimizer_step: 1.04 [2025-04-25 23:02:15,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.00 | bwd_microstep: 5738.08 | bwd_inner_microstep: 5716.75 | bwd_allreduce_microstep: 21.28 | step_microstep: 19.80 [2025-04-25 23:02:15,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.00 | bwd: 5738.09 | bwd_inner: 5716.75 | bwd_allreduce: 21.30 | step: 19.80 15%|█▌ | 6239/41250 [15:04:40<84:25:22, 8.68s/it] {'loss': 0.0732, 'grad_norm': 1.1033720970153809, 'learning_rate': 3.847789110471115e-05, 'epoch': 1.51} 15%|█▌ | 6239/41250 [15:04:40<84:25:22, 8.68s/it][2025-04-25 23:02:24,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 23:02:24,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.52 | bwd_microstep: 5794.67 | bwd_inner_microstep: 5650.66 | bwd_allreduce_microstep: 143.96 | step_microstep: 18.69 [2025-04-25 23:02:24,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.52 | bwd: 5794.69 | bwd_inner: 5650.66 | bwd_allreduce: 143.98 | step: 18.69 15%|█▌ | 6240/41250 [15:04:49<84:29:44, 8.69s/it] {'loss': 0.3038, 'grad_norm': 2.864821434020996, 'learning_rate': 3.847729016726265e-05, 'epoch': 1.51} 15%|█▌ | 6240/41250 [15:04:49<84:29:44, 8.69s/it][2025-04-25 23:02:32,877] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:02:32,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.81 | bwd_microstep: 5725.25 | bwd_inner_microstep: 5664.82 | bwd_allreduce_microstep: 60.38 | step_microstep: 18.62 [2025-04-25 23:02:32,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.81 | bwd: 5725.26 | bwd_inner: 5664.82 | bwd_allreduce: 60.40 | step: 18.62 15%|█▌ | 6241/41250 [15:04:58<84:20:05, 8.67s/it] {'loss': 0.2516, 'grad_norm': 2.4901020526885986, 'learning_rate': 3.84766891159053e-05, 'epoch': 1.51} 15%|█▌ | 6241/41250 [15:04:58<84:20:05, 8.67s/it][2025-04-25 23:02:41,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:02:41,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.20 | bwd_microstep: 5763.71 | bwd_inner_microstep: 5694.77 | bwd_allreduce_microstep: 68.90 | step_microstep: 18.49 [2025-04-25 23:02:41,571] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.20 | bwd: 5763.73 | bwd_inner: 5694.77 | bwd_allreduce: 68.92 | step: 18.50 15%|█▌ | 6242/41250 [15:05:06<84:23:33, 8.68s/it] {'loss': 0.2199, 'grad_norm': 4.853008270263672, 'learning_rate': 3.847608795064279e-05, 'epoch': 1.51} 15%|█▌ | 6242/41250 [15:05:06<84:23:33, 8.68s/it][2025-04-25 23:02:50,190] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-25 23:02:50,191] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.30 | bwd_microstep: 5694.09 | bwd_inner_microstep: 5681.33 | bwd_allreduce_microstep: 12.72 | step_microstep: 19.39 [2025-04-25 23:02:50,191] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.30 | bwd: 5694.10 | bwd_inner: 5681.33 | bwd_allreduce: 12.73 | step: 19.39 15%|█▌ | 6243/41250 [15:05:15<84:13:21, 8.66s/it] {'loss': 0.3164, 'grad_norm': 3.4501376152038574, 'learning_rate': 3.8475486671478826e-05, 'epoch': 1.51} 15%|█▌ | 6243/41250 [15:05:15<84:13:21, 8.66s/it][2025-04-25 23:02:58,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 23:02:58,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.39 | bwd_microstep: 5749.95 | bwd_inner_microstep: 5713.54 | bwd_allreduce_microstep: 36.37 | step_microstep: 18.83 [2025-04-25 23:02:58,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.39 | bwd: 5749.96 | bwd_inner: 5713.54 | bwd_allreduce: 36.38 | step: 18.84 15%|█▌ | 6244/41250 [15:05:24<84:18:12, 8.67s/it] {'loss': 0.0168, 'grad_norm': 0.4772365391254425, 'learning_rate': 3.847488527841713e-05, 'epoch': 1.51} 15%|█▌ | 6244/41250 [15:05:24<84:18:12, 8.67s/it][2025-04-25 23:03:07,561] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-25 23:03:07,562] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.04 | bwd_microstep: 5750.75 | bwd_inner_microstep: 5698.88 | bwd_allreduce_microstep: 51.82 | step_microstep: 18.44 [2025-04-25 23:03:07,562] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.04 | bwd: 5750.76 | bwd_inner: 5698.88 | bwd_allreduce: 51.84 | step: 18.45 15%|█▌ | 6245/41250 [15:05:32<84:19:51, 8.67s/it] {'loss': 0.1212, 'grad_norm': 2.0631139278411865, 'learning_rate': 3.84742837714614e-05, 'epoch': 1.51} 15%|█▌ | 6245/41250 [15:05:32<84:19:51, 8.67s/it][2025-04-25 23:03:16,255] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 1.14 | optimizer_step: 1.10 [2025-04-25 23:03:16,256] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.06 | bwd_microstep: 5786.96 | bwd_inner_microstep: 5653.91 | bwd_allreduce_microstep: 132.99 | step_microstep: 19.23 [2025-04-25 23:03:16,256] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.06 | bwd: 5786.97 | bwd_inner: 5653.91 | bwd_allreduce: 133.01 | step: 19.23 15%|█▌ | 6246/41250 [15:05:41<84:23:59, 8.68s/it] {'loss': 0.0325, 'grad_norm': 0.4841156005859375, 'learning_rate': 3.847368215061534e-05, 'epoch': 1.51} 15%|█▌ | 6246/41250 [15:05:41<84:23:59, 8.68s/it][2025-04-25 23:03:24,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.07 | optimizer_step: 0.95 [2025-04-25 23:03:24,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.38 | bwd_microstep: 5759.05 | bwd_inner_microstep: 5677.79 | bwd_allreduce_microstep: 81.22 | step_microstep: 19.51 [2025-04-25 23:03:24,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.38 | bwd: 5759.07 | bwd_inner: 5677.78 | bwd_allreduce: 81.24 | step: 19.51 15%|█▌ | 6247/41250 [15:05:50<84:26:48, 8.69s/it] {'loss': 0.1117, 'grad_norm': 1.7511976957321167, 'learning_rate': 3.847308041588266e-05, 'epoch': 1.51} 15%|█▌ | 6247/41250 [15:05:50<84:26:48, 8.69s/it][2025-04-25 23:03:33,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.03 | optimizer_step: 1.00 [2025-04-25 23:03:33,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.26 | bwd_microstep: 5712.30 | bwd_inner_microstep: 5699.32 | bwd_allreduce_microstep: 12.93 | step_microstep: 19.38 [2025-04-25 23:03:33,598] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.26 | bwd: 5712.31 | bwd_inner: 5699.32 | bwd_allreduce: 12.95 | step: 19.38 15%|█▌ | 6248/41250 [15:05:58<84:19:54, 8.67s/it] {'loss': 0.0878, 'grad_norm': 3.8195557594299316, 'learning_rate': 3.847247856726708e-05, 'epoch': 1.51} 15%|█▌ | 6248/41250 [15:05:58<84:19:54, 8.67s/it][2025-04-25 23:03:42,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.98 | optimizer_step: 0.93 [2025-04-25 23:03:42,391] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.85 | bwd_microstep: 5879.57 | bwd_inner_microstep: 5654.84 | bwd_allreduce_microstep: 224.68 | step_microstep: 18.47 [2025-04-25 23:03:42,391] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.85 | bwd: 5879.58 | bwd_inner: 5654.84 | bwd_allreduce: 224.70 | step: 18.47 15%|█▌ | 6249/41250 [15:06:07<84:39:53, 8.71s/it] {'loss': 0.2679, 'grad_norm': 1.8577169179916382, 'learning_rate': 3.84718766047723e-05, 'epoch': 1.51} 15%|█▌ | 6249/41250 [15:06:07<84:39:53, 8.71s/it][2025-04-25 23:03:51,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.06 | optimizer_step: 0.93 [2025-04-25 23:03:51,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.97 | bwd_microstep: 5746.03 | bwd_inner_microstep: 5707.50 | bwd_allreduce_microstep: 38.48 | step_microstep: 19.43 [2025-04-25 23:03:51,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.97 | bwd: 5746.04 | bwd_inner: 5707.50 | bwd_allreduce: 38.50 | step: 19.43 15%|█▌ | 6250/41250 [15:06:16<84:34:52, 8.70s/it] {'loss': 0.1257, 'grad_norm': 1.8648438453674316, 'learning_rate': 3.847127452840204e-05, 'epoch': 1.52} 15%|█▌ | 6250/41250 [15:06:16<84:34:52, 8.70s/it][2025-04-25 23:03:59,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:03:59,745] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.76 | bwd_microstep: 5758.48 | bwd_inner_microstep: 5649.54 | bwd_allreduce_microstep: 108.89 | step_microstep: 18.64 [2025-04-25 23:03:59,745] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.76 | bwd: 5758.49 | bwd_inner: 5649.54 | bwd_allreduce: 108.91 | step: 18.64 15%|█▌ | 6251/41250 [15:06:25<84:30:49, 8.69s/it] {'loss': 0.1157, 'grad_norm': 1.9997502565383911, 'learning_rate': 3.8470672338159995e-05, 'epoch': 1.52} 15%|█▌ | 6251/41250 [15:06:25<84:30:49, 8.69s/it][2025-04-25 23:04:08,441] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.03 | optimizer_step: 1.01 [2025-04-25 23:04:08,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.36 | bwd_microstep: 5762.97 | bwd_inner_microstep: 5686.97 | bwd_allreduce_microstep: 75.96 | step_microstep: 19.28 [2025-04-25 23:04:08,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.36 | bwd: 5762.98 | bwd_inner: 5686.96 | bwd_allreduce: 75.98 | step: 19.28 15%|█▌ | 6252/41250 [15:06:33<84:30:56, 8.69s/it] {'loss': 0.1452, 'grad_norm': 4.1308794021606445, 'learning_rate': 3.847007003404989e-05, 'epoch': 1.52} 15%|█▌ | 6252/41250 [15:06:33<84:30:56, 8.69s/it][2025-04-25 23:04:17,122] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 23:04:17,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.41 | bwd_microstep: 5733.14 | bwd_inner_microstep: 5695.19 | bwd_allreduce_microstep: 37.91 | step_microstep: 18.81 [2025-04-25 23:04:17,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.41 | bwd: 5733.15 | bwd_inner: 5695.19 | bwd_allreduce: 37.92 | step: 18.81 15%|█▌ | 6253/41250 [15:06:42<84:28:27, 8.69s/it] {'loss': 0.1595, 'grad_norm': 1.733755350112915, 'learning_rate': 3.8469467616075446e-05, 'epoch': 1.52} 15%|█▌ | 6253/41250 [15:06:42<84:28:27, 8.69s/it][2025-04-25 23:04:26,033] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:04:26,033] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2947.34 | bwd_microstep: 5878.21 | bwd_inner_microstep: 5865.29 | bwd_allreduce_microstep: 12.87 | step_microstep: 18.66 [2025-04-25 23:04:26,034] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2947.34 | bwd: 5878.22 | bwd_inner: 5865.29 | bwd_allreduce: 12.89 | step: 18.66 15%|█▌ | 6254/41250 [15:06:51<85:07:05, 8.76s/it] {'loss': 0.0648, 'grad_norm': 1.254811406135559, 'learning_rate': 3.846886508424036e-05, 'epoch': 1.52} 15%|█▌ | 6254/41250 [15:06:51<85:07:05, 8.76s/it][2025-04-25 23:04:34,636] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-25 23:04:34,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.42 | bwd_microstep: 5695.31 | bwd_inner_microstep: 5653.32 | bwd_allreduce_microstep: 41.95 | step_microstep: 18.75 [2025-04-25 23:04:34,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.42 | bwd: 5695.33 | bwd_inner: 5653.31 | bwd_allreduce: 41.97 | step: 18.75 15%|█▌ | 6255/41250 [15:06:59<84:40:18, 8.71s/it] {'loss': 0.0175, 'grad_norm': 0.27086466550827026, 'learning_rate': 3.846826243854835e-05, 'epoch': 1.52} 15%|█▌ | 6255/41250 [15:06:59<84:40:18, 8.71s/it][2025-04-25 23:04:43,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.98 | optimizer_step: 0.91 [2025-04-25 23:04:43,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.93 | bwd_microstep: 5674.04 | bwd_inner_microstep: 5654.57 | bwd_allreduce_microstep: 19.43 | step_microstep: 18.45 [2025-04-25 23:04:43,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.93 | bwd: 5674.05 | bwd_inner: 5654.57 | bwd_allreduce: 19.44 | step: 18.46 15%|█▌ | 6256/41250 [15:07:08<84:19:48, 8.68s/it] {'loss': 0.1084, 'grad_norm': 1.4538718461990356, 'learning_rate': 3.846765967900314e-05, 'epoch': 1.52} 15%|█▌ | 6256/41250 [15:07:08<84:19:48, 8.68s/it][2025-04-25 23:04:51,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.01 | optimizer_step: 1.10 [2025-04-25 23:04:51,821] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.93 | bwd_microstep: 5670.94 | bwd_inner_microstep: 5640.80 | bwd_allreduce_microstep: 30.09 | step_microstep: 18.88 [2025-04-25 23:04:51,821] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.93 | bwd: 5670.95 | bwd_inner: 5640.80 | bwd_allreduce: 30.11 | step: 18.88 15%|█▌ | 6257/41250 [15:07:17<84:04:40, 8.65s/it] {'loss': 0.1805, 'grad_norm': 1.9608901739120483, 'learning_rate': 3.846705680560844e-05, 'epoch': 1.52} 15%|█▌ | 6257/41250 [15:07:17<84:04:40, 8.65s/it][2025-04-25 23:05:00,428] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 23:05:00,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.91 | bwd_microstep: 5694.42 | bwd_inner_microstep: 5640.26 | bwd_allreduce_microstep: 54.11 | step_microstep: 18.47 [2025-04-25 23:05:00,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.91 | bwd: 5694.43 | bwd_inner: 5640.26 | bwd_allreduce: 54.13 | step: 18.47 15%|█▌ | 6258/41250 [15:07:25<83:57:03, 8.64s/it] {'loss': 0.0291, 'grad_norm': 0.5779341459274292, 'learning_rate': 3.8466453818367956e-05, 'epoch': 1.52} 15%|█▌ | 6258/41250 [15:07:25<83:57:03, 8.64s/it][2025-04-25 23:05:09,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:05:09,052] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.75 | bwd_microstep: 5692.12 | bwd_inner_microstep: 5679.34 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.57 [2025-04-25 23:05:09,052] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.75 | bwd: 5692.13 | bwd_inner: 5679.34 | bwd_allreduce: 12.75 | step: 18.58 15%|█▌ | 6259/41250 [15:07:34<83:54:33, 8.63s/it] {'loss': 0.3896, 'grad_norm': 5.2414045333862305, 'learning_rate': 3.8465850717285433e-05, 'epoch': 1.52} 15%|█▌ | 6259/41250 [15:07:34<83:54:33, 8.63s/it][2025-04-25 23:05:17,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.31 | optimizer_step: 1.05 [2025-04-25 23:05:17,726] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.95 | bwd_microstep: 5756.39 | bwd_inner_microstep: 5644.76 | bwd_allreduce_microstep: 111.58 | step_microstep: 19.98 [2025-04-25 23:05:17,726] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.95 | bwd: 5756.41 | bwd_inner: 5644.76 | bwd_allreduce: 111.61 | step: 19.98 15%|█▌ | 6260/41250 [15:07:43<84:01:37, 8.65s/it] {'loss': 0.0304, 'grad_norm': 0.6117140650749207, 'learning_rate': 3.846524750236456e-05, 'epoch': 1.52} 15%|█▌ | 6260/41250 [15:07:43<84:01:37, 8.65s/it][2025-04-25 23:05:26,373] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-25 23:05:26,374] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.14 | bwd_microstep: 5710.10 | bwd_inner_microstep: 5697.37 | bwd_allreduce_microstep: 12.69 | step_microstep: 19.13 [2025-04-25 23:05:26,374] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.14 | bwd: 5710.11 | bwd_inner: 5697.37 | bwd_allreduce: 12.70 | step: 19.13 15%|█▌ | 6261/41250 [15:07:51<84:01:55, 8.65s/it] {'loss': 0.2084, 'grad_norm': 1.5681614875793457, 'learning_rate': 3.846464417360908e-05, 'epoch': 1.52} 15%|█▌ | 6261/41250 [15:07:51<84:01:55, 8.65s/it][2025-04-25 23:05:35,010] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 1.05 [2025-04-25 23:05:35,011] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.94 | bwd_microstep: 5700.61 | bwd_inner_microstep: 5687.77 | bwd_allreduce_microstep: 12.79 | step_microstep: 19.14 [2025-04-25 23:05:35,011] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.94 | bwd: 5700.62 | bwd_inner: 5687.77 | bwd_allreduce: 12.81 | step: 19.14 15%|█▌ | 6262/41250 [15:08:00<84:00:01, 8.64s/it] {'loss': 0.0819, 'grad_norm': 1.613232970237732, 'learning_rate': 3.846404073102269e-05, 'epoch': 1.52} 15%|█▌ | 6262/41250 [15:08:00<84:00:01, 8.64s/it][2025-04-25 23:05:43,623] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:05:43,623] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.24 | bwd_microstep: 5694.75 | bwd_inner_microstep: 5653.33 | bwd_allreduce_microstep: 41.37 | step_microstep: 18.88 [2025-04-25 23:05:43,623] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.24 | bwd: 5694.77 | bwd_inner: 5653.33 | bwd_allreduce: 41.39 | step: 18.88 15%|█▌ | 6263/41250 [15:08:08<83:54:32, 8.63s/it] {'loss': 0.1228, 'grad_norm': 2.965669870376587, 'learning_rate': 3.846343717460912e-05, 'epoch': 1.52} 15%|█▌ | 6263/41250 [15:08:08<83:54:32, 8.63s/it][2025-04-25 23:05:52,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 23:05:52,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.02 | bwd_microstep: 5799.48 | bwd_inner_microstep: 5657.07 | bwd_allreduce_microstep: 142.36 | step_microstep: 18.67 [2025-04-25 23:05:52,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.02 | bwd: 5799.49 | bwd_inner: 5657.07 | bwd_allreduce: 142.38 | step: 18.67 15%|█▌ | 6264/41250 [15:08:17<84:07:55, 8.66s/it] {'loss': 0.2052, 'grad_norm': 5.254317760467529, 'learning_rate': 3.84628335043721e-05, 'epoch': 1.52} 15%|█▌ | 6264/41250 [15:08:17<84:07:55, 8.66s/it][2025-04-25 23:06:01,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 23:06:01,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.94 | bwd_microstep: 5725.86 | bwd_inner_microstep: 5696.17 | bwd_allreduce_microstep: 29.64 | step_microstep: 18.71 [2025-04-25 23:06:01,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.94 | bwd: 5725.88 | bwd_inner: 5696.17 | bwd_allreduce: 29.66 | step: 18.71 15%|█▌ | 6265/41250 [15:08:26<84:09:51, 8.66s/it] {'loss': 0.1322, 'grad_norm': 1.3543000221252441, 'learning_rate': 3.846222972031534e-05, 'epoch': 1.52} 15%|█▌ | 6265/41250 [15:08:26<84:09:51, 8.66s/it][2025-04-25 23:06:09,662] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-25 23:06:09,663] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.12 | bwd_microstep: 5718.12 | bwd_inner_microstep: 5705.21 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.83 [2025-04-25 23:06:09,663] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.12 | bwd: 5718.13 | bwd_inner: 5705.21 | bwd_allreduce: 12.88 | step: 18.84 15%|█▌ | 6266/41250 [15:08:34<84:09:53, 8.66s/it] {'loss': 0.0262, 'grad_norm': 0.6086506843566895, 'learning_rate': 3.8461625822442564e-05, 'epoch': 1.52} 15%|█▌ | 6266/41250 [15:08:34<84:09:53, 8.66s/it][2025-04-25 23:06:18,468] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-25 23:06:18,469] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.84 | bwd_microstep: 5871.24 | bwd_inner_microstep: 5720.74 | bwd_allreduce_microstep: 150.46 | step_microstep: 18.95 [2025-04-25 23:06:18,469] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.84 | bwd: 5871.26 | bwd_inner: 5720.73 | bwd_allreduce: 150.48 | step: 18.96 15%|█▌ | 6267/41250 [15:08:43<84:35:04, 8.70s/it] {'loss': 0.1659, 'grad_norm': 1.939268708229065, 'learning_rate': 3.8461021810757495e-05, 'epoch': 1.52} 15%|█▌ | 6267/41250 [15:08:43<84:35:04, 8.70s/it][2025-04-25 23:06:27,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 23:06:27,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.03 | bwd_microstep: 5709.04 | bwd_inner_microstep: 5695.99 | bwd_allreduce_microstep: 13.01 | step_microstep: 18.70 [2025-04-25 23:06:27,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.04 | bwd: 5709.06 | bwd_inner: 5695.99 | bwd_allreduce: 13.02 | step: 18.70 15%|█▌ | 6268/41250 [15:08:52<84:24:25, 8.69s/it] {'loss': 0.1431, 'grad_norm': 3.334550619125366, 'learning_rate': 3.846041768526385e-05, 'epoch': 1.52} 15%|█▌ | 6268/41250 [15:08:52<84:24:25, 8.69s/it][2025-04-25 23:06:35,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 0.99 | optimizer_step: 1.08 [2025-04-25 23:06:35,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.54 | bwd_microstep: 5793.48 | bwd_inner_microstep: 5651.98 | bwd_allreduce_microstep: 141.46 | step_microstep: 18.49 [2025-04-25 23:06:35,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.54 | bwd: 5793.49 | bwd_inner: 5651.98 | bwd_allreduce: 141.48 | step: 18.49 15%|█▌ | 6269/41250 [15:09:01<84:27:55, 8.69s/it] {'loss': 0.0542, 'grad_norm': 1.8547084331512451, 'learning_rate': 3.845981344596538e-05, 'epoch': 1.52} 15%|█▌ | 6269/41250 [15:09:01<84:27:55, 8.69s/it][2025-04-25 23:06:44,542] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:06:44,543] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.24 | bwd_microstep: 5795.59 | bwd_inner_microstep: 5658.67 | bwd_allreduce_microstep: 136.88 | step_microstep: 18.67 [2025-04-25 23:06:44,543] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.24 | bwd: 5795.60 | bwd_inner: 5658.67 | bwd_allreduce: 136.90 | step: 18.67 15%|█▌ | 6270/41250 [15:09:09<84:32:58, 8.70s/it] {'loss': 0.4741, 'grad_norm': 5.472161293029785, 'learning_rate': 3.8459209092865776e-05, 'epoch': 1.52} 15%|█▌ | 6270/41250 [15:09:09<84:32:58, 8.70s/it][2025-04-25 23:06:53,178] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:06:53,178] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.88 | bwd_microstep: 5724.61 | bwd_inner_microstep: 5655.71 | bwd_allreduce_microstep: 68.86 | step_microstep: 18.45 [2025-04-25 23:06:53,178] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.88 | bwd: 5724.62 | bwd_inner: 5655.71 | bwd_allreduce: 68.88 | step: 18.45 15%|█▌ | 6271/41250 [15:09:18<84:21:09, 8.68s/it] {'loss': 0.0644, 'grad_norm': 1.1365259885787964, 'learning_rate': 3.845860462596878e-05, 'epoch': 1.52} 15%|█▌ | 6271/41250 [15:09:18<84:21:09, 8.68s/it][2025-04-25 23:07:01,880] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.99 | optimizer_step: 0.97 [2025-04-25 23:07:01,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.45 | bwd_microstep: 5787.39 | bwd_inner_microstep: 5669.75 | bwd_allreduce_microstep: 117.60 | step_microstep: 18.83 [2025-04-25 23:07:01,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.45 | bwd: 5787.40 | bwd_inner: 5669.75 | bwd_allreduce: 117.61 | step: 18.83 15%|█▌ | 6272/41250 [15:09:27<84:24:40, 8.69s/it] {'loss': 0.0427, 'grad_norm': 0.9046356678009033, 'learning_rate': 3.845800004527812e-05, 'epoch': 1.52} 15%|█▌ | 6272/41250 [15:09:27<84:24:40, 8.69s/it][2025-04-25 23:07:10,609] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-25 23:07:10,609] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.53 | bwd_microstep: 5788.84 | bwd_inner_microstep: 5710.67 | bwd_allreduce_microstep: 78.12 | step_microstep: 18.15 [2025-04-25 23:07:10,610] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.53 | bwd: 5788.85 | bwd_inner: 5710.67 | bwd_allreduce: 78.14 | step: 18.16 15%|█▌ | 6273/41250 [15:09:35<84:31:29, 8.70s/it] {'loss': 0.3342, 'grad_norm': 6.2184624671936035, 'learning_rate': 3.8457395350797525e-05, 'epoch': 1.52} 15%|█▌ | 6273/41250 [15:09:35<84:31:29, 8.70s/it][2025-04-25 23:07:19,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.22 | optimizer_step: 0.94 [2025-04-25 23:07:19,308] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.61 | bwd_microstep: 5782.16 | bwd_inner_microstep: 5659.26 | bwd_allreduce_microstep: 122.85 | step_microstep: 19.73 [2025-04-25 23:07:19,308] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.61 | bwd: 5782.18 | bwd_inner: 5659.26 | bwd_allreduce: 122.87 | step: 19.74 15%|█▌ | 6274/41250 [15:09:44<84:31:27, 8.70s/it] {'loss': 0.241, 'grad_norm': 2.485456705093384, 'learning_rate': 3.8456790542530715e-05, 'epoch': 1.52} 15%|█▌ | 6274/41250 [15:09:44<84:31:27, 8.70s/it][2025-04-25 23:07:27,945] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 23:07:27,945] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.14 | bwd_microstep: 5711.59 | bwd_inner_microstep: 5667.85 | bwd_allreduce_microstep: 43.70 | step_microstep: 19.07 [2025-04-25 23:07:27,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.14 | bwd: 5711.61 | bwd_inner: 5667.85 | bwd_allreduce: 43.72 | step: 19.07 15%|█▌ | 6275/41250 [15:09:53<84:20:21, 8.68s/it] {'loss': 0.4911, 'grad_norm': 3.73717999458313, 'learning_rate': 3.8456185620481414e-05, 'epoch': 1.52} 15%|█▌ | 6275/41250 [15:09:53<84:20:21, 8.68s/it][2025-04-25 23:07:36,596] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.04 | optimizer_step: 1.07 [2025-04-25 23:07:36,596] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.80 | bwd_microstep: 5712.23 | bwd_inner_microstep: 5698.88 | bwd_allreduce_microstep: 13.29 | step_microstep: 19.29 [2025-04-25 23:07:36,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.80 | bwd: 5712.24 | bwd_inner: 5698.88 | bwd_allreduce: 13.32 | step: 19.29 15%|█▌ | 6276/41250 [15:10:01<84:14:53, 8.67s/it] {'loss': 0.0292, 'grad_norm': 0.5185178518295288, 'learning_rate': 3.845558058465336e-05, 'epoch': 1.52} 15%|█▌ | 6276/41250 [15:10:01<84:14:53, 8.67s/it][2025-04-25 23:07:45,251] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.03 | optimizer_step: 1.13 [2025-04-25 23:07:45,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.59 | bwd_microstep: 5720.73 | bwd_inner_microstep: 5699.66 | bwd_allreduce_microstep: 21.02 | step_microstep: 19.31 [2025-04-25 23:07:45,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.59 | bwd: 5720.75 | bwd_inner: 5699.66 | bwd_allreduce: 21.04 | step: 19.31 15%|█▌ | 6277/41250 [15:10:10<84:12:08, 8.67s/it] {'loss': 0.2878, 'grad_norm': 2.238090753555298, 'learning_rate': 3.84549754350503e-05, 'epoch': 1.52} 15%|█▌ | 6277/41250 [15:10:10<84:12:08, 8.67s/it][2025-04-25 23:07:53,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.98 | optimizer_step: 1.09 [2025-04-25 23:07:53,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.02 | bwd_microstep: 5724.44 | bwd_inner_microstep: 5711.66 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.81 [2025-04-25 23:07:53,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.02 | bwd: 5724.45 | bwd_inner: 5711.66 | bwd_allreduce: 12.75 | step: 18.81 15%|█▌ | 6278/41250 [15:10:19<84:11:47, 8.67s/it] {'loss': 0.1883, 'grad_norm': 1.5364543199539185, 'learning_rate': 3.8454370171675926e-05, 'epoch': 1.52} 15%|█▌ | 6278/41250 [15:10:19<84:11:47, 8.67s/it][2025-04-25 23:08:02,633] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.01 | optimizer_step: 1.08 [2025-04-25 23:08:02,634] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.35 | bwd_microstep: 5782.80 | bwd_inner_microstep: 5661.29 | bwd_allreduce_microstep: 121.46 | step_microstep: 18.60 [2025-04-25 23:08:02,634] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.35 | bwd: 5782.81 | bwd_inner: 5661.29 | bwd_allreduce: 121.48 | step: 18.60 15%|█▌ | 6279/41250 [15:10:27<84:19:42, 8.68s/it] {'loss': 0.1821, 'grad_norm': 1.4536638259887695, 'learning_rate': 3.8453764794534e-05, 'epoch': 1.52} 15%|█▌ | 6279/41250 [15:10:27<84:19:42, 8.68s/it][2025-04-25 23:08:11,360] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 23:08:11,360] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.40 | bwd_microstep: 5783.44 | bwd_inner_microstep: 5706.83 | bwd_allreduce_microstep: 76.56 | step_microstep: 18.77 [2025-04-25 23:08:11,360] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.40 | bwd: 5783.45 | bwd_inner: 5706.83 | bwd_allreduce: 76.58 | step: 18.78 15%|█▌ | 6280/41250 [15:10:36<84:27:44, 8.70s/it] {'loss': 0.1019, 'grad_norm': 1.0843712091445923, 'learning_rate': 3.845315930362824e-05, 'epoch': 1.52} 15%|█▌ | 6280/41250 [15:10:36<84:27:44, 8.70s/it][2025-04-25 23:08:20,073] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 23:08:20,074] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.89 | bwd_microstep: 5762.90 | bwd_inner_microstep: 5706.17 | bwd_allreduce_microstep: 56.67 | step_microstep: 18.72 [2025-04-25 23:08:20,074] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.89 | bwd: 5762.92 | bwd_inner: 5706.17 | bwd_allreduce: 56.69 | step: 18.72 15%|█▌ | 6281/41250 [15:10:45<84:31:26, 8.70s/it] {'loss': 0.2218, 'grad_norm': 2.4362258911132812, 'learning_rate': 3.8452553698962376e-05, 'epoch': 1.52} 15%|█▌ | 6281/41250 [15:10:45<84:31:26, 8.70s/it][2025-04-25 23:08:28,770] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.24 | optimizer_step: 1.01 [2025-04-25 23:08:28,771] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.69 | bwd_microstep: 5757.33 | bwd_inner_microstep: 5696.67 | bwd_allreduce_microstep: 60.59 | step_microstep: 19.60 [2025-04-25 23:08:28,772] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.69 | bwd: 5757.34 | bwd_inner: 5696.67 | bwd_allreduce: 60.62 | step: 19.60 15%|█▌ | 6282/41250 [15:10:54<84:29:59, 8.70s/it] {'loss': 0.1528, 'grad_norm': 3.07206654548645, 'learning_rate': 3.845194798054015e-05, 'epoch': 1.52} 15%|█▌ | 6282/41250 [15:10:54<84:29:59, 8.70s/it][2025-04-25 23:08:37,699] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-25 23:08:37,699] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2945.54 | bwd_microstep: 5893.13 | bwd_inner_microstep: 5880.39 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.56 [2025-04-25 23:08:37,699] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2945.54 | bwd: 5893.14 | bwd_inner: 5880.39 | bwd_allreduce: 12.71 | step: 18.56 15%|█▌ | 6283/41250 [15:11:03<85:09:44, 8.77s/it] {'loss': 0.1108, 'grad_norm': 1.2784457206726074, 'learning_rate': 3.8451342148365306e-05, 'epoch': 1.52} 15%|█▌ | 6283/41250 [15:11:03<85:09:44, 8.77s/it][2025-04-25 23:08:46,477] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:08:46,478] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2901.26 | bwd_microstep: 5794.48 | bwd_inner_microstep: 5781.61 | bwd_allreduce_microstep: 12.83 | step_microstep: 18.74 [2025-04-25 23:08:46,478] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2901.26 | bwd: 5794.49 | bwd_inner: 5781.61 | bwd_allreduce: 12.85 | step: 18.74 15%|█▌ | 6284/41250 [15:11:11<85:11:28, 8.77s/it] {'loss': 0.1261, 'grad_norm': 2.3021726608276367, 'learning_rate': 3.845073620244156e-05, 'epoch': 1.52} 15%|█▌ | 6284/41250 [15:11:11<85:11:28, 8.77s/it][2025-04-25 23:08:55,250] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:08:55,251] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2889.27 | bwd_microstep: 5799.45 | bwd_inner_microstep: 5786.69 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.72 [2025-04-25 23:08:55,251] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2889.27 | bwd: 5799.47 | bwd_inner: 5786.69 | bwd_allreduce: 12.74 | step: 18.72 15%|█▌ | 6285/41250 [15:11:20<85:11:42, 8.77s/it] {'loss': 0.1171, 'grad_norm': 0.7819494009017944, 'learning_rate': 3.845013014277265e-05, 'epoch': 1.52} 15%|█▌ | 6285/41250 [15:11:20<85:11:42, 8.77s/it][2025-04-25 23:09:03,953] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 23:09:03,954] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.35 | bwd_microstep: 5760.92 | bwd_inner_microstep: 5703.89 | bwd_allreduce_microstep: 56.99 | step_microstep: 18.84 [2025-04-25 23:09:03,954] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.35 | bwd: 5760.93 | bwd_inner: 5703.89 | bwd_allreduce: 57.00 | step: 18.84 15%|█▌ | 6286/41250 [15:11:29<84:59:37, 8.75s/it] {'loss': 0.3017, 'grad_norm': 1.8850603103637695, 'learning_rate': 3.844952396936232e-05, 'epoch': 1.52} 15%|█▌ | 6286/41250 [15:11:29<84:59:37, 8.75s/it][2025-04-25 23:09:12,657] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.00 | optimizer_step: 1.09 [2025-04-25 23:09:12,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.96 | bwd_microstep: 5783.36 | bwd_inner_microstep: 5650.46 | bwd_allreduce_microstep: 132.84 | step_microstep: 18.45 [2025-04-25 23:09:12,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.96 | bwd: 5783.38 | bwd_inner: 5650.46 | bwd_allreduce: 132.87 | step: 18.46 15%|█▌ | 6287/41250 [15:11:37<84:51:48, 8.74s/it] {'loss': 0.1462, 'grad_norm': 2.1206183433532715, 'learning_rate': 3.8448917682214305e-05, 'epoch': 1.52} 15%|█▌ | 6287/41250 [15:11:37<84:51:48, 8.74s/it][2025-04-25 23:09:21,289] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.09 | optimizer_step: 1.00 [2025-04-25 23:09:21,289] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.19 | bwd_microstep: 5710.32 | bwd_inner_microstep: 5646.34 | bwd_allreduce_microstep: 63.92 | step_microstep: 19.74 [2025-04-25 23:09:21,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.19 | bwd: 5710.34 | bwd_inner: 5646.34 | bwd_allreduce: 63.95 | step: 19.75 15%|█▌ | 6288/41250 [15:11:46<84:32:30, 8.71s/it] {'loss': 0.0446, 'grad_norm': 1.1971698999404907, 'learning_rate': 3.844831128133234e-05, 'epoch': 1.52} 15%|█▌ | 6288/41250 [15:11:46<84:32:30, 8.71s/it][2025-04-25 23:09:30,075] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 23:09:30,075] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.44 | bwd_microstep: 5841.59 | bwd_inner_microstep: 5689.36 | bwd_allreduce_microstep: 152.18 | step_microstep: 18.82 [2025-04-25 23:09:30,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.44 | bwd: 5841.60 | bwd_inner: 5689.36 | bwd_allreduce: 152.20 | step: 18.82 15%|█▌ | 6289/41250 [15:11:55<84:46:13, 8.73s/it] {'loss': 0.132, 'grad_norm': 1.562506079673767, 'learning_rate': 3.844770476672016e-05, 'epoch': 1.52} 15%|█▌ | 6289/41250 [15:11:55<84:46:13, 8.73s/it][2025-04-25 23:09:38,771] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 0.98 | optimizer_step: 0.91 [2025-04-25 23:09:38,772] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.73 | bwd_microstep: 5761.40 | bwd_inner_microstep: 5704.78 | bwd_allreduce_microstep: 56.58 | step_microstep: 18.75 [2025-04-25 23:09:38,772] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.73 | bwd: 5761.42 | bwd_inner: 5704.78 | bwd_allreduce: 56.60 | step: 18.76 15%|█▌ | 6290/41250 [15:12:04<84:40:15, 8.72s/it] {'loss': 0.0371, 'grad_norm': 0.5937759280204773, 'learning_rate': 3.8447098138381515e-05, 'epoch': 1.52} 15%|█▌ | 6290/41250 [15:12:04<84:40:15, 8.72s/it][2025-04-25 23:09:47,439] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:09:47,440] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.34 | bwd_microstep: 5739.51 | bwd_inner_microstep: 5649.70 | bwd_allreduce_microstep: 89.77 | step_microstep: 18.39 [2025-04-25 23:09:47,440] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.35 | bwd: 5739.52 | bwd_inner: 5649.70 | bwd_allreduce: 89.78 | step: 18.39 15%|█▌ | 6291/41250 [15:12:12<84:31:11, 8.70s/it] {'loss': 0.1634, 'grad_norm': 3.1056480407714844, 'learning_rate': 3.844649139632014e-05, 'epoch': 1.53} 15%|█▌ | 6291/41250 [15:12:12<84:31:11, 8.70s/it][2025-04-25 23:09:56,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 1.01 [2025-04-25 23:09:56,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.24 | bwd_microstep: 5696.33 | bwd_inner_microstep: 5683.52 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.58 [2025-04-25 23:09:56,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.24 | bwd: 5696.34 | bwd_inner: 5683.52 | bwd_allreduce: 12.78 | step: 18.58 15%|█▌ | 6292/41250 [15:12:21<84:18:49, 8.68s/it] {'loss': 0.2237, 'grad_norm': 2.809368848800659, 'learning_rate': 3.844588454053977e-05, 'epoch': 1.53} 15%|█▌ | 6292/41250 [15:12:21<84:18:49, 8.68s/it][2025-04-25 23:10:04,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:10:04,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.07 | bwd_microstep: 5747.83 | bwd_inner_microstep: 5684.43 | bwd_allreduce_microstep: 63.36 | step_microstep: 18.62 [2025-04-25 23:10:04,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.08 | bwd: 5747.84 | bwd_inner: 5684.43 | bwd_allreduce: 63.38 | step: 18.63 15%|█▌ | 6293/41250 [15:12:30<84:19:53, 8.68s/it] {'loss': 0.0795, 'grad_norm': 3.086463689804077, 'learning_rate': 3.844527757104415e-05, 'epoch': 1.53} 15%|█▌ | 6293/41250 [15:12:30<84:19:53, 8.68s/it][2025-04-25 23:10:13,516] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 1.03 [2025-04-25 23:10:13,516] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2893.33 | bwd_microstep: 5772.80 | bwd_inner_microstep: 5759.99 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.71 [2025-04-25 23:10:13,516] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2893.33 | bwd: 5772.82 | bwd_inner: 5759.99 | bwd_allreduce: 12.79 | step: 18.72 15%|█▌ | 6294/41250 [15:12:38<84:31:58, 8.71s/it] {'loss': 0.1138, 'grad_norm': 1.8313766717910767, 'learning_rate': 3.844467048783702e-05, 'epoch': 1.53} 15%|█▌ | 6294/41250 [15:12:38<84:31:58, 8.71s/it][2025-04-25 23:10:22,308] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.99 | optimizer_step: 0.94 [2025-04-25 23:10:22,309] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.27 | bwd_microstep: 5855.65 | bwd_inner_microstep: 5686.66 | bwd_allreduce_microstep: 168.95 | step_microstep: 18.91 [2025-04-25 23:10:22,309] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.27 | bwd: 5855.66 | bwd_inner: 5686.66 | bwd_allreduce: 168.96 | step: 18.92 15%|█▌ | 6295/41250 [15:12:47<84:46:45, 8.73s/it] {'loss': 0.5085, 'grad_norm': 3.3431484699249268, 'learning_rate': 3.8444063290922125e-05, 'epoch': 1.53} 15%|█▌ | 6295/41250 [15:12:47<84:46:45, 8.73s/it][2025-04-25 23:10:31,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-25 23:10:31,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.48 | bwd_microstep: 5787.64 | bwd_inner_microstep: 5643.12 | bwd_allreduce_microstep: 144.47 | step_microstep: 19.18 [2025-04-25 23:10:31,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.48 | bwd: 5787.66 | bwd_inner: 5643.12 | bwd_allreduce: 144.49 | step: 19.19 15%|█▌ | 6296/41250 [15:12:56<84:40:46, 8.72s/it] {'loss': 0.1139, 'grad_norm': 1.3788905143737793, 'learning_rate': 3.844345598030321e-05, 'epoch': 1.53} 15%|█▌ | 6296/41250 [15:12:56<84:40:46, 8.72s/it][2025-04-25 23:10:39,695] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 23:10:39,696] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.22 | bwd_microstep: 5774.52 | bwd_inner_microstep: 5649.96 | bwd_allreduce_microstep: 124.51 | step_microstep: 19.02 [2025-04-25 23:10:39,696] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.22 | bwd: 5774.53 | bwd_inner: 5649.96 | bwd_allreduce: 124.53 | step: 19.02 15%|█▌ | 6297/41250 [15:13:05<84:35:14, 8.71s/it] {'loss': 0.3057, 'grad_norm': 3.749654531478882, 'learning_rate': 3.844284855598401e-05, 'epoch': 1.53} 15%|█▌ | 6297/41250 [15:13:05<84:35:14, 8.71s/it][2025-04-25 23:10:48,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:10:48,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.72 | bwd_microstep: 5706.40 | bwd_inner_microstep: 5693.58 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.70 [2025-04-25 23:10:48,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.72 | bwd: 5706.41 | bwd_inner: 5693.58 | bwd_allreduce: 12.79 | step: 18.71 15%|█▌ | 6298/41250 [15:13:13<84:23:58, 8.69s/it] {'loss': 0.092, 'grad_norm': 2.037689447402954, 'learning_rate': 3.844224101796829e-05, 'epoch': 1.53} 15%|█▌ | 6298/41250 [15:13:13<84:23:58, 8.69s/it][2025-04-25 23:10:56,992] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-25 23:10:56,993] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.98 | bwd_microstep: 5710.97 | bwd_inner_microstep: 5698.21 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.63 [2025-04-25 23:10:56,993] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.98 | bwd: 5710.99 | bwd_inner: 5698.21 | bwd_allreduce: 12.74 | step: 18.63 15%|█▌ | 6299/41250 [15:13:22<84:15:51, 8.68s/it] {'loss': 0.1082, 'grad_norm': 1.2878994941711426, 'learning_rate': 3.844163336625977e-05, 'epoch': 1.53} 15%|█▌ | 6299/41250 [15:13:22<84:15:51, 8.68s/it][2025-04-25 23:11:05,607] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.06 | optimizer_step: 1.10 [2025-04-25 23:11:05,608] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.08 | bwd_microstep: 5705.51 | bwd_inner_microstep: 5641.67 | bwd_allreduce_microstep: 63.80 | step_microstep: 19.12 [2025-04-25 23:11:05,608] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.08 | bwd: 5705.52 | bwd_inner: 5641.67 | bwd_allreduce: 63.81 | step: 19.12 15%|█▌ | 6300/41250 [15:13:30<84:04:25, 8.66s/it] {'loss': 0.0969, 'grad_norm': 0.801978349685669, 'learning_rate': 3.844102560086221e-05, 'epoch': 1.53} 15%|█▌ | 6300/41250 [15:13:30<84:04:25, 8.66s/it][2025-04-25 23:11:14,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.98 | optimizer_step: 1.05 [2025-04-25 23:11:14,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.83 | bwd_microstep: 5687.78 | bwd_inner_microstep: 5647.16 | bwd_allreduce_microstep: 40.57 | step_microstep: 18.66 [2025-04-25 23:11:14,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.83 | bwd: 5687.79 | bwd_inner: 5647.16 | bwd_allreduce: 40.59 | step: 18.67 15%|█▌ | 6301/41250 [15:13:39<83:54:22, 8.64s/it] {'loss': 0.1081, 'grad_norm': 2.524066686630249, 'learning_rate': 3.844041772177935e-05, 'epoch': 1.53} 15%|█▌ | 6301/41250 [15:13:39<83:54:22, 8.64s/it][2025-04-25 23:11:22,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 23:11:22,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.86 | bwd_microstep: 5781.06 | bwd_inner_microstep: 5642.16 | bwd_allreduce_microstep: 138.87 | step_microstep: 18.27 [2025-04-25 23:11:22,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.86 | bwd: 5781.08 | bwd_inner: 5642.15 | bwd_allreduce: 138.88 | step: 18.27 15%|█▌ | 6302/41250 [15:13:48<84:01:15, 8.66s/it] {'loss': 0.2251, 'grad_norm': 1.921859622001648, 'learning_rate': 3.843980972901495e-05, 'epoch': 1.53} 15%|█▌ | 6302/41250 [15:13:48<84:01:15, 8.66s/it][2025-04-25 23:11:31,522] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.93 | optimizer_step: 0.89 [2025-04-25 23:11:31,523] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.72 | bwd_microstep: 5699.59 | bwd_inner_microstep: 5686.94 | bwd_allreduce_microstep: 12.60 | step_microstep: 18.32 [2025-04-25 23:11:31,523] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.72 | bwd: 5699.60 | bwd_inner: 5686.94 | bwd_allreduce: 12.62 | step: 18.33 15%|█▌ | 6303/41250 [15:13:56<83:56:26, 8.65s/it] {'loss': 0.1492, 'grad_norm': 2.8894906044006348, 'learning_rate': 3.8439201622572746e-05, 'epoch': 1.53} 15%|█▌ | 6303/41250 [15:13:56<83:56:26, 8.65s/it][2025-04-25 23:11:40,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.96 | optimizer_step: 0.93 [2025-04-25 23:11:40,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.57 | bwd_microstep: 5699.80 | bwd_inner_microstep: 5687.19 | bwd_allreduce_microstep: 12.57 | step_microstep: 18.65 [2025-04-25 23:11:40,150] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.57 | bwd: 5699.81 | bwd_inner: 5687.19 | bwd_allreduce: 12.59 | step: 18.65 15%|█▌ | 6304/41250 [15:14:05<83:52:43, 8.64s/it] {'loss': 0.1553, 'grad_norm': 2.8738698959350586, 'learning_rate': 3.843859340245649e-05, 'epoch': 1.53} 15%|█▌ | 6304/41250 [15:14:05<83:52:43, 8.64s/it][2025-04-25 23:11:48,828] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.95 [2025-04-25 23:11:48,828] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.31 | bwd_microstep: 5757.54 | bwd_inner_microstep: 5662.59 | bwd_allreduce_microstep: 94.91 | step_microstep: 18.84 [2025-04-25 23:11:48,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.31 | bwd: 5757.55 | bwd_inner: 5662.59 | bwd_allreduce: 94.92 | step: 18.85 15%|█▌ | 6305/41250 [15:14:14<83:59:20, 8.65s/it] {'loss': 0.1713, 'grad_norm': 2.73095703125, 'learning_rate': 3.8437985068669936e-05, 'epoch': 1.53} 15%|█▌ | 6305/41250 [15:14:14<83:59:20, 8.65s/it][2025-04-25 23:11:57,467] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 23:11:57,468] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.71 | bwd_microstep: 5707.30 | bwd_inner_microstep: 5694.74 | bwd_allreduce_microstep: 12.52 | step_microstep: 18.46 [2025-04-25 23:11:57,468] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.71 | bwd: 5707.31 | bwd_inner: 5694.74 | bwd_allreduce: 12.53 | step: 18.46 15%|█▌ | 6306/41250 [15:14:22<83:56:43, 8.65s/it] {'loss': 0.1615, 'grad_norm': 2.635965585708618, 'learning_rate': 3.843737662121682e-05, 'epoch': 1.53} 15%|█▌ | 6306/41250 [15:14:22<83:56:43, 8.65s/it][2025-04-25 23:12:06,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-25 23:12:06,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.18 | bwd_microstep: 5698.28 | bwd_inner_microstep: 5685.60 | bwd_allreduce_microstep: 12.64 | step_microstep: 17.92 [2025-04-25 23:12:06,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.18 | bwd: 5698.29 | bwd_inner: 5685.60 | bwd_allreduce: 12.66 | step: 17.92 15%|█▌ | 6307/41250 [15:14:31<83:52:13, 8.64s/it] {'loss': 0.1288, 'grad_norm': 1.0378483533859253, 'learning_rate': 3.8436768060100915e-05, 'epoch': 1.53} 15%|█▌ | 6307/41250 [15:14:31<83:52:13, 8.64s/it][2025-04-25 23:12:14,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 23:12:14,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.87 | bwd_microstep: 5708.06 | bwd_inner_microstep: 5695.59 | bwd_allreduce_microstep: 12.41 | step_microstep: 18.58 [2025-04-25 23:12:14,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.87 | bwd: 5708.07 | bwd_inner: 5695.59 | bwd_allreduce: 12.43 | step: 18.58 15%|█▌ | 6308/41250 [15:14:40<83:52:23, 8.64s/it] {'loss': 0.1725, 'grad_norm': 2.521916627883911, 'learning_rate': 3.8436159385325965e-05, 'epoch': 1.53} 15%|█▌ | 6308/41250 [15:14:40<83:52:23, 8.64s/it][2025-04-25 23:12:23,386] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.95 | optimizer_step: 0.93 [2025-04-25 23:12:23,386] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.42 | bwd_microstep: 5711.23 | bwd_inner_microstep: 5698.58 | bwd_allreduce_microstep: 12.61 | step_microstep: 18.79 [2025-04-25 23:12:23,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.42 | bwd: 5711.24 | bwd_inner: 5698.58 | bwd_allreduce: 12.62 | step: 18.79 15%|█▌ | 6309/41250 [15:14:48<83:54:17, 8.64s/it] {'loss': 0.0832, 'grad_norm': 2.386345386505127, 'learning_rate': 3.843555059689571e-05, 'epoch': 1.53} 15%|█▌ | 6309/41250 [15:14:48<83:54:17, 8.64s/it][2025-04-25 23:12:32,073] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 23:12:32,073] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.13 | bwd_microstep: 5773.88 | bwd_inner_microstep: 5662.25 | bwd_allreduce_microstep: 111.59 | step_microstep: 17.95 [2025-04-25 23:12:32,074] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.13 | bwd: 5773.90 | bwd_inner: 5662.25 | bwd_allreduce: 111.61 | step: 17.95 15%|█▌ | 6310/41250 [15:14:57<84:01:25, 8.66s/it] {'loss': 0.3166, 'grad_norm': 2.2623403072357178, 'learning_rate': 3.843494169481391e-05, 'epoch': 1.53} 15%|█▌ | 6310/41250 [15:14:57<84:01:25, 8.66s/it][2025-04-25 23:12:40,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 1.04 | optimizer_step: 0.95 [2025-04-25 23:12:40,694] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.99 | bwd_microstep: 5700.01 | bwd_inner_microstep: 5658.97 | bwd_allreduce_microstep: 40.99 | step_microstep: 19.61 [2025-04-25 23:12:40,694] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.99 | bwd: 5700.03 | bwd_inner: 5658.97 | bwd_allreduce: 41.02 | step: 19.61 15%|█▌ | 6311/41250 [15:15:06<83:55:09, 8.65s/it] {'loss': 0.0654, 'grad_norm': 0.6734538674354553, 'learning_rate': 3.8434332679084325e-05, 'epoch': 1.53} 15%|█▌ | 6311/41250 [15:15:06<83:55:09, 8.65s/it][2025-04-25 23:12:49,316] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:12:49,317] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.29 | bwd_microstep: 5712.02 | bwd_inner_microstep: 5656.32 | bwd_allreduce_microstep: 55.66 | step_microstep: 18.92 [2025-04-25 23:12:49,317] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.29 | bwd: 5712.03 | bwd_inner: 5656.31 | bwd_allreduce: 55.68 | step: 18.92 15%|█▌ | 6312/41250 [15:15:14<83:50:50, 8.64s/it] {'loss': 0.103, 'grad_norm': 1.6983932256698608, 'learning_rate': 3.84337235497107e-05, 'epoch': 1.53} 15%|█▌ | 6312/41250 [15:15:14<83:50:50, 8.64s/it][2025-04-25 23:12:58,005] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:12:58,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.36 | bwd_microstep: 5753.14 | bwd_inner_microstep: 5715.48 | bwd_allreduce_microstep: 37.61 | step_microstep: 18.99 [2025-04-25 23:12:58,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.36 | bwd: 5753.15 | bwd_inner: 5715.48 | bwd_allreduce: 37.63 | step: 18.99 15%|█▌ | 6313/41250 [15:15:23<83:59:19, 8.65s/it] {'loss': 0.2319, 'grad_norm': 1.082553505897522, 'learning_rate': 3.84331143066968e-05, 'epoch': 1.53} 15%|█▌ | 6313/41250 [15:15:23<83:59:19, 8.65s/it][2025-04-25 23:13:06,708] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.12 | optimizer_step: 1.03 [2025-04-25 23:13:06,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.16 | bwd_microstep: 5788.88 | bwd_inner_microstep: 5655.40 | bwd_allreduce_microstep: 133.42 | step_microstep: 19.63 [2025-04-25 23:13:06,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.16 | bwd: 5788.89 | bwd_inner: 5655.40 | bwd_allreduce: 133.45 | step: 19.63 15%|█▌ | 6314/41250 [15:15:32<84:07:55, 8.67s/it] {'loss': 0.0882, 'grad_norm': 2.3435633182525635, 'learning_rate': 3.8432504950046376e-05, 'epoch': 1.53} 15%|█▌ | 6314/41250 [15:15:32<84:07:55, 8.67s/it][2025-04-25 23:13:15,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 23:13:15,420] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.84 | bwd_microstep: 5796.04 | bwd_inner_microstep: 5666.66 | bwd_allreduce_microstep: 129.34 | step_microstep: 18.68 [2025-04-25 23:13:15,420] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.84 | bwd: 5796.06 | bwd_inner: 5666.66 | bwd_allreduce: 129.35 | step: 18.69 15%|█▌ | 6315/41250 [15:15:40<84:14:45, 8.68s/it] {'loss': 0.1337, 'grad_norm': 3.473900556564331, 'learning_rate': 3.843189547976318e-05, 'epoch': 1.53} 15%|█▌ | 6315/41250 [15:15:40<84:14:45, 8.68s/it][2025-04-25 23:13:24,074] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 23:13:24,074] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.99 | bwd_microstep: 5727.56 | bwd_inner_microstep: 5652.83 | bwd_allreduce_microstep: 74.69 | step_microstep: 18.08 [2025-04-25 23:13:24,075] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.99 | bwd: 5727.57 | bwd_inner: 5652.83 | bwd_allreduce: 74.70 | step: 18.09 15%|█▌ | 6316/41250 [15:15:49<84:09:47, 8.67s/it] {'loss': 0.0465, 'grad_norm': 0.4608171284198761, 'learning_rate': 3.843128589585098e-05, 'epoch': 1.53} 15%|█▌ | 6316/41250 [15:15:49<84:09:47, 8.67s/it][2025-04-25 23:13:32,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 23:13:32,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2906.47 | bwd_microstep: 5799.09 | bwd_inner_microstep: 5786.40 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.25 [2025-04-25 23:13:32,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2906.47 | bwd: 5799.10 | bwd_inner: 5786.40 | bwd_allreduce: 12.66 | step: 18.26 15%|█▌ | 6317/41250 [15:15:58<84:29:30, 8.71s/it] {'loss': 0.0361, 'grad_norm': 0.30753186345100403, 'learning_rate': 3.8430676198313526e-05, 'epoch': 1.53} 15%|█▌ | 6317/41250 [15:15:58<84:29:30, 8.71s/it][2025-04-25 23:13:41,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:13:41,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.25 | bwd_microstep: 5721.93 | bwd_inner_microstep: 5661.08 | bwd_allreduce_microstep: 60.81 | step_microstep: 18.25 [2025-04-25 23:13:41,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.25 | bwd: 5721.94 | bwd_inner: 5661.08 | bwd_allreduce: 60.82 | step: 18.25 15%|█▌ | 6318/41250 [15:16:06<84:18:13, 8.69s/it] {'loss': 0.2706, 'grad_norm': 2.6805758476257324, 'learning_rate': 3.843006638715457e-05, 'epoch': 1.53} 15%|█▌ | 6318/41250 [15:16:06<84:18:13, 8.69s/it][2025-04-25 23:13:50,224] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.97 | optimizer_step: 0.93 [2025-04-25 23:13:50,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.95 | bwd_microstep: 5800.01 | bwd_inner_microstep: 5662.47 | bwd_allreduce_microstep: 137.49 | step_microstep: 18.43 [2025-04-25 23:13:50,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.95 | bwd: 5800.02 | bwd_inner: 5662.47 | bwd_allreduce: 137.51 | step: 18.43 15%|█▌ | 6319/41250 [15:16:15<84:23:36, 8.70s/it] {'loss': 0.2575, 'grad_norm': 2.4632229804992676, 'learning_rate': 3.842945646237789e-05, 'epoch': 1.53} 15%|█▌ | 6319/41250 [15:16:15<84:23:36, 8.70s/it][2025-04-25 23:13:58,910] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:13:58,911] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.67 | bwd_microstep: 5749.40 | bwd_inner_microstep: 5712.18 | bwd_allreduce_microstep: 37.17 | step_microstep: 18.77 [2025-04-25 23:13:58,911] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.67 | bwd: 5749.41 | bwd_inner: 5712.18 | bwd_allreduce: 37.19 | step: 18.78 15%|█▌ | 6320/41250 [15:16:24<84:21:32, 8.69s/it] {'loss': 0.1364, 'grad_norm': 1.5986143350601196, 'learning_rate': 3.8428846423987234e-05, 'epoch': 1.53} 15%|█▌ | 6320/41250 [15:16:24<84:21:32, 8.69s/it][2025-04-25 23:14:07,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:14:07,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.07 | bwd_microstep: 5705.43 | bwd_inner_microstep: 5692.72 | bwd_allreduce_microstep: 12.67 | step_microstep: 17.99 [2025-04-25 23:14:07,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.08 | bwd: 5705.44 | bwd_inner: 5692.72 | bwd_allreduce: 12.68 | step: 17.99 15%|█▌ | 6321/41250 [15:16:32<84:11:37, 8.68s/it] {'loss': 0.1623, 'grad_norm': 1.2871153354644775, 'learning_rate': 3.842823627198636e-05, 'epoch': 1.53} 15%|█▌ | 6321/41250 [15:16:32<84:11:37, 8.68s/it][2025-04-25 23:14:16,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 23:14:16,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.01 | bwd_microstep: 5783.14 | bwd_inner_microstep: 5652.10 | bwd_allreduce_microstep: 130.98 | step_microstep: 18.67 [2025-04-25 23:14:16,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.01 | bwd: 5783.15 | bwd_inner: 5652.10 | bwd_allreduce: 131.00 | step: 18.67 15%|█▌ | 6322/41250 [15:16:41<84:13:28, 8.68s/it] {'loss': 0.0614, 'grad_norm': 0.9615607261657715, 'learning_rate': 3.8427626006379035e-05, 'epoch': 1.53} 15%|█▌ | 6322/41250 [15:16:41<84:13:28, 8.68s/it][2025-04-25 23:14:25,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:14:25,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2933.58 | bwd_microstep: 5872.52 | bwd_inner_microstep: 5859.71 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.34 [2025-04-25 23:14:25,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2933.59 | bwd: 5872.53 | bwd_inner: 5859.71 | bwd_allreduce: 12.77 | step: 18.35 15%|█▌ | 6323/41250 [15:16:50<84:49:54, 8.74s/it] {'loss': 0.201, 'grad_norm': 1.7695400714874268, 'learning_rate': 3.842701562716903e-05, 'epoch': 1.53} 15%|█▌ | 6323/41250 [15:16:50<84:49:54, 8.74s/it][2025-04-25 23:14:33,828] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 23:14:33,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.83 | bwd_microstep: 5773.81 | bwd_inner_microstep: 5691.69 | bwd_allreduce_microstep: 82.07 | step_microstep: 18.44 [2025-04-25 23:14:33,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.83 | bwd: 5773.82 | bwd_inner: 5691.69 | bwd_allreduce: 82.09 | step: 18.44 15%|█▌ | 6324/41250 [15:16:59<84:42:14, 8.73s/it] {'loss': 0.0291, 'grad_norm': 0.25101396441459656, 'learning_rate': 3.842640513436008e-05, 'epoch': 1.53} 15%|█▌ | 6324/41250 [15:16:59<84:42:14, 8.73s/it][2025-04-25 23:14:42,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.98 | optimizer_step: 1.12 [2025-04-25 23:14:42,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.52 | bwd_microstep: 5759.21 | bwd_inner_microstep: 5703.85 | bwd_allreduce_microstep: 55.31 | step_microstep: 18.56 [2025-04-25 23:14:42,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.52 | bwd: 5759.23 | bwd_inner: 5703.86 | bwd_allreduce: 55.33 | step: 18.57 15%|█▌ | 6325/41250 [15:17:07<84:36:47, 8.72s/it] {'loss': 0.2915, 'grad_norm': 5.663130760192871, 'learning_rate': 3.842579452795598e-05, 'epoch': 1.53} 15%|█▌ | 6325/41250 [15:17:07<84:36:47, 8.72s/it][2025-04-25 23:14:51,229] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.13 | optimizer_step: 0.98 [2025-04-25 23:14:51,230] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.28 | bwd_microstep: 5765.89 | bwd_inner_microstep: 5688.90 | bwd_allreduce_microstep: 76.94 | step_microstep: 19.30 [2025-04-25 23:14:51,230] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.28 | bwd: 5765.90 | bwd_inner: 5688.90 | bwd_allreduce: 76.96 | step: 19.30 15%|█▌ | 6326/41250 [15:17:16<84:32:58, 8.72s/it] {'loss': 0.0792, 'grad_norm': 1.271106243133545, 'learning_rate': 3.8425183807960476e-05, 'epoch': 1.53} 15%|█▌ | 6326/41250 [15:17:16<84:32:58, 8.72s/it][2025-04-25 23:14:59,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 23:14:59,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.38 | bwd_microstep: 5708.99 | bwd_inner_microstep: 5654.68 | bwd_allreduce_microstep: 54.27 | step_microstep: 18.37 [2025-04-25 23:14:59,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.38 | bwd: 5709.00 | bwd_inner: 5654.68 | bwd_allreduce: 54.28 | step: 18.37 15%|█▌ | 6327/41250 [15:17:25<84:18:02, 8.69s/it] {'loss': 0.1701, 'grad_norm': 1.4566214084625244, 'learning_rate': 3.842457297437734e-05, 'epoch': 1.53} 15%|█▌ | 6327/41250 [15:17:25<84:18:02, 8.69s/it][2025-04-25 23:15:08,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 23:15:08,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.66 | bwd_microstep: 5756.12 | bwd_inner_microstep: 5650.90 | bwd_allreduce_microstep: 105.17 | step_microstep: 18.95 [2025-04-25 23:15:08,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.66 | bwd: 5756.13 | bwd_inner: 5650.90 | bwd_allreduce: 105.19 | step: 18.94 15%|█▌ | 6328/41250 [15:17:33<84:14:00, 8.68s/it] {'loss': 0.1314, 'grad_norm': 3.1762537956237793, 'learning_rate': 3.8423962027210337e-05, 'epoch': 1.53} 15%|█▌ | 6328/41250 [15:17:33<84:14:00, 8.68s/it][2025-04-25 23:15:17,197] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 23:15:17,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.61 | bwd_microstep: 5737.97 | bwd_inner_microstep: 5696.85 | bwd_allreduce_microstep: 41.07 | step_microstep: 18.60 [2025-04-25 23:15:17,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.61 | bwd: 5737.98 | bwd_inner: 5696.85 | bwd_allreduce: 41.09 | step: 18.60 15%|█▌ | 6329/41250 [15:17:42<84:11:15, 8.68s/it] {'loss': 0.2322, 'grad_norm': 2.1339259147644043, 'learning_rate': 3.842335096646323e-05, 'epoch': 1.53} 15%|█▌ | 6329/41250 [15:17:42<84:11:15, 8.68s/it][2025-04-25 23:15:25,802] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:15:25,803] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.86 | bwd_microstep: 5689.08 | bwd_inner_microstep: 5660.13 | bwd_allreduce_microstep: 28.91 | step_microstep: 18.75 [2025-04-25 23:15:25,803] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.86 | bwd: 5689.09 | bwd_inner: 5660.13 | bwd_allreduce: 28.92 | step: 18.75 15%|█▌ | 6330/41250 [15:17:51<83:58:25, 8.66s/it] {'loss': 0.0783, 'grad_norm': 3.986788034439087, 'learning_rate': 3.8422739792139785e-05, 'epoch': 1.53} 15%|█▌ | 6330/41250 [15:17:51<83:58:25, 8.66s/it][2025-04-25 23:15:34,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 1.02 [2025-04-25 23:15:34,505] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.55 | bwd_microstep: 5794.57 | bwd_inner_microstep: 5643.07 | bwd_allreduce_microstep: 151.45 | step_microstep: 19.40 [2025-04-25 23:15:34,505] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.55 | bwd: 5794.58 | bwd_inner: 5643.07 | bwd_allreduce: 151.47 | step: 19.40 15%|█▌ | 6331/41250 [15:17:59<84:06:05, 8.67s/it] {'loss': 0.0875, 'grad_norm': 1.0577313899993896, 'learning_rate': 3.842212850424377e-05, 'epoch': 1.53} 15%|█▌ | 6331/41250 [15:17:59<84:06:05, 8.67s/it][2025-04-25 23:15:43,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 23:15:43,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2928.76 | bwd_microstep: 5876.07 | bwd_inner_microstep: 5863.42 | bwd_allreduce_microstep: 12.60 | step_microstep: 18.35 [2025-04-25 23:15:43,397] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2928.76 | bwd: 5876.08 | bwd_inner: 5863.42 | bwd_allreduce: 12.62 | step: 18.36 15%|█▌ | 6332/41250 [15:18:08<84:44:22, 8.74s/it] {'loss': 0.2582, 'grad_norm': 3.1905019283294678, 'learning_rate': 3.842151710277897e-05, 'epoch': 1.54} 15%|█▌ | 6332/41250 [15:18:08<84:44:22, 8.74s/it][2025-04-25 23:15:52,067] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 23:15:52,068] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.31 | bwd_microstep: 5765.23 | bwd_inner_microstep: 5652.09 | bwd_allreduce_microstep: 113.09 | step_microstep: 18.55 [2025-04-25 23:15:52,068] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.31 | bwd: 5765.24 | bwd_inner: 5652.09 | bwd_allreduce: 113.11 | step: 18.55 15%|█▌ | 6333/41250 [15:18:17<84:32:56, 8.72s/it] {'loss': 0.0565, 'grad_norm': 0.8991254568099976, 'learning_rate': 3.8420905587749125e-05, 'epoch': 1.54} 15%|█▌ | 6333/41250 [15:18:17<84:32:56, 8.72s/it][2025-04-25 23:16:00,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 23:16:00,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2879.81 | bwd_microstep: 5776.25 | bwd_inner_microstep: 5763.38 | bwd_allreduce_microstep: 12.83 | step_microstep: 18.61 [2025-04-25 23:16:00,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2879.81 | bwd: 5776.27 | bwd_inner: 5763.38 | bwd_allreduce: 12.85 | step: 18.62 15%|█▌ | 6334/41250 [15:18:26<84:36:41, 8.72s/it] {'loss': 0.0647, 'grad_norm': 1.2684556245803833, 'learning_rate': 3.842029395915803e-05, 'epoch': 1.54} 15%|█▌ | 6334/41250 [15:18:26<84:36:41, 8.72s/it][2025-04-25 23:16:09,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 23:16:09,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.45 | bwd_microstep: 5687.51 | bwd_inner_microstep: 5658.03 | bwd_allreduce_microstep: 29.44 | step_microstep: 18.57 [2025-04-25 23:16:09,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.45 | bwd: 5687.52 | bwd_inner: 5658.03 | bwd_allreduce: 29.46 | step: 18.57 15%|█▌ | 6335/41250 [15:18:34<84:14:30, 8.69s/it] {'loss': 0.0908, 'grad_norm': 1.1029443740844727, 'learning_rate': 3.8419682217009446e-05, 'epoch': 1.54} 15%|█▌ | 6335/41250 [15:18:34<84:14:30, 8.69s/it][2025-04-25 23:16:18,019] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.10 | optimizer_step: 1.14 [2025-04-25 23:16:18,020] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2816.59 | bwd_microstep: 5711.88 | bwd_inner_microstep: 5628.47 | bwd_allreduce_microstep: 83.36 | step_microstep: 19.57 [2025-04-25 23:16:18,020] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2816.59 | bwd: 5711.90 | bwd_inner: 5628.47 | bwd_allreduce: 83.38 | step: 19.57 15%|█▌ | 6336/41250 [15:18:43<84:02:06, 8.66s/it] {'loss': 0.2148, 'grad_norm': 1.357204794883728, 'learning_rate': 3.841907036130714e-05, 'epoch': 1.54} 15%|█▌ | 6336/41250 [15:18:43<84:02:06, 8.66s/it][2025-04-25 23:16:26,686] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.97 | optimizer_step: 0.98 [2025-04-25 23:16:26,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.52 | bwd_microstep: 5759.27 | bwd_inner_microstep: 5634.53 | bwd_allreduce_microstep: 124.70 | step_microstep: 18.60 [2025-04-25 23:16:26,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.52 | bwd: 5759.28 | bwd_inner: 5634.53 | bwd_allreduce: 124.72 | step: 18.60 15%|█▌ | 6337/41250 [15:18:52<84:02:10, 8.67s/it] {'loss': 0.1328, 'grad_norm': 1.3781911134719849, 'learning_rate': 3.841845839205489e-05, 'epoch': 1.54} 15%|█▌ | 6337/41250 [15:18:52<84:02:10, 8.67s/it][2025-04-25 23:16:35,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:16:35,308] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.19 | bwd_microstep: 5694.72 | bwd_inner_microstep: 5681.67 | bwd_allreduce_microstep: 13.00 | step_microstep: 18.59 [2025-04-25 23:16:35,308] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.19 | bwd: 5694.73 | bwd_inner: 5681.67 | bwd_allreduce: 13.02 | step: 18.59 15%|█▌ | 6338/41250 [15:19:00<83:54:20, 8.65s/it] {'loss': 0.0837, 'grad_norm': 1.908326506614685, 'learning_rate': 3.841784630925647e-05, 'epoch': 1.54} 15%|█▌ | 6338/41250 [15:19:00<83:54:20, 8.65s/it][2025-04-25 23:16:43,990] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:16:43,991] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.38 | bwd_microstep: 5761.72 | bwd_inner_microstep: 5675.46 | bwd_allreduce_microstep: 86.22 | step_microstep: 18.31 [2025-04-25 23:16:43,991] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.38 | bwd: 5761.74 | bwd_inner: 5675.46 | bwd_allreduce: 86.24 | step: 18.32 15%|█▌ | 6339/41250 [15:19:09<83:59:33, 8.66s/it] {'loss': 0.0444, 'grad_norm': 0.5092731714248657, 'learning_rate': 3.841723411291564e-05, 'epoch': 1.54} 15%|█▌ | 6339/41250 [15:19:09<83:59:33, 8.66s/it][2025-04-25 23:16:52,656] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 23:16:52,657] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.24 | bwd_microstep: 5743.35 | bwd_inner_microstep: 5688.46 | bwd_allreduce_microstep: 54.85 | step_microstep: 18.42 [2025-04-25 23:16:52,657] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.24 | bwd: 5743.37 | bwd_inner: 5688.46 | bwd_allreduce: 54.87 | step: 18.42 15%|█▌ | 6340/41250 [15:19:17<84:00:13, 8.66s/it] {'loss': 0.1315, 'grad_norm': 2.471027135848999, 'learning_rate': 3.841662180303619e-05, 'epoch': 1.54} 15%|█▌ | 6340/41250 [15:19:17<84:00:13, 8.66s/it][2025-04-25 23:17:01,384] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:17:01,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2873.21 | bwd_microstep: 5771.30 | bwd_inner_microstep: 5758.65 | bwd_allreduce_microstep: 12.61 | step_microstep: 18.16 [2025-04-25 23:17:01,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2873.21 | bwd: 5771.31 | bwd_inner: 5758.65 | bwd_allreduce: 12.62 | step: 18.16 15%|█▌ | 6341/41250 [15:19:26<84:11:25, 8.68s/it] {'loss': 0.1065, 'grad_norm': 1.2944872379302979, 'learning_rate': 3.841600937962189e-05, 'epoch': 1.54} 15%|█▌ | 6341/41250 [15:19:26<84:11:25, 8.68s/it][2025-04-25 23:17:10,068] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 23:17:10,068] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.66 | bwd_microstep: 5750.12 | bwd_inner_microstep: 5703.29 | bwd_allreduce_microstep: 46.79 | step_microstep: 18.16 [2025-04-25 23:17:10,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.66 | bwd: 5750.13 | bwd_inner: 5703.29 | bwd_allreduce: 46.80 | step: 18.16 15%|█▌ | 6342/41250 [15:19:35<84:11:34, 8.68s/it] {'loss': 0.1626, 'grad_norm': 2.2632155418395996, 'learning_rate': 3.8415396842676515e-05, 'epoch': 1.54} 15%|█▌ | 6342/41250 [15:19:35<84:11:34, 8.68s/it][2025-04-25 23:17:18,760] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.06 | optimizer_step: 0.91 [2025-04-25 23:17:18,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.66 | bwd_microstep: 5782.21 | bwd_inner_microstep: 5654.48 | bwd_allreduce_microstep: 127.67 | step_microstep: 18.95 [2025-04-25 23:17:18,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.66 | bwd: 5782.23 | bwd_inner: 5654.48 | bwd_allreduce: 127.70 | step: 18.95 15%|█▌ | 6343/41250 [15:19:44<84:13:53, 8.69s/it] {'loss': 0.254, 'grad_norm': 2.0606119632720947, 'learning_rate': 3.841478419220384e-05, 'epoch': 1.54} 15%|█▌ | 6343/41250 [15:19:44<84:13:53, 8.69s/it][2025-04-25 23:17:27,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.11 | optimizer_step: 1.01 [2025-04-25 23:17:27,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.45 | bwd_microstep: 5743.23 | bwd_inner_microstep: 5691.47 | bwd_allreduce_microstep: 51.70 | step_microstep: 19.28 [2025-04-25 23:17:27,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.45 | bwd: 5743.24 | bwd_inner: 5691.47 | bwd_allreduce: 51.73 | step: 19.29 15%|█▌ | 6344/41250 [15:19:52<84:12:25, 8.68s/it] {'loss': 0.1123, 'grad_norm': 0.8823277950286865, 'learning_rate': 3.8414171428207636e-05, 'epoch': 1.54} 15%|█▌ | 6344/41250 [15:19:52<84:12:25, 8.68s/it][2025-04-25 23:17:36,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 23:17:36,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.92 | bwd_microstep: 5706.23 | bwd_inner_microstep: 5694.03 | bwd_allreduce_microstep: 12.16 | step_microstep: 17.87 [2025-04-25 23:17:36,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.92 | bwd: 5706.24 | bwd_inner: 5694.03 | bwd_allreduce: 12.17 | step: 17.87 15%|█▌ | 6345/41250 [15:20:01<84:05:15, 8.67s/it] {'loss': 0.1126, 'grad_norm': 1.5557836294174194, 'learning_rate': 3.8413558550691694e-05, 'epoch': 1.54} 15%|█▌ | 6345/41250 [15:20:01<84:05:15, 8.67s/it][2025-04-25 23:17:44,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.00 | optimizer_step: 1.08 [2025-04-25 23:17:44,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.48 | bwd_microstep: 5716.99 | bwd_inner_microstep: 5704.15 | bwd_allreduce_microstep: 12.80 | step_microstep: 19.11 [2025-04-25 23:17:44,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.48 | bwd: 5717.00 | bwd_inner: 5704.15 | bwd_allreduce: 12.81 | step: 19.12 15%|█▌ | 6346/41250 [15:20:10<84:01:50, 8.67s/it] {'loss': 0.0362, 'grad_norm': 1.717598557472229, 'learning_rate': 3.8412945559659784e-05, 'epoch': 1.54} 15%|█▌ | 6346/41250 [15:20:10<84:01:50, 8.67s/it][2025-04-25 23:17:53,436] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:17:53,437] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.81 | bwd_microstep: 5775.67 | bwd_inner_microstep: 5654.91 | bwd_allreduce_microstep: 120.72 | step_microstep: 18.79 [2025-04-25 23:17:53,437] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.81 | bwd: 5775.69 | bwd_inner: 5654.91 | bwd_allreduce: 120.74 | step: 18.80 15%|█▌ | 6347/41250 [15:20:18<84:06:20, 8.67s/it] {'loss': 0.0876, 'grad_norm': 2.974748134613037, 'learning_rate': 3.841233245511569e-05, 'epoch': 1.54} 15%|█▌ | 6347/41250 [15:20:18<84:06:20, 8.67s/it][2025-04-25 23:18:02,098] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.95 [2025-04-25 23:18:02,099] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2862.08 | bwd_microstep: 5717.45 | bwd_inner_microstep: 5704.68 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.77 [2025-04-25 23:18:02,099] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2862.08 | bwd: 5717.46 | bwd_inner: 5704.68 | bwd_allreduce: 12.73 | step: 18.78 15%|█▌ | 6348/41250 [15:20:27<84:04:09, 8.67s/it] {'loss': 0.1476, 'grad_norm': 1.43306303024292, 'learning_rate': 3.841171923706318e-05, 'epoch': 1.54} 15%|█▌ | 6348/41250 [15:20:27<84:04:09, 8.67s/it][2025-04-25 23:18:10,795] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 1.24 | optimizer_step: 0.90 [2025-04-25 23:18:10,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.60 | bwd_microstep: 5767.99 | bwd_inner_microstep: 5689.92 | bwd_allreduce_microstep: 78.02 | step_microstep: 19.21 [2025-04-25 23:18:10,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.60 | bwd: 5768.01 | bwd_inner: 5689.92 | bwd_allreduce: 78.04 | step: 19.22 15%|█▌ | 6349/41250 [15:20:36<84:08:41, 8.68s/it] {'loss': 0.1022, 'grad_norm': 1.2463676929473877, 'learning_rate': 3.8411105905506047e-05, 'epoch': 1.54} 15%|█▌ | 6349/41250 [15:20:36<84:08:41, 8.68s/it][2025-04-25 23:18:19,508] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.05 | optimizer_step: 0.96 [2025-04-25 23:18:19,509] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.64 | bwd_microstep: 5770.30 | bwd_inner_microstep: 5692.81 | bwd_allreduce_microstep: 77.45 | step_microstep: 19.10 [2025-04-25 23:18:19,509] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.64 | bwd: 5770.32 | bwd_inner: 5692.81 | bwd_allreduce: 77.47 | step: 19.10 15%|█▌ | 6350/41250 [15:20:44<84:14:05, 8.69s/it] {'loss': 0.0798, 'grad_norm': 1.6691595315933228, 'learning_rate': 3.841049246044807e-05, 'epoch': 1.54} 15%|█▌ | 6350/41250 [15:20:44<84:14:05, 8.69s/it][2025-04-25 23:18:28,261] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:18:28,261] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.42 | bwd_microstep: 5782.10 | bwd_inner_microstep: 5769.03 | bwd_allreduce_microstep: 13.03 | step_microstep: 18.76 [2025-04-25 23:18:28,262] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.42 | bwd: 5782.11 | bwd_inner: 5769.03 | bwd_allreduce: 13.05 | step: 18.76 15%|█▌ | 6351/41250 [15:20:53<84:25:04, 8.71s/it] {'loss': 0.2382, 'grad_norm': 4.337391376495361, 'learning_rate': 3.840987890189302e-05, 'epoch': 1.54} 15%|█▌ | 6351/41250 [15:20:53<84:25:04, 8.71s/it][2025-04-25 23:18:36,887] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.23 | optimizer_step: 0.90 [2025-04-25 23:18:36,888] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.69 | bwd_microstep: 5713.82 | bwd_inner_microstep: 5657.60 | bwd_allreduce_microstep: 56.18 | step_microstep: 19.38 [2025-04-25 23:18:36,888] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.70 | bwd: 5713.84 | bwd_inner: 5657.59 | bwd_allreduce: 56.20 | step: 19.38 15%|█▌ | 6352/41250 [15:21:02<84:10:39, 8.68s/it] {'loss': 0.1941, 'grad_norm': 1.3374817371368408, 'learning_rate': 3.84092652298447e-05, 'epoch': 1.54} 15%|█▌ | 6352/41250 [15:21:02<84:10:39, 8.68s/it][2025-04-25 23:18:45,585] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.00 | optimizer_step: 1.03 [2025-04-25 23:18:45,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.22 | bwd_microstep: 5784.94 | bwd_inner_microstep: 5654.59 | bwd_allreduce_microstep: 130.30 | step_microstep: 18.67 [2025-04-25 23:18:45,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.22 | bwd: 5784.96 | bwd_inner: 5654.59 | bwd_allreduce: 130.32 | step: 18.67 15%|█▌ | 6353/41250 [15:21:10<84:13:00, 8.69s/it] {'loss': 0.3126, 'grad_norm': 3.1010241508483887, 'learning_rate': 3.8408651444306867e-05, 'epoch': 1.54} 15%|█▌ | 6353/41250 [15:21:10<84:13:00, 8.69s/it][2025-04-25 23:18:54,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.20 | optimizer_step: 0.90 [2025-04-25 23:18:54,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.06 | bwd_microstep: 5723.67 | bwd_inner_microstep: 5665.67 | bwd_allreduce_microstep: 57.95 | step_microstep: 18.93 [2025-04-25 23:18:54,223] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.06 | bwd: 5723.69 | bwd_inner: 5665.67 | bwd_allreduce: 57.97 | step: 18.94 15%|█▌ | 6354/41250 [15:21:19<84:03:54, 8.67s/it] {'loss': 0.2766, 'grad_norm': 5.51011848449707, 'learning_rate': 3.8408037545283325e-05, 'epoch': 1.54} 15%|█▌ | 6354/41250 [15:21:19<84:03:54, 8.67s/it][2025-04-25 23:19:02,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 23:19:02,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.83 | bwd_microstep: 5717.71 | bwd_inner_microstep: 5662.41 | bwd_allreduce_microstep: 55.24 | step_microstep: 18.80 [2025-04-25 23:19:02,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.83 | bwd: 5717.73 | bwd_inner: 5662.41 | bwd_allreduce: 55.26 | step: 18.80 15%|█▌ | 6355/41250 [15:21:28<83:58:13, 8.66s/it] {'loss': 0.2275, 'grad_norm': 2.243175745010376, 'learning_rate': 3.840742353277785e-05, 'epoch': 1.54} 15%|█▌ | 6355/41250 [15:21:28<83:58:13, 8.66s/it][2025-04-25 23:19:11,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.04 | optimizer_step: 1.00 [2025-04-25 23:19:11,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.57 | bwd_microstep: 5786.17 | bwd_inner_microstep: 5697.98 | bwd_allreduce_microstep: 88.15 | step_microstep: 19.08 [2025-04-25 23:19:11,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.57 | bwd: 5786.19 | bwd_inner: 5697.98 | bwd_allreduce: 88.17 | step: 19.08 15%|█▌ | 6356/41250 [15:21:36<84:07:58, 8.68s/it] {'loss': 0.2598, 'grad_norm': 1.9220561981201172, 'learning_rate': 3.840680940679423e-05, 'epoch': 1.54} 15%|█▌ | 6356/41250 [15:21:36<84:07:58, 8.68s/it][2025-04-25 23:19:20,294] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 23:19:20,295] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.28 | bwd_microstep: 5778.75 | bwd_inner_microstep: 5688.29 | bwd_allreduce_microstep: 90.41 | step_microstep: 17.86 [2025-04-25 23:19:20,295] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.28 | bwd: 5778.76 | bwd_inner: 5688.29 | bwd_allreduce: 90.43 | step: 17.86 15%|█▌ | 6357/41250 [15:21:45<84:13:15, 8.69s/it] {'loss': 0.1712, 'grad_norm': 1.4553481340408325, 'learning_rate': 3.8406195167336256e-05, 'epoch': 1.54} 15%|█▌ | 6357/41250 [15:21:45<84:13:15, 8.69s/it][2025-04-25 23:19:29,042] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:19:29,043] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2888.48 | bwd_microstep: 5778.08 | bwd_inner_microstep: 5765.25 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.21 [2025-04-25 23:19:29,043] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2888.48 | bwd: 5778.09 | bwd_inner: 5765.25 | bwd_allreduce: 12.80 | step: 18.21 15%|█▌ | 6358/41250 [15:21:54<84:23:42, 8.71s/it] {'loss': 0.2759, 'grad_norm': 2.408078908920288, 'learning_rate': 3.840558081440771e-05, 'epoch': 1.54} 15%|█▌ | 6358/41250 [15:21:54<84:23:42, 8.71s/it][2025-04-25 23:19:37,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 23:19:37,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.37 | bwd_microstep: 5723.69 | bwd_inner_microstep: 5711.06 | bwd_allreduce_microstep: 12.59 | step_microstep: 18.61 [2025-04-25 23:19:37,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.37 | bwd: 5723.71 | bwd_inner: 5711.06 | bwd_allreduce: 12.60 | step: 18.61 15%|█▌ | 6359/41250 [15:22:03<84:15:37, 8.69s/it] {'loss': 0.0579, 'grad_norm': 0.73814857006073, 'learning_rate': 3.8404966348012366e-05, 'epoch': 1.54} 15%|█▌ | 6359/41250 [15:22:03<84:15:37, 8.69s/it][2025-04-25 23:19:46,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.21 | optimizer_step: 0.90 [2025-04-25 23:19:46,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.52 | bwd_microstep: 5758.01 | bwd_inner_microstep: 5706.72 | bwd_allreduce_microstep: 51.25 | step_microstep: 18.87 [2025-04-25 23:19:46,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.52 | bwd: 5758.02 | bwd_inner: 5706.72 | bwd_allreduce: 51.27 | step: 18.88 15%|█▌ | 6360/41250 [15:22:11<84:17:59, 8.70s/it] {'loss': 0.0492, 'grad_norm': 0.7702252268791199, 'learning_rate': 3.8404351768154034e-05, 'epoch': 1.54} 15%|█▌ | 6360/41250 [15:22:11<84:17:59, 8.70s/it][2025-04-25 23:19:55,184] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:19:55,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2892.36 | bwd_microstep: 5796.31 | bwd_inner_microstep: 5783.29 | bwd_allreduce_microstep: 12.98 | step_microstep: 18.64 [2025-04-25 23:19:55,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2892.36 | bwd: 5796.33 | bwd_inner: 5783.29 | bwd_allreduce: 13.00 | step: 18.64 15%|█▌ | 6361/41250 [15:22:20<84:30:38, 8.72s/it] {'loss': 0.1161, 'grad_norm': 1.7041698694229126, 'learning_rate': 3.840373707483649e-05, 'epoch': 1.54} 15%|█▌ | 6361/41250 [15:22:20<84:30:38, 8.72s/it][2025-04-25 23:20:03,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.05 | optimizer_step: 1.06 [2025-04-25 23:20:03,809] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.06 | bwd_microstep: 5700.47 | bwd_inner_microstep: 5659.46 | bwd_allreduce_microstep: 40.96 | step_microstep: 19.17 [2025-04-25 23:20:03,809] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.06 | bwd: 5700.48 | bwd_inner: 5659.46 | bwd_allreduce: 40.98 | step: 19.17 15%|█▌ | 6362/41250 [15:22:29<84:13:36, 8.69s/it] {'loss': 0.3049, 'grad_norm': 2.535534143447876, 'learning_rate': 3.8403122268063524e-05, 'epoch': 1.54} 15%|█▌ | 6362/41250 [15:22:29<84:13:36, 8.69s/it][2025-04-25 23:20:12,632] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:20:12,632] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.14 | bwd_microstep: 5894.64 | bwd_inner_microstep: 5664.33 | bwd_allreduce_microstep: 230.27 | step_microstep: 18.31 [2025-04-25 23:20:12,632] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.14 | bwd: 5894.65 | bwd_inner: 5664.33 | bwd_allreduce: 230.28 | step: 18.31 15%|█▌ | 6363/41250 [15:22:37<84:36:21, 8.73s/it] {'loss': 0.0668, 'grad_norm': 1.7101491689682007, 'learning_rate': 3.8402507347838934e-05, 'epoch': 1.54} 15%|█▌ | 6363/41250 [15:22:37<84:36:21, 8.73s/it][2025-04-25 23:20:21,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:20:21,362] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.08 | bwd_microstep: 5816.21 | bwd_inner_microstep: 5666.75 | bwd_allreduce_microstep: 149.41 | step_microstep: 18.63 [2025-04-25 23:20:21,362] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.08 | bwd: 5816.22 | bwd_inner: 5666.75 | bwd_allreduce: 149.43 | step: 18.64 15%|█▌ | 6364/41250 [15:22:46<84:36:05, 8.73s/it] {'loss': 0.1606, 'grad_norm': 1.178436040878296, 'learning_rate': 3.840189231416651e-05, 'epoch': 1.54} 15%|█▌ | 6364/41250 [15:22:46<84:36:05, 8.73s/it][2025-04-25 23:20:30,148] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-25 23:20:30,148] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.23 | bwd_microstep: 5875.29 | bwd_inner_microstep: 5650.10 | bwd_allreduce_microstep: 225.15 | step_microstep: 17.74 [2025-04-25 23:20:30,148] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.23 | bwd: 5875.31 | bwd_inner: 5650.10 | bwd_allreduce: 225.17 | step: 17.74 15%|█▌ | 6365/41250 [15:22:55<84:45:38, 8.75s/it] {'loss': 0.0824, 'grad_norm': 0.861057460308075, 'learning_rate': 3.8401277167050034e-05, 'epoch': 1.54} 15%|█▌ | 6365/41250 [15:22:55<84:45:38, 8.75s/it][2025-04-25 23:20:38,768] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:20:38,769] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.21 | bwd_microstep: 5693.66 | bwd_inner_microstep: 5680.96 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.18 [2025-04-25 23:20:38,769] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.21 | bwd: 5693.67 | bwd_inner: 5680.95 | bwd_allreduce: 12.67 | step: 18.18 15%|█▌ | 6366/41250 [15:23:04<84:23:56, 8.71s/it] {'loss': 0.1645, 'grad_norm': 1.7715907096862793, 'learning_rate': 3.8400661906493305e-05, 'epoch': 1.54} 15%|█▌ | 6366/41250 [15:23:04<84:23:56, 8.71s/it][2025-04-25 23:20:47,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 1.14 [2025-04-25 23:20:47,446] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.74 | bwd_microstep: 5731.28 | bwd_inner_microstep: 5718.20 | bwd_allreduce_microstep: 13.03 | step_microstep: 19.12 [2025-04-25 23:20:47,446] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.74 | bwd: 5731.30 | bwd_inner: 5718.20 | bwd_allreduce: 13.05 | step: 19.13 15%|█▌ | 6367/41250 [15:23:12<84:17:38, 8.70s/it] {'loss': 0.1697, 'grad_norm': 1.6265007257461548, 'learning_rate': 3.8400046532500114e-05, 'epoch': 1.54} 15%|█▌ | 6367/41250 [15:23:12<84:17:38, 8.70s/it][2025-04-25 23:20:56,079] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.06 | optimizer_step: 0.92 [2025-04-25 23:20:56,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.10 | bwd_microstep: 5706.01 | bwd_inner_microstep: 5692.93 | bwd_allreduce_microstep: 13.02 | step_microstep: 18.88 [2025-04-25 23:20:56,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.10 | bwd: 5706.02 | bwd_inner: 5692.93 | bwd_allreduce: 13.05 | step: 18.88 15%|█▌ | 6368/41250 [15:23:21<84:06:22, 8.68s/it] {'loss': 0.0884, 'grad_norm': 1.305525302886963, 'learning_rate': 3.839943104507425e-05, 'epoch': 1.54} 15%|█▌ | 6368/41250 [15:23:21<84:06:22, 8.68s/it][2025-04-25 23:21:04,726] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.87 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-25 23:21:04,726] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.91 | bwd_microstep: 5705.48 | bwd_inner_microstep: 5692.68 | bwd_allreduce_microstep: 12.75 | step_microstep: 17.69 [2025-04-25 23:21:04,726] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.91 | bwd: 5705.49 | bwd_inner: 5692.68 | bwd_allreduce: 12.77 | step: 17.69 15%|█▌ | 6369/41250 [15:23:30<84:00:00, 8.67s/it] {'loss': 0.0706, 'grad_norm': 1.3762401342391968, 'learning_rate': 3.8398815444219514e-05, 'epoch': 1.54} 15%|█▌ | 6369/41250 [15:23:30<84:00:00, 8.67s/it][2025-04-25 23:21:13,415] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 23:21:13,415] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.56 | bwd_microstep: 5755.75 | bwd_inner_microstep: 5674.62 | bwd_allreduce_microstep: 81.08 | step_microstep: 18.72 [2025-04-25 23:21:13,416] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.56 | bwd: 5755.76 | bwd_inner: 5674.62 | bwd_allreduce: 81.10 | step: 18.72 15%|█▌ | 6370/41250 [15:23:38<84:03:29, 8.68s/it] {'loss': 0.1052, 'grad_norm': 2.3331708908081055, 'learning_rate': 3.839819972993971e-05, 'epoch': 1.54} 15%|█▌ | 6370/41250 [15:23:38<84:03:29, 8.68s/it][2025-04-25 23:21:22,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:21:22,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.08 | bwd_microstep: 5705.40 | bwd_inner_microstep: 5692.59 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.63 [2025-04-25 23:21:22,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.08 | bwd: 5705.42 | bwd_inner: 5692.59 | bwd_allreduce: 12.78 | step: 18.63 15%|█▌ | 6371/41250 [15:23:47<83:57:34, 8.67s/it] {'loss': 0.0572, 'grad_norm': 0.6941977739334106, 'learning_rate': 3.839758390223861e-05, 'epoch': 1.54} 15%|█▌ | 6371/41250 [15:23:47<83:57:34, 8.67s/it][2025-04-25 23:21:30,672] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-25 23:21:30,673] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.07 | bwd_microstep: 5689.53 | bwd_inner_microstep: 5677.05 | bwd_allreduce_microstep: 12.44 | step_microstep: 18.94 [2025-04-25 23:21:30,673] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.07 | bwd: 5689.55 | bwd_inner: 5677.05 | bwd_allreduce: 12.46 | step: 18.95 15%|█▌ | 6372/41250 [15:23:55<83:48:30, 8.65s/it] {'loss': 0.3634, 'grad_norm': 2.819687604904175, 'learning_rate': 3.839696796112003e-05, 'epoch': 1.54} 15%|█▌ | 6372/41250 [15:23:55<83:48:30, 8.65s/it][2025-04-25 23:21:39,325] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:21:39,325] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.26 | bwd_microstep: 5704.36 | bwd_inner_microstep: 5691.67 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.63 [2025-04-25 23:21:39,325] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.26 | bwd: 5704.37 | bwd_inner: 5691.67 | bwd_allreduce: 12.66 | step: 18.64 15%|█▌ | 6373/41250 [15:24:04<83:48:37, 8.65s/it] {'loss': 0.0596, 'grad_norm': 0.6737018823623657, 'learning_rate': 3.839635190658776e-05, 'epoch': 1.54} 15%|█▌ | 6373/41250 [15:24:04<83:48:37, 8.65s/it][2025-04-25 23:21:47,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 23:21:47,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.32 | bwd_microstep: 5709.91 | bwd_inner_microstep: 5638.14 | bwd_allreduce_microstep: 71.73 | step_microstep: 18.20 [2025-04-25 23:21:47,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.32 | bwd: 5709.92 | bwd_inner: 5638.14 | bwd_allreduce: 71.74 | step: 18.20 15%|█▌ | 6374/41250 [15:24:13<83:44:00, 8.64s/it] {'loss': 0.1767, 'grad_norm': 1.642554521560669, 'learning_rate': 3.83957357386456e-05, 'epoch': 1.55} 15%|█▌ | 6374/41250 [15:24:13<83:44:00, 8.64s/it][2025-04-25 23:21:56,715] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 23:21:56,716] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2891.08 | bwd_microstep: 5793.26 | bwd_inner_microstep: 5780.67 | bwd_allreduce_microstep: 12.55 | step_microstep: 18.09 [2025-04-25 23:21:56,716] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2891.08 | bwd: 5793.27 | bwd_inner: 5780.67 | bwd_allreduce: 12.56 | step: 18.09 15%|█▌ | 6375/41250 [15:24:22<84:05:04, 8.68s/it] {'loss': 0.1929, 'grad_norm': 1.835411787033081, 'learning_rate': 3.839511945729735e-05, 'epoch': 1.55} 15%|█▌ | 6375/41250 [15:24:22<84:05:04, 8.68s/it][2025-04-25 23:22:05,375] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.08 | optimizer_step: 0.95 [2025-04-25 23:22:05,376] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.62 | bwd_microstep: 5727.60 | bwd_inner_microstep: 5688.94 | bwd_allreduce_microstep: 38.61 | step_microstep: 19.31 [2025-04-25 23:22:05,376] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.62 | bwd: 5727.62 | bwd_inner: 5688.94 | bwd_allreduce: 38.63 | step: 19.31 15%|█▌ | 6376/41250 [15:24:30<84:01:47, 8.67s/it] {'loss': 0.0313, 'grad_norm': 0.7381097674369812, 'learning_rate': 3.8394503062546805e-05, 'epoch': 1.55} 15%|█▌ | 6376/41250 [15:24:30<84:01:47, 8.67s/it][2025-04-25 23:22:13,952] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.10 | optimizer_step: 0.90 [2025-04-25 23:22:13,953] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.12 | bwd_microstep: 5668.39 | bwd_inner_microstep: 5643.00 | bwd_allreduce_microstep: 25.35 | step_microstep: 19.38 [2025-04-25 23:22:13,953] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.12 | bwd: 5668.41 | bwd_inner: 5643.00 | bwd_allreduce: 25.36 | step: 19.38 15%|█▌ | 6377/41250 [15:24:39<83:44:40, 8.65s/it] {'loss': 0.0175, 'grad_norm': 0.26265788078308105, 'learning_rate': 3.839388655439776e-05, 'epoch': 1.55} 15%|█▌ | 6377/41250 [15:24:39<83:44:40, 8.65s/it][2025-04-25 23:22:22,596] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:22:22,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.62 | bwd_microstep: 5736.45 | bwd_inner_microstep: 5650.21 | bwd_allreduce_microstep: 86.19 | step_microstep: 18.85 [2025-04-25 23:22:22,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.62 | bwd: 5736.46 | bwd_inner: 5650.21 | bwd_allreduce: 86.21 | step: 18.85 15%|█▌ | 6378/41250 [15:24:47<83:44:19, 8.64s/it] {'loss': 0.0327, 'grad_norm': 0.5730663537979126, 'learning_rate': 3.839326993285403e-05, 'epoch': 1.55} 15%|█▌ | 6378/41250 [15:24:47<83:44:19, 8.64s/it][2025-04-25 23:22:31,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:22:31,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.38 | bwd_microstep: 5689.82 | bwd_inner_microstep: 5644.32 | bwd_allreduce_microstep: 45.46 | step_microstep: 18.64 [2025-04-25 23:22:31,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.38 | bwd: 5689.84 | bwd_inner: 5644.32 | bwd_allreduce: 45.48 | step: 18.64 15%|█▌ | 6379/41250 [15:24:56<83:35:59, 8.63s/it] {'loss': 0.0192, 'grad_norm': 0.424753874540329, 'learning_rate': 3.8392653197919406e-05, 'epoch': 1.55} 15%|█▌ | 6379/41250 [15:24:56<83:35:59, 8.63s/it][2025-04-25 23:22:39,810] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.56 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:22:39,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.60 | bwd_microstep: 5693.46 | bwd_inner_microstep: 5680.56 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.99 [2025-04-25 23:22:39,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.60 | bwd: 5693.48 | bwd_inner: 5680.56 | bwd_allreduce: 12.87 | step: 18.99 15%|█▌ | 6380/41250 [15:25:05<83:33:16, 8.63s/it] {'loss': 0.1309, 'grad_norm': 1.7281415462493896, 'learning_rate': 3.83920363495977e-05, 'epoch': 1.55} 15%|█▌ | 6380/41250 [15:25:05<83:33:16, 8.63s/it][2025-04-25 23:22:48,484] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 1.04 | optimizer_step: 1.01 [2025-04-25 23:22:48,485] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.44 | bwd_microstep: 5760.06 | bwd_inner_microstep: 5646.36 | bwd_allreduce_microstep: 113.65 | step_microstep: 19.34 [2025-04-25 23:22:48,485] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.44 | bwd: 5760.07 | bwd_inner: 5646.35 | bwd_allreduce: 113.67 | step: 19.34 15%|█▌ | 6381/41250 [15:25:13<83:41:24, 8.64s/it] {'loss': 0.2319, 'grad_norm': 1.3648968935012817, 'learning_rate': 3.839141938789269e-05, 'epoch': 1.55} 15%|█▌ | 6381/41250 [15:25:13<83:41:24, 8.64s/it][2025-04-25 23:22:57,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.97 | optimizer_step: 0.94 [2025-04-25 23:22:57,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.45 | bwd_microstep: 5718.41 | bwd_inner_microstep: 5701.95 | bwd_allreduce_microstep: 16.42 | step_microstep: 18.41 [2025-04-25 23:22:57,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.45 | bwd: 5718.42 | bwd_inner: 5701.95 | bwd_allreduce: 16.44 | step: 18.41 15%|█▌ | 6382/41250 [15:25:22<83:44:09, 8.65s/it] {'loss': 0.0652, 'grad_norm': 3.1402828693389893, 'learning_rate': 3.8390802312808214e-05, 'epoch': 1.55} 15%|█▌ | 6382/41250 [15:25:22<83:44:09, 8.65s/it][2025-04-25 23:23:05,925] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.97 | optimizer_step: 1.07 [2025-04-25 23:23:05,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.86 | bwd_microstep: 5872.18 | bwd_inner_microstep: 5652.74 | bwd_allreduce_microstep: 219.39 | step_microstep: 18.38 [2025-04-25 23:23:05,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.86 | bwd: 5872.19 | bwd_inner: 5652.74 | bwd_allreduce: 219.41 | step: 18.38 15%|█▌ | 6383/41250 [15:25:31<84:08:08, 8.69s/it] {'loss': 0.11, 'grad_norm': 1.6980472803115845, 'learning_rate': 3.839018512434805e-05, 'epoch': 1.55} 15%|█▌ | 6383/41250 [15:25:31<84:08:08, 8.69s/it][2025-04-25 23:23:14,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 23:23:14,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.13 | bwd_microstep: 5672.43 | bwd_inner_microstep: 5642.88 | bwd_allreduce_microstep: 29.51 | step_microstep: 18.26 [2025-04-25 23:23:14,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.13 | bwd: 5672.44 | bwd_inner: 5642.88 | bwd_allreduce: 29.52 | step: 18.26 15%|█▌ | 6384/41250 [15:25:39<83:48:53, 8.65s/it] {'loss': 0.4074, 'grad_norm': 2.9902312755584717, 'learning_rate': 3.838956782251601e-05, 'epoch': 1.55} 15%|█▌ | 6384/41250 [15:25:39<83:48:53, 8.65s/it][2025-04-25 23:23:23,165] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 23:23:23,165] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.38 | bwd_microstep: 5717.78 | bwd_inner_microstep: 5705.26 | bwd_allreduce_microstep: 12.47 | step_microstep: 18.17 [2025-04-25 23:23:23,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.38 | bwd: 5717.79 | bwd_inner: 5705.26 | bwd_allreduce: 12.49 | step: 18.17 15%|█▌ | 6385/41250 [15:25:48<83:50:08, 8.66s/it] {'loss': 0.1655, 'grad_norm': 2.1123604774475098, 'learning_rate': 3.838895040731591e-05, 'epoch': 1.55} 15%|█▌ | 6385/41250 [15:25:48<83:50:08, 8.66s/it][2025-04-25 23:23:31,928] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.98 | optimizer_step: 0.95 [2025-04-25 23:23:31,928] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2892.72 | bwd_microstep: 5782.55 | bwd_inner_microstep: 5769.85 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.82 [2025-04-25 23:23:31,929] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2892.72 | bwd: 5782.56 | bwd_inner: 5769.86 | bwd_allreduce: 12.66 | step: 18.82 15%|█▌ | 6386/41250 [15:25:57<84:08:29, 8.69s/it] {'loss': 0.0635, 'grad_norm': 1.067183256149292, 'learning_rate': 3.838833287875155e-05, 'epoch': 1.55} 15%|█▌ | 6386/41250 [15:25:57<84:08:29, 8.69s/it][2025-04-25 23:23:40,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:23:40,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.78 | bwd_microstep: 5746.72 | bwd_inner_microstep: 5688.17 | bwd_allreduce_microstep: 58.51 | step_microstep: 18.71 [2025-04-25 23:23:40,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.78 | bwd: 5746.74 | bwd_inner: 5688.17 | bwd_allreduce: 58.53 | step: 18.71 15%|█▌ | 6387/41250 [15:26:05<84:05:54, 8.68s/it] {'loss': 0.0713, 'grad_norm': 0.9125928282737732, 'learning_rate': 3.838771523682672e-05, 'epoch': 1.55} 15%|█▌ | 6387/41250 [15:26:05<84:05:54, 8.68s/it][2025-04-25 23:23:49,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:23:49,280] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.45 | bwd_microstep: 5753.00 | bwd_inner_microstep: 5657.39 | bwd_allreduce_microstep: 95.57 | step_microstep: 18.11 [2025-04-25 23:23:49,280] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.45 | bwd: 5753.01 | bwd_inner: 5657.39 | bwd_allreduce: 95.59 | step: 18.11 15%|█▌ | 6388/41250 [15:26:14<84:04:34, 8.68s/it] {'loss': 0.4079, 'grad_norm': 3.556107759475708, 'learning_rate': 3.838709748154525e-05, 'epoch': 1.55} 15%|█▌ | 6388/41250 [15:26:14<84:04:34, 8.68s/it][2025-04-25 23:23:57,898] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.98 | optimizer_step: 1.08 [2025-04-25 23:23:57,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.54 | bwd_microstep: 5695.32 | bwd_inner_microstep: 5652.87 | bwd_allreduce_microstep: 42.40 | step_microstep: 18.33 [2025-04-25 23:23:57,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.54 | bwd: 5695.33 | bwd_inner: 5652.87 | bwd_allreduce: 42.42 | step: 18.33 15%|█▌ | 6389/41250 [15:26:23<83:53:30, 8.66s/it] {'loss': 0.2551, 'grad_norm': 2.5806314945220947, 'learning_rate': 3.838647961291094e-05, 'epoch': 1.55} 15%|█▌ | 6389/41250 [15:26:23<83:53:30, 8.66s/it][2025-04-25 23:24:06,573] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 23:24:06,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.23 | bwd_microstep: 5748.05 | bwd_inner_microstep: 5679.93 | bwd_allreduce_microstep: 68.08 | step_microstep: 18.91 [2025-04-25 23:24:06,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.23 | bwd: 5748.06 | bwd_inner: 5679.93 | bwd_allreduce: 68.09 | step: 18.91 15%|█▌ | 6390/41250 [15:26:31<83:55:24, 8.67s/it] {'loss': 0.2002, 'grad_norm': 3.760608434677124, 'learning_rate': 3.83858616309276e-05, 'epoch': 1.55} 15%|█▌ | 6390/41250 [15:26:31<83:55:24, 8.67s/it][2025-04-25 23:24:15,263] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 23:24:15,263] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.29 | bwd_microstep: 5771.72 | bwd_inner_microstep: 5654.99 | bwd_allreduce_microstep: 116.69 | step_microstep: 18.00 [2025-04-25 23:24:15,263] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.29 | bwd: 5771.73 | bwd_inner: 5654.99 | bwd_allreduce: 116.70 | step: 18.00 15%|█▌ | 6391/41250 [15:26:40<83:59:02, 8.67s/it] {'loss': 0.065, 'grad_norm': 1.1326466798782349, 'learning_rate': 3.838524353559904e-05, 'epoch': 1.55} 15%|█▌ | 6391/41250 [15:26:40<83:59:02, 8.67s/it][2025-04-25 23:24:23,947] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 23:24:23,947] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.64 | bwd_microstep: 5755.84 | bwd_inner_microstep: 5687.74 | bwd_allreduce_microstep: 68.06 | step_microstep: 18.24 [2025-04-25 23:24:23,947] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.64 | bwd: 5755.85 | bwd_inner: 5687.74 | bwd_allreduce: 68.07 | step: 18.24 15%|█▌ | 6392/41250 [15:26:49<84:00:52, 8.68s/it] {'loss': 0.1636, 'grad_norm': 2.2691152095794678, 'learning_rate': 3.838462532692907e-05, 'epoch': 1.55} 15%|█▌ | 6392/41250 [15:26:49<84:00:52, 8.68s/it][2025-04-25 23:24:32,580] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.01 | optimizer_step: 0.94 [2025-04-25 23:24:32,580] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.16 | bwd_microstep: 5719.56 | bwd_inner_microstep: 5652.44 | bwd_allreduce_microstep: 67.07 | step_microstep: 18.75 [2025-04-25 23:24:32,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.16 | bwd: 5719.58 | bwd_inner: 5652.44 | bwd_allreduce: 67.09 | step: 18.75 15%|█▌ | 6393/41250 [15:26:57<83:53:19, 8.66s/it] {'loss': 0.1342, 'grad_norm': 2.1804003715515137, 'learning_rate': 3.83840070049215e-05, 'epoch': 1.55} 15%|█▌ | 6393/41250 [15:26:57<83:53:19, 8.66s/it][2025-04-25 23:24:41,253] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:24:41,253] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.42 | bwd_microstep: 5728.26 | bwd_inner_microstep: 5715.50 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.57 [2025-04-25 23:24:41,253] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.42 | bwd: 5728.27 | bwd_inner: 5715.50 | bwd_allreduce: 12.73 | step: 18.57 16%|█▌ | 6394/41250 [15:27:06<83:54:37, 8.67s/it] {'loss': 0.3134, 'grad_norm': 4.783049583435059, 'learning_rate': 3.8383388569580146e-05, 'epoch': 1.55} 16%|█▌ | 6394/41250 [15:27:06<83:54:37, 8.67s/it][2025-04-25 23:24:49,940] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 23:24:49,941] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.54 | bwd_microstep: 5755.91 | bwd_inner_microstep: 5703.41 | bwd_allreduce_microstep: 52.45 | step_microstep: 18.76 [2025-04-25 23:24:49,941] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.54 | bwd: 5755.92 | bwd_inner: 5703.41 | bwd_allreduce: 52.47 | step: 18.76 16%|█▌ | 6395/41250 [15:27:15<83:58:06, 8.67s/it] {'loss': 0.1556, 'grad_norm': 1.6841742992401123, 'learning_rate': 3.838277002090881e-05, 'epoch': 1.55} 16%|█▌ | 6395/41250 [15:27:15<83:58:06, 8.67s/it][2025-04-25 23:24:58,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 23:24:58,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.92 | bwd_microstep: 5779.28 | bwd_inner_microstep: 5705.72 | bwd_allreduce_microstep: 73.52 | step_microstep: 18.72 [2025-04-25 23:24:58,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.92 | bwd: 5779.30 | bwd_inner: 5705.72 | bwd_allreduce: 73.53 | step: 18.72 16%|█▌ | 6396/41250 [15:27:23<84:05:28, 8.69s/it] {'loss': 0.3169, 'grad_norm': 2.5497405529022217, 'learning_rate': 3.838215135891132e-05, 'epoch': 1.55} 16%|█▌ | 6396/41250 [15:27:23<84:05:28, 8.69s/it][2025-04-25 23:25:07,365] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.03 | optimizer_step: 0.95 [2025-04-25 23:25:07,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.90 | bwd_microstep: 5795.02 | bwd_inner_microstep: 5662.00 | bwd_allreduce_microstep: 132.97 | step_microstep: 18.99 [2025-04-25 23:25:07,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.90 | bwd: 5795.04 | bwd_inner: 5662.00 | bwd_allreduce: 132.99 | step: 19.00 16%|█▌ | 6397/41250 [15:27:32<84:09:21, 8.69s/it] {'loss': 0.2818, 'grad_norm': 2.053745985031128, 'learning_rate': 3.8381532583591477e-05, 'epoch': 1.55} 16%|█▌ | 6397/41250 [15:27:32<84:09:21, 8.69s/it][2025-04-25 23:25:16,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-25 23:25:16,150] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2897.46 | bwd_microstep: 5798.03 | bwd_inner_microstep: 5785.25 | bwd_allreduce_microstep: 12.74 | step_microstep: 19.02 [2025-04-25 23:25:16,150] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2897.46 | bwd: 5798.05 | bwd_inner: 5785.25 | bwd_allreduce: 12.75 | step: 19.02 16%|█▌ | 6398/41250 [15:27:41<84:25:25, 8.72s/it] {'loss': 0.1884, 'grad_norm': 3.421018600463867, 'learning_rate': 3.838091369495311e-05, 'epoch': 1.55} 16%|█▌ | 6398/41250 [15:27:41<84:25:25, 8.72s/it][2025-04-25 23:25:24,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.07 | optimizer_step: 0.90 [2025-04-25 23:25:24,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.36 | bwd_microstep: 5806.56 | bwd_inner_microstep: 5665.60 | bwd_allreduce_microstep: 140.91 | step_microstep: 19.13 [2025-04-25 23:25:24,879] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.36 | bwd: 5806.57 | bwd_inner: 5665.60 | bwd_allreduce: 140.93 | step: 19.13 16%|█▌ | 6399/41250 [15:27:50<84:26:40, 8.72s/it] {'loss': 0.0457, 'grad_norm': 1.0786139965057373, 'learning_rate': 3.838029469300001e-05, 'epoch': 1.55} 16%|█▌ | 6399/41250 [15:27:50<84:26:40, 8.72s/it][2025-04-25 23:25:33,614] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:25:33,614] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.22 | bwd_microstep: 5802.34 | bwd_inner_microstep: 5699.37 | bwd_allreduce_microstep: 102.92 | step_microstep: 18.81 [2025-04-25 23:25:33,615] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.22 | bwd: 5802.35 | bwd_inner: 5699.37 | bwd_allreduce: 102.94 | step: 18.82 16%|█▌ | 6400/41250 [15:27:58<84:28:40, 8.73s/it] {'loss': 0.1274, 'grad_norm': 1.5270748138427734, 'learning_rate': 3.837967557773602e-05, 'epoch': 1.55} 16%|█▌ | 6400/41250 [15:27:58<84:28:40, 8.73s/it][2025-04-25 23:25:42,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.03 | optimizer_step: 0.97 [2025-04-25 23:25:42,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.66 | bwd_microstep: 5756.24 | bwd_inner_microstep: 5719.67 | bwd_allreduce_microstep: 36.53 | step_microstep: 18.82 [2025-04-25 23:25:42,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.66 | bwd: 5756.26 | bwd_inner: 5719.67 | bwd_allreduce: 36.55 | step: 18.83 16%|█▌ | 6401/41250 [15:28:07<84:23:47, 8.72s/it] {'loss': 0.0159, 'grad_norm': 0.19412730634212494, 'learning_rate': 3.837905634916494e-05, 'epoch': 1.55} 16%|█▌ | 6401/41250 [15:28:07<84:23:47, 8.72s/it][2025-04-25 23:25:51,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:25:51,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.34 | bwd_microstep: 5782.74 | bwd_inner_microstep: 5661.89 | bwd_allreduce_microstep: 120.80 | step_microstep: 18.46 [2025-04-25 23:25:51,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.34 | bwd: 5782.75 | bwd_inner: 5661.89 | bwd_allreduce: 120.81 | step: 18.47 16%|█▌ | 6402/41250 [15:28:16<84:19:20, 8.71s/it] {'loss': 0.1682, 'grad_norm': 2.125520706176758, 'learning_rate': 3.83784370072906e-05, 'epoch': 1.55} 16%|█▌ | 6402/41250 [15:28:16<84:19:20, 8.71s/it][2025-04-25 23:25:59,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-25 23:25:59,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.61 | bwd_microstep: 5715.46 | bwd_inner_microstep: 5663.75 | bwd_allreduce_microstep: 51.66 | step_microstep: 18.47 [2025-04-25 23:25:59,648] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.61 | bwd: 5715.47 | bwd_inner: 5663.75 | bwd_allreduce: 51.68 | step: 18.47 16%|█▌ | 6403/41250 [15:28:24<84:06:48, 8.69s/it] {'loss': 0.0487, 'grad_norm': 0.8214364647865295, 'learning_rate': 3.83778175521168e-05, 'epoch': 1.55} 16%|█▌ | 6403/41250 [15:28:24<84:06:48, 8.69s/it][2025-04-25 23:26:08,348] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:26:08,349] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.76 | bwd_microstep: 5770.15 | bwd_inner_microstep: 5693.03 | bwd_allreduce_microstep: 77.08 | step_microstep: 18.80 [2025-04-25 23:26:08,349] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.76 | bwd: 5770.16 | bwd_inner: 5693.03 | bwd_allreduce: 77.09 | step: 18.81 16%|█▌ | 6404/41250 [15:28:33<84:08:39, 8.69s/it] {'loss': 0.315, 'grad_norm': 3.033923625946045, 'learning_rate': 3.837719798364738e-05, 'epoch': 1.55} 16%|█▌ | 6404/41250 [15:28:33<84:08:39, 8.69s/it][2025-04-25 23:26:17,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 23:26:17,029] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.89 | bwd_microstep: 5741.38 | bwd_inner_microstep: 5714.57 | bwd_allreduce_microstep: 26.77 | step_microstep: 18.78 [2025-04-25 23:26:17,029] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.89 | bwd: 5741.40 | bwd_inner: 5714.57 | bwd_allreduce: 26.79 | step: 18.79 16%|█▌ | 6405/41250 [15:28:42<84:06:23, 8.69s/it] {'loss': 0.1567, 'grad_norm': 2.174812078475952, 'learning_rate': 3.837657830188614e-05, 'epoch': 1.55} 16%|█▌ | 6405/41250 [15:28:42<84:06:23, 8.69s/it][2025-04-25 23:26:25,730] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:26:25,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.96 | bwd_microstep: 5781.74 | bwd_inner_microstep: 5663.08 | bwd_allreduce_microstep: 118.61 | step_microstep: 18.30 [2025-04-25 23:26:25,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.96 | bwd: 5781.75 | bwd_inner: 5663.08 | bwd_allreduce: 118.63 | step: 18.30 16%|█▌ | 6406/41250 [15:28:51<84:08:14, 8.69s/it] {'loss': 0.0588, 'grad_norm': 1.5345295667648315, 'learning_rate': 3.8375958506836914e-05, 'epoch': 1.55} 16%|█▌ | 6406/41250 [15:28:51<84:08:14, 8.69s/it][2025-04-25 23:26:34,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:26:34,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2889.08 | bwd_microstep: 5794.13 | bwd_inner_microstep: 5781.39 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.77 [2025-04-25 23:26:34,496] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2889.08 | bwd: 5794.14 | bwd_inner: 5781.39 | bwd_allreduce: 12.71 | step: 18.77 16%|█▌ | 6407/41250 [15:28:59<84:20:50, 8.71s/it] {'loss': 0.1584, 'grad_norm': 2.236931324005127, 'learning_rate': 3.8375338598503515e-05, 'epoch': 1.55} 16%|█▌ | 6407/41250 [15:28:59<84:20:50, 8.71s/it][2025-04-25 23:26:43,110] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.96 [2025-04-25 23:26:43,110] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.46 | bwd_microstep: 5704.35 | bwd_inner_microstep: 5651.49 | bwd_allreduce_microstep: 52.81 | step_microstep: 19.10 [2025-04-25 23:26:43,111] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.46 | bwd: 5704.36 | bwd_inner: 5651.49 | bwd_allreduce: 52.83 | step: 19.10 16%|█▌ | 6408/41250 [15:29:08<84:03:11, 8.68s/it] {'loss': 0.2041, 'grad_norm': 3.319100856781006, 'learning_rate': 3.837471857688978e-05, 'epoch': 1.55} 16%|█▌ | 6408/41250 [15:29:08<84:03:11, 8.68s/it][2025-04-25 23:26:51,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 23:26:51,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.18 | bwd_microstep: 5849.85 | bwd_inner_microstep: 5685.90 | bwd_allreduce_microstep: 163.91 | step_microstep: 18.44 [2025-04-25 23:26:51,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.18 | bwd: 5849.86 | bwd_inner: 5685.90 | bwd_allreduce: 163.92 | step: 18.44 16%|█▌ | 6409/41250 [15:29:17<84:20:16, 8.71s/it] {'loss': 0.233, 'grad_norm': 6.563120365142822, 'learning_rate': 3.83740984419995e-05, 'epoch': 1.55} 16%|█▌ | 6409/41250 [15:29:17<84:20:16, 8.71s/it][2025-04-25 23:27:00,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 23:27:00,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.96 | bwd_microstep: 5697.32 | bwd_inner_microstep: 5657.23 | bwd_allreduce_microstep: 40.05 | step_microstep: 18.55 [2025-04-25 23:27:00,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.96 | bwd: 5697.33 | bwd_inner: 5657.23 | bwd_allreduce: 40.06 | step: 18.56 16%|█▌ | 6410/41250 [15:29:25<84:01:13, 8.68s/it] {'loss': 0.203, 'grad_norm': 1.9401817321777344, 'learning_rate': 3.837347819383653e-05, 'epoch': 1.55} 16%|█▌ | 6410/41250 [15:29:25<84:01:13, 8.68s/it][2025-04-25 23:27:09,111] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 23:27:09,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.91 | bwd_microstep: 5702.13 | bwd_inner_microstep: 5645.22 | bwd_allreduce_microstep: 56.86 | step_microstep: 18.70 [2025-04-25 23:27:09,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.91 | bwd: 5702.14 | bwd_inner: 5645.22 | bwd_allreduce: 56.88 | step: 18.71 16%|█▌ | 6411/41250 [15:29:34<83:48:54, 8.66s/it] {'loss': 0.2174, 'grad_norm': 2.8672077655792236, 'learning_rate': 3.837285783240468e-05, 'epoch': 1.55} 16%|█▌ | 6411/41250 [15:29:34<83:48:54, 8.66s/it][2025-04-25 23:27:17,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-25 23:27:17,821] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.92 | bwd_microstep: 5779.21 | bwd_inner_microstep: 5702.67 | bwd_allreduce_microstep: 76.48 | step_microstep: 19.13 [2025-04-25 23:27:17,821] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.92 | bwd: 5779.22 | bwd_inner: 5702.67 | bwd_allreduce: 76.51 | step: 19.13 16%|█▌ | 6412/41250 [15:29:43<83:57:07, 8.68s/it] {'loss': 0.1219, 'grad_norm': 1.4335182905197144, 'learning_rate': 3.8372237357707766e-05, 'epoch': 1.55} 16%|█▌ | 6412/41250 [15:29:43<83:57:07, 8.68s/it][2025-04-25 23:27:26,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.99 | optimizer_step: 0.95 [2025-04-25 23:27:26,474] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.05 | bwd_microstep: 5716.72 | bwd_inner_microstep: 5703.95 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.62 [2025-04-25 23:27:26,474] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.05 | bwd: 5716.74 | bwd_inner: 5703.95 | bwd_allreduce: 12.74 | step: 18.62 16%|█▌ | 6413/41250 [15:29:51<83:53:10, 8.67s/it] {'loss': 0.1453, 'grad_norm': 5.373349189758301, 'learning_rate': 3.8371616769749635e-05, 'epoch': 1.55} 16%|█▌ | 6413/41250 [15:29:51<83:53:10, 8.67s/it][2025-04-25 23:27:35,085] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 23:27:35,085] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.62 | bwd_microstep: 5692.00 | bwd_inner_microstep: 5642.57 | bwd_allreduce_microstep: 49.38 | step_microstep: 19.10 [2025-04-25 23:27:35,086] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.62 | bwd: 5692.01 | bwd_inner: 5642.57 | bwd_allreduce: 49.40 | step: 19.10 16%|█▌ | 6414/41250 [15:30:00<83:43:19, 8.65s/it] {'loss': 0.2311, 'grad_norm': 1.9397656917572021, 'learning_rate': 3.837099606853409e-05, 'epoch': 1.55} 16%|█▌ | 6414/41250 [15:30:00<83:43:19, 8.65s/it][2025-04-25 23:27:43,754] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 23:27:43,755] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.68 | bwd_microstep: 5766.17 | bwd_inner_microstep: 5633.11 | bwd_allreduce_microstep: 133.02 | step_microstep: 18.39 [2025-04-25 23:27:43,755] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.68 | bwd: 5766.19 | bwd_inner: 5633.11 | bwd_allreduce: 133.04 | step: 18.39 16%|█▌ | 6415/41250 [15:30:09<83:46:20, 8.66s/it] {'loss': 0.1318, 'grad_norm': 1.7532000541687012, 'learning_rate': 3.837037525406497e-05, 'epoch': 1.56} 16%|█▌ | 6415/41250 [15:30:09<83:46:20, 8.66s/it][2025-04-25 23:27:52,446] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.08 | optimizer_step: 1.04 [2025-04-25 23:27:52,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.07 | bwd_microstep: 5780.38 | bwd_inner_microstep: 5648.56 | bwd_allreduce_microstep: 131.77 | step_microstep: 19.61 [2025-04-25 23:27:52,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.07 | bwd: 5780.40 | bwd_inner: 5648.56 | bwd_allreduce: 131.79 | step: 19.61 16%|█▌ | 6416/41250 [15:30:17<83:51:55, 8.67s/it] {'loss': 0.1273, 'grad_norm': 1.3529399633407593, 'learning_rate': 3.83697543263461e-05, 'epoch': 1.56} 16%|█▌ | 6416/41250 [15:30:17<83:51:55, 8.67s/it][2025-04-25 23:28:01,218] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:28:01,219] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.36 | bwd_microstep: 5848.20 | bwd_inner_microstep: 5649.55 | bwd_allreduce_microstep: 198.61 | step_microstep: 18.43 [2025-04-25 23:28:01,219] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.36 | bwd: 5848.22 | bwd_inner: 5649.55 | bwd_allreduce: 198.62 | step: 18.43 16%|█▌ | 6417/41250 [15:30:26<84:09:49, 8.70s/it] {'loss': 0.1191, 'grad_norm': 1.3118690252304077, 'learning_rate': 3.836913328538131e-05, 'epoch': 1.56} 16%|█▌ | 6417/41250 [15:30:26<84:09:49, 8.70s/it][2025-04-25 23:28:10,108] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.56 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 23:28:10,109] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2930.45 | bwd_microstep: 5876.01 | bwd_inner_microstep: 5863.00 | bwd_allreduce_microstep: 12.97 | step_microstep: 19.00 [2025-04-25 23:28:10,109] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2930.45 | bwd: 5876.03 | bwd_inner: 5863.00 | bwd_allreduce: 12.99 | step: 19.00 16%|█▌ | 6418/41250 [15:30:35<84:43:12, 8.76s/it] {'loss': 0.2123, 'grad_norm': 2.53794002532959, 'learning_rate': 3.836851213117443e-05, 'epoch': 1.56} 16%|█▌ | 6418/41250 [15:30:35<84:43:12, 8.76s/it][2025-04-25 23:28:18,769] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 23:28:18,770] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.04 | bwd_microstep: 5749.58 | bwd_inner_microstep: 5643.69 | bwd_allreduce_microstep: 105.84 | step_microstep: 18.43 [2025-04-25 23:28:18,770] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.04 | bwd: 5749.59 | bwd_inner: 5643.69 | bwd_allreduce: 105.86 | step: 18.44 16%|█▌ | 6419/41250 [15:30:44<84:26:18, 8.73s/it] {'loss': 0.1109, 'grad_norm': 1.634657859802246, 'learning_rate': 3.836789086372928e-05, 'epoch': 1.56} 16%|█▌ | 6419/41250 [15:30:44<84:26:18, 8.73s/it][2025-04-25 23:28:27,400] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.01 | optimizer_step: 1.13 [2025-04-25 23:28:27,401] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.87 | bwd_microstep: 5704.54 | bwd_inner_microstep: 5691.94 | bwd_allreduce_microstep: 12.54 | step_microstep: 19.39 [2025-04-25 23:28:27,401] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.87 | bwd: 5704.55 | bwd_inner: 5691.94 | bwd_allreduce: 12.56 | step: 19.39 16%|█▌ | 6420/41250 [15:30:52<84:09:56, 8.70s/it] {'loss': 0.1211, 'grad_norm': 1.7086992263793945, 'learning_rate': 3.83672694830497e-05, 'epoch': 1.56} 16%|█▌ | 6420/41250 [15:30:52<84:09:56, 8.70s/it][2025-04-25 23:28:36,017] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.19 | optimizer_step: 0.96 [2025-04-25 23:28:36,017] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.51 | bwd_microstep: 5690.70 | bwd_inner_microstep: 5677.35 | bwd_allreduce_microstep: 13.29 | step_microstep: 19.57 [2025-04-25 23:28:36,018] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.51 | bwd: 5690.71 | bwd_inner: 5677.35 | bwd_allreduce: 13.32 | step: 19.57 16%|█▌ | 6421/41250 [15:31:01<83:55:00, 8.67s/it] {'loss': 0.1649, 'grad_norm': 2.66927170753479, 'learning_rate': 3.836664798913951e-05, 'epoch': 1.56} 16%|█▌ | 6421/41250 [15:31:01<83:55:00, 8.67s/it][2025-04-25 23:28:44,681] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 23:28:44,681] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2812.14 | bwd_microstep: 5767.76 | bwd_inner_microstep: 5623.46 | bwd_allreduce_microstep: 144.26 | step_microstep: 18.58 [2025-04-25 23:28:44,681] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2812.14 | bwd: 5767.78 | bwd_inner: 5623.46 | bwd_allreduce: 144.28 | step: 18.59 16%|█▌ | 6422/41250 [15:31:10<83:53:11, 8.67s/it] {'loss': 0.1314, 'grad_norm': 1.4072176218032837, 'learning_rate': 3.836602638200255e-05, 'epoch': 1.56} 16%|█▌ | 6422/41250 [15:31:10<83:53:11, 8.67s/it][2025-04-25 23:28:53,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:28:53,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.59 | bwd_microstep: 5661.53 | bwd_inner_microstep: 5641.37 | bwd_allreduce_microstep: 20.11 | step_microstep: 18.51 [2025-04-25 23:28:53,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.59 | bwd: 5661.54 | bwd_inner: 5641.37 | bwd_allreduce: 20.13 | step: 18.51 16%|█▌ | 6423/41250 [15:31:18<83:35:28, 8.64s/it] {'loss': 0.0668, 'grad_norm': 0.6427146792411804, 'learning_rate': 3.836540466164265e-05, 'epoch': 1.56} 16%|█▌ | 6423/41250 [15:31:18<83:35:28, 8.64s/it][2025-04-25 23:29:01,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 23:29:01,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2815.62 | bwd_microstep: 5744.76 | bwd_inner_microstep: 5631.26 | bwd_allreduce_microstep: 113.46 | step_microstep: 18.60 [2025-04-25 23:29:01,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2815.62 | bwd: 5744.78 | bwd_inner: 5631.26 | bwd_allreduce: 113.47 | step: 18.60 16%|█▌ | 6424/41250 [15:31:27<83:35:29, 8.64s/it] {'loss': 0.1406, 'grad_norm': 1.5662806034088135, 'learning_rate': 3.836478282806364e-05, 'epoch': 1.56} 16%|█▌ | 6424/41250 [15:31:27<83:35:29, 8.64s/it][2025-04-25 23:29:10,497] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 23:29:10,497] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.62 | bwd_microstep: 5681.47 | bwd_inner_microstep: 5666.66 | bwd_allreduce_microstep: 14.76 | step_microstep: 18.16 [2025-04-25 23:29:10,497] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.62 | bwd: 5681.49 | bwd_inner: 5666.66 | bwd_allreduce: 14.78 | step: 18.16 16%|█▌ | 6425/41250 [15:31:35<83:29:07, 8.63s/it] {'loss': 0.11, 'grad_norm': 5.398685455322266, 'learning_rate': 3.836416088126936e-05, 'epoch': 1.56} 16%|█▌ | 6425/41250 [15:31:35<83:29:07, 8.63s/it][2025-04-25 23:29:19,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.06 | optimizer_step: 1.07 [2025-04-25 23:29:19,162] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.29 | bwd_microstep: 5758.65 | bwd_inner_microstep: 5644.10 | bwd_allreduce_microstep: 114.50 | step_microstep: 19.42 [2025-04-25 23:29:19,162] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.29 | bwd: 5758.67 | bwd_inner: 5644.10 | bwd_allreduce: 114.52 | step: 19.43 16%|█▌ | 6426/41250 [15:31:44<83:34:45, 8.64s/it] {'loss': 0.2367, 'grad_norm': 1.8415865898132324, 'learning_rate': 3.836353882126364e-05, 'epoch': 1.56} 16%|█▌ | 6426/41250 [15:31:44<83:34:45, 8.64s/it][2025-04-25 23:29:27,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 23:29:27,859] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.09 | bwd_microstep: 5767.44 | bwd_inner_microstep: 5679.52 | bwd_allreduce_microstep: 87.87 | step_microstep: 19.06 [2025-04-25 23:29:27,859] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.09 | bwd: 5767.45 | bwd_inner: 5679.52 | bwd_allreduce: 87.88 | step: 19.06 16%|█▌ | 6427/41250 [15:31:53<83:44:29, 8.66s/it] {'loss': 0.261, 'grad_norm': 1.4756821393966675, 'learning_rate': 3.8362916648050314e-05, 'epoch': 1.56} 16%|█▌ | 6427/41250 [15:31:53<83:44:29, 8.66s/it][2025-04-25 23:29:36,536] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:29:36,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.09 | bwd_microstep: 5746.80 | bwd_inner_microstep: 5687.01 | bwd_allreduce_microstep: 59.74 | step_microstep: 18.69 [2025-04-25 23:29:36,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.09 | bwd: 5746.82 | bwd_inner: 5687.01 | bwd_allreduce: 59.76 | step: 18.69 16%|█▌ | 6428/41250 [15:32:01<83:47:54, 8.66s/it] {'loss': 0.0899, 'grad_norm': 1.1788970232009888, 'learning_rate': 3.836229436163322e-05, 'epoch': 1.56} 16%|█▌ | 6428/41250 [15:32:01<83:47:54, 8.66s/it][2025-04-25 23:29:45,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:29:45,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2813.10 | bwd_microstep: 5773.40 | bwd_inner_microstep: 5630.85 | bwd_allreduce_microstep: 142.50 | step_microstep: 18.75 [2025-04-25 23:29:45,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2813.10 | bwd: 5773.41 | bwd_inner: 5630.85 | bwd_allreduce: 142.52 | step: 18.75 16%|█▌ | 6429/41250 [15:32:10<83:49:16, 8.67s/it] {'loss': 0.2916, 'grad_norm': 3.1990723609924316, 'learning_rate': 3.8361671962016196e-05, 'epoch': 1.56} 16%|█▌ | 6429/41250 [15:32:10<83:49:16, 8.67s/it][2025-04-25 23:29:53,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-25 23:29:53,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.20 | bwd_microstep: 5713.60 | bwd_inner_microstep: 5642.43 | bwd_allreduce_microstep: 71.13 | step_microstep: 18.59 [2025-04-25 23:29:53,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.20 | bwd: 5713.61 | bwd_inner: 5642.43 | bwd_allreduce: 71.14 | step: 18.59 16%|█▌ | 6430/41250 [15:32:19<83:40:45, 8.65s/it] {'loss': 0.0947, 'grad_norm': 1.03264319896698, 'learning_rate': 3.836104944920307e-05, 'epoch': 1.56} 16%|█▌ | 6430/41250 [15:32:19<83:40:45, 8.65s/it][2025-04-25 23:30:02,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 0.99 | optimizer_step: 1.00 [2025-04-25 23:30:02,420] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.66 | bwd_microstep: 5689.65 | bwd_inner_microstep: 5654.18 | bwd_allreduce_microstep: 35.43 | step_microstep: 18.79 [2025-04-25 23:30:02,420] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.66 | bwd: 5689.66 | bwd_inner: 5654.18 | bwd_allreduce: 35.44 | step: 18.79 16%|█▌ | 6431/41250 [15:32:27<83:30:46, 8.63s/it] {'loss': 0.0405, 'grad_norm': 0.9460753202438354, 'learning_rate': 3.836042682319769e-05, 'epoch': 1.56} 16%|█▌ | 6431/41250 [15:32:27<83:30:46, 8.63s/it][2025-04-25 23:30:11,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.96 | optimizer_step: 1.08 [2025-04-25 23:30:11,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.24 | bwd_microstep: 5735.84 | bwd_inner_microstep: 5697.55 | bwd_allreduce_microstep: 38.25 | step_microstep: 18.75 [2025-04-25 23:30:11,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.24 | bwd: 5735.85 | bwd_inner: 5697.55 | bwd_allreduce: 38.26 | step: 18.75 16%|█▌ | 6432/41250 [15:32:36<83:36:32, 8.64s/it] {'loss': 0.0543, 'grad_norm': 0.9420209527015686, 'learning_rate': 3.8359804084003885e-05, 'epoch': 1.56} 16%|█▌ | 6432/41250 [15:32:36<83:36:32, 8.64s/it][2025-04-25 23:30:19,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.93 [2025-04-25 23:30:19,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.24 | bwd_microstep: 5704.40 | bwd_inner_microstep: 5659.74 | bwd_allreduce_microstep: 44.61 | step_microstep: 18.68 [2025-04-25 23:30:19,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.24 | bwd: 5704.42 | bwd_inner: 5659.74 | bwd_allreduce: 44.63 | step: 18.68 16%|█▌ | 6433/41250 [15:32:45<83:32:47, 8.64s/it] {'loss': 0.1441, 'grad_norm': 3.5703020095825195, 'learning_rate': 3.8359181231625495e-05, 'epoch': 1.56} 16%|█▌ | 6433/41250 [15:32:45<83:32:47, 8.64s/it][2025-04-25 23:30:28,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 23:30:28,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.02 | bwd_microstep: 5759.50 | bwd_inner_microstep: 5653.43 | bwd_allreduce_microstep: 106.02 | step_microstep: 18.80 [2025-04-25 23:30:28,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.02 | bwd: 5759.52 | bwd_inner: 5653.43 | bwd_allreduce: 106.04 | step: 18.81 16%|█▌ | 6434/41250 [15:32:53<83:37:58, 8.65s/it] {'loss': 0.1265, 'grad_norm': 1.7166829109191895, 'learning_rate': 3.835855826606637e-05, 'epoch': 1.56} 16%|█▌ | 6434/41250 [15:32:53<83:37:58, 8.65s/it][2025-04-25 23:30:37,066] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-25 23:30:37,067] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.88 | bwd_microstep: 5776.58 | bwd_inner_microstep: 5651.97 | bwd_allreduce_microstep: 124.56 | step_microstep: 18.73 [2025-04-25 23:30:37,067] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.88 | bwd: 5776.59 | bwd_inner: 5651.97 | bwd_allreduce: 124.58 | step: 18.74 16%|█▌ | 6435/41250 [15:33:02<83:44:09, 8.66s/it] {'loss': 0.2233, 'grad_norm': 3.1787025928497314, 'learning_rate': 3.8357935187330345e-05, 'epoch': 1.56} 16%|█▌ | 6435/41250 [15:33:02<83:44:09, 8.66s/it][2025-04-25 23:30:45,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 23:30:45,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.20 | bwd_microstep: 5729.70 | bwd_inner_microstep: 5652.98 | bwd_allreduce_microstep: 76.67 | step_microstep: 19.03 [2025-04-25 23:30:45,706] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.20 | bwd: 5729.71 | bwd_inner: 5652.98 | bwd_allreduce: 76.69 | step: 19.03 16%|█▌ | 6436/41250 [15:33:11<83:40:32, 8.65s/it] {'loss': 0.1675, 'grad_norm': 4.309459209442139, 'learning_rate': 3.8357311995421264e-05, 'epoch': 1.56} 16%|█▌ | 6436/41250 [15:33:11<83:40:32, 8.65s/it][2025-04-25 23:30:54,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.04 | optimizer_step: 1.10 [2025-04-25 23:30:54,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.81 | bwd_microstep: 5747.20 | bwd_inner_microstep: 5693.06 | bwd_allreduce_microstep: 54.09 | step_microstep: 19.52 [2025-04-25 23:30:54,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.81 | bwd: 5747.21 | bwd_inner: 5693.06 | bwd_allreduce: 54.11 | step: 19.52 16%|█▌ | 6437/41250 [15:33:19<83:46:03, 8.66s/it] {'loss': 0.0444, 'grad_norm': 0.6888256669044495, 'learning_rate': 3.835668869034296e-05, 'epoch': 1.56} 16%|█▌ | 6437/41250 [15:33:19<83:46:03, 8.66s/it][2025-04-25 23:31:03,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-25 23:31:03,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.61 | bwd_microstep: 5745.70 | bwd_inner_microstep: 5701.38 | bwd_allreduce_microstep: 44.27 | step_microstep: 19.43 [2025-04-25 23:31:03,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.61 | bwd: 5745.72 | bwd_inner: 5701.38 | bwd_allreduce: 44.29 | step: 19.43 16%|█▌ | 6438/41250 [15:33:28<83:48:54, 8.67s/it] {'loss': 0.178, 'grad_norm': 2.5696096420288086, 'learning_rate': 3.835606527209928e-05, 'epoch': 1.56} 16%|█▌ | 6438/41250 [15:33:28<83:48:54, 8.67s/it][2025-04-25 23:31:11,774] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:31:11,775] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.42 | bwd_microstep: 5771.21 | bwd_inner_microstep: 5693.33 | bwd_allreduce_microstep: 77.83 | step_microstep: 18.82 [2025-04-25 23:31:11,775] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.42 | bwd: 5771.23 | bwd_inner: 5693.33 | bwd_allreduce: 77.85 | step: 18.83 16%|█▌ | 6439/41250 [15:33:37<83:55:04, 8.68s/it] {'loss': 0.2511, 'grad_norm': 2.5942022800445557, 'learning_rate': 3.835544174069407e-05, 'epoch': 1.56} 16%|█▌ | 6439/41250 [15:33:37<83:55:04, 8.68s/it][2025-04-25 23:31:20,451] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 23:31:20,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.96 | bwd_microstep: 5766.99 | bwd_inner_microstep: 5651.95 | bwd_allreduce_microstep: 115.00 | step_microstep: 18.88 [2025-04-25 23:31:20,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.96 | bwd: 5767.00 | bwd_inner: 5651.95 | bwd_allreduce: 115.01 | step: 18.88 16%|█▌ | 6440/41250 [15:33:45<83:54:44, 8.68s/it] {'loss': 0.1952, 'grad_norm': 1.965164065361023, 'learning_rate': 3.835481809613117e-05, 'epoch': 1.56} 16%|█▌ | 6440/41250 [15:33:45<83:54:44, 8.68s/it][2025-04-25 23:31:29,086] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:31:29,087] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.62 | bwd_microstep: 5722.90 | bwd_inner_microstep: 5657.49 | bwd_allreduce_microstep: 65.36 | step_microstep: 18.59 [2025-04-25 23:31:29,087] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.62 | bwd: 5722.91 | bwd_inner: 5657.49 | bwd_allreduce: 65.38 | step: 18.59 16%|█▌ | 6441/41250 [15:33:54<83:47:05, 8.67s/it] {'loss': 0.2532, 'grad_norm': 4.52247953414917, 'learning_rate': 3.835419433841443e-05, 'epoch': 1.56} 16%|█▌ | 6441/41250 [15:33:54<83:47:05, 8.67s/it][2025-04-25 23:31:37,743] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-25 23:31:37,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.71 | bwd_microstep: 5715.19 | bwd_inner_microstep: 5702.31 | bwd_allreduce_microstep: 12.84 | step_microstep: 19.36 [2025-04-25 23:31:37,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.71 | bwd: 5715.20 | bwd_inner: 5702.31 | bwd_allreduce: 12.85 | step: 19.36 16%|█▌ | 6442/41250 [15:34:03<83:45:44, 8.66s/it] {'loss': 0.0461, 'grad_norm': 1.430730938911438, 'learning_rate': 3.835357046754768e-05, 'epoch': 1.56} 16%|█▌ | 6442/41250 [15:34:03<83:45:44, 8.66s/it][2025-04-25 23:31:46,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.12 | optimizer_step: 1.01 [2025-04-25 23:31:46,395] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.90 | bwd_microstep: 5713.66 | bwd_inner_microstep: 5700.24 | bwd_allreduce_microstep: 13.38 | step_microstep: 19.33 [2025-04-25 23:31:46,395] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.90 | bwd: 5713.67 | bwd_inner: 5700.23 | bwd_allreduce: 13.40 | step: 19.34 16%|█▌ | 6443/41250 [15:34:11<83:43:25, 8.66s/it] {'loss': 0.1694, 'grad_norm': 1.183205008506775, 'learning_rate': 3.83529464835348e-05, 'epoch': 1.56} 16%|█▌ | 6443/41250 [15:34:11<83:43:25, 8.66s/it][2025-04-25 23:31:55,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.01 | optimizer_step: 1.01 [2025-04-25 23:31:55,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.22 | bwd_microstep: 5785.62 | bwd_inner_microstep: 5648.45 | bwd_allreduce_microstep: 137.12 | step_microstep: 18.91 [2025-04-25 23:31:55,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.22 | bwd: 5785.63 | bwd_inner: 5648.45 | bwd_allreduce: 137.14 | step: 18.91 16%|█▌ | 6444/41250 [15:34:20<83:49:07, 8.67s/it] {'loss': 0.0645, 'grad_norm': 1.8186874389648438, 'learning_rate': 3.83523223863796e-05, 'epoch': 1.56} 16%|█▌ | 6444/41250 [15:34:20<83:49:07, 8.67s/it][2025-04-25 23:32:03,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:32:03,719] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.82 | bwd_microstep: 5717.07 | bwd_inner_microstep: 5652.53 | bwd_allreduce_microstep: 64.49 | step_microstep: 18.82 [2025-04-25 23:32:03,719] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.82 | bwd: 5717.08 | bwd_inner: 5652.53 | bwd_allreduce: 64.51 | step: 18.82 16%|█▌ | 6445/41250 [15:34:29<83:42:11, 8.66s/it] {'loss': 0.4479, 'grad_norm': 9.998440742492676, 'learning_rate': 3.835169817608594e-05, 'epoch': 1.56} 16%|█▌ | 6445/41250 [15:34:29<83:42:11, 8.66s/it][2025-04-25 23:32:12,341] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 23:32:12,342] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.47 | bwd_microstep: 5705.22 | bwd_inner_microstep: 5637.20 | bwd_allreduce_microstep: 67.97 | step_microstep: 19.04 [2025-04-25 23:32:12,342] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.47 | bwd: 5705.23 | bwd_inner: 5637.20 | bwd_allreduce: 67.98 | step: 19.04 16%|█▌ | 6446/41250 [15:34:37<83:36:02, 8.65s/it] {'loss': 0.0455, 'grad_norm': 0.6092248558998108, 'learning_rate': 3.835107385265768e-05, 'epoch': 1.56} 16%|█▌ | 6446/41250 [15:34:37<83:36:02, 8.65s/it][2025-04-25 23:32:21,039] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:32:21,039] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.35 | bwd_microstep: 5785.98 | bwd_inner_microstep: 5648.71 | bwd_allreduce_microstep: 137.22 | step_microstep: 18.60 [2025-04-25 23:32:21,040] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.35 | bwd: 5786.00 | bwd_inner: 5648.71 | bwd_allreduce: 137.24 | step: 18.60 16%|█▌ | 6447/41250 [15:34:46<83:45:00, 8.66s/it] {'loss': 0.1225, 'grad_norm': 4.48621129989624, 'learning_rate': 3.835044941609865e-05, 'epoch': 1.56} 16%|█▌ | 6447/41250 [15:34:46<83:45:00, 8.66s/it][2025-04-25 23:32:29,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.00 | optimizer_step: 1.03 [2025-04-25 23:32:29,737] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.60 | bwd_microstep: 5783.24 | bwd_inner_microstep: 5656.99 | bwd_allreduce_microstep: 126.20 | step_microstep: 18.81 [2025-04-25 23:32:29,737] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.60 | bwd: 5783.25 | bwd_inner: 5656.99 | bwd_allreduce: 126.22 | step: 18.82 16%|█▌ | 6448/41250 [15:34:55<83:50:26, 8.67s/it] {'loss': 0.2236, 'grad_norm': 1.4648945331573486, 'learning_rate': 3.834982486641272e-05, 'epoch': 1.56} 16%|█▌ | 6448/41250 [15:34:55<83:50:26, 8.67s/it][2025-04-25 23:32:38,424] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-25 23:32:38,425] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.78 | bwd_microstep: 5776.37 | bwd_inner_microstep: 5654.36 | bwd_allreduce_microstep: 121.96 | step_microstep: 18.81 [2025-04-25 23:32:38,425] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.78 | bwd: 5776.39 | bwd_inner: 5654.36 | bwd_allreduce: 121.99 | step: 18.81 16%|█▌ | 6449/41250 [15:35:03<83:52:53, 8.68s/it] {'loss': 0.0827, 'grad_norm': 1.7769064903259277, 'learning_rate': 3.834920020360372e-05, 'epoch': 1.56} 16%|█▌ | 6449/41250 [15:35:03<83:52:53, 8.68s/it][2025-04-25 23:32:47,063] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:32:47,063] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.53 | bwd_microstep: 5722.32 | bwd_inner_microstep: 5636.82 | bwd_allreduce_microstep: 85.45 | step_microstep: 18.55 [2025-04-25 23:32:47,063] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.53 | bwd: 5722.34 | bwd_inner: 5636.82 | bwd_allreduce: 85.47 | step: 18.56 16%|█▌ | 6450/41250 [15:35:12<83:45:57, 8.67s/it] {'loss': 0.0232, 'grad_norm': 0.284763365983963, 'learning_rate': 3.834857542767551e-05, 'epoch': 1.56} 16%|█▌ | 6450/41250 [15:35:12<83:45:57, 8.67s/it][2025-04-25 23:32:56,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:32:56,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2950.58 | bwd_microstep: 5894.92 | bwd_inner_microstep: 5882.15 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.75 [2025-04-25 23:32:56,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2950.58 | bwd: 5894.94 | bwd_inner: 5882.15 | bwd_allreduce: 12.74 | step: 18.75 16%|█▌ | 6451/41250 [15:35:21<84:33:29, 8.75s/it] {'loss': 0.0583, 'grad_norm': 0.7552205920219421, 'learning_rate': 3.834795053863195e-05, 'epoch': 1.56} 16%|█▌ | 6451/41250 [15:35:21<84:33:29, 8.75s/it][2025-04-25 23:33:04,640] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.04 | optimizer_step: 0.99 [2025-04-25 23:33:04,641] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.26 | bwd_microstep: 5704.33 | bwd_inner_microstep: 5690.89 | bwd_allreduce_microstep: 13.39 | step_microstep: 19.40 [2025-04-25 23:33:04,641] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.26 | bwd: 5704.34 | bwd_inner: 5690.89 | bwd_allreduce: 13.41 | step: 19.40 16%|█▌ | 6452/41250 [15:35:29<84:14:31, 8.72s/it] {'loss': 0.1264, 'grad_norm': 1.262980341911316, 'learning_rate': 3.834732553647687e-05, 'epoch': 1.56} 16%|█▌ | 6452/41250 [15:35:29<84:14:31, 8.72s/it][2025-04-25 23:33:13,329] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 23:33:13,330] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.89 | bwd_microstep: 5776.61 | bwd_inner_microstep: 5650.55 | bwd_allreduce_microstep: 126.01 | step_microstep: 18.63 [2025-04-25 23:33:13,330] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.89 | bwd: 5776.62 | bwd_inner: 5650.55 | bwd_allreduce: 126.03 | step: 18.64 16%|█▌ | 6453/41250 [15:35:38<84:09:32, 8.71s/it] {'loss': 0.0655, 'grad_norm': 1.6514068841934204, 'learning_rate': 3.8346700421214147e-05, 'epoch': 1.56} 16%|█▌ | 6453/41250 [15:35:38<84:09:32, 8.71s/it][2025-04-25 23:33:22,016] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:33:22,017] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.35 | bwd_microstep: 5778.54 | bwd_inner_microstep: 5649.09 | bwd_allreduce_microstep: 129.40 | step_microstep: 18.51 [2025-04-25 23:33:22,017] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.36 | bwd: 5778.55 | bwd_inner: 5649.09 | bwd_allreduce: 129.42 | step: 18.52 16%|█▌ | 6454/41250 [15:35:47<84:05:59, 8.70s/it] {'loss': 0.0932, 'grad_norm': 1.6076817512512207, 'learning_rate': 3.8346075192847623e-05, 'epoch': 1.56} 16%|█▌ | 6454/41250 [15:35:47<84:05:59, 8.70s/it][2025-04-25 23:33:30,711] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.97 | optimizer_step: 1.00 [2025-04-25 23:33:30,711] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.20 | bwd_microstep: 5762.92 | bwd_inner_microstep: 5706.69 | bwd_allreduce_microstep: 56.18 | step_microstep: 18.39 [2025-04-25 23:33:30,712] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.20 | bwd: 5762.93 | bwd_inner: 5706.69 | bwd_allreduce: 56.20 | step: 18.39 16%|█▌ | 6455/41250 [15:35:56<84:04:45, 8.70s/it] {'loss': 0.0602, 'grad_norm': 1.73088538646698, 'learning_rate': 3.834544985138115e-05, 'epoch': 1.56} 16%|█▌ | 6455/41250 [15:35:56<84:04:45, 8.70s/it][2025-04-25 23:33:39,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:33:39,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.40 | bwd_microstep: 5764.01 | bwd_inner_microstep: 5638.21 | bwd_allreduce_microstep: 125.75 | step_microstep: 18.43 [2025-04-25 23:33:39,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.40 | bwd: 5764.02 | bwd_inner: 5638.21 | bwd_allreduce: 125.77 | step: 18.44 16%|█▌ | 6456/41250 [15:36:04<83:59:42, 8.69s/it] {'loss': 0.2443, 'grad_norm': 1.6093748807907104, 'learning_rate': 3.834482439681859e-05, 'epoch': 1.57} 16%|█▌ | 6456/41250 [15:36:04<83:59:42, 8.69s/it][2025-04-25 23:33:48,045] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:33:48,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.79 | bwd_microstep: 5754.16 | bwd_inner_microstep: 5652.23 | bwd_allreduce_microstep: 101.88 | step_microstep: 18.42 [2025-04-25 23:33:48,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.79 | bwd: 5754.17 | bwd_inner: 5652.23 | bwd_allreduce: 101.90 | step: 18.42 16%|█▌ | 6457/41250 [15:36:13<83:54:52, 8.68s/it] {'loss': 0.1026, 'grad_norm': 2.2397894859313965, 'learning_rate': 3.83441988291638e-05, 'epoch': 1.57} 16%|█▌ | 6457/41250 [15:36:13<83:54:52, 8.68s/it][2025-04-25 23:33:56,642] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:33:56,643] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.07 | bwd_microstep: 5688.06 | bwd_inner_microstep: 5647.60 | bwd_allreduce_microstep: 40.42 | step_microstep: 18.70 [2025-04-25 23:33:56,643] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.07 | bwd: 5688.07 | bwd_inner: 5647.60 | bwd_allreduce: 40.43 | step: 18.70 16%|█▌ | 6458/41250 [15:36:21<83:39:59, 8.66s/it] {'loss': 0.1071, 'grad_norm': 2.352433204650879, 'learning_rate': 3.8343573148420626e-05, 'epoch': 1.57} 16%|█▌ | 6458/41250 [15:36:21<83:39:59, 8.66s/it][2025-04-25 23:34:05,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:34:05,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.73 | bwd_microstep: 5683.51 | bwd_inner_microstep: 5659.88 | bwd_allreduce_microstep: 23.58 | step_microstep: 18.87 [2025-04-25 23:34:05,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.73 | bwd: 5683.52 | bwd_inner: 5659.88 | bwd_allreduce: 23.60 | step: 18.87 16%|█▌ | 6459/41250 [15:36:30<83:29:04, 8.64s/it] {'loss': 0.2929, 'grad_norm': 3.325901508331299, 'learning_rate': 3.8342947354592925e-05, 'epoch': 1.57} 16%|█▌ | 6459/41250 [15:36:30<83:29:04, 8.64s/it][2025-04-25 23:34:13,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-25 23:34:13,844] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.85 | bwd_microstep: 5691.87 | bwd_inner_microstep: 5637.72 | bwd_allreduce_microstep: 54.11 | step_microstep: 19.39 [2025-04-25 23:34:13,844] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.85 | bwd: 5691.89 | bwd_inner: 5637.72 | bwd_allreduce: 54.12 | step: 19.40 16%|█▌ | 6460/41250 [15:36:39<83:23:10, 8.63s/it] {'loss': 0.0246, 'grad_norm': 0.45593973994255066, 'learning_rate': 3.834232144768457e-05, 'epoch': 1.57} 16%|█▌ | 6460/41250 [15:36:39<83:23:10, 8.63s/it][2025-04-25 23:34:22,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:34:22,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.10 | bwd_microstep: 5893.65 | bwd_inner_microstep: 5644.40 | bwd_allreduce_microstep: 249.20 | step_microstep: 18.80 [2025-04-25 23:34:22,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.10 | bwd: 5893.66 | bwd_inner: 5644.40 | bwd_allreduce: 249.22 | step: 18.81 16%|█▌ | 6461/41250 [15:36:47<83:52:52, 8.68s/it] {'loss': 0.1552, 'grad_norm': 2.607750654220581, 'learning_rate': 3.834169542769941e-05, 'epoch': 1.57} 16%|█▌ | 6461/41250 [15:36:47<83:52:52, 8.68s/it][2025-04-25 23:34:31,446] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.01 | optimizer_step: 1.03 [2025-04-25 23:34:31,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.27 | bwd_microstep: 5874.11 | bwd_inner_microstep: 5665.60 | bwd_allreduce_microstep: 208.46 | step_microstep: 19.21 [2025-04-25 23:34:31,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.27 | bwd: 5874.12 | bwd_inner: 5665.60 | bwd_allreduce: 208.48 | step: 19.21 16%|█▌ | 6462/41250 [15:36:56<84:14:06, 8.72s/it] {'loss': 0.0658, 'grad_norm': 3.7407913208007812, 'learning_rate': 3.834106929464131e-05, 'epoch': 1.57} 16%|█▌ | 6462/41250 [15:36:56<84:14:06, 8.72s/it][2025-04-25 23:34:40,050] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.18 | optimizer_step: 0.90 [2025-04-25 23:34:40,050] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.78 | bwd_microstep: 5698.91 | bwd_inner_microstep: 5636.33 | bwd_allreduce_microstep: 62.54 | step_microstep: 19.27 [2025-04-25 23:34:40,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.78 | bwd: 5698.93 | bwd_inner: 5636.33 | bwd_allreduce: 62.55 | step: 19.27 16%|█▌ | 6463/41250 [15:37:05<83:54:15, 8.68s/it] {'loss': 0.2012, 'grad_norm': 1.6177374124526978, 'learning_rate': 3.8340443048514124e-05, 'epoch': 1.57} 16%|█▌ | 6463/41250 [15:37:05<83:54:15, 8.68s/it][2025-04-25 23:34:48,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 1.04 [2025-04-25 23:34:48,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2874.09 | bwd_microstep: 5755.62 | bwd_inner_microstep: 5742.81 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.93 [2025-04-25 23:34:48,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2874.09 | bwd: 5755.63 | bwd_inner: 5742.81 | bwd_allreduce: 12.78 | step: 18.93 16%|█▌ | 6464/41250 [15:37:14<83:59:23, 8.69s/it] {'loss': 0.1112, 'grad_norm': 1.4033679962158203, 'learning_rate': 3.8339816689321706e-05, 'epoch': 1.57} 16%|█▌ | 6464/41250 [15:37:14<83:59:23, 8.69s/it][2025-04-25 23:34:57,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.32 | optimizer_step: 1.08 [2025-04-25 23:34:57,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2882.74 | bwd_microstep: 5771.31 | bwd_inner_microstep: 5758.00 | bwd_allreduce_microstep: 13.24 | step_microstep: 20.38 [2025-04-25 23:34:57,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2882.74 | bwd: 5771.32 | bwd_inner: 5758.00 | bwd_allreduce: 13.27 | step: 20.39 16%|█▌ | 6465/41250 [15:37:22<84:07:52, 8.71s/it] {'loss': 0.122, 'grad_norm': 1.4551782608032227, 'learning_rate': 3.833919021706793e-05, 'epoch': 1.57} 16%|█▌ | 6465/41250 [15:37:22<84:07:52, 8.71s/it][2025-04-25 23:35:06,153] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-25 23:35:06,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.67 | bwd_microstep: 5741.00 | bwd_inner_microstep: 5642.15 | bwd_allreduce_microstep: 98.79 | step_microstep: 18.67 [2025-04-25 23:35:06,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.67 | bwd: 5741.01 | bwd_inner: 5642.15 | bwd_allreduce: 98.81 | step: 18.67 16%|█▌ | 6466/41250 [15:37:31<83:57:54, 8.69s/it] {'loss': 0.0835, 'grad_norm': 2.646437644958496, 'learning_rate': 3.8338563631756655e-05, 'epoch': 1.57} 16%|█▌ | 6466/41250 [15:37:31<83:57:54, 8.69s/it][2025-04-25 23:35:14,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:35:14,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.19 | bwd_microstep: 5702.31 | bwd_inner_microstep: 5643.69 | bwd_allreduce_microstep: 58.58 | step_microstep: 18.59 [2025-04-25 23:35:14,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.19 | bwd: 5702.33 | bwd_inner: 5643.69 | bwd_allreduce: 58.59 | step: 18.59 16%|█▌ | 6467/41250 [15:37:40<83:43:03, 8.66s/it] {'loss': 0.2738, 'grad_norm': 12.091011047363281, 'learning_rate': 3.833793693339175e-05, 'epoch': 1.57} 16%|█▌ | 6467/41250 [15:37:40<83:43:03, 8.66s/it][2025-04-25 23:35:23,377] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.05 | optimizer_step: 1.23 [2025-04-25 23:35:23,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.70 | bwd_microstep: 5681.78 | bwd_inner_microstep: 5669.00 | bwd_allreduce_microstep: 12.73 | step_microstep: 19.57 [2025-04-25 23:35:23,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.70 | bwd: 5681.79 | bwd_inner: 5669.00 | bwd_allreduce: 12.75 | step: 19.57 16%|█▌ | 6468/41250 [15:37:48<83:34:44, 8.65s/it] {'loss': 0.205, 'grad_norm': 1.6103428602218628, 'learning_rate': 3.833731012197706e-05, 'epoch': 1.57} 16%|█▌ | 6468/41250 [15:37:48<83:34:44, 8.65s/it][2025-04-25 23:35:32,205] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.98 | optimizer_step: 1.00 [2025-04-25 23:35:32,206] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.20 | bwd_microstep: 5895.27 | bwd_inner_microstep: 5669.31 | bwd_allreduce_microstep: 225.92 | step_microstep: 18.46 [2025-04-25 23:35:32,206] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.20 | bwd: 5895.28 | bwd_inner: 5669.31 | bwd_allreduce: 225.94 | step: 18.46 16%|█▌ | 6469/41250 [15:37:57<84:05:01, 8.70s/it] {'loss': 0.2557, 'grad_norm': 2.159867286682129, 'learning_rate': 3.833668319751646e-05, 'epoch': 1.57} 16%|█▌ | 6469/41250 [15:37:57<84:05:01, 8.70s/it][2025-04-25 23:35:40,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 1.03 [2025-04-25 23:35:40,863] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.53 | bwd_microstep: 5725.42 | bwd_inner_microstep: 5692.39 | bwd_allreduce_microstep: 32.97 | step_microstep: 19.06 [2025-04-25 23:35:40,863] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.53 | bwd: 5725.43 | bwd_inner: 5692.39 | bwd_allreduce: 32.99 | step: 19.06 16%|█▌ | 6470/41250 [15:38:06<83:57:27, 8.69s/it] {'loss': 0.1963, 'grad_norm': 1.671687126159668, 'learning_rate': 3.8336056160013814e-05, 'epoch': 1.57} 16%|█▌ | 6470/41250 [15:38:06<83:57:27, 8.69s/it][2025-04-25 23:35:49,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.04 | optimizer_step: 0.99 [2025-04-25 23:35:49,508] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.46 | bwd_microstep: 5707.37 | bwd_inner_microstep: 5694.41 | bwd_allreduce_microstep: 12.91 | step_microstep: 19.53 [2025-04-25 23:35:49,508] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.46 | bwd: 5707.38 | bwd_inner: 5694.41 | bwd_allreduce: 12.93 | step: 19.53 16%|█▌ | 6471/41250 [15:38:14<83:48:58, 8.68s/it] {'loss': 0.0773, 'grad_norm': 2.1984591484069824, 'learning_rate': 3.833542900947299e-05, 'epoch': 1.57} 16%|█▌ | 6471/41250 [15:38:14<83:48:58, 8.68s/it][2025-04-25 23:35:58,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.93 [2025-04-25 23:35:58,165] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.64 | bwd_microstep: 5718.72 | bwd_inner_microstep: 5706.07 | bwd_allreduce_microstep: 12.60 | step_microstep: 19.18 [2025-04-25 23:35:58,165] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.64 | bwd: 5718.73 | bwd_inner: 5706.07 | bwd_allreduce: 12.62 | step: 19.19 16%|█▌ | 6472/41250 [15:38:23<83:45:28, 8.67s/it] {'loss': 0.1777, 'grad_norm': 1.7932586669921875, 'learning_rate': 3.833480174589785e-05, 'epoch': 1.57} 16%|█▌ | 6472/41250 [15:38:23<83:45:28, 8.67s/it][2025-04-25 23:36:06,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:36:06,788] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.84 | bwd_microstep: 5694.28 | bwd_inner_microstep: 5681.43 | bwd_allreduce_microstep: 12.80 | step_microstep: 19.23 [2025-04-25 23:36:06,788] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.84 | bwd: 5694.29 | bwd_inner: 5681.43 | bwd_allreduce: 12.82 | step: 19.23 16%|█▌ | 6473/41250 [15:38:32<83:37:05, 8.66s/it] {'loss': 0.0565, 'grad_norm': 1.3387833833694458, 'learning_rate': 3.833417436929227e-05, 'epoch': 1.57} 16%|█▌ | 6473/41250 [15:38:32<83:37:05, 8.66s/it][2025-04-25 23:36:15,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-25 23:36:15,446] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.88 | bwd_microstep: 5723.49 | bwd_inner_microstep: 5710.52 | bwd_allreduce_microstep: 12.92 | step_microstep: 18.72 [2025-04-25 23:36:15,446] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.88 | bwd: 5723.50 | bwd_inner: 5710.52 | bwd_allreduce: 12.94 | step: 18.72 16%|█▌ | 6474/41250 [15:38:40<83:37:12, 8.66s/it] {'loss': 0.0275, 'grad_norm': 0.5260657668113708, 'learning_rate': 3.833354687966011e-05, 'epoch': 1.57} 16%|█▌ | 6474/41250 [15:38:40<83:37:12, 8.66s/it][2025-04-25 23:36:24,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 23:36:24,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.40 | bwd_microstep: 5768.94 | bwd_inner_microstep: 5686.32 | bwd_allreduce_microstep: 82.57 | step_microstep: 18.34 [2025-04-25 23:36:24,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.40 | bwd: 5768.96 | bwd_inner: 5686.32 | bwd_allreduce: 82.58 | step: 18.34 16%|█▌ | 6475/41250 [15:38:49<83:44:03, 8.67s/it] {'loss': 0.1014, 'grad_norm': 1.5252244472503662, 'learning_rate': 3.833291927700523e-05, 'epoch': 1.57} 16%|█▌ | 6475/41250 [15:38:49<83:44:03, 8.67s/it][2025-04-25 23:36:32,832] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.25 | optimizer_step: 1.00 [2025-04-25 23:36:32,833] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.08 | bwd_microstep: 5758.71 | bwd_inner_microstep: 5696.27 | bwd_allreduce_microstep: 62.39 | step_microstep: 19.57 [2025-04-25 23:36:32,833] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.08 | bwd: 5758.72 | bwd_inner: 5696.27 | bwd_allreduce: 62.41 | step: 19.57 16%|█▌ | 6476/41250 [15:38:58<83:47:49, 8.68s/it] {'loss': 0.2198, 'grad_norm': 2.4450085163116455, 'learning_rate': 3.833229156133152e-05, 'epoch': 1.57} 16%|█▌ | 6476/41250 [15:38:58<83:47:49, 8.68s/it][2025-04-25 23:36:41,544] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.19 | optimizer_step: 0.90 [2025-04-25 23:36:41,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.56 | bwd_microstep: 5776.19 | bwd_inner_microstep: 5690.29 | bwd_allreduce_microstep: 85.85 | step_microstep: 19.10 [2025-04-25 23:36:41,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.56 | bwd: 5776.21 | bwd_inner: 5690.30 | bwd_allreduce: 85.87 | step: 19.10 16%|█▌ | 6477/41250 [15:39:06<83:54:12, 8.69s/it] {'loss': 0.1481, 'grad_norm': 2.0350778102874756, 'learning_rate': 3.833166373264283e-05, 'epoch': 1.57} 16%|█▌ | 6477/41250 [15:39:06<83:54:12, 8.69s/it][2025-04-25 23:36:50,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-25 23:36:50,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.54 | bwd_microstep: 5779.47 | bwd_inner_microstep: 5654.03 | bwd_allreduce_microstep: 125.40 | step_microstep: 18.33 [2025-04-25 23:36:50,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.54 | bwd: 5779.49 | bwd_inner: 5654.03 | bwd_allreduce: 125.42 | step: 18.33 16%|█▌ | 6478/41250 [15:39:15<83:55:21, 8.69s/it] {'loss': 0.1684, 'grad_norm': 1.4832831621170044, 'learning_rate': 3.833103579094305e-05, 'epoch': 1.57} 16%|█▌ | 6478/41250 [15:39:15<83:55:21, 8.69s/it][2025-04-25 23:36:58,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 23:36:58,925] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.73 | bwd_microstep: 5762.12 | bwd_inner_microstep: 5663.67 | bwd_allreduce_microstep: 98.40 | step_microstep: 18.44 [2025-04-25 23:36:58,925] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.73 | bwd: 5762.13 | bwd_inner: 5663.67 | bwd_allreduce: 98.42 | step: 18.44 16%|█▌ | 6479/41250 [15:39:24<83:54:35, 8.69s/it] {'loss': 0.1062, 'grad_norm': 1.6123765707015991, 'learning_rate': 3.833040773623603e-05, 'epoch': 1.57} 16%|█▌ | 6479/41250 [15:39:24<83:54:35, 8.69s/it][2025-04-25 23:37:07,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.97 | optimizer_step: 1.02 [2025-04-25 23:37:07,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.17 | bwd_microstep: 5716.14 | bwd_inner_microstep: 5652.24 | bwd_allreduce_microstep: 63.85 | step_microstep: 18.54 [2025-04-25 23:37:07,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.17 | bwd: 5716.15 | bwd_inner: 5652.24 | bwd_allreduce: 63.87 | step: 18.54 16%|█▌ | 6480/41250 [15:39:32<83:43:48, 8.67s/it] {'loss': 0.1381, 'grad_norm': 2.6612236499786377, 'learning_rate': 3.832977956852566e-05, 'epoch': 1.57} 16%|█▌ | 6480/41250 [15:39:32<83:43:48, 8.67s/it][2025-04-25 23:37:16,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-25 23:37:16,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.83 | bwd_microstep: 5749.82 | bwd_inner_microstep: 5654.08 | bwd_allreduce_microstep: 95.70 | step_microstep: 19.03 [2025-04-25 23:37:16,227] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.83 | bwd: 5749.84 | bwd_inner: 5654.08 | bwd_allreduce: 95.72 | step: 19.03 16%|█▌ | 6481/41250 [15:39:41<83:44:53, 8.67s/it] {'loss': 0.1399, 'grad_norm': 3.7715234756469727, 'learning_rate': 3.83291512878158e-05, 'epoch': 1.57} 16%|█▌ | 6481/41250 [15:39:41<83:44:53, 8.67s/it][2025-04-25 23:37:24,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:37:24,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.52 | bwd_microstep: 5768.73 | bwd_inner_microstep: 5701.06 | bwd_allreduce_microstep: 67.63 | step_microstep: 18.15 [2025-04-25 23:37:24,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.52 | bwd: 5768.75 | bwd_inner: 5701.06 | bwd_allreduce: 67.65 | step: 18.15 16%|█▌ | 6482/41250 [15:39:50<83:49:40, 8.68s/it] {'loss': 0.086, 'grad_norm': 1.1478928327560425, 'learning_rate': 3.8328522894110324e-05, 'epoch': 1.57} 16%|█▌ | 6482/41250 [15:39:50<83:49:40, 8.68s/it][2025-04-25 23:37:33,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:37:33,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.02 | bwd_microstep: 5705.39 | bwd_inner_microstep: 5690.07 | bwd_allreduce_microstep: 15.27 | step_microstep: 18.57 [2025-04-25 23:37:33,561] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.02 | bwd: 5705.40 | bwd_inner: 5690.07 | bwd_allreduce: 15.29 | step: 18.57 16%|█▌ | 6483/41250 [15:39:58<83:41:48, 8.67s/it] {'loss': 0.0887, 'grad_norm': 1.0749666690826416, 'learning_rate': 3.832789438741311e-05, 'epoch': 1.57} 16%|█▌ | 6483/41250 [15:39:58<83:41:48, 8.67s/it][2025-04-25 23:37:42,254] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:37:42,254] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.45 | bwd_microstep: 5782.07 | bwd_inner_microstep: 5658.53 | bwd_allreduce_microstep: 123.50 | step_microstep: 18.86 [2025-04-25 23:37:42,254] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.45 | bwd: 5782.08 | bwd_inner: 5658.53 | bwd_allreduce: 123.51 | step: 18.86 16%|█▌ | 6484/41250 [15:40:07<83:46:31, 8.67s/it] {'loss': 0.1848, 'grad_norm': 1.7725355625152588, 'learning_rate': 3.832726576772803e-05, 'epoch': 1.57} 16%|█▌ | 6484/41250 [15:40:07<83:46:31, 8.67s/it][2025-04-25 23:37:50,953] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-25 23:37:50,953] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.49 | bwd_microstep: 5783.10 | bwd_inner_microstep: 5654.13 | bwd_allreduce_microstep: 128.92 | step_microstep: 18.87 [2025-04-25 23:37:50,953] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.49 | bwd: 5783.11 | bwd_inner: 5654.13 | bwd_allreduce: 128.94 | step: 18.87 16%|█▌ | 6485/41250 [15:40:16<83:50:29, 8.68s/it] {'loss': 0.1435, 'grad_norm': 2.2905261516571045, 'learning_rate': 3.8326637035058966e-05, 'epoch': 1.57} 16%|█▌ | 6485/41250 [15:40:16<83:50:29, 8.68s/it][2025-04-25 23:37:59,642] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:37:59,642] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.21 | bwd_microstep: 5773.72 | bwd_inner_microstep: 5653.70 | bwd_allreduce_microstep: 119.98 | step_microstep: 18.37 [2025-04-25 23:37:59,643] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.21 | bwd: 5773.73 | bwd_inner: 5653.70 | bwd_allreduce: 120.00 | step: 18.38 16%|█▌ | 6486/41250 [15:40:24<83:51:17, 8.68s/it] {'loss': 0.0302, 'grad_norm': 0.3211205303668976, 'learning_rate': 3.832600818940979e-05, 'epoch': 1.57} 16%|█▌ | 6486/41250 [15:40:24<83:51:17, 8.68s/it][2025-04-25 23:38:08,270] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:38:08,271] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.96 | bwd_microstep: 5714.16 | bwd_inner_microstep: 5663.80 | bwd_allreduce_microstep: 50.31 | step_microstep: 18.49 [2025-04-25 23:38:08,271] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.96 | bwd: 5714.17 | bwd_inner: 5663.80 | bwd_allreduce: 50.33 | step: 18.49 16%|█▌ | 6487/41250 [15:40:33<83:42:18, 8.67s/it] {'loss': 0.2254, 'grad_norm': 3.394179105758667, 'learning_rate': 3.832537923078437e-05, 'epoch': 1.57} 16%|█▌ | 6487/41250 [15:40:33<83:42:18, 8.67s/it][2025-04-25 23:38:16,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-25 23:38:16,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.58 | bwd_microstep: 5751.90 | bwd_inner_microstep: 5687.73 | bwd_allreduce_microstep: 64.12 | step_microstep: 19.01 [2025-04-25 23:38:16,967] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.58 | bwd: 5751.91 | bwd_inner: 5687.73 | bwd_allreduce: 64.14 | step: 19.02 16%|█▌ | 6488/41250 [15:40:42<83:46:09, 8.68s/it] {'loss': 0.1076, 'grad_norm': 1.2411681413650513, 'learning_rate': 3.83247501591866e-05, 'epoch': 1.57} 16%|█▌ | 6488/41250 [15:40:42<83:46:09, 8.68s/it][2025-04-25 23:38:25,609] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 23:38:25,610] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.94 | bwd_microstep: 5701.74 | bwd_inner_microstep: 5688.87 | bwd_allreduce_microstep: 12.83 | step_microstep: 19.02 [2025-04-25 23:38:25,610] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.94 | bwd: 5701.75 | bwd_inner: 5688.87 | bwd_allreduce: 12.85 | step: 19.02 16%|█▌ | 6489/41250 [15:40:50<83:40:32, 8.67s/it] {'loss': 0.4069, 'grad_norm': 2.3714382648468018, 'learning_rate': 3.832412097462035e-05, 'epoch': 1.57} 16%|█▌ | 6489/41250 [15:40:50<83:40:32, 8.67s/it][2025-04-25 23:38:34,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:38:34,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.26 | bwd_microstep: 5773.48 | bwd_inner_microstep: 5658.02 | bwd_allreduce_microstep: 115.41 | step_microstep: 18.62 [2025-04-25 23:38:34,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.26 | bwd: 5773.50 | bwd_inner: 5658.02 | bwd_allreduce: 115.43 | step: 18.62 16%|█▌ | 6490/41250 [15:40:59<83:45:15, 8.67s/it] {'loss': 0.0814, 'grad_norm': 0.654781699180603, 'learning_rate': 3.83234916770895e-05, 'epoch': 1.57} 16%|█▌ | 6490/41250 [15:40:59<83:45:15, 8.67s/it][2025-04-25 23:38:42,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-25 23:38:42,943] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.50 | bwd_microstep: 5709.82 | bwd_inner_microstep: 5696.96 | bwd_allreduce_microstep: 12.82 | step_microstep: 19.09 [2025-04-25 23:38:42,943] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.50 | bwd: 5709.83 | bwd_inner: 5696.96 | bwd_allreduce: 12.83 | step: 19.09 16%|█▌ | 6491/41250 [15:41:08<83:39:11, 8.66s/it] {'loss': 0.1783, 'grad_norm': 2.9367690086364746, 'learning_rate': 3.8322862266597916e-05, 'epoch': 1.57} 16%|█▌ | 6491/41250 [15:41:08<83:39:11, 8.66s/it][2025-04-25 23:38:51,561] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:38:51,561] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.64 | bwd_microstep: 5707.32 | bwd_inner_microstep: 5645.89 | bwd_allreduce_microstep: 61.38 | step_microstep: 18.42 [2025-04-25 23:38:51,561] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.64 | bwd: 5707.33 | bwd_inner: 5645.89 | bwd_allreduce: 61.40 | step: 18.42 16%|█▌ | 6492/41250 [15:41:16<83:31:10, 8.65s/it] {'loss': 0.0761, 'grad_norm': 1.3143948316574097, 'learning_rate': 3.83222327431495e-05, 'epoch': 1.57} 16%|█▌ | 6492/41250 [15:41:16<83:31:10, 8.65s/it][2025-04-25 23:39:00,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 1.02 [2025-04-25 23:39:00,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.89 | bwd_microstep: 5718.57 | bwd_inner_microstep: 5704.81 | bwd_allreduce_microstep: 13.71 | step_microstep: 19.06 [2025-04-25 23:39:00,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.89 | bwd: 5718.58 | bwd_inner: 5704.81 | bwd_allreduce: 13.73 | step: 19.06 16%|█▌ | 6493/41250 [15:41:25<83:33:21, 8.65s/it] {'loss': 0.0698, 'grad_norm': 0.9367244243621826, 'learning_rate': 3.832160310674812e-05, 'epoch': 1.57} 16%|█▌ | 6493/41250 [15:41:25<83:33:21, 8.65s/it][2025-04-25 23:39:08,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:39:08,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.99 | bwd_microstep: 5723.39 | bwd_inner_microstep: 5640.31 | bwd_allreduce_microstep: 83.03 | step_microstep: 18.40 [2025-04-25 23:39:08,855] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.99 | bwd: 5723.40 | bwd_inner: 5640.31 | bwd_allreduce: 83.05 | step: 18.41 16%|█▌ | 6494/41250 [15:41:34<83:28:30, 8.65s/it] {'loss': 0.0219, 'grad_norm': 0.7791122794151306, 'learning_rate': 3.832097335739766e-05, 'epoch': 1.57} 16%|█▌ | 6494/41250 [15:41:34<83:28:30, 8.65s/it][2025-04-25 23:39:17,450] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:39:17,451] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.25 | bwd_microstep: 5687.97 | bwd_inner_microstep: 5645.94 | bwd_allreduce_microstep: 41.99 | step_microstep: 18.60 [2025-04-25 23:39:17,451] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.25 | bwd: 5687.99 | bwd_inner: 5645.94 | bwd_allreduce: 42.00 | step: 18.60 16%|█▌ | 6495/41250 [15:41:42<83:19:42, 8.63s/it] {'loss': 0.085, 'grad_norm': 1.9460126161575317, 'learning_rate': 3.8320343495102004e-05, 'epoch': 1.57} 16%|█▌ | 6495/41250 [15:41:42<83:19:42, 8.63s/it][2025-04-25 23:39:26,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:39:26,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.58 | bwd_microstep: 5726.50 | bwd_inner_microstep: 5707.46 | bwd_allreduce_microstep: 19.00 | step_microstep: 18.21 [2025-04-25 23:39:26,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.58 | bwd: 5726.52 | bwd_inner: 5707.46 | bwd_allreduce: 19.02 | step: 18.21 16%|█▌ | 6496/41250 [15:41:51<83:25:17, 8.64s/it] {'loss': 0.2199, 'grad_norm': 1.9210036993026733, 'learning_rate': 3.8319713519865035e-05, 'epoch': 1.57} 16%|█▌ | 6496/41250 [15:41:51<83:25:17, 8.64s/it][2025-04-25 23:39:34,738] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.04 | optimizer_step: 0.95 [2025-04-25 23:39:34,739] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.35 | bwd_microstep: 5699.48 | bwd_inner_microstep: 5686.69 | bwd_allreduce_microstep: 12.74 | step_microstep: 19.17 [2025-04-25 23:39:34,739] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.35 | bwd: 5699.50 | bwd_inner: 5686.69 | bwd_allreduce: 12.76 | step: 19.17 16%|█▌ | 6497/41250 [15:42:00<83:22:28, 8.64s/it] {'loss': 0.0556, 'grad_norm': 0.9591057896614075, 'learning_rate': 3.831908343169064e-05, 'epoch': 1.58} 16%|█▌ | 6497/41250 [15:42:00<83:22:28, 8.64s/it][2025-04-25 23:39:43,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 23:39:43,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.62 | bwd_microstep: 5728.51 | bwd_inner_microstep: 5701.22 | bwd_allreduce_microstep: 27.25 | step_microstep: 18.29 [2025-04-25 23:39:43,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.62 | bwd: 5728.52 | bwd_inner: 5701.22 | bwd_allreduce: 27.27 | step: 18.29 16%|█▌ | 6498/41250 [15:42:08<83:26:42, 8.64s/it] {'loss': 0.3252, 'grad_norm': 3.0559399127960205, 'learning_rate': 3.831845323058269e-05, 'epoch': 1.58} 16%|█▌ | 6498/41250 [15:42:08<83:26:42, 8.64s/it][2025-04-25 23:39:52,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-25 23:39:52,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.93 | bwd_microstep: 5718.92 | bwd_inner_microstep: 5644.94 | bwd_allreduce_microstep: 73.93 | step_microstep: 18.59 [2025-04-25 23:39:52,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.93 | bwd: 5718.93 | bwd_inner: 5644.94 | bwd_allreduce: 73.95 | step: 18.59 16%|█▌ | 6499/41250 [15:42:17<83:23:20, 8.64s/it] {'loss': 0.1894, 'grad_norm': 3.4009459018707275, 'learning_rate': 3.831782291654507e-05, 'epoch': 1.58} 16%|█▌ | 6499/41250 [15:42:17<83:23:20, 8.64s/it][2025-04-25 23:40:00,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.98 | optimizer_step: 1.03 [2025-04-25 23:40:00,681] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.07 | bwd_microstep: 5746.86 | bwd_inner_microstep: 5647.74 | bwd_allreduce_microstep: 99.08 | step_microstep: 18.51 [2025-04-25 23:40:00,681] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.08 | bwd: 5746.88 | bwd_inner: 5647.74 | bwd_allreduce: 99.10 | step: 18.51 16%|█▌ | 6500/41250 [15:42:26<83:25:34, 8.64s/it] {'loss': 0.0542, 'grad_norm': 0.9626426696777344, 'learning_rate': 3.8317192489581694e-05, 'epoch': 1.58} 16%|█▌ | 6500/41250 [15:42:26<83:25:34, 8.64s/it][2025-04-25 23:40:09,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:40:09,431] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2884.04 | bwd_microstep: 5783.63 | bwd_inner_microstep: 5770.83 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.66 [2025-04-25 23:40:09,431] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2884.04 | bwd: 5783.65 | bwd_inner: 5770.83 | bwd_allreduce: 12.78 | step: 18.67 16%|█▌ | 6501/41250 [15:42:34<83:44:07, 8.67s/it] {'loss': 0.1084, 'grad_norm': 1.349880576133728, 'learning_rate': 3.831656194969642e-05, 'epoch': 1.58} 16%|█▌ | 6501/41250 [15:42:34<83:44:07, 8.67s/it][2025-04-25 23:40:18,075] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:40:18,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.13 | bwd_microstep: 5714.01 | bwd_inner_microstep: 5695.76 | bwd_allreduce_microstep: 18.21 | step_microstep: 18.70 [2025-04-25 23:40:18,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.13 | bwd: 5714.02 | bwd_inner: 5695.75 | bwd_allreduce: 18.23 | step: 18.70 16%|█▌ | 6502/41250 [15:42:43<83:38:45, 8.67s/it] {'loss': 0.1613, 'grad_norm': 2.8333473205566406, 'learning_rate': 3.831593129689315e-05, 'epoch': 1.58} 16%|█▌ | 6502/41250 [15:42:43<83:38:45, 8.67s/it][2025-04-25 23:40:26,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:40:26,745] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.06 | bwd_microstep: 5756.28 | bwd_inner_microstep: 5648.53 | bwd_allreduce_microstep: 107.71 | step_microstep: 18.41 [2025-04-25 23:40:26,745] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.06 | bwd: 5756.29 | bwd_inner: 5648.53 | bwd_allreduce: 107.72 | step: 18.41 16%|█▌ | 6503/41250 [15:42:52<83:39:20, 8.67s/it] {'loss': 0.167, 'grad_norm': 3.102912425994873, 'learning_rate': 3.831530053117576e-05, 'epoch': 1.58} 16%|█▌ | 6503/41250 [15:42:52<83:39:20, 8.67s/it][2025-04-25 23:40:35,368] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:40:35,368] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.89 | bwd_microstep: 5697.24 | bwd_inner_microstep: 5684.66 | bwd_allreduce_microstep: 12.53 | step_microstep: 18.68 [2025-04-25 23:40:35,369] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.89 | bwd: 5697.25 | bwd_inner: 5684.66 | bwd_allreduce: 12.54 | step: 18.68 16%|█▌ | 6504/41250 [15:43:00<83:31:51, 8.65s/it] {'loss': 0.2158, 'grad_norm': 4.602999210357666, 'learning_rate': 3.831466965254814e-05, 'epoch': 1.58} 16%|█▌ | 6504/41250 [15:43:00<83:31:51, 8.65s/it][2025-04-25 23:40:43,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.01 | optimizer_step: 0.95 [2025-04-25 23:40:43,976] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.53 | bwd_microstep: 5699.35 | bwd_inner_microstep: 5643.42 | bwd_allreduce_microstep: 55.88 | step_microstep: 18.74 [2025-04-25 23:40:43,976] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.53 | bwd: 5699.36 | bwd_inner: 5643.42 | bwd_allreduce: 55.90 | step: 18.74 16%|█▌ | 6505/41250 [15:43:09<83:23:20, 8.64s/it] {'loss': 0.1612, 'grad_norm': 4.4208574295043945, 'learning_rate': 3.8314038661014195e-05, 'epoch': 1.58} 16%|█▌ | 6505/41250 [15:43:09<83:23:20, 8.64s/it][2025-04-25 23:40:52,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.91 [2025-04-25 23:40:52,603] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.75 | bwd_microstep: 5714.42 | bwd_inner_microstep: 5648.59 | bwd_allreduce_microstep: 65.78 | step_microstep: 18.79 [2025-04-25 23:40:52,603] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.75 | bwd: 5714.43 | bwd_inner: 5648.59 | bwd_allreduce: 65.79 | step: 18.79 16%|█▌ | 6506/41250 [15:43:17<83:20:44, 8.64s/it] {'loss': 0.0589, 'grad_norm': 1.122819423675537, 'learning_rate': 3.8313407556577805e-05, 'epoch': 1.58} 16%|█▌ | 6506/41250 [15:43:17<83:20:44, 8.64s/it][2025-04-25 23:41:01,256] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-25 23:41:01,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.01 | bwd_microstep: 5710.21 | bwd_inner_microstep: 5697.65 | bwd_allreduce_microstep: 12.52 | step_microstep: 18.67 [2025-04-25 23:41:01,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.01 | bwd: 5710.23 | bwd_inner: 5697.65 | bwd_allreduce: 12.53 | step: 18.68 16%|█▌ | 6507/41250 [15:43:26<83:23:54, 8.64s/it] {'loss': 0.1608, 'grad_norm': 2.291126251220703, 'learning_rate': 3.831277633924285e-05, 'epoch': 1.58} 16%|█▌ | 6507/41250 [15:43:26<83:23:54, 8.64s/it][2025-04-25 23:41:09,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.02 | optimizer_step: 1.00 [2025-04-25 23:41:09,874] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.22 | bwd_microstep: 5681.72 | bwd_inner_microstep: 5657.59 | bwd_allreduce_microstep: 24.08 | step_microstep: 18.96 [2025-04-25 23:41:09,874] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.22 | bwd: 5681.73 | bwd_inner: 5657.59 | bwd_allreduce: 24.10 | step: 18.96 16%|█▌ | 6508/41250 [15:43:35<83:19:59, 8.64s/it] {'loss': 0.2586, 'grad_norm': 2.165029764175415, 'learning_rate': 3.8312145009013235e-05, 'epoch': 1.58} 16%|█▌ | 6508/41250 [15:43:35<83:19:59, 8.64s/it][2025-04-25 23:41:18,558] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.01 | optimizer_step: 1.06 [2025-04-25 23:41:18,558] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.38 | bwd_microstep: 5768.30 | bwd_inner_microstep: 5642.04 | bwd_allreduce_microstep: 126.22 | step_microstep: 18.73 [2025-04-25 23:41:18,558] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.38 | bwd: 5768.32 | bwd_inner: 5642.04 | bwd_allreduce: 126.24 | step: 18.73 16%|█▌ | 6509/41250 [15:43:43<83:27:44, 8.65s/it] {'loss': 0.0179, 'grad_norm': 0.3823154866695404, 'learning_rate': 3.831151356589284e-05, 'epoch': 1.58} 16%|█▌ | 6509/41250 [15:43:43<83:27:44, 8.65s/it][2025-04-25 23:41:27,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.03 | optimizer_step: 0.99 [2025-04-25 23:41:27,210] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.48 | bwd_microstep: 5711.51 | bwd_inner_microstep: 5698.87 | bwd_allreduce_microstep: 12.59 | step_microstep: 19.13 [2025-04-25 23:41:27,210] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.48 | bwd: 5711.53 | bwd_inner: 5698.87 | bwd_allreduce: 12.61 | step: 19.13 16%|█▌ | 6510/41250 [15:43:52<83:28:39, 8.65s/it] {'loss': 0.0894, 'grad_norm': 1.1152276992797852, 'learning_rate': 3.8310882009885576e-05, 'epoch': 1.58} 16%|█▌ | 6510/41250 [15:43:52<83:28:39, 8.65s/it][2025-04-25 23:41:35,905] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 23:41:35,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.26 | bwd_microstep: 5753.34 | bwd_inner_microstep: 5686.67 | bwd_allreduce_microstep: 66.63 | step_microstep: 18.73 [2025-04-25 23:41:35,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.26 | bwd: 5753.35 | bwd_inner: 5686.67 | bwd_allreduce: 66.65 | step: 18.73 16%|█▌ | 6511/41250 [15:44:01<83:36:07, 8.66s/it] {'loss': 0.0643, 'grad_norm': 1.0973607301712036, 'learning_rate': 3.831025034099532e-05, 'epoch': 1.58} 16%|█▌ | 6511/41250 [15:44:01<83:36:07, 8.66s/it][2025-04-25 23:41:44,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.97 [2025-04-25 23:41:44,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.77 | bwd_microstep: 5768.01 | bwd_inner_microstep: 5675.29 | bwd_allreduce_microstep: 92.67 | step_microstep: 18.52 [2025-04-25 23:41:44,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.77 | bwd: 5768.02 | bwd_inner: 5675.29 | bwd_allreduce: 92.69 | step: 18.52 16%|█▌ | 6512/41250 [15:44:09<83:41:55, 8.67s/it] {'loss': 0.2379, 'grad_norm': 3.8625757694244385, 'learning_rate': 3.830961855922598e-05, 'epoch': 1.58} 16%|█▌ | 6512/41250 [15:44:09<83:41:55, 8.67s/it][2025-04-25 23:41:53,409] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 23:41:53,410] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.16 | bwd_microstep: 5876.91 | bwd_inner_microstep: 5692.95 | bwd_allreduce_microstep: 183.92 | step_microstep: 18.90 [2025-04-25 23:41:53,410] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.16 | bwd: 5876.93 | bwd_inner: 5692.95 | bwd_allreduce: 183.94 | step: 18.90 16%|█▌ | 6513/41250 [15:44:18<84:04:24, 8.71s/it] {'loss': 0.0645, 'grad_norm': 0.6381708979606628, 'learning_rate': 3.830898666458144e-05, 'epoch': 1.58} 16%|█▌ | 6513/41250 [15:44:18<84:04:24, 8.71s/it][2025-04-25 23:42:02,179] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 23:42:02,179] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.36 | bwd_microstep: 5837.39 | bwd_inner_microstep: 5694.24 | bwd_allreduce_microstep: 143.09 | step_microstep: 18.59 [2025-04-25 23:42:02,180] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.37 | bwd: 5837.41 | bwd_inner: 5694.24 | bwd_allreduce: 143.11 | step: 18.59 16%|█▌ | 6514/41250 [15:44:27<84:14:33, 8.73s/it] {'loss': 0.2241, 'grad_norm': 2.73996639251709, 'learning_rate': 3.83083546570656e-05, 'epoch': 1.58} 16%|█▌ | 6514/41250 [15:44:27<84:14:33, 8.73s/it][2025-04-25 23:42:10,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.21 | optimizer_step: 1.01 [2025-04-25 23:42:10,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.61 | bwd_microstep: 5724.38 | bwd_inner_microstep: 5641.31 | bwd_allreduce_microstep: 83.02 | step_microstep: 19.51 [2025-04-25 23:42:10,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.61 | bwd: 5724.39 | bwd_inner: 5641.31 | bwd_allreduce: 83.04 | step: 19.52 16%|█▌ | 6515/41250 [15:44:36<83:56:04, 8.70s/it] {'loss': 0.1946, 'grad_norm': 1.723901391029358, 'learning_rate': 3.8307722536682345e-05, 'epoch': 1.58} 16%|█▌ | 6515/41250 [15:44:36<83:56:04, 8.70s/it][2025-04-25 23:42:19,439] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.61 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 23:42:19,440] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.86 | bwd_microstep: 5700.49 | bwd_inner_microstep: 5687.73 | bwd_allreduce_microstep: 12.72 | step_microstep: 19.14 [2025-04-25 23:42:19,440] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.86 | bwd: 5700.51 | bwd_inner: 5687.73 | bwd_allreduce: 12.74 | step: 19.15 16%|█▌ | 6516/41250 [15:44:44<83:44:23, 8.68s/it] {'loss': 0.0461, 'grad_norm': 1.9139963388442993, 'learning_rate': 3.8307090303435585e-05, 'epoch': 1.58} 16%|█▌ | 6516/41250 [15:44:44<83:44:23, 8.68s/it][2025-04-25 23:42:28,140] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:42:28,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.93 | bwd_microstep: 5762.07 | bwd_inner_microstep: 5679.62 | bwd_allreduce_microstep: 82.40 | step_microstep: 18.76 [2025-04-25 23:42:28,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.93 | bwd: 5762.08 | bwd_inner: 5679.62 | bwd_allreduce: 82.42 | step: 18.77 16%|█▌ | 6517/41250 [15:44:53<83:48:03, 8.69s/it] {'loss': 0.0885, 'grad_norm': 1.8288445472717285, 'learning_rate': 3.830645795732921e-05, 'epoch': 1.58} 16%|█▌ | 6517/41250 [15:44:53<83:48:03, 8.69s/it][2025-04-25 23:42:36,766] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:42:36,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.89 | bwd_microstep: 5714.87 | bwd_inner_microstep: 5660.66 | bwd_allreduce_microstep: 54.16 | step_microstep: 18.54 [2025-04-25 23:42:36,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.89 | bwd: 5714.88 | bwd_inner: 5660.66 | bwd_allreduce: 54.18 | step: 18.54 16%|█▌ | 6518/41250 [15:45:02<83:37:28, 8.67s/it] {'loss': 0.2626, 'grad_norm': 1.6486464738845825, 'learning_rate': 3.830582549836713e-05, 'epoch': 1.58} 16%|█▌ | 6518/41250 [15:45:02<83:37:28, 8.67s/it][2025-04-25 23:42:45,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:42:45,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.39 | bwd_microstep: 5705.96 | bwd_inner_microstep: 5657.49 | bwd_allreduce_microstep: 48.44 | step_microstep: 18.68 [2025-04-25 23:42:45,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.39 | bwd: 5705.98 | bwd_inner: 5657.49 | bwd_allreduce: 48.45 | step: 18.68 16%|█▌ | 6519/41250 [15:45:10<83:29:21, 8.65s/it] {'loss': 0.1256, 'grad_norm': 2.7215867042541504, 'learning_rate': 3.830519292655324e-05, 'epoch': 1.58} 16%|█▌ | 6519/41250 [15:45:10<83:29:21, 8.65s/it][2025-04-25 23:42:54,060] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:42:54,060] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.56 | bwd_microstep: 5737.30 | bwd_inner_microstep: 5686.22 | bwd_allreduce_microstep: 51.04 | step_microstep: 18.59 [2025-04-25 23:42:54,060] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.56 | bwd: 5737.32 | bwd_inner: 5686.22 | bwd_allreduce: 51.05 | step: 18.59 16%|█▌ | 6520/41250 [15:45:19<83:32:21, 8.66s/it] {'loss': 0.0225, 'grad_norm': 0.46173083782196045, 'learning_rate': 3.830456024189143e-05, 'epoch': 1.58} 16%|█▌ | 6520/41250 [15:45:19<83:32:21, 8.66s/it][2025-04-25 23:43:02,678] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.98 | optimizer_step: 1.09 [2025-04-25 23:43:02,678] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.14 | bwd_microstep: 5704.05 | bwd_inner_microstep: 5662.31 | bwd_allreduce_microstep: 41.70 | step_microstep: 18.84 [2025-04-25 23:43:02,679] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.14 | bwd: 5704.06 | bwd_inner: 5662.31 | bwd_allreduce: 41.71 | step: 18.84 16%|█▌ | 6521/41250 [15:45:28<83:25:01, 8.65s/it] {'loss': 0.2116, 'grad_norm': 3.196582078933716, 'learning_rate': 3.830392744438561e-05, 'epoch': 1.58} 16%|█▌ | 6521/41250 [15:45:28<83:25:01, 8.65s/it][2025-04-25 23:43:11,491] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-25 23:43:11,492] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.39 | bwd_microstep: 5881.58 | bwd_inner_microstep: 5648.45 | bwd_allreduce_microstep: 233.08 | step_microstep: 18.35 [2025-04-25 23:43:11,492] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.39 | bwd: 5881.59 | bwd_inner: 5648.45 | bwd_allreduce: 233.10 | step: 18.35 16%|█▌ | 6522/41250 [15:45:36<83:53:37, 8.70s/it] {'loss': 0.1185, 'grad_norm': 1.7298883199691772, 'learning_rate': 3.830329453403967e-05, 'epoch': 1.58} 16%|█▌ | 6522/41250 [15:45:36<83:53:37, 8.70s/it][2025-04-25 23:43:20,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:43:20,150] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.75 | bwd_microstep: 5720.46 | bwd_inner_microstep: 5708.06 | bwd_allreduce_microstep: 12.36 | step_microstep: 18.52 [2025-04-25 23:43:20,150] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.75 | bwd: 5720.47 | bwd_inner: 5708.06 | bwd_allreduce: 12.37 | step: 18.52 16%|█▌ | 6523/41250 [15:45:45<83:46:50, 8.69s/it] {'loss': 0.1046, 'grad_norm': 2.0734565258026123, 'learning_rate': 3.8302661510857526e-05, 'epoch': 1.58} 16%|█▌ | 6523/41250 [15:45:45<83:46:50, 8.69s/it][2025-04-25 23:43:28,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.00 | optimizer_step: 1.05 [2025-04-25 23:43:28,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.76 | bwd_microstep: 5710.25 | bwd_inner_microstep: 5652.23 | bwd_allreduce_microstep: 57.98 | step_microstep: 18.48 [2025-04-25 23:43:28,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.76 | bwd: 5710.27 | bwd_inner: 5652.23 | bwd_allreduce: 58.00 | step: 18.48 16%|█▌ | 6524/41250 [15:45:54<83:37:56, 8.67s/it] {'loss': 0.064, 'grad_norm': 0.587409257888794, 'learning_rate': 3.830202837484307e-05, 'epoch': 1.58} 16%|█▌ | 6524/41250 [15:45:54<83:37:56, 8.67s/it][2025-04-25 23:43:37,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:43:37,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.73 | bwd_microstep: 5856.83 | bwd_inner_microstep: 5691.57 | bwd_allreduce_microstep: 165.22 | step_microstep: 18.49 [2025-04-25 23:43:37,585] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.73 | bwd: 5856.84 | bwd_inner: 5691.57 | bwd_allreduce: 165.23 | step: 18.49 16%|█▌ | 6525/41250 [15:46:02<84:00:24, 8.71s/it] {'loss': 0.0351, 'grad_norm': 0.4824138581752777, 'learning_rate': 3.8301395126000215e-05, 'epoch': 1.58} 16%|█▌ | 6525/41250 [15:46:02<84:00:24, 8.71s/it][2025-04-25 23:43:46,215] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.99 | optimizer_step: 1.02 [2025-04-25 23:43:46,216] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.16 | bwd_microstep: 5703.23 | bwd_inner_microstep: 5664.12 | bwd_allreduce_microstep: 39.07 | step_microstep: 18.93 [2025-04-25 23:43:46,216] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.16 | bwd: 5703.25 | bwd_inner: 5664.11 | bwd_allreduce: 39.09 | step: 18.93 16%|█▌ | 6526/41250 [15:46:11<83:46:51, 8.69s/it] {'loss': 0.097, 'grad_norm': 1.4090828895568848, 'learning_rate': 3.830076176433285e-05, 'epoch': 1.58} 16%|█▌ | 6526/41250 [15:46:11<83:46:51, 8.69s/it][2025-04-25 23:43:54,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 23:43:54,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.38 | bwd_microstep: 5786.67 | bwd_inner_microstep: 5657.71 | bwd_allreduce_microstep: 128.91 | step_microstep: 18.54 [2025-04-25 23:43:54,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.38 | bwd: 5786.68 | bwd_inner: 5657.71 | bwd_allreduce: 128.93 | step: 18.54 16%|█▌ | 6527/41250 [15:46:20<83:50:33, 8.69s/it] {'loss': 0.2529, 'grad_norm': 1.7448354959487915, 'learning_rate': 3.8300128289844896e-05, 'epoch': 1.58} 16%|█▌ | 6527/41250 [15:46:20<83:50:33, 8.69s/it][2025-04-25 23:44:03,648] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-25 23:44:03,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.15 | bwd_microstep: 5807.11 | bwd_inner_microstep: 5657.59 | bwd_allreduce_microstep: 149.47 | step_microstep: 19.05 [2025-04-25 23:44:03,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.15 | bwd: 5807.12 | bwd_inner: 5657.59 | bwd_allreduce: 149.49 | step: 19.05 16%|█▌ | 6528/41250 [15:46:28<83:55:59, 8.70s/it] {'loss': 0.3842, 'grad_norm': 4.340869426727295, 'learning_rate': 3.8299494702540245e-05, 'epoch': 1.58} 16%|█▌ | 6528/41250 [15:46:28<83:55:59, 8.70s/it][2025-04-25 23:44:12,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:44:12,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.26 | bwd_microstep: 5763.45 | bwd_inner_microstep: 5668.77 | bwd_allreduce_microstep: 94.63 | step_microstep: 18.38 [2025-04-25 23:44:12,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.26 | bwd: 5763.46 | bwd_inner: 5668.77 | bwd_allreduce: 94.65 | step: 18.38 16%|█▌ | 6529/41250 [15:46:37<83:52:25, 8.70s/it] {'loss': 0.2721, 'grad_norm': 5.689908504486084, 'learning_rate': 3.829886100242281e-05, 'epoch': 1.58} 16%|█▌ | 6529/41250 [15:46:37<83:52:25, 8.70s/it][2025-04-25 23:44:21,033] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-25 23:44:21,034] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.14 | bwd_microstep: 5780.17 | bwd_inner_microstep: 5646.68 | bwd_allreduce_microstep: 133.45 | step_microstep: 19.35 [2025-04-25 23:44:21,034] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.14 | bwd: 5780.19 | bwd_inner: 5646.68 | bwd_allreduce: 133.47 | step: 19.35 16%|█▌ | 6530/41250 [15:46:46<83:53:22, 8.70s/it] {'loss': 0.1831, 'grad_norm': 1.9161490201950073, 'learning_rate': 3.82982271894965e-05, 'epoch': 1.58} 16%|█▌ | 6530/41250 [15:46:46<83:53:22, 8.70s/it][2025-04-25 23:44:29,724] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 23:44:29,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.99 | bwd_microstep: 5749.22 | bwd_inner_microstep: 5714.73 | bwd_allreduce_microstep: 34.45 | step_microstep: 18.96 [2025-04-25 23:44:29,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.99 | bwd: 5749.23 | bwd_inner: 5714.73 | bwd_allreduce: 34.46 | step: 18.96 16%|█▌ | 6531/41250 [15:46:55<83:51:51, 8.70s/it] {'loss': 0.0857, 'grad_norm': 0.8916834592819214, 'learning_rate': 3.829759326376522e-05, 'epoch': 1.58} 16%|█▌ | 6531/41250 [15:46:55<83:51:51, 8.70s/it][2025-04-25 23:44:38,542] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 23:44:38,542] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 3071.81 | bwd_microstep: 5662.93 | bwd_inner_microstep: 5650.16 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.92 [2025-04-25 23:44:38,542] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 3071.81 | bwd: 5662.94 | bwd_inner: 5650.16 | bwd_allreduce: 12.74 | step: 18.93 16%|█▌ | 6532/41250 [15:47:03<84:12:52, 8.73s/it] {'loss': 0.1262, 'grad_norm': 1.078928828239441, 'learning_rate': 3.829695922523287e-05, 'epoch': 1.58} 16%|█▌ | 6532/41250 [15:47:03<84:12:52, 8.73s/it][2025-04-25 23:44:47,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.00 | optimizer_step: 1.06 [2025-04-25 23:44:47,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.50 | bwd_microstep: 5784.78 | bwd_inner_microstep: 5650.31 | bwd_allreduce_microstep: 134.42 | step_microstep: 18.79 [2025-04-25 23:44:47,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.50 | bwd: 5784.79 | bwd_inner: 5650.31 | bwd_allreduce: 134.44 | step: 18.79 16%|█▌ | 6533/41250 [15:47:12<84:06:45, 8.72s/it] {'loss': 0.1797, 'grad_norm': 7.468048095703125, 'learning_rate': 3.8296325073903363e-05, 'epoch': 1.58} 16%|█▌ | 6533/41250 [15:47:12<84:06:45, 8.72s/it][2025-04-25 23:44:55,851] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 1.02 [2025-04-25 23:44:55,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.70 | bwd_microstep: 5698.72 | bwd_inner_microstep: 5653.95 | bwd_allreduce_microstep: 44.73 | step_microstep: 19.32 [2025-04-25 23:44:55,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.70 | bwd: 5698.73 | bwd_inner: 5653.95 | bwd_allreduce: 44.74 | step: 19.33 16%|█▌ | 6534/41250 [15:47:21<83:47:24, 8.69s/it] {'loss': 0.2227, 'grad_norm': 4.034696578979492, 'learning_rate': 3.8295690809780614e-05, 'epoch': 1.58} 16%|█▌ | 6534/41250 [15:47:21<83:47:24, 8.69s/it][2025-04-25 23:45:04,544] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-25 23:45:04,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.38 | bwd_microstep: 5790.44 | bwd_inner_microstep: 5638.16 | bwd_allreduce_microstep: 152.23 | step_microstep: 18.90 [2025-04-25 23:45:04,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.38 | bwd: 5790.45 | bwd_inner: 5638.16 | bwd_allreduce: 152.26 | step: 18.91 16%|█▌ | 6535/41250 [15:47:29<83:47:58, 8.69s/it] {'loss': 0.1507, 'grad_norm': 4.256382942199707, 'learning_rate': 3.829505643286853e-05, 'epoch': 1.58} 16%|█▌ | 6535/41250 [15:47:29<83:47:58, 8.69s/it][2025-04-25 23:45:13,340] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 0.98 [2025-04-25 23:45:13,341] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.13 | bwd_microstep: 5850.51 | bwd_inner_microstep: 5698.56 | bwd_allreduce_microstep: 151.91 | step_microstep: 18.44 [2025-04-25 23:45:13,341] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.13 | bwd: 5850.52 | bwd_inner: 5698.56 | bwd_allreduce: 151.92 | step: 18.44 16%|█▌ | 6536/41250 [15:47:38<84:06:03, 8.72s/it] {'loss': 0.0792, 'grad_norm': 1.281614899635315, 'learning_rate': 3.8294421943171025e-05, 'epoch': 1.58} 16%|█▌ | 6536/41250 [15:47:38<84:06:03, 8.72s/it][2025-04-25 23:45:22,108] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-25 23:45:22,108] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2893.17 | bwd_microstep: 5791.92 | bwd_inner_microstep: 5779.21 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.59 [2025-04-25 23:45:22,109] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2893.17 | bwd: 5791.93 | bwd_inner: 5779.21 | bwd_allreduce: 12.68 | step: 18.59 16%|█▌ | 6537/41250 [15:47:47<84:13:53, 8.74s/it] {'loss': 0.1283, 'grad_norm': 2.138418674468994, 'learning_rate': 3.829378734069201e-05, 'epoch': 1.58} 16%|█▌ | 6537/41250 [15:47:47<84:13:53, 8.74s/it][2025-04-25 23:45:30,765] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.97 | optimizer_step: 1.09 [2025-04-25 23:45:30,765] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.67 | bwd_microstep: 5729.47 | bwd_inner_microstep: 5688.49 | bwd_allreduce_microstep: 40.93 | step_microstep: 18.65 [2025-04-25 23:45:30,765] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.67 | bwd: 5729.48 | bwd_inner: 5688.49 | bwd_allreduce: 40.95 | step: 18.65 16%|█▌ | 6538/41250 [15:47:56<84:00:03, 8.71s/it] {'loss': 0.0313, 'grad_norm': 0.6459226608276367, 'learning_rate': 3.829315262543538e-05, 'epoch': 1.58} 16%|█▌ | 6538/41250 [15:47:56<84:00:03, 8.71s/it][2025-04-25 23:45:39,398] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-25 23:45:39,399] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.15 | bwd_microstep: 5696.72 | bwd_inner_microstep: 5683.84 | bwd_allreduce_microstep: 12.83 | step_microstep: 18.91 [2025-04-25 23:45:39,399] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.15 | bwd: 5696.74 | bwd_inner: 5683.84 | bwd_allreduce: 12.85 | step: 18.91 16%|█▌ | 6539/41250 [15:48:04<83:46:41, 8.69s/it] {'loss': 0.049, 'grad_norm': 1.0101432800292969, 'learning_rate': 3.8292517797405076e-05, 'epoch': 1.59} 16%|█▌ | 6539/41250 [15:48:04<83:46:41, 8.69s/it][2025-04-25 23:45:48,061] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 23:45:48,062] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.23 | bwd_microstep: 5735.60 | bwd_inner_microstep: 5675.08 | bwd_allreduce_microstep: 60.46 | step_microstep: 18.97 [2025-04-25 23:45:48,062] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.23 | bwd: 5735.61 | bwd_inner: 5675.08 | bwd_allreduce: 60.48 | step: 18.98 16%|█▌ | 6540/41250 [15:48:13<83:41:47, 8.68s/it] {'loss': 0.0742, 'grad_norm': 1.6190894842147827, 'learning_rate': 3.8291882856604985e-05, 'epoch': 1.59} 16%|█▌ | 6540/41250 [15:48:13<83:41:47, 8.68s/it][2025-04-25 23:45:56,661] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.03 | optimizer_step: 1.08 [2025-04-25 23:45:56,662] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.78 | bwd_microstep: 5669.60 | bwd_inner_microstep: 5656.25 | bwd_allreduce_microstep: 13.31 | step_microstep: 19.12 [2025-04-25 23:45:56,662] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.78 | bwd: 5669.62 | bwd_inner: 5656.25 | bwd_allreduce: 13.33 | step: 19.13 16%|█▌ | 6541/41250 [15:48:21<83:27:45, 8.66s/it] {'loss': 0.0871, 'grad_norm': 1.7056450843811035, 'learning_rate': 3.8291247803039044e-05, 'epoch': 1.59} 16%|█▌ | 6541/41250 [15:48:21<83:27:45, 8.66s/it][2025-04-25 23:46:05,273] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:46:05,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.60 | bwd_microstep: 5697.19 | bwd_inner_microstep: 5648.58 | bwd_allreduce_microstep: 48.57 | step_microstep: 18.89 [2025-04-25 23:46:05,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.60 | bwd: 5697.21 | bwd_inner: 5648.58 | bwd_allreduce: 48.58 | step: 18.89 16%|█▌ | 6542/41250 [15:48:30<83:19:39, 8.64s/it] {'loss': 0.2091, 'grad_norm': 4.754953861236572, 'learning_rate': 3.829061263671115e-05, 'epoch': 1.59} 16%|█▌ | 6542/41250 [15:48:30<83:19:39, 8.64s/it][2025-04-25 23:46:13,960] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 23:46:13,960] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.31 | bwd_microstep: 5781.16 | bwd_inner_microstep: 5640.35 | bwd_allreduce_microstep: 140.76 | step_microstep: 18.64 [2025-04-25 23:46:13,960] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.31 | bwd: 5781.17 | bwd_inner: 5640.35 | bwd_allreduce: 140.78 | step: 18.64 16%|█▌ | 6543/41250 [15:48:39<83:27:08, 8.66s/it] {'loss': 0.0452, 'grad_norm': 1.1318522691726685, 'learning_rate': 3.8289977357625237e-05, 'epoch': 1.59} 16%|█▌ | 6543/41250 [15:48:39<83:27:08, 8.66s/it][2025-04-25 23:46:22,585] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 23:46:22,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.06 | bwd_microstep: 5694.75 | bwd_inner_microstep: 5681.74 | bwd_allreduce_microstep: 12.96 | step_microstep: 18.75 [2025-04-25 23:46:22,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.06 | bwd: 5694.76 | bwd_inner: 5681.74 | bwd_allreduce: 12.98 | step: 18.75 16%|█▌ | 6544/41250 [15:48:47<83:21:42, 8.65s/it] {'loss': 0.0163, 'grad_norm': 0.42357197403907776, 'learning_rate': 3.82893419657852e-05, 'epoch': 1.59} 16%|█▌ | 6544/41250 [15:48:47<83:21:42, 8.65s/it][2025-04-25 23:46:31,268] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-25 23:46:31,268] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.99 | bwd_microstep: 5750.90 | bwd_inner_microstep: 5683.60 | bwd_allreduce_microstep: 67.25 | step_microstep: 18.80 [2025-04-25 23:46:31,268] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.99 | bwd: 5750.91 | bwd_inner: 5683.60 | bwd_allreduce: 67.27 | step: 18.80 16%|█▌ | 6545/41250 [15:48:56<83:27:43, 8.66s/it] {'loss': 0.0201, 'grad_norm': 1.034962773323059, 'learning_rate': 3.828870646119498e-05, 'epoch': 1.59} 16%|█▌ | 6545/41250 [15:48:56<83:27:43, 8.66s/it][2025-04-25 23:46:39,965] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 1.01 [2025-04-25 23:46:39,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.38 | bwd_microstep: 5789.08 | bwd_inner_microstep: 5652.01 | bwd_allreduce_microstep: 137.03 | step_microstep: 18.81 [2025-04-25 23:46:39,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.38 | bwd: 5789.10 | bwd_inner: 5652.01 | bwd_allreduce: 137.04 | step: 18.81 16%|█▌ | 6546/41250 [15:49:05<83:34:26, 8.67s/it] {'loss': 0.0705, 'grad_norm': 2.1382832527160645, 'learning_rate': 3.828807084385847e-05, 'epoch': 1.59} 16%|█▌ | 6546/41250 [15:49:05<83:34:26, 8.67s/it][2025-04-25 23:46:48,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:46:48,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.31 | bwd_microstep: 5707.21 | bwd_inner_microstep: 5636.34 | bwd_allreduce_microstep: 70.83 | step_microstep: 18.61 [2025-04-25 23:46:48,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.31 | bwd: 5707.23 | bwd_inner: 5636.34 | bwd_allreduce: 70.84 | step: 18.61 16%|█▌ | 6547/41250 [15:49:13<83:23:51, 8.65s/it] {'loss': 0.1133, 'grad_norm': 1.585102915763855, 'learning_rate': 3.828743511377961e-05, 'epoch': 1.59} 16%|█▌ | 6547/41250 [15:49:13<83:23:51, 8.65s/it][2025-04-25 23:46:57,321] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:46:57,322] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.59 | bwd_microstep: 5774.42 | bwd_inner_microstep: 5761.64 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.24 [2025-04-25 23:46:57,322] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.59 | bwd: 5774.43 | bwd_inner: 5761.64 | bwd_allreduce: 12.75 | step: 18.25 16%|█▌ | 6548/41250 [15:49:22<83:40:14, 8.68s/it] {'loss': 0.2863, 'grad_norm': 5.650994777679443, 'learning_rate': 3.82867992709623e-05, 'epoch': 1.59} 16%|█▌ | 6548/41250 [15:49:22<83:40:14, 8.68s/it][2025-04-25 23:47:05,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:47:05,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.84 | bwd_microstep: 5766.00 | bwd_inner_microstep: 5635.21 | bwd_allreduce_microstep: 130.73 | step_microstep: 18.28 [2025-04-25 23:47:05,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.84 | bwd: 5766.01 | bwd_inner: 5635.21 | bwd_allreduce: 130.75 | step: 18.29 16%|█▌ | 6549/41250 [15:49:31<83:38:51, 8.68s/it] {'loss': 0.2535, 'grad_norm': 2.5328171253204346, 'learning_rate': 3.828616331541048e-05, 'epoch': 1.59} 16%|█▌ | 6549/41250 [15:49:31<83:38:51, 8.68s/it][2025-04-25 23:47:14,657] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.05 | optimizer_step: 1.00 [2025-04-25 23:47:14,657] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.28 | bwd_microstep: 5735.48 | bwd_inner_microstep: 5687.58 | bwd_allreduce_microstep: 47.84 | step_microstep: 18.94 [2025-04-25 23:47:14,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.28 | bwd: 5735.49 | bwd_inner: 5687.58 | bwd_allreduce: 47.86 | step: 18.94 16%|█▌ | 6550/41250 [15:49:39<83:36:11, 8.67s/it] {'loss': 0.0789, 'grad_norm': 1.0851484537124634, 'learning_rate': 3.828552724712805e-05, 'epoch': 1.59} 16%|█▌ | 6550/41250 [15:49:39<83:36:11, 8.67s/it][2025-04-25 23:47:23,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:47:23,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.59 | bwd_microstep: 5752.24 | bwd_inner_microstep: 5676.59 | bwd_allreduce_microstep: 75.61 | step_microstep: 18.51 [2025-04-25 23:47:23,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.59 | bwd: 5752.26 | bwd_inner: 5676.59 | bwd_allreduce: 75.63 | step: 18.51 16%|█▌ | 6551/41250 [15:49:48<83:36:33, 8.67s/it] {'loss': 0.2741, 'grad_norm': 2.1918087005615234, 'learning_rate': 3.828489106611894e-05, 'epoch': 1.59} 16%|█▌ | 6551/41250 [15:49:48<83:36:33, 8.67s/it][2025-04-25 23:47:31,938] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-25 23:47:31,938] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.56 | bwd_microstep: 5694.62 | bwd_inner_microstep: 5639.82 | bwd_allreduce_microstep: 54.75 | step_microstep: 19.02 [2025-04-25 23:47:31,939] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.56 | bwd: 5694.64 | bwd_inner: 5639.82 | bwd_allreduce: 54.78 | step: 19.02 16%|█▌ | 6552/41250 [15:49:57<83:24:28, 8.65s/it] {'loss': 0.227, 'grad_norm': 3.108659267425537, 'learning_rate': 3.828425477238708e-05, 'epoch': 1.59} 16%|█▌ | 6552/41250 [15:49:57<83:24:28, 8.65s/it][2025-04-25 23:47:40,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 23:47:40,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.14 | bwd_microstep: 5803.19 | bwd_inner_microstep: 5632.35 | bwd_allreduce_microstep: 170.79 | step_microstep: 18.88 [2025-04-25 23:47:40,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.14 | bwd: 5803.20 | bwd_inner: 5632.35 | bwd_allreduce: 170.81 | step: 18.89 16%|█▌ | 6553/41250 [15:50:05<83:35:28, 8.67s/it] {'loss': 0.2323, 'grad_norm': 1.826289176940918, 'learning_rate': 3.828361836593638e-05, 'epoch': 1.59} 16%|█▌ | 6553/41250 [15:50:05<83:35:28, 8.67s/it][2025-04-25 23:47:49,578] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:47:49,578] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.55 | bwd_microstep: 5983.89 | bwd_inner_microstep: 5697.85 | bwd_allreduce_microstep: 285.99 | step_microstep: 18.51 [2025-04-25 23:47:49,578] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.55 | bwd: 5983.90 | bwd_inner: 5697.85 | bwd_allreduce: 286.01 | step: 18.51 16%|█▌ | 6554/41250 [15:50:14<84:18:14, 8.75s/it] {'loss': 0.0564, 'grad_norm': 1.1337298154830933, 'learning_rate': 3.8282981846770774e-05, 'epoch': 1.59} 16%|█▌ | 6554/41250 [15:50:14<84:18:14, 8.75s/it][2025-04-25 23:47:58,214] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.98 | optimizer_step: 1.00 [2025-04-25 23:47:58,214] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.70 | bwd_microstep: 5711.93 | bwd_inner_microstep: 5698.94 | bwd_allreduce_microstep: 12.95 | step_microstep: 18.46 [2025-04-25 23:47:58,214] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.70 | bwd: 5711.95 | bwd_inner: 5698.94 | bwd_allreduce: 12.97 | step: 18.47 16%|█▌ | 6555/41250 [15:50:23<83:58:58, 8.71s/it] {'loss': 0.1982, 'grad_norm': 2.0654265880584717, 'learning_rate': 3.8282345214894184e-05, 'epoch': 1.59} 16%|█▌ | 6555/41250 [15:50:23<83:58:58, 8.71s/it][2025-04-25 23:48:06,880] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 1.07 [2025-04-25 23:48:06,880] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.51 | bwd_microstep: 5754.60 | bwd_inner_microstep: 5649.58 | bwd_allreduce_microstep: 104.98 | step_microstep: 18.62 [2025-04-25 23:48:06,880] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.51 | bwd: 5754.61 | bwd_inner: 5649.58 | bwd_allreduce: 104.99 | step: 18.62 16%|█▌ | 6556/41250 [15:50:32<83:50:27, 8.70s/it] {'loss': 0.1654, 'grad_norm': 2.716905117034912, 'learning_rate': 3.828170847031053e-05, 'epoch': 1.59} 16%|█▌ | 6556/41250 [15:50:32<83:50:27, 8.70s/it][2025-04-25 23:48:15,571] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 23:48:15,571] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.23 | bwd_microstep: 5788.03 | bwd_inner_microstep: 5645.85 | bwd_allreduce_microstep: 142.14 | step_microstep: 18.51 [2025-04-25 23:48:15,571] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.23 | bwd: 5788.05 | bwd_inner: 5645.85 | bwd_allreduce: 142.16 | step: 18.51 16%|█▌ | 6557/41250 [15:50:40<83:48:51, 8.70s/it] {'loss': 0.0681, 'grad_norm': 1.3288376331329346, 'learning_rate': 3.828107161302373e-05, 'epoch': 1.59} 16%|█▌ | 6557/41250 [15:50:40<83:48:51, 8.70s/it][2025-04-25 23:48:24,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 23:48:24,384] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.86 | bwd_microstep: 5905.83 | bwd_inner_microstep: 5644.32 | bwd_allreduce_microstep: 261.45 | step_microstep: 18.87 [2025-04-25 23:48:24,384] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.86 | bwd: 5905.84 | bwd_inner: 5644.32 | bwd_allreduce: 261.47 | step: 18.87 16%|█▌ | 6558/41250 [15:50:49<84:09:22, 8.73s/it] {'loss': 0.2793, 'grad_norm': 5.197048187255859, 'learning_rate': 3.828043464303773e-05, 'epoch': 1.59} 16%|█▌ | 6558/41250 [15:50:49<84:09:22, 8.73s/it][2025-04-25 23:48:33,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:48:33,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2891.22 | bwd_microstep: 5799.75 | bwd_inner_microstep: 5786.86 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.76 [2025-04-25 23:48:33,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2891.22 | bwd: 5799.76 | bwd_inner: 5786.86 | bwd_allreduce: 12.86 | step: 18.76 16%|█▌ | 6559/41250 [15:50:58<84:17:04, 8.75s/it] {'loss': 0.2098, 'grad_norm': 4.600319862365723, 'learning_rate': 3.827979756035644e-05, 'epoch': 1.59} 16%|█▌ | 6559/41250 [15:50:58<84:17:04, 8.75s/it][2025-04-25 23:48:41,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.03 | optimizer_step: 0.91 [2025-04-25 23:48:41,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.14 | bwd_microstep: 5787.70 | bwd_inner_microstep: 5647.40 | bwd_allreduce_microstep: 140.26 | step_microstep: 19.21 [2025-04-25 23:48:41,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.14 | bwd: 5787.72 | bwd_inner: 5647.40 | bwd_allreduce: 140.28 | step: 19.22 16%|█▌ | 6560/41250 [15:51:07<84:10:00, 8.73s/it] {'loss': 0.0828, 'grad_norm': 1.8443490266799927, 'learning_rate': 3.827916036498379e-05, 'epoch': 1.59} 16%|█▌ | 6560/41250 [15:51:07<84:10:00, 8.73s/it][2025-04-25 23:48:50,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:48:50,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.29 | bwd_microstep: 5770.82 | bwd_inner_microstep: 5704.54 | bwd_allreduce_microstep: 66.24 | step_microstep: 18.61 [2025-04-25 23:48:50,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.29 | bwd: 5770.83 | bwd_inner: 5704.54 | bwd_allreduce: 66.26 | step: 18.61 16%|█▌ | 6561/41250 [15:51:15<84:03:58, 8.72s/it] {'loss': 0.0836, 'grad_norm': 1.4390652179718018, 'learning_rate': 3.827852305692372e-05, 'epoch': 1.59} 16%|█▌ | 6561/41250 [15:51:15<84:03:58, 8.72s/it][2025-04-25 23:48:59,269] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:48:59,270] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.23 | bwd_microstep: 5764.72 | bwd_inner_microstep: 5702.24 | bwd_allreduce_microstep: 62.44 | step_microstep: 18.63 [2025-04-25 23:48:59,270] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.23 | bwd: 5764.73 | bwd_inner: 5702.24 | bwd_allreduce: 62.45 | step: 18.63 16%|█▌ | 6562/41250 [15:51:24<83:58:49, 8.72s/it] {'loss': 0.0844, 'grad_norm': 2.518033027648926, 'learning_rate': 3.827788563618015e-05, 'epoch': 1.59} 16%|█▌ | 6562/41250 [15:51:24<83:58:49, 8.72s/it][2025-04-25 23:49:07,982] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:49:07,983] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.57 | bwd_microstep: 5773.43 | bwd_inner_microstep: 5664.11 | bwd_allreduce_microstep: 109.28 | step_microstep: 18.61 [2025-04-25 23:49:07,983] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.57 | bwd: 5773.44 | bwd_inner: 5664.11 | bwd_allreduce: 109.29 | step: 18.61 16%|█▌ | 6563/41250 [15:51:33<83:58:05, 8.71s/it] {'loss': 0.2604, 'grad_norm': 2.2788193225860596, 'learning_rate': 3.8277248102757e-05, 'epoch': 1.59} 16%|█▌ | 6563/41250 [15:51:33<83:58:05, 8.71s/it][2025-04-25 23:49:16,624] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:49:16,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.22 | bwd_microstep: 5705.49 | bwd_inner_microstep: 5692.77 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.65 [2025-04-25 23:49:16,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.22 | bwd: 5705.51 | bwd_inner: 5692.77 | bwd_allreduce: 12.69 | step: 18.65 16%|█▌ | 6564/41250 [15:51:41<83:46:00, 8.69s/it] {'loss': 0.2708, 'grad_norm': 3.13101863861084, 'learning_rate': 3.827661045665822e-05, 'epoch': 1.59} 16%|█▌ | 6564/41250 [15:51:41<83:46:00, 8.69s/it][2025-04-25 23:49:25,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.88 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:49:25,265] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.21 | bwd_microstep: 5702.22 | bwd_inner_microstep: 5687.57 | bwd_allreduce_microstep: 14.60 | step_microstep: 17.96 [2025-04-25 23:49:25,265] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.21 | bwd: 5702.23 | bwd_inner: 5687.57 | bwd_allreduce: 14.62 | step: 17.96 16%|█▌ | 6565/41250 [15:51:50<83:36:15, 8.68s/it] {'loss': 0.0625, 'grad_norm': 0.9626396298408508, 'learning_rate': 3.8275972697887726e-05, 'epoch': 1.59} 16%|█▌ | 6565/41250 [15:51:50<83:36:15, 8.68s/it][2025-04-25 23:49:33,948] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-25 23:49:33,949] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2866.38 | bwd_microstep: 5727.65 | bwd_inner_microstep: 5715.01 | bwd_allreduce_microstep: 12.58 | step_microstep: 19.07 [2025-04-25 23:49:33,949] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2866.38 | bwd: 5727.66 | bwd_inner: 5715.01 | bwd_allreduce: 12.60 | step: 19.07 16%|█▌ | 6566/41250 [15:51:59<83:37:02, 8.68s/it] {'loss': 0.1463, 'grad_norm': 2.2108211517333984, 'learning_rate': 3.8275334826449466e-05, 'epoch': 1.59} 16%|█▌ | 6566/41250 [15:51:59<83:37:02, 8.68s/it][2025-04-25 23:49:42,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.18 | optimizer_step: 0.91 [2025-04-25 23:49:42,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.27 | bwd_microstep: 5767.04 | bwd_inner_microstep: 5666.13 | bwd_allreduce_microstep: 100.87 | step_microstep: 18.70 [2025-04-25 23:49:42,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.28 | bwd: 5767.06 | bwd_inner: 5666.13 | bwd_allreduce: 100.89 | step: 18.70 16%|█▌ | 6567/41250 [15:52:07<83:40:34, 8.69s/it] {'loss': 0.1738, 'grad_norm': 3.0890302658081055, 'learning_rate': 3.827469684234735e-05, 'epoch': 1.59} 16%|█▌ | 6567/41250 [15:52:07<83:40:34, 8.69s/it][2025-04-25 23:49:51,349] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-25 23:49:51,350] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.65 | bwd_microstep: 5766.64 | bwd_inner_microstep: 5692.78 | bwd_allreduce_microstep: 73.81 | step_microstep: 18.82 [2025-04-25 23:49:51,350] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.65 | bwd: 5766.65 | bwd_inner: 5692.78 | bwd_allreduce: 73.83 | step: 18.83 16%|█▌ | 6568/41250 [15:52:16<83:42:59, 8.69s/it] {'loss': 0.1414, 'grad_norm': 2.6571404933929443, 'learning_rate': 3.827405874558532e-05, 'epoch': 1.59} 16%|█▌ | 6568/41250 [15:52:16<83:42:59, 8.69s/it][2025-04-25 23:50:00,042] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.03 | optimizer_step: 1.04 [2025-04-25 23:50:00,043] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.04 | bwd_microstep: 5756.02 | bwd_inner_microstep: 5703.90 | bwd_allreduce_microstep: 52.07 | step_microstep: 19.20 [2025-04-25 23:50:00,043] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.04 | bwd: 5756.03 | bwd_inner: 5703.90 | bwd_allreduce: 52.09 | step: 19.20 16%|█▌ | 6569/41250 [15:52:25<83:43:40, 8.69s/it] {'loss': 0.1292, 'grad_norm': 1.5773566961288452, 'learning_rate': 3.827342053616732e-05, 'epoch': 1.59} 16%|█▌ | 6569/41250 [15:52:25<83:43:40, 8.69s/it][2025-04-25 23:50:08,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:50:08,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.23 | bwd_microstep: 5761.98 | bwd_inner_microstep: 5701.09 | bwd_allreduce_microstep: 60.84 | step_microstep: 18.62 [2025-04-25 23:50:08,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.23 | bwd: 5761.99 | bwd_inner: 5701.09 | bwd_allreduce: 60.86 | step: 18.63 16%|█▌ | 6570/41250 [15:52:34<83:44:45, 8.69s/it] {'loss': 0.1368, 'grad_norm': 2.438987970352173, 'learning_rate': 3.8272782214097276e-05, 'epoch': 1.59} 16%|█▌ | 6570/41250 [15:52:34<83:44:45, 8.69s/it][2025-04-25 23:50:17,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:50:17,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.89 | bwd_microstep: 5718.47 | bwd_inner_microstep: 5705.88 | bwd_allreduce_microstep: 12.55 | step_microstep: 18.51 [2025-04-25 23:50:17,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.89 | bwd: 5718.49 | bwd_inner: 5705.88 | bwd_allreduce: 12.57 | step: 18.51 16%|█▌ | 6571/41250 [15:52:42<83:39:15, 8.68s/it] {'loss': 0.256, 'grad_norm': 3.4783871173858643, 'learning_rate': 3.827214377937912e-05, 'epoch': 1.59} 16%|█▌ | 6571/41250 [15:52:42<83:39:15, 8.68s/it][2025-04-25 23:50:26,062] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.28 | optimizer_step: 1.04 [2025-04-25 23:50:26,063] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.48 | bwd_microstep: 5727.44 | bwd_inner_microstep: 5689.74 | bwd_allreduce_microstep: 37.65 | step_microstep: 20.04 [2025-04-25 23:50:26,063] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.49 | bwd: 5727.45 | bwd_inner: 5689.74 | bwd_allreduce: 37.67 | step: 20.05 16%|█▌ | 6572/41250 [15:52:51<83:34:46, 8.68s/it] {'loss': 0.0624, 'grad_norm': 1.5454639196395874, 'learning_rate': 3.827150523201679e-05, 'epoch': 1.59} 16%|█▌ | 6572/41250 [15:52:51<83:34:46, 8.68s/it][2025-04-25 23:50:34,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 0.98 | optimizer_step: 1.06 [2025-04-25 23:50:34,745] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.47 | bwd_microstep: 5774.35 | bwd_inner_microstep: 5649.37 | bwd_allreduce_microstep: 124.93 | step_microstep: 18.45 [2025-04-25 23:50:34,745] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.47 | bwd: 5774.36 | bwd_inner: 5649.37 | bwd_allreduce: 124.95 | step: 18.46 16%|█▌ | 6573/41250 [15:53:00<83:35:25, 8.68s/it] {'loss': 0.2406, 'grad_norm': 1.9462337493896484, 'learning_rate': 3.827086657201423e-05, 'epoch': 1.59} 16%|█▌ | 6573/41250 [15:53:00<83:35:25, 8.68s/it][2025-04-25 23:50:43,356] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:50:43,356] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.92 | bwd_microstep: 5702.98 | bwd_inner_microstep: 5654.50 | bwd_allreduce_microstep: 48.44 | step_microstep: 18.66 [2025-04-25 23:50:43,356] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.92 | bwd: 5703.00 | bwd_inner: 5654.50 | bwd_allreduce: 48.45 | step: 18.66 16%|█▌ | 6574/41250 [15:53:08<83:23:39, 8.66s/it] {'loss': 0.0206, 'grad_norm': 0.19028429687023163, 'learning_rate': 3.8270227799375376e-05, 'epoch': 1.59} 16%|█▌ | 6574/41250 [15:53:08<83:23:39, 8.66s/it][2025-04-25 23:50:52,041] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:50:52,042] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.15 | bwd_microstep: 5756.25 | bwd_inner_microstep: 5696.80 | bwd_allreduce_microstep: 59.41 | step_microstep: 18.86 [2025-04-25 23:50:52,042] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.16 | bwd: 5756.26 | bwd_inner: 5696.79 | bwd_allreduce: 59.42 | step: 18.87 16%|█▌ | 6575/41250 [15:53:17<83:28:24, 8.67s/it] {'loss': 0.1963, 'grad_norm': 1.8097047805786133, 'learning_rate': 3.826958891410415e-05, 'epoch': 1.59} 16%|█▌ | 6575/41250 [15:53:17<83:28:24, 8.67s/it][2025-04-25 23:51:00,748] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.61 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:51:00,749] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.27 | bwd_microstep: 5795.17 | bwd_inner_microstep: 5659.23 | bwd_allreduce_microstep: 135.90 | step_microstep: 19.24 [2025-04-25 23:51:00,749] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.27 | bwd: 5795.19 | bwd_inner: 5659.23 | bwd_allreduce: 135.92 | step: 19.24 16%|█▌ | 6576/41250 [15:53:26<83:35:17, 8.68s/it] {'loss': 0.0654, 'grad_norm': 0.6161969304084778, 'learning_rate': 3.82689499162045e-05, 'epoch': 1.59} 16%|█▌ | 6576/41250 [15:53:26<83:35:17, 8.68s/it][2025-04-25 23:51:09,441] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 23:51:09,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.35 | bwd_microstep: 5764.19 | bwd_inner_microstep: 5676.84 | bwd_allreduce_microstep: 87.31 | step_microstep: 18.83 [2025-04-25 23:51:09,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.35 | bwd: 5764.20 | bwd_inner: 5676.84 | bwd_allreduce: 87.32 | step: 18.84 16%|█▌ | 6577/41250 [15:53:34<83:37:32, 8.68s/it] {'loss': 0.1671, 'grad_norm': 2.5845906734466553, 'learning_rate': 3.826831080568038e-05, 'epoch': 1.59} 16%|█▌ | 6577/41250 [15:53:34<83:37:32, 8.68s/it][2025-04-25 23:51:18,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.20 | optimizer_step: 0.90 [2025-04-25 23:51:18,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.25 | bwd_microstep: 5704.10 | bwd_inner_microstep: 5691.03 | bwd_allreduce_microstep: 13.01 | step_microstep: 19.05 [2025-04-25 23:51:18,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.25 | bwd: 5704.12 | bwd_inner: 5691.03 | bwd_allreduce: 13.04 | step: 19.05 16%|█▌ | 6578/41250 [15:53:43<83:28:36, 8.67s/it] {'loss': 0.3543, 'grad_norm': 2.6254212856292725, 'learning_rate': 3.826767158253571e-05, 'epoch': 1.59} 16%|█▌ | 6578/41250 [15:53:43<83:28:36, 8.67s/it][2025-04-25 23:51:26,737] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:51:26,737] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.60 | bwd_microstep: 5739.42 | bwd_inner_microstep: 5679.37 | bwd_allreduce_microstep: 60.01 | step_microstep: 18.58 [2025-04-25 23:51:26,737] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.60 | bwd: 5739.44 | bwd_inner: 5679.37 | bwd_allreduce: 60.03 | step: 18.58 16%|█▌ | 6579/41250 [15:53:52<83:27:48, 8.67s/it] {'loss': 0.145, 'grad_norm': 2.0294926166534424, 'learning_rate': 3.8267032246774435e-05, 'epoch': 1.59} 16%|█▌ | 6579/41250 [15:53:52<83:27:48, 8.67s/it][2025-04-25 23:51:35,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.96 [2025-04-25 23:51:35,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.05 | bwd_microstep: 5701.54 | bwd_inner_microstep: 5688.61 | bwd_allreduce_microstep: 12.87 | step_microstep: 18.63 [2025-04-25 23:51:35,362] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.05 | bwd: 5701.56 | bwd_inner: 5688.61 | bwd_allreduce: 12.90 | step: 18.63 16%|█▌ | 6580/41250 [15:54:00<83:20:22, 8.65s/it] {'loss': 0.0351, 'grad_norm': 0.4046112596988678, 'learning_rate': 3.8266392798400505e-05, 'epoch': 1.6} 16%|█▌ | 6580/41250 [15:54:00<83:20:22, 8.65s/it][2025-04-25 23:51:44,061] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.13 | optimizer_step: 1.03 [2025-04-25 23:51:44,062] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.85 | bwd_microstep: 5762.46 | bwd_inner_microstep: 5707.29 | bwd_allreduce_microstep: 55.11 | step_microstep: 19.68 [2025-04-25 23:51:44,062] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.85 | bwd: 5762.47 | bwd_inner: 5707.29 | bwd_allreduce: 55.14 | step: 19.68 16%|█▌ | 6581/41250 [15:54:09<83:28:32, 8.67s/it] {'loss': 0.147, 'grad_norm': 3.097174882888794, 'learning_rate': 3.8265753237417847e-05, 'epoch': 1.6} 16%|█▌ | 6581/41250 [15:54:09<83:28:32, 8.67s/it][2025-04-25 23:51:52,697] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:51:52,697] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.71 | bwd_microstep: 5704.65 | bwd_inner_microstep: 5691.82 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.63 [2025-04-25 23:51:52,697] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.71 | bwd: 5704.66 | bwd_inner: 5691.82 | bwd_allreduce: 12.80 | step: 18.63 16%|█▌ | 6582/41250 [15:54:18<83:22:31, 8.66s/it] {'loss': 0.2185, 'grad_norm': 4.507137775421143, 'learning_rate': 3.8265113563830425e-05, 'epoch': 1.6} 16%|█▌ | 6582/41250 [15:54:18<83:22:31, 8.66s/it][2025-04-25 23:52:01,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-25 23:52:01,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.20 | bwd_microstep: 5706.40 | bwd_inner_microstep: 5693.66 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.66 [2025-04-25 23:52:01,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.20 | bwd: 5706.42 | bwd_inner: 5693.66 | bwd_allreduce: 12.72 | step: 18.66 16%|█▌ | 6583/41250 [15:54:26<83:20:47, 8.66s/it] {'loss': 0.1875, 'grad_norm': 1.3512026071548462, 'learning_rate': 3.8264473777642164e-05, 'epoch': 1.6} 16%|█▌ | 6583/41250 [15:54:26<83:20:47, 8.66s/it][2025-04-25 23:52:09,977] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:52:09,978] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.33 | bwd_microstep: 5693.17 | bwd_inner_microstep: 5680.27 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.73 [2025-04-25 23:52:09,978] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.33 | bwd: 5693.18 | bwd_inner: 5680.27 | bwd_allreduce: 12.88 | step: 18.73 16%|█▌ | 6584/41250 [15:54:35<83:16:47, 8.65s/it] {'loss': 0.281, 'grad_norm': 1.4678126573562622, 'learning_rate': 3.826383387885701e-05, 'epoch': 1.6} 16%|█▌ | 6584/41250 [15:54:35<83:16:47, 8.65s/it][2025-04-25 23:52:18,701] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:52:18,701] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2876.54 | bwd_microstep: 5762.17 | bwd_inner_microstep: 5749.30 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.68 [2025-04-25 23:52:18,702] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2876.54 | bwd: 5762.18 | bwd_inner: 5749.30 | bwd_allreduce: 12.84 | step: 18.68 16%|█▌ | 6585/41250 [15:54:44<83:29:44, 8.67s/it] {'loss': 0.1111, 'grad_norm': 0.9641860723495483, 'learning_rate': 3.826319386747892e-05, 'epoch': 1.6} 16%|█▌ | 6585/41250 [15:54:44<83:29:44, 8.67s/it][2025-04-25 23:52:27,486] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:52:27,487] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2882.57 | bwd_microstep: 5816.93 | bwd_inner_microstep: 5766.33 | bwd_allreduce_microstep: 50.56 | step_microstep: 18.57 [2025-04-25 23:52:27,487] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2882.57 | bwd: 5816.94 | bwd_inner: 5766.33 | bwd_allreduce: 50.57 | step: 18.58 16%|█▌ | 6586/41250 [15:54:52<83:49:14, 8.71s/it] {'loss': 0.2458, 'grad_norm': 2.581698417663574, 'learning_rate': 3.826255374351183e-05, 'epoch': 1.6} 16%|█▌ | 6586/41250 [15:54:52<83:49:14, 8.71s/it][2025-04-25 23:52:36,165] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:52:36,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.15 | bwd_microstep: 5758.16 | bwd_inner_microstep: 5634.39 | bwd_allreduce_microstep: 123.72 | step_microstep: 18.32 [2025-04-25 23:52:36,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.15 | bwd: 5758.17 | bwd_inner: 5634.39 | bwd_allreduce: 123.74 | step: 18.32 16%|█▌ | 6587/41250 [15:55:01<83:44:26, 8.70s/it] {'loss': 0.1675, 'grad_norm': 1.7877860069274902, 'learning_rate': 3.826191350695969e-05, 'epoch': 1.6} 16%|█▌ | 6587/41250 [15:55:01<83:44:26, 8.70s/it][2025-04-25 23:52:44,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-25 23:52:44,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.11 | bwd_microstep: 5702.81 | bwd_inner_microstep: 5640.63 | bwd_allreduce_microstep: 62.13 | step_microstep: 18.33 [2025-04-25 23:52:44,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.11 | bwd: 5702.82 | bwd_inner: 5640.63 | bwd_allreduce: 62.15 | step: 18.34 16%|█▌ | 6588/41250 [15:55:10<83:31:02, 8.67s/it] {'loss': 0.1753, 'grad_norm': 3.073418378829956, 'learning_rate': 3.826127315782644e-05, 'epoch': 1.6} 16%|█▌ | 6588/41250 [15:55:10<83:31:02, 8.67s/it][2025-04-25 23:52:53,406] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.02 | optimizer_step: 0.96 [2025-04-25 23:52:53,407] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.84 | bwd_microstep: 5692.37 | bwd_inner_microstep: 5679.16 | bwd_allreduce_microstep: 13.16 | step_microstep: 19.19 [2025-04-25 23:52:53,407] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.84 | bwd: 5692.39 | bwd_inner: 5679.16 | bwd_allreduce: 13.18 | step: 19.20 16%|█▌ | 6589/41250 [15:55:18<83:21:38, 8.66s/it] {'loss': 0.2381, 'grad_norm': 1.9331878423690796, 'learning_rate': 3.8260632696116035e-05, 'epoch': 1.6} 16%|█▌ | 6589/41250 [15:55:18<83:21:38, 8.66s/it][2025-04-25 23:53:02,003] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 1.01 [2025-04-25 23:53:02,004] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.36 | bwd_microstep: 5693.63 | bwd_inner_microstep: 5643.85 | bwd_allreduce_microstep: 49.74 | step_microstep: 18.57 [2025-04-25 23:53:02,004] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.36 | bwd: 5693.65 | bwd_inner: 5643.85 | bwd_allreduce: 49.76 | step: 18.58 16%|█▌ | 6590/41250 [15:55:27<83:10:50, 8.64s/it] {'loss': 0.3688, 'grad_norm': 1.923591136932373, 'learning_rate': 3.825999212183242e-05, 'epoch': 1.6} 16%|█▌ | 6590/41250 [15:55:27<83:10:50, 8.64s/it][2025-04-25 23:53:10,765] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.97 | optimizer_gradients: 1.03 | optimizer_step: 0.97 [2025-04-25 23:53:10,766] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.41 | bwd_microstep: 5834.66 | bwd_inner_microstep: 5674.90 | bwd_allreduce_microstep: 159.72 | step_microstep: 18.52 [2025-04-25 23:53:10,766] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.41 | bwd: 5834.67 | bwd_inner: 5674.90 | bwd_allreduce: 159.74 | step: 18.53 16%|█▌ | 6591/41250 [15:55:36<83:32:10, 8.68s/it] {'loss': 0.1473, 'grad_norm': 2.0853333473205566, 'learning_rate': 3.825935143497955e-05, 'epoch': 1.6} 16%|█▌ | 6591/41250 [15:55:36<83:32:10, 8.68s/it][2025-04-25 23:53:19,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:53:19,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.02 | bwd_microstep: 5692.64 | bwd_inner_microstep: 5679.79 | bwd_allreduce_microstep: 12.81 | step_microstep: 18.52 [2025-04-25 23:53:19,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.02 | bwd: 5692.65 | bwd_inner: 5679.79 | bwd_allreduce: 12.82 | step: 18.53 16%|█▌ | 6592/41250 [15:55:44<83:22:14, 8.66s/it] {'loss': 0.1262, 'grad_norm': 5.295734882354736, 'learning_rate': 3.825871063556137e-05, 'epoch': 1.6} 16%|█▌ | 6592/41250 [15:55:44<83:22:14, 8.66s/it][2025-04-25 23:53:28,054] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:53:28,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.26 | bwd_microstep: 5743.86 | bwd_inner_microstep: 5682.91 | bwd_allreduce_microstep: 60.90 | step_microstep: 18.19 [2025-04-25 23:53:28,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.26 | bwd: 5743.87 | bwd_inner: 5682.91 | bwd_allreduce: 60.92 | step: 18.20 16%|█▌ | 6593/41250 [15:55:53<83:23:36, 8.66s/it] {'loss': 0.1636, 'grad_norm': 4.0525102615356445, 'learning_rate': 3.8258069723581826e-05, 'epoch': 1.6} 16%|█▌ | 6593/41250 [15:55:53<83:23:36, 8.66s/it][2025-04-25 23:53:36,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:53:36,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.95 | bwd_microstep: 5856.79 | bwd_inner_microstep: 5691.23 | bwd_allreduce_microstep: 165.52 | step_microstep: 18.52 [2025-04-25 23:53:36,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.95 | bwd: 5856.80 | bwd_inner: 5691.23 | bwd_allreduce: 165.53 | step: 18.53 16%|█▌ | 6594/41250 [15:56:02<83:45:53, 8.70s/it] {'loss': 0.1695, 'grad_norm': 2.162536859512329, 'learning_rate': 3.8257428699044874e-05, 'epoch': 1.6} 16%|█▌ | 6594/41250 [15:56:02<83:45:53, 8.70s/it][2025-04-25 23:53:45,794] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 23:53:45,795] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.23 | bwd_microstep: 6027.77 | bwd_inner_microstep: 5653.52 | bwd_allreduce_microstep: 374.21 | step_microstep: 18.36 [2025-04-25 23:53:45,795] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.23 | bwd: 6027.79 | bwd_inner: 5653.52 | bwd_allreduce: 374.23 | step: 18.37 16%|█▌ | 6595/41250 [15:56:11<84:28:21, 8.78s/it] {'loss': 0.0871, 'grad_norm': 1.2454789876937866, 'learning_rate': 3.825678756195446e-05, 'epoch': 1.6} 16%|█▌ | 6595/41250 [15:56:11<84:28:21, 8.78s/it][2025-04-25 23:53:54,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:53:54,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.63 | bwd_microstep: 5770.10 | bwd_inner_microstep: 5694.42 | bwd_allreduce_microstep: 75.64 | step_microstep: 18.50 [2025-04-25 23:53:54,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.63 | bwd: 5770.12 | bwd_inner: 5694.42 | bwd_allreduce: 75.66 | step: 18.50 16%|█▌ | 6596/41250 [15:56:19<84:16:17, 8.75s/it] {'loss': 0.0625, 'grad_norm': 1.8815714120864868, 'learning_rate': 3.8256146312314546e-05, 'epoch': 1.6} 16%|█▌ | 6596/41250 [15:56:19<84:16:17, 8.75s/it][2025-04-25 23:54:03,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:54:03,171] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.94 | bwd_microstep: 5756.85 | bwd_inner_microstep: 5650.04 | bwd_allreduce_microstep: 106.76 | step_microstep: 18.59 [2025-04-25 23:54:03,171] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.94 | bwd: 5756.86 | bwd_inner: 5650.04 | bwd_allreduce: 106.78 | step: 18.59 16%|█▌ | 6597/41250 [15:56:28<84:01:41, 8.73s/it] {'loss': 0.099, 'grad_norm': 0.8210248947143555, 'learning_rate': 3.8255504950129086e-05, 'epoch': 1.6} 16%|█▌ | 6597/41250 [15:56:28<84:01:41, 8.73s/it][2025-04-25 23:54:11,791] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-25 23:54:11,791] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.66 | bwd_microstep: 5701.31 | bwd_inner_microstep: 5642.20 | bwd_allreduce_microstep: 59.06 | step_microstep: 19.13 [2025-04-25 23:54:11,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.66 | bwd: 5701.32 | bwd_inner: 5642.20 | bwd_allreduce: 59.08 | step: 19.14 16%|█▌ | 6598/41250 [15:56:37<83:42:39, 8.70s/it] {'loss': 0.0666, 'grad_norm': 0.892745852470398, 'learning_rate': 3.825486347540202e-05, 'epoch': 1.6} 16%|█▌ | 6598/41250 [15:56:37<83:42:39, 8.70s/it][2025-04-25 23:54:20,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-25 23:54:20,471] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.99 | bwd_microstep: 5736.15 | bwd_inner_microstep: 5691.26 | bwd_allreduce_microstep: 44.83 | step_microstep: 18.88 [2025-04-25 23:54:20,471] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.99 | bwd: 5736.16 | bwd_inner: 5691.26 | bwd_allreduce: 44.86 | step: 18.89 16%|█▌ | 6599/41250 [15:56:45<83:39:17, 8.69s/it] {'loss': 0.1165, 'grad_norm': 0.9396278858184814, 'learning_rate': 3.8254221888137316e-05, 'epoch': 1.6} 16%|█▌ | 6599/41250 [15:56:45<83:39:17, 8.69s/it][2025-04-25 23:54:29,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-25 23:54:29,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.18 | bwd_microstep: 5757.62 | bwd_inner_microstep: 5659.01 | bwd_allreduce_microstep: 98.57 | step_microstep: 18.24 [2025-04-25 23:54:29,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.18 | bwd: 5757.64 | bwd_inner: 5659.01 | bwd_allreduce: 98.58 | step: 18.24 16%|█▌ | 6600/41250 [15:56:54<83:36:32, 8.69s/it] {'loss': 0.1736, 'grad_norm': 1.546128273010254, 'learning_rate': 3.825358018833893e-05, 'epoch': 1.6} 16%|█▌ | 6600/41250 [15:56:54<83:36:32, 8.69s/it][2025-04-25 23:54:37,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:54:37,849] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.13 | bwd_microstep: 5798.72 | bwd_inner_microstep: 5647.06 | bwd_allreduce_microstep: 151.61 | step_microstep: 18.25 [2025-04-25 23:54:37,849] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.13 | bwd: 5798.73 | bwd_inner: 5647.06 | bwd_allreduce: 151.63 | step: 18.25 16%|█▌ | 6601/41250 [15:57:03<83:39:00, 8.69s/it] {'loss': 0.2296, 'grad_norm': 1.0812448263168335, 'learning_rate': 3.8252938376010805e-05, 'epoch': 1.6} 16%|█▌ | 6601/41250 [15:57:03<83:39:00, 8.69s/it][2025-04-25 23:54:46,475] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-25 23:54:46,475] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.91 | bwd_microstep: 5718.93 | bwd_inner_microstep: 5644.88 | bwd_allreduce_microstep: 74.00 | step_microstep: 18.77 [2025-04-25 23:54:46,475] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.91 | bwd: 5718.94 | bwd_inner: 5644.88 | bwd_allreduce: 74.01 | step: 18.77 16%|█▌ | 6602/41250 [15:57:11<83:27:32, 8.67s/it] {'loss': 0.082, 'grad_norm': 1.0212043523788452, 'learning_rate': 3.82522964511569e-05, 'epoch': 1.6} 16%|█▌ | 6602/41250 [15:57:11<83:27:32, 8.67s/it][2025-04-25 23:54:55,162] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:54:55,162] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.05 | bwd_microstep: 5754.92 | bwd_inner_microstep: 5688.96 | bwd_allreduce_microstep: 65.91 | step_microstep: 18.56 [2025-04-25 23:54:55,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.05 | bwd: 5754.93 | bwd_inner: 5688.96 | bwd_allreduce: 65.93 | step: 18.56 16%|█▌ | 6603/41250 [15:57:20<83:30:06, 8.68s/it] {'loss': 0.1154, 'grad_norm': 1.3526275157928467, 'learning_rate': 3.825165441378119e-05, 'epoch': 1.6} 16%|█▌ | 6603/41250 [15:57:20<83:30:06, 8.68s/it][2025-04-25 23:55:03,859] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:55:03,859] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.83 | bwd_microstep: 5765.66 | bwd_inner_microstep: 5704.14 | bwd_allreduce_microstep: 61.47 | step_microstep: 18.64 [2025-04-25 23:55:03,859] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.83 | bwd: 5765.67 | bwd_inner: 5704.14 | bwd_allreduce: 61.49 | step: 18.64 16%|█▌ | 6604/41250 [15:57:29<83:33:38, 8.68s/it] {'loss': 0.2558, 'grad_norm': 2.007232666015625, 'learning_rate': 3.8251012263887615e-05, 'epoch': 1.6} 16%|█▌ | 6604/41250 [15:57:29<83:33:38, 8.68s/it][2025-04-25 23:55:12,599] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.99 | optimizer_step: 1.01 [2025-04-25 23:55:12,600] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2881.70 | bwd_microstep: 5770.92 | bwd_inner_microstep: 5758.24 | bwd_allreduce_microstep: 12.63 | step_microstep: 19.32 [2025-04-25 23:55:12,600] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2881.70 | bwd: 5770.93 | bwd_inner: 5758.24 | bwd_allreduce: 12.65 | step: 19.32 16%|█▌ | 6605/41250 [15:57:37<83:43:33, 8.70s/it] {'loss': 0.1697, 'grad_norm': 2.060821294784546, 'learning_rate': 3.825037000148013e-05, 'epoch': 1.6} 16%|█▌ | 6605/41250 [15:57:37<83:43:33, 8.70s/it][2025-04-25 23:55:21,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.94 | optimizer_step: 1.08 [2025-04-25 23:55:21,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.53 | bwd_microstep: 5788.45 | bwd_inner_microstep: 5657.48 | bwd_allreduce_microstep: 130.93 | step_microstep: 18.53 [2025-04-25 23:55:21,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.53 | bwd: 5788.47 | bwd_inner: 5657.47 | bwd_allreduce: 130.95 | step: 18.54 16%|█▌ | 6606/41250 [15:57:46<83:43:36, 8.70s/it] {'loss': 0.272, 'grad_norm': 3.228623628616333, 'learning_rate': 3.824972762656272e-05, 'epoch': 1.6} 16%|█▌ | 6606/41250 [15:57:46<83:43:36, 8.70s/it][2025-04-25 23:55:29,941] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-25 23:55:29,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.78 | bwd_microstep: 5717.85 | bwd_inner_microstep: 5658.79 | bwd_allreduce_microstep: 59.01 | step_microstep: 18.71 [2025-04-25 23:55:29,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.78 | bwd: 5717.87 | bwd_inner: 5658.79 | bwd_allreduce: 59.03 | step: 18.72 16%|█▌ | 6607/41250 [15:57:55<83:33:05, 8.68s/it] {'loss': 0.2256, 'grad_norm': 3.5151307582855225, 'learning_rate': 3.824908513913932e-05, 'epoch': 1.6} 16%|█▌ | 6607/41250 [15:57:55<83:33:05, 8.68s/it][2025-04-25 23:55:38,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:55:38,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.67 | bwd_microstep: 5706.02 | bwd_inner_microstep: 5658.90 | bwd_allreduce_microstep: 47.08 | step_microstep: 18.72 [2025-04-25 23:55:38,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.67 | bwd: 5706.03 | bwd_inner: 5658.90 | bwd_allreduce: 47.09 | step: 18.72 16%|█▌ | 6608/41250 [15:58:03<83:21:46, 8.66s/it] {'loss': 0.0883, 'grad_norm': 1.3068981170654297, 'learning_rate': 3.82484425392139e-05, 'epoch': 1.6} 16%|█▌ | 6608/41250 [15:58:03<83:21:46, 8.66s/it][2025-04-25 23:55:47,233] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:55:47,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.98 | bwd_microstep: 5726.02 | bwd_inner_microstep: 5713.19 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.53 [2025-04-25 23:55:47,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.98 | bwd: 5726.03 | bwd_inner: 5713.19 | bwd_allreduce: 12.80 | step: 18.53 16%|█▌ | 6609/41250 [15:58:12<83:23:28, 8.67s/it] {'loss': 0.0501, 'grad_norm': 0.8206552863121033, 'learning_rate': 3.824779982679042e-05, 'epoch': 1.6} 16%|█▌ | 6609/41250 [15:58:12<83:23:28, 8.67s/it][2025-04-25 23:55:55,939] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-25 23:55:55,939] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.53 | bwd_microstep: 5766.93 | bwd_inner_microstep: 5697.28 | bwd_allreduce_microstep: 69.61 | step_microstep: 18.59 [2025-04-25 23:55:55,939] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.53 | bwd: 5766.95 | bwd_inner: 5697.28 | bwd_allreduce: 69.62 | step: 18.59 16%|█▌ | 6610/41250 [15:58:21<83:30:08, 8.68s/it] {'loss': 0.0943, 'grad_norm': 1.5548899173736572, 'learning_rate': 3.824715700187285e-05, 'epoch': 1.6} 16%|█▌ | 6610/41250 [15:58:21<83:30:08, 8.68s/it][2025-04-25 23:56:04,634] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:56:04,634] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.18 | bwd_microstep: 5778.00 | bwd_inner_microstep: 5655.35 | bwd_allreduce_microstep: 122.61 | step_microstep: 18.40 [2025-04-25 23:56:04,634] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.18 | bwd: 5778.01 | bwd_inner: 5655.35 | bwd_allreduce: 122.63 | step: 18.41 16%|█▌ | 6611/41250 [15:58:29<83:33:04, 8.68s/it] {'loss': 0.0386, 'grad_norm': 0.6692743301391602, 'learning_rate': 3.8246514064465135e-05, 'epoch': 1.6} 16%|█▌ | 6611/41250 [15:58:29<83:33:04, 8.68s/it][2025-04-25 23:56:13,316] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.05 | optimizer_step: 1.24 [2025-04-25 23:56:13,317] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.99 | bwd_microstep: 5751.88 | bwd_inner_microstep: 5666.68 | bwd_allreduce_microstep: 85.15 | step_microstep: 19.68 [2025-04-25 23:56:13,317] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.99 | bwd: 5751.89 | bwd_inner: 5666.68 | bwd_allreduce: 85.17 | step: 19.69 16%|█▌ | 6612/41250 [15:58:38<83:32:45, 8.68s/it] {'loss': 0.1906, 'grad_norm': 1.8633825778961182, 'learning_rate': 3.824587101457126e-05, 'epoch': 1.6} 16%|█▌ | 6612/41250 [15:58:38<83:32:45, 8.68s/it][2025-04-25 23:56:22,003] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.98 | optimizer_step: 1.08 [2025-04-25 23:56:22,004] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.04 | bwd_microstep: 5774.00 | bwd_inner_microstep: 5640.58 | bwd_allreduce_microstep: 133.38 | step_microstep: 18.52 [2025-04-25 23:56:22,004] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.04 | bwd: 5774.01 | bwd_inner: 5640.58 | bwd_allreduce: 133.39 | step: 18.52 16%|█▌ | 6613/41250 [15:58:47<83:33:08, 8.68s/it] {'loss': 0.1665, 'grad_norm': 2.949537754058838, 'learning_rate': 3.824522785219517e-05, 'epoch': 1.6} 16%|█▌ | 6613/41250 [15:58:47<83:33:08, 8.68s/it][2025-04-25 23:56:30,614] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-25 23:56:30,615] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.26 | bwd_microstep: 5679.27 | bwd_inner_microstep: 5654.70 | bwd_allreduce_microstep: 24.53 | step_microstep: 18.66 [2025-04-25 23:56:30,615] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.26 | bwd: 5679.29 | bwd_inner: 5654.70 | bwd_allreduce: 24.55 | step: 18.66 16%|█▌ | 6614/41250 [15:58:55<83:20:25, 8.66s/it] {'loss': 0.1091, 'grad_norm': 2.105473756790161, 'learning_rate': 3.824458457734085e-05, 'epoch': 1.6} 16%|█▌ | 6614/41250 [15:58:55<83:20:25, 8.66s/it][2025-04-25 23:56:39,267] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.98 | optimizer_step: 1.07 [2025-04-25 23:56:39,268] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.37 | bwd_microstep: 5709.77 | bwd_inner_microstep: 5697.03 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.90 [2025-04-25 23:56:39,268] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.37 | bwd: 5709.79 | bwd_inner: 5697.03 | bwd_allreduce: 12.71 | step: 18.90 16%|█▌ | 6615/41250 [15:59:04<83:19:04, 8.66s/it] {'loss': 0.0461, 'grad_norm': 0.5629950165748596, 'learning_rate': 3.824394119001224e-05, 'epoch': 1.6} 16%|█▌ | 6615/41250 [15:59:04<83:19:04, 8.66s/it][2025-04-25 23:56:47,910] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 1.14 [2025-04-25 23:56:47,911] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.12 | bwd_microstep: 5704.85 | bwd_inner_microstep: 5691.93 | bwd_allreduce_microstep: 12.87 | step_microstep: 19.34 [2025-04-25 23:56:47,911] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.12 | bwd: 5704.87 | bwd_inner: 5691.93 | bwd_allreduce: 12.89 | step: 19.34 16%|█▌ | 6616/41250 [15:59:13<83:15:32, 8.65s/it] {'loss': 0.0399, 'grad_norm': 0.5498137474060059, 'learning_rate': 3.8243297690213334e-05, 'epoch': 1.6} 16%|█▌ | 6616/41250 [15:59:13<83:15:32, 8.65s/it][2025-04-25 23:56:56,596] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.96 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-25 23:56:56,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.46 | bwd_microstep: 5737.15 | bwd_inner_microstep: 5686.42 | bwd_allreduce_microstep: 50.69 | step_microstep: 18.51 [2025-04-25 23:56:56,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.46 | bwd: 5737.17 | bwd_inner: 5686.42 | bwd_allreduce: 50.71 | step: 18.51 16%|█▌ | 6617/41250 [15:59:21<83:21:05, 8.66s/it] {'loss': 0.151, 'grad_norm': 2.2439751625061035, 'learning_rate': 3.824265407794808e-05, 'epoch': 1.6} 16%|█▌ | 6617/41250 [15:59:21<83:21:05, 8.66s/it][2025-04-25 23:57:05,247] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:57:05,247] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.60 | bwd_microstep: 5707.81 | bwd_inner_microstep: 5694.99 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.73 [2025-04-25 23:57:05,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.60 | bwd: 5707.83 | bwd_inner: 5694.99 | bwd_allreduce: 12.79 | step: 18.74 16%|█▌ | 6618/41250 [15:59:30<83:18:23, 8.66s/it] {'loss': 0.169, 'grad_norm': 1.5543197393417358, 'learning_rate': 3.824201035322045e-05, 'epoch': 1.6} 16%|█▌ | 6618/41250 [15:59:30<83:18:23, 8.66s/it][2025-04-25 23:57:13,914] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:57:13,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.94 | bwd_microstep: 5726.88 | bwd_inner_microstep: 5709.67 | bwd_allreduce_microstep: 17.17 | step_microstep: 18.44 [2025-04-25 23:57:13,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.94 | bwd: 5726.89 | bwd_inner: 5709.67 | bwd_allreduce: 17.19 | step: 18.45 16%|█▌ | 6619/41250 [15:59:39<83:19:31, 8.66s/it] {'loss': 0.0888, 'grad_norm': 1.7850067615509033, 'learning_rate': 3.8241366516034416e-05, 'epoch': 1.6} 16%|█▌ | 6619/41250 [15:59:39<83:19:31, 8.66s/it][2025-04-25 23:57:22,601] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.98 | optimizer_step: 1.03 [2025-04-25 23:57:22,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.92 | bwd_microstep: 5749.49 | bwd_inner_microstep: 5700.06 | bwd_allreduce_microstep: 49.37 | step_microstep: 18.52 [2025-04-25 23:57:22,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.92 | bwd: 5749.50 | bwd_inner: 5700.06 | bwd_allreduce: 49.39 | step: 18.52 16%|█▌ | 6620/41250 [15:59:47<83:23:58, 8.67s/it] {'loss': 0.1896, 'grad_norm': 2.591186285018921, 'learning_rate': 3.8240722566393945e-05, 'epoch': 1.6} 16%|█▌ | 6620/41250 [15:59:47<83:23:58, 8.67s/it][2025-04-25 23:57:31,260] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-25 23:57:31,261] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.11 | bwd_microstep: 5718.51 | bwd_inner_microstep: 5692.68 | bwd_allreduce_microstep: 25.79 | step_microstep: 18.90 [2025-04-25 23:57:31,262] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.11 | bwd: 5718.53 | bwd_inner: 5692.68 | bwd_allreduce: 25.81 | step: 18.91 16%|█▌ | 6621/41250 [15:59:56<83:21:55, 8.67s/it] {'loss': 0.0413, 'grad_norm': 1.162328839302063, 'learning_rate': 3.824007850430301e-05, 'epoch': 1.61} 16%|█▌ | 6621/41250 [15:59:56<83:21:55, 8.67s/it][2025-04-25 23:57:39,950] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.07 | optimizer_step: 0.94 [2025-04-25 23:57:39,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.70 | bwd_microstep: 5756.22 | bwd_inner_microstep: 5682.73 | bwd_allreduce_microstep: 73.44 | step_microstep: 19.28 [2025-04-25 23:57:39,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.70 | bwd: 5756.23 | bwd_inner: 5682.73 | bwd_allreduce: 73.46 | step: 19.28 16%|█▌ | 6622/41250 [16:00:05<83:25:36, 8.67s/it] {'loss': 0.0296, 'grad_norm': 0.37405115365982056, 'learning_rate': 3.823943432976558e-05, 'epoch': 1.61} 16%|█▌ | 6622/41250 [16:00:05<83:25:36, 8.67s/it][2025-04-25 23:57:48,640] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 1.04 | optimizer_step: 0.96 [2025-04-25 23:57:48,641] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.30 | bwd_microstep: 5785.63 | bwd_inner_microstep: 5642.32 | bwd_allreduce_microstep: 143.26 | step_microstep: 18.69 [2025-04-25 23:57:48,641] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.30 | bwd: 5785.65 | bwd_inner: 5642.32 | bwd_allreduce: 143.28 | step: 18.69 16%|█▌ | 6623/41250 [16:00:13<83:28:23, 8.68s/it] {'loss': 0.1484, 'grad_norm': 1.347522258758545, 'learning_rate': 3.823879004278562e-05, 'epoch': 1.61} 16%|█▌ | 6623/41250 [16:00:13<83:28:23, 8.68s/it][2025-04-25 23:57:57,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-25 23:57:57,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2888.94 | bwd_microstep: 5781.06 | bwd_inner_microstep: 5768.13 | bwd_allreduce_microstep: 12.89 | step_microstep: 18.89 [2025-04-25 23:57:57,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2888.94 | bwd: 5781.08 | bwd_inner: 5768.13 | bwd_allreduce: 12.90 | step: 18.89 16%|█▌ | 6624/41250 [16:00:22<83:41:12, 8.70s/it] {'loss': 0.0666, 'grad_norm': 0.7825517058372498, 'learning_rate': 3.8238145643367106e-05, 'epoch': 1.61} 16%|█▌ | 6624/41250 [16:00:22<83:41:12, 8.70s/it][2025-04-25 23:58:05,980] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.13 | optimizer_step: 1.03 [2025-04-25 23:58:05,981] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.43 | bwd_microstep: 5675.09 | bwd_inner_microstep: 5639.77 | bwd_allreduce_microstep: 35.26 | step_microstep: 20.00 [2025-04-25 23:58:05,981] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.43 | bwd: 5675.11 | bwd_inner: 5639.77 | bwd_allreduce: 35.29 | step: 20.00 16%|█▌ | 6625/41250 [16:00:31<83:21:28, 8.67s/it] {'loss': 0.114, 'grad_norm': 1.417270302772522, 'learning_rate': 3.823750113151402e-05, 'epoch': 1.61} 16%|█▌ | 6625/41250 [16:00:31<83:21:28, 8.67s/it][2025-04-25 23:58:14,605] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.99 | optimizer_step: 1.00 [2025-04-25 23:58:14,606] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.79 | bwd_microstep: 5692.75 | bwd_inner_microstep: 5679.76 | bwd_allreduce_microstep: 12.94 | step_microstep: 18.79 [2025-04-25 23:58:14,606] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.79 | bwd: 5692.76 | bwd_inner: 5679.76 | bwd_allreduce: 12.96 | step: 18.80 16%|█▌ | 6626/41250 [16:00:39<83:13:59, 8.65s/it] {'loss': 0.113, 'grad_norm': 1.920913815498352, 'learning_rate': 3.823685650723032e-05, 'epoch': 1.61} 16%|█▌ | 6626/41250 [16:00:39<83:13:59, 8.65s/it][2025-04-25 23:58:23,259] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 23:58:23,259] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.14 | bwd_microstep: 5749.32 | bwd_inner_microstep: 5642.38 | bwd_allreduce_microstep: 106.89 | step_microstep: 18.88 [2025-04-25 23:58:23,260] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.14 | bwd: 5749.33 | bwd_inner: 5642.38 | bwd_allreduce: 106.91 | step: 18.88 16%|█▌ | 6627/41250 [16:00:48<83:13:45, 8.65s/it] {'loss': 0.2446, 'grad_norm': 2.614367961883545, 'learning_rate': 3.823621177051998e-05, 'epoch': 1.61} 16%|█▌ | 6627/41250 [16:00:48<83:13:45, 8.65s/it][2025-04-25 23:58:31,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:58:31,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.96 | bwd_microstep: 5709.34 | bwd_inner_microstep: 5696.63 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.73 [2025-04-25 23:58:31,902] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.96 | bwd: 5709.36 | bwd_inner: 5696.63 | bwd_allreduce: 12.68 | step: 18.73 16%|█▌ | 6628/41250 [16:00:57<83:11:29, 8.65s/it] {'loss': 0.2962, 'grad_norm': 3.5909764766693115, 'learning_rate': 3.8235566921386994e-05, 'epoch': 1.61} 16%|█▌ | 6628/41250 [16:00:57<83:11:29, 8.65s/it][2025-04-25 23:58:40,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-25 23:58:40,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.43 | bwd_microstep: 5745.47 | bwd_inner_microstep: 5673.27 | bwd_allreduce_microstep: 72.16 | step_microstep: 18.72 [2025-04-25 23:58:40,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.43 | bwd: 5745.49 | bwd_inner: 5673.27 | bwd_allreduce: 72.18 | step: 18.72 16%|█▌ | 6629/41250 [16:01:05<83:14:06, 8.66s/it] {'loss': 0.2906, 'grad_norm': 2.3339309692382812, 'learning_rate': 3.823492195983532e-05, 'epoch': 1.61} 16%|█▌ | 6629/41250 [16:01:05<83:14:06, 8.66s/it][2025-04-25 23:58:49,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-25 23:58:49,249] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.17 | bwd_microstep: 5754.51 | bwd_inner_microstep: 5679.58 | bwd_allreduce_microstep: 74.87 | step_microstep: 18.83 [2025-04-25 23:58:49,249] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.17 | bwd: 5754.52 | bwd_inner: 5679.58 | bwd_allreduce: 74.89 | step: 18.83 16%|█▌ | 6630/41250 [16:01:14<83:18:49, 8.66s/it] {'loss': 0.1354, 'grad_norm': 1.310814380645752, 'learning_rate': 3.8234276885868935e-05, 'epoch': 1.61} 16%|█▌ | 6630/41250 [16:01:14<83:18:49, 8.66s/it][2025-04-25 23:58:57,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-25 23:58:57,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.96 | bwd_microstep: 5687.07 | bwd_inner_microstep: 5639.20 | bwd_allreduce_microstep: 47.82 | step_microstep: 19.00 [2025-04-25 23:58:57,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.96 | bwd: 5687.08 | bwd_inner: 5639.20 | bwd_allreduce: 47.84 | step: 19.00 16%|█▌ | 6631/41250 [16:01:23<83:07:12, 8.64s/it] {'loss': 0.2843, 'grad_norm': 2.245800733566284, 'learning_rate': 3.823363169949182e-05, 'epoch': 1.61} 16%|█▌ | 6631/41250 [16:01:23<83:07:12, 8.64s/it][2025-04-25 23:59:06,541] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.09 | optimizer_step: 1.23 [2025-04-25 23:59:06,542] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.30 | bwd_microstep: 5766.48 | bwd_inner_microstep: 5674.93 | bwd_allreduce_microstep: 91.50 | step_microstep: 20.04 [2025-04-25 23:59:06,542] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.30 | bwd: 5766.50 | bwd_inner: 5674.93 | bwd_allreduce: 91.53 | step: 20.04 16%|█▌ | 6632/41250 [16:01:31<83:16:01, 8.66s/it] {'loss': 0.4524, 'grad_norm': 4.557259559631348, 'learning_rate': 3.8232986400707956e-05, 'epoch': 1.61} 16%|█▌ | 6632/41250 [16:01:31<83:16:01, 8.66s/it][2025-04-25 23:59:15,217] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.02 | optimizer_step: 0.95 [2025-04-25 23:59:15,217] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.54 | bwd_microstep: 5763.04 | bwd_inner_microstep: 5648.48 | bwd_allreduce_microstep: 114.51 | step_microstep: 19.17 [2025-04-25 23:59:15,217] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.54 | bwd: 5763.06 | bwd_inner: 5648.48 | bwd_allreduce: 114.53 | step: 19.17 16%|█▌ | 6633/41250 [16:01:40<83:18:30, 8.66s/it] {'loss': 0.2114, 'grad_norm': 3.5634493827819824, 'learning_rate': 3.823234098952132e-05, 'epoch': 1.61} 16%|█▌ | 6633/41250 [16:01:40<83:18:30, 8.66s/it][2025-04-25 23:59:23,893] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-25 23:59:23,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.95 | bwd_microstep: 5745.46 | bwd_inner_microstep: 5688.92 | bwd_allreduce_microstep: 56.50 | step_microstep: 18.97 [2025-04-25 23:59:23,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.95 | bwd: 5745.48 | bwd_inner: 5688.92 | bwd_allreduce: 56.51 | step: 18.98 16%|█▌ | 6634/41250 [16:01:49<83:20:39, 8.67s/it] {'loss': 0.0479, 'grad_norm': 1.5787333250045776, 'learning_rate': 3.823169546593588e-05, 'epoch': 1.61} 16%|█▌ | 6634/41250 [16:01:49<83:20:39, 8.67s/it][2025-04-25 23:59:32,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-25 23:59:32,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.91 | bwd_microstep: 5729.81 | bwd_inner_microstep: 5680.01 | bwd_allreduce_microstep: 49.76 | step_microstep: 19.02 [2025-04-25 23:59:32,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.91 | bwd: 5729.83 | bwd_inner: 5680.01 | bwd_allreduce: 49.77 | step: 19.03 16%|█▌ | 6635/41250 [16:01:57<83:18:35, 8.66s/it] {'loss': 0.0936, 'grad_norm': 1.040139079093933, 'learning_rate': 3.8231049829955626e-05, 'epoch': 1.61} 16%|█▌ | 6635/41250 [16:01:57<83:18:35, 8.66s/it][2025-04-25 23:59:41,216] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.00 | optimizer_step: 1.02 [2025-04-25 23:59:41,217] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.08 | bwd_microstep: 5735.09 | bwd_inner_microstep: 5708.73 | bwd_allreduce_microstep: 26.32 | step_microstep: 18.92 [2025-04-25 23:59:41,217] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.08 | bwd: 5735.11 | bwd_inner: 5708.73 | bwd_allreduce: 26.34 | step: 18.92 16%|█▌ | 6636/41250 [16:02:06<83:18:57, 8.67s/it] {'loss': 0.0611, 'grad_norm': 1.1488533020019531, 'learning_rate': 3.8230404081584536e-05, 'epoch': 1.61} 16%|█▌ | 6636/41250 [16:02:06<83:18:57, 8.67s/it][2025-04-25 23:59:49,869] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.23 | optimizer_step: 0.93 [2025-04-25 23:59:49,870] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.63 | bwd_microstep: 5746.17 | bwd_inner_microstep: 5647.26 | bwd_allreduce_microstep: 98.87 | step_microstep: 19.17 [2025-04-25 23:59:49,870] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.63 | bwd: 5746.19 | bwd_inner: 5647.26 | bwd_allreduce: 98.88 | step: 19.18 16%|█▌ | 6637/41250 [16:02:15<83:16:34, 8.66s/it] {'loss': 0.0626, 'grad_norm': 1.0964891910552979, 'learning_rate': 3.8229758220826595e-05, 'epoch': 1.61} 16%|█▌ | 6637/41250 [16:02:15<83:16:34, 8.66s/it][2025-04-25 23:59:58,490] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.21 | optimizer_step: 0.90 [2025-04-25 23:59:58,491] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.09 | bwd_microstep: 5711.94 | bwd_inner_microstep: 5651.80 | bwd_allreduce_microstep: 60.09 | step_microstep: 19.16 [2025-04-25 23:59:58,491] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.10 | bwd: 5711.96 | bwd_inner: 5651.80 | bwd_allreduce: 60.11 | step: 19.17 16%|█▌ | 6638/41250 [16:02:23<83:09:26, 8.65s/it] {'loss': 0.0906, 'grad_norm': 1.7225006818771362, 'learning_rate': 3.822911224768577e-05, 'epoch': 1.61} 16%|█▌ | 6638/41250 [16:02:23<83:09:26, 8.65s/it][2025-04-26 00:00:07,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 00:00:07,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.87 | bwd_microstep: 5698.33 | bwd_inner_microstep: 5642.84 | bwd_allreduce_microstep: 55.44 | step_microstep: 18.78 [2025-04-26 00:00:07,092] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.87 | bwd: 5698.35 | bwd_inner: 5642.84 | bwd_allreduce: 55.46 | step: 18.78 16%|█▌ | 6639/41250 [16:02:32<83:01:34, 8.64s/it] {'loss': 0.1057, 'grad_norm': 3.07773494720459, 'learning_rate': 3.8228466162166066e-05, 'epoch': 1.61} 16%|█▌ | 6639/41250 [16:02:32<83:01:34, 8.64s/it][2025-04-26 00:00:15,778] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.05 | optimizer_step: 1.05 [2025-04-26 00:00:15,779] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.70 | bwd_microstep: 5775.32 | bwd_inner_microstep: 5651.61 | bwd_allreduce_microstep: 123.67 | step_microstep: 19.52 [2025-04-26 00:00:15,779] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.70 | bwd: 5775.34 | bwd_inner: 5651.61 | bwd_allreduce: 123.69 | step: 19.52 16%|█▌ | 6640/41250 [16:02:41<83:09:36, 8.65s/it] {'loss': 0.0728, 'grad_norm': 1.3548004627227783, 'learning_rate': 3.822781996427144e-05, 'epoch': 1.61} 16%|█▌ | 6640/41250 [16:02:41<83:09:36, 8.65s/it][2025-04-26 00:00:24,395] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:00:24,395] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.63 | bwd_microstep: 5705.02 | bwd_inner_microstep: 5652.65 | bwd_allreduce_microstep: 52.32 | step_microstep: 18.57 [2025-04-26 00:00:24,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.63 | bwd: 5705.03 | bwd_inner: 5652.65 | bwd_allreduce: 52.34 | step: 18.57 16%|█▌ | 6641/41250 [16:02:49<83:03:43, 8.64s/it] {'loss': 0.1057, 'grad_norm': 1.8960678577423096, 'learning_rate': 3.82271736540059e-05, 'epoch': 1.61} 16%|█▌ | 6641/41250 [16:02:49<83:03:43, 8.64s/it][2025-04-26 00:00:33,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.03 | optimizer_step: 0.95 [2025-04-26 00:00:33,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.69 | bwd_microstep: 6058.05 | bwd_inner_microstep: 5650.10 | bwd_allreduce_microstep: 407.90 | step_microstep: 19.05 [2025-04-26 00:00:33,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.69 | bwd: 6058.06 | bwd_inner: 5650.10 | bwd_allreduce: 407.92 | step: 19.05 16%|█▌ | 6642/41250 [16:02:58<83:59:58, 8.74s/it] {'loss': 0.2397, 'grad_norm': 5.403155326843262, 'learning_rate': 3.822652723137341e-05, 'epoch': 1.61} 16%|█▌ | 6642/41250 [16:02:58<83:59:58, 8.74s/it][2025-04-26 00:00:41,987] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.03 | optimizer_step: 0.96 [2025-04-26 00:00:41,988] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.40 | bwd_microstep: 5701.73 | bwd_inner_microstep: 5688.88 | bwd_allreduce_microstep: 12.81 | step_microstep: 18.90 [2025-04-26 00:00:41,988] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.40 | bwd: 5701.74 | bwd_inner: 5688.88 | bwd_allreduce: 12.82 | step: 18.90 16%|█▌ | 6643/41250 [16:03:07<83:40:39, 8.70s/it] {'loss': 0.2323, 'grad_norm': 2.1131227016448975, 'learning_rate': 3.8225880696377975e-05, 'epoch': 1.61} 16%|█▌ | 6643/41250 [16:03:07<83:40:39, 8.70s/it][2025-04-26 00:00:50,636] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-26 00:00:50,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.88 | bwd_microstep: 5715.59 | bwd_inner_microstep: 5702.82 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.96 [2025-04-26 00:00:50,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.88 | bwd: 5715.60 | bwd_inner: 5702.82 | bwd_allreduce: 12.74 | step: 18.97 16%|█▌ | 6644/41250 [16:03:15<83:30:53, 8.69s/it] {'loss': 0.1371, 'grad_norm': 1.9538136720657349, 'learning_rate': 3.8225234049023564e-05, 'epoch': 1.61} 16%|█▌ | 6644/41250 [16:03:15<83:30:53, 8.69s/it][2025-04-26 00:00:59,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 00:00:59,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.53 | bwd_microstep: 5765.42 | bwd_inner_microstep: 5696.30 | bwd_allreduce_microstep: 69.07 | step_microstep: 18.95 [2025-04-26 00:00:59,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.53 | bwd: 5765.43 | bwd_inner: 5696.30 | bwd_allreduce: 69.09 | step: 18.95 16%|█▌ | 6645/41250 [16:03:24<83:31:46, 8.69s/it] {'loss': 0.0621, 'grad_norm': 2.696687936782837, 'learning_rate': 3.822458728931417e-05, 'epoch': 1.61} 16%|█▌ | 6645/41250 [16:03:24<83:31:46, 8.69s/it][2025-04-26 00:01:08,025] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.01 | optimizer_step: 1.09 [2025-04-26 00:01:08,026] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.92 | bwd_microstep: 5779.10 | bwd_inner_microstep: 5657.86 | bwd_allreduce_microstep: 121.19 | step_microstep: 19.30 [2025-04-26 00:01:08,026] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.92 | bwd: 5779.11 | bwd_inner: 5657.86 | bwd_allreduce: 121.21 | step: 19.31 16%|█▌ | 6646/41250 [16:03:33<83:32:24, 8.69s/it] {'loss': 0.1482, 'grad_norm': 2.1756088733673096, 'learning_rate': 3.8223940417253784e-05, 'epoch': 1.61} 16%|█▌ | 6646/41250 [16:03:33<83:32:24, 8.69s/it][2025-04-26 00:01:16,722] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:01:16,723] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.48 | bwd_microstep: 5763.73 | bwd_inner_microstep: 5711.65 | bwd_allreduce_microstep: 52.03 | step_microstep: 18.88 [2025-04-26 00:01:16,723] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.48 | bwd: 5763.74 | bwd_inner: 5711.65 | bwd_allreduce: 52.05 | step: 18.89 16%|█▌ | 6647/41250 [16:03:42<83:33:13, 8.69s/it] {'loss': 0.047, 'grad_norm': 0.9030144810676575, 'learning_rate': 3.8223293432846384e-05, 'epoch': 1.61} 16%|█▌ | 6647/41250 [16:03:42<83:33:13, 8.69s/it][2025-04-26 00:01:25,506] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:01:25,506] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.37 | bwd_microstep: 5871.55 | bwd_inner_microstep: 5653.17 | bwd_allreduce_microstep: 218.33 | step_microstep: 18.43 [2025-04-26 00:01:25,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.37 | bwd: 5871.57 | bwd_inner: 5653.17 | bwd_allreduce: 218.35 | step: 18.43 16%|█▌ | 6648/41250 [16:03:50<83:48:47, 8.72s/it] {'loss': 0.1084, 'grad_norm': 1.4922209978103638, 'learning_rate': 3.8222646336095966e-05, 'epoch': 1.61} 16%|█▌ | 6648/41250 [16:03:50<83:48:47, 8.72s/it][2025-04-26 00:01:34,278] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.99 | optimizer_step: 1.01 [2025-04-26 00:01:34,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2890.44 | bwd_microstep: 5796.01 | bwd_inner_microstep: 5783.32 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.57 [2025-04-26 00:01:34,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2890.44 | bwd: 5796.02 | bwd_inner: 5783.32 | bwd_allreduce: 12.66 | step: 18.58 16%|█▌ | 6649/41250 [16:03:59<83:57:52, 8.74s/it] {'loss': 0.211, 'grad_norm': 2.120043992996216, 'learning_rate': 3.8221999127006524e-05, 'epoch': 1.61} 16%|█▌ | 6649/41250 [16:03:59<83:57:52, 8.74s/it][2025-04-26 00:01:42,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 1.03 [2025-04-26 00:01:42,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.43 | bwd_microstep: 5714.51 | bwd_inner_microstep: 5701.80 | bwd_allreduce_microstep: 12.67 | step_microstep: 19.06 [2025-04-26 00:01:42,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.43 | bwd: 5714.52 | bwd_inner: 5701.80 | bwd_allreduce: 12.69 | step: 19.06 16%|█▌ | 6650/41250 [16:04:08<83:43:38, 8.71s/it] {'loss': 0.183, 'grad_norm': 5.59145450592041, 'learning_rate': 3.8221351805582034e-05, 'epoch': 1.61} 16%|█▌ | 6650/41250 [16:04:08<83:43:38, 8.71s/it][2025-04-26 00:01:51,564] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-26 00:01:51,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.04 | bwd_microstep: 5717.78 | bwd_inner_microstep: 5657.55 | bwd_allreduce_microstep: 60.18 | step_microstep: 18.43 [2025-04-26 00:01:51,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.04 | bwd: 5717.79 | bwd_inner: 5657.55 | bwd_allreduce: 60.20 | step: 18.43 16%|█▌ | 6651/41250 [16:04:16<83:29:39, 8.69s/it] {'loss': 0.2461, 'grad_norm': 3.6803245544433594, 'learning_rate': 3.8220704371826494e-05, 'epoch': 1.61} 16%|█▌ | 6651/41250 [16:04:16<83:29:39, 8.69s/it][2025-04-26 00:02:00,197] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.07 | optimizer_step: 1.02 [2025-04-26 00:02:00,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.32 | bwd_microstep: 5712.91 | bwd_inner_microstep: 5666.00 | bwd_allreduce_microstep: 46.86 | step_microstep: 19.54 [2025-04-26 00:02:00,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.32 | bwd: 5712.93 | bwd_inner: 5666.00 | bwd_allreduce: 46.88 | step: 19.54 16%|█▌ | 6652/41250 [16:04:25<83:19:56, 8.67s/it] {'loss': 0.0812, 'grad_norm': 1.0214706659317017, 'learning_rate': 3.82200568257439e-05, 'epoch': 1.61} 16%|█▌ | 6652/41250 [16:04:25<83:19:56, 8.67s/it][2025-04-26 00:02:08,960] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 1.03 [2025-04-26 00:02:08,961] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2891.84 | bwd_microstep: 5788.10 | bwd_inner_microstep: 5775.36 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.76 [2025-04-26 00:02:08,961] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2891.84 | bwd: 5788.12 | bwd_inner: 5775.36 | bwd_allreduce: 12.72 | step: 18.76 16%|█▌ | 6653/41250 [16:04:34<83:35:37, 8.70s/it] {'loss': 0.0549, 'grad_norm': 1.0663636922836304, 'learning_rate': 3.821940916733824e-05, 'epoch': 1.61} 16%|█▌ | 6653/41250 [16:04:34<83:35:37, 8.70s/it][2025-04-26 00:02:17,633] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.05 | optimizer_step: 0.90 [2025-04-26 00:02:17,633] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.31 | bwd_microstep: 5722.62 | bwd_inner_microstep: 5709.50 | bwd_allreduce_microstep: 13.08 | step_microstep: 19.04 [2025-04-26 00:02:17,633] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.31 | bwd: 5722.63 | bwd_inner: 5709.50 | bwd_allreduce: 13.09 | step: 19.04 16%|█▌ | 6654/41250 [16:04:42<83:31:03, 8.69s/it] {'loss': 0.0668, 'grad_norm': 0.9680225253105164, 'learning_rate': 3.82187613966135e-05, 'epoch': 1.61} 16%|█▌ | 6654/41250 [16:04:42<83:31:03, 8.69s/it][2025-04-26 00:02:26,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 00:02:26,353] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.58 | bwd_microstep: 5780.79 | bwd_inner_microstep: 5693.44 | bwd_allreduce_microstep: 87.30 | step_microstep: 18.43 [2025-04-26 00:02:26,353] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.58 | bwd: 5780.80 | bwd_inner: 5693.44 | bwd_allreduce: 87.31 | step: 18.43 16%|█▌ | 6655/41250 [16:04:51<83:35:54, 8.70s/it] {'loss': 0.1359, 'grad_norm': 3.1550045013427734, 'learning_rate': 3.821811351357368e-05, 'epoch': 1.61} 16%|█▌ | 6655/41250 [16:04:51<83:35:54, 8.70s/it][2025-04-26 00:02:34,986] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.28 | optimizer_step: 1.05 [2025-04-26 00:02:34,987] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.64 | bwd_microstep: 5717.14 | bwd_inner_microstep: 5649.09 | bwd_allreduce_microstep: 67.99 | step_microstep: 20.08 [2025-04-26 00:02:34,987] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.64 | bwd: 5717.15 | bwd_inner: 5649.09 | bwd_allreduce: 68.02 | step: 20.08 16%|█▌ | 6656/41250 [16:05:00<83:24:40, 8.68s/it] {'loss': 0.2111, 'grad_norm': 1.4830327033996582, 'learning_rate': 3.821746551822278e-05, 'epoch': 1.61} 16%|█▌ | 6656/41250 [16:05:00<83:24:40, 8.68s/it][2025-04-26 00:02:43,626] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.02 | optimizer_step: 0.91 [2025-04-26 00:02:43,627] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.49 | bwd_microstep: 5711.83 | bwd_inner_microstep: 5690.96 | bwd_allreduce_microstep: 20.82 | step_microstep: 19.06 [2025-04-26 00:02:43,627] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.49 | bwd: 5711.84 | bwd_inner: 5690.96 | bwd_allreduce: 20.84 | step: 19.07 16%|█▌ | 6657/41250 [16:05:08<83:17:29, 8.67s/it] {'loss': 0.0254, 'grad_norm': 0.436768114566803, 'learning_rate': 3.821681741056479e-05, 'epoch': 1.61} 16%|█▌ | 6657/41250 [16:05:08<83:17:29, 8.67s/it][2025-04-26 00:02:52,230] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-26 00:02:52,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.05 | bwd_microstep: 5690.08 | bwd_inner_microstep: 5663.23 | bwd_allreduce_microstep: 26.78 | step_microstep: 19.08 [2025-04-26 00:02:52,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.05 | bwd: 5690.10 | bwd_inner: 5663.23 | bwd_allreduce: 26.81 | step: 19.09 16%|█▌ | 6658/41250 [16:05:17<83:06:12, 8.65s/it] {'loss': 0.3518, 'grad_norm': 2.0826621055603027, 'learning_rate': 3.8216169190603694e-05, 'epoch': 1.61} 16%|█▌ | 6658/41250 [16:05:17<83:06:12, 8.65s/it][2025-04-26 00:03:00,850] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 00:03:00,851] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.34 | bwd_microstep: 5693.79 | bwd_inner_microstep: 5681.01 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.60 [2025-04-26 00:03:00,851] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.35 | bwd: 5693.80 | bwd_inner: 5681.01 | bwd_allreduce: 12.75 | step: 18.61 16%|█▌ | 6659/41250 [16:05:26<83:01:07, 8.64s/it] {'loss': 0.0643, 'grad_norm': 1.272721529006958, 'learning_rate': 3.82155208583435e-05, 'epoch': 1.61} 16%|█▌ | 6659/41250 [16:05:26<83:01:07, 8.64s/it][2025-04-26 00:03:09,540] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.20 | optimizer_step: 0.90 [2025-04-26 00:03:09,541] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.40 | bwd_microstep: 5763.97 | bwd_inner_microstep: 5685.93 | bwd_allreduce_microstep: 77.99 | step_microstep: 18.72 [2025-04-26 00:03:09,541] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.40 | bwd: 5763.99 | bwd_inner: 5685.93 | bwd_allreduce: 78.01 | step: 18.72 16%|█▌ | 6660/41250 [16:05:34<83:09:38, 8.66s/it] {'loss': 0.1464, 'grad_norm': 2.5112664699554443, 'learning_rate': 3.821487241378821e-05, 'epoch': 1.61} 16%|█▌ | 6660/41250 [16:05:34<83:09:38, 8.66s/it][2025-04-26 00:03:18,165] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.03 | optimizer_step: 1.04 [2025-04-26 00:03:18,165] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.02 | bwd_microstep: 5712.93 | bwd_inner_microstep: 5653.27 | bwd_allreduce_microstep: 59.58 | step_microstep: 18.70 [2025-04-26 00:03:18,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.02 | bwd: 5712.95 | bwd_inner: 5653.27 | bwd_allreduce: 59.63 | step: 18.71 16%|█▌ | 6661/41250 [16:05:43<83:04:14, 8.65s/it] {'loss': 0.2279, 'grad_norm': 2.236743688583374, 'learning_rate': 3.82142238569418e-05, 'epoch': 1.61} 16%|█▌ | 6661/41250 [16:05:43<83:04:14, 8.65s/it][2025-04-26 00:03:26,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.29 | optimizer_step: 1.06 [2025-04-26 00:03:26,857] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.86 | bwd_microstep: 5765.44 | bwd_inner_microstep: 5684.43 | bwd_allreduce_microstep: 80.95 | step_microstep: 20.08 [2025-04-26 00:03:26,857] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.86 | bwd: 5765.46 | bwd_inner: 5684.43 | bwd_allreduce: 80.98 | step: 20.08 16%|█▌ | 6662/41250 [16:05:52<83:12:10, 8.66s/it] {'loss': 0.1332, 'grad_norm': 1.0822944641113281, 'learning_rate': 3.821357518780829e-05, 'epoch': 1.62} 16%|█▌ | 6662/41250 [16:05:52<83:12:10, 8.66s/it][2025-04-26 00:03:35,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.00 | optimizer_step: 1.01 [2025-04-26 00:03:35,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.20 | bwd_microstep: 5710.67 | bwd_inner_microstep: 5697.91 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.89 [2025-04-26 00:03:35,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.20 | bwd: 5710.68 | bwd_inner: 5697.91 | bwd_allreduce: 12.73 | step: 18.89 16%|█▌ | 6663/41250 [16:06:00<83:09:06, 8.65s/it] {'loss': 0.0828, 'grad_norm': 2.2827467918395996, 'learning_rate': 3.8212926406391676e-05, 'epoch': 1.62} 16%|█▌ | 6663/41250 [16:06:00<83:09:06, 8.65s/it][2025-04-26 00:03:44,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 00:03:44,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.32 | bwd_microstep: 5694.18 | bwd_inner_microstep: 5681.52 | bwd_allreduce_microstep: 12.62 | step_microstep: 18.71 [2025-04-26 00:03:44,117] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.32 | bwd: 5694.20 | bwd_inner: 5681.52 | bwd_allreduce: 12.64 | step: 18.71 16%|█▌ | 6664/41250 [16:06:09<83:02:06, 8.64s/it] {'loss': 0.1088, 'grad_norm': 1.8587123155593872, 'learning_rate': 3.8212277512695955e-05, 'epoch': 1.62} 16%|█▌ | 6664/41250 [16:06:09<83:02:06, 8.64s/it][2025-04-26 00:03:52,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.98 | optimizer_step: 0.93 [2025-04-26 00:03:52,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.11 | bwd_microstep: 5748.24 | bwd_inner_microstep: 5697.71 | bwd_allreduce_microstep: 50.49 | step_microstep: 18.48 [2025-04-26 00:03:52,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.11 | bwd: 5748.26 | bwd_inner: 5697.71 | bwd_allreduce: 50.50 | step: 18.48 16%|█▌ | 6665/41250 [16:06:18<83:08:39, 8.65s/it] {'loss': 0.0974, 'grad_norm': 1.7141846418380737, 'learning_rate': 3.8211628506725116e-05, 'epoch': 1.62} 16%|█▌ | 6665/41250 [16:06:18<83:08:39, 8.65s/it][2025-04-26 00:04:01,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:04:01,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.00 | bwd_microstep: 5863.10 | bwd_inner_microstep: 5692.31 | bwd_allreduce_microstep: 170.75 | step_microstep: 18.83 [2025-04-26 00:04:01,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.00 | bwd: 5863.12 | bwd_inner: 5692.31 | bwd_allreduce: 170.76 | step: 18.83 16%|█▌ | 6666/41250 [16:06:26<83:32:24, 8.70s/it] {'loss': 0.0784, 'grad_norm': 0.9671879410743713, 'learning_rate': 3.821097938848318e-05, 'epoch': 1.62} 16%|█▌ | 6666/41250 [16:06:26<83:32:24, 8.70s/it][2025-04-26 00:04:10,192] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.02 | optimizer_step: 1.01 [2025-04-26 00:04:10,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.70 | bwd_microstep: 5690.50 | bwd_inner_microstep: 5644.62 | bwd_allreduce_microstep: 45.83 | step_microstep: 19.35 [2025-04-26 00:04:10,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.70 | bwd: 5690.51 | bwd_inner: 5644.62 | bwd_allreduce: 45.85 | step: 19.35 16%|█▌ | 6667/41250 [16:06:35<83:16:33, 8.67s/it] {'loss': 0.2221, 'grad_norm': 3.1682846546173096, 'learning_rate': 3.821033015797412e-05, 'epoch': 1.62} 16%|█▌ | 6667/41250 [16:06:35<83:16:33, 8.67s/it][2025-04-26 00:04:18,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 00:04:18,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2876.00 | bwd_microstep: 5762.82 | bwd_inner_microstep: 5749.99 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.97 [2025-04-26 00:04:18,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2876.00 | bwd: 5762.83 | bwd_inner: 5749.99 | bwd_allreduce: 12.80 | step: 18.97 16%|█▌ | 6668/41250 [16:06:44<83:25:36, 8.68s/it] {'loss': 0.0717, 'grad_norm': 0.8185484409332275, 'learning_rate': 3.820968081520197e-05, 'epoch': 1.62} 16%|█▌ | 6668/41250 [16:06:44<83:25:36, 8.68s/it][2025-04-26 00:04:27,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.05 | optimizer_step: 0.90 [2025-04-26 00:04:27,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.85 | bwd_microstep: 5699.42 | bwd_inner_microstep: 5686.39 | bwd_allreduce_microstep: 12.98 | step_microstep: 19.28 [2025-04-26 00:04:27,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.85 | bwd: 5699.43 | bwd_inner: 5686.39 | bwd_allreduce: 13.00 | step: 19.28 16%|█▌ | 6669/41250 [16:06:52<83:16:16, 8.67s/it] {'loss': 0.039, 'grad_norm': 0.6870862245559692, 'learning_rate': 3.820903136017072e-05, 'epoch': 1.62} 16%|█▌ | 6669/41250 [16:06:52<83:16:16, 8.67s/it][2025-04-26 00:04:36,205] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 1.00 | optimizer_step: 0.99 [2025-04-26 00:04:36,206] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.16 | bwd_microstep: 5726.35 | bwd_inner_microstep: 5699.33 | bwd_allreduce_microstep: 26.97 | step_microstep: 19.49 [2025-04-26 00:04:36,206] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.16 | bwd: 5726.37 | bwd_inner: 5699.33 | bwd_allreduce: 26.99 | step: 19.49 16%|█▌ | 6670/41250 [16:07:01<83:14:27, 8.67s/it] {'loss': 0.1448, 'grad_norm': 2.4065937995910645, 'learning_rate': 3.8208381792884374e-05, 'epoch': 1.62} 16%|█▌ | 6670/41250 [16:07:01<83:14:27, 8.67s/it][2025-04-26 00:04:44,864] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 00:04:44,865] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.14 | bwd_microstep: 5718.96 | bwd_inner_microstep: 5698.69 | bwd_allreduce_microstep: 20.23 | step_microstep: 18.57 [2025-04-26 00:04:44,865] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.14 | bwd: 5718.97 | bwd_inner: 5698.68 | bwd_allreduce: 20.25 | step: 18.57 16%|█▌ | 6671/41250 [16:07:10<83:12:41, 8.66s/it] {'loss': 0.1321, 'grad_norm': 2.709787368774414, 'learning_rate': 3.8207732113346924e-05, 'epoch': 1.62} 16%|█▌ | 6671/41250 [16:07:10<83:12:41, 8.66s/it][2025-04-26 00:04:53,624] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 00:04:53,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2889.36 | bwd_microstep: 5785.90 | bwd_inner_microstep: 5773.12 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.90 [2025-04-26 00:04:53,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2889.36 | bwd: 5785.92 | bwd_inner: 5773.12 | bwd_allreduce: 12.75 | step: 18.91 16%|█▌ | 6672/41250 [16:07:18<83:29:18, 8.69s/it] {'loss': 0.1476, 'grad_norm': 2.473151683807373, 'learning_rate': 3.8207082321562404e-05, 'epoch': 1.62} 16%|█▌ | 6672/41250 [16:07:18<83:29:18, 8.69s/it][2025-04-26 00:05:02,224] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.56 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:05:02,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.40 | bwd_microstep: 5694.74 | bwd_inner_microstep: 5642.30 | bwd_allreduce_microstep: 52.39 | step_microstep: 18.83 [2025-04-26 00:05:02,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.40 | bwd: 5694.75 | bwd_inner: 5642.30 | bwd_allreduce: 52.41 | step: 18.83 16%|█▌ | 6673/41250 [16:07:27<83:13:05, 8.66s/it] {'loss': 0.2794, 'grad_norm': 2.278064250946045, 'learning_rate': 3.820643241753478e-05, 'epoch': 1.62} 16%|█▌ | 6673/41250 [16:07:27<83:13:05, 8.66s/it][2025-04-26 00:05:10,879] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.00 | optimizer_step: 0.93 [2025-04-26 00:05:10,880] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.34 | bwd_microstep: 5723.47 | bwd_inner_microstep: 5679.05 | bwd_allreduce_microstep: 44.36 | step_microstep: 18.47 [2025-04-26 00:05:10,880] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.34 | bwd: 5723.48 | bwd_inner: 5679.05 | bwd_allreduce: 44.38 | step: 18.47 16%|█▌ | 6674/41250 [16:07:36<83:11:48, 8.66s/it] {'loss': 0.3527, 'grad_norm': 2.875328302383423, 'learning_rate': 3.82057824012681e-05, 'epoch': 1.62} 16%|█▌ | 6674/41250 [16:07:36<83:11:48, 8.66s/it][2025-04-26 00:05:19,558] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-26 00:05:19,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.69 | bwd_microstep: 5749.48 | bwd_inner_microstep: 5690.32 | bwd_allreduce_microstep: 59.11 | step_microstep: 18.86 [2025-04-26 00:05:19,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.69 | bwd: 5749.49 | bwd_inner: 5690.32 | bwd_allreduce: 59.13 | step: 18.86 16%|█▌ | 6675/41250 [16:07:44<83:14:06, 8.67s/it] {'loss': 0.2012, 'grad_norm': 2.0642318725585938, 'learning_rate': 3.8205132272766346e-05, 'epoch': 1.62} 16%|█▌ | 6675/41250 [16:07:44<83:14:06, 8.67s/it][2025-04-26 00:05:28,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.29 | optimizer_step: 1.06 [2025-04-26 00:05:28,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.73 | bwd_microstep: 5748.60 | bwd_inner_microstep: 5690.85 | bwd_allreduce_microstep: 57.69 | step_microstep: 19.93 [2025-04-26 00:05:28,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.73 | bwd: 5748.62 | bwd_inner: 5690.85 | bwd_allreduce: 57.71 | step: 19.93 16%|█▌ | 6676/41250 [16:07:53<83:17:30, 8.67s/it] {'loss': 0.0301, 'grad_norm': 1.0690184831619263, 'learning_rate': 3.820448203203354e-05, 'epoch': 1.62} 16%|█▌ | 6676/41250 [16:07:53<83:17:30, 8.67s/it][2025-04-26 00:05:36,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 1.06 [2025-04-26 00:05:36,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.57 | bwd_microstep: 5753.46 | bwd_inner_microstep: 5645.63 | bwd_allreduce_microstep: 107.78 | step_microstep: 18.71 [2025-04-26 00:05:36,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.57 | bwd: 5753.47 | bwd_inner: 5645.63 | bwd_allreduce: 107.80 | step: 18.71 16%|█▌ | 6677/41250 [16:08:02<83:14:41, 8.67s/it] {'loss': 0.1418, 'grad_norm': 1.0023994445800781, 'learning_rate': 3.820383167907367e-05, 'epoch': 1.62} 16%|█▌ | 6677/41250 [16:08:02<83:14:41, 8.67s/it][2025-04-26 00:05:45,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 1.02 [2025-04-26 00:05:45,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.14 | bwd_microstep: 5754.81 | bwd_inner_microstep: 5651.80 | bwd_allreduce_microstep: 102.96 | step_microstep: 19.21 [2025-04-26 00:05:45,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.14 | bwd: 5754.82 | bwd_inner: 5651.79 | bwd_allreduce: 102.98 | step: 19.21 16%|█▌ | 6678/41250 [16:08:10<83:13:48, 8.67s/it] {'loss': 0.2184, 'grad_norm': 1.9411851167678833, 'learning_rate': 3.820318121389076e-05, 'epoch': 1.62} 16%|█▌ | 6678/41250 [16:08:10<83:13:48, 8.67s/it][2025-04-26 00:05:54,172] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.03 | optimizer_step: 1.02 [2025-04-26 00:05:54,173] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.23 | bwd_microstep: 5695.51 | bwd_inner_microstep: 5650.23 | bwd_allreduce_microstep: 45.23 | step_microstep: 19.24 [2025-04-26 00:05:54,173] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.22 | bwd: 5695.52 | bwd_inner: 5650.23 | bwd_allreduce: 45.24 | step: 19.24 16%|█▌ | 6679/41250 [16:08:19<83:03:23, 8.65s/it] {'loss': 0.1, 'grad_norm': 2.002990484237671, 'learning_rate': 3.820253063648883e-05, 'epoch': 1.62} 16%|█▌ | 6679/41250 [16:08:19<83:03:23, 8.65s/it][2025-04-26 00:06:02,774] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 00:06:02,775] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.54 | bwd_microstep: 5691.09 | bwd_inner_microstep: 5651.88 | bwd_allreduce_microstep: 39.16 | step_microstep: 18.24 [2025-04-26 00:06:02,775] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.54 | bwd: 5691.11 | bwd_inner: 5651.88 | bwd_allreduce: 39.18 | step: 18.24 16%|█▌ | 6680/41250 [16:08:28<82:54:45, 8.63s/it] {'loss': 0.0791, 'grad_norm': 2.7767210006713867, 'learning_rate': 3.820187994687187e-05, 'epoch': 1.62} 16%|█▌ | 6680/41250 [16:08:28<82:54:45, 8.63s/it][2025-04-26 00:06:11,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:06:11,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.02 | bwd_microstep: 5749.94 | bwd_inner_microstep: 5653.19 | bwd_allreduce_microstep: 96.70 | step_microstep: 18.66 [2025-04-26 00:06:11,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.02 | bwd: 5749.95 | bwd_inner: 5653.19 | bwd_allreduce: 96.72 | step: 18.66 16%|█▌ | 6681/41250 [16:08:36<82:58:12, 8.64s/it] {'loss': 0.1806, 'grad_norm': 1.783096194267273, 'learning_rate': 3.8201229145043905e-05, 'epoch': 1.62} 16%|█▌ | 6681/41250 [16:08:36<82:58:12, 8.64s/it][2025-04-26 00:06:20,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-26 00:06:20,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.08 | bwd_microstep: 5697.04 | bwd_inner_microstep: 5640.82 | bwd_allreduce_microstep: 56.18 | step_microstep: 18.87 [2025-04-26 00:06:20,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.08 | bwd: 5697.06 | bwd_inner: 5640.82 | bwd_allreduce: 56.20 | step: 18.87 16%|█▌ | 6682/41250 [16:08:45<82:52:27, 8.63s/it] {'loss': 0.0895, 'grad_norm': 1.1127570867538452, 'learning_rate': 3.820057823100894e-05, 'epoch': 1.62} 16%|█▌ | 6682/41250 [16:08:45<82:52:27, 8.63s/it][2025-04-26 00:06:28,679] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 00:06:28,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.79 | bwd_microstep: 5706.97 | bwd_inner_microstep: 5693.69 | bwd_allreduce_microstep: 13.23 | step_microstep: 18.40 [2025-04-26 00:06:28,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.79 | bwd: 5706.99 | bwd_inner: 5693.69 | bwd_allreduce: 13.25 | step: 18.40 16%|█▌ | 6683/41250 [16:08:54<82:54:30, 8.63s/it] {'loss': 0.2184, 'grad_norm': 3.4070687294006348, 'learning_rate': 3.819992720477098e-05, 'epoch': 1.62} 16%|█▌ | 6683/41250 [16:08:54<82:54:30, 8.63s/it][2025-04-26 00:06:37,386] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.22 | optimizer_step: 1.02 [2025-04-26 00:06:37,386] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.43 | bwd_microstep: 5791.34 | bwd_inner_microstep: 5651.86 | bwd_allreduce_microstep: 139.44 | step_microstep: 19.64 [2025-04-26 00:06:37,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.43 | bwd: 5791.35 | bwd_inner: 5651.86 | bwd_allreduce: 139.46 | step: 19.64 16%|█▌ | 6684/41250 [16:09:02<83:06:40, 8.66s/it] {'loss': 0.1062, 'grad_norm': 2.2886712551116943, 'learning_rate': 3.819927606633406e-05, 'epoch': 1.62} 16%|█▌ | 6684/41250 [16:09:02<83:06:40, 8.66s/it][2025-04-26 00:06:46,026] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:06:46,027] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.65 | bwd_microstep: 5714.88 | bwd_inner_microstep: 5678.55 | bwd_allreduce_microstep: 36.28 | step_microstep: 18.50 [2025-04-26 00:06:46,027] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.65 | bwd: 5714.89 | bwd_inner: 5678.55 | bwd_allreduce: 36.30 | step: 18.50 16%|█▌ | 6685/41250 [16:09:11<83:03:43, 8.65s/it] {'loss': 0.1198, 'grad_norm': 2.6281850337982178, 'learning_rate': 3.8198624815702184e-05, 'epoch': 1.62} 16%|█▌ | 6685/41250 [16:09:11<83:03:43, 8.65s/it][2025-04-26 00:06:54,712] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:06:54,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.07 | bwd_microstep: 5775.80 | bwd_inner_microstep: 5653.70 | bwd_allreduce_microstep: 122.06 | step_microstep: 18.58 [2025-04-26 00:06:54,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.07 | bwd: 5775.82 | bwd_inner: 5653.70 | bwd_allreduce: 122.07 | step: 18.58 16%|█▌ | 6686/41250 [16:09:20<83:09:37, 8.66s/it] {'loss': 0.1559, 'grad_norm': 1.4559088945388794, 'learning_rate': 3.819797345287936e-05, 'epoch': 1.62} 16%|█▌ | 6686/41250 [16:09:20<83:09:37, 8.66s/it][2025-04-26 00:07:03,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:07:03,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.17 | bwd_microstep: 5770.62 | bwd_inner_microstep: 5649.32 | bwd_allreduce_microstep: 121.25 | step_microstep: 18.45 [2025-04-26 00:07:03,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.17 | bwd: 5770.63 | bwd_inner: 5649.32 | bwd_allreduce: 121.27 | step: 18.47 16%|█▌ | 6687/41250 [16:09:28<83:12:54, 8.67s/it] {'loss': 0.123, 'grad_norm': 1.9347221851348877, 'learning_rate': 3.8197321977869605e-05, 'epoch': 1.62} 16%|█▌ | 6687/41250 [16:09:28<83:12:54, 8.67s/it][2025-04-26 00:07:12,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:07:12,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.18 | bwd_microstep: 5784.86 | bwd_inner_microstep: 5650.39 | bwd_allreduce_microstep: 134.42 | step_microstep: 18.66 [2025-04-26 00:07:12,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.18 | bwd: 5784.87 | bwd_inner: 5650.39 | bwd_allreduce: 134.44 | step: 18.67 16%|█▌ | 6688/41250 [16:09:37<83:17:46, 8.68s/it] {'loss': 0.0919, 'grad_norm': 1.4027963876724243, 'learning_rate': 3.819667039067695e-05, 'epoch': 1.62} 16%|█▌ | 6688/41250 [16:09:37<83:17:46, 8.68s/it][2025-04-26 00:07:20,729] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 00:07:20,729] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.99 | bwd_microstep: 5722.18 | bwd_inner_microstep: 5657.84 | bwd_allreduce_microstep: 64.29 | step_microstep: 18.48 [2025-04-26 00:07:20,729] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.99 | bwd: 5722.19 | bwd_inner: 5657.84 | bwd_allreduce: 64.31 | step: 18.48 16%|█▌ | 6689/41250 [16:09:46<83:11:35, 8.67s/it] {'loss': 0.0608, 'grad_norm': 1.2794920206069946, 'learning_rate': 3.819601869130539e-05, 'epoch': 1.62} 16%|█▌ | 6689/41250 [16:09:46<83:11:35, 8.67s/it][2025-04-26 00:07:29,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.00 | optimizer_step: 0.94 [2025-04-26 00:07:29,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.50 | bwd_microstep: 5706.78 | bwd_inner_microstep: 5693.88 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.81 [2025-04-26 00:07:29,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.50 | bwd: 5706.79 | bwd_inner: 5693.87 | bwd_allreduce: 12.87 | step: 18.81 16%|█▌ | 6690/41250 [16:09:54<83:08:21, 8.66s/it] {'loss': 0.113, 'grad_norm': 1.7271478176116943, 'learning_rate': 3.8195366879758964e-05, 'epoch': 1.62} 16%|█▌ | 6690/41250 [16:09:54<83:08:21, 8.66s/it][2025-04-26 00:07:38,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-26 00:07:38,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.82 | bwd_microstep: 5794.57 | bwd_inner_microstep: 5656.99 | bwd_allreduce_microstep: 137.53 | step_microstep: 19.26 [2025-04-26 00:07:38,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.82 | bwd: 5794.58 | bwd_inner: 5656.99 | bwd_allreduce: 137.55 | step: 19.26 16%|█▌ | 6691/41250 [16:10:03<83:17:06, 8.68s/it] {'loss': 0.1879, 'grad_norm': 2.428492546081543, 'learning_rate': 3.8194714956041677e-05, 'epoch': 1.62} 16%|█▌ | 6691/41250 [16:10:03<83:17:06, 8.68s/it][2025-04-26 00:07:46,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.04 | optimizer_step: 1.13 [2025-04-26 00:07:46,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.47 | bwd_microstep: 5766.45 | bwd_inner_microstep: 5710.14 | bwd_allreduce_microstep: 56.25 | step_microstep: 19.63 [2025-04-26 00:07:46,799] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.47 | bwd: 5766.46 | bwd_inner: 5710.14 | bwd_allreduce: 56.27 | step: 19.63 16%|█▌ | 6692/41250 [16:10:12<83:23:01, 8.69s/it] {'loss': 0.0944, 'grad_norm': 0.9694430828094482, 'learning_rate': 3.819406292015755e-05, 'epoch': 1.62} 16%|█▌ | 6692/41250 [16:10:12<83:23:01, 8.69s/it][2025-04-26 00:07:55,436] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.98 [2025-04-26 00:07:55,437] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.91 | bwd_microstep: 5727.83 | bwd_inner_microstep: 5642.56 | bwd_allreduce_microstep: 85.22 | step_microstep: 19.10 [2025-04-26 00:07:55,437] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.91 | bwd: 5727.84 | bwd_inner: 5642.56 | bwd_allreduce: 85.24 | step: 19.11 16%|█▌ | 6693/41250 [16:10:20<83:14:04, 8.67s/it] {'loss': 0.0585, 'grad_norm': 1.0424073934555054, 'learning_rate': 3.81934107721106e-05, 'epoch': 1.62} 16%|█▌ | 6693/41250 [16:10:20<83:14:04, 8.67s/it][2025-04-26 00:08:04,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.05 | optimizer_step: 1.01 [2025-04-26 00:08:04,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.54 | bwd_microstep: 5791.41 | bwd_inner_microstep: 5680.82 | bwd_allreduce_microstep: 110.54 | step_microstep: 19.47 [2025-04-26 00:08:04,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.54 | bwd: 5791.43 | bwd_inner: 5680.82 | bwd_allreduce: 110.56 | step: 19.47 16%|█▌ | 6694/41250 [16:10:29<83:22:49, 8.69s/it] {'loss': 0.2232, 'grad_norm': 1.920749306678772, 'learning_rate': 3.819275851190486e-05, 'epoch': 1.62} 16%|█▌ | 6694/41250 [16:10:29<83:22:49, 8.69s/it][2025-04-26 00:08:12,902] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.56 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 00:08:12,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2881.99 | bwd_microstep: 5776.77 | bwd_inner_microstep: 5764.45 | bwd_allreduce_microstep: 12.27 | step_microstep: 18.97 [2025-04-26 00:08:12,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2881.99 | bwd: 5776.78 | bwd_inner: 5764.45 | bwd_allreduce: 12.29 | step: 18.97 16%|█▌ | 6695/41250 [16:10:38<83:32:29, 8.70s/it] {'loss': 0.1603, 'grad_norm': 4.035044193267822, 'learning_rate': 3.8192106139544334e-05, 'epoch': 1.62} 16%|█▌ | 6695/41250 [16:10:38<83:32:29, 8.70s/it][2025-04-26 00:08:21,728] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 00:08:21,729] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.15 | bwd_microstep: 5884.75 | bwd_inner_microstep: 5712.36 | bwd_allreduce_microstep: 172.34 | step_microstep: 18.72 [2025-04-26 00:08:21,729] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.15 | bwd: 5884.77 | bwd_inner: 5712.36 | bwd_allreduce: 172.36 | step: 18.73 16%|█▌ | 6696/41250 [16:10:47<83:53:29, 8.74s/it] {'loss': 0.1031, 'grad_norm': 2.414093017578125, 'learning_rate': 3.819145365503305e-05, 'epoch': 1.62} 16%|█▌ | 6696/41250 [16:10:47<83:53:29, 8.74s/it][2025-04-26 00:08:30,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:08:30,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.67 | bwd_microstep: 5770.37 | bwd_inner_microstep: 5713.15 | bwd_allreduce_microstep: 57.17 | step_microstep: 18.43 [2025-04-26 00:08:30,430] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.67 | bwd: 5770.38 | bwd_inner: 5713.15 | bwd_allreduce: 57.19 | step: 18.43 16%|█▌ | 6697/41250 [16:10:55<83:46:37, 8.73s/it] {'loss': 0.137, 'grad_norm': 2.451051950454712, 'learning_rate': 3.819080105837504e-05, 'epoch': 1.62} 16%|█▌ | 6697/41250 [16:10:55<83:46:37, 8.73s/it][2025-04-26 00:08:39,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 00:08:39,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.29 | bwd_microstep: 5779.94 | bwd_inner_microstep: 5692.52 | bwd_allreduce_microstep: 87.36 | step_microstep: 18.76 [2025-04-26 00:08:39,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.29 | bwd: 5779.95 | bwd_inner: 5692.52 | bwd_allreduce: 87.38 | step: 18.77 16%|█▌ | 6698/41250 [16:11:04<83:43:14, 8.72s/it] {'loss': 0.0619, 'grad_norm': 0.9682704210281372, 'learning_rate': 3.8190148349574316e-05, 'epoch': 1.62} 16%|█▌ | 6698/41250 [16:11:04<83:43:14, 8.72s/it][2025-04-26 00:08:47,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-26 00:08:47,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.05 | bwd_microstep: 5712.91 | bwd_inner_microstep: 5700.00 | bwd_allreduce_microstep: 12.87 | step_microstep: 19.28 [2025-04-26 00:08:47,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.05 | bwd: 5712.93 | bwd_inner: 5700.00 | bwd_allreduce: 12.89 | step: 19.28 16%|█▌ | 6699/41250 [16:11:13<83:29:32, 8.70s/it] {'loss': 0.3197, 'grad_norm': 3.4583005905151367, 'learning_rate': 3.8189495528634906e-05, 'epoch': 1.62} 16%|█▌ | 6699/41250 [16:11:13<83:29:32, 8.70s/it][2025-04-26 00:08:56,421] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-26 00:08:56,421] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.98 | bwd_microstep: 5707.80 | bwd_inner_microstep: 5695.13 | bwd_allreduce_microstep: 12.62 | step_microstep: 19.09 [2025-04-26 00:08:56,421] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.98 | bwd: 5707.81 | bwd_inner: 5695.13 | bwd_allreduce: 12.64 | step: 19.09 16%|█▌ | 6700/41250 [16:11:21<83:18:40, 8.68s/it] {'loss': 0.2007, 'grad_norm': 2.6519787311553955, 'learning_rate': 3.8188842595560836e-05, 'epoch': 1.62} 16%|█▌ | 6700/41250 [16:11:21<83:18:40, 8.68s/it][2025-04-26 00:09:05,146] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 00:09:05,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.90 | bwd_microstep: 5820.48 | bwd_inner_microstep: 5645.71 | bwd_allreduce_microstep: 174.73 | step_microstep: 19.09 [2025-04-26 00:09:05,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.90 | bwd: 5820.49 | bwd_inner: 5645.71 | bwd_allreduce: 174.74 | step: 19.10 16%|█▌ | 6701/41250 [16:11:30<83:26:15, 8.69s/it] {'loss': 0.1298, 'grad_norm': 1.53311026096344, 'learning_rate': 3.818818955035612e-05, 'epoch': 1.62} 16%|█▌ | 6701/41250 [16:11:30<83:26:15, 8.69s/it][2025-04-26 00:09:14,000] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 1.05 [2025-04-26 00:09:14,000] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.10 | bwd_microstep: 5939.92 | bwd_inner_microstep: 5646.50 | bwd_allreduce_microstep: 293.38 | step_microstep: 18.86 [2025-04-26 00:09:14,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.10 | bwd: 5939.94 | bwd_inner: 5646.50 | bwd_allreduce: 293.40 | step: 18.86 16%|█▌ | 6702/41250 [16:11:39<83:53:34, 8.74s/it] {'loss': 0.4589, 'grad_norm': 4.553058624267578, 'learning_rate': 3.81875363930248e-05, 'epoch': 1.62} 16%|█▌ | 6702/41250 [16:11:39<83:53:34, 8.74s/it][2025-04-26 00:09:22,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 00:09:22,816] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.06 | bwd_microstep: 5881.95 | bwd_inner_microstep: 5677.22 | bwd_allreduce_microstep: 204.69 | step_microstep: 19.03 [2025-04-26 00:09:22,816] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.06 | bwd: 5881.97 | bwd_inner: 5677.22 | bwd_allreduce: 204.71 | step: 19.04 16%|█▌ | 6703/41250 [16:11:48<84:06:20, 8.76s/it] {'loss': 0.1448, 'grad_norm': 2.1102609634399414, 'learning_rate': 3.81868831235709e-05, 'epoch': 1.62} 16%|█▌ | 6703/41250 [16:11:48<84:06:20, 8.76s/it][2025-04-26 00:09:31,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.27 | optimizer_step: 1.06 [2025-04-26 00:09:31,459] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.46 | bwd_microstep: 5699.50 | bwd_inner_microstep: 5685.56 | bwd_allreduce_microstep: 13.87 | step_microstep: 20.21 [2025-04-26 00:09:31,459] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.46 | bwd: 5699.51 | bwd_inner: 5685.56 | bwd_allreduce: 13.90 | step: 20.21 16%|█▋ | 6704/41250 [16:11:56<83:45:15, 8.73s/it] {'loss': 0.1677, 'grad_norm': 1.7323369979858398, 'learning_rate': 3.818622974199843e-05, 'epoch': 1.63} 16%|█▋ | 6704/41250 [16:11:56<83:45:15, 8.73s/it][2025-04-26 00:09:40,109] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.18 | optimizer_step: 0.90 [2025-04-26 00:09:40,109] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.78 | bwd_microstep: 5714.11 | bwd_inner_microstep: 5701.06 | bwd_allreduce_microstep: 13.01 | step_microstep: 19.02 [2025-04-26 00:09:40,110] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.78 | bwd: 5714.12 | bwd_inner: 5701.06 | bwd_allreduce: 13.02 | step: 19.02 16%|█▋ | 6705/41250 [16:12:05<83:31:35, 8.70s/it] {'loss': 0.1884, 'grad_norm': 1.6175658702850342, 'learning_rate': 3.818557624831144e-05, 'epoch': 1.63} 16%|█▋ | 6705/41250 [16:12:05<83:31:35, 8.70s/it][2025-04-26 00:09:48,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.93 [2025-04-26 00:09:48,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.93 | bwd_microstep: 5740.39 | bwd_inner_microstep: 5692.64 | bwd_allreduce_microstep: 47.71 | step_microstep: 19.22 [2025-04-26 00:09:48,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.93 | bwd: 5740.41 | bwd_inner: 5692.64 | bwd_allreduce: 47.73 | step: 19.23 16%|█▋ | 6706/41250 [16:12:14<83:26:13, 8.70s/it] {'loss': 0.2182, 'grad_norm': 5.697456359863281, 'learning_rate': 3.818492264251394e-05, 'epoch': 1.63} 16%|█▋ | 6706/41250 [16:12:14<83:26:13, 8.70s/it][2025-04-26 00:09:57,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-26 00:09:57,474] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.70 | bwd_microstep: 5780.26 | bwd_inner_microstep: 5649.29 | bwd_allreduce_microstep: 130.93 | step_microstep: 18.75 [2025-04-26 00:09:57,474] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.70 | bwd: 5780.28 | bwd_inner: 5649.29 | bwd_allreduce: 130.95 | step: 18.75 16%|█▋ | 6707/41250 [16:12:22<83:25:10, 8.69s/it] {'loss': 0.2779, 'grad_norm': 2.599959135055542, 'learning_rate': 3.818426892460998e-05, 'epoch': 1.63} 16%|█▋ | 6707/41250 [16:12:22<83:25:10, 8.69s/it][2025-04-26 00:10:06,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 1.06 [2025-04-26 00:10:06,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.63 | bwd_microstep: 5708.81 | bwd_inner_microstep: 5695.68 | bwd_allreduce_microstep: 13.09 | step_microstep: 18.90 [2025-04-26 00:10:06,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.63 | bwd: 5708.83 | bwd_inner: 5695.68 | bwd_allreduce: 13.11 | step: 18.91 16%|█▋ | 6708/41250 [16:12:31<83:16:00, 8.68s/it] {'loss': 0.0257, 'grad_norm': 0.4857819378376007, 'learning_rate': 3.818361509460358e-05, 'epoch': 1.63} 16%|█▋ | 6708/41250 [16:12:31<83:16:00, 8.68s/it][2025-04-26 00:10:14,726] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.99 | optimizer_step: 1.00 [2025-04-26 00:10:14,727] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.31 | bwd_microstep: 5686.02 | bwd_inner_microstep: 5673.55 | bwd_allreduce_microstep: 12.43 | step_microstep: 18.74 [2025-04-26 00:10:14,727] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.31 | bwd: 5686.03 | bwd_inner: 5673.55 | bwd_allreduce: 12.44 | step: 18.74 16%|█▋ | 6709/41250 [16:12:40<83:04:28, 8.66s/it] {'loss': 0.1989, 'grad_norm': 1.7417594194412231, 'learning_rate': 3.818296115249876e-05, 'epoch': 1.63} 16%|█▋ | 6709/41250 [16:12:40<83:04:28, 8.66s/it][2025-04-26 00:10:23,481] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-26 00:10:23,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2886.42 | bwd_microstep: 5785.14 | bwd_inner_microstep: 5772.47 | bwd_allreduce_microstep: 12.62 | step_microstep: 19.19 [2025-04-26 00:10:23,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2886.42 | bwd: 5785.15 | bwd_inner: 5772.47 | bwd_allreduce: 12.64 | step: 19.19 16%|█▋ | 6710/41250 [16:12:48<83:21:00, 8.69s/it] {'loss': 0.2491, 'grad_norm': 2.4291179180145264, 'learning_rate': 3.8182307098299564e-05, 'epoch': 1.63} 16%|█▋ | 6710/41250 [16:12:48<83:21:00, 8.69s/it][2025-04-26 00:10:32,210] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.92 [2025-04-26 00:10:32,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2873.95 | bwd_microstep: 5761.13 | bwd_inner_microstep: 5748.24 | bwd_allreduce_microstep: 12.85 | step_microstep: 18.93 [2025-04-26 00:10:32,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2873.95 | bwd: 5761.15 | bwd_inner: 5748.24 | bwd_allreduce: 12.87 | step: 18.94 16%|█▋ | 6711/41250 [16:12:57<83:27:55, 8.70s/it] {'loss': 0.1349, 'grad_norm': 1.7617063522338867, 'learning_rate': 3.818165293201002e-05, 'epoch': 1.63} 16%|█▋ | 6711/41250 [16:12:57<83:27:55, 8.70s/it][2025-04-26 00:10:40,890] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-26 00:10:40,890] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.32 | bwd_microstep: 5776.49 | bwd_inner_microstep: 5639.46 | bwd_allreduce_microstep: 136.99 | step_microstep: 19.11 [2025-04-26 00:10:40,890] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.32 | bwd: 5776.50 | bwd_inner: 5639.46 | bwd_allreduce: 137.01 | step: 19.11 16%|█▋ | 6712/41250 [16:13:06<83:24:18, 8.69s/it] {'loss': 0.2194, 'grad_norm': 2.362750291824341, 'learning_rate': 3.818099865363417e-05, 'epoch': 1.63} 16%|█▋ | 6712/41250 [16:13:06<83:24:18, 8.69s/it][2025-04-26 00:10:49,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:10:49,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.58 | bwd_microstep: 5786.45 | bwd_inner_microstep: 5654.43 | bwd_allreduce_microstep: 131.97 | step_microstep: 18.64 [2025-04-26 00:10:49,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.58 | bwd: 5786.47 | bwd_inner: 5654.43 | bwd_allreduce: 131.99 | step: 18.64 16%|█▋ | 6713/41250 [16:13:14<83:23:55, 8.69s/it] {'loss': 0.3346, 'grad_norm': 1.7827624082565308, 'learning_rate': 3.818034426317603e-05, 'epoch': 1.63} 16%|█▋ | 6713/41250 [16:13:14<83:23:55, 8.69s/it][2025-04-26 00:10:58,247] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:10:58,247] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.53 | bwd_microstep: 5738.63 | bwd_inner_microstep: 5677.69 | bwd_allreduce_microstep: 60.90 | step_microstep: 18.83 [2025-04-26 00:10:58,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.53 | bwd: 5738.64 | bwd_inner: 5677.69 | bwd_allreduce: 60.91 | step: 18.83 16%|█▋ | 6714/41250 [16:13:23<83:18:54, 8.68s/it] {'loss': 0.0987, 'grad_norm': 1.2950730323791504, 'learning_rate': 3.8179689760639645e-05, 'epoch': 1.63} 16%|█▋ | 6714/41250 [16:13:23<83:18:54, 8.68s/it][2025-04-26 00:11:06,828] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 00:11:06,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.62 | bwd_microstep: 5678.53 | bwd_inner_microstep: 5635.48 | bwd_allreduce_microstep: 43.00 | step_microstep: 18.61 [2025-04-26 00:11:06,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.62 | bwd: 5678.54 | bwd_inner: 5635.48 | bwd_allreduce: 43.01 | step: 18.62 16%|█▋ | 6715/41250 [16:13:32<83:01:14, 8.65s/it] {'loss': 0.0202, 'grad_norm': 0.23108172416687012, 'learning_rate': 3.8179035146029047e-05, 'epoch': 1.63} 16%|█▋ | 6715/41250 [16:13:32<83:01:14, 8.65s/it][2025-04-26 00:11:15,505] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.09 | optimizer_step: 1.08 [2025-04-26 00:11:15,506] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.96 | bwd_microstep: 5741.28 | bwd_inner_microstep: 5690.91 | bwd_allreduce_microstep: 50.32 | step_microstep: 19.34 [2025-04-26 00:11:15,506] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.96 | bwd: 5741.30 | bwd_inner: 5690.91 | bwd_allreduce: 50.35 | step: 19.34 16%|█▋ | 6716/41250 [16:13:40<83:04:37, 8.66s/it] {'loss': 0.0493, 'grad_norm': 0.9438426494598389, 'learning_rate': 3.817838041934828e-05, 'epoch': 1.63} 16%|█▋ | 6716/41250 [16:13:40<83:04:37, 8.66s/it][2025-04-26 00:11:24,184] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 00:11:24,184] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.79 | bwd_microstep: 5750.49 | bwd_inner_microstep: 5697.47 | bwd_allreduce_microstep: 52.98 | step_microstep: 18.73 [2025-04-26 00:11:24,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.79 | bwd: 5750.51 | bwd_inner: 5697.47 | bwd_allreduce: 53.00 | step: 18.74 16%|█▋ | 6717/41250 [16:13:49<83:07:43, 8.67s/it] {'loss': 0.1681, 'grad_norm': 1.939092755317688, 'learning_rate': 3.817772558060137e-05, 'epoch': 1.63} 16%|█▋ | 6717/41250 [16:13:49<83:07:43, 8.67s/it][2025-04-26 00:11:32,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 00:11:32,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.06 | bwd_microstep: 5753.46 | bwd_inner_microstep: 5634.13 | bwd_allreduce_microstep: 119.29 | step_microstep: 18.75 [2025-04-26 00:11:32,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.06 | bwd: 5753.47 | bwd_inner: 5634.13 | bwd_allreduce: 119.31 | step: 18.75 16%|█▋ | 6718/41250 [16:13:58<83:06:45, 8.66s/it] {'loss': 0.2061, 'grad_norm': 1.9491477012634277, 'learning_rate': 3.8177070629792356e-05, 'epoch': 1.63} 16%|█▋ | 6718/41250 [16:13:58<83:06:45, 8.66s/it][2025-04-26 00:11:41,636] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.03 | optimizer_step: 0.95 [2025-04-26 00:11:41,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.47 | bwd_microstep: 5875.37 | bwd_inner_microstep: 5647.76 | bwd_allreduce_microstep: 227.55 | step_microstep: 18.79 [2025-04-26 00:11:41,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.46 | bwd: 5875.38 | bwd_inner: 5647.76 | bwd_allreduce: 227.57 | step: 18.79 16%|█▋ | 6719/41250 [16:14:06<83:28:57, 8.70s/it] {'loss': 0.1581, 'grad_norm': 2.0971312522888184, 'learning_rate': 3.8176415566925274e-05, 'epoch': 1.63} 16%|█▋ | 6719/41250 [16:14:06<83:28:57, 8.70s/it][2025-04-26 00:11:50,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 1.00 | optimizer_step: 1.14 [2025-04-26 00:11:50,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.89 | bwd_microstep: 5685.63 | bwd_inner_microstep: 5637.21 | bwd_allreduce_microstep: 48.38 | step_microstep: 18.92 [2025-04-26 00:11:50,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.89 | bwd: 5685.64 | bwd_inner: 5637.21 | bwd_allreduce: 48.39 | step: 18.92 16%|█▋ | 6720/41250 [16:14:15<83:10:05, 8.67s/it] {'loss': 0.0933, 'grad_norm': 1.576606273651123, 'learning_rate': 3.817576039200417e-05, 'epoch': 1.63} 16%|█▋ | 6720/41250 [16:14:15<83:10:05, 8.67s/it][2025-04-26 00:11:58,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 00:11:58,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.82 | bwd_microstep: 5742.20 | bwd_inner_microstep: 5689.33 | bwd_allreduce_microstep: 52.83 | step_microstep: 18.94 [2025-04-26 00:11:58,902] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.82 | bwd: 5742.22 | bwd_inner: 5689.33 | bwd_allreduce: 52.85 | step: 18.95 16%|█▋ | 6721/41250 [16:14:24<83:09:12, 8.67s/it] {'loss': 0.361, 'grad_norm': 2.079922914505005, 'learning_rate': 3.8175105105033075e-05, 'epoch': 1.63} 16%|█▋ | 6721/41250 [16:14:24<83:09:12, 8.67s/it][2025-04-26 00:12:07,599] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.98 [2025-04-26 00:12:07,600] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.37 | bwd_microstep: 5772.09 | bwd_inner_microstep: 5677.43 | bwd_allreduce_microstep: 94.62 | step_microstep: 19.14 [2025-04-26 00:12:07,600] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.37 | bwd: 5772.11 | bwd_inner: 5677.43 | bwd_allreduce: 94.63 | step: 19.14 16%|█▋ | 6722/41250 [16:14:32<83:13:57, 8.68s/it] {'loss': 0.1767, 'grad_norm': 3.078165054321289, 'learning_rate': 3.817444970601603e-05, 'epoch': 1.63} 16%|█▋ | 6722/41250 [16:14:32<83:13:57, 8.68s/it][2025-04-26 00:12:16,228] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 1.04 [2025-04-26 00:12:16,229] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.20 | bwd_microstep: 5692.57 | bwd_inner_microstep: 5679.85 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.78 [2025-04-26 00:12:16,229] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.20 | bwd: 5692.58 | bwd_inner: 5679.85 | bwd_allreduce: 12.69 | step: 18.78 16%|█▋ | 6723/41250 [16:14:41<83:05:28, 8.66s/it] {'loss': 0.0815, 'grad_norm': 1.0252424478530884, 'learning_rate': 3.817379419495708e-05, 'epoch': 1.63} 16%|█▋ | 6723/41250 [16:14:41<83:05:28, 8.66s/it][2025-04-26 00:12:24,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.99 | optimizer_step: 1.02 [2025-04-26 00:12:24,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.22 | bwd_microstep: 5779.30 | bwd_inner_microstep: 5657.42 | bwd_allreduce_microstep: 121.84 | step_microstep: 18.69 [2025-04-26 00:12:24,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.22 | bwd: 5779.31 | bwd_inner: 5657.42 | bwd_allreduce: 121.85 | step: 18.69 16%|█▋ | 6724/41250 [16:14:50<83:09:34, 8.67s/it] {'loss': 0.0691, 'grad_norm': 1.6152678728103638, 'learning_rate': 3.817313857186027e-05, 'epoch': 1.63} 16%|█▋ | 6724/41250 [16:14:50<83:09:34, 8.67s/it][2025-04-26 00:12:33,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.97 | optimizer_step: 0.95 [2025-04-26 00:12:33,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.61 | bwd_microstep: 5731.17 | bwd_inner_microstep: 5700.60 | bwd_allreduce_microstep: 30.53 | step_microstep: 18.75 [2025-04-26 00:12:33,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.61 | bwd: 5731.19 | bwd_inner: 5700.60 | bwd_allreduce: 30.55 | step: 18.75 16%|█▋ | 6725/41250 [16:14:58<83:08:11, 8.67s/it] {'loss': 0.0685, 'grad_norm': 1.1357942819595337, 'learning_rate': 3.817248283672962e-05, 'epoch': 1.63} 16%|█▋ | 6725/41250 [16:14:58<83:08:11, 8.67s/it][2025-04-26 00:12:42,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:12:42,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.84 | bwd_microstep: 5715.71 | bwd_inner_microstep: 5702.91 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.58 [2025-04-26 00:12:42,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.84 | bwd: 5715.72 | bwd_inner: 5702.91 | bwd_allreduce: 12.77 | step: 18.58 16%|█▋ | 6726/41250 [16:15:07<83:05:40, 8.66s/it] {'loss': 0.0787, 'grad_norm': 0.9357116222381592, 'learning_rate': 3.81718269895692e-05, 'epoch': 1.63} 16%|█▋ | 6726/41250 [16:15:07<83:05:40, 8.66s/it][2025-04-26 00:12:50,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:12:50,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.80 | bwd_microstep: 5778.38 | bwd_inner_microstep: 5651.21 | bwd_allreduce_microstep: 127.13 | step_microstep: 18.53 [2025-04-26 00:12:50,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.80 | bwd: 5778.39 | bwd_inner: 5651.21 | bwd_allreduce: 127.15 | step: 18.53 16%|█▋ | 6727/41250 [16:15:16<83:09:45, 8.67s/it] {'loss': 0.0248, 'grad_norm': 0.34154626727104187, 'learning_rate': 3.8171171030383043e-05, 'epoch': 1.63} 16%|█▋ | 6727/41250 [16:15:16<83:09:45, 8.67s/it][2025-04-26 00:12:59,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.01 | optimizer_step: 1.06 [2025-04-26 00:12:59,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.01 | bwd_microstep: 5707.09 | bwd_inner_microstep: 5656.73 | bwd_allreduce_microstep: 50.31 | step_microstep: 18.88 [2025-04-26 00:12:59,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.01 | bwd: 5707.11 | bwd_inner: 5656.73 | bwd_allreduce: 50.33 | step: 18.88 16%|█▋ | 6728/41250 [16:15:24<83:00:43, 8.66s/it] {'loss': 0.0857, 'grad_norm': 0.8892666697502136, 'learning_rate': 3.8170514959175185e-05, 'epoch': 1.63} 16%|█▋ | 6728/41250 [16:15:24<83:00:43, 8.66s/it][2025-04-26 00:13:08,266] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.96 | optimizer_step: 1.00 [2025-04-26 00:13:08,266] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.38 | bwd_microstep: 5807.06 | bwd_inner_microstep: 5663.65 | bwd_allreduce_microstep: 143.37 | step_microstep: 18.55 [2025-04-26 00:13:08,266] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.38 | bwd: 5807.08 | bwd_inner: 5663.65 | bwd_allreduce: 143.39 | step: 18.55 16%|█▋ | 6729/41250 [16:15:33<83:11:22, 8.68s/it] {'loss': 0.1456, 'grad_norm': 1.9486117362976074, 'learning_rate': 3.816985877594967e-05, 'epoch': 1.63} 16%|█▋ | 6729/41250 [16:15:33<83:11:22, 8.68s/it][2025-04-26 00:13:16,981] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:13:16,981] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.06 | bwd_microstep: 5805.10 | bwd_inner_microstep: 5654.97 | bwd_allreduce_microstep: 150.08 | step_microstep: 18.62 [2025-04-26 00:13:16,981] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.06 | bwd: 5805.11 | bwd_inner: 5654.97 | bwd_allreduce: 150.09 | step: 18.62 16%|█▋ | 6730/41250 [16:15:42<83:18:43, 8.69s/it] {'loss': 0.2656, 'grad_norm': 1.5827038288116455, 'learning_rate': 3.8169202480710557e-05, 'epoch': 1.63} 16%|█▋ | 6730/41250 [16:15:42<83:18:43, 8.69s/it][2025-04-26 00:13:25,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:13:25,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.57 | bwd_microstep: 5778.05 | bwd_inner_microstep: 5710.54 | bwd_allreduce_microstep: 67.46 | step_microstep: 18.58 [2025-04-26 00:13:25,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.57 | bwd: 5778.06 | bwd_inner: 5710.54 | bwd_allreduce: 67.48 | step: 18.59 16%|█▋ | 6731/41250 [16:15:51<83:22:05, 8.69s/it] {'loss': 0.1129, 'grad_norm': 1.519233226776123, 'learning_rate': 3.816854607346188e-05, 'epoch': 1.63} 16%|█▋ | 6731/41250 [16:15:51<83:22:05, 8.69s/it][2025-04-26 00:13:34,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.04 | optimizer_step: 1.07 [2025-04-26 00:13:34,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.00 | bwd_microstep: 5748.85 | bwd_inner_microstep: 5678.98 | bwd_allreduce_microstep: 69.83 | step_microstep: 19.47 [2025-04-26 00:13:34,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.00 | bwd: 5748.87 | bwd_inner: 5678.98 | bwd_allreduce: 69.85 | step: 19.47 16%|█▋ | 6732/41250 [16:15:59<83:22:26, 8.70s/it] {'loss': 0.2028, 'grad_norm': 1.4752343893051147, 'learning_rate': 3.8167889554207695e-05, 'epoch': 1.63} 16%|█▋ | 6732/41250 [16:15:59<83:22:26, 8.70s/it][2025-04-26 00:13:43,084] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 00:13:43,085] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.32 | bwd_microstep: 5789.52 | bwd_inner_microstep: 5646.26 | bwd_allreduce_microstep: 143.21 | step_microstep: 18.61 [2025-04-26 00:13:43,085] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.32 | bwd: 5789.53 | bwd_inner: 5646.26 | bwd_allreduce: 143.23 | step: 18.61 16%|█▋ | 6733/41250 [16:16:08<83:22:10, 8.70s/it] {'loss': 0.0627, 'grad_norm': 2.102294683456421, 'learning_rate': 3.8167232922952046e-05, 'epoch': 1.63} 16%|█▋ | 6733/41250 [16:16:08<83:22:10, 8.70s/it][2025-04-26 00:13:51,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:13:51,706] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.97 | bwd_microstep: 5714.98 | bwd_inner_microstep: 5647.96 | bwd_allreduce_microstep: 66.97 | step_microstep: 18.89 [2025-04-26 00:13:51,706] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.97 | bwd: 5714.99 | bwd_inner: 5647.96 | bwd_allreduce: 66.99 | step: 18.89 16%|█▋ | 6734/41250 [16:16:17<83:09:16, 8.67s/it] {'loss': 0.0669, 'grad_norm': 1.0532704591751099, 'learning_rate': 3.816657617969897e-05, 'epoch': 1.63} 16%|█▋ | 6734/41250 [16:16:17<83:09:16, 8.67s/it][2025-04-26 00:14:00,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:14:00,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.82 | bwd_microstep: 5718.22 | bwd_inner_microstep: 5651.82 | bwd_allreduce_microstep: 66.35 | step_microstep: 18.63 [2025-04-26 00:14:00,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.82 | bwd: 5718.23 | bwd_inner: 5651.82 | bwd_allreduce: 66.37 | step: 18.64 16%|█▋ | 6735/41250 [16:16:25<83:00:50, 8.66s/it] {'loss': 0.0838, 'grad_norm': 0.9009657502174377, 'learning_rate': 3.8165919324452533e-05, 'epoch': 1.63} 16%|█▋ | 6735/41250 [16:16:25<83:00:50, 8.66s/it][2025-04-26 00:14:09,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:14:09,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.27 | bwd_microstep: 5767.15 | bwd_inner_microstep: 5665.92 | bwd_allreduce_microstep: 101.18 | step_microstep: 18.59 [2025-04-26 00:14:09,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.27 | bwd: 5767.16 | bwd_inner: 5665.92 | bwd_allreduce: 101.19 | step: 18.59 16%|█▋ | 6736/41250 [16:16:34<83:03:46, 8.66s/it] {'loss': 0.1749, 'grad_norm': 1.6849385499954224, 'learning_rate': 3.8165262357216774e-05, 'epoch': 1.63} 16%|█▋ | 6736/41250 [16:16:34<83:03:46, 8.66s/it][2025-04-26 00:14:17,691] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.06 | optimizer_step: 0.96 [2025-04-26 00:14:17,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.96 | bwd_microstep: 5771.33 | bwd_inner_microstep: 5661.70 | bwd_allreduce_microstep: 109.58 | step_microstep: 19.35 [2025-04-26 00:14:17,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.96 | bwd: 5771.34 | bwd_inner: 5661.69 | bwd_allreduce: 109.60 | step: 19.36 16%|█▋ | 6737/41250 [16:16:43<83:07:12, 8.67s/it] {'loss': 0.1778, 'grad_norm': 2.6982340812683105, 'learning_rate': 3.816460527799575e-05, 'epoch': 1.63} 16%|█▋ | 6737/41250 [16:16:43<83:07:12, 8.67s/it][2025-04-26 00:14:26,336] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:14:26,336] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.79 | bwd_microstep: 5711.94 | bwd_inner_microstep: 5699.18 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.65 [2025-04-26 00:14:26,337] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.79 | bwd: 5711.95 | bwd_inner: 5699.18 | bwd_allreduce: 12.73 | step: 18.65 16%|█▋ | 6738/41250 [16:16:51<83:02:30, 8.66s/it] {'loss': 0.0697, 'grad_norm': 2.405750036239624, 'learning_rate': 3.81639480867935e-05, 'epoch': 1.63} 16%|█▋ | 6738/41250 [16:16:51<83:02:30, 8.66s/it][2025-04-26 00:14:35,040] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.57 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:14:35,041] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.91 | bwd_microstep: 5790.57 | bwd_inner_microstep: 5660.50 | bwd_allreduce_microstep: 130.02 | step_microstep: 18.65 [2025-04-26 00:14:35,041] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.91 | bwd: 5790.58 | bwd_inner: 5660.50 | bwd_allreduce: 130.04 | step: 18.65 16%|█▋ | 6739/41250 [16:17:00<83:09:40, 8.67s/it] {'loss': 0.1808, 'grad_norm': 1.643941879272461, 'learning_rate': 3.8163290783614086e-05, 'epoch': 1.63} 16%|█▋ | 6739/41250 [16:17:00<83:09:40, 8.67s/it][mov,mp4,m4a,3gp,3g2,mj2 @ 0x2021cb00] moov atom not found [00:14:36] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Allvideos/Animate/00778.mp4, Invalid data found when processing input petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. Error reading /home/wangjiarui/AIGV6K/Allvideos/Animate/00778.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Allvideos/Animate/00778.mp4... [2025-04-26 00:14:43,833] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.98 | optimizer_step: 0.92 [2025-04-26 00:14:43,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.02 | bwd_microstep: 5866.57 | bwd_inner_microstep: 5696.72 | bwd_allreduce_microstep: 169.80 | step_microstep: 18.54 [2025-04-26 00:14:43,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.02 | bwd: 5866.58 | bwd_inner: 5696.72 | bwd_allreduce: 169.82 | step: 18.54 16%|█▋ | 6740/41250 [16:17:09<83:29:54, 8.71s/it] {'loss': 0.4008, 'grad_norm': 2.503296375274658, 'learning_rate': 3.816263336846156e-05, 'epoch': 1.63} 16%|█▋ | 6740/41250 [16:17:09<83:29:54, 8.71s/it][2025-04-26 00:14:52,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 00:14:52,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.96 | bwd_microstep: 5786.76 | bwd_inner_microstep: 5655.24 | bwd_allreduce_microstep: 131.47 | step_microstep: 18.93 [2025-04-26 00:14:52,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.96 | bwd: 5786.78 | bwd_inner: 5655.24 | bwd_allreduce: 131.49 | step: 18.93 16%|█▋ | 6741/41250 [16:17:17<83:27:10, 8.71s/it] {'loss': 0.0342, 'grad_norm': 0.459414541721344, 'learning_rate': 3.816197584133997e-05, 'epoch': 1.63} 16%|█▋ | 6741/41250 [16:17:17<83:27:10, 8.71s/it][2025-04-26 00:15:01,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:15:01,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.95 | bwd_microstep: 5701.60 | bwd_inner_microstep: 5657.86 | bwd_allreduce_microstep: 43.69 | step_microstep: 18.86 [2025-04-26 00:15:01,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.95 | bwd: 5701.61 | bwd_inner: 5657.86 | bwd_allreduce: 43.71 | step: 18.86 16%|█▋ | 6742/41250 [16:17:26<83:11:02, 8.68s/it] {'loss': 0.0796, 'grad_norm': 1.9312071800231934, 'learning_rate': 3.816131820225336e-05, 'epoch': 1.63} 16%|█▋ | 6742/41250 [16:17:26<83:11:02, 8.68s/it][2025-04-26 00:15:09,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 00:15:09,791] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.63 | bwd_microstep: 5715.64 | bwd_inner_microstep: 5702.85 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.63 [2025-04-26 00:15:09,791] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.63 | bwd: 5715.66 | bwd_inner: 5702.85 | bwd_allreduce: 12.77 | step: 18.63 16%|█▋ | 6743/41250 [16:17:35<83:05:49, 8.67s/it] {'loss': 0.1186, 'grad_norm': 1.3811516761779785, 'learning_rate': 3.8160660451205804e-05, 'epoch': 1.63} 16%|█▋ | 6743/41250 [16:17:35<83:05:49, 8.67s/it][2025-04-26 00:15:18,422] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.98 | optimizer_step: 1.05 [2025-04-26 00:15:18,422] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.74 | bwd_microstep: 5701.36 | bwd_inner_microstep: 5688.39 | bwd_allreduce_microstep: 12.91 | step_microstep: 18.77 [2025-04-26 00:15:18,423] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.74 | bwd: 5701.37 | bwd_inner: 5688.39 | bwd_allreduce: 12.93 | step: 18.77 16%|█▋ | 6744/41250 [16:17:43<82:59:17, 8.66s/it] {'loss': 0.0784, 'grad_norm': 1.3723690509796143, 'learning_rate': 3.8160002588201355e-05, 'epoch': 1.63} 16%|█▋ | 6744/41250 [16:17:43<82:59:17, 8.66s/it][2025-04-26 00:15:27,121] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.99 | optimizer_step: 0.88 [2025-04-26 00:15:27,121] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.90 | bwd_microstep: 5762.66 | bwd_inner_microstep: 5680.45 | bwd_allreduce_microstep: 82.16 | step_microstep: 18.43 [2025-04-26 00:15:27,121] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.90 | bwd: 5762.67 | bwd_inner: 5680.45 | bwd_allreduce: 82.18 | step: 18.43 16%|█▋ | 6745/41250 [16:17:52<83:06:15, 8.67s/it] {'loss': 0.3516, 'grad_norm': 3.046987771987915, 'learning_rate': 3.8159344613244054e-05, 'epoch': 1.64} 16%|█▋ | 6745/41250 [16:17:52<83:06:15, 8.67s/it][2025-04-26 00:15:35,764] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.98 | optimizer_step: 1.01 [2025-04-26 00:15:35,765] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.55 | bwd_microstep: 5707.44 | bwd_inner_microstep: 5694.81 | bwd_allreduce_microstep: 12.59 | step_microstep: 18.68 [2025-04-26 00:15:35,765] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.55 | bwd: 5707.45 | bwd_inner: 5694.80 | bwd_allreduce: 12.61 | step: 18.69 16%|█▋ | 6746/41250 [16:18:01<83:01:19, 8.66s/it] {'loss': 0.1888, 'grad_norm': 2.641197443008423, 'learning_rate': 3.8158686526337965e-05, 'epoch': 1.64} 16%|█▋ | 6746/41250 [16:18:01<83:01:19, 8.66s/it][2025-04-26 00:15:44,425] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-26 00:15:44,426] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.44 | bwd_microstep: 5719.80 | bwd_inner_microstep: 5698.14 | bwd_allreduce_microstep: 21.62 | step_microstep: 18.92 [2025-04-26 00:15:44,426] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.44 | bwd: 5719.82 | bwd_inner: 5698.14 | bwd_allreduce: 21.64 | step: 18.92 16%|█▋ | 6747/41250 [16:18:09<83:01:04, 8.66s/it] {'loss': 0.0399, 'grad_norm': 0.5482485294342041, 'learning_rate': 3.815802832748715e-05, 'epoch': 1.64} 16%|█▋ | 6747/41250 [16:18:09<83:01:04, 8.66s/it][2025-04-26 00:15:53,087] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 00:15:53,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.54 | bwd_microstep: 5720.81 | bwd_inner_microstep: 5674.80 | bwd_allreduce_microstep: 45.97 | step_microstep: 18.61 [2025-04-26 00:15:53,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.54 | bwd: 5720.82 | bwd_inner: 5674.80 | bwd_allreduce: 45.98 | step: 18.62 16%|█▋ | 6748/41250 [16:18:18<83:00:52, 8.66s/it] {'loss': 0.1707, 'grad_norm': 2.1724281311035156, 'learning_rate': 3.8157370016695656e-05, 'epoch': 1.64} 16%|█▋ | 6748/41250 [16:18:18<83:00:52, 8.66s/it][2025-04-26 00:16:01,760] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:16:01,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2868.12 | bwd_microstep: 5718.67 | bwd_inner_microstep: 5704.66 | bwd_allreduce_microstep: 13.97 | step_microstep: 18.40 [2025-04-26 00:16:01,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2868.12 | bwd: 5718.68 | bwd_inner: 5704.66 | bwd_allreduce: 13.98 | step: 18.40 16%|█▋ | 6749/41250 [16:18:27<83:02:40, 8.67s/it] {'loss': 0.0495, 'grad_norm': 0.6873926520347595, 'learning_rate': 3.815671159396755e-05, 'epoch': 1.64} 16%|█▋ | 6749/41250 [16:18:27<83:02:40, 8.67s/it][2025-04-26 00:16:10,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:16:10,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.58 | bwd_microstep: 5694.29 | bwd_inner_microstep: 5646.96 | bwd_allreduce_microstep: 47.28 | step_microstep: 19.06 [2025-04-26 00:16:10,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.58 | bwd: 5694.30 | bwd_inner: 5646.96 | bwd_allreduce: 47.30 | step: 19.06 16%|█▋ | 6750/41250 [16:18:35<82:52:17, 8.65s/it] {'loss': 0.3612, 'grad_norm': 3.7542500495910645, 'learning_rate': 3.815605305930689e-05, 'epoch': 1.64} 16%|█▋ | 6750/41250 [16:18:35<82:52:17, 8.65s/it][2025-04-26 00:16:19,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.06 | optimizer_step: 1.05 [2025-04-26 00:16:19,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.89 | bwd_microstep: 5751.37 | bwd_inner_microstep: 5695.60 | bwd_allreduce_microstep: 55.71 | step_microstep: 19.13 [2025-04-26 00:16:19,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.89 | bwd: 5751.39 | bwd_inner: 5695.60 | bwd_allreduce: 55.73 | step: 19.13 16%|█▋ | 6751/41250 [16:18:44<82:59:21, 8.66s/it] {'loss': 0.1354, 'grad_norm': 1.2723621129989624, 'learning_rate': 3.815539441271773e-05, 'epoch': 1.64} 16%|█▋ | 6751/41250 [16:18:44<82:59:21, 8.66s/it][2025-04-26 00:16:27,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.01 | optimizer_step: 0.95 [2025-04-26 00:16:27,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.95 | bwd_microstep: 5695.97 | bwd_inner_microstep: 5682.84 | bwd_allreduce_microstep: 13.07 | step_microstep: 18.74 [2025-04-26 00:16:27,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.95 | bwd: 5695.99 | bwd_inner: 5682.84 | bwd_allreduce: 13.09 | step: 18.74 16%|█▋ | 6752/41250 [16:18:53<82:53:19, 8.65s/it] {'loss': 0.0386, 'grad_norm': 0.5527887344360352, 'learning_rate': 3.815473565420414e-05, 'epoch': 1.64} 16%|█▋ | 6752/41250 [16:18:53<82:53:19, 8.65s/it][2025-04-26 00:16:36,353] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.95 | optimizer_gradients: 1.14 | optimizer_step: 1.03 [2025-04-26 00:16:36,354] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.35 | bwd_microstep: 5762.16 | bwd_inner_microstep: 5648.44 | bwd_allreduce_microstep: 113.68 | step_microstep: 19.13 [2025-04-26 00:16:36,354] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.35 | bwd: 5762.17 | bwd_inner: 5648.44 | bwd_allreduce: 113.69 | step: 19.14 16%|█▋ | 6753/41250 [16:19:01<82:57:01, 8.66s/it] {'loss': 0.0866, 'grad_norm': 1.118107795715332, 'learning_rate': 3.815407678377017e-05, 'epoch': 1.64} 16%|█▋ | 6753/41250 [16:19:01<82:57:01, 8.66s/it][2025-04-26 00:16:44,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:16:44,947] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.39 | bwd_microstep: 5683.58 | bwd_inner_microstep: 5644.05 | bwd_allreduce_microstep: 39.49 | step_microstep: 18.70 [2025-04-26 00:16:44,947] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.39 | bwd: 5683.59 | bwd_inner: 5644.05 | bwd_allreduce: 39.50 | step: 18.71 16%|█▋ | 6754/41250 [16:19:10<82:45:58, 8.64s/it] {'loss': 0.1739, 'grad_norm': 3.5410923957824707, 'learning_rate': 3.81534178014199e-05, 'epoch': 1.64} 16%|█▋ | 6754/41250 [16:19:10<82:45:58, 8.64s/it][2025-04-26 00:16:53,610] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-26 00:16:53,611] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.59 | bwd_microstep: 5742.79 | bwd_inner_microstep: 5646.00 | bwd_allreduce_microstep: 96.74 | step_microstep: 18.97 [2025-04-26 00:16:53,611] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.59 | bwd: 5742.81 | bwd_inner: 5646.00 | bwd_allreduce: 96.77 | step: 18.98 16%|█▋ | 6755/41250 [16:19:18<82:50:16, 8.65s/it] {'loss': 0.0905, 'grad_norm': 1.533473014831543, 'learning_rate': 3.815275870715736e-05, 'epoch': 1.64} 16%|█▋ | 6755/41250 [16:19:18<82:50:16, 8.65s/it][2025-04-26 00:17:02,287] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.02 | optimizer_step: 1.08 [2025-04-26 00:17:02,288] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.45 | bwd_microstep: 5744.44 | bwd_inner_microstep: 5677.08 | bwd_allreduce_microstep: 67.30 | step_microstep: 19.05 [2025-04-26 00:17:02,288] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.45 | bwd: 5744.46 | bwd_inner: 5677.08 | bwd_allreduce: 67.33 | step: 19.05 16%|█▋ | 6756/41250 [16:19:27<82:55:43, 8.65s/it] {'loss': 0.0726, 'grad_norm': 1.0478328466415405, 'learning_rate': 3.815209950098664e-05, 'epoch': 1.64} 16%|█▋ | 6756/41250 [16:19:27<82:55:43, 8.65s/it][2025-04-26 00:17:10,940] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.01 | optimizer_step: 1.01 [2025-04-26 00:17:10,940] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.49 | bwd_microstep: 5718.89 | bwd_inner_microstep: 5706.11 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.84 [2025-04-26 00:17:10,941] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.49 | bwd: 5718.90 | bwd_inner: 5706.11 | bwd_allreduce: 12.75 | step: 18.85 16%|█▋ | 6757/41250 [16:19:36<82:55:08, 8.65s/it] {'loss': 0.0697, 'grad_norm': 1.9335765838623047, 'learning_rate': 3.81514401829118e-05, 'epoch': 1.64} 16%|█▋ | 6757/41250 [16:19:36<82:55:08, 8.65s/it][2025-04-26 00:17:19,664] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:17:19,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2876.07 | bwd_microstep: 5763.60 | bwd_inner_microstep: 5751.01 | bwd_allreduce_microstep: 12.54 | step_microstep: 18.82 [2025-04-26 00:17:19,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2876.07 | bwd: 5763.61 | bwd_inner: 5751.01 | bwd_allreduce: 12.56 | step: 18.82 16%|█▋ | 6758/41250 [16:19:44<83:07:02, 8.68s/it] {'loss': 0.1012, 'grad_norm': 1.150116205215454, 'learning_rate': 3.8150780752936903e-05, 'epoch': 1.64} 16%|█▋ | 6758/41250 [16:19:44<83:07:02, 8.68s/it][2025-04-26 00:17:28,327] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.04 | optimizer_step: 0.93 [2025-04-26 00:17:28,327] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.31 | bwd_microstep: 5734.63 | bwd_inner_microstep: 5712.86 | bwd_allreduce_microstep: 21.72 | step_microstep: 19.12 [2025-04-26 00:17:28,327] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.31 | bwd: 5734.64 | bwd_inner: 5712.86 | bwd_allreduce: 21.74 | step: 19.12 16%|█▋ | 6759/41250 [16:19:53<83:04:44, 8.67s/it] {'loss': 0.2151, 'grad_norm': 3.717503786087036, 'learning_rate': 3.815012121106602e-05, 'epoch': 1.64} 16%|█▋ | 6759/41250 [16:19:53<83:04:44, 8.67s/it][2025-04-26 00:17:36,985] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:17:36,985] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.21 | bwd_microstep: 5730.84 | bwd_inner_microstep: 5646.15 | bwd_allreduce_microstep: 84.65 | step_microstep: 18.43 [2025-04-26 00:17:36,985] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.21 | bwd: 5730.85 | bwd_inner: 5646.15 | bwd_allreduce: 84.66 | step: 18.43 16%|█▋ | 6760/41250 [16:20:02<83:02:22, 8.67s/it] {'loss': 0.0525, 'grad_norm': 0.779930591583252, 'learning_rate': 3.81494615573032e-05, 'epoch': 1.64} 16%|█▋ | 6760/41250 [16:20:02<83:02:22, 8.67s/it][2025-04-26 00:17:45,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:17:45,591] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.07 | bwd_microstep: 5704.50 | bwd_inner_microstep: 5652.12 | bwd_allreduce_microstep: 52.33 | step_microstep: 18.58 [2025-04-26 00:17:45,591] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.07 | bwd: 5704.52 | bwd_inner: 5652.12 | bwd_allreduce: 52.36 | step: 18.58 16%|█▋ | 6761/41250 [16:20:10<82:51:34, 8.65s/it] {'loss': 0.224, 'grad_norm': 2.4295647144317627, 'learning_rate': 3.8148801791652516e-05, 'epoch': 1.64} 16%|█▋ | 6761/41250 [16:20:10<82:51:34, 8.65s/it][2025-04-26 00:17:54,285] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-26 00:17:54,286] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.83 | bwd_microstep: 5771.51 | bwd_inner_microstep: 5683.26 | bwd_allreduce_microstep: 88.21 | step_microstep: 19.06 [2025-04-26 00:17:54,286] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.83 | bwd: 5771.53 | bwd_inner: 5683.26 | bwd_allreduce: 88.23 | step: 19.06 16%|█▋ | 6762/41250 [16:20:19<82:59:17, 8.66s/it] {'loss': 0.0574, 'grad_norm': 1.3573923110961914, 'learning_rate': 3.8148141914118045e-05, 'epoch': 1.64} 16%|█▋ | 6762/41250 [16:20:19<82:59:17, 8.66s/it][2025-04-26 00:18:02,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.98 | optimizer_step: 1.06 [2025-04-26 00:18:02,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.15 | bwd_microstep: 5742.54 | bwd_inner_microstep: 5695.95 | bwd_allreduce_microstep: 46.54 | step_microstep: 18.47 [2025-04-26 00:18:02,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.15 | bwd: 5742.55 | bwd_inner: 5695.95 | bwd_allreduce: 46.56 | step: 18.47 16%|█▋ | 6763/41250 [16:20:28<83:03:37, 8.67s/it] {'loss': 0.1255, 'grad_norm': 1.2305967807769775, 'learning_rate': 3.8147481924703844e-05, 'epoch': 1.64} 16%|█▋ | 6763/41250 [16:20:28<83:03:37, 8.67s/it][2025-04-26 00:18:11,779] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-26 00:18:11,779] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.15 | bwd_microstep: 5879.32 | bwd_inner_microstep: 5681.10 | bwd_allreduce_microstep: 198.17 | step_microstep: 18.76 [2025-04-26 00:18:11,780] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.15 | bwd: 5879.34 | bwd_inner: 5681.10 | bwd_allreduce: 198.19 | step: 18.76 16%|█▋ | 6764/41250 [16:20:37<83:26:40, 8.71s/it] {'loss': 0.0555, 'grad_norm': 0.7385594844818115, 'learning_rate': 3.8146821823413985e-05, 'epoch': 1.64} 16%|█▋ | 6764/41250 [16:20:37<83:26:40, 8.71s/it][2025-04-26 00:18:20,463] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:18:20,463] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.50 | bwd_microstep: 5744.98 | bwd_inner_microstep: 5692.45 | bwd_allreduce_microstep: 52.49 | step_microstep: 18.65 [2025-04-26 00:18:20,464] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.50 | bwd: 5745.00 | bwd_inner: 5692.45 | bwd_allreduce: 52.51 | step: 18.65 16%|█▋ | 6765/41250 [16:20:45<83:21:50, 8.70s/it] {'loss': 0.0933, 'grad_norm': 2.2985126972198486, 'learning_rate': 3.814616161025255e-05, 'epoch': 1.64} 16%|█▋ | 6765/41250 [16:20:45<83:21:50, 8.70s/it][2025-04-26 00:18:29,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:18:29,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.90 | bwd_microstep: 5792.75 | bwd_inner_microstep: 5638.80 | bwd_allreduce_microstep: 153.90 | step_microstep: 18.35 [2025-04-26 00:18:29,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.90 | bwd: 5792.76 | bwd_inner: 5638.80 | bwd_allreduce: 153.92 | step: 18.35 16%|█▋ | 6766/41250 [16:20:54<83:21:06, 8.70s/it] {'loss': 0.1498, 'grad_norm': 2.7651426792144775, 'learning_rate': 3.8145501285223585e-05, 'epoch': 1.64} 16%|█▋ | 6766/41250 [16:20:54<83:21:06, 8.70s/it][2025-04-26 00:18:37,830] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.01 | optimizer_step: 0.98 [2025-04-26 00:18:37,831] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.09 | bwd_microstep: 5742.31 | bwd_inner_microstep: 5682.33 | bwd_allreduce_microstep: 59.93 | step_microstep: 18.88 [2025-04-26 00:18:37,831] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.09 | bwd: 5742.32 | bwd_inner: 5682.33 | bwd_allreduce: 59.95 | step: 18.89 16%|█▋ | 6767/41250 [16:21:03<83:15:20, 8.69s/it] {'loss': 0.1874, 'grad_norm': 2.5653574466705322, 'learning_rate': 3.814484084833118e-05, 'epoch': 1.64} 16%|█▋ | 6767/41250 [16:21:03<83:15:20, 8.69s/it][2025-04-26 00:18:46,492] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.97 | optimizer_step: 0.99 [2025-04-26 00:18:46,492] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.23 | bwd_microstep: 5741.12 | bwd_inner_microstep: 5655.18 | bwd_allreduce_microstep: 85.89 | step_microstep: 18.36 [2025-04-26 00:18:46,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.23 | bwd: 5741.13 | bwd_inner: 5655.18 | bwd_allreduce: 85.91 | step: 18.36 16%|█▋ | 6768/41250 [16:21:11<83:09:57, 8.68s/it] {'loss': 0.0447, 'grad_norm': 2.8459560871124268, 'learning_rate': 3.81441802995794e-05, 'epoch': 1.64} 16%|█▋ | 6768/41250 [16:21:11<83:09:57, 8.68s/it][2025-04-26 00:18:55,103] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-26 00:18:55,104] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.67 | bwd_microstep: 5687.38 | bwd_inner_microstep: 5649.31 | bwd_allreduce_microstep: 38.02 | step_microstep: 18.75 [2025-04-26 00:18:55,104] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.67 | bwd: 5687.39 | bwd_inner: 5649.31 | bwd_allreduce: 38.03 | step: 18.75 16%|█▋ | 6769/41250 [16:21:20<82:57:15, 8.66s/it] {'loss': 0.2045, 'grad_norm': 3.293626308441162, 'learning_rate': 3.814351963897231e-05, 'epoch': 1.64} 16%|█▋ | 6769/41250 [16:21:20<82:57:15, 8.66s/it][2025-04-26 00:19:03,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 00:19:03,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.70 | bwd_microstep: 5772.81 | bwd_inner_microstep: 5641.10 | bwd_allreduce_microstep: 131.66 | step_microstep: 18.74 [2025-04-26 00:19:03,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.70 | bwd: 5772.82 | bwd_inner: 5641.10 | bwd_allreduce: 131.68 | step: 18.74 16%|█▋ | 6770/41250 [16:21:29<83:01:39, 8.67s/it] {'loss': 0.375, 'grad_norm': 3.5655930042266846, 'learning_rate': 3.8142858866513996e-05, 'epoch': 1.64} 16%|█▋ | 6770/41250 [16:21:29<83:01:39, 8.67s/it][2025-04-26 00:19:12,440] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:19:12,441] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.28 | bwd_microstep: 5710.55 | bwd_inner_microstep: 5697.65 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.50 [2025-04-26 00:19:12,441] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.28 | bwd: 5710.56 | bwd_inner: 5697.65 | bwd_allreduce: 12.87 | step: 18.50 16%|█▋ | 6771/41250 [16:21:37<82:58:32, 8.66s/it] {'loss': 0.1035, 'grad_norm': 0.9993962645530701, 'learning_rate': 3.814219798220853e-05, 'epoch': 1.64} 16%|█▋ | 6771/41250 [16:21:37<82:58:32, 8.66s/it][2025-04-26 00:19:21,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:19:21,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.45 | bwd_microstep: 5768.54 | bwd_inner_microstep: 5630.58 | bwd_allreduce_microstep: 137.92 | step_microstep: 18.44 [2025-04-26 00:19:21,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.45 | bwd: 5768.56 | bwd_inner: 5630.58 | bwd_allreduce: 137.94 | step: 18.44 16%|█▋ | 6772/41250 [16:21:46<83:01:45, 8.67s/it] {'loss': 0.2285, 'grad_norm': 2.0479395389556885, 'learning_rate': 3.8141536986059974e-05, 'epoch': 1.64} 16%|█▋ | 6772/41250 [16:21:46<83:01:45, 8.67s/it][2025-04-26 00:19:29,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:19:29,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.98 | bwd_microstep: 5750.81 | bwd_inner_microstep: 5690.97 | bwd_allreduce_microstep: 59.79 | step_microstep: 18.66 [2025-04-26 00:19:29,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.98 | bwd: 5750.82 | bwd_inner: 5690.97 | bwd_allreduce: 59.81 | step: 18.66 16%|█▋ | 6773/41250 [16:21:55<83:07:15, 8.68s/it] {'loss': 0.0225, 'grad_norm': 0.4953915476799011, 'learning_rate': 3.814087587807242e-05, 'epoch': 1.64} 16%|█▋ | 6773/41250 [16:21:55<83:07:15, 8.68s/it][2025-04-26 00:19:38,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.05 | optimizer_step: 1.00 [2025-04-26 00:19:38,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.66 | bwd_microstep: 5750.46 | bwd_inner_microstep: 5682.97 | bwd_allreduce_microstep: 67.44 | step_microstep: 19.29 [2025-04-26 00:19:38,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.66 | bwd: 5750.47 | bwd_inner: 5682.97 | bwd_allreduce: 67.46 | step: 19.30 16%|█▋ | 6774/41250 [16:22:03<83:06:25, 8.68s/it] {'loss': 0.1855, 'grad_norm': 1.544228434562683, 'learning_rate': 3.8140214658249926e-05, 'epoch': 1.64} 16%|█▋ | 6774/41250 [16:22:03<83:06:25, 8.68s/it][2025-04-26 00:19:47,197] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 0.93 [2025-04-26 00:19:47,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.30 | bwd_microstep: 5779.08 | bwd_inner_microstep: 5657.25 | bwd_allreduce_microstep: 121.79 | step_microstep: 18.51 [2025-04-26 00:19:47,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.30 | bwd: 5779.09 | bwd_inner: 5657.25 | bwd_allreduce: 121.81 | step: 18.51 16%|█▋ | 6775/41250 [16:22:12<83:09:13, 8.68s/it] {'loss': 0.1051, 'grad_norm': 1.5925095081329346, 'learning_rate': 3.813955332659658e-05, 'epoch': 1.64} 16%|█▋ | 6775/41250 [16:22:12<83:09:13, 8.68s/it][2025-04-26 00:19:55,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.99 [2025-04-26 00:19:55,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.43 | bwd_microstep: 5719.34 | bwd_inner_microstep: 5706.61 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.54 [2025-04-26 00:19:55,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.43 | bwd: 5719.35 | bwd_inner: 5706.61 | bwd_allreduce: 12.70 | step: 18.54 16%|█▋ | 6776/41250 [16:22:21<83:04:13, 8.67s/it] {'loss': 0.1911, 'grad_norm': 2.946005344390869, 'learning_rate': 3.813889188311645e-05, 'epoch': 1.64} 16%|█▋ | 6776/41250 [16:22:21<83:04:13, 8.67s/it][2025-04-26 00:20:04,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:20:04,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.40 | bwd_microstep: 5719.42 | bwd_inner_microstep: 5685.44 | bwd_allreduce_microstep: 33.94 | step_microstep: 18.22 [2025-04-26 00:20:04,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.40 | bwd: 5719.44 | bwd_inner: 5685.44 | bwd_allreduce: 33.95 | step: 18.22 16%|█▋ | 6777/41250 [16:22:29<82:59:11, 8.67s/it] {'loss': 0.2299, 'grad_norm': 4.1225996017456055, 'learning_rate': 3.8138230327813625e-05, 'epoch': 1.64} 16%|█▋ | 6777/41250 [16:22:29<82:59:11, 8.67s/it][2025-04-26 00:20:13,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:20:13,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.03 | bwd_microstep: 5751.50 | bwd_inner_microstep: 5709.41 | bwd_allreduce_microstep: 42.05 | step_microstep: 18.34 [2025-04-26 00:20:13,193] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.03 | bwd: 5751.51 | bwd_inner: 5709.41 | bwd_allreduce: 42.06 | step: 18.34 16%|█▋ | 6778/41250 [16:22:38<83:03:50, 8.67s/it] {'loss': 0.0533, 'grad_norm': 0.9580344557762146, 'learning_rate': 3.813756866069218e-05, 'epoch': 1.64} 16%|█▋ | 6778/41250 [16:22:38<83:03:50, 8.67s/it][2025-04-26 00:20:21,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 00:20:21,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.68 | bwd_microstep: 5692.86 | bwd_inner_microstep: 5667.90 | bwd_allreduce_microstep: 24.92 | step_microstep: 18.91 [2025-04-26 00:20:21,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.68 | bwd: 5692.87 | bwd_inner: 5667.90 | bwd_allreduce: 24.93 | step: 18.92 16%|█▋ | 6779/41250 [16:22:47<82:53:36, 8.66s/it] {'loss': 0.2075, 'grad_norm': 1.6876612901687622, 'learning_rate': 3.813690688175618e-05, 'epoch': 1.64} 16%|█▋ | 6779/41250 [16:22:47<82:53:36, 8.66s/it][2025-04-26 00:20:30,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 00:20:30,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.86 | bwd_microstep: 5800.68 | bwd_inner_microstep: 5716.91 | bwd_allreduce_microstep: 83.72 | step_microstep: 18.69 [2025-04-26 00:20:30,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.86 | bwd: 5800.69 | bwd_inner: 5716.91 | bwd_allreduce: 83.74 | step: 18.69 16%|█▋ | 6780/41250 [16:22:55<83:07:48, 8.68s/it] {'loss': 0.0865, 'grad_norm': 1.0834929943084717, 'learning_rate': 3.8136244991009734e-05, 'epoch': 1.64} 16%|█▋ | 6780/41250 [16:22:55<83:07:48, 8.68s/it][2025-04-26 00:20:39,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-26 00:20:39,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.27 | bwd_microstep: 5770.33 | bwd_inner_microstep: 5662.08 | bwd_allreduce_microstep: 108.21 | step_microstep: 18.98 [2025-04-26 00:20:39,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.27 | bwd: 5770.35 | bwd_inner: 5662.08 | bwd_allreduce: 108.23 | step: 18.98 16%|█▋ | 6781/41250 [16:23:04<83:08:19, 8.68s/it] {'loss': 0.1403, 'grad_norm': 1.2756743431091309, 'learning_rate': 3.8135582988456894e-05, 'epoch': 1.64} 16%|█▋ | 6781/41250 [16:23:04<83:08:19, 8.68s/it][2025-04-26 00:20:48,200] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 00:20:48,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.71 | bwd_microstep: 6021.56 | bwd_inner_microstep: 5712.13 | bwd_allreduce_microstep: 309.38 | step_microstep: 18.64 [2025-04-26 00:20:48,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.71 | bwd: 6021.57 | bwd_inner: 5712.13 | bwd_allreduce: 309.40 | step: 18.64 16%|█▋ | 6782/41250 [16:23:13<83:56:49, 8.77s/it] {'loss': 0.3656, 'grad_norm': 3.213106393814087, 'learning_rate': 3.8134920874101756e-05, 'epoch': 1.64} 16%|█▋ | 6782/41250 [16:23:13<83:56:49, 8.77s/it][2025-04-26 00:20:56,883] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 00:20:56,884] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.31 | bwd_microstep: 5761.90 | bwd_inner_microstep: 5662.64 | bwd_allreduce_microstep: 99.22 | step_microstep: 18.55 [2025-04-26 00:20:56,884] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.31 | bwd: 5761.91 | bwd_inner: 5662.64 | bwd_allreduce: 99.23 | step: 18.55 16%|█▋ | 6783/41250 [16:23:22<83:42:06, 8.74s/it] {'loss': 0.0517, 'grad_norm': 1.0449382066726685, 'learning_rate': 3.8134258647948394e-05, 'epoch': 1.64} 16%|█▋ | 6783/41250 [16:23:22<83:42:06, 8.74s/it][2025-04-26 00:21:05,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:21:05,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.05 | bwd_microstep: 5715.41 | bwd_inner_microstep: 5702.52 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.78 [2025-04-26 00:21:05,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.05 | bwd: 5715.42 | bwd_inner: 5702.52 | bwd_allreduce: 12.85 | step: 18.78 16%|█▋ | 6784/41250 [16:23:30<83:25:38, 8.71s/it] {'loss': 0.1475, 'grad_norm': 1.648278832435608, 'learning_rate': 3.813359631000089e-05, 'epoch': 1.64} 16%|█▋ | 6784/41250 [16:23:30<83:25:38, 8.71s/it][2025-04-26 00:21:14,210] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 1.03 | optimizer_step: 0.97 [2025-04-26 00:21:14,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.87 | bwd_microstep: 5754.43 | bwd_inner_microstep: 5660.88 | bwd_allreduce_microstep: 93.50 | step_microstep: 19.16 [2025-04-26 00:21:14,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.87 | bwd: 5754.45 | bwd_inner: 5660.88 | bwd_allreduce: 93.52 | step: 19.16 16%|█▋ | 6785/41250 [16:23:39<83:19:45, 8.70s/it] {'loss': 0.0631, 'grad_norm': 1.2542098760604858, 'learning_rate': 3.813293386026334e-05, 'epoch': 1.64} 16%|█▋ | 6785/41250 [16:23:39<83:19:45, 8.70s/it][2025-04-26 00:21:22,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:21:22,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.89 | bwd_microstep: 5755.54 | bwd_inner_microstep: 5681.73 | bwd_allreduce_microstep: 73.77 | step_microstep: 18.55 [2025-04-26 00:21:22,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.89 | bwd: 5755.55 | bwd_inner: 5681.73 | bwd_allreduce: 73.78 | step: 18.55 16%|█▋ | 6786/41250 [16:23:48<83:17:54, 8.70s/it] {'loss': 0.1716, 'grad_norm': 1.9693660736083984, 'learning_rate': 3.8132271298739814e-05, 'epoch': 1.65} 16%|█▋ | 6786/41250 [16:23:48<83:17:54, 8.70s/it][2025-04-26 00:21:31,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.30 | optimizer_step: 0.90 [2025-04-26 00:21:31,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.66 | bwd_microstep: 5703.60 | bwd_inner_microstep: 5647.53 | bwd_allreduce_microstep: 56.01 | step_microstep: 19.57 [2025-04-26 00:21:31,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.66 | bwd: 5703.61 | bwd_inner: 5647.53 | bwd_allreduce: 56.03 | step: 19.58 16%|█▋ | 6787/41250 [16:23:56<83:03:44, 8.68s/it] {'loss': 0.1512, 'grad_norm': 2.757805347442627, 'learning_rate': 3.8131608625434405e-05, 'epoch': 1.65} 16%|█▋ | 6787/41250 [16:23:56<83:03:44, 8.68s/it][2025-04-26 00:21:40,330] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.08 | optimizer_step: 0.96 [2025-04-26 00:21:40,330] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.70 | bwd_microstep: 5892.43 | bwd_inner_microstep: 5658.09 | bwd_allreduce_microstep: 234.30 | step_microstep: 18.93 [2025-04-26 00:21:40,330] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.70 | bwd: 5892.44 | bwd_inner: 5658.09 | bwd_allreduce: 234.31 | step: 18.94 16%|█▋ | 6788/41250 [16:24:05<83:25:30, 8.71s/it] {'loss': 0.0708, 'grad_norm': 2.520204544067383, 'learning_rate': 3.81309458403512e-05, 'epoch': 1.65} 16%|█▋ | 6788/41250 [16:24:05<83:25:30, 8.71s/it][2025-04-26 00:21:48,963] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 00:21:48,964] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.80 | bwd_microstep: 5726.55 | bwd_inner_microstep: 5646.01 | bwd_allreduce_microstep: 80.49 | step_microstep: 18.71 [2025-04-26 00:21:48,964] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.80 | bwd: 5726.56 | bwd_inner: 5646.01 | bwd_allreduce: 80.51 | step: 18.71 16%|█▋ | 6789/41250 [16:24:14<83:11:50, 8.69s/it] {'loss': 0.0594, 'grad_norm': 1.1258608102798462, 'learning_rate': 3.813028294349427e-05, 'epoch': 1.65} 16%|█▋ | 6789/41250 [16:24:14<83:11:50, 8.69s/it][2025-04-26 00:21:57,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.97 [2025-04-26 00:21:57,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.24 | bwd_microstep: 5755.00 | bwd_inner_microstep: 5662.87 | bwd_allreduce_microstep: 92.08 | step_microstep: 19.11 [2025-04-26 00:21:57,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.24 | bwd: 5755.01 | bwd_inner: 5662.87 | bwd_allreduce: 92.10 | step: 19.11 16%|█▋ | 6790/41250 [16:24:22<83:06:56, 8.68s/it] {'loss': 0.0263, 'grad_norm': 0.805087149143219, 'learning_rate': 3.8129619934867716e-05, 'epoch': 1.65} 16%|█▋ | 6790/41250 [16:24:22<83:06:56, 8.68s/it][2025-04-26 00:22:06,256] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-26 00:22:06,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.79 | bwd_microstep: 5714.38 | bwd_inner_microstep: 5650.31 | bwd_allreduce_microstep: 64.03 | step_microstep: 18.68 [2025-04-26 00:22:06,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.79 | bwd: 5714.40 | bwd_inner: 5650.31 | bwd_allreduce: 64.04 | step: 18.69 16%|█▋ | 6791/41250 [16:24:31<82:57:01, 8.67s/it] {'loss': 0.4458, 'grad_norm': 8.807673454284668, 'learning_rate': 3.8128956814475616e-05, 'epoch': 1.65} 16%|█▋ | 6791/41250 [16:24:31<82:57:01, 8.67s/it][2025-04-26 00:22:14,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:22:14,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.69 | bwd_microstep: 5707.45 | bwd_inner_microstep: 5694.94 | bwd_allreduce_microstep: 12.46 | step_microstep: 18.53 [2025-04-26 00:22:14,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.69 | bwd: 5707.46 | bwd_inner: 5694.94 | bwd_allreduce: 12.48 | step: 18.54 16%|█▋ | 6792/41250 [16:24:40<82:53:44, 8.66s/it] {'loss': 0.1195, 'grad_norm': 2.455313205718994, 'learning_rate': 3.812829358232207e-05, 'epoch': 1.65} 16%|█▋ | 6792/41250 [16:24:40<82:53:44, 8.66s/it][2025-04-26 00:22:23,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-26 00:22:23,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.10 | bwd_microstep: 5715.01 | bwd_inner_microstep: 5650.29 | bwd_allreduce_microstep: 64.67 | step_microstep: 18.62 [2025-04-26 00:22:23,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.10 | bwd: 5715.02 | bwd_inner: 5650.29 | bwd_allreduce: 64.68 | step: 18.62 16%|█▋ | 6793/41250 [16:24:48<82:47:11, 8.65s/it] {'loss': 0.0194, 'grad_norm': 0.40992021560668945, 'learning_rate': 3.812763023841116e-05, 'epoch': 1.65} 16%|█▋ | 6793/41250 [16:24:48<82:47:11, 8.65s/it][2025-04-26 00:22:32,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:22:32,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.93 | bwd_microstep: 5751.31 | bwd_inner_microstep: 5643.85 | bwd_allreduce_microstep: 107.41 | step_microstep: 18.86 [2025-04-26 00:22:32,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.93 | bwd: 5751.32 | bwd_inner: 5643.85 | bwd_allreduce: 107.43 | step: 18.86 16%|█▋ | 6794/41250 [16:24:57<82:48:48, 8.65s/it] {'loss': 0.1907, 'grad_norm': 28.006553649902344, 'learning_rate': 3.812696678274697e-05, 'epoch': 1.65} 16%|█▋ | 6794/41250 [16:24:57<82:48:48, 8.65s/it][2025-04-26 00:22:40,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 00:22:40,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2883.48 | bwd_microstep: 5774.13 | bwd_inner_microstep: 5761.48 | bwd_allreduce_microstep: 12.60 | step_microstep: 18.69 [2025-04-26 00:22:40,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2883.48 | bwd: 5774.14 | bwd_inner: 5761.48 | bwd_allreduce: 12.62 | step: 18.70 16%|█▋ | 6795/41250 [16:25:06<83:03:34, 8.68s/it] {'loss': 0.1829, 'grad_norm': 1.4405107498168945, 'learning_rate': 3.81263032153336e-05, 'epoch': 1.65} 16%|█▋ | 6795/41250 [16:25:06<83:03:34, 8.68s/it][2025-04-26 00:22:49,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.97 | optimizer_gradients: 1.00 | optimizer_step: 1.00 [2025-04-26 00:22:49,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.71 | bwd_microstep: 5721.45 | bwd_inner_microstep: 5708.35 | bwd_allreduce_microstep: 13.05 | step_microstep: 18.53 [2025-04-26 00:22:49,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.71 | bwd: 5721.47 | bwd_inner: 5708.35 | bwd_allreduce: 13.07 | step: 18.53 16%|█▋ | 6796/41250 [16:25:14<83:00:26, 8.67s/it] {'loss': 0.0733, 'grad_norm': 2.497864007949829, 'learning_rate': 3.8125639536175135e-05, 'epoch': 1.65} 16%|█▋ | 6796/41250 [16:25:14<83:00:26, 8.67s/it][2025-04-26 00:22:58,268] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.90 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:22:58,269] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.63 | bwd_microstep: 5753.35 | bwd_inner_microstep: 5688.15 | bwd_allreduce_microstep: 65.15 | step_microstep: 17.92 [2025-04-26 00:22:58,269] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.63 | bwd: 5753.36 | bwd_inner: 5688.15 | bwd_allreduce: 65.17 | step: 17.93 16%|█▋ | 6797/41250 [16:25:23<83:01:35, 8.68s/it] {'loss': 0.0943, 'grad_norm': 1.5368543863296509, 'learning_rate': 3.812497574527567e-05, 'epoch': 1.65} 16%|█▋ | 6797/41250 [16:25:23<83:01:35, 8.68s/it][2025-04-26 00:23:06,959] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.24 | optimizer_step: 1.03 [2025-04-26 00:23:06,960] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.63 | bwd_microstep: 5787.51 | bwd_inner_microstep: 5637.12 | bwd_allreduce_microstep: 150.35 | step_microstep: 19.72 [2025-04-26 00:23:06,960] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.63 | bwd: 5787.53 | bwd_inner: 5637.12 | bwd_allreduce: 150.37 | step: 19.72 16%|█▋ | 6798/41250 [16:25:32<83:04:14, 8.68s/it] {'loss': 0.2718, 'grad_norm': 1.5870431661605835, 'learning_rate': 3.8124311842639286e-05, 'epoch': 1.65} 16%|█▋ | 6798/41250 [16:25:32<83:04:14, 8.68s/it][2025-04-26 00:23:15,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:23:15,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.89 | bwd_microstep: 5704.69 | bwd_inner_microstep: 5691.80 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.26 [2025-04-26 00:23:15,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.89 | bwd: 5704.70 | bwd_inner: 5691.80 | bwd_allreduce: 12.86 | step: 18.27 16%|█▋ | 6799/41250 [16:25:40<82:55:41, 8.67s/it] {'loss': 0.1132, 'grad_norm': 2.7401764392852783, 'learning_rate': 3.812364782827009e-05, 'epoch': 1.65} 16%|█▋ | 6799/41250 [16:25:40<82:55:41, 8.67s/it][2025-04-26 00:23:24,261] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 00:23:24,261] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.78 | bwd_microstep: 5739.22 | bwd_inner_microstep: 5679.87 | bwd_allreduce_microstep: 59.30 | step_microstep: 18.43 [2025-04-26 00:23:24,262] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.78 | bwd: 5739.23 | bwd_inner: 5679.87 | bwd_allreduce: 59.32 | step: 18.43 16%|█▋ | 6800/41250 [16:25:49<82:56:09, 8.67s/it] {'loss': 0.1516, 'grad_norm': 1.6352832317352295, 'learning_rate': 3.812298370217217e-05, 'epoch': 1.65} 16%|█▋ | 6800/41250 [16:25:49<82:56:09, 8.67s/it][2025-04-26 00:23:32,945] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 00:23:32,945] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.56 | bwd_microstep: 5780.18 | bwd_inner_microstep: 5648.97 | bwd_allreduce_microstep: 131.16 | step_microstep: 18.66 [2025-04-26 00:23:32,945] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.56 | bwd: 5780.19 | bwd_inner: 5648.97 | bwd_allreduce: 131.18 | step: 18.67 16%|█▋ | 6801/41250 [16:25:58<82:58:55, 8.67s/it] {'loss': 0.1891, 'grad_norm': 2.41806960105896, 'learning_rate': 3.812231946434961e-05, 'epoch': 1.65} 16%|█▋ | 6801/41250 [16:25:58<82:58:55, 8.67s/it][2025-04-26 00:23:41,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:23:41,595] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.00 | bwd_microstep: 5742.82 | bwd_inner_microstep: 5634.62 | bwd_allreduce_microstep: 108.16 | step_microstep: 18.56 [2025-04-26 00:23:41,595] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.00 | bwd: 5742.83 | bwd_inner: 5634.61 | bwd_allreduce: 108.18 | step: 18.56 16%|█▋ | 6802/41250 [16:26:06<82:54:58, 8.67s/it] {'loss': 0.095, 'grad_norm': 1.1496381759643555, 'learning_rate': 3.812165511480653e-05, 'epoch': 1.65} 16%|█▋ | 6802/41250 [16:26:06<82:54:58, 8.67s/it][2025-04-26 00:23:50,278] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:23:50,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.47 | bwd_microstep: 5755.93 | bwd_inner_microstep: 5677.26 | bwd_allreduce_microstep: 78.63 | step_microstep: 18.48 [2025-04-26 00:23:50,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.47 | bwd: 5755.95 | bwd_inner: 5677.26 | bwd_allreduce: 78.65 | step: 18.48 16%|█▋ | 6803/41250 [16:26:15<82:58:11, 8.67s/it] {'loss': 0.1978, 'grad_norm': 2.161158323287964, 'learning_rate': 3.8120990653547e-05, 'epoch': 1.65} 16%|█▋ | 6803/41250 [16:26:15<82:58:11, 8.67s/it][2025-04-26 00:23:59,074] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.05 | optimizer_step: 1.10 [2025-04-26 00:23:59,075] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.27 | bwd_microstep: 5876.07 | bwd_inner_microstep: 5669.36 | bwd_allreduce_microstep: 206.66 | step_microstep: 19.19 [2025-04-26 00:23:59,075] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.27 | bwd: 5876.09 | bwd_inner: 5669.36 | bwd_allreduce: 206.68 | step: 19.19 16%|█▋ | 6804/41250 [16:26:24<83:19:33, 8.71s/it] {'loss': 0.0684, 'grad_norm': 1.6350629329681396, 'learning_rate': 3.812032608057513e-05, 'epoch': 1.65} 16%|█▋ | 6804/41250 [16:26:24<83:19:33, 8.71s/it][2025-04-26 00:24:07,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.93 | optimizer_step: 0.89 [2025-04-26 00:24:07,844] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2814.28 | bwd_microstep: 5872.23 | bwd_inner_microstep: 5631.50 | bwd_allreduce_microstep: 240.69 | step_microstep: 18.33 [2025-04-26 00:24:07,844] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2814.28 | bwd: 5872.25 | bwd_inner: 5631.50 | bwd_allreduce: 240.70 | step: 18.34 16%|█▋ | 6805/41250 [16:26:33<83:29:45, 8.73s/it] {'loss': 0.0644, 'grad_norm': 2.1302621364593506, 'learning_rate': 3.811966139589502e-05, 'epoch': 1.65} 16%|█▋ | 6805/41250 [16:26:33<83:29:45, 8.73s/it][2025-04-26 00:24:16,535] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.86 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:24:16,535] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2813.65 | bwd_microstep: 5794.08 | bwd_inner_microstep: 5625.53 | bwd_allreduce_microstep: 168.51 | step_microstep: 17.83 [2025-04-26 00:24:16,535] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2813.65 | bwd: 5794.10 | bwd_inner: 5625.53 | bwd_allreduce: 168.53 | step: 17.83 16%|█▋ | 6806/41250 [16:26:41<83:23:32, 8.72s/it] {'loss': 0.1323, 'grad_norm': 4.455916404724121, 'learning_rate': 3.811899659951076e-05, 'epoch': 1.65} 16%|█▋ | 6806/41250 [16:26:41<83:23:32, 8.72s/it][2025-04-26 00:24:25,205] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.98 | optimizer_step: 0.91 [2025-04-26 00:24:25,206] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.39 | bwd_microstep: 5747.10 | bwd_inner_microstep: 5676.43 | bwd_allreduce_microstep: 70.63 | step_microstep: 18.62 [2025-04-26 00:24:25,206] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.39 | bwd: 5747.12 | bwd_inner: 5676.43 | bwd_allreduce: 70.65 | step: 18.63 17%|█▋ | 6807/41250 [16:26:50<83:15:39, 8.70s/it] {'loss': 0.0209, 'grad_norm': 0.5975077152252197, 'learning_rate': 3.8118331691426444e-05, 'epoch': 1.65} 17%|█▋ | 6807/41250 [16:26:50<83:15:39, 8.70s/it][2025-04-26 00:24:33,887] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:24:33,887] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.53 | bwd_microstep: 5757.13 | bwd_inner_microstep: 5676.87 | bwd_allreduce_microstep: 80.21 | step_microstep: 18.46 [2025-04-26 00:24:33,887] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.53 | bwd: 5757.15 | bwd_inner: 5676.87 | bwd_allreduce: 80.23 | step: 18.46 17%|█▋ | 6808/41250 [16:26:59<83:11:51, 8.70s/it] {'loss': 0.0426, 'grad_norm': 1.2046566009521484, 'learning_rate': 3.811766667164618e-05, 'epoch': 1.65} 17%|█▋ | 6808/41250 [16:26:59<83:11:51, 8.70s/it][2025-04-26 00:24:42,569] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 00:24:42,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.44 | bwd_microstep: 5746.63 | bwd_inner_microstep: 5695.97 | bwd_allreduce_microstep: 50.62 | step_microstep: 18.73 [2025-04-26 00:24:42,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.44 | bwd: 5746.65 | bwd_inner: 5695.97 | bwd_allreduce: 50.64 | step: 18.74 17%|█▋ | 6809/41250 [16:27:07<83:09:42, 8.69s/it] {'loss': 0.0544, 'grad_norm': 1.442014455795288, 'learning_rate': 3.811700154017406e-05, 'epoch': 1.65} 17%|█▋ | 6809/41250 [16:27:07<83:09:42, 8.69s/it][2025-04-26 00:24:51,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 00:24:51,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.61 | bwd_microstep: 5716.94 | bwd_inner_microstep: 5692.39 | bwd_allreduce_microstep: 24.50 | step_microstep: 18.86 [2025-04-26 00:24:51,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.61 | bwd: 5716.95 | bwd_inner: 5692.39 | bwd_allreduce: 24.52 | step: 18.86 17%|█▋ | 6810/41250 [16:27:16<83:03:10, 8.68s/it] {'loss': 0.1038, 'grad_norm': 3.0665338039398193, 'learning_rate': 3.8116336297014195e-05, 'epoch': 1.65} 17%|█▋ | 6810/41250 [16:27:16<83:03:10, 8.68s/it][2025-04-26 00:24:59,870] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:24:59,870] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.90 | bwd_microstep: 5711.64 | bwd_inner_microstep: 5698.89 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.52 [2025-04-26 00:24:59,871] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.90 | bwd: 5711.65 | bwd_inner: 5698.89 | bwd_allreduce: 12.72 | step: 18.53 17%|█▋ | 6811/41250 [16:27:25<82:56:34, 8.67s/it] {'loss': 0.1365, 'grad_norm': 1.8964636325836182, 'learning_rate': 3.811567094217068e-05, 'epoch': 1.65} 17%|█▋ | 6811/41250 [16:27:25<82:56:34, 8.67s/it][2025-04-26 00:25:08,686] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 1.09 | optimizer_step: 0.98 [2025-04-26 00:25:08,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.06 | bwd_microstep: 5906.02 | bwd_inner_microstep: 5633.94 | bwd_allreduce_microstep: 272.03 | step_microstep: 18.74 [2025-04-26 00:25:08,687] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.06 | bwd: 5906.03 | bwd_inner: 5633.94 | bwd_allreduce: 272.05 | step: 18.74 17%|█▋ | 6812/41250 [16:27:34<83:21:36, 8.71s/it] {'loss': 0.1065, 'grad_norm': 1.6985102891921997, 'learning_rate': 3.8115005475647615e-05, 'epoch': 1.65} 17%|█▋ | 6812/41250 [16:27:34<83:21:36, 8.71s/it][2025-04-26 00:25:17,285] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 1.05 [2025-04-26 00:25:17,286] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.75 | bwd_microstep: 5687.87 | bwd_inner_microstep: 5647.12 | bwd_allreduce_microstep: 40.70 | step_microstep: 18.73 [2025-04-26 00:25:17,286] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.75 | bwd: 5687.88 | bwd_inner: 5647.12 | bwd_allreduce: 40.72 | step: 18.74 17%|█▋ | 6813/41250 [16:27:42<83:01:36, 8.68s/it] {'loss': 0.1363, 'grad_norm': 1.9155409336090088, 'learning_rate': 3.81143398974491e-05, 'epoch': 1.65} 17%|█▋ | 6813/41250 [16:27:42<83:01:36, 8.68s/it][2025-04-26 00:25:25,940] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-26 00:25:25,941] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.78 | bwd_microstep: 5719.41 | bwd_inner_microstep: 5706.36 | bwd_allreduce_microstep: 12.99 | step_microstep: 18.79 [2025-04-26 00:25:25,941] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.78 | bwd: 5719.43 | bwd_inner: 5706.36 | bwd_allreduce: 13.01 | step: 18.79 17%|█▋ | 6814/41250 [16:27:51<82:58:01, 8.67s/it] {'loss': 0.1328, 'grad_norm': 2.841989517211914, 'learning_rate': 3.8113674207579243e-05, 'epoch': 1.65} 17%|█▋ | 6814/41250 [16:27:51<82:58:01, 8.67s/it][2025-04-26 00:25:34,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:25:34,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.39 | bwd_microstep: 5758.06 | bwd_inner_microstep: 5704.71 | bwd_allreduce_microstep: 53.31 | step_microstep: 18.62 [2025-04-26 00:25:34,638] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.39 | bwd: 5758.07 | bwd_inner: 5704.71 | bwd_allreduce: 53.33 | step: 18.62 17%|█▋ | 6815/41250 [16:27:59<83:01:06, 8.68s/it] {'loss': 0.1093, 'grad_norm': 2.2548348903656006, 'learning_rate': 3.811300840604216e-05, 'epoch': 1.65} 17%|█▋ | 6815/41250 [16:27:59<83:01:06, 8.68s/it][2025-04-26 00:25:43,292] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:25:43,292] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.49 | bwd_microstep: 5714.33 | bwd_inner_microstep: 5701.48 | bwd_allreduce_microstep: 12.81 | step_microstep: 18.83 [2025-04-26 00:25:43,292] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.49 | bwd: 5714.35 | bwd_inner: 5701.48 | bwd_allreduce: 12.82 | step: 18.83 17%|█▋ | 6816/41250 [16:28:08<82:56:46, 8.67s/it] {'loss': 0.1228, 'grad_norm': 1.9409748315811157, 'learning_rate': 3.811234249284193e-05, 'epoch': 1.65} 17%|█▋ | 6816/41250 [16:28:08<82:56:46, 8.67s/it][2025-04-26 00:25:51,990] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.04 | optimizer_step: 1.01 [2025-04-26 00:25:51,991] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.13 | bwd_microstep: 5784.50 | bwd_inner_microstep: 5659.50 | bwd_allreduce_microstep: 124.95 | step_microstep: 19.38 [2025-04-26 00:25:51,991] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.13 | bwd: 5784.51 | bwd_inner: 5659.50 | bwd_allreduce: 124.97 | step: 19.38 17%|█▋ | 6817/41250 [16:28:17<83:01:19, 8.68s/it] {'loss': 0.044, 'grad_norm': 1.197939157485962, 'learning_rate': 3.811167646798267e-05, 'epoch': 1.65} 17%|█▋ | 6817/41250 [16:28:17<83:01:19, 8.68s/it][2025-04-26 00:26:00,615] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:26:00,616] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.80 | bwd_microstep: 5696.87 | bwd_inner_microstep: 5676.29 | bwd_allreduce_microstep: 20.54 | step_microstep: 18.78 [2025-04-26 00:26:00,616] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.80 | bwd: 5696.88 | bwd_inner: 5676.29 | bwd_allreduce: 20.55 | step: 18.78 17%|█▋ | 6818/41250 [16:28:25<82:51:28, 8.66s/it] {'loss': 0.3802, 'grad_norm': 1.9465070962905884, 'learning_rate': 3.81110103314685e-05, 'epoch': 1.65} 17%|█▋ | 6818/41250 [16:28:25<82:51:28, 8.66s/it][2025-04-26 00:26:09,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 00:26:09,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.22 | bwd_microstep: 5808.33 | bwd_inner_microstep: 5657.05 | bwd_allreduce_microstep: 151.22 | step_microstep: 19.04 [2025-04-26 00:26:09,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.22 | bwd: 5808.35 | bwd_inner: 5657.05 | bwd_allreduce: 151.24 | step: 19.05 17%|█▋ | 6819/41250 [16:28:34<83:03:07, 8.68s/it] {'loss': 0.2203, 'grad_norm': 2.119314193725586, 'learning_rate': 3.8110344083303495e-05, 'epoch': 1.65} 17%|█▋ | 6819/41250 [16:28:34<83:03:07, 8.68s/it][2025-04-26 00:26:18,025] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:26:18,025] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.92 | bwd_microstep: 5767.85 | bwd_inner_microstep: 5644.20 | bwd_allreduce_microstep: 123.62 | step_microstep: 18.04 [2025-04-26 00:26:18,026] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.92 | bwd: 5767.87 | bwd_inner: 5644.19 | bwd_allreduce: 123.63 | step: 18.04 17%|█▋ | 6820/41250 [16:28:43<83:01:54, 8.68s/it] {'loss': 0.1831, 'grad_norm': 2.146859884262085, 'learning_rate': 3.810967772349179e-05, 'epoch': 1.65} 17%|█▋ | 6820/41250 [16:28:43<83:01:54, 8.68s/it][2025-04-26 00:26:26,699] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 00:26:26,700] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2871.34 | bwd_microstep: 5721.30 | bwd_inner_microstep: 5708.64 | bwd_allreduce_microstep: 12.61 | step_microstep: 18.50 [2025-04-26 00:26:26,700] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2871.34 | bwd: 5721.31 | bwd_inner: 5708.64 | bwd_allreduce: 12.63 | step: 18.50 17%|█▋ | 6821/41250 [16:28:52<83:00:40, 8.68s/it] {'loss': 0.3641, 'grad_norm': 3.448904514312744, 'learning_rate': 3.8109011252037487e-05, 'epoch': 1.65} 17%|█▋ | 6821/41250 [16:28:52<83:00:40, 8.68s/it][2025-04-26 00:26:35,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 1.13 [2025-04-26 00:26:35,420] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.16 | bwd_microstep: 5789.29 | bwd_inner_microstep: 5697.74 | bwd_allreduce_microstep: 91.50 | step_microstep: 19.14 [2025-04-26 00:26:35,420] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.16 | bwd: 5789.30 | bwd_inner: 5697.74 | bwd_allreduce: 91.52 | step: 19.15 17%|█▋ | 6822/41250 [16:29:00<83:07:15, 8.69s/it] {'loss': 0.1253, 'grad_norm': 2.498757839202881, 'learning_rate': 3.810834466894469e-05, 'epoch': 1.65} 17%|█▋ | 6822/41250 [16:29:00<83:07:15, 8.69s/it][2025-04-26 00:26:44,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:26:44,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.98 | bwd_microstep: 5772.84 | bwd_inner_microstep: 5701.86 | bwd_allreduce_microstep: 70.94 | step_microstep: 18.19 [2025-04-26 00:26:44,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.98 | bwd: 5772.85 | bwd_inner: 5701.86 | bwd_allreduce: 70.96 | step: 18.19 17%|█▋ | 6823/41250 [16:29:09<83:09:21, 8.70s/it] {'loss': 0.0992, 'grad_norm': 1.4623663425445557, 'learning_rate': 3.810767797421751e-05, 'epoch': 1.65} 17%|█▋ | 6823/41250 [16:29:09<83:09:21, 8.70s/it][2025-04-26 00:26:52,812] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.98 [2025-04-26 00:26:52,813] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.65 | bwd_microstep: 5769.47 | bwd_inner_microstep: 5667.12 | bwd_allreduce_microstep: 102.29 | step_microstep: 19.06 [2025-04-26 00:26:52,813] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.65 | bwd: 5769.48 | bwd_inner: 5667.12 | bwd_allreduce: 102.32 | step: 19.07 17%|█▋ | 6824/41250 [16:29:18<83:08:09, 8.69s/it] {'loss': 0.1744, 'grad_norm': 1.8846700191497803, 'learning_rate': 3.810701116786005e-05, 'epoch': 1.65} 17%|█▋ | 6824/41250 [16:29:18<83:08:09, 8.69s/it][2025-04-26 00:27:01,456] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:27:01,457] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.32 | bwd_microstep: 5717.32 | bwd_inner_microstep: 5648.95 | bwd_allreduce_microstep: 68.33 | step_microstep: 18.33 [2025-04-26 00:27:01,457] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.32 | bwd: 5717.33 | bwd_inner: 5648.95 | bwd_allreduce: 68.34 | step: 18.33 17%|█▋ | 6825/41250 [16:29:26<82:59:13, 8.68s/it] {'loss': 0.0298, 'grad_norm': 0.34940552711486816, 'learning_rate': 3.810634424987644e-05, 'epoch': 1.65} 17%|█▋ | 6825/41250 [16:29:26<82:59:13, 8.68s/it][2025-04-26 00:27:10,136] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 00:27:10,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.34 | bwd_microstep: 5770.12 | bwd_inner_microstep: 5652.19 | bwd_allreduce_microstep: 117.89 | step_microstep: 18.78 [2025-04-26 00:27:10,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.34 | bwd: 5770.13 | bwd_inner: 5652.19 | bwd_allreduce: 117.90 | step: 18.78 17%|█▋ | 6826/41250 [16:29:35<82:59:23, 8.68s/it] {'loss': 0.1156, 'grad_norm': 1.1530327796936035, 'learning_rate': 3.8105677220270776e-05, 'epoch': 1.65} 17%|█▋ | 6826/41250 [16:29:35<82:59:23, 8.68s/it][2025-04-26 00:27:18,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 00:27:18,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.84 | bwd_microstep: 5916.63 | bwd_inner_microstep: 5653.71 | bwd_allreduce_microstep: 262.87 | step_microstep: 18.75 [2025-04-26 00:27:18,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.84 | bwd: 5916.64 | bwd_inner: 5653.71 | bwd_allreduce: 262.89 | step: 18.75 17%|█▋ | 6827/41250 [16:29:44<83:25:15, 8.72s/it] {'loss': 0.0736, 'grad_norm': 1.4110562801361084, 'learning_rate': 3.8105010079047164e-05, 'epoch': 1.66} 17%|█▋ | 6827/41250 [16:29:44<83:25:15, 8.72s/it][2025-04-26 00:27:27,652] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.04 | optimizer_step: 1.04 [2025-04-26 00:27:27,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.31 | bwd_microstep: 5745.36 | bwd_inner_microstep: 5719.32 | bwd_allreduce_microstep: 25.99 | step_microstep: 19.69 [2025-04-26 00:27:27,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.31 | bwd: 5745.37 | bwd_inner: 5719.32 | bwd_allreduce: 26.01 | step: 19.69 17%|█▋ | 6828/41250 [16:29:52<83:18:55, 8.71s/it] {'loss': 0.027, 'grad_norm': 0.564909040927887, 'learning_rate': 3.810434282620973e-05, 'epoch': 1.66} 17%|█▋ | 6828/41250 [16:29:52<83:18:55, 8.71s/it][2025-04-26 00:27:36,278] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.03 | optimizer_step: 1.00 [2025-04-26 00:27:36,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.27 | bwd_microstep: 5710.58 | bwd_inner_microstep: 5656.38 | bwd_allreduce_microstep: 54.16 | step_microstep: 19.63 [2025-04-26 00:27:36,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.27 | bwd: 5710.60 | bwd_inner: 5656.38 | bwd_allreduce: 54.18 | step: 19.63 17%|█▋ | 6829/41250 [16:30:01<83:03:55, 8.69s/it] {'loss': 0.1588, 'grad_norm': 2.369352340698242, 'learning_rate': 3.8103675461762584e-05, 'epoch': 1.66} 17%|█▋ | 6829/41250 [16:30:01<83:03:55, 8.69s/it][2025-04-26 00:27:44,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.92 [2025-04-26 00:27:44,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.00 | bwd_microstep: 5778.16 | bwd_inner_microstep: 5653.89 | bwd_allreduce_microstep: 124.22 | step_microstep: 19.14 [2025-04-26 00:27:44,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.00 | bwd: 5778.17 | bwd_inner: 5653.89 | bwd_allreduce: 124.24 | step: 19.15 17%|█▋ | 6830/41250 [16:30:10<83:05:00, 8.69s/it] {'loss': 0.1972, 'grad_norm': 2.595562219619751, 'learning_rate': 3.810300798570984e-05, 'epoch': 1.66} 17%|█▋ | 6830/41250 [16:30:10<83:05:00, 8.69s/it][2025-04-26 00:27:53,689] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:27:53,690] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.73 | bwd_microstep: 5801.25 | bwd_inner_microstep: 5654.41 | bwd_allreduce_microstep: 146.80 | step_microstep: 18.84 [2025-04-26 00:27:53,690] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.73 | bwd: 5801.26 | bwd_inner: 5654.41 | bwd_allreduce: 146.82 | step: 18.84 17%|█▋ | 6831/41250 [16:30:19<83:09:00, 8.70s/it] {'loss': 0.1909, 'grad_norm': 1.4146413803100586, 'learning_rate': 3.810234039805561e-05, 'epoch': 1.66} 17%|█▋ | 6831/41250 [16:30:19<83:09:00, 8.70s/it][2025-04-26 00:28:02,349] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 1.01 [2025-04-26 00:28:02,350] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.69 | bwd_microstep: 5721.77 | bwd_inner_microstep: 5709.01 | bwd_allreduce_microstep: 12.72 | step_microstep: 19.45 [2025-04-26 00:28:02,350] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.69 | bwd: 5721.79 | bwd_inner: 5709.01 | bwd_allreduce: 12.74 | step: 19.45 17%|█▋ | 6832/41250 [16:30:27<83:02:54, 8.69s/it] {'loss': 0.1769, 'grad_norm': 2.5033743381500244, 'learning_rate': 3.8101672698804016e-05, 'epoch': 1.66} 17%|█▋ | 6832/41250 [16:30:27<83:02:54, 8.69s/it][2025-04-26 00:28:11,032] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.21 | optimizer_step: 0.92 [2025-04-26 00:28:11,033] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.24 | bwd_microstep: 5766.38 | bwd_inner_microstep: 5660.12 | bwd_allreduce_microstep: 106.21 | step_microstep: 19.17 [2025-04-26 00:28:11,033] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.24 | bwd: 5766.40 | bwd_inner: 5660.12 | bwd_allreduce: 106.23 | step: 19.17 17%|█▋ | 6833/41250 [16:30:36<83:01:52, 8.69s/it] {'loss': 0.0132, 'grad_norm': 0.3386911153793335, 'learning_rate': 3.810100488795917e-05, 'epoch': 1.66} 17%|█▋ | 6833/41250 [16:30:36<83:01:52, 8.69s/it][2025-04-26 00:28:19,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:28:19,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.69 | bwd_microstep: 5713.21 | bwd_inner_microstep: 5647.93 | bwd_allreduce_microstep: 65.23 | step_microstep: 18.72 [2025-04-26 00:28:19,656] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.69 | bwd: 5713.22 | bwd_inner: 5647.93 | bwd_allreduce: 65.24 | step: 18.72 17%|█▋ | 6834/41250 [16:30:44<82:50:54, 8.67s/it] {'loss': 0.0689, 'grad_norm': 1.3846358060836792, 'learning_rate': 3.810033696552519e-05, 'epoch': 1.66} 17%|█▋ | 6834/41250 [16:30:44<82:50:54, 8.67s/it][2025-04-26 00:28:28,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.14 | optimizer_step: 0.91 [2025-04-26 00:28:28,336] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.81 | bwd_microstep: 5771.77 | bwd_inner_microstep: 5654.83 | bwd_allreduce_microstep: 116.88 | step_microstep: 18.80 [2025-04-26 00:28:28,336] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.81 | bwd: 5771.78 | bwd_inner: 5654.83 | bwd_allreduce: 116.90 | step: 18.80 17%|█▋ | 6835/41250 [16:30:53<82:53:12, 8.67s/it] {'loss': 0.1618, 'grad_norm': 1.5245182514190674, 'learning_rate': 3.809966893150619e-05, 'epoch': 1.66} 17%|█▋ | 6835/41250 [16:30:53<82:53:12, 8.67s/it][2025-04-26 00:28:37,018] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.95 [2025-04-26 00:28:37,019] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.77 | bwd_microstep: 5764.41 | bwd_inner_microstep: 5653.01 | bwd_allreduce_microstep: 111.35 | step_microstep: 18.88 [2025-04-26 00:28:37,019] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.77 | bwd: 5764.42 | bwd_inner: 5653.01 | bwd_allreduce: 111.37 | step: 18.88 17%|█▋ | 6836/41250 [16:31:02<82:55:14, 8.67s/it] {'loss': 0.0599, 'grad_norm': 1.1994946002960205, 'learning_rate': 3.8099000785906295e-05, 'epoch': 1.66} 17%|█▋ | 6836/41250 [16:31:02<82:55:14, 8.67s/it][2025-04-26 00:28:45,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.11 | optimizer_step: 1.05 [2025-04-26 00:28:45,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.28 | bwd_microstep: 5732.73 | bwd_inner_microstep: 5692.78 | bwd_allreduce_microstep: 39.89 | step_microstep: 20.00 [2025-04-26 00:28:45,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.28 | bwd: 5732.75 | bwd_inner: 5692.78 | bwd_allreduce: 39.92 | step: 20.00 17%|█▋ | 6837/41250 [16:31:11<82:53:48, 8.67s/it] {'loss': 0.0474, 'grad_norm': 0.8227823972702026, 'learning_rate': 3.8098332528729616e-05, 'epoch': 1.66} 17%|█▋ | 6837/41250 [16:31:11<82:53:48, 8.67s/it][2025-04-26 00:28:54,349] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-26 00:28:54,350] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.37 | bwd_microstep: 5755.49 | bwd_inner_microstep: 5654.49 | bwd_allreduce_microstep: 100.95 | step_microstep: 18.91 [2025-04-26 00:28:54,350] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.37 | bwd: 5755.50 | bwd_inner: 5654.49 | bwd_allreduce: 100.97 | step: 18.92 17%|█▋ | 6838/41250 [16:31:19<82:52:17, 8.67s/it] {'loss': 0.1651, 'grad_norm': 2.268512487411499, 'learning_rate': 3.809766415998028e-05, 'epoch': 1.66} 17%|█▋ | 6838/41250 [16:31:19<82:52:17, 8.67s/it][2025-04-26 00:29:03,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:29:03,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.31 | bwd_microstep: 5714.93 | bwd_inner_microstep: 5698.61 | bwd_allreduce_microstep: 16.28 | step_microstep: 18.70 [2025-04-26 00:29:03,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.31 | bwd: 5714.95 | bwd_inner: 5698.61 | bwd_allreduce: 16.30 | step: 18.70 17%|█▋ | 6839/41250 [16:31:28<82:49:05, 8.66s/it] {'loss': 0.0697, 'grad_norm': 1.0917353630065918, 'learning_rate': 3.80969956796624e-05, 'epoch': 1.66} 17%|█▋ | 6839/41250 [16:31:28<82:49:05, 8.66s/it][2025-04-26 00:29:11,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 00:29:11,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.33 | bwd_microstep: 5766.65 | bwd_inner_microstep: 5643.50 | bwd_allreduce_microstep: 123.10 | step_microstep: 19.03 [2025-04-26 00:29:11,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.33 | bwd: 5766.66 | bwd_inner: 5643.50 | bwd_allreduce: 123.12 | step: 19.03 17%|█▋ | 6840/41250 [16:31:37<82:52:06, 8.67s/it] {'loss': 0.0811, 'grad_norm': 1.3762931823730469, 'learning_rate': 3.80963270877801e-05, 'epoch': 1.66} 17%|█▋ | 6840/41250 [16:31:37<82:52:06, 8.67s/it][2025-04-26 00:29:20,358] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 00:29:20,358] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.73 | bwd_microstep: 5747.44 | bwd_inner_microstep: 5688.63 | bwd_allreduce_microstep: 58.76 | step_microstep: 18.63 [2025-04-26 00:29:20,359] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.74 | bwd: 5747.45 | bwd_inner: 5688.63 | bwd_allreduce: 58.77 | step: 18.63 17%|█▋ | 6841/41250 [16:31:45<82:52:40, 8.67s/it] {'loss': 0.1241, 'grad_norm': 1.3006565570831299, 'learning_rate': 3.809565838433751e-05, 'epoch': 1.66} 17%|█▋ | 6841/41250 [16:31:45<82:52:40, 8.67s/it][2025-04-26 00:29:28,973] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:29:28,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.74 | bwd_microstep: 5688.72 | bwd_inner_microstep: 5675.96 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.79 [2025-04-26 00:29:28,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.74 | bwd: 5688.73 | bwd_inner: 5675.96 | bwd_allreduce: 12.72 | step: 18.79 17%|█▋ | 6842/41250 [16:31:54<82:43:05, 8.65s/it] {'loss': 0.2234, 'grad_norm': 3.25276255607605, 'learning_rate': 3.8094989569338735e-05, 'epoch': 1.66} 17%|█▋ | 6842/41250 [16:31:54<82:43:05, 8.65s/it][2025-04-26 00:29:37,640] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.05 | optimizer_step: 0.95 [2025-04-26 00:29:37,641] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.56 | bwd_microstep: 5748.90 | bwd_inner_microstep: 5643.42 | bwd_allreduce_microstep: 105.43 | step_microstep: 19.37 [2025-04-26 00:29:37,641] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.56 | bwd: 5748.91 | bwd_inner: 5643.42 | bwd_allreduce: 105.45 | step: 19.37 17%|█▋ | 6843/41250 [16:32:02<82:45:10, 8.66s/it] {'loss': 0.1351, 'grad_norm': 1.7531460523605347, 'learning_rate': 3.809432064278791e-05, 'epoch': 1.66} 17%|█▋ | 6843/41250 [16:32:02<82:45:10, 8.66s/it][2025-04-26 00:29:46,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.19 | optimizer_step: 0.92 [2025-04-26 00:29:46,249] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.05 | bwd_microstep: 5690.71 | bwd_inner_microstep: 5668.00 | bwd_allreduce_microstep: 22.66 | step_microstep: 19.44 [2025-04-26 00:29:46,249] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.05 | bwd: 5690.72 | bwd_inner: 5668.00 | bwd_allreduce: 22.68 | step: 19.44 17%|█▋ | 6844/41250 [16:32:11<82:36:19, 8.64s/it] {'loss': 0.2295, 'grad_norm': 3.6771273612976074, 'learning_rate': 3.809365160468917e-05, 'epoch': 1.66} 17%|█▋ | 6844/41250 [16:32:11<82:36:19, 8.64s/it][2025-04-26 00:29:54,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-26 00:29:54,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.13 | bwd_microstep: 5695.17 | bwd_inner_microstep: 5682.14 | bwd_allreduce_microstep: 12.99 | step_microstep: 19.14 [2025-04-26 00:29:54,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.13 | bwd: 5695.19 | bwd_inner: 5682.13 | bwd_allreduce: 13.01 | step: 19.14 17%|█▋ | 6845/41250 [16:32:20<82:32:47, 8.64s/it] {'loss': 0.1212, 'grad_norm': 2.0400965213775635, 'learning_rate': 3.809298245504661e-05, 'epoch': 1.66} 17%|█▋ | 6845/41250 [16:32:20<82:32:47, 8.64s/it][2025-04-26 00:30:03,486] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:30:03,487] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.57 | bwd_microstep: 5692.10 | bwd_inner_microstep: 5663.45 | bwd_allreduce_microstep: 28.60 | step_microstep: 18.98 [2025-04-26 00:30:03,487] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.57 | bwd: 5692.11 | bwd_inner: 5663.45 | bwd_allreduce: 28.62 | step: 18.98 17%|█▋ | 6846/41250 [16:32:28<82:28:28, 8.63s/it] {'loss': 0.2621, 'grad_norm': 3.8546695709228516, 'learning_rate': 3.809231319386439e-05, 'epoch': 1.66} 17%|█▋ | 6846/41250 [16:32:28<82:28:28, 8.63s/it][2025-04-26 00:30:12,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:30:12,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2874.49 | bwd_microstep: 5937.72 | bwd_inner_microstep: 5746.83 | bwd_allreduce_microstep: 190.84 | step_microstep: 18.68 [2025-04-26 00:30:12,381] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2874.49 | bwd: 5937.73 | bwd_inner: 5746.83 | bwd_allreduce: 190.86 | step: 18.68 17%|█▋ | 6847/41250 [16:32:37<83:13:37, 8.71s/it] {'loss': 0.13, 'grad_norm': 1.3097938299179077, 'learning_rate': 3.80916438211466e-05, 'epoch': 1.66} 17%|█▋ | 6847/41250 [16:32:37<83:13:37, 8.71s/it][2025-04-26 00:30:21,030] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 00:30:21,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.27 | bwd_microstep: 5728.80 | bwd_inner_microstep: 5666.98 | bwd_allreduce_microstep: 61.77 | step_microstep: 18.72 [2025-04-26 00:30:21,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.27 | bwd: 5728.81 | bwd_inner: 5666.98 | bwd_allreduce: 61.79 | step: 18.72 17%|█▋ | 6848/41250 [16:32:46<83:03:26, 8.69s/it] {'loss': 0.2004, 'grad_norm': 2.5378644466400146, 'learning_rate': 3.80909743368974e-05, 'epoch': 1.66} 17%|█▋ | 6848/41250 [16:32:46<83:03:26, 8.69s/it][2025-04-26 00:30:29,740] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.02 | optimizer_step: 1.12 [2025-04-26 00:30:29,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2874.00 | bwd_microstep: 5753.62 | bwd_inner_microstep: 5740.86 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.97 [2025-04-26 00:30:29,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2874.00 | bwd: 5753.63 | bwd_inner: 5740.86 | bwd_allreduce: 12.73 | step: 18.97 17%|█▋ | 6849/41250 [16:32:55<83:06:42, 8.70s/it] {'loss': 0.246, 'grad_norm': 4.872077941894531, 'learning_rate': 3.809030474112089e-05, 'epoch': 1.66} 17%|█▋ | 6849/41250 [16:32:55<83:06:42, 8.70s/it][2025-04-26 00:30:38,350] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 00:30:38,351] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.43 | bwd_microstep: 5684.66 | bwd_inner_microstep: 5669.86 | bwd_allreduce_microstep: 14.75 | step_microstep: 18.52 [2025-04-26 00:30:38,351] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.43 | bwd: 5684.68 | bwd_inner: 5669.86 | bwd_allreduce: 14.77 | step: 18.53 17%|█▋ | 6850/41250 [16:33:03<82:51:13, 8.67s/it] {'loss': 0.2321, 'grad_norm': 2.951045513153076, 'learning_rate': 3.808963503382122e-05, 'epoch': 1.66} 17%|█▋ | 6850/41250 [16:33:03<82:51:13, 8.67s/it][2025-04-26 00:30:47,018] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:30:47,018] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.48 | bwd_microstep: 5744.98 | bwd_inner_microstep: 5675.82 | bwd_allreduce_microstep: 69.11 | step_microstep: 18.63 [2025-04-26 00:30:47,018] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.48 | bwd: 5744.99 | bwd_inner: 5675.82 | bwd_allreduce: 69.13 | step: 18.63 17%|█▋ | 6851/41250 [16:33:12<82:50:39, 8.67s/it] {'loss': 0.0153, 'grad_norm': 0.2244829684495926, 'learning_rate': 3.808896521500249e-05, 'epoch': 1.66} 17%|█▋ | 6851/41250 [16:33:12<82:50:39, 8.67s/it][2025-04-26 00:30:55,770] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.97 | optimizer_step: 0.98 [2025-04-26 00:30:55,771] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2884.30 | bwd_microstep: 5783.52 | bwd_inner_microstep: 5770.79 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.51 [2025-04-26 00:30:55,771] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2884.30 | bwd: 5783.53 | bwd_inner: 5770.79 | bwd_allreduce: 12.70 | step: 18.52 17%|█▋ | 6852/41250 [16:33:21<83:04:41, 8.69s/it] {'loss': 0.0444, 'grad_norm': 1.271704077720642, 'learning_rate': 3.808829528466886e-05, 'epoch': 1.66} 17%|█▋ | 6852/41250 [16:33:21<83:04:41, 8.69s/it][2025-04-26 00:31:04,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:31:04,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.37 | bwd_microstep: 5737.50 | bwd_inner_microstep: 5711.80 | bwd_allreduce_microstep: 25.66 | step_microstep: 18.70 [2025-04-26 00:31:04,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.37 | bwd: 5737.52 | bwd_inner: 5711.80 | bwd_allreduce: 25.68 | step: 18.70 17%|█▋ | 6853/41250 [16:33:29<83:01:00, 8.69s/it] {'loss': 0.1113, 'grad_norm': 2.1077287197113037, 'learning_rate': 3.808762524282445e-05, 'epoch': 1.66} 17%|█▋ | 6853/41250 [16:33:29<83:01:00, 8.69s/it][2025-04-26 00:31:13,400] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-26 00:31:13,400] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.92 | bwd_microstep: 6044.83 | bwd_inner_microstep: 5654.88 | bwd_allreduce_microstep: 389.90 | step_microstep: 19.04 [2025-04-26 00:31:13,400] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.92 | bwd: 6044.84 | bwd_inner: 5654.88 | bwd_allreduce: 389.91 | step: 19.04 17%|█▋ | 6854/41250 [16:33:38<83:46:39, 8.77s/it] {'loss': 0.0134, 'grad_norm': 0.3653024435043335, 'learning_rate': 3.808695508947338e-05, 'epoch': 1.66} 17%|█▋ | 6854/41250 [16:33:38<83:46:39, 8.77s/it][2025-04-26 00:31:22,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:31:22,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.93 | bwd_microstep: 5743.87 | bwd_inner_microstep: 5685.63 | bwd_allreduce_microstep: 58.19 | step_microstep: 18.67 [2025-04-26 00:31:22,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.93 | bwd: 5743.88 | bwd_inner: 5685.63 | bwd_allreduce: 58.21 | step: 18.67 17%|█▋ | 6855/41250 [16:33:47<83:29:59, 8.74s/it] {'loss': 0.0736, 'grad_norm': 1.229202389717102, 'learning_rate': 3.8086284824619796e-05, 'epoch': 1.66} 17%|█▋ | 6855/41250 [16:33:47<83:29:59, 8.74s/it][2025-04-26 00:31:30,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:31:30,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.85 | bwd_microstep: 5739.70 | bwd_inner_microstep: 5687.10 | bwd_allreduce_microstep: 52.55 | step_microstep: 18.64 [2025-04-26 00:31:30,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.85 | bwd: 5739.71 | bwd_inner: 5687.10 | bwd_allreduce: 52.57 | step: 18.65 17%|█▋ | 6856/41250 [16:33:56<83:17:41, 8.72s/it] {'loss': 0.143, 'grad_norm': 3.223684787750244, 'learning_rate': 3.808561444826782e-05, 'epoch': 1.66} 17%|█▋ | 6856/41250 [16:33:56<83:17:41, 8.72s/it][2025-04-26 00:31:39,416] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:31:39,417] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.49 | bwd_microstep: 5742.72 | bwd_inner_microstep: 5694.20 | bwd_allreduce_microstep: 48.47 | step_microstep: 18.38 [2025-04-26 00:31:39,417] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.49 | bwd: 5742.73 | bwd_inner: 5694.20 | bwd_allreduce: 48.49 | step: 18.38 17%|█▋ | 6857/41250 [16:34:04<83:10:15, 8.71s/it] {'loss': 0.0779, 'grad_norm': 2.417795419692993, 'learning_rate': 3.808494396042159e-05, 'epoch': 1.66} 17%|█▋ | 6857/41250 [16:34:04<83:10:15, 8.71s/it][2025-04-26 00:31:48,119] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 1.08 [2025-04-26 00:31:48,120] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.92 | bwd_microstep: 5763.65 | bwd_inner_microstep: 5697.64 | bwd_allreduce_microstep: 65.97 | step_microstep: 18.94 [2025-04-26 00:31:48,120] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.92 | bwd: 5763.66 | bwd_inner: 5697.64 | bwd_allreduce: 65.98 | step: 18.95 17%|█▋ | 6858/41250 [16:34:13<83:09:37, 8.70s/it] {'loss': 0.2746, 'grad_norm': 5.000942230224609, 'learning_rate': 3.8084273361085236e-05, 'epoch': 1.66} 17%|█▋ | 6858/41250 [16:34:13<83:09:37, 8.70s/it][2025-04-26 00:31:56,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.98 | optimizer_step: 1.08 [2025-04-26 00:31:56,764] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.92 | bwd_microstep: 5710.35 | bwd_inner_microstep: 5697.66 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.52 [2025-04-26 00:31:56,764] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.92 | bwd: 5710.36 | bwd_inner: 5697.66 | bwd_allreduce: 12.67 | step: 18.52 17%|█▋ | 6859/41250 [16:34:22<82:59:05, 8.69s/it] {'loss': 0.1036, 'grad_norm': 2.2365658283233643, 'learning_rate': 3.80836026502629e-05, 'epoch': 1.66} 17%|█▋ | 6859/41250 [16:34:22<82:59:05, 8.69s/it][2025-04-26 00:32:05,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 00:32:05,459] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.28 | bwd_microstep: 5765.20 | bwd_inner_microstep: 5675.12 | bwd_allreduce_microstep: 90.03 | step_microstep: 18.67 [2025-04-26 00:32:05,459] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.28 | bwd: 5765.21 | bwd_inner: 5675.12 | bwd_allreduce: 90.05 | step: 18.67 17%|█▋ | 6860/41250 [16:34:30<83:00:27, 8.69s/it] {'loss': 0.0362, 'grad_norm': 1.5174026489257812, 'learning_rate': 3.80829318279587e-05, 'epoch': 1.66} 17%|█▋ | 6860/41250 [16:34:30<83:00:27, 8.69s/it][2025-04-26 00:32:14,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:32:14,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.07 | bwd_microstep: 5722.90 | bwd_inner_microstep: 5653.26 | bwd_allreduce_microstep: 69.60 | step_microstep: 18.27 [2025-04-26 00:32:14,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.07 | bwd: 5722.91 | bwd_inner: 5653.26 | bwd_allreduce: 69.62 | step: 18.27 17%|█▋ | 6861/41250 [16:34:39<82:50:37, 8.67s/it] {'loss': 0.0507, 'grad_norm': 1.8664488792419434, 'learning_rate': 3.8082260894176785e-05, 'epoch': 1.66} 17%|█▋ | 6861/41250 [16:34:39<82:50:37, 8.67s/it][2025-04-26 00:32:22,805] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:32:22,805] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.41 | bwd_microstep: 5802.41 | bwd_inner_microstep: 5646.22 | bwd_allreduce_microstep: 156.15 | step_microstep: 18.66 [2025-04-26 00:32:22,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.41 | bwd: 5802.42 | bwd_inner: 5646.22 | bwd_allreduce: 156.16 | step: 18.66 17%|█▋ | 6862/41250 [16:34:48<82:57:18, 8.68s/it] {'loss': 0.1864, 'grad_norm': 1.8861761093139648, 'learning_rate': 3.808158984892129e-05, 'epoch': 1.66} 17%|█▋ | 6862/41250 [16:34:48<82:57:18, 8.68s/it][2025-04-26 00:32:31,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:32:31,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2899.34 | bwd_microstep: 5796.42 | bwd_inner_microstep: 5783.68 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.50 [2025-04-26 00:32:31,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2899.34 | bwd: 5796.43 | bwd_inner: 5783.68 | bwd_allreduce: 12.71 | step: 18.50 17%|█▋ | 6863/41250 [16:34:56<83:13:10, 8.71s/it] {'loss': 0.2427, 'grad_norm': 3.980003833770752, 'learning_rate': 3.808091869219635e-05, 'epoch': 1.66} 17%|█▋ | 6863/41250 [16:34:56<83:13:10, 8.71s/it][2025-04-26 00:32:40,281] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:32:40,281] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.00 | bwd_microstep: 5761.74 | bwd_inner_microstep: 5697.92 | bwd_allreduce_microstep: 63.78 | step_microstep: 18.57 [2025-04-26 00:32:40,282] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.00 | bwd: 5761.76 | bwd_inner: 5697.92 | bwd_allreduce: 63.80 | step: 18.58 17%|█▋ | 6864/41250 [16:35:05<83:10:36, 8.71s/it] {'loss': 0.0391, 'grad_norm': 2.130321502685547, 'learning_rate': 3.80802474240061e-05, 'epoch': 1.66} 17%|█▋ | 6864/41250 [16:35:05<83:10:36, 8.71s/it][2025-04-26 00:32:48,959] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:32:48,960] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.84 | bwd_microstep: 5764.59 | bwd_inner_microstep: 5670.68 | bwd_allreduce_microstep: 93.87 | step_microstep: 18.46 [2025-04-26 00:32:48,960] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.84 | bwd: 5764.60 | bwd_inner: 5670.68 | bwd_allreduce: 93.89 | step: 18.47 17%|█▋ | 6865/41250 [16:35:14<83:05:17, 8.70s/it] {'loss': 0.0331, 'grad_norm': 0.6191770434379578, 'learning_rate': 3.8079576044354685e-05, 'epoch': 1.66} 17%|█▋ | 6865/41250 [16:35:14<83:05:17, 8.70s/it][2025-04-26 00:32:57,670] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:32:57,671] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.55 | bwd_microstep: 5793.27 | bwd_inner_microstep: 5662.00 | bwd_allreduce_microstep: 131.23 | step_microstep: 18.42 [2025-04-26 00:32:57,671] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.55 | bwd: 5793.29 | bwd_inner: 5662.00 | bwd_allreduce: 131.25 | step: 18.42 17%|█▋ | 6866/41250 [16:35:22<83:07:14, 8.70s/it] {'loss': 0.0327, 'grad_norm': 0.5166125893592834, 'learning_rate': 3.8078904553246234e-05, 'epoch': 1.66} 17%|█▋ | 6866/41250 [16:35:22<83:07:14, 8.70s/it][2025-04-26 00:33:06,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-26 00:33:06,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.23 | bwd_microstep: 5783.37 | bwd_inner_microstep: 5699.49 | bwd_allreduce_microstep: 83.83 | step_microstep: 18.54 [2025-04-26 00:33:06,384] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.23 | bwd: 5783.38 | bwd_inner: 5699.49 | bwd_allreduce: 83.85 | step: 18.54 17%|█▋ | 6867/41250 [16:35:31<83:09:26, 8.71s/it] {'loss': 0.1649, 'grad_norm': 2.9104087352752686, 'learning_rate': 3.8078232950684896e-05, 'epoch': 1.66} 17%|█▋ | 6867/41250 [16:35:31<83:09:26, 8.71s/it][2025-04-26 00:33:15,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.21 | optimizer_step: 1.00 [2025-04-26 00:33:15,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.88 | bwd_microstep: 5793.05 | bwd_inner_microstep: 5661.58 | bwd_allreduce_microstep: 131.41 | step_microstep: 19.46 [2025-04-26 00:33:15,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.88 | bwd: 5793.06 | bwd_inner: 5661.58 | bwd_allreduce: 131.44 | step: 19.46 17%|█▋ | 6868/41250 [16:35:40<83:08:41, 8.71s/it] {'loss': 0.1105, 'grad_norm': 2.8956706523895264, 'learning_rate': 3.807756123667481e-05, 'epoch': 1.66} 17%|█▋ | 6868/41250 [16:35:40<83:08:41, 8.71s/it][2025-04-26 00:33:23,730] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 00:33:23,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.12 | bwd_microstep: 5709.76 | bwd_inner_microstep: 5697.04 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.65 [2025-04-26 00:33:23,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.12 | bwd: 5709.78 | bwd_inner: 5697.04 | bwd_allreduce: 12.70 | step: 18.65 17%|█▋ | 6869/41250 [16:35:49<82:57:29, 8.69s/it] {'loss': 0.1733, 'grad_norm': 1.8719490766525269, 'learning_rate': 3.8076889411220114e-05, 'epoch': 1.67} 17%|█▋ | 6869/41250 [16:35:49<82:57:29, 8.69s/it][2025-04-26 00:33:32,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:33:32,368] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.50 | bwd_microstep: 5724.08 | bwd_inner_microstep: 5638.03 | bwd_allreduce_microstep: 86.01 | step_microstep: 18.63 [2025-04-26 00:33:32,368] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.50 | bwd: 5724.10 | bwd_inner: 5638.03 | bwd_allreduce: 86.03 | step: 18.63 17%|█▋ | 6870/41250 [16:35:57<82:48:43, 8.67s/it] {'loss': 0.0304, 'grad_norm': 2.7691211700439453, 'learning_rate': 3.807621747432495e-05, 'epoch': 1.67} 17%|█▋ | 6870/41250 [16:35:57<82:48:43, 8.67s/it][2025-04-26 00:33:40,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:33:40,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.48 | bwd_microstep: 5687.48 | bwd_inner_microstep: 5648.33 | bwd_allreduce_microstep: 39.10 | step_microstep: 18.31 [2025-04-26 00:33:40,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.48 | bwd: 5687.49 | bwd_inner: 5648.33 | bwd_allreduce: 39.12 | step: 18.31 17%|█▋ | 6871/41250 [16:36:06<82:37:01, 8.65s/it] {'loss': 0.272, 'grad_norm': 2.8919870853424072, 'learning_rate': 3.807554542599346e-05, 'epoch': 1.67} 17%|█▋ | 6871/41250 [16:36:06<82:37:01, 8.65s/it][2025-04-26 00:33:49,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 00:33:49,651] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.05 | bwd_microstep: 5748.44 | bwd_inner_microstep: 5698.86 | bwd_allreduce_microstep: 49.54 | step_microstep: 18.74 [2025-04-26 00:33:49,651] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.05 | bwd: 5748.45 | bwd_inner: 5698.86 | bwd_allreduce: 49.55 | step: 18.75 17%|█▋ | 6872/41250 [16:36:14<82:41:46, 8.66s/it] {'loss': 0.0304, 'grad_norm': 1.3990471363067627, 'learning_rate': 3.807487326622979e-05, 'epoch': 1.67} 17%|█▋ | 6872/41250 [16:36:14<82:41:46, 8.66s/it][2025-04-26 00:33:58,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 0.99 | optimizer_step: 1.01 [2025-04-26 00:33:58,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.82 | bwd_microstep: 5713.94 | bwd_inner_microstep: 5701.58 | bwd_allreduce_microstep: 12.31 | step_microstep: 18.56 [2025-04-26 00:33:58,303] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.82 | bwd: 5713.95 | bwd_inner: 5701.58 | bwd_allreduce: 12.33 | step: 18.56 17%|█▋ | 6873/41250 [16:36:23<82:40:06, 8.66s/it] {'loss': 0.0764, 'grad_norm': 0.896733283996582, 'learning_rate': 3.807420099503808e-05, 'epoch': 1.67} 17%|█▋ | 6873/41250 [16:36:23<82:40:06, 8.66s/it][2025-04-26 00:34:06,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 00:34:06,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.04 | bwd_microstep: 5758.80 | bwd_inner_microstep: 5700.86 | bwd_allreduce_microstep: 57.89 | step_microstep: 18.76 [2025-04-26 00:34:06,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.04 | bwd: 5758.82 | bwd_inner: 5700.86 | bwd_allreduce: 57.91 | step: 18.76 17%|█▋ | 6874/41250 [16:36:32<82:46:03, 8.67s/it] {'loss': 0.2427, 'grad_norm': 2.6409640312194824, 'learning_rate': 3.807352861242247e-05, 'epoch': 1.67} 17%|█▋ | 6874/41250 [16:36:32<82:46:03, 8.67s/it][2025-04-26 00:34:15,641] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.00 | optimizer_step: 1.07 [2025-04-26 00:34:15,641] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.74 | bwd_microstep: 5710.64 | bwd_inner_microstep: 5697.77 | bwd_allreduce_microstep: 12.81 | step_microstep: 18.72 [2025-04-26 00:34:15,641] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.74 | bwd: 5710.65 | bwd_inner: 5697.77 | bwd_allreduce: 12.83 | step: 18.72 17%|█▋ | 6875/41250 [16:36:40<82:42:56, 8.66s/it] {'loss': 0.1624, 'grad_norm': 2.448498487472534, 'learning_rate': 3.807285611838712e-05, 'epoch': 1.67} 17%|█▋ | 6875/41250 [16:36:40<82:42:56, 8.66s/it][2025-04-26 00:34:24,281] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:34:24,281] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.46 | bwd_microstep: 5705.66 | bwd_inner_microstep: 5692.76 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.55 [2025-04-26 00:34:24,281] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.46 | bwd: 5705.68 | bwd_inner: 5692.76 | bwd_allreduce: 12.88 | step: 18.56 17%|█▋ | 6876/41250 [16:36:49<82:38:09, 8.65s/it] {'loss': 0.1472, 'grad_norm': 1.9988142251968384, 'learning_rate': 3.8072183512936154e-05, 'epoch': 1.67} 17%|█▋ | 6876/41250 [16:36:49<82:38:09, 8.65s/it][2025-04-26 00:34:32,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.01 | optimizer_step: 1.02 [2025-04-26 00:34:32,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.33 | bwd_microstep: 5687.70 | bwd_inner_microstep: 5675.08 | bwd_allreduce_microstep: 12.58 | step_microstep: 19.51 [2025-04-26 00:34:32,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.33 | bwd: 5687.72 | bwd_inner: 5675.08 | bwd_allreduce: 12.60 | step: 19.51 17%|█▋ | 6877/41250 [16:36:58<82:33:01, 8.65s/it] {'loss': 0.0573, 'grad_norm': 1.2651047706604004, 'learning_rate': 3.807151079607375e-05, 'epoch': 1.67} 17%|█▋ | 6877/41250 [16:36:58<82:33:01, 8.65s/it][2025-04-26 00:34:41,544] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:34:41,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.42 | bwd_microstep: 5704.77 | bwd_inner_microstep: 5691.85 | bwd_allreduce_microstep: 12.88 | step_microstep: 18.92 [2025-04-26 00:34:41,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.42 | bwd: 5704.78 | bwd_inner: 5691.85 | bwd_allreduce: 12.89 | step: 18.93 17%|█▋ | 6878/41250 [16:37:06<82:31:33, 8.64s/it] {'loss': 0.3488, 'grad_norm': 2.880899667739868, 'learning_rate': 3.807083796780402e-05, 'epoch': 1.67} 17%|█▋ | 6878/41250 [16:37:06<82:31:33, 8.64s/it][2025-04-26 00:34:50,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 00:34:50,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.76 | bwd_microstep: 5730.85 | bwd_inner_microstep: 5694.25 | bwd_allreduce_microstep: 36.56 | step_microstep: 18.64 [2025-04-26 00:34:50,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.76 | bwd: 5730.87 | bwd_inner: 5694.25 | bwd_allreduce: 36.57 | step: 18.65 17%|█▋ | 6879/41250 [16:37:15<82:34:54, 8.65s/it] {'loss': 0.0538, 'grad_norm': 0.9801366925239563, 'learning_rate': 3.8070165028131134e-05, 'epoch': 1.67} 17%|█▋ | 6879/41250 [16:37:15<82:34:54, 8.65s/it][2025-04-26 00:34:58,887] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:34:58,887] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.18 | bwd_microstep: 5768.88 | bwd_inner_microstep: 5645.26 | bwd_allreduce_microstep: 123.57 | step_microstep: 18.61 [2025-04-26 00:34:58,887] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.18 | bwd: 5768.89 | bwd_inner: 5645.26 | bwd_allreduce: 123.59 | step: 18.62 17%|█▋ | 6880/41250 [16:37:24<82:39:48, 8.66s/it] {'loss': 0.2095, 'grad_norm': 2.011505126953125, 'learning_rate': 3.8069491977059234e-05, 'epoch': 1.67} 17%|█▋ | 6880/41250 [16:37:24<82:39:48, 8.66s/it][2025-04-26 00:35:07,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:35:07,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.20 | bwd_microstep: 5760.09 | bwd_inner_microstep: 5644.09 | bwd_allreduce_microstep: 115.95 | step_microstep: 18.53 [2025-04-26 00:35:07,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.20 | bwd: 5760.11 | bwd_inner: 5644.09 | bwd_allreduce: 115.97 | step: 18.54 17%|█▋ | 6881/41250 [16:37:32<82:41:31, 8.66s/it] {'loss': 0.0482, 'grad_norm': 0.8567681908607483, 'learning_rate': 3.806881881459247e-05, 'epoch': 1.67} 17%|█▋ | 6881/41250 [16:37:32<82:41:31, 8.66s/it][2025-04-26 00:35:16,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 1.06 [2025-04-26 00:35:16,203] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.84 | bwd_microstep: 5717.31 | bwd_inner_microstep: 5688.44 | bwd_allreduce_microstep: 28.83 | step_microstep: 18.97 [2025-04-26 00:35:16,203] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.84 | bwd: 5717.32 | bwd_inner: 5688.44 | bwd_allreduce: 28.84 | step: 18.97 17%|█▋ | 6882/41250 [16:37:41<82:38:36, 8.66s/it] {'loss': 0.085, 'grad_norm': 1.6263160705566406, 'learning_rate': 3.8068145540735e-05, 'epoch': 1.67} 17%|█▋ | 6882/41250 [16:37:41<82:38:36, 8.66s/it][2025-04-26 00:35:24,867] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:35:24,867] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.54 | bwd_microstep: 5734.78 | bwd_inner_microstep: 5700.21 | bwd_allreduce_microstep: 34.52 | step_microstep: 18.84 [2025-04-26 00:35:24,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.54 | bwd: 5734.79 | bwd_inner: 5700.21 | bwd_allreduce: 34.54 | step: 18.84 17%|█▋ | 6883/41250 [16:37:50<82:40:03, 8.66s/it] {'loss': 0.2062, 'grad_norm': 2.305143117904663, 'learning_rate': 3.8067472155490954e-05, 'epoch': 1.67} 17%|█▋ | 6883/41250 [16:37:50<82:40:03, 8.66s/it][2025-04-26 00:35:33,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 0.96 [2025-04-26 00:35:33,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.59 | bwd_microstep: 5733.34 | bwd_inner_microstep: 5685.29 | bwd_allreduce_microstep: 48.01 | step_microstep: 18.77 [2025-04-26 00:35:33,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.59 | bwd: 5733.35 | bwd_inner: 5685.29 | bwd_allreduce: 48.02 | step: 18.78 17%|█▋ | 6884/41250 [16:37:58<82:39:39, 8.66s/it] {'loss': 0.0716, 'grad_norm': 2.1556780338287354, 'learning_rate': 3.80667986588645e-05, 'epoch': 1.67} 17%|█▋ | 6884/41250 [16:37:58<82:39:39, 8.66s/it][2025-04-26 00:35:42,176] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 00:35:42,177] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.58 | bwd_microstep: 5718.42 | bwd_inner_microstep: 5699.76 | bwd_allreduce_microstep: 18.61 | step_microstep: 18.42 [2025-04-26 00:35:42,177] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.58 | bwd: 5718.43 | bwd_inner: 5699.76 | bwd_allreduce: 18.63 | step: 18.42 17%|█▋ | 6885/41250 [16:38:07<82:37:58, 8.66s/it] {'loss': 0.1135, 'grad_norm': 1.521303653717041, 'learning_rate': 3.806612505085979e-05, 'epoch': 1.67} 17%|█▋ | 6885/41250 [16:38:07<82:37:58, 8.66s/it][2025-04-26 00:35:50,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:35:50,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.05 | bwd_microstep: 5760.88 | bwd_inner_microstep: 5650.66 | bwd_allreduce_microstep: 110.17 | step_microstep: 18.56 [2025-04-26 00:35:50,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.05 | bwd: 5760.89 | bwd_inner: 5650.66 | bwd_allreduce: 110.19 | step: 18.56 17%|█▋ | 6886/41250 [16:38:16<82:38:45, 8.66s/it] {'loss': 0.0212, 'grad_norm': 0.8572463989257812, 'learning_rate': 3.806545133148097e-05, 'epoch': 1.67} 17%|█▋ | 6886/41250 [16:38:16<82:38:45, 8.66s/it][2025-04-26 00:35:59,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:35:59,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2813.53 | bwd_microstep: 5695.68 | bwd_inner_microstep: 5634.71 | bwd_allreduce_microstep: 60.93 | step_microstep: 18.49 [2025-04-26 00:35:59,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2813.53 | bwd: 5695.69 | bwd_inner: 5634.71 | bwd_allreduce: 60.94 | step: 18.49 17%|█▋ | 6887/41250 [16:38:24<82:26:58, 8.64s/it] {'loss': 0.2074, 'grad_norm': 1.7776703834533691, 'learning_rate': 3.806477750073219e-05, 'epoch': 1.67} 17%|█▋ | 6887/41250 [16:38:24<82:26:58, 8.64s/it][2025-04-26 00:36:08,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:36:08,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.01 | bwd_microstep: 5771.52 | bwd_inner_microstep: 5651.10 | bwd_allreduce_microstep: 120.37 | step_microstep: 18.62 [2025-04-26 00:36:08,102] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.01 | bwd: 5771.53 | bwd_inner: 5651.10 | bwd_allreduce: 120.39 | step: 18.62 17%|█▋ | 6888/41250 [16:38:33<82:32:43, 8.65s/it] {'loss': 0.1187, 'grad_norm': 3.3492703437805176, 'learning_rate': 3.806410355861762e-05, 'epoch': 1.67} 17%|█▋ | 6888/41250 [16:38:33<82:32:43, 8.65s/it][2025-04-26 00:36:16,776] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 00:36:16,776] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.72 | bwd_microstep: 5750.33 | bwd_inner_microstep: 5670.82 | bwd_allreduce_microstep: 79.47 | step_microstep: 18.66 [2025-04-26 00:36:16,777] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.72 | bwd: 5750.34 | bwd_inner: 5670.81 | bwd_allreduce: 79.49 | step: 18.66 17%|█▋ | 6889/41250 [16:38:42<82:37:18, 8.66s/it] {'loss': 0.4195, 'grad_norm': 2.9247117042541504, 'learning_rate': 3.80634295051414e-05, 'epoch': 1.67} 17%|█▋ | 6889/41250 [16:38:42<82:37:18, 8.66s/it][2025-04-26 00:36:25,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 00:36:25,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.37 | bwd_microstep: 5719.51 | bwd_inner_microstep: 5697.65 | bwd_allreduce_microstep: 21.82 | step_microstep: 18.34 [2025-04-26 00:36:25,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.37 | bwd: 5719.52 | bwd_inner: 5697.65 | bwd_allreduce: 21.84 | step: 18.34 17%|█▋ | 6890/41250 [16:38:50<82:36:46, 8.66s/it] {'loss': 0.3397, 'grad_norm': 3.197667121887207, 'learning_rate': 3.806275534030769e-05, 'epoch': 1.67} 17%|█▋ | 6890/41250 [16:38:50<82:36:46, 8.66s/it][2025-04-26 00:36:34,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.99 | optimizer_step: 0.95 [2025-04-26 00:36:34,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.82 | bwd_microstep: 5690.42 | bwd_inner_microstep: 5677.63 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.52 [2025-04-26 00:36:34,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.82 | bwd: 5690.43 | bwd_inner: 5677.63 | bwd_allreduce: 12.76 | step: 18.52 17%|█▋ | 6891/41250 [16:38:59<82:31:56, 8.65s/it] {'loss': 0.0359, 'grad_norm': 0.6634089946746826, 'learning_rate': 3.806208106412065e-05, 'epoch': 1.67} 17%|█▋ | 6891/41250 [16:38:59<82:31:56, 8.65s/it][2025-04-26 00:36:42,739] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 00:36:42,739] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.08 | bwd_microstep: 5766.13 | bwd_inner_microstep: 5630.18 | bwd_allreduce_microstep: 135.91 | step_microstep: 18.27 [2025-04-26 00:36:42,739] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.08 | bwd: 5766.15 | bwd_inner: 5630.18 | bwd_allreduce: 135.92 | step: 18.27 17%|█▋ | 6892/41250 [16:39:08<82:37:43, 8.66s/it] {'loss': 0.0339, 'grad_norm': 0.38788551092147827, 'learning_rate': 3.806140667658443e-05, 'epoch': 1.67} 17%|█▋ | 6892/41250 [16:39:08<82:37:43, 8.66s/it][2025-04-26 00:36:51,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 1.01 | optimizer_step: 0.95 [2025-04-26 00:36:51,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.35 | bwd_microstep: 5751.34 | bwd_inner_microstep: 5676.00 | bwd_allreduce_microstep: 75.29 | step_microstep: 18.51 [2025-04-26 00:36:51,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.35 | bwd: 5751.35 | bwd_inner: 5676.00 | bwd_allreduce: 75.31 | step: 18.52 17%|█▋ | 6893/41250 [16:39:16<82:39:56, 8.66s/it] {'loss': 0.1383, 'grad_norm': 2.009521007537842, 'learning_rate': 3.806073217770318e-05, 'epoch': 1.67} 17%|█▋ | 6893/41250 [16:39:16<82:39:56, 8.66s/it][2025-04-26 00:37:00,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:37:00,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.57 | bwd_microstep: 5761.65 | bwd_inner_microstep: 5646.56 | bwd_allreduce_microstep: 115.04 | step_microstep: 18.73 [2025-04-26 00:37:00,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.57 | bwd: 5761.67 | bwd_inner: 5646.56 | bwd_allreduce: 115.06 | step: 18.73 17%|█▋ | 6894/41250 [16:39:25<82:42:23, 8.67s/it] {'loss': 0.1172, 'grad_norm': 1.2775171995162964, 'learning_rate': 3.806005756748108e-05, 'epoch': 1.67} 17%|█▋ | 6894/41250 [16:39:25<82:42:23, 8.67s/it][2025-04-26 00:37:08,700] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.97 | optimizer_step: 0.99 [2025-04-26 00:37:08,700] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.47 | bwd_microstep: 5691.67 | bwd_inner_microstep: 5679.01 | bwd_allreduce_microstep: 12.62 | step_microstep: 18.45 [2025-04-26 00:37:08,701] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.47 | bwd: 5691.69 | bwd_inner: 5679.01 | bwd_allreduce: 12.64 | step: 18.46 17%|█▋ | 6895/41250 [16:39:34<82:32:40, 8.65s/it] {'loss': 0.0547, 'grad_norm': 0.8279218077659607, 'learning_rate': 3.805938284592228e-05, 'epoch': 1.67} 17%|█▋ | 6895/41250 [16:39:34<82:32:40, 8.65s/it][2025-04-26 00:37:17,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.04 | optimizer_step: 0.91 [2025-04-26 00:37:17,368] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.67 | bwd_microstep: 5739.65 | bwd_inner_microstep: 5693.52 | bwd_allreduce_microstep: 46.09 | step_microstep: 19.26 [2025-04-26 00:37:17,368] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.67 | bwd: 5739.67 | bwd_inner: 5693.52 | bwd_allreduce: 46.11 | step: 19.26 17%|█▋ | 6896/41250 [16:39:42<82:35:46, 8.66s/it] {'loss': 0.0686, 'grad_norm': 1.2687994241714478, 'learning_rate': 3.8058708013030935e-05, 'epoch': 1.67} 17%|█▋ | 6896/41250 [16:39:42<82:35:46, 8.66s/it][2025-04-26 00:37:26,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:37:26,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.28 | bwd_microstep: 5724.72 | bwd_inner_microstep: 5712.05 | bwd_allreduce_microstep: 12.63 | step_microstep: 18.55 [2025-04-26 00:37:26,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.28 | bwd: 5724.74 | bwd_inner: 5712.05 | bwd_allreduce: 12.65 | step: 18.56 17%|█▋ | 6897/41250 [16:39:51<82:37:46, 8.66s/it] {'loss': 0.1054, 'grad_norm': 2.137465715408325, 'learning_rate': 3.80580330688112e-05, 'epoch': 1.67} 17%|█▋ | 6897/41250 [16:39:51<82:37:46, 8.66s/it][2025-04-26 00:37:34,712] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.67 | optimizer_step: 1.13 [2025-04-26 00:37:34,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.21 | bwd_microstep: 5770.00 | bwd_inner_microstep: 5648.62 | bwd_allreduce_microstep: 121.33 | step_microstep: 19.28 [2025-04-26 00:37:34,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.21 | bwd: 5770.01 | bwd_inner: 5648.62 | bwd_allreduce: 121.35 | step: 19.29 17%|█▋ | 6898/41250 [16:40:00<82:40:32, 8.66s/it] {'loss': 0.1021, 'grad_norm': 2.177665948867798, 'learning_rate': 3.8057358013267244e-05, 'epoch': 1.67} 17%|█▋ | 6898/41250 [16:40:00<82:40:32, 8.66s/it][2025-04-26 00:37:43,351] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 1.07 [2025-04-26 00:37:43,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.09 | bwd_microstep: 5707.13 | bwd_inner_microstep: 5693.90 | bwd_allreduce_microstep: 13.17 | step_microstep: 18.86 [2025-04-26 00:37:43,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.09 | bwd: 5707.16 | bwd_inner: 5693.90 | bwd_allreduce: 13.20 | step: 18.86 17%|█▋ | 6899/41250 [16:40:08<82:36:44, 8.66s/it] {'loss': 0.1303, 'grad_norm': 1.9427961111068726, 'learning_rate': 3.805668284640323e-05, 'epoch': 1.67} 17%|█▋ | 6899/41250 [16:40:08<82:36:44, 8.66s/it][2025-04-26 00:37:52,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:37:52,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.04 | bwd_microstep: 5742.10 | bwd_inner_microstep: 5703.53 | bwd_allreduce_microstep: 38.52 | step_microstep: 18.73 [2025-04-26 00:37:52,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.04 | bwd: 5742.11 | bwd_inner: 5703.53 | bwd_allreduce: 38.54 | step: 18.73 17%|█▋ | 6900/41250 [16:40:17<82:39:33, 8.66s/it] {'loss': 0.1031, 'grad_norm': 2.1245694160461426, 'learning_rate': 3.805600756822332e-05, 'epoch': 1.67} 17%|█▋ | 6900/41250 [16:40:17<82:39:33, 8.66s/it][2025-04-26 00:38:00,661] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:38:00,662] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.22 | bwd_microstep: 5701.49 | bwd_inner_microstep: 5688.76 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.60 [2025-04-26 00:38:00,662] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.22 | bwd: 5701.51 | bwd_inner: 5688.76 | bwd_allreduce: 12.70 | step: 18.61 17%|█▋ | 6901/41250 [16:40:25<82:33:53, 8.65s/it] {'loss': 0.0894, 'grad_norm': 1.0539590120315552, 'learning_rate': 3.805533217873167e-05, 'epoch': 1.67} 17%|█▋ | 6901/41250 [16:40:25<82:33:53, 8.65s/it][2025-04-26 00:38:09,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 1.14 [2025-04-26 00:38:09,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.25 | bwd_microstep: 5709.43 | bwd_inner_microstep: 5652.10 | bwd_allreduce_microstep: 57.28 | step_microstep: 19.00 [2025-04-26 00:38:09,280] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.25 | bwd: 5709.44 | bwd_inner: 5652.10 | bwd_allreduce: 57.30 | step: 19.00 17%|█▋ | 6902/41250 [16:40:34<82:27:39, 8.64s/it] {'loss': 0.2551, 'grad_norm': 6.5947418212890625, 'learning_rate': 3.8054656677932455e-05, 'epoch': 1.67} 17%|█▋ | 6902/41250 [16:40:34<82:27:39, 8.64s/it][2025-04-26 00:38:17,957] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.56 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:38:17,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.01 | bwd_microstep: 5745.86 | bwd_inner_microstep: 5713.89 | bwd_allreduce_microstep: 31.92 | step_microstep: 18.70 [2025-04-26 00:38:17,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.01 | bwd: 5745.87 | bwd_inner: 5713.89 | bwd_allreduce: 31.94 | step: 18.70 17%|█▋ | 6903/41250 [16:40:43<82:33:39, 8.65s/it] {'loss': 0.0823, 'grad_norm': 1.7723268270492554, 'learning_rate': 3.805398106582983e-05, 'epoch': 1.67} 17%|█▋ | 6903/41250 [16:40:43<82:33:39, 8.65s/it][2025-04-26 00:38:26,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.04 | optimizer_step: 0.92 [2025-04-26 00:38:26,659] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.53 | bwd_microstep: 5784.54 | bwd_inner_microstep: 5659.42 | bwd_allreduce_microstep: 125.07 | step_microstep: 19.23 [2025-04-26 00:38:26,659] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.53 | bwd: 5784.56 | bwd_inner: 5659.42 | bwd_allreduce: 125.09 | step: 19.23 17%|█▋ | 6904/41250 [16:40:51<82:41:40, 8.67s/it] {'loss': 0.1107, 'grad_norm': 2.373622417449951, 'learning_rate': 3.805330534242796e-05, 'epoch': 1.67} 17%|█▋ | 6904/41250 [16:40:51<82:41:40, 8.67s/it][2025-04-26 00:38:35,330] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:38:35,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.65 | bwd_microstep: 5764.30 | bwd_inner_microstep: 5656.37 | bwd_allreduce_microstep: 107.90 | step_microstep: 18.37 [2025-04-26 00:38:35,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.65 | bwd: 5764.32 | bwd_inner: 5656.37 | bwd_allreduce: 107.91 | step: 18.37 17%|█▋ | 6905/41250 [16:41:00<82:42:12, 8.67s/it] {'loss': 0.2295, 'grad_norm': 1.4764505624771118, 'learning_rate': 3.805262950773101e-05, 'epoch': 1.67} 17%|█▋ | 6905/41250 [16:41:00<82:42:12, 8.67s/it][2025-04-26 00:38:43,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:38:43,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.00 | bwd_microstep: 5766.87 | bwd_inner_microstep: 5642.74 | bwd_allreduce_microstep: 124.09 | step_microstep: 18.47 [2025-04-26 00:38:43,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.00 | bwd: 5766.89 | bwd_inner: 5642.74 | bwd_allreduce: 124.10 | step: 18.47 17%|█▋ | 6906/41250 [16:41:09<82:41:53, 8.67s/it] {'loss': 0.1895, 'grad_norm': 1.7805896997451782, 'learning_rate': 3.805195356174316e-05, 'epoch': 1.67} 17%|█▋ | 6906/41250 [16:41:09<82:41:53, 8.67s/it][2025-04-26 00:38:52,641] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:38:52,642] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.37 | bwd_microstep: 5737.58 | bwd_inner_microstep: 5660.21 | bwd_allreduce_microstep: 77.32 | step_microstep: 18.56 [2025-04-26 00:38:52,642] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.37 | bwd: 5737.59 | bwd_inner: 5660.21 | bwd_allreduce: 77.34 | step: 18.56 17%|█▋ | 6907/41250 [16:41:17<82:37:22, 8.66s/it] {'loss': 0.1758, 'grad_norm': 3.8530213832855225, 'learning_rate': 3.805127750446856e-05, 'epoch': 1.67} 17%|█▋ | 6907/41250 [16:41:17<82:37:22, 8.66s/it][2025-04-26 00:39:01,328] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 00:39:01,329] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.36 | bwd_microstep: 5753.71 | bwd_inner_microstep: 5707.92 | bwd_allreduce_microstep: 45.75 | step_microstep: 18.38 [2025-04-26 00:39:01,329] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.36 | bwd: 5753.73 | bwd_inner: 5707.92 | bwd_allreduce: 45.77 | step: 18.38 17%|█▋ | 6908/41250 [16:41:26<82:41:39, 8.67s/it] {'loss': 0.2212, 'grad_norm': 2.9543075561523438, 'learning_rate': 3.805060133591139e-05, 'epoch': 1.67} 17%|█▋ | 6908/41250 [16:41:26<82:41:39, 8.67s/it][2025-04-26 00:39:09,969] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:39:09,970] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.40 | bwd_microstep: 5712.11 | bwd_inner_microstep: 5699.53 | bwd_allreduce_microstep: 12.54 | step_microstep: 18.60 [2025-04-26 00:39:09,970] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.40 | bwd: 5712.12 | bwd_inner: 5699.53 | bwd_allreduce: 12.55 | step: 18.61 17%|█▋ | 6909/41250 [16:41:35<82:36:56, 8.66s/it] {'loss': 0.2107, 'grad_norm': 3.012986898422241, 'learning_rate': 3.804992505607581e-05, 'epoch': 1.67} 17%|█▋ | 6909/41250 [16:41:35<82:36:56, 8.66s/it][2025-04-26 00:39:18,605] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-26 00:39:18,606] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.46 | bwd_microstep: 5725.40 | bwd_inner_microstep: 5657.69 | bwd_allreduce_microstep: 67.66 | step_microstep: 18.21 [2025-04-26 00:39:18,606] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.46 | bwd: 5725.41 | bwd_inner: 5657.69 | bwd_allreduce: 67.68 | step: 18.21 17%|█▋ | 6910/41250 [16:41:43<82:32:26, 8.65s/it] {'loss': 0.1493, 'grad_norm': 1.491206407546997, 'learning_rate': 3.804924866496599e-05, 'epoch': 1.68} 17%|█▋ | 6910/41250 [16:41:43<82:32:26, 8.65s/it][2025-04-26 00:39:27,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-26 00:39:27,292] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.25 | bwd_microstep: 5746.15 | bwd_inner_microstep: 5703.31 | bwd_allreduce_microstep: 42.79 | step_microstep: 18.75 [2025-04-26 00:39:27,292] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.25 | bwd: 5746.16 | bwd_inner: 5703.31 | bwd_allreduce: 42.81 | step: 18.75 17%|█▋ | 6911/41250 [16:41:52<82:37:51, 8.66s/it] {'loss': 0.3782, 'grad_norm': 3.453822612762451, 'learning_rate': 3.804857216258611e-05, 'epoch': 1.68} 17%|█▋ | 6911/41250 [16:41:52<82:37:51, 8.66s/it][2025-04-26 00:39:35,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.91 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 00:39:35,996] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.24 | bwd_microstep: 5796.67 | bwd_inner_microstep: 5654.85 | bwd_allreduce_microstep: 141.77 | step_microstep: 17.92 [2025-04-26 00:39:35,996] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.24 | bwd: 5796.68 | bwd_inner: 5654.85 | bwd_allreduce: 141.78 | step: 17.92 17%|█▋ | 6912/41250 [16:42:01<82:44:44, 8.68s/it] {'loss': 0.0845, 'grad_norm': 1.382641077041626, 'learning_rate': 3.804789554894033e-05, 'epoch': 1.68} 17%|█▋ | 6912/41250 [16:42:01<82:44:44, 8.68s/it][2025-04-26 00:39:44,633] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:39:44,633] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.32 | bwd_microstep: 5709.72 | bwd_inner_microstep: 5696.92 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.41 [2025-04-26 00:39:44,634] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.32 | bwd: 5709.73 | bwd_inner: 5696.92 | bwd_allreduce: 12.77 | step: 18.41 17%|█▋ | 6913/41250 [16:42:09<82:38:15, 8.66s/it] {'loss': 0.1203, 'grad_norm': 1.8298540115356445, 'learning_rate': 3.804721882403281e-05, 'epoch': 1.68} 17%|█▋ | 6913/41250 [16:42:09<82:38:15, 8.66s/it][2025-04-26 00:39:53,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:39:53,315] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.52 | bwd_microstep: 5745.80 | bwd_inner_microstep: 5686.52 | bwd_allreduce_microstep: 59.23 | step_microstep: 18.66 [2025-04-26 00:39:53,315] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.52 | bwd: 5745.81 | bwd_inner: 5686.52 | bwd_allreduce: 59.25 | step: 18.66 17%|█▋ | 6914/41250 [16:42:18<82:41:04, 8.67s/it] {'loss': 0.1529, 'grad_norm': 1.5493303537368774, 'learning_rate': 3.804654198786775e-05, 'epoch': 1.68} 17%|█▋ | 6914/41250 [16:42:18<82:41:04, 8.67s/it][2025-04-26 00:40:02,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:40:02,008] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.87 | bwd_microstep: 5779.79 | bwd_inner_microstep: 5662.33 | bwd_allreduce_microstep: 117.42 | step_microstep: 18.41 [2025-04-26 00:40:02,008] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.87 | bwd: 5779.80 | bwd_inner: 5662.33 | bwd_allreduce: 117.43 | step: 18.41 17%|█▋ | 6915/41250 [16:42:27<82:45:02, 8.68s/it] {'loss': 0.4839, 'grad_norm': 1.8985509872436523, 'learning_rate': 3.80458650404493e-05, 'epoch': 1.68} 17%|█▋ | 6915/41250 [16:42:27<82:45:02, 8.68s/it][2025-04-26 00:40:10,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:40:10,632] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.77 | bwd_microstep: 5715.49 | bwd_inner_microstep: 5644.05 | bwd_allreduce_microstep: 71.40 | step_microstep: 18.37 [2025-04-26 00:40:10,632] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.77 | bwd: 5715.51 | bwd_inner: 5644.05 | bwd_allreduce: 71.42 | step: 18.37 17%|█▋ | 6916/41250 [16:42:35<82:35:53, 8.66s/it] {'loss': 0.2243, 'grad_norm': 3.498418092727661, 'learning_rate': 3.804518798178165e-05, 'epoch': 1.68} 17%|█▋ | 6916/41250 [16:42:35<82:35:53, 8.66s/it][2025-04-26 00:40:19,319] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.95 | optimizer_step: 1.00 [2025-04-26 00:40:19,319] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.80 | bwd_microstep: 5782.92 | bwd_inner_microstep: 5640.51 | bwd_allreduce_microstep: 142.36 | step_microstep: 18.49 [2025-04-26 00:40:19,320] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.80 | bwd: 5782.93 | bwd_inner: 5640.51 | bwd_allreduce: 142.38 | step: 18.49 17%|█▋ | 6917/41250 [16:42:44<82:40:23, 8.67s/it] {'loss': 0.0796, 'grad_norm': 1.6063055992126465, 'learning_rate': 3.804451081186896e-05, 'epoch': 1.68} 17%|█▋ | 6917/41250 [16:42:44<82:40:23, 8.67s/it][2025-04-26 00:40:27,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 00:40:27,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.03 | bwd_microstep: 5691.59 | bwd_inner_microstep: 5658.75 | bwd_allreduce_microstep: 32.79 | step_microstep: 18.21 [2025-04-26 00:40:27,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.03 | bwd: 5691.60 | bwd_inner: 5658.75 | bwd_allreduce: 32.81 | step: 18.22 17%|█▋ | 6918/41250 [16:42:53<82:27:55, 8.65s/it] {'loss': 0.1805, 'grad_norm': 5.531785011291504, 'learning_rate': 3.804383353071541e-05, 'epoch': 1.68} 17%|█▋ | 6918/41250 [16:42:53<82:27:55, 8.65s/it][2025-04-26 00:40:36,623] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:40:36,623] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.81 | bwd_microstep: 5804.68 | bwd_inner_microstep: 5633.56 | bwd_allreduce_microstep: 171.07 | step_microstep: 18.62 [2025-04-26 00:40:36,624] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.81 | bwd: 5804.69 | bwd_inner: 5633.56 | bwd_allreduce: 171.09 | step: 18.63 17%|█▋ | 6919/41250 [16:43:01<82:38:01, 8.67s/it] {'loss': 0.255, 'grad_norm': 3.8471667766571045, 'learning_rate': 3.8043156138325185e-05, 'epoch': 1.68} 17%|█▋ | 6919/41250 [16:43:01<82:38:01, 8.67s/it][2025-04-26 00:40:45,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:40:45,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.80 | bwd_microstep: 5709.32 | bwd_inner_microstep: 5651.99 | bwd_allreduce_microstep: 57.28 | step_microstep: 17.98 [2025-04-26 00:40:45,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.80 | bwd: 5709.33 | bwd_inner: 5651.99 | bwd_allreduce: 57.30 | step: 17.99 17%|█▋ | 6920/41250 [16:43:10<82:29:27, 8.65s/it] {'loss': 0.095, 'grad_norm': 2.096122980117798, 'learning_rate': 3.8042478634702435e-05, 'epoch': 1.68} 17%|█▋ | 6920/41250 [16:43:10<82:29:27, 8.65s/it][2025-04-26 00:40:53,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:40:53,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.23 | bwd_microstep: 5699.18 | bwd_inner_microstep: 5686.48 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.32 [2025-04-26 00:40:53,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.23 | bwd: 5699.20 | bwd_inner: 5686.48 | bwd_allreduce: 12.68 | step: 18.33 17%|█▋ | 6921/41250 [16:43:19<82:26:26, 8.65s/it] {'loss': 0.0747, 'grad_norm': 1.210184097290039, 'learning_rate': 3.8041801019851364e-05, 'epoch': 1.68} 17%|█▋ | 6921/41250 [16:43:19<82:26:26, 8.65s/it][2025-04-26 00:41:02,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 00:41:02,539] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.76 | bwd_microstep: 5756.93 | bwd_inner_microstep: 5640.00 | bwd_allreduce_microstep: 116.89 | step_microstep: 18.56 [2025-04-26 00:41:02,539] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.76 | bwd: 5756.94 | bwd_inner: 5640.00 | bwd_allreduce: 116.90 | step: 18.57 17%|█▋ | 6922/41250 [16:43:27<82:29:50, 8.65s/it] {'loss': 0.0888, 'grad_norm': 1.2820383310317993, 'learning_rate': 3.804112329377613e-05, 'epoch': 1.68} 17%|█▋ | 6922/41250 [16:43:27<82:29:50, 8.65s/it][2025-04-26 00:41:11,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 00:41:11,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.17 | bwd_microstep: 5703.79 | bwd_inner_microstep: 5691.02 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.77 [2025-04-26 00:41:11,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.17 | bwd: 5703.80 | bwd_inner: 5691.02 | bwd_allreduce: 12.73 | step: 18.77 17%|█▋ | 6923/41250 [16:43:36<82:25:55, 8.64s/it] {'loss': 0.2755, 'grad_norm': 2.153012990951538, 'learning_rate': 3.804044545648093e-05, 'epoch': 1.68} 17%|█▋ | 6923/41250 [16:43:36<82:25:55, 8.64s/it][2025-04-26 00:41:19,865] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.17 | optimizer_step: 0.95 [2025-04-26 00:41:19,866] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.77 | bwd_microstep: 5797.27 | bwd_inner_microstep: 5631.30 | bwd_allreduce_microstep: 165.89 | step_microstep: 19.87 [2025-04-26 00:41:19,866] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.77 | bwd: 5797.29 | bwd_inner: 5631.30 | bwd_allreduce: 165.92 | step: 19.87 17%|█▋ | 6924/41250 [16:43:45<82:36:02, 8.66s/it] {'loss': 0.0668, 'grad_norm': 3.5347955226898193, 'learning_rate': 3.8039767507969925e-05, 'epoch': 1.68} 17%|█▋ | 6924/41250 [16:43:45<82:36:02, 8.66s/it][2025-04-26 00:41:28,481] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:41:28,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.61 | bwd_microstep: 5703.30 | bwd_inner_microstep: 5646.26 | bwd_allreduce_microstep: 57.00 | step_microstep: 18.67 [2025-04-26 00:41:28,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.61 | bwd: 5703.31 | bwd_inner: 5646.26 | bwd_allreduce: 57.01 | step: 18.67 17%|█▋ | 6925/41250 [16:43:53<82:26:34, 8.65s/it] {'loss': 0.0629, 'grad_norm': 0.7358373999595642, 'learning_rate': 3.8039089448247304e-05, 'epoch': 1.68} 17%|█▋ | 6925/41250 [16:43:53<82:26:34, 8.65s/it][2025-04-26 00:41:37,084] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.17 | optimizer_step: 1.02 [2025-04-26 00:41:37,085] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.01 | bwd_microstep: 5698.85 | bwd_inner_microstep: 5644.07 | bwd_allreduce_microstep: 54.73 | step_microstep: 19.63 [2025-04-26 00:41:37,085] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.01 | bwd: 5698.86 | bwd_inner: 5644.07 | bwd_allreduce: 54.75 | step: 19.63 17%|█▋ | 6926/41250 [16:44:02<82:19:09, 8.63s/it] {'loss': 0.2019, 'grad_norm': 3.8524467945098877, 'learning_rate': 3.803841127731725e-05, 'epoch': 1.68} 17%|█▋ | 6926/41250 [16:44:02<82:19:09, 8.63s/it][2025-04-26 00:41:45,717] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:41:45,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.27 | bwd_microstep: 5705.53 | bwd_inner_microstep: 5692.82 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.44 [2025-04-26 00:41:45,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.27 | bwd: 5705.54 | bwd_inner: 5692.82 | bwd_allreduce: 12.68 | step: 18.44 17%|█▋ | 6927/41250 [16:44:11<82:18:40, 8.63s/it] {'loss': 0.1297, 'grad_norm': 2.6519665718078613, 'learning_rate': 3.803773299518393e-05, 'epoch': 1.68} 17%|█▋ | 6927/41250 [16:44:11<82:18:40, 8.63s/it][2025-04-26 00:41:54,342] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 00:41:54,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.62 | bwd_microstep: 5700.16 | bwd_inner_microstep: 5687.53 | bwd_allreduce_microstep: 12.58 | step_microstep: 18.46 [2025-04-26 00:41:54,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.62 | bwd: 5700.17 | bwd_inner: 5687.53 | bwd_allreduce: 12.60 | step: 18.47 17%|█▋ | 6928/41250 [16:44:19<82:17:05, 8.63s/it] {'loss': 0.1383, 'grad_norm': 3.5988476276397705, 'learning_rate': 3.803705460185153e-05, 'epoch': 1.68} 17%|█▋ | 6928/41250 [16:44:19<82:17:05, 8.63s/it][2025-04-26 00:42:02,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.96 | optimizer_step: 0.97 [2025-04-26 00:42:02,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.93 | bwd_microstep: 5751.70 | bwd_inner_microstep: 5640.81 | bwd_allreduce_microstep: 110.85 | step_microstep: 18.30 [2025-04-26 00:42:02,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.93 | bwd: 5751.72 | bwd_inner: 5640.81 | bwd_allreduce: 110.87 | step: 18.30 17%|█▋ | 6929/41250 [16:44:28<82:21:13, 8.64s/it] {'loss': 0.2329, 'grad_norm': 2.4032540321350098, 'learning_rate': 3.803637609732424e-05, 'epoch': 1.68} 17%|█▋ | 6929/41250 [16:44:28<82:21:13, 8.64s/it][2025-04-26 00:42:11,646] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:42:11,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2811.71 | bwd_microstep: 5752.18 | bwd_inner_microstep: 5626.46 | bwd_allreduce_microstep: 125.68 | step_microstep: 18.57 [2025-04-26 00:42:11,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2811.71 | bwd: 5752.20 | bwd_inner: 5626.46 | bwd_allreduce: 125.70 | step: 18.58 17%|█▋ | 6930/41250 [16:44:36<82:22:45, 8.64s/it] {'loss': 0.0219, 'grad_norm': 0.4401237666606903, 'learning_rate': 3.803569748160625e-05, 'epoch': 1.68} 17%|█▋ | 6930/41250 [16:44:36<82:22:45, 8.64s/it][2025-04-26 00:42:20,296] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:42:20,297] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.39 | bwd_microstep: 5732.63 | bwd_inner_microstep: 5674.39 | bwd_allreduce_microstep: 58.20 | step_microstep: 18.41 [2025-04-26 00:42:20,297] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.39 | bwd: 5732.65 | bwd_inner: 5674.39 | bwd_allreduce: 58.22 | step: 18.41 17%|█▋ | 6931/41250 [16:44:45<82:24:08, 8.64s/it] {'loss': 0.1624, 'grad_norm': 3.209563732147217, 'learning_rate': 3.803501875470172e-05, 'epoch': 1.68} 17%|█▋ | 6931/41250 [16:44:45<82:24:08, 8.64s/it][2025-04-26 00:42:28,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 00:42:28,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.49 | bwd_microstep: 5694.90 | bwd_inner_microstep: 5682.20 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.82 [2025-04-26 00:42:28,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.49 | bwd: 5694.92 | bwd_inner: 5682.20 | bwd_allreduce: 12.68 | step: 18.82 17%|█▋ | 6932/41250 [16:44:54<82:19:59, 8.64s/it] {'loss': 0.1438, 'grad_norm': 1.9190378189086914, 'learning_rate': 3.803433991661485e-05, 'epoch': 1.68} 17%|█▋ | 6932/41250 [16:44:54<82:19:59, 8.64s/it][2025-04-26 00:42:37,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 00:42:37,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.35 | bwd_microstep: 5733.62 | bwd_inner_microstep: 5641.31 | bwd_allreduce_microstep: 92.26 | step_microstep: 18.54 [2025-04-26 00:42:37,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.35 | bwd: 5733.63 | bwd_inner: 5641.31 | bwd_allreduce: 92.28 | step: 18.54 17%|█▋ | 6933/41250 [16:45:02<82:20:45, 8.64s/it] {'loss': 0.193, 'grad_norm': 1.6762491464614868, 'learning_rate': 3.803366096734983e-05, 'epoch': 1.68} 17%|█▋ | 6933/41250 [16:45:02<82:20:45, 8.64s/it][2025-04-26 00:42:46,180] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 00:42:46,181] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.80 | bwd_microstep: 5698.27 | bwd_inner_microstep: 5685.53 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.39 [2025-04-26 00:42:46,181] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.80 | bwd: 5698.29 | bwd_inner: 5685.53 | bwd_allreduce: 12.71 | step: 18.39 17%|█▋ | 6934/41250 [16:45:11<82:17:41, 8.63s/it] {'loss': 0.0633, 'grad_norm': 1.0400930643081665, 'learning_rate': 3.803298190691082e-05, 'epoch': 1.68} 17%|█▋ | 6934/41250 [16:45:11<82:17:41, 8.63s/it][2025-04-26 00:42:54,845] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:42:54,845] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.64 | bwd_microstep: 5760.94 | bwd_inner_microstep: 5633.55 | bwd_allreduce_microstep: 127.35 | step_microstep: 18.59 [2025-04-26 00:42:54,845] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.64 | bwd: 5760.95 | bwd_inner: 5633.55 | bwd_allreduce: 127.36 | step: 18.59 17%|█▋ | 6935/41250 [16:45:20<82:23:03, 8.64s/it] {'loss': 0.1101, 'grad_norm': 1.4478055238723755, 'learning_rate': 3.8032302735302035e-05, 'epoch': 1.68} 17%|█▋ | 6935/41250 [16:45:20<82:23:03, 8.64s/it][2025-04-26 00:43:03,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.00 | optimizer_step: 1.00 [2025-04-26 00:43:03,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.19 | bwd_microstep: 5769.96 | bwd_inner_microstep: 5648.89 | bwd_allreduce_microstep: 121.03 | step_microstep: 18.96 [2025-04-26 00:43:03,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.19 | bwd: 5769.97 | bwd_inner: 5648.89 | bwd_allreduce: 121.04 | step: 18.96 17%|█▋ | 6936/41250 [16:45:28<82:27:53, 8.65s/it] {'loss': 0.326, 'grad_norm': 1.55611252784729, 'learning_rate': 3.803162345252765e-05, 'epoch': 1.68} 17%|█▋ | 6936/41250 [16:45:28<82:27:53, 8.65s/it][2025-04-26 00:43:12,186] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.95 | optimizer_step: 1.05 [2025-04-26 00:43:12,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.75 | bwd_microstep: 5765.70 | bwd_inner_microstep: 5641.29 | bwd_allreduce_microstep: 124.37 | step_microstep: 18.29 [2025-04-26 00:43:12,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.75 | bwd: 5765.72 | bwd_inner: 5641.29 | bwd_allreduce: 124.39 | step: 18.29 17%|█▋ | 6937/41250 [16:45:37<82:30:36, 8.66s/it] {'loss': 0.411, 'grad_norm': 3.4784209728240967, 'learning_rate': 3.803094405859185e-05, 'epoch': 1.68} 17%|█▋ | 6937/41250 [16:45:37<82:30:36, 8.66s/it][2025-04-26 00:43:20,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:43:20,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.82 | bwd_microstep: 5709.09 | bwd_inner_microstep: 5652.05 | bwd_allreduce_microstep: 57.00 | step_microstep: 18.45 [2025-04-26 00:43:20,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.82 | bwd: 5709.10 | bwd_inner: 5652.05 | bwd_allreduce: 57.01 | step: 18.46 17%|█▋ | 6938/41250 [16:45:46<82:22:22, 8.64s/it] {'loss': 0.0999, 'grad_norm': 1.0832020044326782, 'learning_rate': 3.8030264553498825e-05, 'epoch': 1.68} 17%|█▋ | 6938/41250 [16:45:46<82:22:22, 8.64s/it][2025-04-26 00:43:29,408] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:43:29,408] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.57 | bwd_microstep: 5706.91 | bwd_inner_microstep: 5647.24 | bwd_allreduce_microstep: 59.62 | step_microstep: 18.45 [2025-04-26 00:43:29,408] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.57 | bwd: 5706.92 | bwd_inner: 5647.24 | bwd_allreduce: 59.64 | step: 18.45 17%|█▋ | 6939/41250 [16:45:54<82:17:11, 8.63s/it] {'loss': 0.1544, 'grad_norm': 1.4553736448287964, 'learning_rate': 3.802958493725277e-05, 'epoch': 1.68} 17%|█▋ | 6939/41250 [16:45:54<82:17:11, 8.63s/it][2025-04-26 00:43:38,092] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:43:38,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.85 | bwd_microstep: 5751.13 | bwd_inner_microstep: 5700.17 | bwd_allreduce_microstep: 50.91 | step_microstep: 18.67 [2025-04-26 00:43:38,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.85 | bwd: 5751.14 | bwd_inner: 5700.17 | bwd_allreduce: 50.93 | step: 18.67 17%|█▋ | 6940/41250 [16:46:03<82:26:06, 8.65s/it] {'loss': 0.0336, 'grad_norm': 0.42445802688598633, 'learning_rate': 3.802890520985787e-05, 'epoch': 1.68} 17%|█▋ | 6940/41250 [16:46:03<82:26:06, 8.65s/it][2025-04-26 00:43:46,779] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.07 | optimizer_step: 1.11 [2025-04-26 00:43:46,780] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2775.92 | bwd_microstep: 5826.09 | bwd_inner_microstep: 5558.89 | bwd_allreduce_microstep: 267.15 | step_microstep: 19.45 [2025-04-26 00:43:46,780] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2775.92 | bwd: 5826.11 | bwd_inner: 5558.89 | bwd_allreduce: 267.17 | step: 19.45 17%|█▋ | 6941/41250 [16:46:12<82:32:00, 8.66s/it] {'loss': 0.2035, 'grad_norm': 2.1388940811157227, 'learning_rate': 3.802822537131831e-05, 'epoch': 1.68} 17%|█▋ | 6941/41250 [16:46:12<82:32:00, 8.66s/it][2025-04-26 00:43:55,424] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 00:43:55,425] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.57 | bwd_microstep: 5713.97 | bwd_inner_microstep: 5701.28 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.91 [2025-04-26 00:43:55,425] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.57 | bwd: 5713.98 | bwd_inner: 5701.28 | bwd_allreduce: 12.66 | step: 18.91 17%|█▋ | 6942/41250 [16:46:20<82:29:14, 8.66s/it] {'loss': 0.0313, 'grad_norm': 0.5967629551887512, 'learning_rate': 3.802754542163829e-05, 'epoch': 1.68} 17%|█▋ | 6942/41250 [16:46:20<82:29:14, 8.66s/it][2025-04-26 00:44:04,364] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.98 | optimizer_step: 1.09 [2025-04-26 00:44:04,365] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.29 | bwd_microstep: 6012.10 | bwd_inner_microstep: 5693.39 | bwd_allreduce_microstep: 318.66 | step_microstep: 18.68 [2025-04-26 00:44:04,365] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.29 | bwd: 6012.11 | bwd_inner: 5693.39 | bwd_allreduce: 318.68 | step: 18.68 17%|█▋ | 6943/41250 [16:46:29<83:17:48, 8.74s/it] {'loss': 0.1625, 'grad_norm': 2.167724370956421, 'learning_rate': 3.8026865360822e-05, 'epoch': 1.68} 17%|█▋ | 6943/41250 [16:46:29<83:17:48, 8.74s/it][2025-04-26 00:44:12,993] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:44:12,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.98 | bwd_microstep: 5715.80 | bwd_inner_microstep: 5652.59 | bwd_allreduce_microstep: 63.16 | step_microstep: 18.85 [2025-04-26 00:44:12,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.98 | bwd: 5715.81 | bwd_inner: 5652.59 | bwd_allreduce: 63.18 | step: 18.85 17%|█▋ | 6944/41250 [16:46:38<82:58:40, 8.71s/it] {'loss': 0.1789, 'grad_norm': 1.648089051246643, 'learning_rate': 3.802618518887363e-05, 'epoch': 1.68} 17%|█▋ | 6944/41250 [16:46:38<82:58:40, 8.71s/it][2025-04-26 00:44:21,600] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-26 00:44:21,600] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.90 | bwd_microstep: 5694.23 | bwd_inner_microstep: 5655.11 | bwd_allreduce_microstep: 39.07 | step_microstep: 19.01 [2025-04-26 00:44:21,601] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.90 | bwd: 5694.25 | bwd_inner: 5655.10 | bwd_allreduce: 39.09 | step: 19.02 17%|█▋ | 6945/41250 [16:46:46<82:41:30, 8.68s/it] {'loss': 0.0967, 'grad_norm': 0.9955821633338928, 'learning_rate': 3.802550490579737e-05, 'epoch': 1.68} 17%|█▋ | 6945/41250 [16:46:46<82:41:30, 8.68s/it][2025-04-26 00:44:30,269] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.05 | optimizer_step: 1.06 [2025-04-26 00:44:30,270] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.91 | bwd_microstep: 5755.32 | bwd_inner_microstep: 5657.90 | bwd_allreduce_microstep: 97.37 | step_microstep: 19.21 [2025-04-26 00:44:30,270] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.91 | bwd: 5755.33 | bwd_inner: 5657.90 | bwd_allreduce: 97.39 | step: 19.22 17%|█▋ | 6946/41250 [16:46:55<82:39:31, 8.67s/it] {'loss': 0.0684, 'grad_norm': 1.1669479608535767, 'learning_rate': 3.8024824511597425e-05, 'epoch': 1.68} 17%|█▋ | 6946/41250 [16:46:55<82:39:31, 8.67s/it][2025-04-26 00:44:38,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-26 00:44:38,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.13 | bwd_microstep: 5710.17 | bwd_inner_microstep: 5697.07 | bwd_allreduce_microstep: 13.05 | step_microstep: 19.04 [2025-04-26 00:44:38,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.13 | bwd: 5710.18 | bwd_inner: 5697.07 | bwd_allreduce: 13.07 | step: 19.04 17%|█▋ | 6947/41250 [16:47:04<82:34:30, 8.67s/it] {'loss': 0.1443, 'grad_norm': 1.4398727416992188, 'learning_rate': 3.802414400627798e-05, 'epoch': 1.68} 17%|█▋ | 6947/41250 [16:47:04<82:34:30, 8.67s/it][2025-04-26 00:44:47,618] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:44:47,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.95 | bwd_microstep: 5769.68 | bwd_inner_microstep: 5703.87 | bwd_allreduce_microstep: 65.77 | step_microstep: 18.86 [2025-04-26 00:44:47,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.95 | bwd: 5769.69 | bwd_inner: 5703.87 | bwd_allreduce: 65.78 | step: 18.86 17%|█▋ | 6948/41250 [16:47:12<82:40:30, 8.68s/it] {'loss': 0.0531, 'grad_norm': 1.3741321563720703, 'learning_rate': 3.8023463389843226e-05, 'epoch': 1.68} 17%|█▋ | 6948/41250 [16:47:12<82:40:30, 8.68s/it][2025-04-26 00:44:56,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:44:56,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.08 | bwd_microstep: 5752.87 | bwd_inner_microstep: 5707.76 | bwd_allreduce_microstep: 45.07 | step_microstep: 18.52 [2025-04-26 00:44:56,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.08 | bwd: 5752.89 | bwd_inner: 5707.76 | bwd_allreduce: 45.08 | step: 18.52 17%|█▋ | 6949/41250 [16:47:21<82:42:31, 8.68s/it] {'loss': 0.1746, 'grad_norm': 3.8203511238098145, 'learning_rate': 3.802278266229737e-05, 'epoch': 1.68} 17%|█▋ | 6949/41250 [16:47:21<82:42:31, 8.68s/it][2025-04-26 00:45:05,233] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 1.09 [2025-04-26 00:45:05,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2937.79 | bwd_microstep: 5906.51 | bwd_inner_microstep: 5893.74 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.86 [2025-04-26 00:45:05,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2937.79 | bwd: 5906.53 | bwd_inner: 5893.73 | bwd_allreduce: 12.75 | step: 18.86 17%|█▋ | 6950/41250 [16:47:30<83:24:36, 8.75s/it] {'loss': 0.1132, 'grad_norm': 1.2503150701522827, 'learning_rate': 3.8022101823644605e-05, 'epoch': 1.68} 17%|█▋ | 6950/41250 [16:47:30<83:24:36, 8.75s/it][2025-04-26 00:45:13,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.01 | optimizer_step: 0.98 [2025-04-26 00:45:13,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.54 | bwd_microstep: 5768.41 | bwd_inner_microstep: 5657.95 | bwd_allreduce_microstep: 110.41 | step_microstep: 18.93 [2025-04-26 00:45:13,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.54 | bwd: 5768.42 | bwd_inner: 5657.95 | bwd_allreduce: 110.43 | step: 18.93 17%|█▋ | 6951/41250 [16:47:39<83:12:17, 8.73s/it] {'loss': 0.2513, 'grad_norm': 1.6788763999938965, 'learning_rate': 3.8021420873889126e-05, 'epoch': 1.69} 17%|█▋ | 6951/41250 [16:47:39<83:12:17, 8.73s/it][2025-04-26 00:45:22,748] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.97 | optimizer_step: 1.02 [2025-04-26 00:45:22,749] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.25 | bwd_microstep: 5918.09 | bwd_inner_microstep: 5662.31 | bwd_allreduce_microstep: 255.73 | step_microstep: 18.55 [2025-04-26 00:45:22,749] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.25 | bwd: 5918.10 | bwd_inner: 5662.31 | bwd_allreduce: 255.75 | step: 18.56 17%|█▋ | 6952/41250 [16:47:48<83:29:10, 8.76s/it] {'loss': 0.0697, 'grad_norm': 1.3730154037475586, 'learning_rate': 3.8020739813035125e-05, 'epoch': 1.69} 17%|█▋ | 6952/41250 [16:47:48<83:29:10, 8.76s/it][2025-04-26 00:45:31,432] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-26 00:45:31,433] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.47 | bwd_microstep: 5759.66 | bwd_inner_microstep: 5659.15 | bwd_allreduce_microstep: 100.46 | step_microstep: 19.04 [2025-04-26 00:45:31,433] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.47 | bwd: 5759.67 | bwd_inner: 5659.15 | bwd_allreduce: 100.48 | step: 19.04 17%|█▋ | 6953/41250 [16:47:56<83:15:35, 8.74s/it] {'loss': 0.0714, 'grad_norm': 1.317674160003662, 'learning_rate': 3.802005864108681e-05, 'epoch': 1.69} 17%|█▋ | 6953/41250 [16:47:56<83:15:35, 8.74s/it][2025-04-26 00:45:40,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.02 | optimizer_step: 1.09 [2025-04-26 00:45:40,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.52 | bwd_microstep: 5754.64 | bwd_inner_microstep: 5695.83 | bwd_allreduce_microstep: 58.77 | step_microstep: 19.00 [2025-04-26 00:45:40,125] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.52 | bwd: 5754.66 | bwd_inner: 5695.83 | bwd_allreduce: 58.78 | step: 19.00 17%|█▋ | 6954/41250 [16:48:05<83:07:22, 8.73s/it] {'loss': 0.204, 'grad_norm': 1.2892504930496216, 'learning_rate': 3.801937735804838e-05, 'epoch': 1.69} 17%|█▋ | 6954/41250 [16:48:05<83:07:22, 8.73s/it][2025-04-26 00:45:48,749] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:45:48,750] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.60 | bwd_microstep: 5711.66 | bwd_inner_microstep: 5657.50 | bwd_allreduce_microstep: 54.12 | step_microstep: 18.43 [2025-04-26 00:45:48,750] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.60 | bwd: 5711.67 | bwd_inner: 5657.50 | bwd_allreduce: 54.13 | step: 18.43 17%|█▋ | 6955/41250 [16:48:14<82:49:48, 8.69s/it] {'loss': 0.0928, 'grad_norm': 0.9404202699661255, 'learning_rate': 3.8018695963924045e-05, 'epoch': 1.69} 17%|█▋ | 6955/41250 [16:48:14<82:49:48, 8.69s/it][2025-04-26 00:45:57,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.04 | optimizer_step: 1.04 [2025-04-26 00:45:57,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.19 | bwd_microstep: 5698.06 | bwd_inner_microstep: 5685.17 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.77 [2025-04-26 00:45:57,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.19 | bwd: 5698.07 | bwd_inner: 5685.17 | bwd_allreduce: 12.86 | step: 18.77 17%|█▋ | 6956/41250 [16:48:22<82:38:43, 8.68s/it] {'loss': 0.0529, 'grad_norm': 0.9517133831977844, 'learning_rate': 3.8018014458717976e-05, 'epoch': 1.69} 17%|█▋ | 6956/41250 [16:48:22<82:38:43, 8.68s/it][2025-04-26 00:46:06,005] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.05 | optimizer_step: 1.00 [2025-04-26 00:46:06,005] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.81 | bwd_microstep: 5692.88 | bwd_inner_microstep: 5660.76 | bwd_allreduce_microstep: 32.07 | step_microstep: 19.42 [2025-04-26 00:46:06,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.81 | bwd: 5692.90 | bwd_inner: 5660.76 | bwd_allreduce: 32.09 | step: 19.42 17%|█▋ | 6957/41250 [16:48:31<82:30:01, 8.66s/it] {'loss': 0.5726, 'grad_norm': 3.874814033508301, 'learning_rate': 3.8017332842434406e-05, 'epoch': 1.69} 17%|█▋ | 6957/41250 [16:48:31<82:30:01, 8.66s/it][2025-04-26 00:46:14,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:46:14,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.01 | bwd_microstep: 5753.97 | bwd_inner_microstep: 5677.04 | bwd_allreduce_microstep: 76.89 | step_microstep: 18.64 [2025-04-26 00:46:14,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.01 | bwd: 5753.99 | bwd_inner: 5677.04 | bwd_allreduce: 76.91 | step: 18.64 17%|█▋ | 6958/41250 [16:48:40<82:34:20, 8.67s/it] {'loss': 0.1348, 'grad_norm': 1.9239212274551392, 'learning_rate': 3.801665111507751e-05, 'epoch': 1.69} 17%|█▋ | 6958/41250 [16:48:40<82:34:20, 8.67s/it][2025-04-26 00:46:23,320] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 00:46:23,321] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.62 | bwd_microstep: 5712.63 | bwd_inner_microstep: 5655.45 | bwd_allreduce_microstep: 57.13 | step_microstep: 18.95 [2025-04-26 00:46:23,321] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.62 | bwd: 5712.65 | bwd_inner: 5655.45 | bwd_allreduce: 57.15 | step: 18.95 17%|█▋ | 6959/41250 [16:48:48<82:27:14, 8.66s/it] {'loss': 0.1123, 'grad_norm': 2.4923832416534424, 'learning_rate': 3.801596927665151e-05, 'epoch': 1.69} 17%|█▋ | 6959/41250 [16:48:48<82:27:14, 8.66s/it][2025-04-26 00:46:32,003] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 0.98 | optimizer_step: 0.93 [2025-04-26 00:46:32,003] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.05 | bwd_microstep: 5770.03 | bwd_inner_microstep: 5642.69 | bwd_allreduce_microstep: 127.29 | step_microstep: 18.41 [2025-04-26 00:46:32,004] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.05 | bwd: 5770.05 | bwd_inner: 5642.69 | bwd_allreduce: 127.31 | step: 18.41 17%|█▋ | 6960/41250 [16:48:57<82:31:20, 8.66s/it] {'loss': 0.1611, 'grad_norm': 1.343345284461975, 'learning_rate': 3.80152873271606e-05, 'epoch': 1.69} 17%|█▋ | 6960/41250 [16:48:57<82:31:20, 8.66s/it][2025-04-26 00:46:40,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 00:46:40,683] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.77 | bwd_microstep: 5776.03 | bwd_inner_microstep: 5635.74 | bwd_allreduce_microstep: 140.25 | step_microstep: 18.55 [2025-04-26 00:46:40,683] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.77 | bwd: 5776.05 | bwd_inner: 5635.73 | bwd_allreduce: 140.27 | step: 18.55 17%|█▋ | 6961/41250 [16:49:06<82:34:08, 8.67s/it] {'loss': 0.2467, 'grad_norm': 2.616295576095581, 'learning_rate': 3.801460526660899e-05, 'epoch': 1.69} 17%|█▋ | 6961/41250 [16:49:06<82:34:08, 8.67s/it][2025-04-26 00:46:49,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.19 | optimizer_step: 0.90 [2025-04-26 00:46:49,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.62 | bwd_microstep: 5904.48 | bwd_inner_microstep: 5645.19 | bwd_allreduce_microstep: 259.24 | step_microstep: 19.20 [2025-04-26 00:46:49,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.62 | bwd: 5904.49 | bwd_inner: 5645.19 | bwd_allreduce: 259.26 | step: 19.20 17%|█▋ | 6962/41250 [16:49:14<83:00:35, 8.72s/it] {'loss': 0.0345, 'grad_norm': 0.5761358141899109, 'learning_rate': 3.801392309500088e-05, 'epoch': 1.69} 17%|█▋ | 6962/41250 [16:49:14<83:00:35, 8.72s/it][2025-04-26 00:46:58,119] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-26 00:46:58,119] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.95 | bwd_microstep: 5699.68 | bwd_inner_microstep: 5655.32 | bwd_allreduce_microstep: 44.31 | step_microstep: 19.04 [2025-04-26 00:46:58,119] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.95 | bwd: 5699.69 | bwd_inner: 5655.32 | bwd_allreduce: 44.32 | step: 19.04 17%|█▋ | 6963/41250 [16:49:23<82:42:34, 8.68s/it] {'loss': 0.1444, 'grad_norm': 1.220411777496338, 'learning_rate': 3.801324081234048e-05, 'epoch': 1.69} 17%|█▋ | 6963/41250 [16:49:23<82:42:34, 8.68s/it][2025-04-26 00:47:06,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.15 | optimizer_step: 0.96 [2025-04-26 00:47:06,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.71 | bwd_microstep: 5735.77 | bwd_inner_microstep: 5687.84 | bwd_allreduce_microstep: 47.88 | step_microstep: 19.09 [2025-04-26 00:47:06,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.71 | bwd: 5735.78 | bwd_inner: 5687.84 | bwd_allreduce: 47.90 | step: 19.10 17%|█▋ | 6964/41250 [16:49:32<82:40:24, 8.68s/it] {'loss': 0.1209, 'grad_norm': 3.6806693077087402, 'learning_rate': 3.801255841863199e-05, 'epoch': 1.69} 17%|█▋ | 6964/41250 [16:49:32<82:40:24, 8.68s/it][2025-04-26 00:47:15,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 00:47:15,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.03 | bwd_microstep: 5672.68 | bwd_inner_microstep: 5645.94 | bwd_allreduce_microstep: 26.69 | step_microstep: 18.88 [2025-04-26 00:47:15,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.03 | bwd: 5672.69 | bwd_inner: 5645.94 | bwd_allreduce: 26.71 | step: 18.88 17%|█▋ | 6965/41250 [16:49:40<82:26:51, 8.66s/it] {'loss': 0.0681, 'grad_norm': 1.5274924039840698, 'learning_rate': 3.801187591387962e-05, 'epoch': 1.69} 17%|█▋ | 6965/41250 [16:49:40<82:26:51, 8.66s/it][2025-04-26 00:47:24,059] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 00:47:24,060] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.35 | bwd_microstep: 5732.97 | bwd_inner_microstep: 5684.70 | bwd_allreduce_microstep: 48.22 | step_microstep: 18.84 [2025-04-26 00:47:24,060] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.35 | bwd: 5732.99 | bwd_inner: 5684.70 | bwd_allreduce: 48.24 | step: 18.85 17%|█▋ | 6966/41250 [16:49:49<82:28:16, 8.66s/it] {'loss': 0.1773, 'grad_norm': 1.9475648403167725, 'learning_rate': 3.801119329808758e-05, 'epoch': 1.69} 17%|█▋ | 6966/41250 [16:49:49<82:28:16, 8.66s/it][2025-04-26 00:47:32,662] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 00:47:32,663] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.10 | bwd_microstep: 5699.20 | bwd_inner_microstep: 5639.29 | bwd_allreduce_microstep: 59.85 | step_microstep: 19.03 [2025-04-26 00:47:32,663] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.10 | bwd: 5699.21 | bwd_inner: 5639.29 | bwd_allreduce: 59.87 | step: 19.04 17%|█▋ | 6967/41250 [16:49:57<82:18:40, 8.64s/it] {'loss': 0.0839, 'grad_norm': 2.1449666023254395, 'learning_rate': 3.801051057126007e-05, 'epoch': 1.69} 17%|█▋ | 6967/41250 [16:49:57<82:18:40, 8.64s/it][2025-04-26 00:47:41,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-26 00:47:41,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.92 | bwd_microstep: 5729.48 | bwd_inner_microstep: 5693.87 | bwd_allreduce_microstep: 35.57 | step_microstep: 19.09 [2025-04-26 00:47:41,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.92 | bwd: 5729.50 | bwd_inner: 5693.87 | bwd_allreduce: 35.59 | step: 19.09 17%|█▋ | 6968/41250 [16:50:06<82:23:20, 8.65s/it] {'loss': 0.353, 'grad_norm': 5.508453845977783, 'learning_rate': 3.800982773340131e-05, 'epoch': 1.69} 17%|█▋ | 6968/41250 [16:50:06<82:23:20, 8.65s/it][2025-04-26 00:47:49,978] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-26 00:47:49,979] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.62 | bwd_microstep: 5702.30 | bwd_inner_microstep: 5689.20 | bwd_allreduce_microstep: 13.05 | step_microstep: 19.25 [2025-04-26 00:47:49,979] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.62 | bwd: 5702.31 | bwd_inner: 5689.20 | bwd_allreduce: 13.07 | step: 19.25 17%|█▋ | 6969/41250 [16:50:15<82:21:38, 8.65s/it] {'loss': 0.2618, 'grad_norm': 2.8453335762023926, 'learning_rate': 3.8009144784515514e-05, 'epoch': 1.69} 17%|█▋ | 6969/41250 [16:50:15<82:21:38, 8.65s/it][2025-04-26 00:47:58,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 00:47:58,659] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.63 | bwd_microstep: 5748.17 | bwd_inner_microstep: 5681.35 | bwd_allreduce_microstep: 66.77 | step_microstep: 18.98 [2025-04-26 00:47:58,659] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.63 | bwd: 5748.18 | bwd_inner: 5681.35 | bwd_allreduce: 66.79 | step: 18.98 17%|█▋ | 6970/41250 [16:50:23<82:26:41, 8.66s/it] {'loss': 0.1501, 'grad_norm': 1.7398149967193604, 'learning_rate': 3.800846172460687e-05, 'epoch': 1.69} 17%|█▋ | 6970/41250 [16:50:23<82:26:41, 8.66s/it][2025-04-26 00:48:07,261] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:48:07,262] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.09 | bwd_microstep: 5689.08 | bwd_inner_microstep: 5645.52 | bwd_allreduce_microstep: 43.52 | step_microstep: 18.60 [2025-04-26 00:48:07,262] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.09 | bwd: 5689.10 | bwd_inner: 5645.52 | bwd_allreduce: 43.54 | step: 18.60 17%|█▋ | 6971/41250 [16:50:32<82:17:05, 8.64s/it] {'loss': 0.1705, 'grad_norm': 1.5990275144577026, 'learning_rate': 3.8007778553679605e-05, 'epoch': 1.69} 17%|█▋ | 6971/41250 [16:50:32<82:17:05, 8.64s/it][2025-04-26 00:48:16,022] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 00:48:16,023] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.34 | bwd_microstep: 5834.37 | bwd_inner_microstep: 5682.63 | bwd_allreduce_microstep: 151.70 | step_microstep: 18.60 [2025-04-26 00:48:16,023] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.34 | bwd: 5834.38 | bwd_inner: 5682.63 | bwd_allreduce: 151.72 | step: 18.60 17%|█▋ | 6972/41250 [16:50:41<82:37:19, 8.68s/it] {'loss': 0.125, 'grad_norm': 1.5790280103683472, 'learning_rate': 3.8007095271737924e-05, 'epoch': 1.69} 17%|█▋ | 6972/41250 [16:50:41<82:37:19, 8.68s/it][2025-04-26 00:48:24,616] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-26 00:48:24,616] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.74 | bwd_microstep: 5686.65 | bwd_inner_microstep: 5639.54 | bwd_allreduce_microstep: 47.07 | step_microstep: 18.44 [2025-04-26 00:48:24,616] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.74 | bwd: 5686.67 | bwd_inner: 5639.54 | bwd_allreduce: 47.08 | step: 18.44 17%|█▋ | 6973/41250 [16:50:49<82:22:54, 8.65s/it] {'loss': 0.1035, 'grad_norm': 2.1652705669403076, 'learning_rate': 3.800641187878605e-05, 'epoch': 1.69} 17%|█▋ | 6973/41250 [16:50:49<82:22:54, 8.65s/it][2025-04-26 00:48:33,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.99 [2025-04-26 00:48:33,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.94 | bwd_microstep: 5696.93 | bwd_inner_microstep: 5684.31 | bwd_allreduce_microstep: 12.57 | step_microstep: 19.14 [2025-04-26 00:48:33,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.94 | bwd: 5696.94 | bwd_inner: 5684.31 | bwd_allreduce: 12.59 | step: 19.15 17%|█▋ | 6974/41250 [16:50:58<82:18:00, 8.64s/it] {'loss': 0.1825, 'grad_norm': 1.7627238035202026, 'learning_rate': 3.800572837482818e-05, 'epoch': 1.69} 17%|█▋ | 6974/41250 [16:50:58<82:18:00, 8.64s/it][2025-04-26 00:48:41,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 00:48:41,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.62 | bwd_microstep: 5757.01 | bwd_inner_microstep: 5678.72 | bwd_allreduce_microstep: 78.23 | step_microstep: 18.94 [2025-04-26 00:48:41,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.62 | bwd: 5757.02 | bwd_inner: 5678.72 | bwd_allreduce: 78.25 | step: 18.94 17%|█▋ | 6975/41250 [16:51:07<82:24:29, 8.66s/it] {'loss': 0.1752, 'grad_norm': 5.374935626983643, 'learning_rate': 3.800504475986854e-05, 'epoch': 1.69} 17%|█▋ | 6975/41250 [16:51:07<82:24:29, 8.66s/it][2025-04-26 00:48:50,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 00:48:50,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.42 | bwd_microstep: 5688.85 | bwd_inner_microstep: 5675.88 | bwd_allreduce_microstep: 12.93 | step_microstep: 19.16 [2025-04-26 00:48:50,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.42 | bwd: 5688.87 | bwd_inner: 5675.88 | bwd_allreduce: 12.94 | step: 19.16 17%|█▋ | 6976/41250 [16:51:15<82:17:45, 8.64s/it] {'loss': 0.1103, 'grad_norm': 1.441941261291504, 'learning_rate': 3.8004361033911335e-05, 'epoch': 1.69} 17%|█▋ | 6976/41250 [16:51:15<82:17:45, 8.64s/it][2025-04-26 00:48:59,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.05 | optimizer_step: 0.92 [2025-04-26 00:48:59,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.32 | bwd_microstep: 5695.21 | bwd_inner_microstep: 5682.33 | bwd_allreduce_microstep: 12.83 | step_microstep: 19.74 [2025-04-26 00:48:59,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.32 | bwd: 5695.23 | bwd_inner: 5682.33 | bwd_allreduce: 12.85 | step: 19.75 17%|█▋ | 6977/41250 [16:51:24<82:15:48, 8.64s/it] {'loss': 0.1516, 'grad_norm': 2.6289172172546387, 'learning_rate': 3.8003677196960795e-05, 'epoch': 1.69} 17%|█▋ | 6977/41250 [16:51:24<82:15:48, 8.64s/it][2025-04-26 00:49:07,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 00:49:07,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.26 | bwd_microstep: 5747.96 | bwd_inner_microstep: 5668.50 | bwd_allreduce_microstep: 79.42 | step_microstep: 19.22 [2025-04-26 00:49:07,844] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.26 | bwd: 5747.98 | bwd_inner: 5668.50 | bwd_allreduce: 79.44 | step: 19.22 17%|█▋ | 6978/41250 [16:51:33<82:21:00, 8.65s/it] {'loss': 0.2836, 'grad_norm': 4.727651119232178, 'learning_rate': 3.800299324902112e-05, 'epoch': 1.69} 17%|█▋ | 6978/41250 [16:51:33<82:21:00, 8.65s/it][2025-04-26 00:49:16,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.03 | optimizer_step: 1.02 [2025-04-26 00:49:16,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.03 | bwd_microstep: 5773.18 | bwd_inner_microstep: 5643.14 | bwd_allreduce_microstep: 129.99 | step_microstep: 19.69 [2025-04-26 00:49:16,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.03 | bwd: 5773.19 | bwd_inner: 5643.14 | bwd_allreduce: 130.02 | step: 19.70 17%|█▋ | 6979/41250 [16:51:41<82:26:21, 8.66s/it] {'loss': 0.197, 'grad_norm': 2.0654497146606445, 'learning_rate': 3.800230919009653e-05, 'epoch': 1.69} 17%|█▋ | 6979/41250 [16:51:41<82:26:21, 8.66s/it][2025-04-26 00:49:25,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.05 | optimizer_step: 0.90 [2025-04-26 00:49:25,168] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.90 | bwd_microstep: 5705.98 | bwd_inner_microstep: 5692.90 | bwd_allreduce_microstep: 13.03 | step_microstep: 19.17 [2025-04-26 00:49:25,168] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.90 | bwd: 5706.00 | bwd_inner: 5692.89 | bwd_allreduce: 13.06 | step: 19.17 17%|█▋ | 6980/41250 [16:51:50<82:22:39, 8.65s/it] {'loss': 0.0305, 'grad_norm': 0.6259673833847046, 'learning_rate': 3.8001625020191253e-05, 'epoch': 1.69} 17%|█▋ | 6980/41250 [16:51:50<82:22:39, 8.65s/it][2025-04-26 00:49:33,759] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 00:49:33,760] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.70 | bwd_microstep: 5686.26 | bwd_inner_microstep: 5647.70 | bwd_allreduce_microstep: 38.52 | step_microstep: 18.76 [2025-04-26 00:49:33,760] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.70 | bwd: 5686.27 | bwd_inner: 5647.70 | bwd_allreduce: 38.53 | step: 18.77 17%|█▋ | 6981/41250 [16:51:59<82:12:09, 8.64s/it] {'loss': 0.0782, 'grad_norm': 1.342300534248352, 'learning_rate': 3.80009407393095e-05, 'epoch': 1.69} 17%|█▋ | 6981/41250 [16:51:59<82:12:09, 8.64s/it][2025-04-26 00:49:42,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:49:42,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.75 | bwd_microstep: 5701.02 | bwd_inner_microstep: 5688.33 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.57 [2025-04-26 00:49:42,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.75 | bwd: 5701.03 | bwd_inner: 5688.33 | bwd_allreduce: 12.66 | step: 18.57 17%|█▋ | 6982/41250 [16:52:07<82:09:40, 8.63s/it] {'loss': 0.131, 'grad_norm': 1.9550195932388306, 'learning_rate': 3.800025634745548e-05, 'epoch': 1.69} 17%|█▋ | 6982/41250 [16:52:07<82:09:40, 8.63s/it][2025-04-26 00:49:50,991] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.24 | optimizer_step: 0.90 [2025-04-26 00:49:50,992] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.26 | bwd_microstep: 5694.59 | bwd_inner_microstep: 5660.07 | bwd_allreduce_microstep: 34.46 | step_microstep: 19.00 [2025-04-26 00:49:50,992] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.27 | bwd: 5694.60 | bwd_inner: 5660.07 | bwd_allreduce: 34.48 | step: 18.99 17%|█▋ | 6983/41250 [16:52:16<82:06:22, 8.63s/it] {'loss': 0.1771, 'grad_norm': 1.517397165298462, 'learning_rate': 3.799957184463342e-05, 'epoch': 1.69} 17%|█▋ | 6983/41250 [16:52:16<82:06:22, 8.63s/it][2025-04-26 00:49:59,674] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.98 [2025-04-26 00:49:59,675] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.97 | bwd_microstep: 5736.27 | bwd_inner_microstep: 5686.60 | bwd_allreduce_microstep: 49.63 | step_microstep: 19.29 [2025-04-26 00:49:59,675] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.97 | bwd: 5736.29 | bwd_inner: 5686.60 | bwd_allreduce: 49.65 | step: 19.30 17%|█▋ | 6984/41250 [16:52:24<82:15:36, 8.64s/it] {'loss': 0.1156, 'grad_norm': 0.968220591545105, 'learning_rate': 3.7998887230847545e-05, 'epoch': 1.69} 17%|█▋ | 6984/41250 [16:52:25<82:15:36, 8.64s/it][2025-04-26 00:50:08,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 1.11 | optimizer_step: 1.01 [2025-04-26 00:50:08,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.59 | bwd_microstep: 5729.88 | bwd_inner_microstep: 5695.89 | bwd_allreduce_microstep: 33.94 | step_microstep: 19.19 [2025-04-26 00:50:08,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.59 | bwd: 5729.89 | bwd_inner: 5695.89 | bwd_allreduce: 33.96 | step: 19.20 17%|█▋ | 6985/41250 [16:52:33<82:20:03, 8.65s/it] {'loss': 0.0598, 'grad_norm': 0.6925025582313538, 'learning_rate': 3.799820250610207e-05, 'epoch': 1.69} 17%|█▋ | 6985/41250 [16:52:33<82:20:03, 8.65s/it][2025-04-26 00:50:16,954] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:50:16,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.52 | bwd_microstep: 5694.65 | bwd_inner_microstep: 5656.13 | bwd_allreduce_microstep: 38.47 | step_microstep: 19.01 [2025-04-26 00:50:16,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.52 | bwd: 5694.66 | bwd_inner: 5656.13 | bwd_allreduce: 38.49 | step: 19.01 17%|█▋ | 6986/41250 [16:52:42<82:13:00, 8.64s/it] {'loss': 0.1181, 'grad_norm': 2.442380428314209, 'learning_rate': 3.799751767040121e-05, 'epoch': 1.69} 17%|█▋ | 6986/41250 [16:52:42<82:13:00, 8.64s/it][2025-04-26 00:50:25,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:50:25,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.20 | bwd_microstep: 5899.88 | bwd_inner_microstep: 5644.89 | bwd_allreduce_microstep: 254.94 | step_microstep: 18.51 [2025-04-26 00:50:25,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.20 | bwd: 5899.89 | bwd_inner: 5644.89 | bwd_allreduce: 254.96 | step: 18.51 17%|█▋ | 6987/41250 [16:52:51<82:41:42, 8.69s/it] {'loss': 0.2268, 'grad_norm': 4.173859119415283, 'learning_rate': 3.79968327237492e-05, 'epoch': 1.69} 17%|█▋ | 6987/41250 [16:52:51<82:41:42, 8.69s/it][2025-04-26 00:50:34,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.06 | optimizer_step: 1.06 [2025-04-26 00:50:34,453] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.50 | bwd_microstep: 5758.89 | bwd_inner_microstep: 5702.35 | bwd_allreduce_microstep: 56.49 | step_microstep: 19.66 [2025-04-26 00:50:34,453] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.50 | bwd: 5758.90 | bwd_inner: 5702.35 | bwd_allreduce: 56.51 | step: 19.66 17%|█▋ | 6988/41250 [16:52:59<82:42:19, 8.69s/it] {'loss': 0.0896, 'grad_norm': 2.1698408126831055, 'learning_rate': 3.7996147666150256e-05, 'epoch': 1.69} 17%|█▋ | 6988/41250 [16:52:59<82:42:19, 8.69s/it][2025-04-26 00:50:43,104] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:50:43,104] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.14 | bwd_microstep: 5715.52 | bwd_inner_microstep: 5702.90 | bwd_allreduce_microstep: 12.58 | step_microstep: 18.66 [2025-04-26 00:50:43,105] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.14 | bwd: 5715.54 | bwd_inner: 5702.90 | bwd_allreduce: 12.60 | step: 18.66 17%|█▋ | 6989/41250 [16:53:08<82:35:12, 8.68s/it] {'loss': 0.1762, 'grad_norm': 4.293949604034424, 'learning_rate': 3.7995462497608595e-05, 'epoch': 1.69} 17%|█▋ | 6989/41250 [16:53:08<82:35:12, 8.68s/it][2025-04-26 00:50:51,708] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:50:51,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.55 | bwd_microstep: 5695.96 | bwd_inner_microstep: 5664.16 | bwd_allreduce_microstep: 31.75 | step_microstep: 18.48 [2025-04-26 00:50:51,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.55 | bwd: 5695.97 | bwd_inner: 5664.16 | bwd_allreduce: 31.77 | step: 18.48 17%|█▋ | 6990/41250 [16:53:17<82:22:26, 8.66s/it] {'loss': 0.1071, 'grad_norm': 2.4090356826782227, 'learning_rate': 3.7994777218128454e-05, 'epoch': 1.69} 17%|█▋ | 6990/41250 [16:53:17<82:22:26, 8.66s/it][2025-04-26 00:51:00,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:51:00,395] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.01 | bwd_microstep: 5757.88 | bwd_inner_microstep: 5687.30 | bwd_allreduce_microstep: 70.54 | step_microstep: 18.41 [2025-04-26 00:51:00,395] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.01 | bwd: 5757.89 | bwd_inner: 5687.30 | bwd_allreduce: 70.55 | step: 18.41 17%|█▋ | 6991/41250 [16:53:25<82:27:29, 8.66s/it] {'loss': 0.0882, 'grad_norm': 2.013277053833008, 'learning_rate': 3.7994091827714055e-05, 'epoch': 1.69} 17%|█▋ | 6991/41250 [16:53:25<82:27:29, 8.66s/it][2025-04-26 00:51:09,034] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 00:51:09,034] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.79 | bwd_microstep: 5706.69 | bwd_inner_microstep: 5694.02 | bwd_allreduce_microstep: 12.62 | step_microstep: 18.60 [2025-04-26 00:51:09,034] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.79 | bwd: 5706.71 | bwd_inner: 5694.02 | bwd_allreduce: 12.64 | step: 18.61 17%|█▋ | 6992/41250 [16:53:34<82:22:56, 8.66s/it] {'loss': 0.3188, 'grad_norm': 2.645777702331543, 'learning_rate': 3.7993406326369605e-05, 'epoch': 1.7} 17%|█▋ | 6992/41250 [16:53:34<82:22:56, 8.66s/it][2025-04-26 00:51:17,707] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:51:17,707] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.80 | bwd_microstep: 5759.89 | bwd_inner_microstep: 5661.77 | bwd_allreduce_microstep: 98.07 | step_microstep: 18.54 [2025-04-26 00:51:17,707] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.80 | bwd: 5759.91 | bwd_inner: 5661.77 | bwd_allreduce: 98.09 | step: 18.54 17%|█▋ | 6993/41250 [16:53:43<82:25:35, 8.66s/it] {'loss': 0.1372, 'grad_norm': 2.235297679901123, 'learning_rate': 3.7992720714099354e-05, 'epoch': 1.7} 17%|█▋ | 6993/41250 [16:53:43<82:25:35, 8.66s/it][2025-04-26 00:51:26,339] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:51:26,339] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.00 | bwd_microstep: 5723.83 | bwd_inner_microstep: 5642.35 | bwd_allreduce_microstep: 81.44 | step_microstep: 18.57 [2025-04-26 00:51:26,339] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.00 | bwd: 5723.84 | bwd_inner: 5642.35 | bwd_allreduce: 81.46 | step: 18.57 17%|█▋ | 6994/41250 [16:53:51<82:20:21, 8.65s/it] {'loss': 0.0898, 'grad_norm': 1.1491159200668335, 'learning_rate': 3.7992034990907515e-05, 'epoch': 1.7} 17%|█▋ | 6994/41250 [16:53:51<82:20:21, 8.65s/it][2025-04-26 00:51:34,977] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.01 | optimizer_step: 1.07 [2025-04-26 00:51:34,978] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.92 | bwd_microstep: 5710.85 | bwd_inner_microstep: 5698.24 | bwd_allreduce_microstep: 12.56 | step_microstep: 19.06 [2025-04-26 00:51:34,978] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.93 | bwd: 5710.86 | bwd_inner: 5698.24 | bwd_allreduce: 12.58 | step: 19.07 17%|█▋ | 6995/41250 [16:54:00<82:17:41, 8.65s/it] {'loss': 0.2908, 'grad_norm': 2.0931479930877686, 'learning_rate': 3.7991349156798325e-05, 'epoch': 1.7} 17%|█▋ | 6995/41250 [16:54:00<82:17:41, 8.65s/it][2025-04-26 00:51:43,681] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:51:43,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2863.05 | bwd_microstep: 5753.97 | bwd_inner_microstep: 5712.09 | bwd_allreduce_microstep: 41.84 | step_microstep: 18.30 [2025-04-26 00:51:43,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2863.05 | bwd: 5753.98 | bwd_inner: 5712.09 | bwd_allreduce: 41.86 | step: 18.31 17%|█▋ | 6996/41250 [16:54:09<82:26:54, 8.67s/it] {'loss': 0.0238, 'grad_norm': 0.3598145544528961, 'learning_rate': 3.7990663211776e-05, 'epoch': 1.7} 17%|█▋ | 6996/41250 [16:54:09<82:26:54, 8.67s/it][2025-04-26 00:51:52,435] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.56 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-26 00:51:52,435] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2889.85 | bwd_microstep: 5774.79 | bwd_inner_microstep: 5762.29 | bwd_allreduce_microstep: 12.46 | step_microstep: 18.27 [2025-04-26 00:51:52,435] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2889.86 | bwd: 5774.81 | bwd_inner: 5762.29 | bwd_allreduce: 12.48 | step: 18.27 17%|█▋ | 6997/41250 [16:54:17<82:41:52, 8.69s/it] {'loss': 0.0793, 'grad_norm': 0.9679085612297058, 'learning_rate': 3.798997715584477e-05, 'epoch': 1.7} 17%|█▋ | 6997/41250 [16:54:17<82:41:52, 8.69s/it][2025-04-26 00:52:01,140] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:52:01,140] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.18 | bwd_microstep: 5784.25 | bwd_inner_microstep: 5665.44 | bwd_allreduce_microstep: 118.76 | step_microstep: 18.81 [2025-04-26 00:52:01,140] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.18 | bwd: 5784.27 | bwd_inner: 5665.44 | bwd_allreduce: 118.79 | step: 18.81 17%|█▋ | 6998/41250 [16:54:26<82:44:08, 8.70s/it] {'loss': 0.0476, 'grad_norm': 0.9449998736381531, 'learning_rate': 3.798929098900888e-05, 'epoch': 1.7} 17%|█▋ | 6998/41250 [16:54:26<82:44:08, 8.70s/it][2025-04-26 00:52:09,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:52:09,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.91 | bwd_microstep: 5745.49 | bwd_inner_microstep: 5709.32 | bwd_allreduce_microstep: 36.13 | step_microstep: 18.67 [2025-04-26 00:52:09,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.91 | bwd: 5745.51 | bwd_inner: 5709.32 | bwd_allreduce: 36.14 | step: 18.68 17%|█▋ | 6999/41250 [16:54:35<82:42:24, 8.69s/it] {'loss': 0.1713, 'grad_norm': 1.958443284034729, 'learning_rate': 3.798860471127254e-05, 'epoch': 1.7} 17%|█▋ | 6999/41250 [16:54:35<82:42:24, 8.69s/it][2025-04-26 00:52:18,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:52:18,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2896.66 | bwd_microstep: 5780.57 | bwd_inner_microstep: 5767.76 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.84 [2025-04-26 00:52:18,588] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2896.66 | bwd: 5780.58 | bwd_inner: 5767.76 | bwd_allreduce: 12.77 | step: 18.84 17%|█▋ | 7000/41250 [16:54:43<82:53:49, 8.71s/it] {'loss': 0.2557, 'grad_norm': 2.4017186164855957, 'learning_rate': 3.798791832264e-05, 'epoch': 1.7} 17%|█▋ | 7000/41250 [16:54:43<82:53:49, 8.71s/it][2025-04-26 00:52:27,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.57 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:52:27,223] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.45 | bwd_microstep: 5708.20 | bwd_inner_microstep: 5695.27 | bwd_allreduce_microstep: 12.87 | step_microstep: 18.97 [2025-04-26 00:52:27,223] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.45 | bwd: 5708.21 | bwd_inner: 5695.27 | bwd_allreduce: 12.89 | step: 18.98 17%|█▋ | 7001/41250 [16:54:52<82:40:47, 8.69s/it] {'loss': 0.1828, 'grad_norm': 3.371244430541992, 'learning_rate': 3.798723182311548e-05, 'epoch': 1.7} 17%|█▋ | 7001/41250 [16:54:52<82:40:47, 8.69s/it][2025-04-26 00:52:35,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.03 | optimizer_step: 1.09 [2025-04-26 00:52:35,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.06 | bwd_microstep: 5760.50 | bwd_inner_microstep: 5648.79 | bwd_allreduce_microstep: 111.65 | step_microstep: 18.99 [2025-04-26 00:52:35,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.06 | bwd: 5760.51 | bwd_inner: 5648.79 | bwd_allreduce: 111.67 | step: 19.00 17%|█▋ | 7002/41250 [16:55:01<82:37:10, 8.68s/it] {'loss': 0.0474, 'grad_norm': 2.5778450965881348, 'learning_rate': 3.798654521270321e-05, 'epoch': 1.7} 17%|█▋ | 7002/41250 [16:55:01<82:37:10, 8.68s/it][2025-04-26 00:52:44,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.21 | optimizer_step: 1.04 [2025-04-26 00:52:44,576] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.10 | bwd_microstep: 5769.00 | bwd_inner_microstep: 5653.28 | bwd_allreduce_microstep: 115.67 | step_microstep: 19.81 [2025-04-26 00:52:44,576] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.10 | bwd: 5769.02 | bwd_inner: 5653.28 | bwd_allreduce: 115.69 | step: 19.82 17%|█▋ | 7003/41250 [16:55:09<82:36:23, 8.68s/it] {'loss': 0.0791, 'grad_norm': 1.652101755142212, 'learning_rate': 3.798585849140743e-05, 'epoch': 1.7} 17%|█▋ | 7003/41250 [16:55:09<82:36:23, 8.68s/it][2025-04-26 00:52:53,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 0.93 [2025-04-26 00:52:53,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.51 | bwd_microstep: 5920.70 | bwd_inner_microstep: 5665.15 | bwd_allreduce_microstep: 255.50 | step_microstep: 18.70 [2025-04-26 00:52:53,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.51 | bwd: 5920.71 | bwd_inner: 5665.15 | bwd_allreduce: 255.52 | step: 18.70 17%|█▋ | 7004/41250 [16:55:18<83:02:34, 8.73s/it] {'loss': 0.1071, 'grad_norm': 1.4332436323165894, 'learning_rate': 3.7985171659232364e-05, 'epoch': 1.7} 17%|█▋ | 7004/41250 [16:55:18<83:02:34, 8.73s/it][2025-04-26 00:53:02,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-26 00:53:02,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.99 | bwd_microstep: 5775.94 | bwd_inner_microstep: 5645.00 | bwd_allreduce_microstep: 130.89 | step_microstep: 18.72 [2025-04-26 00:53:02,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.99 | bwd: 5775.96 | bwd_inner: 5645.00 | bwd_allreduce: 130.91 | step: 18.72 17%|█▋ | 7005/41250 [16:55:27<82:54:15, 8.72s/it] {'loss': 0.2065, 'grad_norm': 3.654470920562744, 'learning_rate': 3.798448471618225e-05, 'epoch': 1.7} 17%|█▋ | 7005/41250 [16:55:27<82:54:15, 8.72s/it][2025-04-26 00:53:10,795] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.24 | optimizer_step: 0.90 [2025-04-26 00:53:10,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.41 | bwd_microstep: 5791.16 | bwd_inner_microstep: 5654.50 | bwd_allreduce_microstep: 136.62 | step_microstep: 19.33 [2025-04-26 00:53:10,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.41 | bwd: 5791.18 | bwd_inner: 5654.50 | bwd_allreduce: 136.64 | step: 19.34 17%|█▋ | 7006/41250 [16:55:36<82:51:36, 8.71s/it] {'loss': 0.3367, 'grad_norm': 3.6412339210510254, 'learning_rate': 3.7983797662261335e-05, 'epoch': 1.7} 17%|█▋ | 7006/41250 [16:55:36<82:51:36, 8.71s/it][2025-04-26 00:53:19,420] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 00:53:19,421] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.22 | bwd_microstep: 5691.72 | bwd_inner_microstep: 5678.90 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.60 [2025-04-26 00:53:19,421] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.22 | bwd: 5691.73 | bwd_inner: 5678.89 | bwd_allreduce: 12.79 | step: 18.61 17%|█▋ | 7007/41250 [16:55:44<82:37:05, 8.69s/it] {'loss': 0.1691, 'grad_norm': 2.209867238998413, 'learning_rate': 3.798311049747384e-05, 'epoch': 1.7} 17%|█▋ | 7007/41250 [16:55:44<82:37:05, 8.69s/it][2025-04-26 00:53:28,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.06 | optimizer_step: 0.99 [2025-04-26 00:53:28,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.32 | bwd_microstep: 5791.42 | bwd_inner_microstep: 5647.25 | bwd_allreduce_microstep: 144.13 | step_microstep: 19.25 [2025-04-26 00:53:28,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.32 | bwd: 5791.44 | bwd_inner: 5647.25 | bwd_allreduce: 144.15 | step: 19.25 17%|█▋ | 7008/41250 [16:55:53<82:38:16, 8.69s/it] {'loss': 0.0464, 'grad_norm': 1.00788414478302, 'learning_rate': 3.7982423221824004e-05, 'epoch': 1.7} 17%|█▋ | 7008/41250 [16:55:53<82:38:16, 8.69s/it][2025-04-26 00:53:36,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.97 | optimizer_step: 0.96 [2025-04-26 00:53:36,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.31 | bwd_microstep: 5754.21 | bwd_inner_microstep: 5689.99 | bwd_allreduce_microstep: 64.18 | step_microstep: 18.63 [2025-04-26 00:53:36,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.31 | bwd: 5754.23 | bwd_inner: 5689.99 | bwd_allreduce: 64.20 | step: 18.64 17%|█▋ | 7009/41250 [16:56:02<82:37:03, 8.69s/it] {'loss': 0.0362, 'grad_norm': 0.7331605553627014, 'learning_rate': 3.798173583531607e-05, 'epoch': 1.7} 17%|█▋ | 7009/41250 [16:56:02<82:37:03, 8.69s/it][2025-04-26 00:53:45,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 00:53:45,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.68 | bwd_microstep: 5697.77 | bwd_inner_microstep: 5648.41 | bwd_allreduce_microstep: 49.32 | step_microstep: 18.91 [2025-04-26 00:53:45,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.68 | bwd: 5697.79 | bwd_inner: 5648.41 | bwd_allreduce: 49.34 | step: 18.92 17%|█▋ | 7010/41250 [16:56:10<82:23:33, 8.66s/it] {'loss': 0.0949, 'grad_norm': 1.7672899961471558, 'learning_rate': 3.798104833795427e-05, 'epoch': 1.7} 17%|█▋ | 7010/41250 [16:56:10<82:23:33, 8.66s/it][2025-04-26 00:53:53,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 00:53:53,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.50 | bwd_microstep: 5686.98 | bwd_inner_microstep: 5633.31 | bwd_allreduce_microstep: 53.63 | step_microstep: 18.75 [2025-04-26 00:53:53,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.50 | bwd: 5687.00 | bwd_inner: 5633.31 | bwd_allreduce: 53.64 | step: 18.76 17%|█▋ | 7011/41250 [16:56:19<82:11:31, 8.64s/it] {'loss': 0.0852, 'grad_norm': 1.9846165180206299, 'learning_rate': 3.798036072974284e-05, 'epoch': 1.7} 17%|█▋ | 7011/41250 [16:56:19<82:11:31, 8.64s/it][2025-04-26 00:54:02,641] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.02 | optimizer_step: 1.09 [2025-04-26 00:54:02,642] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.40 | bwd_microstep: 5710.96 | bwd_inner_microstep: 5698.14 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.92 [2025-04-26 00:54:02,642] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.40 | bwd: 5710.97 | bwd_inner: 5698.14 | bwd_allreduce: 12.79 | step: 18.93 17%|█▋ | 7012/41250 [16:56:27<82:11:30, 8.64s/it] {'loss': 0.0941, 'grad_norm': 1.8414827585220337, 'learning_rate': 3.797967301068603e-05, 'epoch': 1.7} 17%|█▋ | 7012/41250 [16:56:27<82:11:30, 8.64s/it][2025-04-26 00:54:11,315] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.07 | optimizer_step: 0.91 [2025-04-26 00:54:11,316] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.73 | bwd_microstep: 5736.41 | bwd_inner_microstep: 5693.54 | bwd_allreduce_microstep: 42.81 | step_microstep: 19.02 [2025-04-26 00:54:11,316] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.73 | bwd: 5736.42 | bwd_inner: 5693.54 | bwd_allreduce: 42.83 | step: 19.01 17%|█▋ | 7013/41250 [16:56:36<82:17:24, 8.65s/it] {'loss': 0.1787, 'grad_norm': 2.5969836711883545, 'learning_rate': 3.7978985180788065e-05, 'epoch': 1.7} 17%|█▋ | 7013/41250 [16:56:36<82:17:24, 8.65s/it][2025-04-26 00:54:19,996] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-26 00:54:19,997] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.76 | bwd_microstep: 5743.56 | bwd_inner_microstep: 5695.06 | bwd_allreduce_microstep: 48.45 | step_microstep: 19.15 [2025-04-26 00:54:19,997] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.76 | bwd: 5743.57 | bwd_inner: 5695.06 | bwd_allreduce: 48.47 | step: 19.15 17%|█▋ | 7014/41250 [16:56:45<82:21:36, 8.66s/it] {'loss': 0.0746, 'grad_norm': 1.34222412109375, 'learning_rate': 3.797829724005319e-05, 'epoch': 1.7} 17%|█▋ | 7014/41250 [16:56:45<82:21:36, 8.66s/it][2025-04-26 00:54:28,603] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 00:54:28,603] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.86 | bwd_microstep: 5700.80 | bwd_inner_microstep: 5641.71 | bwd_allreduce_microstep: 59.04 | step_microstep: 18.84 [2025-04-26 00:54:28,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.86 | bwd: 5700.81 | bwd_inner: 5641.71 | bwd_allreduce: 59.06 | step: 18.84 17%|█▋ | 7015/41250 [16:56:53<82:12:04, 8.64s/it] {'loss': 0.2725, 'grad_norm': 3.7859346866607666, 'learning_rate': 3.797760918848566e-05, 'epoch': 1.7} 17%|█▋ | 7015/41250 [16:56:53<82:12:04, 8.64s/it][2025-04-26 00:54:37,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-26 00:54:37,292] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.44 | bwd_microstep: 5782.12 | bwd_inner_microstep: 5639.27 | bwd_allreduce_microstep: 142.79 | step_microstep: 19.29 [2025-04-26 00:54:37,292] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.44 | bwd: 5782.13 | bwd_inner: 5639.27 | bwd_allreduce: 142.81 | step: 19.30 17%|█▋ | 7016/41250 [16:57:02<82:19:33, 8.66s/it] {'loss': 0.2548, 'grad_norm': 3.954528331756592, 'learning_rate': 3.79769210260897e-05, 'epoch': 1.7} 17%|█▋ | 7016/41250 [16:57:02<82:19:33, 8.66s/it][2025-04-26 00:54:45,965] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 00:54:45,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.35 | bwd_microstep: 5770.00 | bwd_inner_microstep: 5635.59 | bwd_allreduce_microstep: 134.36 | step_microstep: 18.79 [2025-04-26 00:54:45,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.35 | bwd: 5770.01 | bwd_inner: 5635.59 | bwd_allreduce: 134.38 | step: 18.79 17%|█▋ | 7017/41250 [16:57:11<82:22:08, 8.66s/it] {'loss': 0.0347, 'grad_norm': 1.4085867404937744, 'learning_rate': 3.797623275286955e-05, 'epoch': 1.7} 17%|█▋ | 7017/41250 [16:57:11<82:22:08, 8.66s/it][2025-04-26 00:54:54,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.04 | optimizer_step: 1.01 [2025-04-26 00:54:54,632] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.94 | bwd_microstep: 5742.76 | bwd_inner_microstep: 5675.91 | bwd_allreduce_microstep: 66.80 | step_microstep: 19.49 [2025-04-26 00:54:54,632] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.94 | bwd: 5742.78 | bwd_inner: 5675.91 | bwd_allreduce: 66.82 | step: 19.49 17%|█▋ | 7018/41250 [16:57:19<82:23:17, 8.66s/it] {'loss': 0.2669, 'grad_norm': 2.1174979209899902, 'learning_rate': 3.7975544368829475e-05, 'epoch': 1.7} 17%|█▋ | 7018/41250 [16:57:19<82:23:17, 8.66s/it][2025-04-26 00:55:03,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-26 00:55:03,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.05 | bwd_microstep: 5686.04 | bwd_inner_microstep: 5642.02 | bwd_allreduce_microstep: 43.98 | step_microstep: 18.90 [2025-04-26 00:55:03,223] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.05 | bwd: 5686.05 | bwd_inner: 5642.02 | bwd_allreduce: 44.00 | step: 18.91 17%|█▋ | 7019/41250 [16:57:28<82:10:02, 8.64s/it] {'loss': 0.0511, 'grad_norm': 1.434583306312561, 'learning_rate': 3.7974855873973696e-05, 'epoch': 1.7} 17%|█▋ | 7019/41250 [16:57:28<82:10:02, 8.64s/it][2025-04-26 00:55:11,885] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:55:11,886] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.05 | bwd_microstep: 5740.31 | bwd_inner_microstep: 5666.66 | bwd_allreduce_microstep: 73.60 | step_microstep: 19.17 [2025-04-26 00:55:11,886] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.05 | bwd: 5740.33 | bwd_inner: 5666.66 | bwd_allreduce: 73.62 | step: 19.17 17%|█▋ | 7020/41250 [16:57:37<82:13:43, 8.65s/it] {'loss': 0.1478, 'grad_norm': 2.187274694442749, 'learning_rate': 3.797416726830647e-05, 'epoch': 1.7} 17%|█▋ | 7020/41250 [16:57:37<82:13:43, 8.65s/it][2025-04-26 00:55:20,498] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.96 | optimizer_step: 1.05 [2025-04-26 00:55:20,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.02 | bwd_microstep: 5684.82 | bwd_inner_microstep: 5672.16 | bwd_allreduce_microstep: 12.62 | step_microstep: 18.82 [2025-04-26 00:55:20,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.02 | bwd: 5684.84 | bwd_inner: 5672.16 | bwd_allreduce: 12.63 | step: 18.82 17%|█▋ | 7021/41250 [16:57:45<82:07:20, 8.64s/it] {'loss': 0.0714, 'grad_norm': 1.92880117893219, 'learning_rate': 3.797347855183203e-05, 'epoch': 1.7} 17%|█▋ | 7021/41250 [16:57:45<82:07:20, 8.64s/it][2025-04-26 00:55:29,179] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.13 | optimizer_step: 1.07 [2025-04-26 00:55:29,180] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2816.56 | bwd_microstep: 5779.30 | bwd_inner_microstep: 5640.02 | bwd_allreduce_microstep: 139.22 | step_microstep: 20.04 [2025-04-26 00:55:29,180] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2816.56 | bwd: 5779.32 | bwd_inner: 5640.02 | bwd_allreduce: 139.25 | step: 20.05 17%|█▋ | 7022/41250 [16:57:54<82:15:05, 8.65s/it] {'loss': 0.3093, 'grad_norm': 4.567498683929443, 'learning_rate': 3.797278972455464e-05, 'epoch': 1.7} 17%|█▋ | 7022/41250 [16:57:54<82:15:05, 8.65s/it][2025-04-26 00:55:37,864] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.06 | optimizer_step: 0.90 [2025-04-26 00:55:37,865] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.29 | bwd_microstep: 5742.30 | bwd_inner_microstep: 5687.68 | bwd_allreduce_microstep: 54.56 | step_microstep: 18.83 [2025-04-26 00:55:37,865] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.29 | bwd: 5742.32 | bwd_inner: 5687.68 | bwd_allreduce: 54.59 | step: 18.84 17%|█▋ | 7023/41250 [16:58:03<82:20:39, 8.66s/it] {'loss': 0.2636, 'grad_norm': 3.793158769607544, 'learning_rate': 3.797210078647853e-05, 'epoch': 1.7} 17%|█▋ | 7023/41250 [16:58:03<82:20:39, 8.66s/it][2025-04-26 00:55:46,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-26 00:55:46,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.73 | bwd_microstep: 5735.26 | bwd_inner_microstep: 5693.70 | bwd_allreduce_microstep: 41.51 | step_microstep: 19.32 [2025-04-26 00:55:46,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.73 | bwd: 5735.27 | bwd_inner: 5693.70 | bwd_allreduce: 41.53 | step: 19.32 17%|█▋ | 7024/41250 [16:58:11<82:19:29, 8.66s/it] {'loss': 0.2823, 'grad_norm': 3.1399192810058594, 'learning_rate': 3.7971411737607956e-05, 'epoch': 1.7} 17%|█▋ | 7024/41250 [16:58:11<82:19:29, 8.66s/it][2025-04-26 00:55:55,204] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.22 | optimizer_step: 0.90 [2025-04-26 00:55:55,205] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.12 | bwd_microstep: 5745.03 | bwd_inner_microstep: 5709.58 | bwd_allreduce_microstep: 35.40 | step_microstep: 19.44 [2025-04-26 00:55:55,205] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.12 | bwd: 5745.04 | bwd_inner: 5709.58 | bwd_allreduce: 35.42 | step: 19.44 17%|█▋ | 7025/41250 [16:58:20<82:23:42, 8.67s/it] {'loss': 0.1565, 'grad_norm': 2.6501197814941406, 'learning_rate': 3.797072257794717e-05, 'epoch': 1.7} 17%|█▋ | 7025/41250 [16:58:20<82:23:42, 8.67s/it][2025-04-26 00:56:03,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:56:03,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.25 | bwd_microstep: 5698.20 | bwd_inner_microstep: 5685.46 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.92 [2025-04-26 00:56:03,830] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.25 | bwd: 5698.21 | bwd_inner: 5685.46 | bwd_allreduce: 12.71 | step: 18.93 17%|█▋ | 7026/41250 [16:58:29<82:16:12, 8.65s/it] {'loss': 0.1606, 'grad_norm': 2.4433512687683105, 'learning_rate': 3.7970033307500405e-05, 'epoch': 1.7} 17%|█▋ | 7026/41250 [16:58:29<82:16:12, 8.65s/it][2025-04-26 00:56:12,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:56:12,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.51 | bwd_microstep: 5703.50 | bwd_inner_microstep: 5690.58 | bwd_allreduce_microstep: 12.87 | step_microstep: 18.92 [2025-04-26 00:56:12,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.51 | bwd: 5703.51 | bwd_inner: 5690.58 | bwd_allreduce: 12.89 | step: 18.92 17%|█▋ | 7027/41250 [16:58:37<82:12:38, 8.65s/it] {'loss': 0.1468, 'grad_norm': 1.1939469575881958, 'learning_rate': 3.796934392627192e-05, 'epoch': 1.7} 17%|█▋ | 7027/41250 [16:58:37<82:12:38, 8.65s/it][2025-04-26 00:56:21,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-26 00:56:21,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.59 | bwd_microstep: 5739.78 | bwd_inner_microstep: 5710.34 | bwd_allreduce_microstep: 29.39 | step_microstep: 19.37 [2025-04-26 00:56:21,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.59 | bwd: 5739.79 | bwd_inner: 5710.34 | bwd_allreduce: 29.41 | step: 19.37 17%|█▋ | 7028/41250 [16:58:46<82:17:11, 8.66s/it] {'loss': 0.3679, 'grad_norm': 1.8687437772750854, 'learning_rate': 3.796865443426596e-05, 'epoch': 1.7} 17%|█▋ | 7028/41250 [16:58:46<82:17:11, 8.66s/it][2025-04-26 00:56:29,822] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-26 00:56:29,823] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.89 | bwd_microstep: 5776.02 | bwd_inner_microstep: 5634.32 | bwd_allreduce_microstep: 141.66 | step_microstep: 18.48 [2025-04-26 00:56:29,823] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.89 | bwd: 5776.03 | bwd_inner: 5634.32 | bwd_allreduce: 141.68 | step: 18.49 17%|█▋ | 7029/41250 [16:58:55<82:21:54, 8.66s/it] {'loss': 0.0601, 'grad_norm': 1.3135274648666382, 'learning_rate': 3.7967964831486784e-05, 'epoch': 1.7} 17%|█▋ | 7029/41250 [16:58:55<82:21:54, 8.66s/it][2025-04-26 00:56:38,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 00:56:38,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2876.02 | bwd_microstep: 5764.79 | bwd_inner_microstep: 5752.03 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.72 [2025-04-26 00:56:38,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2876.02 | bwd: 5764.80 | bwd_inner: 5752.03 | bwd_allreduce: 12.73 | step: 18.72 17%|█▋ | 7030/41250 [16:59:03<82:31:43, 8.68s/it] {'loss': 0.2223, 'grad_norm': 2.076925039291382, 'learning_rate': 3.796727511793864e-05, 'epoch': 1.7} 17%|█▋ | 7030/41250 [16:59:03<82:31:43, 8.68s/it][2025-04-26 00:56:47,173] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:56:47,174] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.64 | bwd_microstep: 5716.45 | bwd_inner_microstep: 5644.98 | bwd_allreduce_microstep: 71.42 | step_microstep: 18.47 [2025-04-26 00:56:47,174] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.64 | bwd: 5716.46 | bwd_inner: 5644.98 | bwd_allreduce: 71.44 | step: 18.48 17%|█▋ | 7031/41250 [16:59:12<82:22:15, 8.67s/it] {'loss': 0.3293, 'grad_norm': 2.8108103275299072, 'learning_rate': 3.796658529362578e-05, 'epoch': 1.7} 17%|█▋ | 7031/41250 [16:59:12<82:22:15, 8.67s/it][2025-04-26 00:56:55,910] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:56:55,911] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2879.10 | bwd_microstep: 5774.55 | bwd_inner_microstep: 5761.98 | bwd_allreduce_microstep: 12.53 | step_microstep: 18.57 [2025-04-26 00:56:55,911] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2879.10 | bwd: 5774.56 | bwd_inner: 5761.98 | bwd_allreduce: 12.55 | step: 18.57 17%|█▋ | 7032/41250 [16:59:21<82:34:10, 8.69s/it] {'loss': 0.1085, 'grad_norm': 1.1664676666259766, 'learning_rate': 3.7965895358552457e-05, 'epoch': 1.7} 17%|█▋ | 7032/41250 [16:59:21<82:34:10, 8.69s/it][2025-04-26 00:57:04,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.21 | optimizer_step: 0.90 [2025-04-26 00:57:04,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.03 | bwd_microstep: 5725.46 | bwd_inner_microstep: 5649.13 | bwd_allreduce_microstep: 76.28 | step_microstep: 18.80 [2025-04-26 00:57:04,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.03 | bwd: 5725.48 | bwd_inner: 5649.13 | bwd_allreduce: 76.30 | step: 18.80 17%|█▋ | 7033/41250 [16:59:29<82:25:04, 8.67s/it] {'loss': 0.0496, 'grad_norm': 0.5667318105697632, 'learning_rate': 3.796520531272292e-05, 'epoch': 1.7} 17%|█▋ | 7033/41250 [16:59:29<82:25:04, 8.67s/it][2025-04-26 00:57:13,309] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:57:13,310] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2889.41 | bwd_microstep: 5790.75 | bwd_inner_microstep: 5778.26 | bwd_allreduce_microstep: 12.45 | step_microstep: 18.52 [2025-04-26 00:57:13,310] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2889.41 | bwd: 5790.77 | bwd_inner: 5778.26 | bwd_allreduce: 12.46 | step: 18.52 17%|█▋ | 7034/41250 [16:59:38<82:41:02, 8.70s/it] {'loss': 0.1617, 'grad_norm': 1.2958424091339111, 'learning_rate': 3.796451515614142e-05, 'epoch': 1.71} 17%|█▋ | 7034/41250 [16:59:38<82:41:02, 8.70s/it][2025-04-26 00:57:22,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.01 | optimizer_step: 1.00 [2025-04-26 00:57:22,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.04 | bwd_microstep: 5779.61 | bwd_inner_microstep: 5657.32 | bwd_allreduce_microstep: 122.25 | step_microstep: 19.00 [2025-04-26 00:57:22,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.04 | bwd: 5779.62 | bwd_inner: 5657.32 | bwd_allreduce: 122.26 | step: 19.01 17%|█▋ | 7035/41250 [16:59:47<82:40:23, 8.70s/it] {'loss': 0.0655, 'grad_norm': 1.3965504169464111, 'learning_rate': 3.796382488881223e-05, 'epoch': 1.71} 17%|█▋ | 7035/41250 [16:59:47<82:40:23, 8.70s/it][2025-04-26 00:57:30,670] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 00:57:30,671] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.36 | bwd_microstep: 5724.43 | bwd_inner_microstep: 5711.29 | bwd_allreduce_microstep: 13.08 | step_microstep: 19.27 [2025-04-26 00:57:30,671] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.36 | bwd: 5724.44 | bwd_inner: 5711.29 | bwd_allreduce: 13.11 | step: 19.28 17%|█▋ | 7036/41250 [16:59:55<82:34:15, 8.69s/it] {'loss': 0.1227, 'grad_norm': 2.257054328918457, 'learning_rate': 3.7963134510739585e-05, 'epoch': 1.71} 17%|█▋ | 7036/41250 [16:59:55<82:34:15, 8.69s/it][2025-04-26 00:57:39,377] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 00:57:39,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.04 | bwd_microstep: 5792.17 | bwd_inner_microstep: 5658.46 | bwd_allreduce_microstep: 133.66 | step_microstep: 18.85 [2025-04-26 00:57:39,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.04 | bwd: 5792.19 | bwd_inner: 5658.46 | bwd_allreduce: 133.68 | step: 18.85 17%|█▋ | 7037/41250 [17:00:04<82:37:23, 8.69s/it] {'loss': 0.2032, 'grad_norm': 1.2513153553009033, 'learning_rate': 3.796244402192775e-05, 'epoch': 1.71} 17%|█▋ | 7037/41250 [17:00:04<82:37:23, 8.69s/it][2025-04-26 00:57:48,043] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 00:57:48,044] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.57 | bwd_microstep: 5729.49 | bwd_inner_microstep: 5716.75 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.65 [2025-04-26 00:57:48,044] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.57 | bwd: 5729.50 | bwd_inner: 5716.75 | bwd_allreduce: 12.71 | step: 18.66 17%|█▋ | 7038/41250 [17:00:13<82:32:23, 8.69s/it] {'loss': 0.2095, 'grad_norm': 1.4077506065368652, 'learning_rate': 3.796175342238097e-05, 'epoch': 1.71} 17%|█▋ | 7038/41250 [17:00:13<82:32:23, 8.69s/it][2025-04-26 00:57:56,681] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:57:56,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.88 | bwd_microstep: 5729.06 | bwd_inner_microstep: 5655.06 | bwd_allreduce_microstep: 73.95 | step_microstep: 18.46 [2025-04-26 00:57:56,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.88 | bwd: 5729.07 | bwd_inner: 5655.06 | bwd_allreduce: 73.97 | step: 18.46 17%|█▋ | 7039/41250 [17:00:22<82:24:14, 8.67s/it] {'loss': 0.1809, 'grad_norm': 2.704439640045166, 'learning_rate': 3.7961062712103534e-05, 'epoch': 1.71} 17%|█▋ | 7039/41250 [17:00:22<82:24:14, 8.67s/it][2025-04-26 00:58:05,330] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 00:58:05,330] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.01 | bwd_microstep: 5719.44 | bwd_inner_microstep: 5686.37 | bwd_allreduce_microstep: 33.03 | step_microstep: 18.31 [2025-04-26 00:58:05,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.01 | bwd: 5719.46 | bwd_inner: 5686.37 | bwd_allreduce: 33.05 | step: 18.31 17%|█▋ | 7040/41250 [17:00:30<82:20:06, 8.66s/it] {'loss': 0.1029, 'grad_norm': 1.3263033628463745, 'learning_rate': 3.796037189109967e-05, 'epoch': 1.71} 17%|█▋ | 7040/41250 [17:00:30<82:20:06, 8.66s/it][2025-04-26 00:58:13,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 00:58:13,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.75 | bwd_microstep: 5705.66 | bwd_inner_microstep: 5654.27 | bwd_allreduce_microstep: 51.34 | step_microstep: 18.56 [2025-04-26 00:58:13,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.75 | bwd: 5705.67 | bwd_inner: 5654.27 | bwd_allreduce: 51.36 | step: 18.57 17%|█▋ | 7041/41250 [17:00:39<82:11:39, 8.65s/it] {'loss': 0.0497, 'grad_norm': 1.1007457971572876, 'learning_rate': 3.795968095937365e-05, 'epoch': 1.71} 17%|█▋ | 7041/41250 [17:00:39<82:11:39, 8.65s/it][2025-04-26 00:58:22,632] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 00:58:22,633] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.79 | bwd_microstep: 5756.37 | bwd_inner_microstep: 5714.16 | bwd_allreduce_microstep: 42.16 | step_microstep: 18.39 [2025-04-26 00:58:22,633] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.79 | bwd: 5756.39 | bwd_inner: 5714.16 | bwd_allreduce: 42.18 | step: 18.41 17%|█▋ | 7042/41250 [17:00:47<82:17:58, 8.66s/it] {'loss': 0.1179, 'grad_norm': 1.6440733671188354, 'learning_rate': 3.795898991692972e-05, 'epoch': 1.71} 17%|█▋ | 7042/41250 [17:00:47<82:17:58, 8.66s/it][2025-04-26 00:58:31,287] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.94 | optimizer_gradients: 1.00 | optimizer_step: 1.11 [2025-04-26 00:58:31,287] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.49 | bwd_microstep: 5719.44 | bwd_inner_microstep: 5706.71 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.84 [2025-04-26 00:58:31,288] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.49 | bwd: 5719.46 | bwd_inner: 5706.71 | bwd_allreduce: 12.71 | step: 18.85 17%|█▋ | 7043/41250 [17:00:56<82:16:39, 8.66s/it] {'loss': 0.0416, 'grad_norm': 0.5681148171424866, 'learning_rate': 3.795829876377215e-05, 'epoch': 1.71} 17%|█▋ | 7043/41250 [17:00:56<82:16:39, 8.66s/it][2025-04-26 00:58:39,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 00:58:39,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.41 | bwd_microstep: 5751.24 | bwd_inner_microstep: 5699.43 | bwd_allreduce_microstep: 51.77 | step_microstep: 18.59 [2025-04-26 00:58:39,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.41 | bwd: 5751.25 | bwd_inner: 5699.43 | bwd_allreduce: 51.78 | step: 18.59 17%|█▋ | 7044/41250 [17:01:05<82:20:52, 8.67s/it] {'loss': 0.3254, 'grad_norm': 1.671541452407837, 'learning_rate': 3.79576074999052e-05, 'epoch': 1.71} 17%|█▋ | 7044/41250 [17:01:05<82:20:52, 8.67s/it][2025-04-26 00:58:48,733] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 00:58:48,734] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2892.58 | bwd_microstep: 5786.27 | bwd_inner_microstep: 5773.63 | bwd_allreduce_microstep: 12.59 | step_microstep: 18.49 [2025-04-26 00:58:48,734] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2892.58 | bwd: 5786.28 | bwd_inner: 5773.63 | bwd_allreduce: 12.61 | step: 18.49 17%|█▋ | 7045/41250 [17:01:14<82:36:55, 8.70s/it] {'loss': 0.0598, 'grad_norm': 0.9211239218711853, 'learning_rate': 3.795691612533314e-05, 'epoch': 1.71} 17%|█▋ | 7045/41250 [17:01:14<82:36:55, 8.70s/it][2025-04-26 00:58:57,353] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:58:57,353] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.38 | bwd_microstep: 5694.22 | bwd_inner_microstep: 5681.59 | bwd_allreduce_microstep: 12.59 | step_microstep: 18.74 [2025-04-26 00:58:57,353] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.38 | bwd: 5694.24 | bwd_inner: 5681.59 | bwd_allreduce: 12.61 | step: 18.74 17%|█▋ | 7046/41250 [17:01:22<82:23:52, 8.67s/it] {'loss': 0.1605, 'grad_norm': 2.265822410583496, 'learning_rate': 3.7956224640060216e-05, 'epoch': 1.71} 17%|█▋ | 7046/41250 [17:01:22<82:23:52, 8.67s/it][2025-04-26 00:59:05,993] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 00:59:05,993] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.13 | bwd_microstep: 5710.27 | bwd_inner_microstep: 5697.51 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.89 [2025-04-26 00:59:05,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.13 | bwd: 5710.29 | bwd_inner: 5697.51 | bwd_allreduce: 12.74 | step: 18.90 17%|█▋ | 7047/41250 [17:01:31<82:18:16, 8.66s/it] {'loss': 0.3021, 'grad_norm': 1.9999219179153442, 'learning_rate': 3.7955533044090704e-05, 'epoch': 1.71} 17%|█▋ | 7047/41250 [17:01:31<82:18:16, 8.66s/it][2025-04-26 00:59:14,623] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 00:59:14,623] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.72 | bwd_microstep: 5723.79 | bwd_inner_microstep: 5650.84 | bwd_allreduce_microstep: 72.91 | step_microstep: 18.30 [2025-04-26 00:59:14,624] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.72 | bwd: 5723.80 | bwd_inner: 5650.84 | bwd_allreduce: 72.92 | step: 18.30 17%|█▋ | 7048/41250 [17:01:39<82:12:22, 8.65s/it] {'loss': 0.2233, 'grad_norm': 2.030385971069336, 'learning_rate': 3.7954841337428866e-05, 'epoch': 1.71} 17%|█▋ | 7048/41250 [17:01:39<82:12:22, 8.65s/it][2025-04-26 00:59:23,284] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.96 | optimizer_gradients: 1.03 | optimizer_step: 1.06 [2025-04-26 00:59:23,285] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.83 | bwd_microstep: 5728.05 | bwd_inner_microstep: 5698.24 | bwd_allreduce_microstep: 29.76 | step_microstep: 18.59 [2025-04-26 00:59:23,285] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.83 | bwd: 5728.06 | bwd_inner: 5698.24 | bwd_allreduce: 29.78 | step: 18.59 17%|█▋ | 7049/41250 [17:01:48<82:13:46, 8.66s/it] {'loss': 0.3644, 'grad_norm': 2.9135279655456543, 'learning_rate': 3.795414952007895e-05, 'epoch': 1.71} 17%|█▋ | 7049/41250 [17:01:48<82:13:46, 8.66s/it][2025-04-26 00:59:31,970] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 00:59:31,970] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.07 | bwd_microstep: 5778.81 | bwd_inner_microstep: 5652.66 | bwd_allreduce_microstep: 126.10 | step_microstep: 19.12 [2025-04-26 00:59:31,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.07 | bwd: 5778.83 | bwd_inner: 5652.66 | bwd_allreduce: 126.12 | step: 19.12 17%|█▋ | 7050/41250 [17:01:57<82:19:08, 8.67s/it] {'loss': 0.2092, 'grad_norm': 1.9865286350250244, 'learning_rate': 3.795345759204525e-05, 'epoch': 1.71} 17%|█▋ | 7050/41250 [17:01:57<82:19:08, 8.67s/it][2025-04-26 00:59:40,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.32 | optimizer_step: 1.04 [2025-04-26 00:59:40,566] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.67 | bwd_microstep: 5688.39 | bwd_inner_microstep: 5647.48 | bwd_allreduce_microstep: 40.86 | step_microstep: 19.98 [2025-04-26 00:59:40,566] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.67 | bwd: 5688.41 | bwd_inner: 5647.48 | bwd_allreduce: 40.88 | step: 19.98 17%|█▋ | 7051/41250 [17:02:05<82:06:50, 8.64s/it] {'loss': 0.3999, 'grad_norm': 3.66530442237854, 'learning_rate': 3.7952765553332e-05, 'epoch': 1.71} 17%|█▋ | 7051/41250 [17:02:05<82:06:50, 8.64s/it][2025-04-26 00:59:49,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.25 | optimizer_step: 0.90 [2025-04-26 00:59:49,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.66 | bwd_microstep: 5845.78 | bwd_inner_microstep: 5691.39 | bwd_allreduce_microstep: 154.33 | step_microstep: 19.04 [2025-04-26 00:59:49,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.66 | bwd: 5845.79 | bwd_inner: 5691.39 | bwd_allreduce: 154.35 | step: 19.04 17%|█▋ | 7052/41250 [17:02:14<82:29:59, 8.68s/it] {'loss': 0.1811, 'grad_norm': 2.2829110622406006, 'learning_rate': 3.795207340394349e-05, 'epoch': 1.71} 17%|█▋ | 7052/41250 [17:02:14<82:29:59, 8.68s/it][2025-04-26 00:59:58,111] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.26 | optimizer_step: 0.90 [2025-04-26 00:59:58,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2891.68 | bwd_microstep: 5788.33 | bwd_inner_microstep: 5775.36 | bwd_allreduce_microstep: 12.93 | step_microstep: 19.47 [2025-04-26 00:59:58,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2891.68 | bwd: 5788.35 | bwd_inner: 5775.36 | bwd_allreduce: 12.95 | step: 19.47 17%|█▋ | 7053/41250 [17:02:23<82:43:57, 8.71s/it] {'loss': 0.0861, 'grad_norm': 1.3033877611160278, 'learning_rate': 3.7951381143883974e-05, 'epoch': 1.71} 17%|█▋ | 7053/41250 [17:02:23<82:43:57, 8.71s/it][2025-04-26 01:00:06,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.27 | optimizer_step: 0.90 [2025-04-26 01:00:06,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.93 | bwd_microstep: 5738.61 | bwd_inner_microstep: 5648.59 | bwd_allreduce_microstep: 89.97 | step_microstep: 19.36 [2025-04-26 01:00:06,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.93 | bwd: 5738.63 | bwd_inner: 5648.59 | bwd_allreduce: 89.99 | step: 19.37 17%|█▋ | 7054/41250 [17:02:32<82:33:40, 8.69s/it] {'loss': 0.3598, 'grad_norm': 3.472356081008911, 'learning_rate': 3.795068877315773e-05, 'epoch': 1.71} 17%|█▋ | 7054/41250 [17:02:32<82:33:40, 8.69s/it][2025-04-26 01:00:15,487] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.19 | optimizer_step: 0.91 [2025-04-26 01:00:15,487] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2875.06 | bwd_microstep: 5765.73 | bwd_inner_microstep: 5752.79 | bwd_allreduce_microstep: 12.89 | step_microstep: 19.25 [2025-04-26 01:00:15,487] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2875.06 | bwd: 5765.75 | bwd_inner: 5752.79 | bwd_allreduce: 12.91 | step: 19.27 17%|█▋ | 7055/41250 [17:02:40<82:39:22, 8.70s/it] {'loss': 0.0801, 'grad_norm': 0.7055622339248657, 'learning_rate': 3.794999629176902e-05, 'epoch': 1.71} 17%|█▋ | 7055/41250 [17:02:40<82:39:22, 8.70s/it][2025-04-26 01:00:24,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.08 | optimizer_step: 0.90 [2025-04-26 01:00:24,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2928.01 | bwd_microstep: 5871.81 | bwd_inner_microstep: 5858.61 | bwd_allreduce_microstep: 13.13 | step_microstep: 19.33 [2025-04-26 01:00:24,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2928.01 | bwd: 5871.82 | bwd_inner: 5858.61 | bwd_allreduce: 13.16 | step: 19.34 17%|█▋ | 7056/41250 [17:02:49<83:10:30, 8.76s/it] {'loss': 0.0905, 'grad_norm': 1.0514405965805054, 'learning_rate': 3.794930369972211e-05, 'epoch': 1.71} 17%|█▋ | 7056/41250 [17:02:49<83:10:30, 8.76s/it][2025-04-26 01:00:33,025] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-26 01:00:33,025] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.84 | bwd_microstep: 5734.00 | bwd_inner_microstep: 5645.88 | bwd_allreduce_microstep: 88.07 | step_microstep: 18.96 [2025-04-26 01:00:33,026] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.84 | bwd: 5734.02 | bwd_inner: 5645.88 | bwd_allreduce: 88.09 | step: 18.96 17%|█▋ | 7057/41250 [17:02:58<82:52:54, 8.73s/it] {'loss': 0.0354, 'grad_norm': 0.6883544325828552, 'learning_rate': 3.794861099702127e-05, 'epoch': 1.71} 17%|█▋ | 7057/41250 [17:02:58<82:52:54, 8.73s/it][2025-04-26 01:00:41,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.32 | optimizer_step: 0.99 [2025-04-26 01:00:41,681] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.65 | bwd_microstep: 5743.57 | bwd_inner_microstep: 5646.26 | bwd_allreduce_microstep: 97.25 | step_microstep: 19.96 [2025-04-26 01:00:41,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.65 | bwd: 5743.58 | bwd_inner: 5646.26 | bwd_allreduce: 97.28 | step: 19.96 17%|█▋ | 7058/41250 [17:03:07<82:40:13, 8.70s/it] {'loss': 0.1974, 'grad_norm': 2.3389463424682617, 'learning_rate': 3.794791818367077e-05, 'epoch': 1.71} 17%|█▋ | 7058/41250 [17:03:07<82:40:13, 8.70s/it][2025-04-26 01:00:50,256] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:00:50,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.38 | bwd_microstep: 5673.28 | bwd_inner_microstep: 5640.25 | bwd_allreduce_microstep: 32.98 | step_microstep: 18.48 [2025-04-26 01:00:50,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.38 | bwd: 5673.29 | bwd_inner: 5640.25 | bwd_allreduce: 33.00 | step: 18.49 17%|█▋ | 7059/41250 [17:03:15<82:17:55, 8.67s/it] {'loss': 0.0899, 'grad_norm': 1.1894468069076538, 'learning_rate': 3.7947225259674894e-05, 'epoch': 1.71} 17%|█▋ | 7059/41250 [17:03:15<82:17:55, 8.67s/it][2025-04-26 01:00:58,990] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.92 [2025-04-26 01:00:58,991] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2885.38 | bwd_microstep: 5768.28 | bwd_inner_microstep: 5755.59 | bwd_allreduce_microstep: 12.65 | step_microstep: 19.04 [2025-04-26 01:00:58,991] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2885.38 | bwd: 5768.30 | bwd_inner: 5755.59 | bwd_allreduce: 12.67 | step: 19.05 17%|█▋ | 7060/41250 [17:03:24<82:29:34, 8.69s/it] {'loss': 0.1762, 'grad_norm': 1.5145111083984375, 'learning_rate': 3.7946532225037905e-05, 'epoch': 1.71} 17%|█▋ | 7060/41250 [17:03:24<82:29:34, 8.69s/it][2025-04-26 01:01:07,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:01:07,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.66 | bwd_microstep: 5694.96 | bwd_inner_microstep: 5643.79 | bwd_allreduce_microstep: 51.12 | step_microstep: 18.76 [2025-04-26 01:01:07,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.66 | bwd: 5694.97 | bwd_inner: 5643.79 | bwd_allreduce: 51.14 | step: 18.77 17%|█▋ | 7061/41250 [17:03:32<82:15:02, 8.66s/it] {'loss': 0.2123, 'grad_norm': 2.537604570388794, 'learning_rate': 3.7945839079764075e-05, 'epoch': 1.71} 17%|█▋ | 7061/41250 [17:03:32<82:15:02, 8.66s/it][2025-04-26 01:01:16,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 01:01:16,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.71 | bwd_microstep: 5718.67 | bwd_inner_microstep: 5705.87 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.59 [2025-04-26 01:01:16,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.71 | bwd: 5718.68 | bwd_inner: 5705.87 | bwd_allreduce: 12.77 | step: 18.60 17%|█▋ | 7062/41250 [17:03:41<82:13:04, 8.66s/it] {'loss': 0.1366, 'grad_norm': 1.2553917169570923, 'learning_rate': 3.794514582385767e-05, 'epoch': 1.71} 17%|█▋ | 7062/41250 [17:03:41<82:13:04, 8.66s/it][2025-04-26 01:01:24,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 01:01:24,923] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2816.38 | bwd_microstep: 5780.71 | bwd_inner_microstep: 5637.01 | bwd_allreduce_microstep: 143.65 | step_microstep: 18.33 [2025-04-26 01:01:24,923] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2816.38 | bwd: 5780.72 | bwd_inner: 5637.01 | bwd_allreduce: 143.67 | step: 18.34 17%|█▋ | 7063/41250 [17:03:50<82:16:38, 8.66s/it] {'loss': 0.1658, 'grad_norm': 1.2543373107910156, 'learning_rate': 3.794445245732297e-05, 'epoch': 1.71} 17%|█▋ | 7063/41250 [17:03:50<82:16:38, 8.66s/it][2025-04-26 01:01:33,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 01:01:33,576] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.49 | bwd_microstep: 5747.61 | bwd_inner_microstep: 5651.18 | bwd_allreduce_microstep: 96.39 | step_microstep: 18.11 [2025-04-26 01:01:33,576] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.49 | bwd: 5747.62 | bwd_inner: 5651.18 | bwd_allreduce: 96.40 | step: 18.11 17%|█▋ | 7064/41250 [17:03:58<82:14:34, 8.66s/it] {'loss': 0.2393, 'grad_norm': 5.216795444488525, 'learning_rate': 3.794375898016426e-05, 'epoch': 1.71} 17%|█▋ | 7064/41250 [17:03:58<82:14:34, 8.66s/it][2025-04-26 01:01:42,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.96 | optimizer_step: 1.04 [2025-04-26 01:01:42,184] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.81 | bwd_microstep: 5700.22 | bwd_inner_microstep: 5663.86 | bwd_allreduce_microstep: 36.32 | step_microstep: 18.54 [2025-04-26 01:01:42,184] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.81 | bwd: 5700.23 | bwd_inner: 5663.86 | bwd_allreduce: 36.33 | step: 18.54 17%|█▋ | 7065/41250 [17:04:07<82:05:28, 8.64s/it] {'loss': 0.1243, 'grad_norm': 1.6073555946350098, 'learning_rate': 3.79430653923858e-05, 'epoch': 1.71} 17%|█▋ | 7065/41250 [17:04:07<82:05:28, 8.64s/it][2025-04-26 01:01:50,864] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 01:01:50,864] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.74 | bwd_microstep: 5772.69 | bwd_inner_microstep: 5644.67 | bwd_allreduce_microstep: 127.98 | step_microstep: 18.28 [2025-04-26 01:01:50,865] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.74 | bwd: 5772.70 | bwd_inner: 5644.67 | bwd_allreduce: 127.99 | step: 18.28 17%|█▋ | 7066/41250 [17:04:16<82:11:23, 8.66s/it] {'loss': 0.2138, 'grad_norm': 5.101917266845703, 'learning_rate': 3.7942371693991866e-05, 'epoch': 1.71} 17%|█▋ | 7066/41250 [17:04:16<82:11:23, 8.66s/it][2025-04-26 01:01:59,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 0.96 | optimizer_step: 1.05 [2025-04-26 01:01:59,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2870.91 | bwd_microstep: 5719.85 | bwd_inner_microstep: 5707.26 | bwd_allreduce_microstep: 12.54 | step_microstep: 18.23 [2025-04-26 01:01:59,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2870.91 | bwd: 5719.86 | bwd_inner: 5707.26 | bwd_allreduce: 12.55 | step: 18.23 17%|█▋ | 7067/41250 [17:04:24<82:14:27, 8.66s/it] {'loss': 0.2515, 'grad_norm': 2.894940137863159, 'learning_rate': 3.7941677884986745e-05, 'epoch': 1.71} 17%|█▋ | 7067/41250 [17:04:24<82:14:27, 8.66s/it][2025-04-26 01:02:08,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.21 | optimizer_step: 0.90 [2025-04-26 01:02:08,150] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.50 | bwd_microstep: 5699.91 | bwd_inner_microstep: 5657.88 | bwd_allreduce_microstep: 41.98 | step_microstep: 18.94 [2025-04-26 01:02:08,150] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.50 | bwd: 5699.92 | bwd_inner: 5657.88 | bwd_allreduce: 42.00 | step: 18.94 17%|█▋ | 7068/41250 [17:04:33<82:05:48, 8.65s/it] {'loss': 0.0955, 'grad_norm': 1.8760932683944702, 'learning_rate': 3.7940983965374714e-05, 'epoch': 1.71} 17%|█▋ | 7068/41250 [17:04:33<82:05:48, 8.65s/it][2025-04-26 01:02:16,750] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.95 [2025-04-26 01:02:16,751] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.69 | bwd_microstep: 5691.27 | bwd_inner_microstep: 5652.63 | bwd_allreduce_microstep: 38.59 | step_microstep: 18.97 [2025-04-26 01:02:16,751] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.69 | bwd: 5691.28 | bwd_inner: 5652.63 | bwd_allreduce: 38.60 | step: 18.98 17%|█▋ | 7069/41250 [17:04:42<81:57:57, 8.63s/it] {'loss': 0.3147, 'grad_norm': 3.034046173095703, 'learning_rate': 3.7940289935160034e-05, 'epoch': 1.71} 17%|█▋ | 7069/41250 [17:04:42<81:57:57, 8.63s/it][2025-04-26 01:02:25,679] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.25 | optimizer_step: 0.96 [2025-04-26 01:02:25,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.42 | bwd_microstep: 6004.19 | bwd_inner_microstep: 5656.91 | bwd_allreduce_microstep: 347.23 | step_microstep: 19.67 [2025-04-26 01:02:25,680] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.43 | bwd: 6004.21 | bwd_inner: 5656.91 | bwd_allreduce: 347.25 | step: 19.68 17%|█▋ | 7070/41250 [17:04:51<82:48:36, 8.72s/it] {'loss': 0.0458, 'grad_norm': 0.7000483274459839, 'learning_rate': 3.7939595794347e-05, 'epoch': 1.71} 17%|█▋ | 7070/41250 [17:04:51<82:48:36, 8.72s/it][2025-04-26 01:02:34,286] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:02:34,287] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.40 | bwd_microstep: 5694.85 | bwd_inner_microstep: 5653.47 | bwd_allreduce_microstep: 41.33 | step_microstep: 18.80 [2025-04-26 01:02:34,287] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.40 | bwd: 5694.86 | bwd_inner: 5653.47 | bwd_allreduce: 41.35 | step: 18.80 17%|█▋ | 7071/41250 [17:04:59<82:28:37, 8.69s/it] {'loss': 0.2367, 'grad_norm': 1.609775185585022, 'learning_rate': 3.7938901542939886e-05, 'epoch': 1.71} 17%|█▋ | 7071/41250 [17:04:59<82:28:37, 8.69s/it][2025-04-26 01:02:42,988] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 0.98 | optimizer_step: 0.99 [2025-04-26 01:02:42,989] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.64 | bwd_microstep: 5784.55 | bwd_inner_microstep: 5654.34 | bwd_allreduce_microstep: 130.17 | step_microstep: 18.35 [2025-04-26 01:02:42,989] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.64 | bwd: 5784.57 | bwd_inner: 5654.34 | bwd_allreduce: 130.19 | step: 18.35 17%|█▋ | 7072/41250 [17:05:08<82:30:59, 8.69s/it] {'loss': 0.1252, 'grad_norm': 0.8820052742958069, 'learning_rate': 3.7938207180942976e-05, 'epoch': 1.71} 17%|█▋ | 7072/41250 [17:05:08<82:30:59, 8.69s/it][2025-04-26 01:02:51,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 01:02:51,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.73 | bwd_microstep: 5776.23 | bwd_inner_microstep: 5688.39 | bwd_allreduce_microstep: 87.78 | step_microstep: 18.70 [2025-04-26 01:02:51,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.73 | bwd: 5776.25 | bwd_inner: 5688.39 | bwd_allreduce: 87.81 | step: 18.70 17%|█▋ | 7073/41250 [17:05:17<82:33:49, 8.70s/it] {'loss': 0.258, 'grad_norm': 2.389002799987793, 'learning_rate': 3.793751270836055e-05, 'epoch': 1.71} 17%|█▋ | 7073/41250 [17:05:17<82:33:49, 8.70s/it][2025-04-26 01:03:00,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-26 01:03:00,373] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.18 | bwd_microstep: 5764.58 | bwd_inner_microstep: 5663.05 | bwd_allreduce_microstep: 101.48 | step_microstep: 18.92 [2025-04-26 01:03:00,373] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.18 | bwd: 5764.59 | bwd_inner: 5663.05 | bwd_allreduce: 101.49 | step: 18.92 17%|█▋ | 7074/41250 [17:05:25<82:29:58, 8.69s/it] {'loss': 0.0502, 'grad_norm': 0.7934807538986206, 'learning_rate': 3.793681812519688e-05, 'epoch': 1.71} 17%|█▋ | 7074/41250 [17:05:25<82:29:58, 8.69s/it][2025-04-26 01:03:09,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.02 | optimizer_step: 1.08 [2025-04-26 01:03:09,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.15 | bwd_microstep: 6024.82 | bwd_inner_microstep: 5702.41 | bwd_allreduce_microstep: 322.36 | step_microstep: 18.94 [2025-04-26 01:03:09,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.15 | bwd: 6024.84 | bwd_inner: 5702.22 | bwd_allreduce: 322.38 | step: 18.94 17%|█▋ | 7075/41250 [17:05:34<83:16:06, 8.77s/it] {'loss': 0.2137, 'grad_norm': 3.1725993156433105, 'learning_rate': 3.793612343145625e-05, 'epoch': 1.72} 17%|█▋ | 7075/41250 [17:05:34<83:16:06, 8.77s/it][2025-04-26 01:03:18,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:03:18,029] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.66 | bwd_microstep: 5760.41 | bwd_inner_microstep: 5689.22 | bwd_allreduce_microstep: 71.15 | step_microstep: 18.78 [2025-04-26 01:03:18,029] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.66 | bwd: 5760.42 | bwd_inner: 5689.22 | bwd_allreduce: 71.16 | step: 18.78 17%|█▋ | 7076/41250 [17:05:43<83:02:53, 8.75s/it] {'loss': 0.0818, 'grad_norm': 1.323919653892517, 'learning_rate': 3.7935428627142956e-05, 'epoch': 1.72} 17%|█▋ | 7076/41250 [17:05:43<83:02:53, 8.75s/it][2025-04-26 01:03:26,724] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:03:26,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.16 | bwd_microstep: 5767.56 | bwd_inner_microstep: 5707.79 | bwd_allreduce_microstep: 59.73 | step_microstep: 18.47 [2025-04-26 01:03:26,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.16 | bwd: 5767.57 | bwd_inner: 5707.79 | bwd_allreduce: 59.74 | step: 18.47 17%|█▋ | 7077/41250 [17:05:52<82:53:43, 8.73s/it] {'loss': 0.1245, 'grad_norm': 2.420724868774414, 'learning_rate': 3.793473371226126e-05, 'epoch': 1.72} 17%|█▋ | 7077/41250 [17:05:52<82:53:43, 8.73s/it][2025-04-26 01:03:35,342] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 0.93 [2025-04-26 01:03:35,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.53 | bwd_microstep: 5707.03 | bwd_inner_microstep: 5651.81 | bwd_allreduce_microstep: 55.18 | step_microstep: 18.69 [2025-04-26 01:03:35,343] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.53 | bwd: 5707.04 | bwd_inner: 5651.81 | bwd_allreduce: 55.19 | step: 18.70 17%|█▋ | 7078/41250 [17:06:00<82:34:02, 8.70s/it] {'loss': 0.1775, 'grad_norm': 1.8409175872802734, 'learning_rate': 3.793403868681547e-05, 'epoch': 1.72} 17%|█▋ | 7078/41250 [17:06:00<82:34:02, 8.70s/it][mov,mp4,m4a,3gp,3g2,mj2 @ 0x29f396c0] moov atom not found [01:03:35] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Allvideos/Animate/00777.mp4, Invalid data found when processing input petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. Error reading /home/wangjiarui/AIGV6K/Allvideos/Animate/00777.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Allvideos/Animate/00777.mp4... [2025-04-26 01:03:44,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.02 | optimizer_step: 1.04 [2025-04-26 01:03:44,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.67 | bwd_microstep: 5795.98 | bwd_inner_microstep: 5665.45 | bwd_allreduce_microstep: 130.48 | step_microstep: 18.83 [2025-04-26 01:03:44,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.67 | bwd: 5795.99 | bwd_inner: 5665.45 | bwd_allreduce: 130.50 | step: 18.83 17%|█▋ | 7079/41250 [17:06:09<82:36:30, 8.70s/it] {'loss': 0.0755, 'grad_norm': 1.182334303855896, 'learning_rate': 3.7933343550809855e-05, 'epoch': 1.72} 17%|█▋ | 7079/41250 [17:06:09<82:36:30, 8.70s/it][2025-04-26 01:03:52,721] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:03:52,721] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.63 | bwd_microstep: 5725.91 | bwd_inner_microstep: 5713.18 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.79 [2025-04-26 01:03:52,721] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.63 | bwd: 5725.93 | bwd_inner: 5713.18 | bwd_allreduce: 12.70 | step: 18.79 17%|█▋ | 7080/41250 [17:06:18<82:29:52, 8.69s/it] {'loss': 0.2987, 'grad_norm': 2.960986375808716, 'learning_rate': 3.7932648304248707e-05, 'epoch': 1.72} 17%|█▋ | 7080/41250 [17:06:18<82:29:52, 8.69s/it][2025-04-26 01:04:01,496] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.56 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:04:01,496] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2892.08 | bwd_microstep: 5800.40 | bwd_inner_microstep: 5787.91 | bwd_allreduce_microstep: 12.45 | step_microstep: 18.90 [2025-04-26 01:04:01,497] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2892.08 | bwd: 5800.42 | bwd_inner: 5787.91 | bwd_allreduce: 12.46 | step: 18.90 17%|█▋ | 7081/41250 [17:06:26<82:44:07, 8.72s/it] {'loss': 0.0888, 'grad_norm': 1.0106022357940674, 'learning_rate': 3.793195294713631e-05, 'epoch': 1.72} 17%|█▋ | 7081/41250 [17:06:26<82:44:07, 8.72s/it][2025-04-26 01:04:10,189] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 01:04:10,189] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.40 | bwd_microstep: 5787.19 | bwd_inner_microstep: 5653.54 | bwd_allreduce_microstep: 133.60 | step_microstep: 18.63 [2025-04-26 01:04:10,190] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.40 | bwd: 5787.20 | bwd_inner: 5653.54 | bwd_allreduce: 133.61 | step: 18.63 17%|█▋ | 7082/41250 [17:06:35<82:39:46, 8.71s/it] {'loss': 0.1771, 'grad_norm': 1.4071922302246094, 'learning_rate': 3.793125747947694e-05, 'epoch': 1.72} 17%|█▋ | 7082/41250 [17:06:35<82:39:46, 8.71s/it][2025-04-26 01:04:18,908] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 1.03 | optimizer_step: 1.02 [2025-04-26 01:04:18,909] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.04 | bwd_microstep: 5786.52 | bwd_inner_microstep: 5689.08 | bwd_allreduce_microstep: 97.40 | step_microstep: 19.44 [2025-04-26 01:04:18,909] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.04 | bwd: 5786.54 | bwd_inner: 5689.08 | bwd_allreduce: 97.41 | step: 19.44 17%|█▋ | 7083/41250 [17:06:44<82:41:17, 8.71s/it] {'loss': 0.1211, 'grad_norm': 2.4090213775634766, 'learning_rate': 3.79305619012749e-05, 'epoch': 1.72} 17%|█▋ | 7083/41250 [17:06:44<82:41:17, 8.71s/it][2025-04-26 01:04:27,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 01:04:27,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.32 | bwd_microstep: 5712.52 | bwd_inner_microstep: 5699.76 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.87 [2025-04-26 01:04:27,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.32 | bwd: 5712.54 | bwd_inner: 5699.76 | bwd_allreduce: 12.73 | step: 18.87 17%|█▋ | 7084/41250 [17:06:52<82:30:34, 8.69s/it] {'loss': 0.2347, 'grad_norm': 1.359980583190918, 'learning_rate': 3.792986621253448e-05, 'epoch': 1.72} 17%|█▋ | 7084/41250 [17:06:52<82:30:34, 8.69s/it][2025-04-26 01:04:36,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.05 | optimizer_step: 1.00 [2025-04-26 01:04:36,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.80 | bwd_microstep: 5765.01 | bwd_inner_microstep: 5644.17 | bwd_allreduce_microstep: 120.79 | step_microstep: 19.22 [2025-04-26 01:04:36,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.80 | bwd: 5765.02 | bwd_inner: 5644.17 | bwd_allreduce: 120.81 | step: 19.22 17%|█▋ | 7085/41250 [17:07:01<82:28:04, 8.69s/it] {'loss': 0.0515, 'grad_norm': 0.49186521768569946, 'learning_rate': 3.7929170413259956e-05, 'epoch': 1.72} 17%|█▋ | 7085/41250 [17:07:01<82:28:04, 8.69s/it][2025-04-26 01:04:44,930] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:04:44,930] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.63 | bwd_microstep: 5782.09 | bwd_inner_microstep: 5645.41 | bwd_allreduce_microstep: 136.63 | step_microstep: 18.86 [2025-04-26 01:04:44,931] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.63 | bwd: 5782.10 | bwd_inner: 5645.41 | bwd_allreduce: 136.65 | step: 18.86 17%|█▋ | 7086/41250 [17:07:10<82:28:08, 8.69s/it] {'loss': 0.1296, 'grad_norm': 1.2118732929229736, 'learning_rate': 3.792847450345562e-05, 'epoch': 1.72} 17%|█▋ | 7086/41250 [17:07:10<82:28:08, 8.69s/it][2025-04-26 01:04:53,706] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:04:53,707] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2903.01 | bwd_microstep: 5791.20 | bwd_inner_microstep: 5778.19 | bwd_allreduce_microstep: 12.96 | step_microstep: 18.98 [2025-04-26 01:04:53,707] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2903.01 | bwd: 5791.21 | bwd_inner: 5778.19 | bwd_allreduce: 12.97 | step: 18.98 17%|█▋ | 7087/41250 [17:07:19<82:43:01, 8.72s/it] {'loss': 0.1274, 'grad_norm': 1.1806210279464722, 'learning_rate': 3.792777848312577e-05, 'epoch': 1.72} 17%|█▋ | 7087/41250 [17:07:19<82:43:01, 8.72s/it][2025-04-26 01:05:02,321] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.02 | optimizer_step: 1.18 [2025-04-26 01:05:02,322] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.40 | bwd_microstep: 5700.51 | bwd_inner_microstep: 5658.04 | bwd_allreduce_microstep: 42.42 | step_microstep: 19.37 [2025-04-26 01:05:02,322] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.40 | bwd: 5700.53 | bwd_inner: 5658.04 | bwd_allreduce: 42.44 | step: 19.37 17%|█▋ | 7088/41250 [17:07:27<82:25:20, 8.69s/it] {'loss': 0.0683, 'grad_norm': 1.245516300201416, 'learning_rate': 3.792708235227469e-05, 'epoch': 1.72} 17%|█▋ | 7088/41250 [17:07:27<82:25:20, 8.69s/it][2025-04-26 01:05:10,938] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.17 | optimizer_step: 0.92 [2025-04-26 01:05:10,939] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.47 | bwd_microstep: 5704.59 | bwd_inner_microstep: 5662.65 | bwd_allreduce_microstep: 41.89 | step_microstep: 19.12 [2025-04-26 01:05:10,939] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.47 | bwd: 5704.60 | bwd_inner: 5662.65 | bwd_allreduce: 41.91 | step: 19.13 17%|█▋ | 7089/41250 [17:07:36<82:13:38, 8.67s/it] {'loss': 0.1245, 'grad_norm': 1.6877096891403198, 'learning_rate': 3.792638611090667e-05, 'epoch': 1.72} 17%|█▋ | 7089/41250 [17:07:36<82:13:38, 8.67s/it][2025-04-26 01:05:19,580] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:05:19,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.54 | bwd_microstep: 5710.30 | bwd_inner_microstep: 5697.45 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.86 [2025-04-26 01:05:19,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.54 | bwd: 5710.31 | bwd_inner: 5697.45 | bwd_allreduce: 12.82 | step: 18.86 17%|█▋ | 7090/41250 [17:07:44<82:09:12, 8.66s/it] {'loss': 0.0933, 'grad_norm': 1.655260443687439, 'learning_rate': 3.792568975902601e-05, 'epoch': 1.72} 17%|█▋ | 7090/41250 [17:07:44<82:09:12, 8.66s/it][2025-04-26 01:05:28,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-26 01:05:28,203] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.82 | bwd_microstep: 5701.67 | bwd_inner_microstep: 5654.58 | bwd_allreduce_microstep: 47.05 | step_microstep: 18.62 [2025-04-26 01:05:28,203] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.83 | bwd: 5701.69 | bwd_inner: 5654.58 | bwd_allreduce: 47.07 | step: 18.62 17%|█▋ | 7091/41250 [17:07:53<82:03:00, 8.65s/it] {'loss': 0.0942, 'grad_norm': 1.0173901319503784, 'learning_rate': 3.7924993296637e-05, 'epoch': 1.72} 17%|█▋ | 7091/41250 [17:07:53<82:03:00, 8.65s/it][2025-04-26 01:05:36,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.05 | optimizer_step: 0.89 [2025-04-26 01:05:36,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.31 | bwd_microstep: 5764.60 | bwd_inner_microstep: 5686.56 | bwd_allreduce_microstep: 77.98 | step_microstep: 19.01 [2025-04-26 01:05:36,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.31 | bwd: 5764.62 | bwd_inner: 5686.56 | bwd_allreduce: 78.00 | step: 19.01 17%|█▋ | 7092/41250 [17:08:02<82:11:02, 8.66s/it] {'loss': 0.0355, 'grad_norm': 0.9444786906242371, 'learning_rate': 3.792429672374392e-05, 'epoch': 1.72} 17%|█▋ | 7092/41250 [17:08:02<82:11:02, 8.66s/it][2025-04-26 01:05:45,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.99 [2025-04-26 01:05:45,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.56 | bwd_microstep: 5873.60 | bwd_inner_microstep: 5693.68 | bwd_allreduce_microstep: 179.88 | step_microstep: 19.15 [2025-04-26 01:05:45,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.56 | bwd: 5873.61 | bwd_inner: 5693.68 | bwd_allreduce: 179.90 | step: 19.15 17%|█▋ | 7093/41250 [17:08:11<82:35:24, 8.70s/it] {'loss': 0.1096, 'grad_norm': 1.1785496473312378, 'learning_rate': 3.792360004035109e-05, 'epoch': 1.72} 17%|█▋ | 7093/41250 [17:08:11<82:35:24, 8.70s/it][2025-04-26 01:05:54,308] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.25 | optimizer_step: 0.93 [2025-04-26 01:05:54,309] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.17 | bwd_microstep: 5685.99 | bwd_inner_microstep: 5649.01 | bwd_allreduce_microstep: 36.93 | step_microstep: 19.23 [2025-04-26 01:05:54,309] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.17 | bwd: 5686.01 | bwd_inner: 5649.01 | bwd_allreduce: 36.95 | step: 19.24 17%|█▋ | 7094/41250 [17:08:19<82:18:13, 8.67s/it] {'loss': 0.0953, 'grad_norm': 1.8708349466323853, 'learning_rate': 3.792290324646279e-05, 'epoch': 1.72} 17%|█▋ | 7094/41250 [17:08:19<82:18:13, 8.67s/it][2025-04-26 01:06:02,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 01:06:02,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.15 | bwd_microstep: 5708.44 | bwd_inner_microstep: 5642.34 | bwd_allreduce_microstep: 66.06 | step_microstep: 18.92 [2025-04-26 01:06:02,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.15 | bwd: 5708.45 | bwd_inner: 5642.34 | bwd_allreduce: 66.07 | step: 18.92 17%|█▋ | 7095/41250 [17:08:28<82:07:59, 8.66s/it] {'loss': 0.1084, 'grad_norm': 1.45547354221344, 'learning_rate': 3.792220634208331e-05, 'epoch': 1.72} 17%|█▋ | 7095/41250 [17:08:28<82:07:59, 8.66s/it][2025-04-26 01:06:11,601] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:06:11,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2815.14 | bwd_microstep: 5782.80 | bwd_inner_microstep: 5636.18 | bwd_allreduce_microstep: 146.57 | step_microstep: 18.88 [2025-04-26 01:06:11,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2815.14 | bwd: 5782.82 | bwd_inner: 5636.18 | bwd_allreduce: 146.59 | step: 18.88 17%|█▋ | 7096/41250 [17:08:36<82:11:47, 8.66s/it] {'loss': 0.1764, 'grad_norm': 3.7820804119110107, 'learning_rate': 3.792150932721695e-05, 'epoch': 1.72} 17%|█▋ | 7096/41250 [17:08:36<82:11:47, 8.66s/it][2025-04-26 01:06:20,282] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.30 | optimizer_step: 1.04 [2025-04-26 01:06:20,283] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.25 | bwd_microstep: 5749.91 | bwd_inner_microstep: 5695.19 | bwd_allreduce_microstep: 54.65 | step_microstep: 20.15 [2025-04-26 01:06:20,283] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.25 | bwd: 5749.92 | bwd_inner: 5695.19 | bwd_allreduce: 54.68 | step: 20.15 17%|█▋ | 7097/41250 [17:08:45<82:14:12, 8.67s/it] {'loss': 0.113, 'grad_norm': 2.4533486366271973, 'learning_rate': 3.792081220186802e-05, 'epoch': 1.72} 17%|█▋ | 7097/41250 [17:08:45<82:14:12, 8.67s/it][2025-04-26 01:06:28,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.03 | optimizer_step: 0.92 [2025-04-26 01:06:28,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.36 | bwd_microstep: 5705.58 | bwd_inner_microstep: 5692.89 | bwd_allreduce_microstep: 12.65 | step_microstep: 19.01 [2025-04-26 01:06:28,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.36 | bwd: 5705.59 | bwd_inner: 5692.89 | bwd_allreduce: 12.67 | step: 19.01 17%|█▋ | 7098/41250 [17:08:54<82:08:37, 8.66s/it] {'loss': 0.1752, 'grad_norm': 1.5292415618896484, 'learning_rate': 3.792011496604081e-05, 'epoch': 1.72} 17%|█▋ | 7098/41250 [17:08:54<82:08:37, 8.66s/it][2025-04-26 01:06:37,712] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:06:37,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.27 | bwd_microstep: 5877.61 | bwd_inner_microstep: 5650.55 | bwd_allreduce_microstep: 227.02 | step_microstep: 18.42 [2025-04-26 01:06:37,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.27 | bwd: 5877.62 | bwd_inner: 5650.55 | bwd_allreduce: 227.03 | step: 18.42 17%|█▋ | 7099/41250 [17:09:03<82:31:20, 8.70s/it] {'loss': 0.2143, 'grad_norm': 2.2278940677642822, 'learning_rate': 3.7919417619739606e-05, 'epoch': 1.72} 17%|█▋ | 7099/41250 [17:09:03<82:31:20, 8.70s/it][2025-04-26 01:06:46,381] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:06:46,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.00 | bwd_microstep: 5726.28 | bwd_inner_microstep: 5693.31 | bwd_allreduce_microstep: 32.92 | step_microstep: 18.48 [2025-04-26 01:06:46,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.00 | bwd: 5726.29 | bwd_inner: 5693.31 | bwd_allreduce: 32.94 | step: 18.48 17%|█▋ | 7100/41250 [17:09:11<82:26:17, 8.69s/it] {'loss': 0.1511, 'grad_norm': 1.2670996189117432, 'learning_rate': 3.7918720162968714e-05, 'epoch': 1.72} 17%|█▋ | 7100/41250 [17:09:11<82:26:17, 8.69s/it][2025-04-26 01:06:55,021] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:06:55,022] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.05 | bwd_microstep: 5711.11 | bwd_inner_microstep: 5667.68 | bwd_allreduce_microstep: 43.38 | step_microstep: 18.81 [2025-04-26 01:06:55,022] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.05 | bwd: 5711.13 | bwd_inner: 5667.68 | bwd_allreduce: 43.40 | step: 18.81 17%|█▋ | 7101/41250 [17:09:20<82:17:51, 8.68s/it] {'loss': 0.182, 'grad_norm': 2.755117893218994, 'learning_rate': 3.7918022595732446e-05, 'epoch': 1.72} 17%|█▋ | 7101/41250 [17:09:20<82:17:51, 8.68s/it][2025-04-26 01:07:03,712] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.03 | optimizer_step: 0.97 [2025-04-26 01:07:03,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.05 | bwd_microstep: 5781.34 | bwd_inner_microstep: 5627.83 | bwd_allreduce_microstep: 153.45 | step_microstep: 19.23 [2025-04-26 01:07:03,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.05 | bwd: 5781.35 | bwd_inner: 5627.83 | bwd_allreduce: 153.48 | step: 19.23 17%|█▋ | 7102/41250 [17:09:29<82:20:01, 8.68s/it] {'loss': 0.0384, 'grad_norm': 0.563278079032898, 'learning_rate': 3.791732491803509e-05, 'epoch': 1.72} 17%|█▋ | 7102/41250 [17:09:29<82:20:01, 8.68s/it][2025-04-26 01:07:12,320] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.99 | optimizer_step: 0.92 [2025-04-26 01:07:12,321] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.72 | bwd_microstep: 5703.54 | bwd_inner_microstep: 5619.93 | bwd_allreduce_microstep: 83.57 | step_microstep: 18.76 [2025-04-26 01:07:12,321] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.72 | bwd: 5703.56 | bwd_inner: 5619.93 | bwd_allreduce: 83.58 | step: 18.76 17%|█▋ | 7103/41250 [17:09:37<82:07:27, 8.66s/it] {'loss': 0.0838, 'grad_norm': 2.118283271789551, 'learning_rate': 3.7916627129880945e-05, 'epoch': 1.72} 17%|█▋ | 7103/41250 [17:09:37<82:07:27, 8.66s/it][2025-04-26 01:07:20,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.95 [2025-04-26 01:07:20,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.73 | bwd_microstep: 5692.47 | bwd_inner_microstep: 5679.54 | bwd_allreduce_microstep: 12.88 | step_microstep: 19.17 [2025-04-26 01:07:20,943] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.73 | bwd: 5692.48 | bwd_inner: 5679.54 | bwd_allreduce: 12.90 | step: 19.17 17%|█▋ | 7104/41250 [17:09:46<82:01:08, 8.65s/it] {'loss': 0.0812, 'grad_norm': 1.0221014022827148, 'learning_rate': 3.791592923127433e-05, 'epoch': 1.72} 17%|█▋ | 7104/41250 [17:09:46<82:01:08, 8.65s/it][2025-04-26 01:07:29,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-26 01:07:29,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.22 | bwd_microstep: 5719.36 | bwd_inner_microstep: 5706.02 | bwd_allreduce_microstep: 13.29 | step_microstep: 18.68 [2025-04-26 01:07:29,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.22 | bwd: 5719.38 | bwd_inner: 5706.02 | bwd_allreduce: 13.31 | step: 18.68 17%|█▋ | 7105/41250 [17:09:54<82:00:29, 8.65s/it] {'loss': 0.2144, 'grad_norm': 3.1536166667938232, 'learning_rate': 3.7915231222219525e-05, 'epoch': 1.72} 17%|█▋ | 7105/41250 [17:09:54<82:00:29, 8.65s/it][2025-04-26 01:07:38,249] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-26 01:07:38,250] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.41 | bwd_microstep: 5754.32 | bwd_inner_microstep: 5642.11 | bwd_allreduce_microstep: 112.16 | step_microstep: 18.68 [2025-04-26 01:07:38,250] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.41 | bwd: 5754.34 | bwd_inner: 5642.11 | bwd_allreduce: 112.18 | step: 18.69 17%|█▋ | 7106/41250 [17:10:03<82:03:15, 8.65s/it] {'loss': 0.1195, 'grad_norm': 4.252877235412598, 'learning_rate': 3.7914533102720845e-05, 'epoch': 1.72} 17%|█▋ | 7106/41250 [17:10:03<82:03:15, 8.65s/it][2025-04-26 01:07:46,879] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 0.99 | optimizer_step: 0.98 [2025-04-26 01:07:46,880] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.56 | bwd_microstep: 5702.59 | bwd_inner_microstep: 5689.82 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.57 [2025-04-26 01:07:46,880] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.56 | bwd: 5702.60 | bwd_inner: 5689.82 | bwd_allreduce: 12.74 | step: 18.58 17%|█▋ | 7107/41250 [17:10:12<81:59:18, 8.64s/it] {'loss': 0.1944, 'grad_norm': 2.699481248855591, 'learning_rate': 3.7913834872782595e-05, 'epoch': 1.72} 17%|█▋ | 7107/41250 [17:10:12<81:59:18, 8.64s/it][2025-04-26 01:07:55,481] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:07:55,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.80 | bwd_microstep: 5697.61 | bwd_inner_microstep: 5646.41 | bwd_allreduce_microstep: 51.16 | step_microstep: 18.90 [2025-04-26 01:07:55,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.80 | bwd: 5697.63 | bwd_inner: 5646.41 | bwd_allreduce: 51.17 | step: 18.90 17%|█▋ | 7108/41250 [17:10:20<81:52:11, 8.63s/it] {'loss': 0.1746, 'grad_norm': 1.848864197731018, 'learning_rate': 3.791313653240907e-05, 'epoch': 1.72} 17%|█▋ | 7108/41250 [17:10:20<81:52:11, 8.63s/it][2025-04-26 01:08:04,086] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.01 | optimizer_step: 0.98 [2025-04-26 01:08:04,086] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.39 | bwd_microstep: 5694.43 | bwd_inner_microstep: 5649.55 | bwd_allreduce_microstep: 44.83 | step_microstep: 19.35 [2025-04-26 01:08:04,087] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.39 | bwd: 5694.45 | bwd_inner: 5649.55 | bwd_allreduce: 44.85 | step: 19.36 17%|█▋ | 7109/41250 [17:10:29<81:47:06, 8.62s/it] {'loss': 0.0699, 'grad_norm': 1.5347546339035034, 'learning_rate': 3.791243808160459e-05, 'epoch': 1.72} 17%|█▋ | 7109/41250 [17:10:29<81:47:06, 8.62s/it][2025-04-26 01:08:12,757] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.97 | optimizer_step: 0.94 [2025-04-26 01:08:12,758] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.52 | bwd_microstep: 5746.17 | bwd_inner_microstep: 5686.51 | bwd_allreduce_microstep: 59.62 | step_microstep: 18.66 [2025-04-26 01:08:12,758] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.52 | bwd: 5746.19 | bwd_inner: 5686.51 | bwd_allreduce: 59.64 | step: 18.66 17%|█▋ | 7110/41250 [17:10:38<81:55:02, 8.64s/it] {'loss': 0.0729, 'grad_norm': 3.826460838317871, 'learning_rate': 3.791173952037345e-05, 'epoch': 1.72} 17%|█▋ | 7110/41250 [17:10:38<81:55:02, 8.64s/it][2025-04-26 01:08:21,360] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:08:21,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.60 | bwd_microstep: 5693.42 | bwd_inner_microstep: 5646.68 | bwd_allreduce_microstep: 46.69 | step_microstep: 18.61 [2025-04-26 01:08:21,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.60 | bwd: 5693.44 | bwd_inner: 5646.69 | bwd_allreduce: 46.71 | step: 18.61 17%|█▋ | 7111/41250 [17:10:46<81:48:50, 8.63s/it] {'loss': 0.0867, 'grad_norm': 0.9792299866676331, 'learning_rate': 3.791104084871995e-05, 'epoch': 1.72} 17%|█▋ | 7111/41250 [17:10:46<81:48:50, 8.63s/it][2025-04-26 01:08:30,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:08:30,029] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.87 | bwd_microstep: 5728.11 | bwd_inner_microstep: 5694.19 | bwd_allreduce_microstep: 33.88 | step_microstep: 18.93 [2025-04-26 01:08:30,029] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.87 | bwd: 5728.12 | bwd_inner: 5694.19 | bwd_allreduce: 33.89 | step: 18.93 17%|█▋ | 7112/41250 [17:10:55<81:55:40, 8.64s/it] {'loss': 0.1259, 'grad_norm': 2.506441354751587, 'learning_rate': 3.791034206664842e-05, 'epoch': 1.72} 17%|█▋ | 7112/41250 [17:10:55<81:55:40, 8.64s/it][2025-04-26 01:08:38,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:08:38,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.03 | bwd_microstep: 5772.52 | bwd_inner_microstep: 5689.84 | bwd_allreduce_microstep: 82.63 | step_microstep: 18.81 [2025-04-26 01:08:38,733] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.03 | bwd: 5772.53 | bwd_inner: 5689.84 | bwd_allreduce: 82.65 | step: 18.81 17%|█▋ | 7113/41250 [17:11:04<82:06:26, 8.66s/it] {'loss': 0.0373, 'grad_norm': 1.016188144683838, 'learning_rate': 3.7909643174163145e-05, 'epoch': 1.72} 17%|█▋ | 7113/41250 [17:11:04<82:06:26, 8.66s/it][2025-04-26 01:08:47,613] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 01:08:47,613] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 3091.00 | bwd_microstep: 5709.62 | bwd_inner_microstep: 5696.76 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.78 [2025-04-26 01:08:47,614] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 3091.00 | bwd: 5709.64 | bwd_inner: 5696.76 | bwd_allreduce: 12.84 | step: 18.79 17%|█▋ | 7114/41250 [17:11:12<82:44:12, 8.73s/it] {'loss': 0.0621, 'grad_norm': 1.2671442031860352, 'learning_rate': 3.790894417126844e-05, 'epoch': 1.72} 17%|█▋ | 7114/41250 [17:11:12<82:44:12, 8.73s/it][2025-04-26 01:08:56,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.99 | optimizer_step: 1.01 [2025-04-26 01:08:56,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.35 | bwd_microstep: 5697.31 | bwd_inner_microstep: 5684.58 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.64 [2025-04-26 01:08:56,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.35 | bwd: 5697.32 | bwd_inner: 5684.58 | bwd_allreduce: 12.70 | step: 18.65 17%|█▋ | 7115/41250 [17:11:21<82:26:32, 8.69s/it] {'loss': 0.0702, 'grad_norm': 5.02650785446167, 'learning_rate': 3.7908245057968625e-05, 'epoch': 1.72} 17%|█▋ | 7115/41250 [17:11:21<82:26:32, 8.69s/it][2025-04-26 01:09:04,948] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.94 [2025-04-26 01:09:04,948] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.74 | bwd_microstep: 5800.78 | bwd_inner_microstep: 5662.66 | bwd_allreduce_microstep: 138.07 | step_microstep: 18.76 [2025-04-26 01:09:04,948] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.74 | bwd: 5800.79 | bwd_inner: 5662.66 | bwd_allreduce: 138.09 | step: 18.77 17%|█▋ | 7116/41250 [17:11:30<82:29:20, 8.70s/it] {'loss': 0.023, 'grad_norm': 0.2958524525165558, 'learning_rate': 3.7907545834268004e-05, 'epoch': 1.73} 17%|█▋ | 7116/41250 [17:11:30<82:29:20, 8.70s/it][2025-04-26 01:09:13,562] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:09:13,563] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.50 | bwd_microstep: 5701.69 | bwd_inner_microstep: 5648.97 | bwd_allreduce_microstep: 52.67 | step_microstep: 18.76 [2025-04-26 01:09:13,563] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.50 | bwd: 5701.70 | bwd_inner: 5648.97 | bwd_allreduce: 52.69 | step: 18.76 17%|█▋ | 7117/41250 [17:11:38<82:15:00, 8.67s/it] {'loss': 0.0749, 'grad_norm': 1.645468831062317, 'learning_rate': 3.7906846500170875e-05, 'epoch': 1.73} 17%|█▋ | 7117/41250 [17:11:38<82:15:00, 8.67s/it][2025-04-26 01:09:22,250] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.06 | optimizer_step: 0.96 [2025-04-26 01:09:22,251] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.76 | bwd_microstep: 5756.83 | bwd_inner_microstep: 5693.73 | bwd_allreduce_microstep: 63.05 | step_microstep: 19.16 [2025-04-26 01:09:22,251] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.76 | bwd: 5756.84 | bwd_inner: 5693.73 | bwd_allreduce: 63.07 | step: 19.16 17%|█▋ | 7118/41250 [17:11:47<82:16:54, 8.68s/it] {'loss': 0.1177, 'grad_norm': 1.068730354309082, 'learning_rate': 3.790614705568156e-05, 'epoch': 1.73} 17%|█▋ | 7118/41250 [17:11:47<82:16:54, 8.68s/it][2025-04-26 01:09:30,937] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 01:09:30,938] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.57 | bwd_microstep: 5755.40 | bwd_inner_microstep: 5679.30 | bwd_allreduce_microstep: 76.05 | step_microstep: 18.64 [2025-04-26 01:09:30,938] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.57 | bwd: 5755.41 | bwd_inner: 5679.30 | bwd_allreduce: 76.08 | step: 18.64 17%|█▋ | 7119/41250 [17:11:56<82:17:58, 8.68s/it] {'loss': 0.2518, 'grad_norm': 2.138831615447998, 'learning_rate': 3.7905447500804374e-05, 'epoch': 1.73} 17%|█▋ | 7119/41250 [17:11:56<82:17:58, 8.68s/it][2025-04-26 01:09:39,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 01:09:39,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.81 | bwd_microstep: 5713.40 | bwd_inner_microstep: 5700.52 | bwd_allreduce_microstep: 12.83 | step_microstep: 18.94 [2025-04-26 01:09:39,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.81 | bwd: 5713.42 | bwd_inner: 5700.52 | bwd_allreduce: 12.85 | step: 18.94 17%|█▋ | 7120/41250 [17:12:04<82:13:52, 8.67s/it] {'loss': 0.0929, 'grad_norm': 1.3612003326416016, 'learning_rate': 3.790474783554363e-05, 'epoch': 1.73} 17%|█▋ | 7120/41250 [17:12:04<82:13:52, 8.67s/it][2025-04-26 01:09:48,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.06 | optimizer_step: 1.06 [2025-04-26 01:09:48,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.66 | bwd_microstep: 5886.28 | bwd_inner_microstep: 5698.46 | bwd_allreduce_microstep: 187.78 | step_microstep: 19.56 [2025-04-26 01:09:48,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.66 | bwd: 5886.29 | bwd_inner: 5698.46 | bwd_allreduce: 187.80 | step: 19.56 17%|█▋ | 7121/41250 [17:12:13<82:39:06, 8.72s/it] {'loss': 0.3394, 'grad_norm': 2.2614827156066895, 'learning_rate': 3.790404805990363e-05, 'epoch': 1.73} 17%|█▋ | 7121/41250 [17:12:13<82:39:06, 8.72s/it][2025-04-26 01:09:57,120] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.30 | optimizer_step: 1.03 [2025-04-26 01:09:57,121] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.70 | bwd_microstep: 5785.88 | bwd_inner_microstep: 5656.66 | bwd_allreduce_microstep: 129.17 | step_microstep: 19.88 [2025-04-26 01:09:57,121] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.70 | bwd: 5785.90 | bwd_inner: 5656.66 | bwd_allreduce: 129.20 | step: 19.89 17%|█▋ | 7122/41250 [17:12:22<82:36:19, 8.71s/it] {'loss': 0.2331, 'grad_norm': 1.2955410480499268, 'learning_rate': 3.7903348173888706e-05, 'epoch': 1.73} 17%|█▋ | 7122/41250 [17:12:22<82:36:19, 8.71s/it][2025-04-26 01:10:05,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.03 | optimizer_step: 0.99 [2025-04-26 01:10:05,816] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.52 | bwd_microstep: 5761.26 | bwd_inner_microstep: 5714.23 | bwd_allreduce_microstep: 46.98 | step_microstep: 19.23 [2025-04-26 01:10:05,816] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.52 | bwd: 5761.27 | bwd_inner: 5714.23 | bwd_allreduce: 47.00 | step: 19.23 17%|█▋ | 7123/41250 [17:12:31<82:33:39, 8.71s/it] {'loss': 0.4487, 'grad_norm': 3.1624374389648438, 'learning_rate': 3.790264817750315e-05, 'epoch': 1.73} 17%|█▋ | 7123/41250 [17:12:31<82:33:39, 8.71s/it][2025-04-26 01:10:14,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.03 | optimizer_step: 0.91 [2025-04-26 01:10:14,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.49 | bwd_microstep: 6037.08 | bwd_inner_microstep: 5644.47 | bwd_allreduce_microstep: 392.56 | step_microstep: 18.79 [2025-04-26 01:10:14,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.49 | bwd: 6037.09 | bwd_inner: 5644.47 | bwd_allreduce: 392.58 | step: 18.79 17%|█▋ | 7124/41250 [17:12:40<83:17:17, 8.79s/it] {'loss': 0.0341, 'grad_norm': 0.8708192110061646, 'learning_rate': 3.7901948070751295e-05, 'epoch': 1.73} 17%|█▋ | 7124/41250 [17:12:40<83:17:17, 8.79s/it][2025-04-26 01:10:23,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 1.00 [2025-04-26 01:10:23,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.46 | bwd_microstep: 5701.70 | bwd_inner_microstep: 5654.03 | bwd_allreduce_microstep: 47.63 | step_microstep: 18.66 [2025-04-26 01:10:23,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.46 | bwd: 5701.72 | bwd_inner: 5654.03 | bwd_allreduce: 47.64 | step: 18.66 17%|█▋ | 7125/41250 [17:12:48<82:48:41, 8.74s/it] {'loss': 0.2448, 'grad_norm': 3.395456314086914, 'learning_rate': 3.7901247853637444e-05, 'epoch': 1.73} 17%|█▋ | 7125/41250 [17:12:48<82:48:41, 8.74s/it][2025-04-26 01:10:32,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.05 | optimizer_step: 0.98 [2025-04-26 01:10:32,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.59 | bwd_microstep: 5767.62 | bwd_inner_microstep: 5693.34 | bwd_allreduce_microstep: 74.23 | step_microstep: 19.10 [2025-04-26 01:10:32,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.59 | bwd: 5767.63 | bwd_inner: 5693.34 | bwd_allreduce: 74.25 | step: 19.10 17%|█▋ | 7126/41250 [17:12:57<82:43:59, 8.73s/it] {'loss': 0.1488, 'grad_norm': 1.7952522039413452, 'learning_rate': 3.790054752616593e-05, 'epoch': 1.73} 17%|█▋ | 7126/41250 [17:12:57<82:43:59, 8.73s/it][2025-04-26 01:10:40,747] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-26 01:10:40,747] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.93 | bwd_microstep: 5704.62 | bwd_inner_microstep: 5691.86 | bwd_allreduce_microstep: 12.71 | step_microstep: 19.07 [2025-04-26 01:10:40,747] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.93 | bwd: 5704.63 | bwd_inner: 5691.86 | bwd_allreduce: 12.73 | step: 19.07 17%|█▋ | 7127/41250 [17:13:06<82:27:41, 8.70s/it] {'loss': 0.1555, 'grad_norm': 2.5031349658966064, 'learning_rate': 3.789984708834106e-05, 'epoch': 1.73} 17%|█▋ | 7127/41250 [17:13:06<82:27:41, 8.70s/it][2025-04-26 01:10:49,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-26 01:10:49,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.35 | bwd_microstep: 5703.06 | bwd_inner_microstep: 5689.88 | bwd_allreduce_microstep: 13.13 | step_microstep: 18.98 [2025-04-26 01:10:49,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.35 | bwd: 5703.07 | bwd_inner: 5689.88 | bwd_allreduce: 13.15 | step: 18.98 17%|█▋ | 7128/41250 [17:13:14<82:16:31, 8.68s/it] {'loss': 0.4257, 'grad_norm': 2.509925127029419, 'learning_rate': 3.789914654016715e-05, 'epoch': 1.73} 17%|█▋ | 7128/41250 [17:13:14<82:16:31, 8.68s/it][2025-04-26 01:10:57,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 1.07 [2025-04-26 01:10:57,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.92 | bwd_microstep: 5699.84 | bwd_inner_microstep: 5660.98 | bwd_allreduce_microstep: 38.82 | step_microstep: 18.86 [2025-04-26 01:10:57,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.92 | bwd: 5699.86 | bwd_inner: 5660.98 | bwd_allreduce: 38.83 | step: 18.86 17%|█▋ | 7129/41250 [17:13:23<82:05:27, 8.66s/it] {'loss': 0.1301, 'grad_norm': 3.648496150970459, 'learning_rate': 3.789844588164853e-05, 'epoch': 1.73} 17%|█▋ | 7129/41250 [17:13:23<82:05:27, 8.66s/it][2025-04-26 01:11:06,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-26 01:11:06,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.48 | bwd_microstep: 5874.03 | bwd_inner_microstep: 5648.97 | bwd_allreduce_microstep: 225.00 | step_microstep: 18.85 [2025-04-26 01:11:06,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.48 | bwd: 5874.04 | bwd_inner: 5648.97 | bwd_allreduce: 225.03 | step: 18.85 17%|█▋ | 7130/41250 [17:13:32<82:26:42, 8.70s/it] {'loss': 0.2623, 'grad_norm': 3.8762123584747314, 'learning_rate': 3.78977451127895e-05, 'epoch': 1.73} 17%|█▋ | 7130/41250 [17:13:32<82:26:42, 8.70s/it][2025-04-26 01:11:15,454] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-26 01:11:15,455] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.54 | bwd_microstep: 5734.79 | bwd_inner_microstep: 5707.68 | bwd_allreduce_microstep: 27.06 | step_microstep: 19.37 [2025-04-26 01:11:15,455] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.54 | bwd: 5734.80 | bwd_inner: 5707.68 | bwd_allreduce: 27.08 | step: 19.37 17%|█▋ | 7131/41250 [17:13:40<82:21:40, 8.69s/it] {'loss': 0.1622, 'grad_norm': 2.0942397117614746, 'learning_rate': 3.78970442335944e-05, 'epoch': 1.73} 17%|█▋ | 7131/41250 [17:13:40<82:21:40, 8.69s/it][2025-04-26 01:11:24,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.03 | optimizer_step: 0.92 [2025-04-26 01:11:24,406] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.27 | bwd_microstep: 6040.65 | bwd_inner_microstep: 5653.06 | bwd_allreduce_microstep: 387.54 | step_microstep: 19.23 [2025-04-26 01:11:24,406] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.27 | bwd: 6040.66 | bwd_inner: 5653.06 | bwd_allreduce: 387.56 | step: 19.23 17%|█▋ | 7132/41250 [17:13:49<83:05:56, 8.77s/it] {'loss': 0.0334, 'grad_norm': 0.49295270442962646, 'learning_rate': 3.789634324406754e-05, 'epoch': 1.73} 17%|█▋ | 7132/41250 [17:13:49<83:05:56, 8.77s/it][2025-04-26 01:11:33,060] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 01:11:33,060] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.22 | bwd_microstep: 5717.91 | bwd_inner_microstep: 5705.17 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.75 [2025-04-26 01:11:33,060] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.22 | bwd: 5717.92 | bwd_inner: 5705.17 | bwd_allreduce: 12.71 | step: 18.75 17%|█▋ | 7133/41250 [17:13:58<82:46:19, 8.73s/it] {'loss': 0.2032, 'grad_norm': 1.869437336921692, 'learning_rate': 3.789564214421324e-05, 'epoch': 1.73} 17%|█▋ | 7133/41250 [17:13:58<82:46:19, 8.73s/it][2025-04-26 01:11:41,673] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 01:11:41,674] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.10 | bwd_microstep: 5697.40 | bwd_inner_microstep: 5651.13 | bwd_allreduce_microstep: 46.22 | step_microstep: 18.93 [2025-04-26 01:11:41,674] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.10 | bwd: 5697.42 | bwd_inner: 5651.13 | bwd_allreduce: 46.24 | step: 18.93 17%|█▋ | 7134/41250 [17:14:06<82:25:40, 8.70s/it] {'loss': 0.1818, 'grad_norm': 4.3543171882629395, 'learning_rate': 3.789494093403583e-05, 'epoch': 1.73} 17%|█▋ | 7134/41250 [17:14:06<82:25:40, 8.70s/it][2025-04-26 01:11:50,351] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.96 | optimizer_gradients: 1.14 | optimizer_step: 1.11 [2025-04-26 01:11:50,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.73 | bwd_microstep: 5745.02 | bwd_inner_microstep: 5678.95 | bwd_allreduce_microstep: 66.02 | step_microstep: 19.21 [2025-04-26 01:11:50,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.73 | bwd: 5745.03 | bwd_inner: 5678.95 | bwd_allreduce: 66.04 | step: 19.21 17%|█▋ | 7135/41250 [17:14:15<82:22:17, 8.69s/it] {'loss': 0.1972, 'grad_norm': 3.2552013397216797, 'learning_rate': 3.789423961353963e-05, 'epoch': 1.73} 17%|█▋ | 7135/41250 [17:14:15<82:22:17, 8.69s/it][2025-04-26 01:11:58,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:11:58,976] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.17 | bwd_microstep: 5692.12 | bwd_inner_microstep: 5679.59 | bwd_allreduce_microstep: 12.48 | step_microstep: 18.55 [2025-04-26 01:11:58,976] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.17 | bwd: 5692.13 | bwd_inner: 5679.59 | bwd_allreduce: 12.49 | step: 18.56 17%|█▋ | 7136/41250 [17:14:24<82:10:17, 8.67s/it] {'loss': 0.1703, 'grad_norm': 3.0218944549560547, 'learning_rate': 3.7893538182728964e-05, 'epoch': 1.73} 17%|█▋ | 7136/41250 [17:14:24<82:10:17, 8.67s/it][2025-04-26 01:12:07,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-26 01:12:07,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.54 | bwd_microstep: 5691.55 | bwd_inner_microstep: 5675.91 | bwd_allreduce_microstep: 15.59 | step_microstep: 19.03 [2025-04-26 01:12:07,591] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.54 | bwd: 5691.57 | bwd_inner: 5675.91 | bwd_allreduce: 15.61 | step: 19.03 17%|█▋ | 7137/41250 [17:14:32<82:00:28, 8.65s/it] {'loss': 0.058, 'grad_norm': 1.5273315906524658, 'learning_rate': 3.789283664160815e-05, 'epoch': 1.73} 17%|█▋ | 7137/41250 [17:14:32<82:00:28, 8.65s/it][2025-04-26 01:12:16,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 01:12:16,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.22 | bwd_microstep: 5719.31 | bwd_inner_microstep: 5697.35 | bwd_allreduce_microstep: 21.92 | step_microstep: 18.86 [2025-04-26 01:12:16,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.22 | bwd: 5719.33 | bwd_inner: 5697.35 | bwd_allreduce: 21.93 | step: 18.86 17%|█▋ | 7138/41250 [17:14:41<81:59:50, 8.65s/it] {'loss': 0.1311, 'grad_norm': 0.5758270025253296, 'learning_rate': 3.789213499018152e-05, 'epoch': 1.73} 17%|█▋ | 7138/41250 [17:14:41<81:59:50, 8.65s/it][2025-04-26 01:12:24,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:12:24,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.78 | bwd_microstep: 5720.63 | bwd_inner_microstep: 5691.44 | bwd_allreduce_microstep: 29.14 | step_microstep: 18.66 [2025-04-26 01:12:24,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.78 | bwd: 5720.64 | bwd_inner: 5691.44 | bwd_allreduce: 29.16 | step: 18.66 17%|█▋ | 7139/41250 [17:14:50<81:59:54, 8.65s/it] {'loss': 0.1669, 'grad_norm': 1.9620836973190308, 'learning_rate': 3.7891433228453397e-05, 'epoch': 1.73} 17%|█▋ | 7139/41250 [17:14:50<81:59:54, 8.65s/it][2025-04-26 01:12:33,696] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.06 | optimizer_step: 1.01 [2025-04-26 01:12:33,697] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.85 | bwd_microstep: 5877.86 | bwd_inner_microstep: 5642.24 | bwd_allreduce_microstep: 235.57 | step_microstep: 19.30 [2025-04-26 01:12:33,697] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.85 | bwd: 5877.88 | bwd_inner: 5642.24 | bwd_allreduce: 235.60 | step: 19.30 17%|█▋ | 7140/41250 [17:14:59<82:25:08, 8.70s/it] {'loss': 0.2484, 'grad_norm': 1.235915184020996, 'learning_rate': 3.78907313564281e-05, 'epoch': 1.73} 17%|█▋ | 7140/41250 [17:14:59<82:25:08, 8.70s/it][2025-04-26 01:12:42,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-26 01:12:42,381] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.22 | bwd_microstep: 5772.81 | bwd_inner_microstep: 5641.18 | bwd_allreduce_microstep: 131.58 | step_microstep: 18.87 [2025-04-26 01:12:42,381] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.22 | bwd: 5772.82 | bwd_inner: 5641.18 | bwd_allreduce: 131.60 | step: 18.87 17%|█▋ | 7141/41250 [17:15:07<82:22:13, 8.69s/it] {'loss': 0.0352, 'grad_norm': 0.8389608263969421, 'learning_rate': 3.7890029374109974e-05, 'epoch': 1.73} 17%|█▋ | 7141/41250 [17:15:07<82:22:13, 8.69s/it][2025-04-26 01:12:51,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.45 | optimizer_step: 0.90 [2025-04-26 01:12:51,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.14 | bwd_microstep: 5769.37 | bwd_inner_microstep: 5633.20 | bwd_allreduce_microstep: 136.12 | step_microstep: 19.76 [2025-04-26 01:12:51,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.14 | bwd: 5769.39 | bwd_inner: 5633.20 | bwd_allreduce: 136.14 | step: 19.77 17%|█▋ | 7142/41250 [17:15:16<82:19:32, 8.69s/it] {'loss': 0.135, 'grad_norm': 1.214301586151123, 'learning_rate': 3.788932728150332e-05, 'epoch': 1.73} 17%|█▋ | 7142/41250 [17:15:16<82:19:32, 8.69s/it][2025-04-26 01:12:59,718] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:12:59,719] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.82 | bwd_microstep: 5726.46 | bwd_inner_microstep: 5681.42 | bwd_allreduce_microstep: 44.98 | step_microstep: 18.60 [2025-04-26 01:12:59,719] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.82 | bwd: 5726.47 | bwd_inner: 5681.42 | bwd_allreduce: 45.00 | step: 18.60 17%|█▋ | 7143/41250 [17:15:25<82:14:22, 8.68s/it] {'loss': 0.0403, 'grad_norm': 0.6242594122886658, 'learning_rate': 3.7888625078612494e-05, 'epoch': 1.73} 17%|█▋ | 7143/41250 [17:15:25<82:14:22, 8.68s/it][2025-04-26 01:13:08,324] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.99 [2025-04-26 01:13:08,324] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.46 | bwd_microstep: 5686.83 | bwd_inner_microstep: 5616.40 | bwd_allreduce_microstep: 70.39 | step_microstep: 18.67 [2025-04-26 01:13:08,325] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.46 | bwd: 5686.84 | bwd_inner: 5616.40 | bwd_allreduce: 70.40 | step: 18.67 17%|█▋ | 7144/41250 [17:15:33<82:01:23, 8.66s/it] {'loss': 0.0864, 'grad_norm': 1.4260562658309937, 'learning_rate': 3.7887922765441815e-05, 'epoch': 1.73} 17%|█▋ | 7144/41250 [17:15:33<82:01:23, 8.66s/it][2025-04-26 01:13:16,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 1.04 [2025-04-26 01:13:16,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.76 | bwd_microstep: 5684.13 | bwd_inner_microstep: 5670.99 | bwd_allreduce_microstep: 13.10 | step_microstep: 18.85 [2025-04-26 01:13:16,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.76 | bwd: 5684.15 | bwd_inner: 5670.99 | bwd_allreduce: 13.11 | step: 18.85 17%|█▋ | 7145/41250 [17:15:42<81:52:50, 8.64s/it] {'loss': 0.1657, 'grad_norm': 2.2219607830047607, 'learning_rate': 3.788722034199561e-05, 'epoch': 1.73} 17%|█▋ | 7145/41250 [17:15:42<81:52:50, 8.64s/it][2025-04-26 01:13:25,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.10 | optimizer_step: 1.02 [2025-04-26 01:13:25,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.66 | bwd_microstep: 5706.79 | bwd_inner_microstep: 5693.41 | bwd_allreduce_microstep: 13.32 | step_microstep: 18.98 [2025-04-26 01:13:25,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.66 | bwd: 5706.81 | bwd_inner: 5693.41 | bwd_allreduce: 13.35 | step: 18.99 17%|█▋ | 7146/41250 [17:15:50<81:54:01, 8.65s/it] {'loss': 0.0565, 'grad_norm': 0.9879442453384399, 'learning_rate': 3.7886517808278205e-05, 'epoch': 1.73} 17%|█▋ | 7146/41250 [17:15:50<81:54:01, 8.65s/it][2025-04-26 01:13:34,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-26 01:13:34,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.07 | bwd_microstep: 5722.63 | bwd_inner_microstep: 5676.79 | bwd_allreduce_microstep: 45.79 | step_microstep: 19.04 [2025-04-26 01:13:34,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.07 | bwd: 5722.65 | bwd_inner: 5676.79 | bwd_allreduce: 45.81 | step: 19.04 17%|█▋ | 7147/41250 [17:15:59<81:55:03, 8.65s/it] {'loss': 0.0948, 'grad_norm': 1.4240745306015015, 'learning_rate': 3.7885815164293936e-05, 'epoch': 1.73} 17%|█▋ | 7147/41250 [17:15:59<81:55:03, 8.65s/it][2025-04-26 01:13:42,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:13:42,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.68 | bwd_microstep: 5708.69 | bwd_inner_microstep: 5695.81 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.69 [2025-04-26 01:13:42,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.68 | bwd: 5708.70 | bwd_inner: 5695.81 | bwd_allreduce: 12.86 | step: 18.69 17%|█▋ | 7148/41250 [17:16:08<81:53:58, 8.65s/it] {'loss': 0.0783, 'grad_norm': 1.3048642873764038, 'learning_rate': 3.788511241004714e-05, 'epoch': 1.73} 17%|█▋ | 7148/41250 [17:16:08<81:53:58, 8.65s/it][2025-04-26 01:13:51,497] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 0.97 | optimizer_step: 1.06 [2025-04-26 01:13:51,498] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.94 | bwd_microstep: 5717.29 | bwd_inner_microstep: 5649.38 | bwd_allreduce_microstep: 67.87 | step_microstep: 18.23 [2025-04-26 01:13:51,498] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.94 | bwd: 5717.31 | bwd_inner: 5649.38 | bwd_allreduce: 67.88 | step: 18.23 17%|█▋ | 7149/41250 [17:16:16<81:49:20, 8.64s/it] {'loss': 0.084, 'grad_norm': 1.0403445959091187, 'learning_rate': 3.788440954554214e-05, 'epoch': 1.73} 17%|█▋ | 7149/41250 [17:16:16<81:49:20, 8.64s/it][2025-04-26 01:14:00,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:14:00,139] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.30 | bwd_microstep: 5697.93 | bwd_inner_microstep: 5685.04 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.51 [2025-04-26 01:14:00,139] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.30 | bwd: 5697.94 | bwd_inner: 5685.04 | bwd_allreduce: 12.86 | step: 18.52 17%|█▋ | 7150/41250 [17:16:25<81:49:45, 8.64s/it] {'loss': 0.0213, 'grad_norm': 0.2637373208999634, 'learning_rate': 3.788370657078327e-05, 'epoch': 1.73} 17%|█▋ | 7150/41250 [17:16:25<81:49:45, 8.64s/it][2025-04-26 01:14:08,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.00 | optimizer_step: 1.02 [2025-04-26 01:14:08,809] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.65 | bwd_microstep: 5726.25 | bwd_inner_microstep: 5703.95 | bwd_allreduce_microstep: 22.25 | step_microstep: 18.98 [2025-04-26 01:14:08,809] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.65 | bwd: 5726.26 | bwd_inner: 5703.95 | bwd_allreduce: 22.26 | step: 18.98 17%|█▋ | 7151/41250 [17:16:34<81:54:55, 8.65s/it] {'loss': 0.1653, 'grad_norm': 0.8042116761207581, 'learning_rate': 3.788300348577487e-05, 'epoch': 1.73} 17%|█▋ | 7151/41250 [17:16:34<81:54:55, 8.65s/it][2025-04-26 01:14:17,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:14:17,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.91 | bwd_microstep: 5743.83 | bwd_inner_microstep: 5710.69 | bwd_allreduce_microstep: 33.10 | step_microstep: 18.71 [2025-04-26 01:14:17,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.91 | bwd: 5743.85 | bwd_inner: 5710.69 | bwd_allreduce: 33.12 | step: 18.72 17%|█▋ | 7152/41250 [17:16:42<82:00:09, 8.66s/it] {'loss': 0.1118, 'grad_norm': 3.367612838745117, 'learning_rate': 3.788230029052127e-05, 'epoch': 1.73} 17%|█▋ | 7152/41250 [17:16:42<82:00:09, 8.66s/it][2025-04-26 01:14:26,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:14:26,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.10 | bwd_microstep: 5736.37 | bwd_inner_microstep: 5643.71 | bwd_allreduce_microstep: 92.62 | step_microstep: 18.45 [2025-04-26 01:14:26,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.10 | bwd: 5736.39 | bwd_inner: 5643.71 | bwd_allreduce: 92.64 | step: 18.45 17%|█▋ | 7153/41250 [17:16:51<82:00:31, 8.66s/it] {'loss': 0.2113, 'grad_norm': 24.89377212524414, 'learning_rate': 3.788159698502681e-05, 'epoch': 1.73} 17%|█▋ | 7153/41250 [17:16:51<82:00:31, 8.66s/it][2025-04-26 01:14:34,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 01:14:34,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.02 | bwd_microstep: 5714.06 | bwd_inner_microstep: 5701.15 | bwd_allreduce_microstep: 12.87 | step_microstep: 18.69 [2025-04-26 01:14:34,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.02 | bwd: 5714.08 | bwd_inner: 5701.15 | bwd_allreduce: 12.89 | step: 18.69 17%|█▋ | 7154/41250 [17:17:00<81:58:35, 8.66s/it] {'loss': 0.0845, 'grad_norm': 1.157112717628479, 'learning_rate': 3.788089356929582e-05, 'epoch': 1.73} 17%|█▋ | 7154/41250 [17:17:00<81:58:35, 8.66s/it][2025-04-26 01:14:43,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 01:14:43,471] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.87 | bwd_microstep: 5734.08 | bwd_inner_microstep: 5700.14 | bwd_allreduce_microstep: 33.90 | step_microstep: 18.37 [2025-04-26 01:14:43,471] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.87 | bwd: 5734.10 | bwd_inner: 5700.14 | bwd_allreduce: 33.92 | step: 18.38 17%|█▋ | 7155/41250 [17:17:08<82:01:10, 8.66s/it] {'loss': 0.0179, 'grad_norm': 0.49352723360061646, 'learning_rate': 3.788019004333263e-05, 'epoch': 1.73} 17%|█▋ | 7155/41250 [17:17:08<82:01:10, 8.66s/it][2025-04-26 01:14:52,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 01:14:52,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.34 | bwd_microstep: 5711.77 | bwd_inner_microstep: 5698.77 | bwd_allreduce_microstep: 12.95 | step_microstep: 18.29 [2025-04-26 01:14:52,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.34 | bwd: 5711.78 | bwd_inner: 5698.77 | bwd_allreduce: 12.97 | step: 18.29 17%|█▋ | 7156/41250 [17:17:17<81:57:53, 8.65s/it] {'loss': 0.1114, 'grad_norm': 1.2460300922393799, 'learning_rate': 3.7879486407141596e-05, 'epoch': 1.73} 17%|█▋ | 7156/41250 [17:17:17<81:57:53, 8.65s/it][2025-04-26 01:15:00,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 01:15:00,764] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.30 | bwd_microstep: 5722.74 | bwd_inner_microstep: 5709.70 | bwd_allreduce_microstep: 13.00 | step_microstep: 18.42 [2025-04-26 01:15:00,764] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.30 | bwd: 5722.76 | bwd_inner: 5709.70 | bwd_allreduce: 13.02 | step: 18.42 17%|█▋ | 7157/41250 [17:17:26<81:57:13, 8.65s/it] {'loss': 0.1173, 'grad_norm': 2.4878170490264893, 'learning_rate': 3.787878266072704e-05, 'epoch': 1.74} 17%|█▋ | 7157/41250 [17:17:26<81:57:13, 8.65s/it][2025-04-26 01:15:09,386] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 01:15:09,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.24 | bwd_microstep: 5704.99 | bwd_inner_microstep: 5659.40 | bwd_allreduce_microstep: 45.54 | step_microstep: 18.41 [2025-04-26 01:15:09,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.24 | bwd: 5705.01 | bwd_inner: 5659.40 | bwd_allreduce: 45.56 | step: 18.42 17%|█▋ | 7158/41250 [17:17:34<81:52:33, 8.65s/it] {'loss': 0.3255, 'grad_norm': 4.1654863357543945, 'learning_rate': 3.78780788040933e-05, 'epoch': 1.74} 17%|█▋ | 7158/41250 [17:17:34<81:52:33, 8.65s/it][2025-04-26 01:15:18,030] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 1.30 | optimizer_step: 1.04 [2025-04-26 01:15:18,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.82 | bwd_microstep: 5704.52 | bwd_inner_microstep: 5690.63 | bwd_allreduce_microstep: 13.83 | step_microstep: 20.20 [2025-04-26 01:15:18,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.82 | bwd: 5704.54 | bwd_inner: 5690.63 | bwd_allreduce: 13.86 | step: 20.21 17%|█▋ | 7159/41250 [17:17:43<81:51:29, 8.64s/it] {'loss': 0.0824, 'grad_norm': 1.4054968357086182, 'learning_rate': 3.787737483724473e-05, 'epoch': 1.74} 17%|█▋ | 7159/41250 [17:17:43<81:51:29, 8.64s/it][2025-04-26 01:15:26,730] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.96 [2025-04-26 01:15:26,730] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.65 | bwd_microstep: 5787.96 | bwd_inner_microstep: 5656.26 | bwd_allreduce_microstep: 131.65 | step_microstep: 18.67 [2025-04-26 01:15:26,730] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.65 | bwd: 5787.97 | bwd_inner: 5656.26 | bwd_allreduce: 131.67 | step: 18.68 17%|█▋ | 7160/41250 [17:17:52<82:00:45, 8.66s/it] {'loss': 0.0343, 'grad_norm': 0.7637341022491455, 'learning_rate': 3.787667076018566e-05, 'epoch': 1.74} 17%|█▋ | 7160/41250 [17:17:52<82:00:45, 8.66s/it][2025-04-26 01:15:35,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 01:15:35,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.17 | bwd_microstep: 5764.68 | bwd_inner_microstep: 5654.87 | bwd_allreduce_microstep: 109.76 | step_microstep: 18.26 [2025-04-26 01:15:35,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.17 | bwd: 5764.69 | bwd_inner: 5654.87 | bwd_allreduce: 109.78 | step: 18.27 17%|█▋ | 7161/41250 [17:18:00<82:02:36, 8.66s/it] {'loss': 0.16, 'grad_norm': 2.292487144470215, 'learning_rate': 3.787596657292042e-05, 'epoch': 1.74} 17%|█▋ | 7161/41250 [17:18:00<82:02:36, 8.66s/it][2025-04-26 01:15:44,065] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 01:15:44,066] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.30 | bwd_microstep: 5723.70 | bwd_inner_microstep: 5711.08 | bwd_allreduce_microstep: 12.57 | step_microstep: 18.62 [2025-04-26 01:15:44,066] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.30 | bwd: 5723.71 | bwd_inner: 5711.08 | bwd_allreduce: 12.59 | step: 18.62 17%|█▋ | 7162/41250 [17:18:09<82:02:08, 8.66s/it] {'loss': 0.2045, 'grad_norm': 2.3152174949645996, 'learning_rate': 3.7875262275453375e-05, 'epoch': 1.74} 17%|█▋ | 7162/41250 [17:18:09<82:02:08, 8.66s/it][2025-04-26 01:15:52,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:15:52,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.75 | bwd_microstep: 5706.36 | bwd_inner_microstep: 5662.75 | bwd_allreduce_microstep: 43.56 | step_microstep: 18.68 [2025-04-26 01:15:52,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.75 | bwd: 5706.38 | bwd_inner: 5662.75 | bwd_allreduce: 43.58 | step: 18.68 17%|█▋ | 7163/41250 [17:18:18<81:54:24, 8.65s/it] {'loss': 0.0891, 'grad_norm': 1.4978351593017578, 'learning_rate': 3.7874557867788846e-05, 'epoch': 1.74} 17%|█▋ | 7163/41250 [17:18:18<81:54:24, 8.65s/it][2025-04-26 01:16:01,401] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 01:16:01,401] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.70 | bwd_microstep: 5790.47 | bwd_inner_microstep: 5688.88 | bwd_allreduce_microstep: 101.54 | step_microstep: 18.49 [2025-04-26 01:16:01,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.70 | bwd: 5790.48 | bwd_inner: 5688.88 | bwd_allreduce: 101.56 | step: 18.49 17%|█▋ | 7164/41250 [17:18:26<82:05:40, 8.67s/it] {'loss': 0.0236, 'grad_norm': 0.47722944617271423, 'learning_rate': 3.7873853349931186e-05, 'epoch': 1.74} 17%|█▋ | 7164/41250 [17:18:26<82:05:40, 8.67s/it][2025-04-26 01:16:10,105] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-26 01:16:10,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.52 | bwd_microstep: 5768.16 | bwd_inner_microstep: 5708.63 | bwd_allreduce_microstep: 59.49 | step_microstep: 19.36 [2025-04-26 01:16:10,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.52 | bwd: 5768.17 | bwd_inner: 5708.63 | bwd_allreduce: 59.51 | step: 19.36 17%|█▋ | 7165/41250 [17:18:35<82:11:23, 8.68s/it] {'loss': 0.0051, 'grad_norm': 0.12203103303909302, 'learning_rate': 3.787314872188474e-05, 'epoch': 1.74} 17%|█▋ | 7165/41250 [17:18:35<82:11:23, 8.68s/it][2025-04-26 01:16:18,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 01:16:18,768] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.63 | bwd_microstep: 5725.10 | bwd_inner_microstep: 5712.35 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.56 [2025-04-26 01:16:18,768] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.63 | bwd: 5725.12 | bwd_inner: 5712.35 | bwd_allreduce: 12.72 | step: 18.56 17%|█▋ | 7166/41250 [17:18:44<82:07:49, 8.67s/it] {'loss': 0.1164, 'grad_norm': 1.7620210647583008, 'learning_rate': 3.787244398365384e-05, 'epoch': 1.74} 17%|█▋ | 7166/41250 [17:18:44<82:07:49, 8.67s/it][2025-04-26 01:16:27,408] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 01:16:27,408] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.82 | bwd_microstep: 5710.96 | bwd_inner_microstep: 5698.42 | bwd_allreduce_microstep: 12.50 | step_microstep: 18.15 [2025-04-26 01:16:27,408] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.82 | bwd: 5710.98 | bwd_inner: 5698.42 | bwd_allreduce: 12.52 | step: 18.15 17%|█▋ | 7167/41250 [17:18:52<82:01:42, 8.66s/it] {'loss': 0.1769, 'grad_norm': 3.987546682357788, 'learning_rate': 3.787173913524284e-05, 'epoch': 1.74} 17%|█▋ | 7167/41250 [17:18:52<82:01:42, 8.66s/it][2025-04-26 01:16:36,061] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 01:16:36,061] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.28 | bwd_microstep: 5721.20 | bwd_inner_microstep: 5708.26 | bwd_allreduce_microstep: 12.89 | step_microstep: 18.46 [2025-04-26 01:16:36,061] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.28 | bwd: 5721.21 | bwd_inner: 5708.26 | bwd_allreduce: 12.91 | step: 18.46 17%|█▋ | 7168/41250 [17:19:01<81:59:41, 8.66s/it] {'loss': 0.3482, 'grad_norm': 6.865823745727539, 'learning_rate': 3.7871034176656084e-05, 'epoch': 1.74} 17%|█▋ | 7168/41250 [17:19:01<81:59:41, 8.66s/it][2025-04-26 01:16:44,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 01:16:44,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.86 | bwd_microstep: 5716.21 | bwd_inner_microstep: 5703.58 | bwd_allreduce_microstep: 12.59 | step_microstep: 18.70 [2025-04-26 01:16:44,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.86 | bwd: 5716.22 | bwd_inner: 5703.58 | bwd_allreduce: 12.60 | step: 18.70 17%|█▋ | 7169/41250 [17:19:10<81:57:28, 8.66s/it] {'loss': 0.063, 'grad_norm': 2.097896099090576, 'learning_rate': 3.7870329107897925e-05, 'epoch': 1.74} 17%|█▋ | 7169/41250 [17:19:10<81:57:28, 8.66s/it][2025-04-26 01:16:53,347] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-26 01:16:53,348] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.28 | bwd_microstep: 5726.68 | bwd_inner_microstep: 5642.55 | bwd_allreduce_microstep: 84.09 | step_microstep: 18.65 [2025-04-26 01:16:53,348] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.28 | bwd: 5726.70 | bwd_inner: 5642.55 | bwd_allreduce: 84.11 | step: 18.66 17%|█▋ | 7170/41250 [17:19:18<81:54:08, 8.65s/it] {'loss': 0.1445, 'grad_norm': 1.697943925857544, 'learning_rate': 3.786962392897269e-05, 'epoch': 1.74} 17%|█▋ | 7170/41250 [17:19:18<81:54:08, 8.65s/it][2025-04-26 01:17:02,052] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:17:02,053] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.94 | bwd_microstep: 5773.59 | bwd_inner_microstep: 5693.50 | bwd_allreduce_microstep: 80.04 | step_microstep: 18.51 [2025-04-26 01:17:02,053] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.94 | bwd: 5773.60 | bwd_inner: 5693.50 | bwd_allreduce: 80.06 | step: 18.51 17%|█▋ | 7171/41250 [17:19:27<82:03:08, 8.67s/it] {'loss': 0.1447, 'grad_norm': 6.59114933013916, 'learning_rate': 3.786891863988475e-05, 'epoch': 1.74} 17%|█▋ | 7171/41250 [17:19:27<82:03:08, 8.67s/it][2025-04-26 01:17:10,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 0.98 | optimizer_step: 1.01 [2025-04-26 01:17:10,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.73 | bwd_microstep: 5768.08 | bwd_inner_microstep: 5656.82 | bwd_allreduce_microstep: 111.22 | step_microstep: 18.50 [2025-04-26 01:17:10,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.73 | bwd: 5768.10 | bwd_inner: 5656.82 | bwd_allreduce: 111.24 | step: 18.50 17%|█▋ | 7172/41250 [17:19:36<82:04:56, 8.67s/it] {'loss': 0.1036, 'grad_norm': 1.4164670705795288, 'learning_rate': 3.786821324063843e-05, 'epoch': 1.74} 17%|█▋ | 7172/41250 [17:19:36<82:04:56, 8.67s/it][2025-04-26 01:17:19,344] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 01:17:19,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.06 | bwd_microstep: 5705.56 | bwd_inner_microstep: 5645.81 | bwd_allreduce_microstep: 59.71 | step_microstep: 18.38 [2025-04-26 01:17:19,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.06 | bwd: 5705.58 | bwd_inner: 5645.81 | bwd_allreduce: 59.73 | step: 18.39 17%|█▋ | 7173/41250 [17:19:44<81:54:55, 8.65s/it] {'loss': 0.2264, 'grad_norm': 2.4890010356903076, 'learning_rate': 3.786750773123809e-05, 'epoch': 1.74} 17%|█▋ | 7173/41250 [17:19:44<81:54:55, 8.65s/it][2025-04-26 01:17:28,030] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 01:17:28,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.69 | bwd_microstep: 5745.34 | bwd_inner_microstep: 5700.45 | bwd_allreduce_microstep: 44.85 | step_microstep: 18.54 [2025-04-26 01:17:28,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.69 | bwd: 5745.36 | bwd_inner: 5700.45 | bwd_allreduce: 44.87 | step: 18.54 17%|█▋ | 7174/41250 [17:19:53<82:00:07, 8.66s/it] {'loss': 0.0238, 'grad_norm': 0.5343949198722839, 'learning_rate': 3.7866802111688084e-05, 'epoch': 1.74} 17%|█▋ | 7174/41250 [17:19:53<82:00:07, 8.66s/it][2025-04-26 01:17:36,711] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-26 01:17:36,711] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.99 | bwd_microstep: 5748.39 | bwd_inner_microstep: 5711.21 | bwd_allreduce_microstep: 37.14 | step_microstep: 18.32 [2025-04-26 01:17:36,712] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.99 | bwd: 5748.41 | bwd_inner: 5711.21 | bwd_allreduce: 37.15 | step: 18.32 17%|█▋ | 7175/41250 [17:20:02<82:02:49, 8.67s/it] {'loss': 0.0654, 'grad_norm': 2.2274909019470215, 'learning_rate': 3.7866096381992766e-05, 'epoch': 1.74} 17%|█▋ | 7175/41250 [17:20:02<82:02:49, 8.67s/it][2025-04-26 01:17:45,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.02 | optimizer_step: 1.00 [2025-04-26 01:17:45,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.62 | bwd_microstep: 5731.06 | bwd_inner_microstep: 5696.52 | bwd_allreduce_microstep: 34.50 | step_microstep: 19.62 [2025-04-26 01:17:45,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.62 | bwd: 5731.08 | bwd_inner: 5696.52 | bwd_allreduce: 34.52 | step: 19.62 17%|█▋ | 7176/41250 [17:20:10<82:03:15, 8.67s/it] {'loss': 0.4989, 'grad_norm': 4.474792003631592, 'learning_rate': 3.7865390542156466e-05, 'epoch': 1.74} 17%|█▋ | 7176/41250 [17:20:10<82:03:15, 8.67s/it][2025-04-26 01:17:54,182] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.02 | optimizer_step: 0.96 [2025-04-26 01:17:54,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.22 | bwd_microstep: 5891.85 | bwd_inner_microstep: 5654.83 | bwd_allreduce_microstep: 236.98 | step_microstep: 19.13 [2025-04-26 01:17:54,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.22 | bwd: 5891.86 | bwd_inner: 5654.83 | bwd_allreduce: 236.99 | step: 19.13 17%|█▋ | 7177/41250 [17:20:19<82:25:51, 8.71s/it] {'loss': 0.1829, 'grad_norm': 1.8532142639160156, 'learning_rate': 3.786468459218354e-05, 'epoch': 1.74} 17%|█▋ | 7177/41250 [17:20:19<82:25:51, 8.71s/it][2025-04-26 01:18:02,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.03 | optimizer_step: 0.91 [2025-04-26 01:18:02,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.80 | bwd_microstep: 5701.15 | bwd_inner_microstep: 5654.68 | bwd_allreduce_microstep: 46.42 | step_microstep: 18.96 [2025-04-26 01:18:02,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.80 | bwd: 5701.17 | bwd_inner: 5654.68 | bwd_allreduce: 46.44 | step: 18.97 17%|█▋ | 7178/41250 [17:20:28<82:08:38, 8.68s/it] {'loss': 0.0887, 'grad_norm': 2.9803659915924072, 'learning_rate': 3.7863978532078364e-05, 'epoch': 1.74} 17%|█▋ | 7178/41250 [17:20:28<82:08:38, 8.68s/it][2025-04-26 01:18:11,453] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:18:11,453] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.62 | bwd_microstep: 5722.50 | bwd_inner_microstep: 5709.62 | bwd_allreduce_microstep: 12.83 | step_microstep: 18.88 [2025-04-26 01:18:11,453] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.62 | bwd: 5722.52 | bwd_inner: 5709.62 | bwd_allreduce: 12.85 | step: 18.88 17%|█▋ | 7179/41250 [17:20:36<82:05:32, 8.67s/it] {'loss': 0.0367, 'grad_norm': 0.5166900753974915, 'learning_rate': 3.786327236184527e-05, 'epoch': 1.74} 17%|█▋ | 7179/41250 [17:20:36<82:05:32, 8.67s/it][2025-04-26 01:18:20,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.05 | optimizer_step: 1.02 [2025-04-26 01:18:20,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.71 | bwd_microstep: 5759.12 | bwd_inner_microstep: 5681.51 | bwd_allreduce_microstep: 77.56 | step_microstep: 19.22 [2025-04-26 01:18:20,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.71 | bwd: 5759.13 | bwd_inner: 5681.51 | bwd_allreduce: 77.58 | step: 19.22 17%|█▋ | 7180/41250 [17:20:45<82:07:42, 8.68s/it] {'loss': 0.0981, 'grad_norm': 2.049466133117676, 'learning_rate': 3.7862566081488614e-05, 'epoch': 1.74} 17%|█▋ | 7180/41250 [17:20:45<82:07:42, 8.68s/it][2025-04-26 01:18:28,770] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:18:28,770] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.51 | bwd_microstep: 5692.40 | bwd_inner_microstep: 5679.29 | bwd_allreduce_microstep: 13.06 | step_microstep: 18.83 [2025-04-26 01:18:28,770] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.51 | bwd: 5692.42 | bwd_inner: 5679.29 | bwd_allreduce: 13.08 | step: 18.83 17%|█▋ | 7181/41250 [17:20:54<81:59:20, 8.66s/it] {'loss': 0.4496, 'grad_norm': 4.769265651702881, 'learning_rate': 3.786185969101275e-05, 'epoch': 1.74} 17%|█▋ | 7181/41250 [17:20:54<81:59:20, 8.66s/it][2025-04-26 01:18:37,418] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.05 | optimizer_step: 1.05 [2025-04-26 01:18:37,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.55 | bwd_microstep: 5710.75 | bwd_inner_microstep: 5697.85 | bwd_allreduce_microstep: 12.85 | step_microstep: 19.41 [2025-04-26 01:18:37,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.56 | bwd: 5710.76 | bwd_inner: 5697.85 | bwd_allreduce: 12.86 | step: 19.42 17%|█▋ | 7182/41250 [17:21:02<81:56:36, 8.66s/it] {'loss': 0.0863, 'grad_norm': 2.0010087490081787, 'learning_rate': 3.786115319042203e-05, 'epoch': 1.74} 17%|█▋ | 7182/41250 [17:21:02<81:56:36, 8.66s/it][2025-04-26 01:18:46,339] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 01:18:46,340] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.06 | bwd_microstep: 5984.58 | bwd_inner_microstep: 5696.93 | bwd_allreduce_microstep: 287.61 | step_microstep: 18.88 [2025-04-26 01:18:46,340] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.06 | bwd: 5984.59 | bwd_inner: 5696.93 | bwd_allreduce: 287.63 | step: 18.88 17%|█▋ | 7183/41250 [17:21:11<82:41:05, 8.74s/it] {'loss': 0.0255, 'grad_norm': 0.7470961213111877, 'learning_rate': 3.786044657972082e-05, 'epoch': 1.74} 17%|█▋ | 7183/41250 [17:21:11<82:41:05, 8.74s/it][2025-04-26 01:18:55,014] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:18:55,015] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.50 | bwd_microstep: 5765.53 | bwd_inner_microstep: 5649.79 | bwd_allreduce_microstep: 115.69 | step_microstep: 18.99 [2025-04-26 01:18:55,015] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.50 | bwd: 5765.54 | bwd_inner: 5649.79 | bwd_allreduce: 115.71 | step: 18.99 17%|█▋ | 7184/41250 [17:21:20<82:30:13, 8.72s/it] {'loss': 0.0524, 'grad_norm': 0.9489001631736755, 'learning_rate': 3.785973985891347e-05, 'epoch': 1.74} 17%|█▋ | 7184/41250 [17:21:20<82:30:13, 8.72s/it][2025-04-26 01:19:03,700] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 01:19:03,701] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.02 | bwd_microstep: 5754.63 | bwd_inner_microstep: 5694.10 | bwd_allreduce_microstep: 60.49 | step_microstep: 19.19 [2025-04-26 01:19:03,701] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.02 | bwd: 5754.65 | bwd_inner: 5694.10 | bwd_allreduce: 60.51 | step: 19.20 17%|█▋ | 7185/41250 [17:21:29<82:24:36, 8.71s/it] {'loss': 0.2372, 'grad_norm': 3.233020544052124, 'learning_rate': 3.7859033028004334e-05, 'epoch': 1.74} 17%|█▋ | 7185/41250 [17:21:29<82:24:36, 8.71s/it][2025-04-26 01:19:12,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.03 | optimizer_step: 1.00 [2025-04-26 01:19:12,336] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.23 | bwd_microstep: 5705.17 | bwd_inner_microstep: 5692.22 | bwd_allreduce_microstep: 12.91 | step_microstep: 19.15 [2025-04-26 01:19:12,336] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.23 | bwd: 5705.19 | bwd_inner: 5692.22 | bwd_allreduce: 12.92 | step: 19.15 17%|█▋ | 7186/41250 [17:21:37<82:11:41, 8.69s/it] {'loss': 0.0522, 'grad_norm': 1.0372843742370605, 'learning_rate': 3.785832608699777e-05, 'epoch': 1.74} 17%|█▋ | 7186/41250 [17:21:37<82:11:41, 8.69s/it][2025-04-26 01:19:20,993] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.04 | optimizer_step: 1.19 [2025-04-26 01:19:20,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.35 | bwd_microstep: 5750.22 | bwd_inner_microstep: 5634.60 | bwd_allreduce_microstep: 115.56 | step_microstep: 19.51 [2025-04-26 01:19:20,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.35 | bwd: 5750.23 | bwd_inner: 5634.60 | bwd_allreduce: 115.58 | step: 19.51 17%|█▋ | 7187/41250 [17:21:46<82:06:37, 8.68s/it] {'loss': 0.0318, 'grad_norm': 1.045538306236267, 'learning_rate': 3.785761903589814e-05, 'epoch': 1.74} 17%|█▋ | 7187/41250 [17:21:46<82:06:37, 8.68s/it][2025-04-26 01:19:29,654] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:19:29,654] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.31 | bwd_microstep: 5750.48 | bwd_inner_microstep: 5646.92 | bwd_allreduce_microstep: 103.51 | step_microstep: 18.81 [2025-04-26 01:19:29,654] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.31 | bwd: 5750.50 | bwd_inner: 5646.92 | bwd_allreduce: 103.53 | step: 18.81 17%|█▋ | 7188/41250 [17:21:54<82:03:13, 8.67s/it] {'loss': 0.0955, 'grad_norm': 1.7443342208862305, 'learning_rate': 3.78569118747098e-05, 'epoch': 1.74} 17%|█▋ | 7188/41250 [17:21:54<82:03:13, 8.67s/it][2025-04-26 01:19:38,328] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 1.09 [2025-04-26 01:19:38,328] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.49 | bwd_microstep: 5744.11 | bwd_inner_microstep: 5679.43 | bwd_allreduce_microstep: 64.63 | step_microstep: 19.35 [2025-04-26 01:19:38,328] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.49 | bwd: 5744.13 | bwd_inner: 5679.43 | bwd_allreduce: 64.65 | step: 19.36 17%|█▋ | 7189/41250 [17:22:03<82:03:25, 8.67s/it] {'loss': 0.0383, 'grad_norm': 2.1026296615600586, 'learning_rate': 3.7856204603437116e-05, 'epoch': 1.74} 17%|█▋ | 7189/41250 [17:22:03<82:03:25, 8.67s/it][2025-04-26 01:19:46,996] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 01:19:46,997] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.18 | bwd_microstep: 5758.53 | bwd_inner_microstep: 5646.66 | bwd_allreduce_microstep: 111.83 | step_microstep: 18.70 [2025-04-26 01:19:46,997] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.18 | bwd: 5758.54 | bwd_inner: 5646.66 | bwd_allreduce: 111.84 | step: 18.70 17%|█▋ | 7190/41250 [17:22:12<82:02:20, 8.67s/it] {'loss': 0.0523, 'grad_norm': 4.185266971588135, 'learning_rate': 3.785549722208444e-05, 'epoch': 1.74} 17%|█▋ | 7190/41250 [17:22:12<82:02:20, 8.67s/it][2025-04-26 01:19:55,588] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 01:19:55,589] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.32 | bwd_microstep: 5688.49 | bwd_inner_microstep: 5634.41 | bwd_allreduce_microstep: 54.03 | step_microstep: 18.73 [2025-04-26 01:19:55,589] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.32 | bwd: 5688.50 | bwd_inner: 5634.41 | bwd_allreduce: 54.05 | step: 18.73 17%|█▋ | 7191/41250 [17:22:20<81:48:41, 8.65s/it] {'loss': 0.0071, 'grad_norm': 0.16819573938846588, 'learning_rate': 3.7854789730656135e-05, 'epoch': 1.74} 17%|█▋ | 7191/41250 [17:22:20<81:48:41, 8.65s/it][2025-04-26 01:20:04,271] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:20:04,271] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.94 | bwd_microstep: 5775.56 | bwd_inner_microstep: 5641.54 | bwd_allreduce_microstep: 133.97 | step_microstep: 18.76 [2025-04-26 01:20:04,271] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.94 | bwd: 5775.57 | bwd_inner: 5641.54 | bwd_allreduce: 133.99 | step: 18.76 17%|█▋ | 7192/41250 [17:22:29<81:54:47, 8.66s/it] {'loss': 0.0635, 'grad_norm': 2.655128240585327, 'learning_rate': 3.7854082129156564e-05, 'epoch': 1.74} 17%|█▋ | 7192/41250 [17:22:29<81:54:47, 8.66s/it][2025-04-26 01:20:12,945] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 0.99 | optimizer_step: 1.13 [2025-04-26 01:20:12,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.04 | bwd_microstep: 5766.97 | bwd_inner_microstep: 5646.63 | bwd_allreduce_microstep: 120.29 | step_microstep: 18.90 [2025-04-26 01:20:12,947] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.04 | bwd: 5766.98 | bwd_inner: 5646.63 | bwd_allreduce: 120.31 | step: 18.90 17%|█▋ | 7193/41250 [17:22:38<81:57:18, 8.66s/it] {'loss': 0.1161, 'grad_norm': 1.6586754322052002, 'learning_rate': 3.7853374417590084e-05, 'epoch': 1.74} 17%|█▋ | 7193/41250 [17:22:38<81:57:18, 8.66s/it][2025-04-26 01:20:21,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.98 | optimizer_step: 0.95 [2025-04-26 01:20:21,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.51 | bwd_microstep: 5731.88 | bwd_inner_microstep: 5690.98 | bwd_allreduce_microstep: 40.85 | step_microstep: 18.44 [2025-04-26 01:20:21,620] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.51 | bwd: 5731.89 | bwd_inner: 5690.98 | bwd_allreduce: 40.87 | step: 18.44 17%|█▋ | 7194/41250 [17:22:46<81:58:52, 8.67s/it] {'loss': 0.0223, 'grad_norm': 0.6152781844139099, 'learning_rate': 3.785266659596107e-05, 'epoch': 1.74} 17%|█▋ | 7194/41250 [17:22:46<81:58:52, 8.67s/it][2025-04-26 01:20:30,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 0.99 | optimizer_step: 0.99 [2025-04-26 01:20:30,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.12 | bwd_microstep: 5695.93 | bwd_inner_microstep: 5683.23 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.59 [2025-04-26 01:20:30,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.12 | bwd: 5695.94 | bwd_inner: 5683.23 | bwd_allreduce: 12.67 | step: 18.59 17%|█▋ | 7195/41250 [17:22:55<81:51:30, 8.65s/it] {'loss': 0.3403, 'grad_norm': 1.6522778272628784, 'learning_rate': 3.785195866427387e-05, 'epoch': 1.74} 17%|█▋ | 7195/41250 [17:22:55<81:51:30, 8.65s/it][2025-04-26 01:20:38,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 01:20:38,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.68 | bwd_microstep: 5766.97 | bwd_inner_microstep: 5655.87 | bwd_allreduce_microstep: 111.05 | step_microstep: 18.98 [2025-04-26 01:20:38,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.68 | bwd: 5766.98 | bwd_inner: 5655.87 | bwd_allreduce: 111.07 | step: 18.98 17%|█▋ | 7196/41250 [17:23:04<81:55:12, 8.66s/it] {'loss': 0.0354, 'grad_norm': 0.6880890130996704, 'learning_rate': 3.785125062253286e-05, 'epoch': 1.74} 17%|█▋ | 7196/41250 [17:23:04<81:55:12, 8.66s/it][2025-04-26 01:20:47,616] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.96 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:20:47,617] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.60 | bwd_microstep: 5766.35 | bwd_inner_microstep: 5710.02 | bwd_allreduce_microstep: 56.28 | step_microstep: 18.26 [2025-04-26 01:20:47,617] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.61 | bwd: 5766.36 | bwd_inner: 5710.02 | bwd_allreduce: 56.30 | step: 18.27 17%|█▋ | 7197/41250 [17:23:12<82:01:26, 8.67s/it] {'loss': 0.3677, 'grad_norm': 2.69596529006958, 'learning_rate': 3.7850542470742404e-05, 'epoch': 1.74} 17%|█▋ | 7197/41250 [17:23:12<82:01:26, 8.67s/it][2025-04-26 01:20:56,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:20:56,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.05 | bwd_microstep: 5689.81 | bwd_inner_microstep: 5653.12 | bwd_allreduce_microstep: 36.63 | step_microstep: 18.73 [2025-04-26 01:20:56,213] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.05 | bwd: 5689.82 | bwd_inner: 5653.12 | bwd_allreduce: 36.66 | step: 18.74 17%|█▋ | 7198/41250 [17:23:21<81:48:42, 8.65s/it] {'loss': 0.0106, 'grad_norm': 0.13082551956176758, 'learning_rate': 3.784983420890686e-05, 'epoch': 1.74} 17%|█▋ | 7198/41250 [17:23:21<81:48:42, 8.65s/it][2025-04-26 01:21:04,909] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.33 | optimizer_step: 1.05 [2025-04-26 01:21:04,910] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.78 | bwd_microstep: 5764.57 | bwd_inner_microstep: 5701.68 | bwd_allreduce_microstep: 62.83 | step_microstep: 20.26 [2025-04-26 01:21:04,910] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.78 | bwd: 5764.59 | bwd_inner: 5701.68 | bwd_allreduce: 62.86 | step: 20.26 17%|█▋ | 7199/41250 [17:23:30<81:56:37, 8.66s/it] {'loss': 0.1543, 'grad_norm': 2.916114568710327, 'learning_rate': 3.78491258370306e-05, 'epoch': 1.75} 17%|█▋ | 7199/41250 [17:23:30<81:56:37, 8.66s/it][2025-04-26 01:21:13,610] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.05 | optimizer_step: 0.91 [2025-04-26 01:21:13,610] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.07 | bwd_microstep: 5793.04 | bwd_inner_microstep: 5653.63 | bwd_allreduce_microstep: 139.36 | step_microstep: 19.15 [2025-04-26 01:21:13,611] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.07 | bwd: 5793.06 | bwd_inner: 5653.63 | bwd_allreduce: 139.38 | step: 19.15 17%|█▋ | 7200/41250 [17:23:38<82:02:33, 8.67s/it] {'loss': 0.1669, 'grad_norm': 1.2921584844589233, 'learning_rate': 3.784841735511799e-05, 'epoch': 1.75} 17%|█▋ | 7200/41250 [17:23:38<82:02:33, 8.67s/it][2025-04-26 01:21:22,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:21:22,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.86 | bwd_microstep: 5706.85 | bwd_inner_microstep: 5693.87 | bwd_allreduce_microstep: 12.94 | step_microstep: 18.90 [2025-04-26 01:21:22,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.86 | bwd: 5706.86 | bwd_inner: 5693.87 | bwd_allreduce: 12.95 | step: 18.90 17%|█▋ | 7201/41250 [17:23:47<81:56:52, 8.66s/it] {'loss': 0.1027, 'grad_norm': 1.0119730234146118, 'learning_rate': 3.78477087631734e-05, 'epoch': 1.75} 17%|█▋ | 7201/41250 [17:23:47<81:56:52, 8.66s/it][2025-04-26 01:21:30,859] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 01:21:30,860] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.66 | bwd_microstep: 5698.72 | bwd_inner_microstep: 5657.57 | bwd_allreduce_microstep: 41.10 | step_microstep: 18.62 [2025-04-26 01:21:30,860] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.66 | bwd: 5698.73 | bwd_inner: 5657.57 | bwd_allreduce: 41.12 | step: 18.62 17%|█▋ | 7202/41250 [17:23:56<81:47:02, 8.65s/it] {'loss': 0.1127, 'grad_norm': 2.1186676025390625, 'learning_rate': 3.784700006120119e-05, 'epoch': 1.75} 17%|█▋ | 7202/41250 [17:23:56<81:47:02, 8.65s/it][2025-04-26 01:21:39,505] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:21:39,506] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.75 | bwd_microstep: 5711.84 | bwd_inner_microstep: 5699.03 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.87 [2025-04-26 01:21:39,506] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.75 | bwd: 5711.86 | bwd_inner: 5699.03 | bwd_allreduce: 12.78 | step: 18.87 17%|█▋ | 7203/41250 [17:24:04<81:46:57, 8.65s/it] {'loss': 0.1959, 'grad_norm': 2.2747037410736084, 'learning_rate': 3.784629124920574e-05, 'epoch': 1.75} 17%|█▋ | 7203/41250 [17:24:04<81:46:57, 8.65s/it][2025-04-26 01:21:48,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:21:48,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.64 | bwd_microstep: 5763.82 | bwd_inner_microstep: 5663.35 | bwd_allreduce_microstep: 100.42 | step_microstep: 18.88 [2025-04-26 01:21:48,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.64 | bwd: 5763.84 | bwd_inner: 5663.35 | bwd_allreduce: 100.44 | step: 18.89 17%|█▋ | 7204/41250 [17:24:13<81:52:49, 8.66s/it] {'loss': 0.1535, 'grad_norm': 1.6289682388305664, 'learning_rate': 3.784558232719141e-05, 'epoch': 1.75} 17%|█▋ | 7204/41250 [17:24:13<81:52:49, 8.66s/it][2025-04-26 01:21:56,831] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.02 | optimizer_step: 1.01 [2025-04-26 01:21:56,832] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.87 | bwd_microstep: 5712.55 | bwd_inner_microstep: 5699.82 | bwd_allreduce_microstep: 12.69 | step_microstep: 19.01 [2025-04-26 01:21:56,832] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.87 | bwd: 5712.57 | bwd_inner: 5699.82 | bwd_allreduce: 12.71 | step: 19.01 17%|█▋ | 7205/41250 [17:24:22<81:50:14, 8.65s/it] {'loss': 0.0513, 'grad_norm': 1.6720925569534302, 'learning_rate': 3.784487329516258e-05, 'epoch': 1.75} 17%|█▋ | 7205/41250 [17:24:22<81:50:14, 8.65s/it][2025-04-26 01:22:05,776] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 01:22:05,777] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.62 | bwd_microstep: 6010.71 | bwd_inner_microstep: 5698.51 | bwd_allreduce_microstep: 312.16 | step_microstep: 18.88 [2025-04-26 01:22:05,777] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.62 | bwd: 6010.73 | bwd_inner: 5698.51 | bwd_allreduce: 312.18 | step: 18.88 17%|█▋ | 7206/41250 [17:24:31<82:39:30, 8.74s/it] {'loss': 0.0767, 'grad_norm': 1.4095935821533203, 'learning_rate': 3.7844164153123615e-05, 'epoch': 1.75} 17%|█▋ | 7206/41250 [17:24:31<82:39:30, 8.74s/it][2025-04-26 01:22:14,441] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-26 01:22:14,441] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2867.66 | bwd_microstep: 5714.74 | bwd_inner_microstep: 5701.71 | bwd_allreduce_microstep: 12.98 | step_microstep: 18.99 [2025-04-26 01:22:14,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2867.66 | bwd: 5714.76 | bwd_inner: 5701.71 | bwd_allreduce: 13.00 | step: 18.99 17%|█▋ | 7207/41250 [17:24:39<82:26:23, 8.72s/it] {'loss': 0.3363, 'grad_norm': 2.2652230262756348, 'learning_rate': 3.7843454901078886e-05, 'epoch': 1.75} 17%|█▋ | 7207/41250 [17:24:39<82:26:23, 8.72s/it][2025-04-26 01:22:23,296] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:22:23,297] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.74 | bwd_microstep: 5918.73 | bwd_inner_microstep: 5708.33 | bwd_allreduce_microstep: 210.36 | step_microstep: 18.47 [2025-04-26 01:22:23,297] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.74 | bwd: 5918.75 | bwd_inner: 5708.33 | bwd_allreduce: 210.37 | step: 18.47 17%|█▋ | 7208/41250 [17:24:48<82:49:33, 8.76s/it] {'loss': 0.1698, 'grad_norm': 2.7418582439422607, 'learning_rate': 3.784274553903277e-05, 'epoch': 1.75} 17%|█▋ | 7208/41250 [17:24:48<82:49:33, 8.76s/it][2025-04-26 01:22:31,929] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.05 | optimizer_step: 0.91 [2025-04-26 01:22:31,930] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.87 | bwd_microstep: 5719.54 | bwd_inner_microstep: 5658.33 | bwd_allreduce_microstep: 61.16 | step_microstep: 18.67 [2025-04-26 01:22:31,930] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.87 | bwd: 5719.56 | bwd_inner: 5658.33 | bwd_allreduce: 61.18 | step: 18.67 17%|█▋ | 7209/41250 [17:24:57<82:28:56, 8.72s/it] {'loss': 0.0901, 'grad_norm': 1.388259768486023, 'learning_rate': 3.784203606698963e-05, 'epoch': 1.75} 17%|█▋ | 7209/41250 [17:24:57<82:28:56, 8.72s/it][2025-04-26 01:22:40,636] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-26 01:22:40,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2866.47 | bwd_microstep: 5751.53 | bwd_inner_microstep: 5694.33 | bwd_allreduce_microstep: 57.16 | step_microstep: 19.12 [2025-04-26 01:22:40,637] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2866.47 | bwd: 5751.54 | bwd_inner: 5694.33 | bwd_allreduce: 57.18 | step: 19.12 17%|█▋ | 7210/41250 [17:25:05<82:25:19, 8.72s/it] {'loss': 0.1646, 'grad_norm': 1.9459211826324463, 'learning_rate': 3.7841326484953865e-05, 'epoch': 1.75} 17%|█▋ | 7210/41250 [17:25:05<82:25:19, 8.72s/it][2025-04-26 01:22:49,466] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 01:22:49,466] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.10 | bwd_microstep: 5884.15 | bwd_inner_microstep: 5700.56 | bwd_allreduce_microstep: 183.56 | step_microstep: 18.73 [2025-04-26 01:22:49,466] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.10 | bwd: 5884.17 | bwd_inner: 5700.56 | bwd_allreduce: 183.57 | step: 18.73 17%|█▋ | 7211/41250 [17:25:14<82:44:23, 8.75s/it] {'loss': 0.0433, 'grad_norm': 1.0709031820297241, 'learning_rate': 3.784061679292981e-05, 'epoch': 1.75} 17%|█▋ | 7211/41250 [17:25:14<82:44:23, 8.75s/it][2025-04-26 01:22:58,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:22:58,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.26 | bwd_microstep: 5698.89 | bwd_inner_microstep: 5654.88 | bwd_allreduce_microstep: 43.96 | step_microstep: 18.58 [2025-04-26 01:22:58,080] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.26 | bwd: 5698.90 | bwd_inner: 5654.89 | bwd_allreduce: 43.98 | step: 18.58 17%|█▋ | 7212/41250 [17:25:23<82:20:57, 8.71s/it] {'loss': 0.0675, 'grad_norm': 1.901116132736206, 'learning_rate': 3.7839906990921876e-05, 'epoch': 1.75} 17%|█▋ | 7212/41250 [17:25:23<82:20:57, 8.71s/it][2025-04-26 01:23:06,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:23:06,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.78 | bwd_microstep: 5683.62 | bwd_inner_microstep: 5653.95 | bwd_allreduce_microstep: 29.63 | step_microstep: 18.87 [2025-04-26 01:23:06,686] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.78 | bwd: 5683.64 | bwd_inner: 5653.95 | bwd_allreduce: 29.65 | step: 18.87 17%|█▋ | 7213/41250 [17:25:32<82:02:57, 8.68s/it] {'loss': 0.1643, 'grad_norm': 3.0812723636627197, 'learning_rate': 3.783919707893442e-05, 'epoch': 1.75} 17%|█▋ | 7213/41250 [17:25:32<82:02:57, 8.68s/it][2025-04-26 01:23:15,312] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-26 01:23:15,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.94 | bwd_microstep: 5699.17 | bwd_inner_microstep: 5686.56 | bwd_allreduce_microstep: 12.56 | step_microstep: 18.70 [2025-04-26 01:23:15,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.94 | bwd: 5699.19 | bwd_inner: 5686.56 | bwd_allreduce: 12.58 | step: 18.70 17%|█▋ | 7214/41250 [17:25:40<81:54:06, 8.66s/it] {'loss': 0.1509, 'grad_norm': 1.9147319793701172, 'learning_rate': 3.783848705697182e-05, 'epoch': 1.75} 17%|█▋ | 7214/41250 [17:25:40<81:54:06, 8.66s/it][2025-04-26 01:23:23,937] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 01:23:23,938] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.63 | bwd_microstep: 5710.94 | bwd_inner_microstep: 5643.23 | bwd_allreduce_microstep: 67.66 | step_microstep: 18.68 [2025-04-26 01:23:23,938] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.63 | bwd: 5710.95 | bwd_inner: 5643.23 | bwd_allreduce: 67.68 | step: 18.68 17%|█▋ | 7215/41250 [17:25:49<81:47:33, 8.65s/it] {'loss': 0.2554, 'grad_norm': 4.168917179107666, 'learning_rate': 3.7837776925038454e-05, 'epoch': 1.75} 17%|█▋ | 7215/41250 [17:25:49<81:47:33, 8.65s/it][2025-04-26 01:23:32,534] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.94 [2025-04-26 01:23:32,534] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.90 | bwd_microstep: 5686.49 | bwd_inner_microstep: 5646.03 | bwd_allreduce_microstep: 40.41 | step_microstep: 18.85 [2025-04-26 01:23:32,534] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.90 | bwd: 5686.51 | bwd_inner: 5646.03 | bwd_allreduce: 40.43 | step: 18.85 17%|█▋ | 7216/41250 [17:25:57<81:38:05, 8.64s/it] {'loss': 0.0326, 'grad_norm': 0.5557559728622437, 'learning_rate': 3.783706668313871e-05, 'epoch': 1.75} 17%|█▋ | 7216/41250 [17:25:57<81:38:05, 8.64s/it][2025-04-26 01:23:41,160] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:23:41,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.56 | bwd_microstep: 5704.26 | bwd_inner_microstep: 5649.60 | bwd_allreduce_microstep: 54.62 | step_microstep: 18.62 [2025-04-26 01:23:41,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.56 | bwd: 5704.28 | bwd_inner: 5649.60 | bwd_allreduce: 54.64 | step: 18.63 17%|█▋ | 7217/41250 [17:26:06<81:36:21, 8.63s/it] {'loss': 0.2933, 'grad_norm': 3.2903192043304443, 'learning_rate': 3.7836356331276946e-05, 'epoch': 1.75} 17%|█▋ | 7217/41250 [17:26:06<81:36:21, 8.63s/it][2025-04-26 01:23:49,760] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:23:49,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.19 | bwd_microstep: 5685.78 | bwd_inner_microstep: 5643.05 | bwd_allreduce_microstep: 42.69 | step_microstep: 18.49 [2025-04-26 01:23:49,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.19 | bwd: 5685.80 | bwd_inner: 5643.05 | bwd_allreduce: 42.70 | step: 18.49 17%|█▋ | 7218/41250 [17:26:15<81:30:47, 8.62s/it] {'loss': 0.044, 'grad_norm': 0.8091095089912415, 'learning_rate': 3.7835645869457566e-05, 'epoch': 1.75} 17%|█▋ | 7218/41250 [17:26:15<81:30:47, 8.62s/it][2025-04-26 01:23:58,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 01:23:58,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.63 | bwd_microstep: 5815.04 | bwd_inner_microstep: 5777.74 | bwd_allreduce_microstep: 37.25 | step_microstep: 18.56 [2025-04-26 01:23:58,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.63 | bwd: 5815.05 | bwd_inner: 5777.74 | bwd_allreduce: 37.27 | step: 18.56 18%|█▊ | 7219/41250 [17:26:23<81:58:27, 8.67s/it] {'loss': 0.1313, 'grad_norm': 1.773522973060608, 'learning_rate': 3.783493529768493e-05, 'epoch': 1.75} 18%|█▊ | 7219/41250 [17:26:23<81:58:27, 8.67s/it][2025-04-26 01:24:07,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.00 | optimizer_step: 1.12 [2025-04-26 01:24:07,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.28 | bwd_microstep: 5760.81 | bwd_inner_microstep: 5692.63 | bwd_allreduce_microstep: 68.13 | step_microstep: 19.25 [2025-04-26 01:24:07,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.28 | bwd: 5760.82 | bwd_inner: 5692.63 | bwd_allreduce: 68.15 | step: 19.25 18%|█▊ | 7220/41250 [17:26:32<82:01:58, 8.68s/it] {'loss': 0.2666, 'grad_norm': 2.0109429359436035, 'learning_rate': 3.7834224615963427e-05, 'epoch': 1.75} 18%|█▊ | 7220/41250 [17:26:32<82:01:58, 8.68s/it][2025-04-26 01:24:15,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 01:24:15,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.36 | bwd_microstep: 5692.49 | bwd_inner_microstep: 5643.76 | bwd_allreduce_microstep: 48.69 | step_microstep: 18.76 [2025-04-26 01:24:15,840] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.36 | bwd: 5692.50 | bwd_inner: 5643.76 | bwd_allreduce: 48.70 | step: 18.76 18%|█▊ | 7221/41250 [17:26:41<81:48:29, 8.65s/it] {'loss': 0.1769, 'grad_norm': 1.8771657943725586, 'learning_rate': 3.7833513824297436e-05, 'epoch': 1.75} 18%|█▊ | 7221/41250 [17:26:41<81:48:29, 8.65s/it][2025-04-26 01:24:24,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 01:24:24,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.73 | bwd_microstep: 5765.01 | bwd_inner_microstep: 5677.33 | bwd_allreduce_microstep: 87.63 | step_microstep: 18.65 [2025-04-26 01:24:24,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.73 | bwd: 5765.02 | bwd_inner: 5677.33 | bwd_allreduce: 87.65 | step: 18.66 18%|█▊ | 7222/41250 [17:26:49<81:54:20, 8.67s/it] {'loss': 0.0272, 'grad_norm': 0.4410299062728882, 'learning_rate': 3.7832802922691347e-05, 'epoch': 1.75} 18%|█▊ | 7222/41250 [17:26:49<81:54:20, 8.67s/it][2025-04-26 01:24:33,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 0.94 [2025-04-26 01:24:33,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.84 | bwd_microstep: 5700.69 | bwd_inner_microstep: 5687.96 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.95 [2025-04-26 01:24:33,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.84 | bwd: 5700.71 | bwd_inner: 5687.96 | bwd_allreduce: 12.71 | step: 18.95 18%|█▊ | 7223/41250 [17:26:58<81:48:46, 8.66s/it] {'loss': 0.1543, 'grad_norm': 0.6407178044319153, 'learning_rate': 3.783209191114953e-05, 'epoch': 1.75} 18%|█▊ | 7223/41250 [17:26:58<81:48:46, 8.66s/it][2025-04-26 01:24:41,809] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:24:41,810] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.95 | bwd_microstep: 5706.12 | bwd_inner_microstep: 5693.26 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.69 [2025-04-26 01:24:41,810] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.95 | bwd: 5706.13 | bwd_inner: 5693.26 | bwd_allreduce: 12.83 | step: 18.69 18%|█▊ | 7224/41250 [17:27:07<81:47:07, 8.65s/it] {'loss': 0.1339, 'grad_norm': 1.3068640232086182, 'learning_rate': 3.783138078967637e-05, 'epoch': 1.75} 18%|█▊ | 7224/41250 [17:27:07<81:47:07, 8.65s/it][2025-04-26 01:24:50,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 1.08 [2025-04-26 01:24:50,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.34 | bwd_microstep: 5747.18 | bwd_inner_microstep: 5652.38 | bwd_allreduce_microstep: 94.76 | step_microstep: 18.84 [2025-04-26 01:24:50,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.34 | bwd: 5747.19 | bwd_inner: 5652.38 | bwd_allreduce: 94.78 | step: 18.84 18%|█▊ | 7225/41250 [17:27:15<81:48:41, 8.66s/it] {'loss': 0.2366, 'grad_norm': 6.417294979095459, 'learning_rate': 3.783066955827626e-05, 'epoch': 1.75} 18%|█▊ | 7225/41250 [17:27:15<81:48:41, 8.66s/it][2025-04-26 01:24:59,119] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:24:59,120] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.97 | bwd_microstep: 5742.08 | bwd_inner_microstep: 5640.08 | bwd_allreduce_microstep: 101.95 | step_microstep: 18.53 [2025-04-26 01:24:59,120] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.97 | bwd: 5742.09 | bwd_inner: 5640.08 | bwd_allreduce: 101.97 | step: 18.53 18%|█▊ | 7226/41250 [17:27:24<81:47:15, 8.65s/it] {'loss': 0.1425, 'grad_norm': 3.5140020847320557, 'learning_rate': 3.782995821695358e-05, 'epoch': 1.75} 18%|█▊ | 7226/41250 [17:27:24<81:47:15, 8.65s/it][2025-04-26 01:25:07,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.14 | optimizer_step: 1.05 [2025-04-26 01:25:07,805] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.08 | bwd_microstep: 5751.60 | bwd_inner_microstep: 5683.59 | bwd_allreduce_microstep: 67.94 | step_microstep: 20.32 [2025-04-26 01:25:07,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.08 | bwd: 5751.61 | bwd_inner: 5683.59 | bwd_allreduce: 67.97 | step: 20.32 18%|█▊ | 7227/41250 [17:27:33<81:52:27, 8.66s/it] {'loss': 0.1373, 'grad_norm': 2.6805012226104736, 'learning_rate': 3.7829246765712715e-05, 'epoch': 1.75} 18%|█▊ | 7227/41250 [17:27:33<81:52:27, 8.66s/it][2025-04-26 01:25:16,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 0.95 [2025-04-26 01:25:16,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2816.05 | bwd_microstep: 5697.85 | bwd_inner_microstep: 5638.96 | bwd_allreduce_microstep: 58.84 | step_microstep: 18.56 [2025-04-26 01:25:16,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2816.05 | bwd: 5697.86 | bwd_inner: 5638.96 | bwd_allreduce: 58.86 | step: 18.56 18%|█▊ | 7228/41250 [17:27:41<81:41:05, 8.64s/it] {'loss': 0.1674, 'grad_norm': 1.8439584970474243, 'learning_rate': 3.7828535204558045e-05, 'epoch': 1.75} 18%|█▊ | 7228/41250 [17:27:41<81:41:05, 8.64s/it][2025-04-26 01:25:25,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:25:25,168] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.81 | bwd_microstep: 5831.44 | bwd_inner_microstep: 5706.68 | bwd_allreduce_microstep: 124.71 | step_microstep: 18.65 [2025-04-26 01:25:25,168] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.81 | bwd: 5831.45 | bwd_inner: 5706.68 | bwd_allreduce: 124.73 | step: 18.66 18%|█▊ | 7229/41250 [17:27:50<82:01:30, 8.68s/it] {'loss': 0.0253, 'grad_norm': 0.32203108072280884, 'learning_rate': 3.782782353349397e-05, 'epoch': 1.75} 18%|█▊ | 7229/41250 [17:27:50<82:01:30, 8.68s/it][2025-04-26 01:25:33,766] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-26 01:25:33,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.78 | bwd_microstep: 5695.96 | bwd_inner_microstep: 5646.78 | bwd_allreduce_microstep: 49.13 | step_microstep: 18.92 [2025-04-26 01:25:33,767] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.78 | bwd: 5695.98 | bwd_inner: 5646.78 | bwd_allreduce: 49.15 | step: 18.92 18%|█▊ | 7230/41250 [17:27:59<81:47:40, 8.66s/it] {'loss': 0.2065, 'grad_norm': 2.0850822925567627, 'learning_rate': 3.7827111752524866e-05, 'epoch': 1.75} 18%|█▊ | 7230/41250 [17:27:59<81:47:40, 8.66s/it][2025-04-26 01:25:42,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.99 | optimizer_step: 1.08 [2025-04-26 01:25:42,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.04 | bwd_microstep: 5696.27 | bwd_inner_microstep: 5639.38 | bwd_allreduce_microstep: 56.84 | step_microstep: 18.71 [2025-04-26 01:25:42,367] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.05 | bwd: 5696.28 | bwd_inner: 5639.38 | bwd_allreduce: 56.86 | step: 18.72 18%|█▊ | 7231/41250 [17:28:07<81:37:58, 8.64s/it] {'loss': 0.1069, 'grad_norm': 2.3141162395477295, 'learning_rate': 3.782639986165512e-05, 'epoch': 1.75} 18%|█▊ | 7231/41250 [17:28:07<81:37:58, 8.64s/it][2025-04-26 01:25:51,153] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 01:25:51,153] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.31 | bwd_microstep: 5853.20 | bwd_inner_microstep: 5699.90 | bwd_allreduce_microstep: 153.26 | step_microstep: 18.90 [2025-04-26 01:25:51,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.31 | bwd: 5853.21 | bwd_inner: 5699.90 | bwd_allreduce: 153.28 | step: 18.91 18%|█▊ | 7232/41250 [17:28:16<82:02:58, 8.68s/it] {'loss': 0.1514, 'grad_norm': 2.2008824348449707, 'learning_rate': 3.782568786088913e-05, 'epoch': 1.75} 18%|█▊ | 7232/41250 [17:28:16<82:02:58, 8.68s/it][2025-04-26 01:25:59,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:25:59,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.52 | bwd_microstep: 5764.36 | bwd_inner_microstep: 5648.98 | bwd_allreduce_microstep: 115.34 | step_microstep: 18.81 [2025-04-26 01:25:59,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.52 | bwd: 5764.38 | bwd_inner: 5648.98 | bwd_allreduce: 115.36 | step: 18.81 18%|█▊ | 7233/41250 [17:28:25<82:01:38, 8.68s/it] {'loss': 0.2183, 'grad_norm': 3.8612759113311768, 'learning_rate': 3.782497575023128e-05, 'epoch': 1.75} 18%|█▊ | 7233/41250 [17:28:25<82:01:38, 8.68s/it][2025-04-26 01:26:08,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.34 | optimizer_step: 1.05 [2025-04-26 01:26:08,490] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.27 | bwd_microstep: 5758.64 | bwd_inner_microstep: 5628.04 | bwd_allreduce_microstep: 130.53 | step_microstep: 20.30 [2025-04-26 01:26:08,490] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.27 | bwd: 5758.65 | bwd_inner: 5628.04 | bwd_allreduce: 130.57 | step: 20.30 18%|█▊ | 7234/41250 [17:28:33<81:58:15, 8.68s/it] {'loss': 0.1779, 'grad_norm': 3.2248504161834717, 'learning_rate': 3.782426352968596e-05, 'epoch': 1.75} 18%|█▊ | 7234/41250 [17:28:33<81:58:15, 8.68s/it][2025-04-26 01:26:17,140] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 0.91 [2025-04-26 01:26:17,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.73 | bwd_microstep: 5721.75 | bwd_inner_microstep: 5673.22 | bwd_allreduce_microstep: 48.48 | step_microstep: 18.79 [2025-04-26 01:26:17,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.73 | bwd: 5721.76 | bwd_inner: 5673.22 | bwd_allreduce: 48.50 | step: 18.79 18%|█▊ | 7235/41250 [17:28:42<81:53:48, 8.67s/it] {'loss': 0.1313, 'grad_norm': 1.7361140251159668, 'learning_rate': 3.7823551199257556e-05, 'epoch': 1.75} 18%|█▊ | 7235/41250 [17:28:42<81:53:48, 8.67s/it][2025-04-26 01:26:25,816] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.00 | optimizer_step: 1.08 [2025-04-26 01:26:25,817] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.06 | bwd_microstep: 5763.44 | bwd_inner_microstep: 5657.89 | bwd_allreduce_microstep: 105.50 | step_microstep: 18.99 [2025-04-26 01:26:25,817] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.06 | bwd: 5763.45 | bwd_inner: 5657.89 | bwd_allreduce: 105.52 | step: 18.99 18%|█▊ | 7236/41250 [17:28:51<81:55:03, 8.67s/it] {'loss': 0.239, 'grad_norm': 2.5363869667053223, 'learning_rate': 3.782283875895047e-05, 'epoch': 1.75} 18%|█▊ | 7236/41250 [17:28:51<81:55:03, 8.67s/it][2025-04-26 01:26:34,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 01:26:34,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.66 | bwd_microstep: 5689.04 | bwd_inner_microstep: 5651.74 | bwd_allreduce_microstep: 37.25 | step_microstep: 18.53 [2025-04-26 01:26:34,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.66 | bwd: 5689.05 | bwd_inner: 5651.74 | bwd_allreduce: 37.27 | step: 18.54 18%|█▊ | 7237/41250 [17:28:59<81:42:19, 8.65s/it] {'loss': 0.0765, 'grad_norm': 1.0135115385055542, 'learning_rate': 3.782212620876909e-05, 'epoch': 1.75} 18%|█▊ | 7237/41250 [17:28:59<81:42:19, 8.65s/it][2025-04-26 01:26:43,029] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 01:26:43,030] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.82 | bwd_microstep: 5693.88 | bwd_inner_microstep: 5681.34 | bwd_allreduce_microstep: 12.49 | step_microstep: 18.63 [2025-04-26 01:26:43,030] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.82 | bwd: 5693.89 | bwd_inner: 5681.34 | bwd_allreduce: 12.51 | step: 18.64 18%|█▊ | 7238/41250 [17:29:08<81:36:51, 8.64s/it] {'loss': 0.1315, 'grad_norm': 2.0317726135253906, 'learning_rate': 3.78214135487178e-05, 'epoch': 1.75} 18%|█▊ | 7238/41250 [17:29:08<81:36:51, 8.64s/it][2025-04-26 01:26:51,696] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 01:26:51,697] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.77 | bwd_microstep: 5757.98 | bwd_inner_microstep: 5648.45 | bwd_allreduce_microstep: 109.48 | step_microstep: 18.51 [2025-04-26 01:26:51,697] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.77 | bwd: 5757.99 | bwd_inner: 5648.45 | bwd_allreduce: 109.50 | step: 18.51 18%|█▊ | 7239/41250 [17:29:17<81:41:32, 8.65s/it] {'loss': 0.0919, 'grad_norm': 0.5955415964126587, 'learning_rate': 3.782070077880101e-05, 'epoch': 1.75} 18%|█▊ | 7239/41250 [17:29:17<81:41:32, 8.65s/it][2025-04-26 01:27:00,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 01:27:00,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.75 | bwd_microstep: 5712.98 | bwd_inner_microstep: 5700.54 | bwd_allreduce_microstep: 12.40 | step_microstep: 18.64 [2025-04-26 01:27:00,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.75 | bwd: 5713.00 | bwd_inner: 5700.54 | bwd_allreduce: 12.42 | step: 18.64 18%|█▊ | 7240/41250 [17:29:25<81:39:49, 8.64s/it] {'loss': 0.2338, 'grad_norm': 2.804048776626587, 'learning_rate': 3.7819987899023096e-05, 'epoch': 1.76} 18%|█▊ | 7240/41250 [17:29:25<81:39:49, 8.64s/it][2025-04-26 01:27:08,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 01:27:08,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.14 | bwd_microstep: 5754.95 | bwd_inner_microstep: 5646.53 | bwd_allreduce_microstep: 108.38 | step_microstep: 18.42 [2025-04-26 01:27:08,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.14 | bwd: 5754.97 | bwd_inner: 5646.53 | bwd_allreduce: 108.40 | step: 18.42 18%|█▊ | 7241/41250 [17:29:34<81:42:19, 8.65s/it] {'loss': 0.219, 'grad_norm': 2.261017322540283, 'learning_rate': 3.781927490938847e-05, 'epoch': 1.76} 18%|█▊ | 7241/41250 [17:29:34<81:42:19, 8.65s/it][2025-04-26 01:27:17,618] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 01:27:17,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.69 | bwd_microstep: 5698.10 | bwd_inner_microstep: 5685.49 | bwd_allreduce_microstep: 12.57 | step_microstep: 18.67 [2025-04-26 01:27:17,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.69 | bwd: 5698.11 | bwd_inner: 5685.49 | bwd_allreduce: 12.59 | step: 18.67 18%|█▊ | 7242/41250 [17:29:42<81:38:12, 8.64s/it] {'loss': 0.2366, 'grad_norm': 3.3679630756378174, 'learning_rate': 3.7818561809901514e-05, 'epoch': 1.76} 18%|█▊ | 7242/41250 [17:29:42<81:38:12, 8.64s/it][2025-04-26 01:27:26,432] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.02 | optimizer_step: 1.01 [2025-04-26 01:27:26,433] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.06 | bwd_microstep: 5902.85 | bwd_inner_microstep: 5666.35 | bwd_allreduce_microstep: 236.46 | step_microstep: 19.11 [2025-04-26 01:27:26,433] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.06 | bwd: 5902.86 | bwd_inner: 5666.35 | bwd_allreduce: 236.48 | step: 19.11 18%|█▊ | 7243/41250 [17:29:51<82:07:27, 8.69s/it] {'loss': 0.126, 'grad_norm': 2.1744256019592285, 'learning_rate': 3.7817848600566635e-05, 'epoch': 1.76} 18%|█▊ | 7243/41250 [17:29:51<82:07:27, 8.69s/it][2025-04-26 01:27:35,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.99 | optimizer_step: 1.02 [2025-04-26 01:27:35,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.68 | bwd_microstep: 5753.80 | bwd_inner_microstep: 5682.78 | bwd_allreduce_microstep: 70.97 | step_microstep: 18.57 [2025-04-26 01:27:35,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.68 | bwd: 5753.81 | bwd_inner: 5682.78 | bwd_allreduce: 70.98 | step: 18.57 18%|█▊ | 7244/41250 [17:30:00<82:05:16, 8.69s/it] {'loss': 0.2755, 'grad_norm': 1.421707272529602, 'learning_rate': 3.781713528138822e-05, 'epoch': 1.76} 18%|█▊ | 7244/41250 [17:30:00<82:05:16, 8.69s/it][2025-04-26 01:27:43,800] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-26 01:27:43,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2864.12 | bwd_microstep: 5737.94 | bwd_inner_microstep: 5718.35 | bwd_allreduce_microstep: 19.54 | step_microstep: 19.01 [2025-04-26 01:27:43,801] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2864.12 | bwd: 5737.95 | bwd_inner: 5718.35 | bwd_allreduce: 19.56 | step: 19.01 18%|█▊ | 7245/41250 [17:30:09<82:04:13, 8.69s/it] {'loss': 0.1794, 'grad_norm': 2.641861915588379, 'learning_rate': 3.781642185237067e-05, 'epoch': 1.76} 18%|█▊ | 7245/41250 [17:30:09<82:04:13, 8.69s/it][2025-04-26 01:27:52,555] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:27:52,555] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2888.83 | bwd_microstep: 5784.00 | bwd_inner_microstep: 5771.16 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.81 [2025-04-26 01:27:52,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2888.83 | bwd: 5784.01 | bwd_inner: 5771.16 | bwd_allreduce: 12.82 | step: 18.81 18%|█▊ | 7246/41250 [17:30:17<82:15:25, 8.71s/it] {'loss': 0.0448, 'grad_norm': 1.0508018732070923, 'learning_rate': 3.7815708313518376e-05, 'epoch': 1.76} 18%|█▊ | 7246/41250 [17:30:17<82:15:25, 8.71s/it][2025-04-26 01:28:01,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 01:28:01,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.11 | bwd_microstep: 5700.50 | bwd_inner_microstep: 5663.77 | bwd_allreduce_microstep: 36.68 | step_microstep: 18.94 [2025-04-26 01:28:01,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.11 | bwd: 5700.52 | bwd_inner: 5663.78 | bwd_allreduce: 36.70 | step: 18.94 18%|█▊ | 7247/41250 [17:30:26<81:58:45, 8.68s/it] {'loss': 0.1394, 'grad_norm': 1.2896174192428589, 'learning_rate': 3.781499466483576e-05, 'epoch': 1.76} 18%|█▊ | 7247/41250 [17:30:26<81:58:45, 8.68s/it][2025-04-26 01:28:09,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.97 | optimizer_step: 1.08 [2025-04-26 01:28:09,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.26 | bwd_microstep: 5727.19 | bwd_inner_microstep: 5714.18 | bwd_allreduce_microstep: 12.96 | step_microstep: 18.73 [2025-04-26 01:28:09,828] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.26 | bwd: 5727.20 | bwd_inner: 5714.18 | bwd_allreduce: 12.98 | step: 18.73 18%|█▊ | 7248/41250 [17:30:35<81:55:19, 8.67s/it] {'loss': 0.1044, 'grad_norm': 1.1864213943481445, 'learning_rate': 3.781428090632719e-05, 'epoch': 1.76} 18%|█▊ | 7248/41250 [17:30:35<81:55:19, 8.67s/it][2025-04-26 01:28:18,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:28:18,645] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.83 | bwd_microstep: 5908.80 | bwd_inner_microstep: 5654.85 | bwd_allreduce_microstep: 253.91 | step_microstep: 18.57 [2025-04-26 01:28:18,645] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.83 | bwd: 5908.81 | bwd_inner: 5654.85 | bwd_allreduce: 253.93 | step: 18.57 18%|█▊ | 7249/41250 [17:30:43<82:19:37, 8.72s/it] {'loss': 0.2809, 'grad_norm': 4.1288862228393555, 'learning_rate': 3.7813567037997095e-05, 'epoch': 1.76} 18%|█▊ | 7249/41250 [17:30:43<82:19:37, 8.72s/it][2025-04-26 01:28:27,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 01:28:27,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.51 | bwd_microstep: 5764.79 | bwd_inner_microstep: 5697.28 | bwd_allreduce_microstep: 67.47 | step_microstep: 18.63 [2025-04-26 01:28:27,336] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.51 | bwd: 5764.80 | bwd_inner: 5697.28 | bwd_allreduce: 67.49 | step: 18.63 18%|█▊ | 7250/41250 [17:30:52<82:15:00, 8.71s/it] {'loss': 0.0629, 'grad_norm': 0.8651026487350464, 'learning_rate': 3.7812853059849854e-05, 'epoch': 1.76} 18%|█▊ | 7250/41250 [17:30:52<82:15:00, 8.71s/it][2025-04-26 01:28:35,984] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 01:28:35,984] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.06 | bwd_microstep: 5709.76 | bwd_inner_microstep: 5697.19 | bwd_allreduce_microstep: 12.52 | step_microstep: 18.60 [2025-04-26 01:28:35,985] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.06 | bwd: 5709.77 | bwd_inner: 5697.19 | bwd_allreduce: 12.54 | step: 18.60 18%|█▊ | 7251/41250 [17:31:01<82:04:54, 8.69s/it] {'loss': 0.1122, 'grad_norm': 1.4043116569519043, 'learning_rate': 3.781213897188989e-05, 'epoch': 1.76} 18%|█▊ | 7251/41250 [17:31:01<82:04:54, 8.69s/it][2025-04-26 01:28:44,608] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-26 01:28:44,609] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.82 | bwd_microstep: 5707.30 | bwd_inner_microstep: 5662.77 | bwd_allreduce_microstep: 44.48 | step_microstep: 19.19 [2025-04-26 01:28:44,609] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.82 | bwd: 5707.31 | bwd_inner: 5662.77 | bwd_allreduce: 44.49 | step: 19.20 18%|█▊ | 7252/41250 [17:31:09<81:53:19, 8.67s/it] {'loss': 0.0828, 'grad_norm': 0.8603627681732178, 'learning_rate': 3.781142477412158e-05, 'epoch': 1.76} 18%|█▊ | 7252/41250 [17:31:09<81:53:19, 8.67s/it][2025-04-26 01:28:53,322] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 01:28:53,322] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.88 | bwd_microstep: 5777.83 | bwd_inner_microstep: 5701.94 | bwd_allreduce_microstep: 75.86 | step_microstep: 18.39 [2025-04-26 01:28:53,322] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.88 | bwd: 5777.85 | bwd_inner: 5701.94 | bwd_allreduce: 75.87 | step: 18.40 18%|█▊ | 7253/41250 [17:31:18<82:00:13, 8.68s/it] {'loss': 0.337, 'grad_norm': 3.7154672145843506, 'learning_rate': 3.781071046654934e-05, 'epoch': 1.76} 18%|█▊ | 7253/41250 [17:31:18<82:00:13, 8.68s/it][2025-04-26 01:29:02,078] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-26 01:29:02,079] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.69 | bwd_microstep: 5785.74 | bwd_inner_microstep: 5773.01 | bwd_allreduce_microstep: 12.69 | step_microstep: 19.12 [2025-04-26 01:29:02,079] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.69 | bwd: 5785.76 | bwd_inner: 5773.01 | bwd_allreduce: 12.71 | step: 19.12 18%|█▊ | 7254/41250 [17:31:27<82:12:47, 8.71s/it] {'loss': 0.0847, 'grad_norm': 0.6741653680801392, 'learning_rate': 3.780999604917758e-05, 'epoch': 1.76} 18%|█▊ | 7254/41250 [17:31:27<82:12:47, 8.71s/it][2025-04-26 01:29:10,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-26 01:29:10,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2896.64 | bwd_microstep: 5793.30 | bwd_inner_microstep: 5780.20 | bwd_allreduce_microstep: 13.06 | step_microstep: 19.12 [2025-04-26 01:29:10,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2896.65 | bwd: 5793.32 | bwd_inner: 5780.20 | bwd_allreduce: 13.08 | step: 19.12 18%|█▊ | 7255/41250 [17:31:36<82:24:02, 8.73s/it] {'loss': 0.229, 'grad_norm': 4.042088508605957, 'learning_rate': 3.780928152201069e-05, 'epoch': 1.76} 18%|█▊ | 7255/41250 [17:31:36<82:24:02, 8.73s/it][2025-04-26 01:29:19,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:29:19,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.77 | bwd_microstep: 5772.27 | bwd_inner_microstep: 5654.20 | bwd_allreduce_microstep: 118.03 | step_microstep: 18.80 [2025-04-26 01:29:19,545] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.77 | bwd: 5772.29 | bwd_inner: 5654.20 | bwd_allreduce: 118.04 | step: 18.80 18%|█▊ | 7256/41250 [17:31:44<82:18:08, 8.72s/it] {'loss': 0.115, 'grad_norm': 3.5417463779449463, 'learning_rate': 3.780856688505309e-05, 'epoch': 1.76} 18%|█▊ | 7256/41250 [17:31:44<82:18:08, 8.72s/it][2025-04-26 01:29:28,153] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.14 | optimizer_step: 1.06 [2025-04-26 01:29:28,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.98 | bwd_microstep: 5694.60 | bwd_inner_microstep: 5656.58 | bwd_allreduce_microstep: 37.96 | step_microstep: 20.20 [2025-04-26 01:29:28,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.99 | bwd: 5694.62 | bwd_inner: 5656.58 | bwd_allreduce: 37.99 | step: 20.21 18%|█▊ | 7257/41250 [17:31:53<81:59:56, 8.68s/it] {'loss': 0.0751, 'grad_norm': 1.1243584156036377, 'learning_rate': 3.7807852138309174e-05, 'epoch': 1.76} 18%|█▊ | 7257/41250 [17:31:53<81:59:56, 8.68s/it][2025-04-26 01:29:36,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-26 01:29:36,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.91 | bwd_microstep: 5797.46 | bwd_inner_microstep: 5650.80 | bwd_allreduce_microstep: 146.61 | step_microstep: 18.94 [2025-04-26 01:29:36,863] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.91 | bwd: 5797.47 | bwd_inner: 5650.80 | bwd_allreduce: 146.62 | step: 18.94 18%|█▊ | 7258/41250 [17:32:02<82:03:43, 8.69s/it] {'loss': 0.1328, 'grad_norm': 2.8122713565826416, 'learning_rate': 3.780713728178335e-05, 'epoch': 1.76} 18%|█▊ | 7258/41250 [17:32:02<82:03:43, 8.69s/it][2025-04-26 01:29:45,535] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 01:29:45,535] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.62 | bwd_microstep: 5741.75 | bwd_inner_microstep: 5701.73 | bwd_allreduce_microstep: 39.97 | step_microstep: 18.69 [2025-04-26 01:29:45,535] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.62 | bwd: 5741.76 | bwd_inner: 5701.73 | bwd_allreduce: 39.99 | step: 18.69 18%|█▊ | 7259/41250 [17:32:10<82:00:29, 8.69s/it] {'loss': 0.2198, 'grad_norm': 2.5759801864624023, 'learning_rate': 3.7806422315480035e-05, 'epoch': 1.76} 18%|█▊ | 7259/41250 [17:32:10<82:00:29, 8.69s/it][2025-04-26 01:29:54,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.96 | optimizer_step: 1.05 [2025-04-26 01:29:54,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.50 | bwd_microstep: 5787.21 | bwd_inner_microstep: 5653.17 | bwd_allreduce_microstep: 134.00 | step_microstep: 18.63 [2025-04-26 01:29:54,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.50 | bwd: 5787.22 | bwd_inner: 5653.16 | bwd_allreduce: 134.01 | step: 18.63 18%|█▊ | 7260/41250 [17:32:19<82:03:12, 8.69s/it] {'loss': 0.1059, 'grad_norm': 2.3854966163635254, 'learning_rate': 3.780570723940362e-05, 'epoch': 1.76} 18%|█▊ | 7260/41250 [17:32:19<82:03:12, 8.69s/it][2025-04-26 01:30:02,917] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.96 | optimizer_step: 1.00 [2025-04-26 01:30:02,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.82 | bwd_microstep: 5753.13 | bwd_inner_microstep: 5705.84 | bwd_allreduce_microstep: 47.25 | step_microstep: 18.54 [2025-04-26 01:30:02,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.82 | bwd: 5753.15 | bwd_inner: 5705.84 | bwd_allreduce: 47.27 | step: 18.54 18%|█▊ | 7261/41250 [17:32:28<82:01:20, 8.69s/it] {'loss': 0.1345, 'grad_norm': 2.104391098022461, 'learning_rate': 3.780499205355852e-05, 'epoch': 1.76} 18%|█▊ | 7261/41250 [17:32:28<82:01:20, 8.69s/it][2025-04-26 01:30:11,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.00 | optimizer_step: 1.14 [2025-04-26 01:30:11,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.18 | bwd_microstep: 5705.30 | bwd_inner_microstep: 5653.88 | bwd_allreduce_microstep: 51.38 | step_microstep: 19.09 [2025-04-26 01:30:11,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.18 | bwd: 5705.31 | bwd_inner: 5653.88 | bwd_allreduce: 51.39 | step: 19.09 18%|█▊ | 7262/41250 [17:32:36<81:49:51, 8.67s/it] {'loss': 0.1739, 'grad_norm': 1.7092511653900146, 'learning_rate': 3.780427675794916e-05, 'epoch': 1.76} 18%|█▊ | 7262/41250 [17:32:36<81:49:51, 8.67s/it][2025-04-26 01:30:20,206] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.97 | optimizer_step: 0.99 [2025-04-26 01:30:20,207] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.82 | bwd_microstep: 5736.52 | bwd_inner_microstep: 5688.89 | bwd_allreduce_microstep: 47.59 | step_microstep: 18.72 [2025-04-26 01:30:20,207] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.83 | bwd: 5736.53 | bwd_inner: 5688.89 | bwd_allreduce: 47.61 | step: 18.72 18%|█▊ | 7263/41250 [17:32:45<81:49:44, 8.67s/it] {'loss': 0.3596, 'grad_norm': 4.10505485534668, 'learning_rate': 3.780356135257992e-05, 'epoch': 1.76} 18%|█▊ | 7263/41250 [17:32:45<81:49:44, 8.67s/it][2025-04-26 01:30:28,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:30:28,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.29 | bwd_microstep: 5690.83 | bwd_inner_microstep: 5678.11 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.95 [2025-04-26 01:30:28,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.29 | bwd: 5690.84 | bwd_inner: 5678.11 | bwd_allreduce: 12.68 | step: 18.96 18%|█▊ | 7264/41250 [17:32:54<81:41:20, 8.65s/it] {'loss': 0.0583, 'grad_norm': 1.0936737060546875, 'learning_rate': 3.780284583745524e-05, 'epoch': 1.76} 18%|█▊ | 7264/41250 [17:32:54<81:41:20, 8.65s/it][2025-04-26 01:30:37,477] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.04 | optimizer_step: 1.09 [2025-04-26 01:30:37,477] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.59 | bwd_microstep: 5729.80 | bwd_inner_microstep: 5682.50 | bwd_allreduce_microstep: 47.26 | step_microstep: 18.72 [2025-04-26 01:30:37,478] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.59 | bwd: 5729.81 | bwd_inner: 5682.50 | bwd_allreduce: 47.27 | step: 18.72 18%|█▊ | 7265/41250 [17:33:02<81:41:01, 8.65s/it] {'loss': 0.3102, 'grad_norm': 3.594799518585205, 'learning_rate': 3.780213021257951e-05, 'epoch': 1.76} 18%|█▊ | 7265/41250 [17:33:02<81:41:01, 8.65s/it][2025-04-26 01:30:46,078] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 01:30:46,079] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.72 | bwd_microstep: 5698.25 | bwd_inner_microstep: 5638.88 | bwd_allreduce_microstep: 59.32 | step_microstep: 18.69 [2025-04-26 01:30:46,079] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.72 | bwd: 5698.26 | bwd_inner: 5638.88 | bwd_allreduce: 59.34 | step: 18.69 18%|█▊ | 7266/41250 [17:33:11<81:32:05, 8.64s/it] {'loss': 0.0863, 'grad_norm': 1.1935765743255615, 'learning_rate': 3.780141447795715e-05, 'epoch': 1.76} 18%|█▊ | 7266/41250 [17:33:11<81:32:05, 8.64s/it][2025-04-26 01:30:54,756] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-26 01:30:54,756] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.00 | bwd_microstep: 5772.14 | bwd_inner_microstep: 5633.21 | bwd_allreduce_microstep: 138.89 | step_microstep: 19.01 [2025-04-26 01:30:54,757] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.00 | bwd: 5772.15 | bwd_inner: 5633.21 | bwd_allreduce: 138.90 | step: 19.01 18%|█▊ | 7267/41250 [17:33:20<81:39:15, 8.65s/it] {'loss': 0.1852, 'grad_norm': 1.5076017379760742, 'learning_rate': 3.7800698633592564e-05, 'epoch': 1.76} 18%|█▊ | 7267/41250 [17:33:20<81:39:15, 8.65s/it][2025-04-26 01:31:03,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.94 [2025-04-26 01:31:03,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.28 | bwd_microstep: 5885.70 | bwd_inner_microstep: 5655.62 | bwd_allreduce_microstep: 230.04 | step_microstep: 18.99 [2025-04-26 01:31:03,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.28 | bwd: 5885.72 | bwd_inner: 5655.62 | bwd_allreduce: 230.06 | step: 18.99 18%|█▊ | 7268/41250 [17:33:28<82:05:01, 8.70s/it] {'loss': 0.3025, 'grad_norm': 2.2232553958892822, 'learning_rate': 3.779998267949018e-05, 'epoch': 1.76} 18%|█▊ | 7268/41250 [17:33:28<82:05:01, 8.70s/it][2025-04-26 01:31:12,225] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:31:12,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.33 | bwd_microstep: 5724.64 | bwd_inner_microstep: 5695.80 | bwd_allreduce_microstep: 28.81 | step_microstep: 18.58 [2025-04-26 01:31:12,226] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.34 | bwd: 5724.66 | bwd_inner: 5695.80 | bwd_allreduce: 28.82 | step: 18.58 18%|█▊ | 7269/41250 [17:33:37<81:59:52, 8.69s/it] {'loss': 0.1647, 'grad_norm': 1.7867587804794312, 'learning_rate': 3.779926661565441e-05, 'epoch': 1.76} 18%|█▊ | 7269/41250 [17:33:37<81:59:52, 8.69s/it][2025-04-26 01:31:20,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 1.06 [2025-04-26 01:31:20,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.51 | bwd_microstep: 5715.72 | bwd_inner_microstep: 5678.85 | bwd_allreduce_microstep: 36.83 | step_microstep: 18.81 [2025-04-26 01:31:20,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.51 | bwd: 5715.73 | bwd_inner: 5678.85 | bwd_allreduce: 36.85 | step: 18.81 18%|█▊ | 7270/41250 [17:33:46<81:53:50, 8.68s/it] {'loss': 0.0374, 'grad_norm': 0.8910180330276489, 'learning_rate': 3.779855044208965e-05, 'epoch': 1.76} 18%|█▊ | 7270/41250 [17:33:46<81:53:50, 8.68s/it][2025-04-26 01:31:29,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:31:29,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2920.97 | bwd_microstep: 5856.39 | bwd_inner_microstep: 5843.60 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.59 [2025-04-26 01:31:29,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2920.97 | bwd: 5856.40 | bwd_inner: 5843.60 | bwd_allreduce: 12.76 | step: 18.59 18%|█▊ | 7271/41250 [17:33:55<82:25:09, 8.73s/it] {'loss': 0.1561, 'grad_norm': 3.9490089416503906, 'learning_rate': 3.779783415880033e-05, 'epoch': 1.76} 18%|█▊ | 7271/41250 [17:33:55<82:25:09, 8.73s/it][2025-04-26 01:31:38,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:31:38,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.10 | bwd_microstep: 5711.67 | bwd_inner_microstep: 5694.28 | bwd_allreduce_microstep: 17.33 | step_microstep: 18.74 [2025-04-26 01:31:38,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.10 | bwd: 5711.69 | bwd_inner: 5694.28 | bwd_allreduce: 17.36 | step: 18.74 18%|█▊ | 7272/41250 [17:34:03<82:11:02, 8.71s/it] {'loss': 0.135, 'grad_norm': 1.6722702980041504, 'learning_rate': 3.7797117765790864e-05, 'epoch': 1.76} 18%|█▊ | 7272/41250 [17:34:03<82:11:02, 8.71s/it][2025-04-26 01:31:47,027] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.07 | optimizer_step: 1.24 [2025-04-26 01:31:47,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.57 | bwd_microstep: 5713.74 | bwd_inner_microstep: 5647.09 | bwd_allreduce_microstep: 66.59 | step_microstep: 19.79 [2025-04-26 01:31:47,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.57 | bwd: 5713.75 | bwd_inner: 5647.09 | bwd_allreduce: 66.62 | step: 19.79 18%|█▊ | 7273/41250 [17:34:12<81:58:59, 8.69s/it] {'loss': 0.1678, 'grad_norm': 1.601065993309021, 'learning_rate': 3.779640126306567e-05, 'epoch': 1.76} 18%|█▊ | 7273/41250 [17:34:12<81:58:59, 8.69s/it][2025-04-26 01:31:55,618] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:31:55,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.73 | bwd_microstep: 5684.89 | bwd_inner_microstep: 5636.65 | bwd_allreduce_microstep: 48.20 | step_microstep: 18.44 [2025-04-26 01:31:55,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.73 | bwd: 5684.90 | bwd_inner: 5636.65 | bwd_allreduce: 48.21 | step: 18.44 18%|█▊ | 7274/41250 [17:34:20<81:42:28, 8.66s/it] {'loss': 0.1844, 'grad_norm': 2.085780382156372, 'learning_rate': 3.779568465062916e-05, 'epoch': 1.76} 18%|█▊ | 7274/41250 [17:34:20<81:42:28, 8.66s/it][2025-04-26 01:32:04,288] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:32:04,288] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.09 | bwd_microstep: 5763.40 | bwd_inner_microstep: 5637.39 | bwd_allreduce_microstep: 125.97 | step_microstep: 18.57 [2025-04-26 01:32:04,289] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.09 | bwd: 5763.42 | bwd_inner: 5637.39 | bwd_allreduce: 125.98 | step: 18.58 18%|█▊ | 7275/41250 [17:34:29<81:44:29, 8.66s/it] {'loss': 0.1443, 'grad_norm': 1.8210422992706299, 'learning_rate': 3.779496792848575e-05, 'epoch': 1.76} 18%|█▊ | 7275/41250 [17:34:29<81:44:29, 8.66s/it][2025-04-26 01:32:12,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:32:12,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.48 | bwd_microstep: 5765.08 | bwd_inner_microstep: 5649.92 | bwd_allreduce_microstep: 115.12 | step_microstep: 18.85 [2025-04-26 01:32:12,956] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.48 | bwd: 5765.10 | bwd_inner: 5649.92 | bwd_allreduce: 115.14 | step: 18.86 18%|█▊ | 7276/41250 [17:34:38<81:45:17, 8.66s/it] {'loss': 0.3032, 'grad_norm': 3.192342519760132, 'learning_rate': 3.779425109663987e-05, 'epoch': 1.76} 18%|█▊ | 7276/41250 [17:34:38<81:45:17, 8.66s/it][2025-04-26 01:32:21,627] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:32:21,628] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.96 | bwd_microstep: 5769.46 | bwd_inner_microstep: 5653.14 | bwd_allreduce_microstep: 116.27 | step_microstep: 18.88 [2025-04-26 01:32:21,628] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.96 | bwd: 5769.47 | bwd_inner: 5653.14 | bwd_allreduce: 116.29 | step: 18.88 18%|█▊ | 7277/41250 [17:34:46<81:47:23, 8.67s/it] {'loss': 0.145, 'grad_norm': 3.8613812923431396, 'learning_rate': 3.779353415509593e-05, 'epoch': 1.76} 18%|█▊ | 7277/41250 [17:34:46<81:47:23, 8.67s/it][2025-04-26 01:32:30,295] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.28 | optimizer_step: 1.04 [2025-04-26 01:32:30,296] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.63 | bwd_microstep: 5738.07 | bwd_inner_microstep: 5684.19 | bwd_allreduce_microstep: 53.82 | step_microstep: 20.07 [2025-04-26 01:32:30,296] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.63 | bwd: 5738.09 | bwd_inner: 5684.19 | bwd_allreduce: 53.85 | step: 20.08 18%|█▊ | 7278/41250 [17:34:55<81:46:51, 8.67s/it] {'loss': 0.1187, 'grad_norm': 0.8610243797302246, 'learning_rate': 3.779281710385836e-05, 'epoch': 1.76} 18%|█▊ | 7278/41250 [17:34:55<81:46:51, 8.67s/it][2025-04-26 01:32:39,024] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-26 01:32:39,024] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2875.67 | bwd_microstep: 5768.67 | bwd_inner_microstep: 5755.81 | bwd_allreduce_microstep: 12.82 | step_microstep: 19.08 [2025-04-26 01:32:39,024] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2875.67 | bwd: 5768.69 | bwd_inner: 5755.81 | bwd_allreduce: 12.84 | step: 19.08 18%|█▊ | 7279/41250 [17:35:04<81:57:10, 8.68s/it] {'loss': 0.1412, 'grad_norm': 1.2485491037368774, 'learning_rate': 3.779209994293156e-05, 'epoch': 1.76} 18%|█▊ | 7279/41250 [17:35:04<81:57:10, 8.68s/it][2025-04-26 01:32:47,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:32:47,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.96 | bwd_microstep: 5753.30 | bwd_inner_microstep: 5696.83 | bwd_allreduce_microstep: 56.42 | step_microstep: 18.63 [2025-04-26 01:32:47,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.96 | bwd: 5753.31 | bwd_inner: 5696.83 | bwd_allreduce: 56.44 | step: 18.63 18%|█▊ | 7280/41250 [17:35:13<81:57:36, 8.69s/it] {'loss': 0.1645, 'grad_norm': 1.4136515855789185, 'learning_rate': 3.7791382672319964e-05, 'epoch': 1.76} 18%|█▊ | 7280/41250 [17:35:13<81:57:36, 8.69s/it][2025-04-26 01:32:56,325] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:32:56,326] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.29 | bwd_microstep: 5707.49 | bwd_inner_microstep: 5656.53 | bwd_allreduce_microstep: 50.91 | step_microstep: 18.58 [2025-04-26 01:32:56,326] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.29 | bwd: 5707.50 | bwd_inner: 5656.53 | bwd_allreduce: 50.93 | step: 18.58 18%|█▊ | 7281/41250 [17:35:21<81:45:00, 8.66s/it] {'loss': 0.1519, 'grad_norm': 1.254064917564392, 'learning_rate': 3.779066529202799e-05, 'epoch': 1.77} 18%|█▊ | 7281/41250 [17:35:21<81:45:00, 8.66s/it][2025-04-26 01:33:05,021] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.05 | optimizer_step: 0.89 [2025-04-26 01:33:05,021] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.72 | bwd_microstep: 5773.44 | bwd_inner_microstep: 5658.95 | bwd_allreduce_microstep: 114.43 | step_microstep: 19.00 [2025-04-26 01:33:05,021] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.72 | bwd: 5773.45 | bwd_inner: 5658.95 | bwd_allreduce: 114.46 | step: 19.00 18%|█▊ | 7282/41250 [17:35:30<81:50:34, 8.67s/it] {'loss': 0.1287, 'grad_norm': 3.004906177520752, 'learning_rate': 3.7789947802060076e-05, 'epoch': 1.77} 18%|█▊ | 7282/41250 [17:35:30<81:50:34, 8.67s/it][2025-04-26 01:33:13,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.01 | optimizer_step: 1.03 [2025-04-26 01:33:13,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.26 | bwd_microstep: 5744.49 | bwd_inner_microstep: 5701.61 | bwd_allreduce_microstep: 42.84 | step_microstep: 19.03 [2025-04-26 01:33:13,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.26 | bwd: 5744.51 | bwd_inner: 5701.61 | bwd_allreduce: 42.86 | step: 19.03 18%|█▊ | 7283/41250 [17:35:39<81:51:49, 8.68s/it] {'loss': 0.0671, 'grad_norm': 1.507530927658081, 'learning_rate': 3.778923020242063e-05, 'epoch': 1.77} 18%|█▊ | 7283/41250 [17:35:39<81:51:49, 8.68s/it][2025-04-26 01:33:22,351] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 01:33:22,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.66 | bwd_microstep: 5708.50 | bwd_inner_microstep: 5695.80 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.92 [2025-04-26 01:33:22,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.66 | bwd: 5708.51 | bwd_inner: 5695.80 | bwd_allreduce: 12.67 | step: 18.92 18%|█▊ | 7284/41250 [17:35:47<81:46:47, 8.67s/it] {'loss': 0.313, 'grad_norm': 3.7436351776123047, 'learning_rate': 3.778851249311408e-05, 'epoch': 1.77} 18%|█▊ | 7284/41250 [17:35:47<81:46:47, 8.67s/it][2025-04-26 01:33:31,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:33:31,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.39 | bwd_microstep: 5781.33 | bwd_inner_microstep: 5645.76 | bwd_allreduce_microstep: 135.54 | step_microstep: 18.29 [2025-04-26 01:33:31,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.39 | bwd: 5781.35 | bwd_inner: 5645.76 | bwd_allreduce: 135.55 | step: 18.29 18%|█▊ | 7285/41250 [17:35:56<81:49:42, 8.67s/it] {'loss': 0.1818, 'grad_norm': 3.1978697776794434, 'learning_rate': 3.778779467414484e-05, 'epoch': 1.77} 18%|█▊ | 7285/41250 [17:35:56<81:49:42, 8.67s/it][2025-04-26 01:33:39,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.00 | optimizer_step: 1.02 [2025-04-26 01:33:39,659] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.73 | bwd_microstep: 5715.96 | bwd_inner_microstep: 5652.04 | bwd_allreduce_microstep: 63.88 | step_microstep: 18.77 [2025-04-26 01:33:39,659] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.73 | bwd: 5715.98 | bwd_inner: 5652.04 | bwd_allreduce: 63.90 | step: 18.77 18%|█▊ | 7286/41250 [17:36:04<81:40:40, 8.66s/it] {'loss': 0.1507, 'grad_norm': 1.8625836372375488, 'learning_rate': 3.7787076745517353e-05, 'epoch': 1.77} 18%|█▊ | 7286/41250 [17:36:04<81:40:40, 8.66s/it][2025-04-26 01:33:48,330] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:33:48,330] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.81 | bwd_microstep: 5764.13 | bwd_inner_microstep: 5650.87 | bwd_allreduce_microstep: 113.21 | step_microstep: 18.65 [2025-04-26 01:33:48,330] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.82 | bwd: 5764.15 | bwd_inner: 5650.87 | bwd_allreduce: 113.24 | step: 18.65 18%|█▊ | 7287/41250 [17:36:13<81:43:15, 8.66s/it] {'loss': 0.2184, 'grad_norm': 2.2269482612609863, 'learning_rate': 3.778635870723603e-05, 'epoch': 1.77} 18%|█▊ | 7287/41250 [17:36:13<81:43:15, 8.66s/it][2025-04-26 01:33:56,968] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.03 | optimizer_step: 1.15 [2025-04-26 01:33:56,969] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.53 | bwd_microstep: 5703.73 | bwd_inner_microstep: 5691.11 | bwd_allreduce_microstep: 12.57 | step_microstep: 19.45 [2025-04-26 01:33:56,969] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.53 | bwd: 5703.74 | bwd_inner: 5691.11 | bwd_allreduce: 12.59 | step: 19.45 18%|█▊ | 7288/41250 [17:36:22<81:38:48, 8.65s/it] {'loss': 0.1943, 'grad_norm': 1.0199179649353027, 'learning_rate': 3.7785640559305304e-05, 'epoch': 1.77} 18%|█▊ | 7288/41250 [17:36:22<81:38:48, 8.65s/it][2025-04-26 01:34:05,648] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:34:05,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.37 | bwd_microstep: 5769.72 | bwd_inner_microstep: 5642.04 | bwd_allreduce_microstep: 127.63 | step_microstep: 18.53 [2025-04-26 01:34:05,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.37 | bwd: 5769.73 | bwd_inner: 5642.04 | bwd_allreduce: 127.65 | step: 18.54 18%|█▊ | 7289/41250 [17:36:30<81:42:57, 8.66s/it] {'loss': 0.0856, 'grad_norm': 2.1149911880493164, 'learning_rate': 3.778492230172961e-05, 'epoch': 1.77} 18%|█▊ | 7289/41250 [17:36:30<81:42:57, 8.66s/it][2025-04-26 01:34:14,360] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-26 01:34:14,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.16 | bwd_microstep: 5774.74 | bwd_inner_microstep: 5694.98 | bwd_allreduce_microstep: 79.70 | step_microstep: 19.06 [2025-04-26 01:34:14,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.16 | bwd: 5774.75 | bwd_inner: 5694.98 | bwd_allreduce: 79.72 | step: 19.07 18%|█▊ | 7290/41250 [17:36:39<81:51:14, 8.68s/it] {'loss': 0.3199, 'grad_norm': 2.277965545654297, 'learning_rate': 3.778420393451336e-05, 'epoch': 1.77} 18%|█▊ | 7290/41250 [17:36:39<81:51:14, 8.68s/it][2025-04-26 01:34:23,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 01:34:23,047] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.05 | bwd_microstep: 5752.63 | bwd_inner_microstep: 5701.59 | bwd_allreduce_microstep: 51.00 | step_microstep: 18.56 [2025-04-26 01:34:23,047] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.05 | bwd: 5752.65 | bwd_inner: 5701.59 | bwd_allreduce: 51.01 | step: 18.57 18%|█▊ | 7291/41250 [17:36:48<81:52:39, 8.68s/it] {'loss': 0.0798, 'grad_norm': 1.5169986486434937, 'learning_rate': 3.7783485457660996e-05, 'epoch': 1.77} 18%|█▊ | 7291/41250 [17:36:48<81:52:39, 8.68s/it][2025-04-26 01:34:31,735] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:34:31,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.93 | bwd_microstep: 5754.62 | bwd_inner_microstep: 5703.38 | bwd_allreduce_microstep: 51.20 | step_microstep: 18.82 [2025-04-26 01:34:31,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.94 | bwd: 5754.64 | bwd_inner: 5703.38 | bwd_allreduce: 51.22 | step: 18.82 18%|█▊ | 7292/41250 [17:36:57<81:54:01, 8.68s/it] {'loss': 0.1118, 'grad_norm': 1.0493124723434448, 'learning_rate': 3.778276687117694e-05, 'epoch': 1.77} 18%|█▊ | 7292/41250 [17:36:57<81:54:01, 8.68s/it][2025-04-26 01:34:40,479] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:34:40,480] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2879.88 | bwd_microstep: 5781.91 | bwd_inner_microstep: 5769.27 | bwd_allreduce_microstep: 12.59 | step_microstep: 18.65 [2025-04-26 01:34:40,480] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2879.88 | bwd: 5781.92 | bwd_inner: 5769.27 | bwd_allreduce: 12.60 | step: 18.66 18%|█▊ | 7293/41250 [17:37:05<82:04:21, 8.70s/it] {'loss': 0.3961, 'grad_norm': 3.8468971252441406, 'learning_rate': 3.778204817506562e-05, 'epoch': 1.77} 18%|█▊ | 7293/41250 [17:37:05<82:04:21, 8.70s/it][2025-04-26 01:34:49,178] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:34:49,179] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.54 | bwd_microstep: 5771.07 | bwd_inner_microstep: 5710.72 | bwd_allreduce_microstep: 60.29 | step_microstep: 18.63 [2025-04-26 01:34:49,179] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.54 | bwd: 5771.08 | bwd_inner: 5710.72 | bwd_allreduce: 60.31 | step: 18.63 18%|█▊ | 7294/41250 [17:37:14<82:03:56, 8.70s/it] {'loss': 0.0763, 'grad_norm': 1.1422648429870605, 'learning_rate': 3.7781329369331474e-05, 'epoch': 1.77} 18%|█▊ | 7294/41250 [17:37:14<82:03:56, 8.70s/it][2025-04-26 01:34:57,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.09 | optimizer_step: 1.02 [2025-04-26 01:34:57,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.09 | bwd_microstep: 5732.15 | bwd_inner_microstep: 5718.66 | bwd_allreduce_microstep: 13.43 | step_microstep: 19.55 [2025-04-26 01:34:57,849] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.09 | bwd: 5732.16 | bwd_inner: 5718.66 | bwd_allreduce: 13.46 | step: 19.56 18%|█▊ | 7295/41250 [17:37:23<81:58:24, 8.69s/it] {'loss': 0.1947, 'grad_norm': 2.4322454929351807, 'learning_rate': 3.778061045397893e-05, 'epoch': 1.77} 18%|█▊ | 7295/41250 [17:37:23<81:58:24, 8.69s/it][2025-04-26 01:35:06,546] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:35:06,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.98 | bwd_microstep: 5766.81 | bwd_inner_microstep: 5693.90 | bwd_allreduce_microstep: 72.86 | step_microstep: 18.90 [2025-04-26 01:35:06,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.98 | bwd: 5766.82 | bwd_inner: 5693.90 | bwd_allreduce: 72.88 | step: 18.91 18%|█▊ | 7296/41250 [17:37:31<81:59:33, 8.69s/it] {'loss': 0.0872, 'grad_norm': 1.1095019578933716, 'learning_rate': 3.777989142901242e-05, 'epoch': 1.77} 18%|█▊ | 7296/41250 [17:37:31<81:59:33, 8.69s/it][2025-04-26 01:35:15,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-26 01:35:15,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.15 | bwd_microstep: 5787.40 | bwd_inner_microstep: 5650.48 | bwd_allreduce_microstep: 136.87 | step_microstep: 18.86 [2025-04-26 01:35:15,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.15 | bwd: 5787.41 | bwd_inner: 5650.48 | bwd_allreduce: 136.89 | step: 18.86 18%|█▊ | 7297/41250 [17:37:40<82:00:54, 8.70s/it] {'loss': 0.2587, 'grad_norm': 1.7667063474655151, 'learning_rate': 3.777917229443638e-05, 'epoch': 1.77} 18%|█▊ | 7297/41250 [17:37:40<82:00:54, 8.70s/it][2025-04-26 01:35:23,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 01:35:23,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.86 | bwd_microstep: 5789.63 | bwd_inner_microstep: 5706.50 | bwd_allreduce_microstep: 83.08 | step_microstep: 18.91 [2025-04-26 01:35:23,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.86 | bwd: 5789.65 | bwd_inner: 5706.50 | bwd_allreduce: 83.10 | step: 18.92 18%|█▊ | 7298/41250 [17:37:49<82:05:20, 8.70s/it] {'loss': 0.1312, 'grad_norm': 1.4023975133895874, 'learning_rate': 3.777845305025523e-05, 'epoch': 1.77} 18%|█▊ | 7298/41250 [17:37:49<82:05:20, 8.70s/it][2025-04-26 01:35:32,683] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.01 | optimizer_step: 1.08 [2025-04-26 01:35:32,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.24 | bwd_microstep: 5801.28 | bwd_inner_microstep: 5660.35 | bwd_allreduce_microstep: 140.88 | step_microstep: 18.92 [2025-04-26 01:35:32,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.24 | bwd: 5801.30 | bwd_inner: 5660.35 | bwd_allreduce: 140.91 | step: 18.92 18%|█▊ | 7299/41250 [17:37:58<82:06:41, 8.71s/it] {'loss': 0.1804, 'grad_norm': 2.2921979427337646, 'learning_rate': 3.777773369647342e-05, 'epoch': 1.77} 18%|█▊ | 7299/41250 [17:37:58<82:06:41, 8.71s/it][2025-04-26 01:35:41,377] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.03 | optimizer_step: 1.17 [2025-04-26 01:35:41,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.08 | bwd_microstep: 5756.42 | bwd_inner_microstep: 5697.29 | bwd_allreduce_microstep: 59.09 | step_microstep: 19.46 [2025-04-26 01:35:41,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.08 | bwd: 5756.44 | bwd_inner: 5697.29 | bwd_allreduce: 59.11 | step: 19.47 18%|█▊ | 7300/41250 [17:38:06<82:04:14, 8.70s/it] {'loss': 0.2784, 'grad_norm': 3.6411681175231934, 'learning_rate': 3.777701423309538e-05, 'epoch': 1.77} 18%|█▊ | 7300/41250 [17:38:06<82:04:14, 8.70s/it][2025-04-26 01:35:50,062] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 01:35:50,063] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.70 | bwd_microstep: 5758.55 | bwd_inner_microstep: 5658.65 | bwd_allreduce_microstep: 99.85 | step_microstep: 18.72 [2025-04-26 01:35:50,063] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.70 | bwd: 5758.57 | bwd_inner: 5658.65 | bwd_allreduce: 99.87 | step: 18.72 18%|█▊ | 7301/41250 [17:38:15<82:01:03, 8.70s/it] {'loss': 0.1648, 'grad_norm': 1.8346571922302246, 'learning_rate': 3.777629466012554e-05, 'epoch': 1.77} 18%|█▊ | 7301/41250 [17:38:15<82:01:03, 8.70s/it][2025-04-26 01:35:58,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-26 01:35:58,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.00 | bwd_microstep: 5749.05 | bwd_inner_microstep: 5690.27 | bwd_allreduce_microstep: 58.74 | step_microstep: 18.82 [2025-04-26 01:35:58,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.00 | bwd: 5749.07 | bwd_inner: 5690.27 | bwd_allreduce: 58.76 | step: 18.82 18%|█▊ | 7302/41250 [17:38:24<81:57:44, 8.69s/it] {'loss': 0.0679, 'grad_norm': 2.194138288497925, 'learning_rate': 3.777557497756835e-05, 'epoch': 1.77} 18%|█▊ | 7302/41250 [17:38:24<81:57:44, 8.69s/it][2025-04-26 01:36:07,373] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-26 01:36:07,374] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.63 | bwd_microstep: 5721.61 | bwd_inner_microstep: 5650.76 | bwd_allreduce_microstep: 70.81 | step_microstep: 18.57 [2025-04-26 01:36:07,374] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.62 | bwd: 5721.63 | bwd_inner: 5650.76 | bwd_allreduce: 70.83 | step: 18.57 18%|█▊ | 7303/41250 [17:38:32<81:47:30, 8.67s/it] {'loss': 0.1742, 'grad_norm': 1.5283255577087402, 'learning_rate': 3.7774855185428224e-05, 'epoch': 1.77} 18%|█▊ | 7303/41250 [17:38:32<81:47:30, 8.67s/it][2025-04-26 01:36:16,009] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:36:16,009] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.94 | bwd_microstep: 5702.77 | bwd_inner_microstep: 5689.89 | bwd_allreduce_microstep: 12.83 | step_microstep: 18.92 [2025-04-26 01:36:16,009] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.94 | bwd: 5702.79 | bwd_inner: 5689.89 | bwd_allreduce: 12.85 | step: 18.93 18%|█▊ | 7304/41250 [17:38:41<81:40:53, 8.66s/it] {'loss': 0.1374, 'grad_norm': 2.4164388179779053, 'learning_rate': 3.777413528370962e-05, 'epoch': 1.77} 18%|█▊ | 7304/41250 [17:38:41<81:40:53, 8.66s/it][2025-04-26 01:36:24,688] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 01:36:24,689] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.19 | bwd_microstep: 5767.56 | bwd_inner_microstep: 5657.17 | bwd_allreduce_microstep: 110.34 | step_microstep: 18.82 [2025-04-26 01:36:24,689] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.19 | bwd: 5767.57 | bwd_inner: 5657.17 | bwd_allreduce: 110.36 | step: 18.82 18%|█▊ | 7305/41250 [17:38:50<81:43:41, 8.67s/it] {'loss': 0.0983, 'grad_norm': 1.153323769569397, 'learning_rate': 3.7773415272416964e-05, 'epoch': 1.77} 18%|█▊ | 7305/41250 [17:38:50<81:43:41, 8.67s/it][2025-04-26 01:36:33,316] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 01:36:33,317] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.88 | bwd_microstep: 5722.64 | bwd_inner_microstep: 5626.98 | bwd_allreduce_microstep: 95.61 | step_microstep: 18.46 [2025-04-26 01:36:33,317] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.88 | bwd: 5722.65 | bwd_inner: 5626.98 | bwd_allreduce: 95.63 | step: 18.46 18%|█▊ | 7306/41250 [17:38:58<81:36:58, 8.66s/it] {'loss': 0.1063, 'grad_norm': 1.5419564247131348, 'learning_rate': 3.7772695151554704e-05, 'epoch': 1.77} 18%|█▊ | 7306/41250 [17:38:58<81:36:58, 8.66s/it][2025-04-26 01:36:41,959] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.04 | optimizer_step: 0.98 [2025-04-26 01:36:41,960] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.69 | bwd_microstep: 5710.84 | bwd_inner_microstep: 5697.98 | bwd_allreduce_microstep: 12.81 | step_microstep: 19.08 [2025-04-26 01:36:41,960] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.69 | bwd: 5710.85 | bwd_inner: 5697.98 | bwd_allreduce: 12.83 | step: 19.08 18%|█▊ | 7307/41250 [17:39:07<81:34:37, 8.65s/it] {'loss': 0.2057, 'grad_norm': 1.3727102279663086, 'learning_rate': 3.777197492112726e-05, 'epoch': 1.77} 18%|█▊ | 7307/41250 [17:39:07<81:34:37, 8.65s/it][2025-04-26 01:36:50,719] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.99 | optimizer_step: 1.03 [2025-04-26 01:36:50,720] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2888.90 | bwd_microstep: 5785.48 | bwd_inner_microstep: 5772.44 | bwd_allreduce_microstep: 13.00 | step_microstep: 18.86 [2025-04-26 01:36:50,720] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2888.90 | bwd: 5785.50 | bwd_inner: 5772.44 | bwd_allreduce: 13.01 | step: 18.86 18%|█▊ | 7308/41250 [17:39:16<81:52:36, 8.68s/it] {'loss': 0.0691, 'grad_norm': 1.0433928966522217, 'learning_rate': 3.7771254581139096e-05, 'epoch': 1.77} 18%|█▊ | 7308/41250 [17:39:16<81:52:36, 8.68s/it][2025-04-26 01:36:59,397] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.06 | optimizer_step: 0.90 [2025-04-26 01:36:59,398] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.57 | bwd_microstep: 5742.50 | bwd_inner_microstep: 5690.43 | bwd_allreduce_microstep: 52.02 | step_microstep: 19.63 [2025-04-26 01:36:59,398] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.57 | bwd: 5742.52 | bwd_inner: 5690.43 | bwd_allreduce: 52.04 | step: 19.63 18%|█▊ | 7309/41250 [17:39:24<81:51:30, 8.68s/it] {'loss': 0.0754, 'grad_norm': 0.6334443092346191, 'learning_rate': 3.7770534131594646e-05, 'epoch': 1.77} 18%|█▊ | 7309/41250 [17:39:24<81:51:30, 8.68s/it][2025-04-26 01:37:08,098] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:37:08,099] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.54 | bwd_microstep: 5766.14 | bwd_inner_microstep: 5681.97 | bwd_allreduce_microstep: 84.12 | step_microstep: 18.73 [2025-04-26 01:37:08,099] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.54 | bwd: 5766.16 | bwd_inner: 5681.97 | bwd_allreduce: 84.14 | step: 18.73 18%|█▊ | 7310/41250 [17:39:33<81:54:29, 8.69s/it] {'loss': 0.1083, 'grad_norm': 0.8004656434059143, 'learning_rate': 3.776981357249835e-05, 'epoch': 1.77} 18%|█▊ | 7310/41250 [17:39:33<81:54:29, 8.69s/it][2025-04-26 01:37:16,759] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 1.03 [2025-04-26 01:37:16,759] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.22 | bwd_microstep: 5754.81 | bwd_inner_microstep: 5636.33 | bwd_allreduce_microstep: 118.43 | step_microstep: 18.93 [2025-04-26 01:37:16,759] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.22 | bwd: 5754.82 | bwd_inner: 5636.33 | bwd_allreduce: 118.45 | step: 18.94 18%|█▊ | 7311/41250 [17:39:42<81:49:52, 8.68s/it] {'loss': 0.1543, 'grad_norm': 1.5271371603012085, 'learning_rate': 3.776909290385464e-05, 'epoch': 1.77} 18%|█▊ | 7311/41250 [17:39:42<81:49:52, 8.68s/it][2025-04-26 01:37:25,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-26 01:37:25,384] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.80 | bwd_microstep: 5700.34 | bwd_inner_microstep: 5683.29 | bwd_allreduce_microstep: 17.01 | step_microstep: 19.01 [2025-04-26 01:37:25,384] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.80 | bwd: 5700.35 | bwd_inner: 5683.29 | bwd_allreduce: 17.03 | step: 19.01 18%|█▊ | 7312/41250 [17:39:50<81:40:02, 8.66s/it] {'loss': 0.2067, 'grad_norm': 2.415011405944824, 'learning_rate': 3.776837212566797e-05, 'epoch': 1.77} 18%|█▊ | 7312/41250 [17:39:50<81:40:02, 8.66s/it][2025-04-26 01:37:34,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:37:34,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.85 | bwd_microstep: 5794.03 | bwd_inner_microstep: 5645.73 | bwd_allreduce_microstep: 148.26 | step_microstep: 18.93 [2025-04-26 01:37:34,091] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.85 | bwd: 5794.05 | bwd_inner: 5645.73 | bwd_allreduce: 148.28 | step: 18.93 18%|█▊ | 7313/41250 [17:39:59<81:47:22, 8.68s/it] {'loss': 0.2313, 'grad_norm': 2.8018784523010254, 'learning_rate': 3.7767651237942777e-05, 'epoch': 1.77} 18%|█▊ | 7313/41250 [17:39:59<81:47:22, 8.68s/it][2025-04-26 01:37:42,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:37:42,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.08 | bwd_microstep: 5698.81 | bwd_inner_microstep: 5686.27 | bwd_allreduce_microstep: 12.49 | step_microstep: 19.03 [2025-04-26 01:37:42,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.08 | bwd: 5698.82 | bwd_inner: 5686.27 | bwd_allreduce: 12.51 | step: 19.03 18%|█▊ | 7314/41250 [17:40:08<81:38:04, 8.66s/it] {'loss': 0.158, 'grad_norm': 1.7513633966445923, 'learning_rate': 3.776693024068351e-05, 'epoch': 1.77} 18%|█▊ | 7314/41250 [17:40:08<81:38:04, 8.66s/it][2025-04-26 01:37:51,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 01:37:51,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.42 | bwd_microstep: 5690.48 | bwd_inner_microstep: 5677.64 | bwd_allreduce_microstep: 12.79 | step_microstep: 18.65 [2025-04-26 01:37:51,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.42 | bwd: 5690.49 | bwd_inner: 5677.64 | bwd_allreduce: 12.81 | step: 18.66 18%|█▊ | 7315/41250 [17:40:16<81:31:02, 8.65s/it] {'loss': 0.0689, 'grad_norm': 1.3205991983413696, 'learning_rate': 3.776620913389462e-05, 'epoch': 1.77} 18%|█▊ | 7315/41250 [17:40:16<81:31:02, 8.65s/it][2025-04-26 01:38:00,009] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 01:38:00,009] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.85 | bwd_microstep: 5772.68 | bwd_inner_microstep: 5644.49 | bwd_allreduce_microstep: 128.13 | step_microstep: 18.59 [2025-04-26 01:38:00,009] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.85 | bwd: 5772.69 | bwd_inner: 5644.49 | bwd_allreduce: 128.15 | step: 18.59 18%|█▊ | 7316/41250 [17:40:25<81:36:01, 8.66s/it] {'loss': 0.0605, 'grad_norm': 0.9576243162155151, 'learning_rate': 3.776548791758054e-05, 'epoch': 1.77} 18%|█▊ | 7316/41250 [17:40:25<81:36:01, 8.66s/it][2025-04-26 01:38:08,678] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.02 | optimizer_step: 0.98 [2025-04-26 01:38:08,678] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.28 | bwd_microstep: 5763.12 | bwd_inner_microstep: 5647.19 | bwd_allreduce_microstep: 115.88 | step_microstep: 19.31 [2025-04-26 01:38:08,679] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.28 | bwd: 5763.13 | bwd_inner: 5647.19 | bwd_allreduce: 115.90 | step: 19.32 18%|█▊ | 7317/41250 [17:40:34<81:37:51, 8.66s/it] {'loss': 0.08, 'grad_norm': 1.4218165874481201, 'learning_rate': 3.776476659174572e-05, 'epoch': 1.77} 18%|█▊ | 7317/41250 [17:40:34<81:37:51, 8.66s/it][2025-04-26 01:38:17,281] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.02 | optimizer_step: 1.06 [2025-04-26 01:38:17,282] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2816.35 | bwd_microstep: 5703.04 | bwd_inner_microstep: 5634.01 | bwd_allreduce_microstep: 68.99 | step_microstep: 19.10 [2025-04-26 01:38:17,282] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2816.35 | bwd: 5703.06 | bwd_inner: 5634.01 | bwd_allreduce: 69.01 | step: 19.10 18%|█▊ | 7318/41250 [17:40:42<81:28:06, 8.64s/it] {'loss': 0.0605, 'grad_norm': 1.0827914476394653, 'learning_rate': 3.7764045156394614e-05, 'epoch': 1.77} 18%|█▊ | 7318/41250 [17:40:42<81:28:06, 8.64s/it][2025-04-26 01:38:25,879] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.99 | optimizer_step: 1.02 [2025-04-26 01:38:25,879] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.81 | bwd_microstep: 5688.61 | bwd_inner_microstep: 5635.29 | bwd_allreduce_microstep: 53.27 | step_microstep: 18.82 [2025-04-26 01:38:25,879] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.81 | bwd: 5688.62 | bwd_inner: 5635.29 | bwd_allreduce: 53.29 | step: 18.83 18%|█▊ | 7319/41250 [17:40:51<81:20:04, 8.63s/it] {'loss': 0.2205, 'grad_norm': 2.593327760696411, 'learning_rate': 3.776332361153166e-05, 'epoch': 1.77} 18%|█▊ | 7319/41250 [17:40:51<81:20:04, 8.63s/it][2025-04-26 01:38:34,657] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:38:34,657] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.58 | bwd_microstep: 5877.09 | bwd_inner_microstep: 5639.81 | bwd_allreduce_microstep: 237.24 | step_microstep: 18.64 [2025-04-26 01:38:34,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.58 | bwd: 5877.11 | bwd_inner: 5639.81 | bwd_allreduce: 237.26 | step: 18.64 18%|█▊ | 7320/41250 [17:40:59<81:45:10, 8.67s/it] {'loss': 0.0868, 'grad_norm': 0.9245887398719788, 'learning_rate': 3.7762601957161304e-05, 'epoch': 1.77} 18%|█▊ | 7320/41250 [17:40:59<81:45:10, 8.67s/it][2025-04-26 01:38:43,305] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:38:43,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.79 | bwd_microstep: 5719.66 | bwd_inner_microstep: 5672.91 | bwd_allreduce_microstep: 46.70 | step_microstep: 18.55 [2025-04-26 01:38:43,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.79 | bwd: 5719.67 | bwd_inner: 5672.91 | bwd_allreduce: 46.72 | step: 18.55 18%|█▊ | 7321/41250 [17:41:08<81:40:56, 8.67s/it] {'loss': 0.0935, 'grad_norm': 1.2016100883483887, 'learning_rate': 3.776188019328801e-05, 'epoch': 1.77} 18%|█▊ | 7321/41250 [17:41:08<81:40:56, 8.67s/it][2025-04-26 01:38:51,978] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.95 [2025-04-26 01:38:51,978] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.84 | bwd_microstep: 5764.59 | bwd_inner_microstep: 5638.44 | bwd_allreduce_microstep: 126.10 | step_microstep: 18.79 [2025-04-26 01:38:51,979] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.84 | bwd: 5764.60 | bwd_inner: 5638.44 | bwd_allreduce: 126.12 | step: 18.79 18%|█▊ | 7322/41250 [17:41:17<81:41:20, 8.67s/it] {'loss': 0.5043, 'grad_norm': 3.09061336517334, 'learning_rate': 3.776115831991622e-05, 'epoch': 1.78} 18%|█▊ | 7322/41250 [17:41:17<81:41:20, 8.67s/it][2025-04-26 01:39:00,742] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.97 [2025-04-26 01:39:00,743] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.98 | bwd_microstep: 5840.36 | bwd_inner_microstep: 5673.34 | bwd_allreduce_microstep: 166.97 | step_microstep: 18.77 [2025-04-26 01:39:00,743] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.98 | bwd: 5840.37 | bwd_inner: 5673.34 | bwd_allreduce: 166.99 | step: 18.77 18%|█▊ | 7323/41250 [17:41:26<81:57:31, 8.70s/it] {'loss': 0.091, 'grad_norm': 1.1608799695968628, 'learning_rate': 3.7760436337050365e-05, 'epoch': 1.78} 18%|█▊ | 7323/41250 [17:41:26<81:57:31, 8.70s/it][2025-04-26 01:39:09,435] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 01:39:09,435] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.01 | bwd_microstep: 5771.53 | bwd_inner_microstep: 5678.32 | bwd_allreduce_microstep: 93.16 | step_microstep: 18.56 [2025-04-26 01:39:09,436] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.01 | bwd: 5771.54 | bwd_inner: 5678.32 | bwd_allreduce: 93.18 | step: 18.56 18%|█▊ | 7324/41250 [17:41:34<81:56:43, 8.70s/it] {'loss': 0.3087, 'grad_norm': 2.04487943649292, 'learning_rate': 3.775971424469493e-05, 'epoch': 1.78} 18%|█▊ | 7324/41250 [17:41:34<81:56:43, 8.70s/it][2025-04-26 01:39:18,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 01:39:18,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.36 | bwd_microstep: 5741.82 | bwd_inner_microstep: 5701.08 | bwd_allreduce_microstep: 40.69 | step_microstep: 18.98 [2025-04-26 01:39:18,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.36 | bwd: 5741.83 | bwd_inner: 5701.08 | bwd_allreduce: 40.71 | step: 18.98 18%|█▊ | 7325/41250 [17:41:43<81:53:37, 8.69s/it] {'loss': 0.0921, 'grad_norm': 1.4723286628723145, 'learning_rate': 3.7758992042854345e-05, 'epoch': 1.78} 18%|█▊ | 7325/41250 [17:41:43<81:53:37, 8.69s/it][2025-04-26 01:39:26,788] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:39:26,788] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.64 | bwd_microstep: 5744.55 | bwd_inner_microstep: 5694.26 | bwd_allreduce_microstep: 50.25 | step_microstep: 18.54 [2025-04-26 01:39:26,788] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.64 | bwd: 5744.57 | bwd_inner: 5694.26 | bwd_allreduce: 50.26 | step: 18.54 18%|█▊ | 7326/41250 [17:41:52<81:51:03, 8.69s/it] {'loss': 0.0964, 'grad_norm': 1.7531070709228516, 'learning_rate': 3.775826973153307e-05, 'epoch': 1.78} 18%|█▊ | 7326/41250 [17:41:52<81:51:03, 8.69s/it][2025-04-26 01:39:35,449] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.99 | optimizer_step: 0.98 [2025-04-26 01:39:35,450] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.36 | bwd_microstep: 5730.49 | bwd_inner_microstep: 5698.29 | bwd_allreduce_microstep: 32.16 | step_microstep: 19.12 [2025-04-26 01:39:35,450] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.36 | bwd: 5730.51 | bwd_inner: 5698.29 | bwd_allreduce: 32.18 | step: 19.12 18%|█▊ | 7327/41250 [17:42:00<81:46:49, 8.68s/it] {'loss': 0.0889, 'grad_norm': 1.5756007432937622, 'learning_rate': 3.775754731073555e-05, 'epoch': 1.78} 18%|█▊ | 7327/41250 [17:42:00<81:46:49, 8.68s/it][2025-04-26 01:39:44,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-26 01:39:44,107] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.12 | bwd_microstep: 5719.57 | bwd_inner_microstep: 5706.20 | bwd_allreduce_microstep: 13.31 | step_microstep: 19.43 [2025-04-26 01:39:44,107] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.12 | bwd: 5719.58 | bwd_inner: 5706.20 | bwd_allreduce: 13.34 | step: 19.43 18%|█▊ | 7328/41250 [17:42:09<81:42:53, 8.67s/it] {'loss': 0.2061, 'grad_norm': 3.030895471572876, 'learning_rate': 3.7756824780466244e-05, 'epoch': 1.78} 18%|█▊ | 7328/41250 [17:42:09<81:42:53, 8.67s/it][2025-04-26 01:39:52,781] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.01 | optimizer_step: 1.00 [2025-04-26 01:39:52,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.56 | bwd_microstep: 5766.88 | bwd_inner_microstep: 5649.94 | bwd_allreduce_microstep: 116.89 | step_microstep: 18.76 [2025-04-26 01:39:52,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.56 | bwd: 5766.89 | bwd_inner: 5649.94 | bwd_allreduce: 116.91 | step: 18.77 18%|█▊ | 7329/41250 [17:42:18<81:43:05, 8.67s/it] {'loss': 0.1521, 'grad_norm': 1.7687112092971802, 'learning_rate': 3.7756102140729606e-05, 'epoch': 1.78} 18%|█▊ | 7329/41250 [17:42:18<81:43:05, 8.67s/it][2025-04-26 01:40:01,454] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:40:01,455] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.84 | bwd_microstep: 5747.12 | bwd_inner_microstep: 5689.14 | bwd_allreduce_microstep: 57.94 | step_microstep: 18.71 [2025-04-26 01:40:01,455] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.84 | bwd: 5747.14 | bwd_inner: 5689.14 | bwd_allreduce: 57.96 | step: 18.72 18%|█▊ | 7330/41250 [17:42:26<81:43:06, 8.67s/it] {'loss': 0.1472, 'grad_norm': 2.96606707572937, 'learning_rate': 3.7755379391530094e-05, 'epoch': 1.78} 18%|█▊ | 7330/41250 [17:42:26<81:43:06, 8.67s/it][2025-04-26 01:40:10,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.22 | optimizer_step: 0.90 [2025-04-26 01:40:10,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.76 | bwd_microstep: 5850.96 | bwd_inner_microstep: 5705.62 | bwd_allreduce_microstep: 145.27 | step_microstep: 18.94 [2025-04-26 01:40:10,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.76 | bwd: 5850.98 | bwd_inner: 5705.62 | bwd_allreduce: 145.30 | step: 18.94 18%|█▊ | 7331/41250 [17:42:35<82:03:03, 8.71s/it] {'loss': 0.1273, 'grad_norm': 2.179964542388916, 'learning_rate': 3.775465653287216e-05, 'epoch': 1.78} 18%|█▊ | 7331/41250 [17:42:35<82:03:03, 8.71s/it][2025-04-26 01:40:18,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.03 | optimizer_step: 1.17 [2025-04-26 01:40:18,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.06 | bwd_microstep: 5779.55 | bwd_inner_microstep: 5643.37 | bwd_allreduce_microstep: 136.12 | step_microstep: 19.44 [2025-04-26 01:40:18,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.06 | bwd: 5779.56 | bwd_inner: 5643.37 | bwd_allreduce: 136.14 | step: 19.44 18%|█▊ | 7332/41250 [17:42:44<81:59:17, 8.70s/it] {'loss': 0.1845, 'grad_norm': 6.729294776916504, 'learning_rate': 3.7753933564760256e-05, 'epoch': 1.78} 18%|█▊ | 7332/41250 [17:42:44<81:59:17, 8.70s/it][2025-04-26 01:40:27,699] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-26 01:40:27,700] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2888.06 | bwd_microstep: 5795.49 | bwd_inner_microstep: 5782.23 | bwd_allreduce_microstep: 13.21 | step_microstep: 19.09 [2025-04-26 01:40:27,700] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2888.06 | bwd: 5795.50 | bwd_inner: 5782.23 | bwd_allreduce: 13.23 | step: 19.09 18%|█▊ | 7333/41250 [17:42:53<82:10:07, 8.72s/it] {'loss': 0.0952, 'grad_norm': 1.022873044013977, 'learning_rate': 3.775321048719885e-05, 'epoch': 1.78} 18%|█▊ | 7333/41250 [17:42:53<82:10:07, 8.72s/it][2025-04-26 01:40:36,421] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.00 | optimizer_step: 0.94 [2025-04-26 01:40:36,422] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.67 | bwd_microstep: 5810.10 | bwd_inner_microstep: 5655.57 | bwd_allreduce_microstep: 154.48 | step_microstep: 18.73 [2025-04-26 01:40:36,422] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.67 | bwd: 5810.11 | bwd_inner: 5655.57 | bwd_allreduce: 154.50 | step: 18.73 18%|█▊ | 7334/41250 [17:43:01<82:10:04, 8.72s/it] {'loss': 0.1773, 'grad_norm': 3.7946696281433105, 'learning_rate': 3.7752487300192386e-05, 'epoch': 1.78} 18%|█▊ | 7334/41250 [17:43:01<82:10:04, 8.72s/it][2025-04-26 01:40:45,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 01:40:45,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.43 | bwd_microstep: 5743.66 | bwd_inner_microstep: 5696.31 | bwd_allreduce_microstep: 47.30 | step_microstep: 18.69 [2025-04-26 01:40:45,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.43 | bwd: 5743.67 | bwd_inner: 5696.31 | bwd_allreduce: 47.32 | step: 18.69 18%|█▊ | 7335/41250 [17:43:10<82:02:43, 8.71s/it] {'loss': 0.3864, 'grad_norm': 2.285442352294922, 'learning_rate': 3.775176400374533e-05, 'epoch': 1.78} 18%|█▊ | 7335/41250 [17:43:10<82:02:43, 8.71s/it][2025-04-26 01:40:53,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.05 | optimizer_step: 1.11 [2025-04-26 01:40:53,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.78 | bwd_microstep: 5750.75 | bwd_inner_microstep: 5708.18 | bwd_allreduce_microstep: 42.51 | step_microstep: 19.65 [2025-04-26 01:40:53,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.78 | bwd: 5750.76 | bwd_inner: 5708.18 | bwd_allreduce: 42.54 | step: 19.65 18%|█▊ | 7336/41250 [17:43:19<81:59:26, 8.70s/it] {'loss': 0.1649, 'grad_norm': 1.5986312627792358, 'learning_rate': 3.775104059786215e-05, 'epoch': 1.78} 18%|█▊ | 7336/41250 [17:43:19<81:59:26, 8.70s/it][2025-04-26 01:41:02,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:41:02,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.43 | bwd_microstep: 5716.89 | bwd_inner_microstep: 5704.08 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.63 [2025-04-26 01:41:02,445] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.43 | bwd: 5716.90 | bwd_inner: 5704.07 | bwd_allreduce: 12.79 | step: 18.63 18%|█▊ | 7337/41250 [17:43:27<81:50:39, 8.69s/it] {'loss': 0.2217, 'grad_norm': 2.691789388656616, 'learning_rate': 3.775031708254728e-05, 'epoch': 1.78} 18%|█▊ | 7337/41250 [17:43:27<81:50:39, 8.69s/it][2025-04-26 01:41:11,216] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.16 | optimizer_step: 1.02 [2025-04-26 01:41:11,216] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2894.13 | bwd_microstep: 5793.98 | bwd_inner_microstep: 5780.63 | bwd_allreduce_microstep: 13.29 | step_microstep: 19.11 [2025-04-26 01:41:11,217] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2894.13 | bwd: 5793.99 | bwd_inner: 5780.63 | bwd_allreduce: 13.32 | step: 19.12 18%|█▊ | 7338/41250 [17:43:36<82:04:45, 8.71s/it] {'loss': 0.2367, 'grad_norm': 2.8631021976470947, 'learning_rate': 3.774959345780521e-05, 'epoch': 1.78} 18%|█▊ | 7338/41250 [17:43:36<82:04:45, 8.71s/it][2025-04-26 01:41:19,844] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.00 | optimizer_step: 0.94 [2025-04-26 01:41:19,845] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.96 | bwd_microstep: 5703.35 | bwd_inner_microstep: 5690.30 | bwd_allreduce_microstep: 13.01 | step_microstep: 18.63 [2025-04-26 01:41:19,845] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.96 | bwd: 5703.36 | bwd_inner: 5690.30 | bwd_allreduce: 13.02 | step: 18.63 18%|█▊ | 7339/41250 [17:43:45<81:50:08, 8.69s/it] {'loss': 0.1032, 'grad_norm': 1.7478529214859009, 'learning_rate': 3.774886972364039e-05, 'epoch': 1.78} 18%|█▊ | 7339/41250 [17:43:45<81:50:08, 8.69s/it][2025-04-26 01:41:28,485] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:41:28,485] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.10 | bwd_microstep: 5707.91 | bwd_inner_microstep: 5693.00 | bwd_allreduce_microstep: 14.86 | step_microstep: 18.85 [2025-04-26 01:41:28,485] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.10 | bwd: 5707.92 | bwd_inner: 5693.00 | bwd_allreduce: 14.87 | step: 18.85 18%|█▊ | 7340/41250 [17:43:53<81:42:02, 8.67s/it] {'loss': 0.256, 'grad_norm': 1.8641027212142944, 'learning_rate': 3.774814588005727e-05, 'epoch': 1.78} 18%|█▊ | 7340/41250 [17:43:53<81:42:02, 8.67s/it][2025-04-26 01:41:37,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.00 | optimizer_step: 0.94 [2025-04-26 01:41:37,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.47 | bwd_microstep: 5715.13 | bwd_inner_microstep: 5657.08 | bwd_allreduce_microstep: 58.00 | step_microstep: 18.55 [2025-04-26 01:41:37,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.47 | bwd: 5715.14 | bwd_inner: 5657.08 | bwd_allreduce: 58.02 | step: 18.55 18%|█▊ | 7341/41250 [17:44:02<81:35:15, 8.66s/it] {'loss': 0.1784, 'grad_norm': 1.7995364665985107, 'learning_rate': 3.774742192706032e-05, 'epoch': 1.78} 18%|█▊ | 7341/41250 [17:44:02<81:35:15, 8.66s/it][2025-04-26 01:41:45,753] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.04 | optimizer_step: 1.02 [2025-04-26 01:41:45,754] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.00 | bwd_microstep: 5701.15 | bwd_inner_microstep: 5688.16 | bwd_allreduce_microstep: 12.94 | step_microstep: 19.41 [2025-04-26 01:41:45,754] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.00 | bwd: 5701.17 | bwd_inner: 5688.16 | bwd_allreduce: 12.96 | step: 19.42 18%|█▊ | 7342/41250 [17:44:11<81:30:23, 8.65s/it] {'loss': 0.1308, 'grad_norm': 1.657130479812622, 'learning_rate': 3.774669786465401e-05, 'epoch': 1.78} 18%|█▊ | 7342/41250 [17:44:11<81:30:23, 8.65s/it][2025-04-26 01:41:54,451] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 01:41:54,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.97 | bwd_microstep: 5788.64 | bwd_inner_microstep: 5663.20 | bwd_allreduce_microstep: 125.39 | step_microstep: 18.39 [2025-04-26 01:41:54,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.97 | bwd: 5788.65 | bwd_inner: 5663.20 | bwd_allreduce: 125.41 | step: 18.39 18%|█▊ | 7343/41250 [17:44:19<81:37:45, 8.67s/it] {'loss': 0.1941, 'grad_norm': 3.9189586639404297, 'learning_rate': 3.77459736928428e-05, 'epoch': 1.78} 18%|█▊ | 7343/41250 [17:44:19<81:37:45, 8.67s/it][2025-04-26 01:42:03,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:42:03,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.58 | bwd_microstep: 5793.53 | bwd_inner_microstep: 5653.50 | bwd_allreduce_microstep: 139.99 | step_microstep: 18.61 [2025-04-26 01:42:03,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.58 | bwd: 5793.55 | bwd_inner: 5653.50 | bwd_allreduce: 140.00 | step: 18.61 18%|█▊ | 7344/41250 [17:44:28<81:43:41, 8.68s/it] {'loss': 0.4, 'grad_norm': 2.202507972717285, 'learning_rate': 3.774524941163115e-05, 'epoch': 1.78} 18%|█▊ | 7344/41250 [17:44:28<81:43:41, 8.68s/it][2025-04-26 01:42:11,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 01:42:11,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.06 | bwd_microstep: 5796.71 | bwd_inner_microstep: 5658.22 | bwd_allreduce_microstep: 138.45 | step_microstep: 18.55 [2025-04-26 01:42:11,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.06 | bwd: 5796.73 | bwd_inner: 5658.22 | bwd_allreduce: 138.47 | step: 18.55 18%|█▊ | 7345/41250 [17:44:37<81:47:57, 8.69s/it] {'loss': 0.1259, 'grad_norm': 3.3318843841552734, 'learning_rate': 3.774452502102353e-05, 'epoch': 1.78} 18%|█▊ | 7345/41250 [17:44:37<81:47:57, 8.69s/it][2025-04-26 01:42:20,536] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:42:20,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.38 | bwd_microstep: 5773.04 | bwd_inner_microstep: 5654.42 | bwd_allreduce_microstep: 118.57 | step_microstep: 18.70 [2025-04-26 01:42:20,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.38 | bwd: 5773.05 | bwd_inner: 5654.42 | bwd_allreduce: 118.59 | step: 18.70 18%|█▊ | 7346/41250 [17:44:45<81:47:07, 8.68s/it] {'loss': 0.1034, 'grad_norm': 1.0434978008270264, 'learning_rate': 3.7743800521024395e-05, 'epoch': 1.78} 18%|█▊ | 7346/41250 [17:44:45<81:47:07, 8.68s/it][2025-04-26 01:42:29,245] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.03 | optimizer_step: 1.10 [2025-04-26 01:42:29,245] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.85 | bwd_microstep: 5769.90 | bwd_inner_microstep: 5703.52 | bwd_allreduce_microstep: 66.33 | step_microstep: 19.35 [2025-04-26 01:42:29,246] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.85 | bwd: 5769.91 | bwd_inner: 5703.52 | bwd_allreduce: 66.35 | step: 19.35 18%|█▊ | 7347/41250 [17:44:54<81:50:45, 8.69s/it] {'loss': 0.1765, 'grad_norm': 1.9011317491531372, 'learning_rate': 3.774307591163823e-05, 'epoch': 1.78} 18%|█▊ | 7347/41250 [17:44:54<81:50:45, 8.69s/it][2025-04-26 01:42:37,893] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.02 | optimizer_step: 0.95 [2025-04-26 01:42:37,893] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.31 | bwd_microstep: 5714.25 | bwd_inner_microstep: 5701.37 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.48 [2025-04-26 01:42:37,893] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.31 | bwd: 5714.27 | bwd_inner: 5701.37 | bwd_allreduce: 12.86 | step: 18.48 18%|█▊ | 7348/41250 [17:45:03<81:43:09, 8.68s/it] {'loss': 0.0467, 'grad_norm': 1.5716816186904907, 'learning_rate': 3.774235119286949e-05, 'epoch': 1.78} 18%|█▊ | 7348/41250 [17:45:03<81:43:09, 8.68s/it][2025-04-26 01:42:46,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 01:42:46,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.16 | bwd_microstep: 5704.88 | bwd_inner_microstep: 5644.68 | bwd_allreduce_microstep: 60.16 | step_microstep: 18.51 [2025-04-26 01:42:46,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.16 | bwd: 5704.89 | bwd_inner: 5644.68 | bwd_allreduce: 60.17 | step: 18.51 18%|█▊ | 7349/41250 [17:45:11<81:32:15, 8.66s/it] {'loss': 0.1703, 'grad_norm': 1.3081060647964478, 'learning_rate': 3.774162636472264e-05, 'epoch': 1.78} 18%|█▊ | 7349/41250 [17:45:11<81:32:15, 8.66s/it][2025-04-26 01:42:55,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 01:42:55,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.23 | bwd_microstep: 5706.60 | bwd_inner_microstep: 5693.91 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.35 [2025-04-26 01:42:55,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.23 | bwd: 5706.61 | bwd_inner: 5693.91 | bwd_allreduce: 12.66 | step: 18.35 18%|█▊ | 7350/41250 [17:45:20<81:28:02, 8.65s/it] {'loss': 0.1894, 'grad_norm': 1.5274369716644287, 'learning_rate': 3.774090142720216e-05, 'epoch': 1.78} 18%|█▊ | 7350/41250 [17:45:20<81:28:02, 8.65s/it][2025-04-26 01:43:03,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 01:43:03,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.03 | bwd_microstep: 5770.66 | bwd_inner_microstep: 5687.10 | bwd_allreduce_microstep: 83.52 | step_microstep: 18.53 [2025-04-26 01:43:03,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.03 | bwd: 5770.67 | bwd_inner: 5687.10 | bwd_allreduce: 83.54 | step: 18.54 18%|█▊ | 7351/41250 [17:45:29<81:35:25, 8.66s/it] {'loss': 0.0495, 'grad_norm': 0.5996806621551514, 'learning_rate': 3.7740176380312504e-05, 'epoch': 1.78} 18%|█▊ | 7351/41250 [17:45:29<81:35:25, 8.66s/it][2025-04-26 01:43:12,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 01:43:12,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.92 | bwd_microstep: 5697.76 | bwd_inner_microstep: 5685.14 | bwd_allreduce_microstep: 12.57 | step_microstep: 18.60 [2025-04-26 01:43:12,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.92 | bwd: 5697.77 | bwd_inner: 5685.14 | bwd_allreduce: 12.58 | step: 18.61 18%|█▊ | 7352/41250 [17:45:37<81:28:43, 8.65s/it] {'loss': 0.3292, 'grad_norm': 1.6043404340744019, 'learning_rate': 3.773945122405815e-05, 'epoch': 1.78} 18%|█▊ | 7352/41250 [17:45:37<81:28:43, 8.65s/it][2025-04-26 01:43:21,221] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.00 | optimizer_step: 1.10 [2025-04-26 01:43:21,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2889.25 | bwd_microstep: 5786.84 | bwd_inner_microstep: 5773.78 | bwd_allreduce_microstep: 13.00 | step_microstep: 19.38 [2025-04-26 01:43:21,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2889.25 | bwd: 5786.85 | bwd_inner: 5773.78 | bwd_allreduce: 13.02 | step: 19.39 18%|█▊ | 7353/41250 [17:45:46<81:46:35, 8.68s/it] {'loss': 0.0612, 'grad_norm': 0.5108978748321533, 'learning_rate': 3.773872595844357e-05, 'epoch': 1.78} 18%|█▊ | 7353/41250 [17:45:46<81:46:35, 8.68s/it][2025-04-26 01:43:29,866] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 1.02 [2025-04-26 01:43:29,866] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.32 | bwd_microstep: 5705.58 | bwd_inner_microstep: 5692.85 | bwd_allreduce_microstep: 12.69 | step_microstep: 19.26 [2025-04-26 01:43:29,867] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.32 | bwd: 5705.59 | bwd_inner: 5692.85 | bwd_allreduce: 12.70 | step: 19.26 18%|█▊ | 7354/41250 [17:45:55<81:39:45, 8.67s/it] {'loss': 0.2674, 'grad_norm': 4.1430583000183105, 'learning_rate': 3.7738000583473235e-05, 'epoch': 1.78} 18%|█▊ | 7354/41250 [17:45:55<81:39:45, 8.67s/it][2025-04-26 01:43:38,561] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:43:38,562] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.24 | bwd_microstep: 5783.62 | bwd_inner_microstep: 5651.76 | bwd_allreduce_microstep: 131.81 | step_microstep: 18.64 [2025-04-26 01:43:38,562] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.24 | bwd: 5783.64 | bwd_inner: 5651.76 | bwd_allreduce: 131.83 | step: 18.64 18%|█▊ | 7355/41250 [17:46:03<81:43:12, 8.68s/it] {'loss': 0.0699, 'grad_norm': 0.8117111325263977, 'learning_rate': 3.773727509915162e-05, 'epoch': 1.78} 18%|█▊ | 7355/41250 [17:46:03<81:43:12, 8.68s/it][2025-04-26 01:43:47,192] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 1.01 [2025-04-26 01:43:47,192] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.57 | bwd_microstep: 5692.37 | bwd_inner_microstep: 5679.53 | bwd_allreduce_microstep: 12.79 | step_microstep: 19.03 [2025-04-26 01:43:47,192] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.57 | bwd: 5692.38 | bwd_inner: 5679.53 | bwd_allreduce: 12.81 | step: 19.03 18%|█▊ | 7356/41250 [17:46:12<81:34:42, 8.66s/it] {'loss': 0.0572, 'grad_norm': 2.8607735633850098, 'learning_rate': 3.773654950548319e-05, 'epoch': 1.78} 18%|█▊ | 7356/41250 [17:46:12<81:34:42, 8.66s/it][2025-04-26 01:43:55,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 01:43:55,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.35 | bwd_microstep: 5709.25 | bwd_inner_microstep: 5696.65 | bwd_allreduce_microstep: 12.55 | step_microstep: 18.76 [2025-04-26 01:43:55,844] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.35 | bwd: 5709.27 | bwd_inner: 5696.65 | bwd_allreduce: 12.57 | step: 18.76 18%|█▊ | 7357/41250 [17:46:21<81:32:20, 8.66s/it] {'loss': 0.036, 'grad_norm': 0.3244190812110901, 'learning_rate': 3.7735823802472416e-05, 'epoch': 1.78} 18%|█▊ | 7357/41250 [17:46:21<81:32:20, 8.66s/it][2025-04-26 01:44:04,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:44:04,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.33 | bwd_microstep: 5763.02 | bwd_inner_microstep: 5641.95 | bwd_allreduce_microstep: 121.03 | step_microstep: 18.94 [2025-04-26 01:44:04,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.34 | bwd: 5763.04 | bwd_inner: 5641.95 | bwd_allreduce: 121.05 | step: 18.94 18%|█▊ | 7358/41250 [17:46:29<81:34:00, 8.66s/it] {'loss': 0.0307, 'grad_norm': 0.6103740930557251, 'learning_rate': 3.7735097990123784e-05, 'epoch': 1.78} 18%|█▊ | 7358/41250 [17:46:29<81:34:00, 8.66s/it][2025-04-26 01:44:13,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.03 | optimizer_step: 1.00 [2025-04-26 01:44:13,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.75 | bwd_microstep: 5747.79 | bwd_inner_microstep: 5688.87 | bwd_allreduce_microstep: 58.86 | step_microstep: 19.37 [2025-04-26 01:44:13,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.75 | bwd: 5747.80 | bwd_inner: 5688.87 | bwd_allreduce: 58.89 | step: 19.37 18%|█▊ | 7359/41250 [17:46:38<81:37:11, 8.67s/it] {'loss': 0.1423, 'grad_norm': 0.9627006649971008, 'learning_rate': 3.773437206844175e-05, 'epoch': 1.78} 18%|█▊ | 7359/41250 [17:46:38<81:37:11, 8.67s/it][2025-04-26 01:44:21,819] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:44:21,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.20 | bwd_microstep: 5692.84 | bwd_inner_microstep: 5680.10 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.76 [2025-04-26 01:44:21,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.20 | bwd: 5692.86 | bwd_inner: 5680.10 | bwd_allreduce: 12.72 | step: 18.76 18%|█▊ | 7360/41250 [17:46:47<81:28:38, 8.66s/it] {'loss': 0.2296, 'grad_norm': 2.1557741165161133, 'learning_rate': 3.773364603743081e-05, 'epoch': 1.78} 18%|█▊ | 7360/41250 [17:46:47<81:28:38, 8.66s/it][2025-04-26 01:44:30,448] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 01:44:30,449] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.20 | bwd_microstep: 5701.07 | bwd_inner_microstep: 5688.35 | bwd_allreduce_microstep: 12.68 | step_microstep: 19.03 [2025-04-26 01:44:30,449] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.20 | bwd: 5701.09 | bwd_inner: 5688.35 | bwd_allreduce: 12.70 | step: 19.03 18%|█▊ | 7361/41250 [17:46:55<81:24:09, 8.65s/it] {'loss': 0.1294, 'grad_norm': 3.329404830932617, 'learning_rate': 3.7732919897095425e-05, 'epoch': 1.78} 18%|█▊ | 7361/41250 [17:46:55<81:24:09, 8.65s/it][2025-04-26 01:44:39,075] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 01:44:39,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.09 | bwd_microstep: 5695.59 | bwd_inner_microstep: 5682.46 | bwd_allreduce_microstep: 13.08 | step_microstep: 19.13 [2025-04-26 01:44:39,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.09 | bwd: 5695.61 | bwd_inner: 5682.46 | bwd_allreduce: 13.10 | step: 19.13 18%|█▊ | 7362/41250 [17:47:04<81:20:29, 8.64s/it] {'loss': 0.2198, 'grad_norm': 2.3834402561187744, 'learning_rate': 3.773219364744008e-05, 'epoch': 1.78} 18%|█▊ | 7362/41250 [17:47:04<81:20:29, 8.64s/it][2025-04-26 01:44:47,727] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:44:47,728] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.49 | bwd_microstep: 5718.88 | bwd_inner_microstep: 5678.92 | bwd_allreduce_microstep: 39.91 | step_microstep: 18.81 [2025-04-26 01:44:47,728] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.49 | bwd: 5718.89 | bwd_inner: 5678.92 | bwd_allreduce: 39.93 | step: 18.82 18%|█▊ | 7363/41250 [17:47:13<81:22:18, 8.64s/it] {'loss': 0.1845, 'grad_norm': 1.297350525856018, 'learning_rate': 3.773146728846925e-05, 'epoch': 1.78} 18%|█▊ | 7363/41250 [17:47:13<81:22:18, 8.64s/it][2025-04-26 01:44:56,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.98 | optimizer_step: 1.08 [2025-04-26 01:44:56,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.58 | bwd_microstep: 5714.43 | bwd_inner_microstep: 5701.69 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.87 [2025-04-26 01:44:56,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.58 | bwd: 5714.44 | bwd_inner: 5701.69 | bwd_allreduce: 12.71 | step: 18.87 18%|█▊ | 7364/41250 [17:47:21<81:23:20, 8.65s/it] {'loss': 0.0957, 'grad_norm': 2.115776538848877, 'learning_rate': 3.773074082018741e-05, 'epoch': 1.79} 18%|█▊ | 7364/41250 [17:47:21<81:23:20, 8.65s/it][2025-04-26 01:45:04,988] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:45:04,989] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.63 | bwd_microstep: 5701.76 | bwd_inner_microstep: 5653.36 | bwd_allreduce_microstep: 48.35 | step_microstep: 18.59 [2025-04-26 01:45:04,989] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.63 | bwd: 5701.78 | bwd_inner: 5653.36 | bwd_allreduce: 48.37 | step: 18.59 18%|█▊ | 7365/41250 [17:47:30<81:17:00, 8.64s/it] {'loss': 0.0796, 'grad_norm': 1.1507405042648315, 'learning_rate': 3.7730014242599036e-05, 'epoch': 1.79} 18%|█▊ | 7365/41250 [17:47:30<81:17:00, 8.64s/it][2025-04-26 01:45:13,654] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 0.95 [2025-04-26 01:45:13,654] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.68 | bwd_microstep: 5757.71 | bwd_inner_microstep: 5642.47 | bwd_allreduce_microstep: 115.20 | step_microstep: 18.71 [2025-04-26 01:45:13,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.68 | bwd: 5757.72 | bwd_inner: 5642.47 | bwd_allreduce: 115.21 | step: 18.71 18%|█▊ | 7366/41250 [17:47:38<81:21:46, 8.64s/it] {'loss': 0.1317, 'grad_norm': 2.4989876747131348, 'learning_rate': 3.772928755570862e-05, 'epoch': 1.79} 18%|█▊ | 7366/41250 [17:47:38<81:21:46, 8.64s/it][2025-04-26 01:45:22,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 01:45:22,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.17 | bwd_microstep: 5747.33 | bwd_inner_microstep: 5677.50 | bwd_allreduce_microstep: 69.79 | step_microstep: 18.38 [2025-04-26 01:45:22,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.17 | bwd: 5747.34 | bwd_inner: 5677.50 | bwd_allreduce: 69.81 | step: 18.38 18%|█▊ | 7367/41250 [17:47:47<81:27:26, 8.65s/it] {'loss': 0.1029, 'grad_norm': 2.081782817840576, 'learning_rate': 3.772856075952063e-05, 'epoch': 1.79} 18%|█▊ | 7367/41250 [17:47:47<81:27:26, 8.65s/it][2025-04-26 01:45:31,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 01:45:31,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.97 | bwd_microstep: 5758.58 | bwd_inner_microstep: 5651.09 | bwd_allreduce_microstep: 107.45 | step_microstep: 18.49 [2025-04-26 01:45:31,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.97 | bwd: 5758.59 | bwd_inner: 5651.09 | bwd_allreduce: 107.47 | step: 18.49 18%|█▊ | 7368/41250 [17:47:56<81:30:14, 8.66s/it] {'loss': 0.1357, 'grad_norm': 1.312106966972351, 'learning_rate': 3.772783385403955e-05, 'epoch': 1.79} 18%|█▊ | 7368/41250 [17:47:56<81:30:14, 8.66s/it][2025-04-26 01:45:39,758] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 01:45:39,758] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2883.91 | bwd_microstep: 5785.69 | bwd_inner_microstep: 5772.87 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.59 [2025-04-26 01:45:39,758] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2883.91 | bwd: 5785.70 | bwd_inner: 5772.87 | bwd_allreduce: 12.79 | step: 18.59 18%|█▊ | 7369/41250 [17:48:05<81:45:50, 8.69s/it] {'loss': 0.2534, 'grad_norm': 3.2068629264831543, 'learning_rate': 3.772710683926985e-05, 'epoch': 1.79} 18%|█▊ | 7369/41250 [17:48:05<81:45:50, 8.69s/it][2025-04-26 01:45:48,375] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:45:48,376] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.43 | bwd_microstep: 5692.00 | bwd_inner_microstep: 5679.19 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.57 [2025-04-26 01:45:48,376] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.43 | bwd: 5692.01 | bwd_inner: 5679.19 | bwd_allreduce: 12.78 | step: 18.57 18%|█▊ | 7370/41250 [17:48:13<81:33:46, 8.67s/it] {'loss': 0.0711, 'grad_norm': 0.9003294110298157, 'learning_rate': 3.772637971521604e-05, 'epoch': 1.79} 18%|█▊ | 7370/41250 [17:48:13<81:33:46, 8.67s/it][2025-04-26 01:45:57,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 01:45:57,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.77 | bwd_microstep: 5733.16 | bwd_inner_microstep: 5691.93 | bwd_allreduce_microstep: 41.19 | step_microstep: 18.51 [2025-04-26 01:45:57,037] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.77 | bwd: 5733.17 | bwd_inner: 5691.93 | bwd_allreduce: 41.20 | step: 18.51 18%|█▊ | 7371/41250 [17:48:22<81:32:49, 8.67s/it] {'loss': 0.1231, 'grad_norm': 2.0468478202819824, 'learning_rate': 3.772565248188258e-05, 'epoch': 1.79} 18%|█▊ | 7371/41250 [17:48:22<81:32:49, 8.67s/it][2025-04-26 01:46:05,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.08 | optimizer_step: 1.13 [2025-04-26 01:46:05,683] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.50 | bwd_microstep: 5710.10 | bwd_inner_microstep: 5697.25 | bwd_allreduce_microstep: 12.79 | step_microstep: 19.79 [2025-04-26 01:46:05,683] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.50 | bwd: 5710.11 | bwd_inner: 5697.25 | bwd_allreduce: 12.82 | step: 19.79 18%|█▊ | 7372/41250 [17:48:31<81:29:40, 8.66s/it] {'loss': 0.1718, 'grad_norm': 1.3112399578094482, 'learning_rate': 3.772492513927396e-05, 'epoch': 1.79} 18%|█▊ | 7372/41250 [17:48:31<81:29:40, 8.66s/it][2025-04-26 01:46:14,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:46:14,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.85 | bwd_microstep: 5767.77 | bwd_inner_microstep: 5689.00 | bwd_allreduce_microstep: 78.72 | step_microstep: 18.35 [2025-04-26 01:46:14,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.85 | bwd: 5767.78 | bwd_inner: 5689.00 | bwd_allreduce: 78.73 | step: 18.35 18%|█▊ | 7373/41250 [17:48:39<81:35:57, 8.67s/it] {'loss': 0.0504, 'grad_norm': 1.0288970470428467, 'learning_rate': 3.772419768739466e-05, 'epoch': 1.79} 18%|█▊ | 7373/41250 [17:48:39<81:35:57, 8.67s/it][2025-04-26 01:46:23,049] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 1.04 [2025-04-26 01:46:23,050] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.95 | bwd_microstep: 5764.36 | bwd_inner_microstep: 5643.28 | bwd_allreduce_microstep: 121.04 | step_microstep: 18.92 [2025-04-26 01:46:23,050] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.95 | bwd: 5764.38 | bwd_inner: 5643.28 | bwd_allreduce: 121.06 | step: 18.92 18%|█▊ | 7374/41250 [17:48:48<81:35:10, 8.67s/it] {'loss': 0.1922, 'grad_norm': 2.811234474182129, 'learning_rate': 3.7723470126249167e-05, 'epoch': 1.79} 18%|█▊ | 7374/41250 [17:48:48<81:35:10, 8.67s/it][2025-04-26 01:46:31,681] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:46:31,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.78 | bwd_microstep: 5709.31 | bwd_inner_microstep: 5685.41 | bwd_allreduce_microstep: 23.86 | step_microstep: 18.72 [2025-04-26 01:46:31,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.78 | bwd: 5709.33 | bwd_inner: 5685.40 | bwd_allreduce: 23.88 | step: 18.72 18%|█▊ | 7375/41250 [17:48:57<81:28:39, 8.66s/it] {'loss': 0.167, 'grad_norm': 1.302159309387207, 'learning_rate': 3.772274245584197e-05, 'epoch': 1.79} 18%|█▊ | 7375/41250 [17:48:57<81:28:39, 8.66s/it][2025-04-26 01:46:40,446] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.02 | optimizer_step: 1.01 [2025-04-26 01:46:40,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2892.39 | bwd_microstep: 5789.60 | bwd_inner_microstep: 5776.93 | bwd_allreduce_microstep: 12.62 | step_microstep: 19.01 [2025-04-26 01:46:40,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2892.39 | bwd: 5789.61 | bwd_inner: 5776.93 | bwd_allreduce: 12.64 | step: 19.01 18%|█▊ | 7376/41250 [17:49:05<81:46:22, 8.69s/it] {'loss': 0.148, 'grad_norm': 2.0100724697113037, 'learning_rate': 3.7722014676177546e-05, 'epoch': 1.79} 18%|█▊ | 7376/41250 [17:49:05<81:46:22, 8.69s/it][2025-04-26 01:46:49,130] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 0.98 | optimizer_step: 0.99 [2025-04-26 01:46:49,131] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.27 | bwd_microstep: 5773.14 | bwd_inner_microstep: 5647.90 | bwd_allreduce_microstep: 125.18 | step_microstep: 18.75 [2025-04-26 01:46:49,131] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.27 | bwd: 5773.15 | bwd_inner: 5647.90 | bwd_allreduce: 125.20 | step: 18.75 18%|█▊ | 7377/41250 [17:49:14<81:45:05, 8.69s/it] {'loss': 0.0312, 'grad_norm': 1.6631090641021729, 'learning_rate': 3.772128678726039e-05, 'epoch': 1.79} 18%|█▊ | 7377/41250 [17:49:14<81:45:05, 8.69s/it][2025-04-26 01:46:57,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:46:57,835] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.28 | bwd_microstep: 5789.92 | bwd_inner_microstep: 5661.35 | bwd_allreduce_microstep: 128.53 | step_microstep: 18.70 [2025-04-26 01:46:57,835] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.28 | bwd: 5789.94 | bwd_inner: 5661.35 | bwd_allreduce: 128.55 | step: 18.70 18%|█▊ | 7378/41250 [17:49:23<81:47:35, 8.69s/it] {'loss': 0.1148, 'grad_norm': 1.655745506286621, 'learning_rate': 3.772055878909499e-05, 'epoch': 1.79} 18%|█▊ | 7378/41250 [17:49:23<81:47:35, 8.69s/it][2025-04-26 01:47:06,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:47:06,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.42 | bwd_microstep: 5767.09 | bwd_inner_microstep: 5656.71 | bwd_allreduce_microstep: 110.33 | step_microstep: 18.51 [2025-04-26 01:47:06,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.42 | bwd: 5767.11 | bwd_inner: 5656.71 | bwd_allreduce: 110.35 | step: 18.51 18%|█▊ | 7379/41250 [17:49:31<81:46:23, 8.69s/it] {'loss': 0.1655, 'grad_norm': 3.676917552947998, 'learning_rate': 3.771983068168583e-05, 'epoch': 1.79} 18%|█▊ | 7379/41250 [17:49:31<81:46:23, 8.69s/it][2025-04-26 01:47:15,210] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.03 | optimizer_step: 1.03 [2025-04-26 01:47:15,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.90 | bwd_microstep: 5776.71 | bwd_inner_microstep: 5655.87 | bwd_allreduce_microstep: 120.79 | step_microstep: 19.16 [2025-04-26 01:47:15,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.90 | bwd: 5776.72 | bwd_inner: 5655.87 | bwd_allreduce: 120.81 | step: 19.16 18%|█▊ | 7380/41250 [17:49:40<81:45:55, 8.69s/it] {'loss': 0.0815, 'grad_norm': 2.4421563148498535, 'learning_rate': 3.771910246503739e-05, 'epoch': 1.79} 18%|█▊ | 7380/41250 [17:49:40<81:45:55, 8.69s/it][2025-04-26 01:47:23,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:47:23,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.67 | bwd_microstep: 5770.68 | bwd_inner_microstep: 5663.07 | bwd_allreduce_microstep: 107.57 | step_microstep: 18.41 [2025-04-26 01:47:23,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.67 | bwd: 5770.69 | bwd_inner: 5663.07 | bwd_allreduce: 107.59 | step: 18.41 18%|█▊ | 7381/41250 [17:49:49<81:44:39, 8.69s/it] {'loss': 0.4239, 'grad_norm': 7.668618679046631, 'learning_rate': 3.771837413915418e-05, 'epoch': 1.79} 18%|█▊ | 7381/41250 [17:49:49<81:44:39, 8.69s/it][2025-04-26 01:47:32,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.04 | optimizer_step: 0.89 [2025-04-26 01:47:32,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.11 | bwd_microstep: 5716.44 | bwd_inner_microstep: 5664.44 | bwd_allreduce_microstep: 51.96 | step_microstep: 18.57 [2025-04-26 01:47:32,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.11 | bwd: 5716.45 | bwd_inner: 5664.44 | bwd_allreduce: 51.98 | step: 18.57 18%|█▊ | 7382/41250 [17:49:57<81:35:03, 8.67s/it] {'loss': 0.1183, 'grad_norm': 1.5632778406143188, 'learning_rate': 3.7717645704040676e-05, 'epoch': 1.79} 18%|█▊ | 7382/41250 [17:49:57<81:35:03, 8.67s/it][2025-04-26 01:47:41,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.98 | optimizer_step: 0.99 [2025-04-26 01:47:41,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2864.09 | bwd_microstep: 5759.93 | bwd_inner_microstep: 5706.46 | bwd_allreduce_microstep: 53.42 | step_microstep: 18.70 [2025-04-26 01:47:41,244] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2864.09 | bwd: 5759.94 | bwd_inner: 5706.46 | bwd_allreduce: 53.44 | step: 18.70 18%|█▊ | 7383/41250 [17:50:06<81:42:15, 8.69s/it] {'loss': 0.4174, 'grad_norm': 2.186918020248413, 'learning_rate': 3.7716917159701365e-05, 'epoch': 1.79} 18%|█▊ | 7383/41250 [17:50:06<81:42:15, 8.69s/it][2025-04-26 01:47:49,903] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:47:49,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.31 | bwd_microstep: 5716.20 | bwd_inner_microstep: 5703.24 | bwd_allreduce_microstep: 12.91 | step_microstep: 18.41 [2025-04-26 01:47:49,904] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.31 | bwd: 5716.21 | bwd_inner: 5703.24 | bwd_allreduce: 12.93 | step: 18.41 18%|█▊ | 7384/41250 [17:50:15<81:38:17, 8.68s/it] {'loss': 0.2937, 'grad_norm': 2.2689244747161865, 'learning_rate': 3.771618850614075e-05, 'epoch': 1.79} 18%|█▊ | 7384/41250 [17:50:15<81:38:17, 8.68s/it][2025-04-26 01:47:58,542] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-26 01:47:58,543] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.37 | bwd_microstep: 5701.76 | bwd_inner_microstep: 5682.86 | bwd_allreduce_microstep: 18.86 | step_microstep: 19.07 [2025-04-26 01:47:58,543] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.37 | bwd: 5701.78 | bwd_inner: 5682.86 | bwd_allreduce: 18.87 | step: 19.08 18%|█▊ | 7385/41250 [17:50:23<81:31:06, 8.67s/it] {'loss': 0.2345, 'grad_norm': 3.1348259449005127, 'learning_rate': 3.7715459743363314e-05, 'epoch': 1.79} 18%|█▊ | 7385/41250 [17:50:23<81:31:06, 8.67s/it][2025-04-26 01:48:07,337] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-26 01:48:07,337] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.78 | bwd_microstep: 5868.47 | bwd_inner_microstep: 5687.84 | bwd_allreduce_microstep: 180.58 | step_microstep: 19.11 [2025-04-26 01:48:07,338] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.79 | bwd: 5868.48 | bwd_inner: 5687.84 | bwd_allreduce: 180.60 | step: 19.11 18%|█▊ | 7386/41250 [17:50:32<81:52:50, 8.70s/it] {'loss': 0.1433, 'grad_norm': 1.9279239177703857, 'learning_rate': 3.771473087137356e-05, 'epoch': 1.79} 18%|█▊ | 7386/41250 [17:50:32<81:52:50, 8.70s/it][2025-04-26 01:48:15,957] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-26 01:48:15,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.96 | bwd_microstep: 5693.87 | bwd_inner_microstep: 5661.27 | bwd_allreduce_microstep: 32.55 | step_microstep: 18.75 [2025-04-26 01:48:15,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.96 | bwd: 5693.88 | bwd_inner: 5661.27 | bwd_allreduce: 32.57 | step: 18.76 18%|█▊ | 7387/41250 [17:50:41<81:38:31, 8.68s/it] {'loss': 0.0522, 'grad_norm': 1.8846468925476074, 'learning_rate': 3.771400189017596e-05, 'epoch': 1.79} 18%|█▊ | 7387/41250 [17:50:41<81:38:31, 8.68s/it][2025-04-26 01:48:24,642] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:48:24,643] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.28 | bwd_microstep: 5758.38 | bwd_inner_microstep: 5661.56 | bwd_allreduce_microstep: 96.78 | step_microstep: 18.56 [2025-04-26 01:48:24,643] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.28 | bwd: 5758.40 | bwd_inner: 5661.56 | bwd_allreduce: 96.80 | step: 18.57 18%|█▊ | 7388/41250 [17:50:49<81:39:09, 8.68s/it] {'loss': 0.0627, 'grad_norm': 0.9422652125358582, 'learning_rate': 3.771327279977504e-05, 'epoch': 1.79} 18%|█▊ | 7388/41250 [17:50:49<81:39:09, 8.68s/it][2025-04-26 01:48:33,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.99 | optimizer_step: 0.98 [2025-04-26 01:48:33,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2885.09 | bwd_microstep: 5780.09 | bwd_inner_microstep: 5767.36 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.41 [2025-04-26 01:48:33,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2885.09 | bwd: 5780.11 | bwd_inner: 5767.36 | bwd_allreduce: 12.71 | step: 18.42 18%|█▊ | 7389/41250 [17:50:58<81:50:10, 8.70s/it] {'loss': 0.0773, 'grad_norm': 4.236123085021973, 'learning_rate': 3.7712543600175264e-05, 'epoch': 1.79} 18%|█▊ | 7389/41250 [17:50:58<81:50:10, 8.70s/it][2025-04-26 01:48:42,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:48:42,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.36 | bwd_microstep: 5743.45 | bwd_inner_microstep: 5678.59 | bwd_allreduce_microstep: 64.81 | step_microstep: 18.72 [2025-04-26 01:48:42,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.36 | bwd: 5743.46 | bwd_inner: 5678.59 | bwd_allreduce: 64.83 | step: 18.72 18%|█▊ | 7390/41250 [17:51:07<81:46:52, 8.69s/it] {'loss': 0.1573, 'grad_norm': 2.5160255432128906, 'learning_rate': 3.771181429138115e-05, 'epoch': 1.79} 18%|█▊ | 7390/41250 [17:51:07<81:46:52, 8.69s/it][2025-04-26 01:48:50,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 01:48:50,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.61 | bwd_microstep: 5752.37 | bwd_inner_microstep: 5650.08 | bwd_allreduce_microstep: 102.24 | step_microstep: 18.74 [2025-04-26 01:48:50,745] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.61 | bwd: 5752.38 | bwd_inner: 5650.08 | bwd_allreduce: 102.26 | step: 18.74 18%|█▊ | 7391/41250 [17:51:16<81:43:23, 8.69s/it] {'loss': 0.0443, 'grad_norm': 1.2889180183410645, 'learning_rate': 3.771108487339718e-05, 'epoch': 1.79} 18%|█▊ | 7391/41250 [17:51:16<81:43:23, 8.69s/it][2025-04-26 01:48:59,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-26 01:48:59,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.78 | bwd_microstep: 5785.73 | bwd_inner_microstep: 5648.33 | bwd_allreduce_microstep: 137.35 | step_microstep: 18.82 [2025-04-26 01:48:59,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.78 | bwd: 5785.74 | bwd_inner: 5648.33 | bwd_allreduce: 137.37 | step: 18.82 18%|█▊ | 7392/41250 [17:51:24<81:44:22, 8.69s/it] {'loss': 0.2543, 'grad_norm': 1.6859601736068726, 'learning_rate': 3.7710355346227856e-05, 'epoch': 1.79} 18%|█▊ | 7392/41250 [17:51:24<81:44:22, 8.69s/it][2025-04-26 01:49:08,103] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-26 01:49:08,104] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.42 | bwd_microstep: 5744.43 | bwd_inner_microstep: 5655.00 | bwd_allreduce_microstep: 89.39 | step_microstep: 18.75 [2025-04-26 01:49:08,104] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.42 | bwd: 5744.44 | bwd_inner: 5655.00 | bwd_allreduce: 89.41 | step: 18.76 18%|█▊ | 7393/41250 [17:51:33<81:39:21, 8.68s/it] {'loss': 0.1538, 'grad_norm': 1.3703529834747314, 'learning_rate': 3.770962570987767e-05, 'epoch': 1.79} 18%|█▊ | 7393/41250 [17:51:33<81:39:21, 8.68s/it][2025-04-26 01:49:16,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 1.00 [2025-04-26 01:49:16,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.68 | bwd_microstep: 5692.47 | bwd_inner_microstep: 5653.37 | bwd_allreduce_microstep: 39.05 | step_microstep: 19.06 [2025-04-26 01:49:16,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.68 | bwd: 5692.49 | bwd_inner: 5653.37 | bwd_allreduce: 39.07 | step: 19.06 18%|█▊ | 7394/41250 [17:51:42<81:25:27, 8.66s/it] {'loss': 0.0998, 'grad_norm': 2.7657907009124756, 'learning_rate': 3.770889596435113e-05, 'epoch': 1.79} 18%|█▊ | 7394/41250 [17:51:42<81:25:27, 8.66s/it][2025-04-26 01:49:25,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:49:25,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.40 | bwd_microstep: 5738.37 | bwd_inner_microstep: 5702.03 | bwd_allreduce_microstep: 36.29 | step_microstep: 18.67 [2025-04-26 01:49:25,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.40 | bwd: 5738.38 | bwd_inner: 5702.03 | bwd_allreduce: 36.31 | step: 18.67 18%|█▊ | 7395/41250 [17:51:50<81:27:59, 8.66s/it] {'loss': 0.1032, 'grad_norm': 1.4548929929733276, 'learning_rate': 3.770816610965273e-05, 'epoch': 1.79} 18%|█▊ | 7395/41250 [17:51:50<81:27:59, 8.66s/it][2025-04-26 01:49:34,043] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:49:34,044] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.20 | bwd_microstep: 5734.53 | bwd_inner_microstep: 5683.61 | bwd_allreduce_microstep: 50.89 | step_microstep: 18.50 [2025-04-26 01:49:34,044] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.20 | bwd: 5734.55 | bwd_inner: 5683.61 | bwd_allreduce: 50.90 | step: 18.50 18%|█▊ | 7396/41250 [17:51:59<81:28:13, 8.66s/it] {'loss': 0.1254, 'grad_norm': 2.087845802307129, 'learning_rate': 3.770743614578696e-05, 'epoch': 1.79} 18%|█▊ | 7396/41250 [17:51:59<81:28:13, 8.66s/it][2025-04-26 01:49:42,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 01:49:42,805] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2888.58 | bwd_microstep: 5786.21 | bwd_inner_microstep: 5773.22 | bwd_allreduce_microstep: 12.95 | step_microstep: 19.05 [2025-04-26 01:49:42,805] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2888.58 | bwd: 5786.23 | bwd_inner: 5773.22 | bwd_allreduce: 12.97 | step: 19.05 18%|█▊ | 7397/41250 [17:52:08<81:44:37, 8.69s/it] {'loss': 0.1191, 'grad_norm': 3.69673490524292, 'learning_rate': 3.770670607275834e-05, 'epoch': 1.79} 18%|█▊ | 7397/41250 [17:52:08<81:44:37, 8.69s/it][2025-04-26 01:49:51,412] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 01:49:51,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.08 | bwd_microstep: 5703.44 | bwd_inner_microstep: 5636.77 | bwd_allreduce_microstep: 66.62 | step_microstep: 18.77 [2025-04-26 01:49:51,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.08 | bwd: 5703.45 | bwd_inner: 5636.77 | bwd_allreduce: 66.64 | step: 18.77 18%|█▊ | 7398/41250 [17:52:16<81:30:19, 8.67s/it] {'loss': 0.0996, 'grad_norm': 1.4589790105819702, 'learning_rate': 3.770597589057136e-05, 'epoch': 1.79} 18%|█▊ | 7398/41250 [17:52:16<81:30:19, 8.67s/it][2025-04-26 01:50:00,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-26 01:50:00,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2880.25 | bwd_microstep: 5824.12 | bwd_inner_microstep: 5763.10 | bwd_allreduce_microstep: 60.97 | step_microstep: 19.18 [2025-04-26 01:50:00,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2880.25 | bwd: 5824.13 | bwd_inner: 5763.10 | bwd_allreduce: 60.99 | step: 19.18 18%|█▊ | 7399/41250 [17:52:25<81:50:29, 8.70s/it] {'loss': 0.1226, 'grad_norm': 1.7006642818450928, 'learning_rate': 3.770524559923052e-05, 'epoch': 1.79} 18%|█▊ | 7399/41250 [17:52:25<81:50:29, 8.70s/it][2025-04-26 01:50:08,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:50:08,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.88 | bwd_microstep: 5745.42 | bwd_inner_microstep: 5685.19 | bwd_allreduce_microstep: 60.19 | step_microstep: 18.56 [2025-04-26 01:50:08,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.88 | bwd: 5745.44 | bwd_inner: 5685.19 | bwd_allreduce: 60.20 | step: 18.57 18%|█▊ | 7400/41250 [17:52:34<81:46:14, 8.70s/it] {'loss': 0.079, 'grad_norm': 1.8377211093902588, 'learning_rate': 3.7704515198740315e-05, 'epoch': 1.79} 18%|█▊ | 7400/41250 [17:52:34<81:46:14, 8.70s/it][2025-04-26 01:50:17,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.04 | optimizer_step: 0.92 [2025-04-26 01:50:17,561] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.78 | bwd_microstep: 5773.18 | bwd_inner_microstep: 5631.90 | bwd_allreduce_microstep: 141.22 | step_microstep: 19.10 [2025-04-26 01:50:17,561] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.78 | bwd: 5773.19 | bwd_inner: 5631.90 | bwd_allreduce: 141.24 | step: 19.11 18%|█▊ | 7401/41250 [17:52:42<81:43:03, 8.69s/it] {'loss': 0.0674, 'grad_norm': 1.033522367477417, 'learning_rate': 3.7703784689105266e-05, 'epoch': 1.79} 18%|█▊ | 7401/41250 [17:52:42<81:43:03, 8.69s/it][2025-04-26 01:50:26,181] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 1.06 [2025-04-26 01:50:26,181] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.48 | bwd_microstep: 5688.48 | bwd_inner_microstep: 5675.92 | bwd_allreduce_microstep: 12.52 | step_microstep: 18.99 [2025-04-26 01:50:26,181] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.48 | bwd: 5688.50 | bwd_inner: 5675.92 | bwd_allreduce: 12.54 | step: 18.99 18%|█▊ | 7402/41250 [17:52:51<81:30:59, 8.67s/it] {'loss': 0.0638, 'grad_norm': 1.4359359741210938, 'learning_rate': 3.770305407032986e-05, 'epoch': 1.79} 18%|█▊ | 7402/41250 [17:52:51<81:30:59, 8.67s/it][2025-04-26 01:50:34,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:50:34,821] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2862.75 | bwd_microstep: 5693.44 | bwd_inner_microstep: 5680.54 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.56 [2025-04-26 01:50:34,821] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2862.75 | bwd: 5693.46 | bwd_inner: 5680.54 | bwd_allreduce: 12.87 | step: 18.56 18%|█▊ | 7403/41250 [17:53:00<81:26:02, 8.66s/it] {'loss': 0.1391, 'grad_norm': 2.3913750648498535, 'learning_rate': 3.7702323342418615e-05, 'epoch': 1.79} 18%|█▊ | 7403/41250 [17:53:00<81:26:02, 8.66s/it][2025-04-26 01:50:43,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.07 | optimizer_step: 1.02 [2025-04-26 01:50:43,494] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.09 | bwd_microstep: 5741.20 | bwd_inner_microstep: 5688.99 | bwd_allreduce_microstep: 52.16 | step_microstep: 19.63 [2025-04-26 01:50:43,494] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.09 | bwd: 5741.21 | bwd_inner: 5688.99 | bwd_allreduce: 52.18 | step: 19.64 18%|█▊ | 7404/41250 [17:53:08<81:27:48, 8.66s/it] {'loss': 0.2496, 'grad_norm': 2.9151949882507324, 'learning_rate': 3.7701592505376024e-05, 'epoch': 1.79} 18%|█▊ | 7404/41250 [17:53:08<81:27:48, 8.66s/it][2025-04-26 01:50:52,151] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-26 01:50:52,152] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.12 | bwd_microstep: 5714.63 | bwd_inner_microstep: 5701.81 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.61 [2025-04-26 01:50:52,152] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.12 | bwd: 5714.64 | bwd_inner: 5701.81 | bwd_allreduce: 12.79 | step: 18.61 18%|█▊ | 7405/41250 [17:53:17<81:26:48, 8.66s/it] {'loss': 0.1861, 'grad_norm': 1.5835340023040771, 'learning_rate': 3.7700861559206595e-05, 'epoch': 1.8} 18%|█▊ | 7405/41250 [17:53:17<81:26:48, 8.66s/it][2025-04-26 01:51:00,814] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.97 | optimizer_gradients: 1.18 | optimizer_step: 1.10 [2025-04-26 01:51:00,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.76 | bwd_microstep: 5747.24 | bwd_inner_microstep: 5651.24 | bwd_allreduce_microstep: 95.95 | step_microstep: 19.71 [2025-04-26 01:51:00,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.76 | bwd: 5747.25 | bwd_inner: 5651.24 | bwd_allreduce: 95.97 | step: 19.71 18%|█▊ | 7406/41250 [17:53:26<81:26:03, 8.66s/it] {'loss': 0.1746, 'grad_norm': 2.3170368671417236, 'learning_rate': 3.770013050391484e-05, 'epoch': 1.8} 18%|█▊ | 7406/41250 [17:53:26<81:26:03, 8.66s/it][2025-04-26 01:51:09,415] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-26 01:51:09,416] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.31 | bwd_microstep: 5695.84 | bwd_inner_microstep: 5641.21 | bwd_allreduce_microstep: 54.58 | step_microstep: 18.75 [2025-04-26 01:51:09,416] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.31 | bwd: 5695.85 | bwd_inner: 5641.21 | bwd_allreduce: 54.60 | step: 18.76 18%|█▊ | 7407/41250 [17:53:34<81:15:34, 8.64s/it] {'loss': 0.0581, 'grad_norm': 2.089327573776245, 'learning_rate': 3.7699399339505257e-05, 'epoch': 1.8} 18%|█▊ | 7407/41250 [17:53:34<81:15:34, 8.64s/it][2025-04-26 01:51:18,087] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:51:18,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.11 | bwd_microstep: 5733.87 | bwd_inner_microstep: 5687.89 | bwd_allreduce_microstep: 45.93 | step_microstep: 18.64 [2025-04-26 01:51:18,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.11 | bwd: 5733.89 | bwd_inner: 5687.89 | bwd_allreduce: 45.95 | step: 18.64 18%|█▊ | 7408/41250 [17:53:43<81:20:20, 8.65s/it] {'loss': 0.0261, 'grad_norm': 0.3304009437561035, 'learning_rate': 3.769866806598236e-05, 'epoch': 1.8} 18%|█▊ | 7408/41250 [17:53:43<81:20:20, 8.65s/it][2025-04-26 01:51:26,691] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-26 01:51:26,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.62 | bwd_microstep: 5693.72 | bwd_inner_microstep: 5642.42 | bwd_allreduce_microstep: 51.26 | step_microstep: 18.82 [2025-04-26 01:51:26,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.62 | bwd: 5693.73 | bwd_inner: 5642.42 | bwd_allreduce: 51.27 | step: 18.82 18%|█▊ | 7409/41250 [17:53:52<81:11:49, 8.64s/it] {'loss': 0.2178, 'grad_norm': 2.3558743000030518, 'learning_rate': 3.7697936683350656e-05, 'epoch': 1.8} 18%|█▊ | 7409/41250 [17:53:52<81:11:49, 8.64s/it][2025-04-26 01:51:35,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-26 01:51:35,301] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.90 | bwd_microstep: 5697.75 | bwd_inner_microstep: 5644.39 | bwd_allreduce_microstep: 53.31 | step_microstep: 19.24 [2025-04-26 01:51:35,301] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.90 | bwd: 5697.77 | bwd_inner: 5644.39 | bwd_allreduce: 53.33 | step: 19.24 18%|█▊ | 7410/41250 [17:54:00<81:06:53, 8.63s/it] {'loss': 0.3323, 'grad_norm': 4.563774585723877, 'learning_rate': 3.769720519161466e-05, 'epoch': 1.8} 18%|█▊ | 7410/41250 [17:54:00<81:06:53, 8.63s/it][2025-04-26 01:51:43,967] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 01:51:43,967] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.59 | bwd_microstep: 5735.19 | bwd_inner_microstep: 5693.25 | bwd_allreduce_microstep: 41.88 | step_microstep: 18.65 [2025-04-26 01:51:43,968] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.59 | bwd: 5735.20 | bwd_inner: 5693.25 | bwd_allreduce: 41.90 | step: 18.65 18%|█▊ | 7411/41250 [17:54:09<81:13:22, 8.64s/it] {'loss': 0.1739, 'grad_norm': 2.0742664337158203, 'learning_rate': 3.769647359077886e-05, 'epoch': 1.8} 18%|█▊ | 7411/41250 [17:54:09<81:13:22, 8.64s/it][2025-04-26 01:51:52,595] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 1.00 | optimizer_step: 1.16 [2025-04-26 01:51:52,596] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.75 | bwd_microstep: 5696.66 | bwd_inner_microstep: 5683.88 | bwd_allreduce_microstep: 12.73 | step_microstep: 19.09 [2025-04-26 01:51:52,596] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.75 | bwd: 5696.67 | bwd_inner: 5683.88 | bwd_allreduce: 12.75 | step: 19.09 18%|█▊ | 7412/41250 [17:54:17<81:10:47, 8.64s/it] {'loss': 0.0888, 'grad_norm': 1.2489888668060303, 'learning_rate': 3.7695741880847795e-05, 'epoch': 1.8} 18%|█▊ | 7412/41250 [17:54:17<81:10:47, 8.64s/it][2025-04-26 01:52:01,294] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-26 01:52:01,295] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.80 | bwd_microstep: 5771.21 | bwd_inner_microstep: 5685.88 | bwd_allreduce_microstep: 85.28 | step_microstep: 18.80 [2025-04-26 01:52:01,295] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.80 | bwd: 5771.22 | bwd_inner: 5685.88 | bwd_allreduce: 85.30 | step: 18.80 18%|█▊ | 7413/41250 [17:54:26<81:21:02, 8.66s/it] {'loss': 0.1522, 'grad_norm': 4.788893222808838, 'learning_rate': 3.769501006182595e-05, 'epoch': 1.8} 18%|█▊ | 7413/41250 [17:54:26<81:21:02, 8.66s/it][2025-04-26 01:52:09,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 01:52:09,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.87 | bwd_microstep: 5704.41 | bwd_inner_microstep: 5691.56 | bwd_allreduce_microstep: 12.81 | step_microstep: 18.93 [2025-04-26 01:52:09,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.87 | bwd: 5704.43 | bwd_inner: 5691.56 | bwd_allreduce: 12.83 | step: 18.94 18%|█▊ | 7414/41250 [17:54:35<81:16:57, 8.65s/it] {'loss': 0.0891, 'grad_norm': 1.6845567226409912, 'learning_rate': 3.7694278133717855e-05, 'epoch': 1.8} 18%|█▊ | 7414/41250 [17:54:35<81:16:57, 8.65s/it][2025-04-26 01:52:18,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-26 01:52:18,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.41 | bwd_microstep: 5776.44 | bwd_inner_microstep: 5687.45 | bwd_allreduce_microstep: 88.94 | step_microstep: 18.96 [2025-04-26 01:52:18,632] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.41 | bwd: 5776.46 | bwd_inner: 5687.45 | bwd_allreduce: 88.96 | step: 18.96 18%|█▊ | 7415/41250 [17:54:43<81:26:51, 8.67s/it] {'loss': 0.1571, 'grad_norm': 3.158069372177124, 'learning_rate': 3.769354609652802e-05, 'epoch': 1.8} 18%|█▊ | 7415/41250 [17:54:43<81:26:51, 8.67s/it][2025-04-26 01:52:27,443] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.21 | optimizer_step: 1.05 [2025-04-26 01:52:27,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.32 | bwd_microstep: 5885.42 | bwd_inner_microstep: 5688.76 | bwd_allreduce_microstep: 196.60 | step_microstep: 19.71 [2025-04-26 01:52:27,444] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.32 | bwd: 5885.44 | bwd_inner: 5688.76 | bwd_allreduce: 196.63 | step: 19.71 18%|█▊ | 7416/41250 [17:54:52<81:51:10, 8.71s/it] {'loss': 0.4212, 'grad_norm': 3.3703525066375732, 'learning_rate': 3.769281395026095e-05, 'epoch': 1.8} 18%|█▊ | 7416/41250 [17:54:52<81:51:10, 8.71s/it][2025-04-26 01:52:36,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:52:36,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.05 | bwd_microstep: 5880.05 | bwd_inner_microstep: 5699.56 | bwd_allreduce_microstep: 180.45 | step_microstep: 18.83 [2025-04-26 01:52:36,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.05 | bwd: 5880.06 | bwd_inner: 5699.55 | bwd_allreduce: 180.47 | step: 18.83 18%|█▊ | 7417/41250 [17:55:01<82:07:39, 8.74s/it] {'loss': 0.1158, 'grad_norm': 3.208064079284668, 'learning_rate': 3.769208169492116e-05, 'epoch': 1.8} 18%|█▊ | 7417/41250 [17:55:01<82:07:39, 8.74s/it][2025-04-26 01:52:44,929] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 1.07 [2025-04-26 01:52:44,930] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.05 | bwd_microstep: 5753.46 | bwd_inner_microstep: 5683.24 | bwd_allreduce_microstep: 70.17 | step_microstep: 18.91 [2025-04-26 01:52:44,930] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.05 | bwd: 5753.48 | bwd_inner: 5683.24 | bwd_allreduce: 70.19 | step: 18.91 18%|█▊ | 7418/41250 [17:55:10<81:57:17, 8.72s/it] {'loss': 0.0781, 'grad_norm': 1.9144803285598755, 'learning_rate': 3.769134933051317e-05, 'epoch': 1.8} 18%|█▊ | 7418/41250 [17:55:10<81:57:17, 8.72s/it][2025-04-26 01:52:53,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.96 | optimizer_step: 1.07 [2025-04-26 01:52:53,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.74 | bwd_microstep: 5771.15 | bwd_inner_microstep: 5697.94 | bwd_allreduce_microstep: 73.16 | step_microstep: 18.63 [2025-04-26 01:52:53,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.74 | bwd: 5771.16 | bwd_inner: 5697.94 | bwd_allreduce: 73.18 | step: 18.63 18%|█▊ | 7419/41250 [17:55:18<81:53:39, 8.71s/it] {'loss': 0.118, 'grad_norm': 1.6239794492721558, 'learning_rate': 3.769061685704148e-05, 'epoch': 1.8} 18%|█▊ | 7419/41250 [17:55:18<81:53:39, 8.71s/it][2025-04-26 01:53:02,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-26 01:53:02,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2884.85 | bwd_microstep: 5791.39 | bwd_inner_microstep: 5778.63 | bwd_allreduce_microstep: 12.70 | step_microstep: 19.18 [2025-04-26 01:53:02,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2884.85 | bwd: 5791.40 | bwd_inner: 5778.63 | bwd_allreduce: 12.72 | step: 19.19 18%|█▊ | 7420/41250 [17:55:27<82:01:37, 8.73s/it] {'loss': 0.1363, 'grad_norm': 1.567510724067688, 'learning_rate': 3.768988427451063e-05, 'epoch': 1.8} 18%|█▊ | 7420/41250 [17:55:27<82:01:37, 8.73s/it][2025-04-26 01:53:11,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 01:53:11,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.98 | bwd_microstep: 5741.20 | bwd_inner_microstep: 5684.55 | bwd_allreduce_microstep: 56.61 | step_microstep: 18.40 [2025-04-26 01:53:11,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.98 | bwd: 5741.22 | bwd_inner: 5684.55 | bwd_allreduce: 56.63 | step: 18.41 18%|█▊ | 7421/41250 [17:55:36<81:52:36, 8.71s/it] {'loss': 0.109, 'grad_norm': 2.178358554840088, 'learning_rate': 3.768915158292512e-05, 'epoch': 1.8} 18%|█▊ | 7421/41250 [17:55:36<81:52:36, 8.71s/it][2025-04-26 01:53:19,755] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 01:53:19,755] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.87 | bwd_microstep: 5770.56 | bwd_inner_microstep: 5652.47 | bwd_allreduce_microstep: 118.05 | step_microstep: 18.53 [2025-04-26 01:53:19,755] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.87 | bwd: 5770.58 | bwd_inner: 5652.47 | bwd_allreduce: 118.07 | step: 18.53 18%|█▊ | 7422/41250 [17:55:45<81:47:46, 8.70s/it] {'loss': 0.198, 'grad_norm': 5.534649848937988, 'learning_rate': 3.768841878228946e-05, 'epoch': 1.8} 18%|█▊ | 7422/41250 [17:55:45<81:47:46, 8.70s/it][2025-04-26 01:53:28,400] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.98 [2025-04-26 01:53:28,400] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.78 | bwd_microstep: 5701.16 | bwd_inner_microstep: 5688.36 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.74 [2025-04-26 01:53:28,401] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.78 | bwd: 5701.17 | bwd_inner: 5688.36 | bwd_allreduce: 12.77 | step: 18.75 18%|█▊ | 7423/41250 [17:55:53<81:37:47, 8.69s/it] {'loss': 0.088, 'grad_norm': 3.2948663234710693, 'learning_rate': 3.7687685872608186e-05, 'epoch': 1.8} 18%|█▊ | 7423/41250 [17:55:53<81:37:47, 8.69s/it][2025-04-26 01:53:37,069] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 1.06 [2025-04-26 01:53:37,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2862.05 | bwd_microstep: 5721.94 | bwd_inner_microstep: 5709.04 | bwd_allreduce_microstep: 12.85 | step_microstep: 19.25 [2025-04-26 01:53:37,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2862.05 | bwd: 5721.96 | bwd_inner: 5709.04 | bwd_allreduce: 12.87 | step: 19.26 18%|█▊ | 7424/41250 [17:56:02<81:35:10, 8.68s/it] {'loss': 0.0564, 'grad_norm': 0.843117892742157, 'learning_rate': 3.7686952853885805e-05, 'epoch': 1.8} 18%|█▊ | 7424/41250 [17:56:02<81:35:10, 8.68s/it][2025-04-26 01:53:45,888] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.07 | optimizer_step: 0.95 [2025-04-26 01:53:45,888] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2862.20 | bwd_microstep: 5867.15 | bwd_inner_microstep: 5721.91 | bwd_allreduce_microstep: 145.20 | step_microstep: 19.40 [2025-04-26 01:53:45,889] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2862.20 | bwd: 5867.17 | bwd_inner: 5721.91 | bwd_allreduce: 145.22 | step: 19.41 18%|█▊ | 7425/41250 [17:56:11<81:57:26, 8.72s/it] {'loss': 0.2325, 'grad_norm': 1.6116573810577393, 'learning_rate': 3.7686219726126845e-05, 'epoch': 1.8} 18%|█▊ | 7425/41250 [17:56:11<81:57:26, 8.72s/it][2025-04-26 01:53:54,582] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.05 | optimizer_step: 1.10 [2025-04-26 01:53:54,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.55 | bwd_microstep: 5748.23 | bwd_inner_microstep: 5705.28 | bwd_allreduce_microstep: 42.91 | step_microstep: 19.46 [2025-04-26 01:53:54,583] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.55 | bwd: 5748.25 | bwd_inner: 5705.28 | bwd_allreduce: 42.93 | step: 19.46 18%|█▊ | 7426/41250 [17:56:19<81:52:31, 8.71s/it] {'loss': 0.2976, 'grad_norm': 2.386112928390503, 'learning_rate': 3.768548648933581e-05, 'epoch': 1.8} 18%|█▊ | 7426/41250 [17:56:19<81:52:31, 8.71s/it][2025-04-26 01:54:03,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 0.98 | optimizer_step: 0.94 [2025-04-26 01:54:03,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.09 | bwd_microstep: 5774.41 | bwd_inner_microstep: 5697.94 | bwd_allreduce_microstep: 76.43 | step_microstep: 18.36 [2025-04-26 01:54:03,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.09 | bwd: 5774.43 | bwd_inner: 5697.94 | bwd_allreduce: 76.45 | step: 18.36 18%|█▊ | 7427/41250 [17:56:28<81:52:44, 8.71s/it] {'loss': 0.3487, 'grad_norm': 2.1889545917510986, 'learning_rate': 3.768475314351723e-05, 'epoch': 1.8} 18%|█▊ | 7427/41250 [17:56:28<81:52:44, 8.71s/it][2025-04-26 01:54:11,979] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:54:11,980] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.95 | bwd_microstep: 5749.35 | bwd_inner_microstep: 5691.45 | bwd_allreduce_microstep: 57.84 | step_microstep: 18.46 [2025-04-26 01:54:11,980] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.95 | bwd: 5749.36 | bwd_inner: 5691.45 | bwd_allreduce: 57.87 | step: 18.47 18%|█▊ | 7428/41250 [17:56:37<81:46:58, 8.70s/it] {'loss': 0.1473, 'grad_norm': 1.504954218864441, 'learning_rate': 3.768401968867563e-05, 'epoch': 1.8} 18%|█▊ | 7428/41250 [17:56:37<81:46:58, 8.70s/it][2025-04-26 01:54:20,831] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.02 | optimizer_step: 0.96 [2025-04-26 01:54:20,832] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.19 | bwd_microstep: 5927.49 | bwd_inner_microstep: 5663.83 | bwd_allreduce_microstep: 263.61 | step_microstep: 18.83 [2025-04-26 01:54:20,832] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.19 | bwd: 5927.50 | bwd_inner: 5663.83 | bwd_allreduce: 263.63 | step: 18.83 18%|█▊ | 7429/41250 [17:56:46<82:11:25, 8.75s/it] {'loss': 0.042, 'grad_norm': 0.8735190033912659, 'learning_rate': 3.768328612481553e-05, 'epoch': 1.8} 18%|█▊ | 7429/41250 [17:56:46<82:11:25, 8.75s/it][2025-04-26 01:54:29,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 01:54:29,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.38 | bwd_microstep: 5724.61 | bwd_inner_microstep: 5711.74 | bwd_allreduce_microstep: 12.83 | step_microstep: 18.86 [2025-04-26 01:54:29,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.38 | bwd: 5724.62 | bwd_inner: 5711.74 | bwd_allreduce: 12.84 | step: 18.86 18%|█▊ | 7430/41250 [17:56:54<81:58:06, 8.73s/it] {'loss': 0.2052, 'grad_norm': 2.856370449066162, 'learning_rate': 3.768255245194144e-05, 'epoch': 1.8} 18%|█▊ | 7430/41250 [17:56:54<81:58:06, 8.73s/it][2025-04-26 01:54:38,184] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-26 01:54:38,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.71 | bwd_microstep: 5740.64 | bwd_inner_microstep: 5713.96 | bwd_allreduce_microstep: 26.64 | step_microstep: 19.03 [2025-04-26 01:54:38,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.71 | bwd: 5740.66 | bwd_inner: 5713.96 | bwd_allreduce: 26.66 | step: 19.03 18%|█▊ | 7431/41250 [17:57:03<81:50:46, 8.71s/it] {'loss': 0.0765, 'grad_norm': 3.1838340759277344, 'learning_rate': 3.76818186700579e-05, 'epoch': 1.8} 18%|█▊ | 7431/41250 [17:57:03<81:50:46, 8.71s/it][2025-04-26 01:54:46,877] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 01:54:46,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.44 | bwd_microstep: 5783.13 | bwd_inner_microstep: 5647.00 | bwd_allreduce_microstep: 136.07 | step_microstep: 18.65 [2025-04-26 01:54:46,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.44 | bwd: 5783.14 | bwd_inner: 5647.00 | bwd_allreduce: 136.09 | step: 18.65 18%|█▊ | 7432/41250 [17:57:12<81:47:46, 8.71s/it] {'loss': 0.2333, 'grad_norm': 1.3065623044967651, 'learning_rate': 3.7681084779169423e-05, 'epoch': 1.8} 18%|█▊ | 7432/41250 [17:57:12<81:47:46, 8.71s/it][2025-04-26 01:54:55,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 6.41 | optimizer_gradients: 1.21 | optimizer_step: 0.98 [2025-04-26 01:54:55,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.51 | bwd_microstep: 5756.28 | bwd_inner_microstep: 5654.32 | bwd_allreduce_microstep: 101.91 | step_microstep: 20.44 [2025-04-26 01:54:55,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.51 | bwd: 5756.30 | bwd_inner: 5654.32 | bwd_allreduce: 101.93 | step: 20.44 18%|█▊ | 7433/41250 [17:57:20<81:41:01, 8.70s/it] {'loss': 0.4076, 'grad_norm': 2.529862403869629, 'learning_rate': 3.768035077928053e-05, 'epoch': 1.8} 18%|█▊ | 7433/41250 [17:57:20<81:41:01, 8.70s/it][2025-04-26 01:55:04,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 01:55:04,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.14 | bwd_microstep: 5781.35 | bwd_inner_microstep: 5633.85 | bwd_allreduce_microstep: 147.45 | step_microstep: 18.49 [2025-04-26 01:55:04,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.14 | bwd: 5781.36 | bwd_inner: 5633.85 | bwd_allreduce: 147.47 | step: 18.49 18%|█▊ | 7434/41250 [17:57:29<81:40:18, 8.69s/it] {'loss': 0.1003, 'grad_norm': 1.1132413148880005, 'learning_rate': 3.767961667039576e-05, 'epoch': 1.8} 18%|█▊ | 7434/41250 [17:57:29<81:40:18, 8.69s/it][2025-04-26 01:55:12,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-26 01:55:12,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.92 | bwd_microstep: 5686.12 | bwd_inner_microstep: 5651.23 | bwd_allreduce_microstep: 34.85 | step_microstep: 18.60 [2025-04-26 01:55:12,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.92 | bwd: 5686.13 | bwd_inner: 5651.22 | bwd_allreduce: 34.87 | step: 18.60 18%|█▊ | 7435/41250 [17:57:38<81:26:12, 8.67s/it] {'loss': 0.2952, 'grad_norm': 2.0555734634399414, 'learning_rate': 3.767888245251963e-05, 'epoch': 1.8} 18%|█▊ | 7435/41250 [17:57:38<81:26:12, 8.67s/it][2025-04-26 01:55:21,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 01:55:21,539] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.05 | bwd_microstep: 5775.38 | bwd_inner_microstep: 5642.73 | bwd_allreduce_microstep: 132.61 | step_microstep: 18.35 [2025-04-26 01:55:21,539] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.05 | bwd: 5775.40 | bwd_inner: 5642.73 | bwd_allreduce: 132.62 | step: 18.35 18%|█▊ | 7436/41250 [17:57:46<81:28:31, 8.67s/it] {'loss': 0.0498, 'grad_norm': 1.3266514539718628, 'learning_rate': 3.767814812565666e-05, 'epoch': 1.8} 18%|█▊ | 7436/41250 [17:57:46<81:28:31, 8.67s/it][2025-04-26 01:55:30,213] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 01:55:30,214] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.79 | bwd_microstep: 5741.23 | bwd_inner_microstep: 5675.51 | bwd_allreduce_microstep: 65.68 | step_microstep: 18.52 [2025-04-26 01:55:30,214] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.79 | bwd: 5741.24 | bwd_inner: 5675.51 | bwd_allreduce: 65.70 | step: 18.52 18%|█▊ | 7437/41250 [17:57:55<81:28:31, 8.67s/it] {'loss': 0.0491, 'grad_norm': 1.0898289680480957, 'learning_rate': 3.7677413689811396e-05, 'epoch': 1.8} 18%|█▊ | 7437/41250 [17:57:55<81:28:31, 8.67s/it][2025-04-26 01:55:38,845] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:55:38,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.64 | bwd_microstep: 5696.69 | bwd_inner_microstep: 5683.82 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.71 [2025-04-26 01:55:38,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.64 | bwd: 5696.70 | bwd_inner: 5683.82 | bwd_allreduce: 12.84 | step: 18.71 18%|█▊ | 7438/41250 [17:58:04<81:21:22, 8.66s/it] {'loss': 0.2041, 'grad_norm': 1.4907597303390503, 'learning_rate': 3.7676679144988344e-05, 'epoch': 1.8} 18%|█▊ | 7438/41250 [17:58:04<81:21:22, 8.66s/it][2025-04-26 01:55:47,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.02 | optimizer_step: 1.19 [2025-04-26 01:55:47,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.12 | bwd_microstep: 5740.72 | bwd_inner_microstep: 5684.18 | bwd_allreduce_microstep: 56.49 | step_microstep: 19.26 [2025-04-26 01:55:47,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.12 | bwd: 5740.73 | bwd_inner: 5684.18 | bwd_allreduce: 56.50 | step: 19.26 18%|█▊ | 7439/41250 [17:58:12<81:22:59, 8.67s/it] {'loss': 0.0135, 'grad_norm': 0.22451573610305786, 'learning_rate': 3.7675944491192045e-05, 'epoch': 1.8} 18%|█▊ | 7439/41250 [17:58:12<81:22:59, 8.67s/it][2025-04-26 01:55:56,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:55:56,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.61 | bwd_microstep: 5679.37 | bwd_inner_microstep: 5645.97 | bwd_allreduce_microstep: 33.35 | step_microstep: 18.59 [2025-04-26 01:55:56,113] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.61 | bwd: 5679.39 | bwd_inner: 5645.97 | bwd_allreduce: 33.37 | step: 18.60 18%|█▊ | 7440/41250 [17:58:21<81:10:40, 8.64s/it] {'loss': 0.0565, 'grad_norm': 1.480241298675537, 'learning_rate': 3.767520972842702e-05, 'epoch': 1.8} 18%|█▊ | 7440/41250 [17:58:21<81:10:40, 8.64s/it][2025-04-26 01:56:04,788] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-26 01:56:04,788] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.77 | bwd_microstep: 5734.89 | bwd_inner_microstep: 5692.04 | bwd_allreduce_microstep: 42.80 | step_microstep: 19.05 [2025-04-26 01:56:04,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.77 | bwd: 5734.91 | bwd_inner: 5692.04 | bwd_allreduce: 42.82 | step: 19.05 18%|█▊ | 7441/41250 [17:58:30<81:16:43, 8.65s/it] {'loss': 0.1748, 'grad_norm': 4.267389297485352, 'learning_rate': 3.767447485669781e-05, 'epoch': 1.8} 18%|█▊ | 7441/41250 [17:58:30<81:16:43, 8.65s/it][2025-04-26 01:56:13,391] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 01:56:13,391] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.31 | bwd_microstep: 5685.74 | bwd_inner_microstep: 5654.65 | bwd_allreduce_microstep: 31.04 | step_microstep: 18.80 [2025-04-26 01:56:13,391] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.31 | bwd: 5685.75 | bwd_inner: 5654.65 | bwd_allreduce: 31.06 | step: 18.80 18%|█▊ | 7442/41250 [17:58:38<81:07:12, 8.64s/it] {'loss': 0.2145, 'grad_norm': 5.184484481811523, 'learning_rate': 3.767373987600893e-05, 'epoch': 1.8} 18%|█▊ | 7442/41250 [17:58:38<81:07:12, 8.64s/it][2025-04-26 01:56:21,982] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 01:56:21,983] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.49 | bwd_microstep: 5672.84 | bwd_inner_microstep: 5645.85 | bwd_allreduce_microstep: 26.95 | step_microstep: 18.94 [2025-04-26 01:56:21,983] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.49 | bwd: 5672.85 | bwd_inner: 5645.85 | bwd_allreduce: 26.97 | step: 18.94 18%|█▊ | 7443/41250 [17:58:47<80:58:57, 8.62s/it] {'loss': 0.1462, 'grad_norm': 1.0679914951324463, 'learning_rate': 3.767300478636493e-05, 'epoch': 1.8} 18%|█▊ | 7443/41250 [17:58:47<80:58:57, 8.62s/it][2025-04-26 01:56:30,579] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 01:56:30,579] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.98 | bwd_microstep: 5689.55 | bwd_inner_microstep: 5639.79 | bwd_allreduce_microstep: 49.71 | step_microstep: 18.67 [2025-04-26 01:56:30,579] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.98 | bwd: 5689.56 | bwd_inner: 5639.79 | bwd_allreduce: 49.73 | step: 18.67 18%|█▊ | 7444/41250 [17:58:55<80:54:56, 8.62s/it] {'loss': 0.0989, 'grad_norm': 2.021881103515625, 'learning_rate': 3.767226958777033e-05, 'epoch': 1.8} 18%|█▊ | 7444/41250 [17:58:55<80:54:56, 8.62s/it][2025-04-26 01:56:39,233] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.00 | optimizer_step: 1.01 [2025-04-26 01:56:39,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.04 | bwd_microstep: 5741.22 | bwd_inner_microstep: 5638.66 | bwd_allreduce_microstep: 102.52 | step_microstep: 18.97 [2025-04-26 01:56:39,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.04 | bwd: 5741.23 | bwd_inner: 5638.66 | bwd_allreduce: 102.53 | step: 18.97 18%|█▊ | 7445/41250 [17:59:04<81:00:31, 8.63s/it] {'loss': 0.0376, 'grad_norm': 0.952304482460022, 'learning_rate': 3.767153428022966e-05, 'epoch': 1.8} 18%|█▊ | 7445/41250 [17:59:04<81:00:31, 8.63s/it][2025-04-26 01:56:47,962] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 01:56:47,963] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2884.72 | bwd_microstep: 5759.81 | bwd_inner_microstep: 5747.11 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.65 [2025-04-26 01:56:47,963] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2884.72 | bwd: 5759.82 | bwd_inner: 5747.11 | bwd_allreduce: 12.68 | step: 18.65 18%|█▊ | 7446/41250 [17:59:13<81:17:51, 8.66s/it] {'loss': 0.1302, 'grad_norm': 2.1875956058502197, 'learning_rate': 3.767079886374746e-05, 'epoch': 1.81} 18%|█▊ | 7446/41250 [17:59:13<81:17:51, 8.66s/it][2025-04-26 01:56:56,651] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 01:56:56,651] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.21 | bwd_microstep: 5770.04 | bwd_inner_microstep: 5644.55 | bwd_allreduce_microstep: 125.45 | step_microstep: 18.20 [2025-04-26 01:56:56,651] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.21 | bwd: 5770.05 | bwd_inner: 5644.55 | bwd_allreduce: 125.47 | step: 18.20 18%|█▊ | 7447/41250 [17:59:21<81:22:46, 8.67s/it] {'loss': 0.0633, 'grad_norm': 1.2054131031036377, 'learning_rate': 3.767006333832826e-05, 'epoch': 1.81} 18%|█▊ | 7447/41250 [17:59:21<81:22:46, 8.67s/it][2025-04-26 01:57:05,316] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:57:05,316] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.42 | bwd_microstep: 5730.41 | bwd_inner_microstep: 5685.50 | bwd_allreduce_microstep: 44.87 | step_microstep: 18.41 [2025-04-26 01:57:05,316] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.42 | bwd: 5730.42 | bwd_inner: 5685.50 | bwd_allreduce: 44.88 | step: 18.42 18%|█▊ | 7448/41250 [17:59:30<81:22:11, 8.67s/it] {'loss': 0.1239, 'grad_norm': 1.577268123626709, 'learning_rate': 3.766932770397659e-05, 'epoch': 1.81} 18%|█▊ | 7448/41250 [17:59:30<81:22:11, 8.67s/it][2025-04-26 01:57:13,996] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.03 | optimizer_step: 1.04 [2025-04-26 01:57:13,996] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.39 | bwd_microstep: 5741.94 | bwd_inner_microstep: 5683.15 | bwd_allreduce_microstep: 58.74 | step_microstep: 19.37 [2025-04-26 01:57:13,997] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.39 | bwd: 5741.95 | bwd_inner: 5683.15 | bwd_allreduce: 58.76 | step: 19.37 18%|█▊ | 7449/41250 [17:59:39<81:24:53, 8.67s/it] {'loss': 0.0828, 'grad_norm': 1.3735843896865845, 'learning_rate': 3.7668591960697e-05, 'epoch': 1.81} 18%|█▊ | 7449/41250 [17:59:39<81:24:53, 8.67s/it][2025-04-26 01:57:22,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:57:22,661] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.65 | bwd_microstep: 5754.40 | bwd_inner_microstep: 5644.53 | bwd_allreduce_microstep: 109.83 | step_microstep: 18.96 [2025-04-26 01:57:22,661] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.65 | bwd: 5754.41 | bwd_inner: 5644.53 | bwd_allreduce: 109.85 | step: 18.96 18%|█▊ | 7450/41250 [17:59:47<81:23:23, 8.67s/it] {'loss': 0.2395, 'grad_norm': 2.9982752799987793, 'learning_rate': 3.766785610849401e-05, 'epoch': 1.81} 18%|█▊ | 7450/41250 [17:59:47<81:23:23, 8.67s/it][2025-04-26 01:57:31,289] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:57:31,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.32 | bwd_microstep: 5689.47 | bwd_inner_microstep: 5676.55 | bwd_allreduce_microstep: 12.88 | step_microstep: 18.43 [2025-04-26 01:57:31,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.32 | bwd: 5689.48 | bwd_inner: 5676.55 | bwd_allreduce: 12.89 | step: 18.43 18%|█▊ | 7451/41250 [17:59:56<81:16:34, 8.66s/it] {'loss': 0.0412, 'grad_norm': 1.1723353862762451, 'learning_rate': 3.7667120147372164e-05, 'epoch': 1.81} 18%|█▊ | 7451/41250 [17:59:56<81:16:34, 8.66s/it][2025-04-26 01:57:39,950] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:57:39,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.58 | bwd_microstep: 5731.99 | bwd_inner_microstep: 5676.85 | bwd_allreduce_microstep: 55.10 | step_microstep: 18.55 [2025-04-26 01:57:39,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.58 | bwd: 5732.00 | bwd_inner: 5676.85 | bwd_allreduce: 55.12 | step: 18.56 18%|█▊ | 7452/41250 [18:00:05<81:17:03, 8.66s/it] {'loss': 0.2928, 'grad_norm': 2.85420823097229, 'learning_rate': 3.7666384077336e-05, 'epoch': 1.81} 18%|█▊ | 7452/41250 [18:00:05<81:17:03, 8.66s/it][2025-04-26 01:57:48,615] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:57:48,616] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.07 | bwd_microstep: 5756.34 | bwd_inner_microstep: 5638.44 | bwd_allreduce_microstep: 117.86 | step_microstep: 18.14 [2025-04-26 01:57:48,616] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.07 | bwd: 5756.35 | bwd_inner: 5638.44 | bwd_allreduce: 117.87 | step: 18.15 18%|█▊ | 7453/41250 [18:00:13<81:17:46, 8.66s/it] {'loss': 0.334, 'grad_norm': 4.1131134033203125, 'learning_rate': 3.766564789839004e-05, 'epoch': 1.81} 18%|█▊ | 7453/41250 [18:00:13<81:17:46, 8.66s/it][2025-04-26 01:57:57,368] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:57:57,369] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2885.30 | bwd_microstep: 5781.80 | bwd_inner_microstep: 5769.08 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.42 [2025-04-26 01:57:57,369] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2885.30 | bwd: 5781.82 | bwd_inner: 5769.08 | bwd_allreduce: 12.69 | step: 18.43 18%|█▊ | 7454/41250 [18:00:22<81:33:38, 8.69s/it] {'loss': 0.2356, 'grad_norm': 4.665336608886719, 'learning_rate': 3.7664911610538844e-05, 'epoch': 1.81} 18%|█▊ | 7454/41250 [18:00:22<81:33:38, 8.69s/it][2025-04-26 01:58:06,025] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-26 01:58:06,026] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.79 | bwd_microstep: 5724.71 | bwd_inner_microstep: 5654.30 | bwd_allreduce_microstep: 70.36 | step_microstep: 18.83 [2025-04-26 01:58:06,026] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.79 | bwd: 5724.72 | bwd_inner: 5654.30 | bwd_allreduce: 70.38 | step: 18.84 18%|█▊ | 7455/41250 [18:00:31<81:28:24, 8.68s/it] {'loss': 0.1563, 'grad_norm': 3.345400333404541, 'learning_rate': 3.766417521378695e-05, 'epoch': 1.81} 18%|█▊ | 7455/41250 [18:00:31<81:28:24, 8.68s/it][2025-04-26 01:58:14,678] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:58:14,678] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.47 | bwd_microstep: 5714.54 | bwd_inner_microstep: 5701.82 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.51 [2025-04-26 01:58:14,679] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.47 | bwd: 5714.56 | bwd_inner: 5701.82 | bwd_allreduce: 12.70 | step: 18.52 18%|█▊ | 7456/41250 [18:00:40<81:23:36, 8.67s/it] {'loss': 0.2765, 'grad_norm': 5.379055976867676, 'learning_rate': 3.7663438708138876e-05, 'epoch': 1.81} 18%|█▊ | 7456/41250 [18:00:40<81:23:36, 8.67s/it][2025-04-26 01:58:23,353] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.00 | optimizer_step: 1.06 [2025-04-26 01:58:23,354] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.92 | bwd_microstep: 5757.90 | bwd_inner_microstep: 5656.51 | bwd_allreduce_microstep: 101.35 | step_microstep: 18.70 [2025-04-26 01:58:23,354] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.92 | bwd: 5757.92 | bwd_inner: 5656.51 | bwd_allreduce: 101.37 | step: 18.70 18%|█▊ | 7457/41250 [18:00:48<81:24:16, 8.67s/it] {'loss': 0.1627, 'grad_norm': 3.537233591079712, 'learning_rate': 3.766270209359919e-05, 'epoch': 1.81} 18%|█▊ | 7457/41250 [18:00:48<81:24:16, 8.67s/it][2025-04-26 01:58:32,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.98 | optimizer_step: 1.04 [2025-04-26 01:58:32,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.55 | bwd_microstep: 5718.90 | bwd_inner_microstep: 5706.18 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.76 [2025-04-26 01:58:32,002] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.55 | bwd: 5718.91 | bwd_inner: 5706.18 | bwd_allreduce: 12.68 | step: 18.77 18%|█▊ | 7458/41250 [18:00:57<81:20:28, 8.67s/it] {'loss': 0.1732, 'grad_norm': 1.5534859895706177, 'learning_rate': 3.766196537017241e-05, 'epoch': 1.81} 18%|█▊ | 7458/41250 [18:00:57<81:20:28, 8.67s/it][2025-04-26 01:58:40,609] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 01:58:40,609] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.73 | bwd_microstep: 5694.40 | bwd_inner_microstep: 5650.51 | bwd_allreduce_microstep: 43.84 | step_microstep: 18.61 [2025-04-26 01:58:40,609] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.73 | bwd: 5694.41 | bwd_inner: 5650.51 | bwd_allreduce: 43.86 | step: 18.61 18%|█▊ | 7459/41250 [18:01:05<81:10:11, 8.65s/it] {'loss': 0.0286, 'grad_norm': 0.3765430152416229, 'learning_rate': 3.766122853786309e-05, 'epoch': 1.81} 18%|█▊ | 7459/41250 [18:01:05<81:10:11, 8.65s/it][2025-04-26 01:58:49,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.06 | optimizer_step: 1.01 [2025-04-26 01:58:49,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.41 | bwd_microstep: 5755.96 | bwd_inner_microstep: 5657.91 | bwd_allreduce_microstep: 98.00 | step_microstep: 19.44 [2025-04-26 01:58:49,291] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.41 | bwd: 5755.97 | bwd_inner: 5657.91 | bwd_allreduce: 98.03 | step: 19.44 18%|█▊ | 7460/41250 [18:01:14<81:15:51, 8.66s/it] {'loss': 0.0608, 'grad_norm': 1.0050153732299805, 'learning_rate': 3.766049159667577e-05, 'epoch': 1.81} 18%|█▊ | 7460/41250 [18:01:14<81:15:51, 8.66s/it][2025-04-26 01:58:57,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 01:58:57,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.73 | bwd_microstep: 5772.57 | bwd_inner_microstep: 5699.26 | bwd_allreduce_microstep: 73.26 | step_microstep: 18.69 [2025-04-26 01:58:57,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.73 | bwd: 5772.58 | bwd_inner: 5699.26 | bwd_allreduce: 73.28 | step: 18.70 18%|█▊ | 7461/41250 [18:01:23<81:24:13, 8.67s/it] {'loss': 0.3186, 'grad_norm': 2.7789130210876465, 'learning_rate': 3.765975454661499e-05, 'epoch': 1.81} 18%|█▊ | 7461/41250 [18:01:23<81:24:13, 8.67s/it][2025-04-26 01:59:06,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 0.92 [2025-04-26 01:59:06,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.95 | bwd_microstep: 5845.07 | bwd_inner_microstep: 5697.22 | bwd_allreduce_microstep: 147.80 | step_microstep: 18.53 [2025-04-26 01:59:06,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.95 | bwd: 5845.08 | bwd_inner: 5697.22 | bwd_allreduce: 147.82 | step: 18.54 18%|█▊ | 7462/41250 [18:01:32<81:44:00, 8.71s/it] {'loss': 0.1193, 'grad_norm': 1.936635136604309, 'learning_rate': 3.76590173876853e-05, 'epoch': 1.81} 18%|█▊ | 7462/41250 [18:01:32<81:44:00, 8.71s/it][2025-04-26 01:59:15,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:59:15,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.88 | bwd_microstep: 5705.13 | bwd_inner_microstep: 5662.54 | bwd_allreduce_microstep: 42.55 | step_microstep: 18.50 [2025-04-26 01:59:15,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.88 | bwd: 5705.14 | bwd_inner: 5662.54 | bwd_allreduce: 42.56 | step: 18.51 18%|█▊ | 7463/41250 [18:01:40<81:28:02, 8.68s/it] {'loss': 0.0213, 'grad_norm': 0.6008156538009644, 'learning_rate': 3.765828011989124e-05, 'epoch': 1.81} 18%|█▊ | 7463/41250 [18:01:40<81:28:02, 8.68s/it][2025-04-26 01:59:24,158] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:59:24,158] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2890.97 | bwd_microstep: 5777.53 | bwd_inner_microstep: 5764.62 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.78 [2025-04-26 01:59:24,158] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2890.97 | bwd: 5777.54 | bwd_inner: 5764.62 | bwd_allreduce: 12.88 | step: 18.78 18%|█▊ | 7464/41250 [18:01:49<81:40:47, 8.70s/it] {'loss': 0.2459, 'grad_norm': 2.7734556198120117, 'learning_rate': 3.765754274323736e-05, 'epoch': 1.81} 18%|█▊ | 7464/41250 [18:01:49<81:40:47, 8.70s/it][2025-04-26 01:59:32,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-26 01:59:32,844] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.00 | bwd_microstep: 5740.97 | bwd_inner_microstep: 5697.36 | bwd_allreduce_microstep: 43.56 | step_microstep: 18.95 [2025-04-26 01:59:32,844] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.01 | bwd: 5740.98 | bwd_inner: 5697.36 | bwd_allreduce: 43.58 | step: 18.95 18%|█▊ | 7465/41250 [18:01:58<81:37:12, 8.70s/it] {'loss': 0.1708, 'grad_norm': 3.4889042377471924, 'learning_rate': 3.76568052577282e-05, 'epoch': 1.81} 18%|█▊ | 7465/41250 [18:01:58<81:37:12, 8.70s/it][2025-04-26 01:59:41,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 01:59:41,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.81 | bwd_microstep: 5768.85 | bwd_inner_microstep: 5691.72 | bwd_allreduce_microstep: 77.09 | step_microstep: 18.67 [2025-04-26 01:59:41,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.81 | bwd: 5768.86 | bwd_inner: 5691.72 | bwd_allreduce: 77.10 | step: 18.67 18%|█▊ | 7466/41250 [18:02:06<81:38:15, 8.70s/it] {'loss': 0.1809, 'grad_norm': 1.9101375341415405, 'learning_rate': 3.765606766336831e-05, 'epoch': 1.81} 18%|█▊ | 7466/41250 [18:02:06<81:38:15, 8.70s/it][2025-04-26 01:59:50,245] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 01:59:50,246] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.76 | bwd_microstep: 5779.02 | bwd_inner_microstep: 5649.88 | bwd_allreduce_microstep: 129.10 | step_microstep: 18.72 [2025-04-26 01:59:50,246] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.76 | bwd: 5779.04 | bwd_inner: 5649.88 | bwd_allreduce: 129.12 | step: 18.72 18%|█▊ | 7467/41250 [18:02:15<81:37:35, 8.70s/it] {'loss': 0.0635, 'grad_norm': 0.8923555016517639, 'learning_rate': 3.765532996016223e-05, 'epoch': 1.81} 18%|█▊ | 7467/41250 [18:02:15<81:37:35, 8.70s/it][2025-04-26 01:59:58,938] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 01:59:58,939] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.06 | bwd_microstep: 5755.10 | bwd_inner_microstep: 5708.07 | bwd_allreduce_microstep: 46.99 | step_microstep: 18.42 [2025-04-26 01:59:58,939] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.06 | bwd: 5755.11 | bwd_inner: 5708.07 | bwd_allreduce: 47.01 | step: 18.42 18%|█▊ | 7468/41250 [18:02:24<81:36:31, 8.70s/it] {'loss': 0.1249, 'grad_norm': 3.6642723083496094, 'learning_rate': 3.7654592148114514e-05, 'epoch': 1.81} 18%|█▊ | 7468/41250 [18:02:24<81:36:31, 8.70s/it][2025-04-26 02:00:07,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 02:00:07,576] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.31 | bwd_microstep: 5719.28 | bwd_inner_microstep: 5653.09 | bwd_allreduce_microstep: 66.15 | step_microstep: 18.23 [2025-04-26 02:00:07,576] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.31 | bwd: 5719.29 | bwd_inner: 5653.09 | bwd_allreduce: 66.16 | step: 18.24 18%|█▊ | 7469/41250 [18:02:32<81:26:23, 8.68s/it] {'loss': 0.2461, 'grad_norm': 5.337395668029785, 'learning_rate': 3.765385422722972e-05, 'epoch': 1.81} 18%|█▊ | 7469/41250 [18:02:32<81:26:23, 8.68s/it][2025-04-26 02:00:16,275] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:00:16,276] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.98 | bwd_microstep: 5784.12 | bwd_inner_microstep: 5646.40 | bwd_allreduce_microstep: 137.68 | step_microstep: 18.46 [2025-04-26 02:00:16,276] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.98 | bwd: 5784.13 | bwd_inner: 5646.40 | bwd_allreduce: 137.69 | step: 18.46 18%|█▊ | 7470/41250 [18:02:41<81:29:57, 8.69s/it] {'loss': 0.1361, 'grad_norm': 2.614384889602661, 'learning_rate': 3.765311619751238e-05, 'epoch': 1.81} 18%|█▊ | 7470/41250 [18:02:41<81:29:57, 8.69s/it][2025-04-26 02:00:24,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:00:24,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.35 | bwd_microstep: 5712.44 | bwd_inner_microstep: 5699.59 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.49 [2025-04-26 02:00:24,928] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.35 | bwd: 5712.45 | bwd_inner: 5699.59 | bwd_allreduce: 12.81 | step: 18.49 18%|█▊ | 7471/41250 [18:02:50<81:23:58, 8.68s/it] {'loss': 0.1514, 'grad_norm': 3.079850435256958, 'learning_rate': 3.7652378058967044e-05, 'epoch': 1.81} 18%|█▊ | 7471/41250 [18:02:50<81:23:58, 8.68s/it][2025-04-26 02:00:33,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:00:33,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.28 | bwd_microstep: 5795.04 | bwd_inner_microstep: 5647.63 | bwd_allreduce_microstep: 147.35 | step_microstep: 18.64 [2025-04-26 02:00:33,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.28 | bwd: 5795.05 | bwd_inner: 5647.63 | bwd_allreduce: 147.37 | step: 18.65 18%|█▊ | 7472/41250 [18:02:58<81:28:50, 8.68s/it] {'loss': 0.2868, 'grad_norm': 4.501125335693359, 'learning_rate': 3.7651639811598276e-05, 'epoch': 1.81} 18%|█▊ | 7472/41250 [18:02:58<81:28:50, 8.68s/it][2025-04-26 02:00:42,253] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.01 | optimizer_step: 1.16 [2025-04-26 02:00:42,254] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.07 | bwd_microstep: 5704.49 | bwd_inner_microstep: 5648.00 | bwd_allreduce_microstep: 56.44 | step_microstep: 18.93 [2025-04-26 02:00:42,254] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.07 | bwd: 5704.50 | bwd_inner: 5648.00 | bwd_allreduce: 56.46 | step: 18.93 18%|█▊ | 7473/41250 [18:03:07<81:18:15, 8.67s/it] {'loss': 0.1571, 'grad_norm': 1.509244441986084, 'learning_rate': 3.765090145541062e-05, 'epoch': 1.81} 18%|█▊ | 7473/41250 [18:03:07<81:18:15, 8.67s/it][2025-04-26 02:00:50,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-26 02:00:50,934] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.08 | bwd_microstep: 5743.44 | bwd_inner_microstep: 5697.21 | bwd_allreduce_microstep: 46.19 | step_microstep: 18.52 [2025-04-26 02:00:50,934] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.08 | bwd: 5743.46 | bwd_inner: 5697.21 | bwd_allreduce: 46.21 | step: 18.53 18%|█▊ | 7474/41250 [18:03:16<81:20:23, 8.67s/it] {'loss': 0.319, 'grad_norm': 2.422044515609741, 'learning_rate': 3.7650162990408625e-05, 'epoch': 1.81} 18%|█▊ | 7474/41250 [18:03:16<81:20:23, 8.67s/it][2025-04-26 02:00:59,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-26 02:00:59,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2893.78 | bwd_microstep: 5790.36 | bwd_inner_microstep: 5777.68 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.86 [2025-04-26 02:00:59,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2893.78 | bwd: 5790.38 | bwd_inner: 5777.68 | bwd_allreduce: 12.66 | step: 18.86 18%|█▊ | 7475/41250 [18:03:25<81:37:16, 8.70s/it] {'loss': 0.1583, 'grad_norm': 1.5754706859588623, 'learning_rate': 3.7649424416596856e-05, 'epoch': 1.81} 18%|█▊ | 7475/41250 [18:03:25<81:37:16, 8.70s/it][2025-04-26 02:01:08,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:01:08,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.72 | bwd_microstep: 5710.33 | bwd_inner_microstep: 5642.64 | bwd_allreduce_microstep: 67.64 | step_microstep: 18.74 [2025-04-26 02:01:08,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.72 | bwd: 5710.34 | bwd_inner: 5642.64 | bwd_allreduce: 67.66 | step: 18.75 18%|█▊ | 7476/41250 [18:03:33<81:25:28, 8.68s/it] {'loss': 0.1422, 'grad_norm': 2.3914895057678223, 'learning_rate': 3.7648685733979856e-05, 'epoch': 1.81} 18%|█▊ | 7476/41250 [18:03:33<81:25:28, 8.68s/it][2025-04-26 02:01:17,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 02:01:17,008] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.14 | bwd_microstep: 5740.86 | bwd_inner_microstep: 5654.83 | bwd_allreduce_microstep: 85.99 | step_microstep: 18.60 [2025-04-26 02:01:17,008] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.14 | bwd: 5740.87 | bwd_inner: 5654.83 | bwd_allreduce: 86.00 | step: 18.60 18%|█▊ | 7477/41250 [18:03:42<81:24:03, 8.68s/it] {'loss': 0.2204, 'grad_norm': 2.188889503479004, 'learning_rate': 3.764794694256217e-05, 'epoch': 1.81} 18%|█▊ | 7477/41250 [18:03:42<81:24:03, 8.68s/it][2025-04-26 02:01:25,618] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 02:01:25,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.40 | bwd_microstep: 5702.31 | bwd_inner_microstep: 5638.66 | bwd_allreduce_microstep: 63.60 | step_microstep: 18.55 [2025-04-26 02:01:25,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.40 | bwd: 5702.33 | bwd_inner: 5638.66 | bwd_allreduce: 63.62 | step: 18.56 18%|█▊ | 7478/41250 [18:03:50<81:13:11, 8.66s/it] {'loss': 0.0752, 'grad_norm': 1.2003509998321533, 'learning_rate': 3.764720804234837e-05, 'epoch': 1.81} 18%|█▊ | 7478/41250 [18:03:50<81:13:11, 8.66s/it][2025-04-26 02:01:34,299] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 1.14 | optimizer_step: 1.11 [2025-04-26 02:01:34,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.84 | bwd_microstep: 5771.96 | bwd_inner_microstep: 5645.98 | bwd_allreduce_microstep: 125.93 | step_microstep: 19.07 [2025-04-26 02:01:34,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.84 | bwd: 5771.97 | bwd_inner: 5645.98 | bwd_allreduce: 125.95 | step: 19.07 18%|█▊ | 7479/41250 [18:03:59<81:16:43, 8.66s/it] {'loss': 0.0722, 'grad_norm': 1.2572109699249268, 'learning_rate': 3.7646469033343e-05, 'epoch': 1.81} 18%|█▊ | 7479/41250 [18:03:59<81:16:43, 8.66s/it][2025-04-26 02:01:42,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.99 | optimizer_step: 1.07 [2025-04-26 02:01:42,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.55 | bwd_microstep: 5691.72 | bwd_inner_microstep: 5678.93 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.80 [2025-04-26 02:01:42,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.55 | bwd: 5691.73 | bwd_inner: 5678.93 | bwd_allreduce: 12.76 | step: 18.81 18%|█▊ | 7480/41250 [18:04:08<81:09:50, 8.65s/it] {'loss': 0.0241, 'grad_norm': 0.5272358059883118, 'learning_rate': 3.764572991555062e-05, 'epoch': 1.81} 18%|█▊ | 7480/41250 [18:04:08<81:09:50, 8.65s/it][2025-04-26 02:01:51,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 02:01:51,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.50 | bwd_microstep: 5744.65 | bwd_inner_microstep: 5674.82 | bwd_allreduce_microstep: 69.79 | step_microstep: 18.50 [2025-04-26 02:01:51,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.50 | bwd: 5744.66 | bwd_inner: 5674.82 | bwd_allreduce: 69.80 | step: 18.51 18%|█▊ | 7481/41250 [18:04:16<81:14:17, 8.66s/it] {'loss': 0.2061, 'grad_norm': 2.286813259124756, 'learning_rate': 3.764499068897578e-05, 'epoch': 1.81} 18%|█▊ | 7481/41250 [18:04:16<81:14:17, 8.66s/it][2025-04-26 02:02:00,273] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 02:02:00,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.42 | bwd_microstep: 5736.45 | bwd_inner_microstep: 5652.14 | bwd_allreduce_microstep: 84.25 | step_microstep: 18.53 [2025-04-26 02:02:00,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.42 | bwd: 5736.46 | bwd_inner: 5652.14 | bwd_allreduce: 84.27 | step: 18.53 18%|█▊ | 7482/41250 [18:04:25<81:15:58, 8.66s/it] {'loss': 0.0178, 'grad_norm': 0.5058453679084778, 'learning_rate': 3.764425135362305e-05, 'epoch': 1.81} 18%|█▊ | 7482/41250 [18:04:25<81:15:58, 8.66s/it][2025-04-26 02:02:08,945] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-26 02:02:08,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.12 | bwd_microstep: 5734.00 | bwd_inner_microstep: 5680.68 | bwd_allreduce_microstep: 53.28 | step_microstep: 19.22 [2025-04-26 02:02:08,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.12 | bwd: 5734.02 | bwd_inner: 5680.68 | bwd_allreduce: 53.30 | step: 19.22 18%|█▊ | 7483/41250 [18:04:34<81:17:12, 8.67s/it] {'loss': 0.0817, 'grad_norm': 0.9391086101531982, 'learning_rate': 3.764351190949698e-05, 'epoch': 1.81} 18%|█▊ | 7483/41250 [18:04:34<81:17:12, 8.67s/it][2025-04-26 02:02:17,552] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 1.08 [2025-04-26 02:02:17,552] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.36 | bwd_microstep: 5695.80 | bwd_inner_microstep: 5640.87 | bwd_allreduce_microstep: 54.89 | step_microstep: 18.86 [2025-04-26 02:02:17,553] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.36 | bwd: 5695.81 | bwd_inner: 5640.87 | bwd_allreduce: 54.91 | step: 18.87 18%|█▊ | 7484/41250 [18:04:42<81:06:53, 8.65s/it] {'loss': 0.1093, 'grad_norm': 2.344161033630371, 'learning_rate': 3.764277235660213e-05, 'epoch': 1.81} 18%|█▊ | 7484/41250 [18:04:42<81:06:53, 8.65s/it][2025-04-26 02:02:26,162] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-26 02:02:26,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.93 | bwd_microstep: 5698.39 | bwd_inner_microstep: 5650.92 | bwd_allreduce_microstep: 47.42 | step_microstep: 19.00 [2025-04-26 02:02:26,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.93 | bwd: 5698.40 | bwd_inner: 5650.92 | bwd_allreduce: 47.44 | step: 19.00 18%|█▊ | 7485/41250 [18:04:51<81:00:31, 8.64s/it] {'loss': 0.0477, 'grad_norm': 0.6886035203933716, 'learning_rate': 3.7642032694943056e-05, 'epoch': 1.81} 18%|█▊ | 7485/41250 [18:04:51<81:00:31, 8.64s/it][2025-04-26 02:02:34,799] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:02:34,800] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.66 | bwd_microstep: 5706.41 | bwd_inner_microstep: 5693.71 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.65 [2025-04-26 02:02:34,800] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.66 | bwd: 5706.42 | bwd_inner: 5693.71 | bwd_allreduce: 12.68 | step: 18.65 18%|█▊ | 7486/41250 [18:05:00<81:00:09, 8.64s/it] {'loss': 0.2092, 'grad_norm': 1.884619116783142, 'learning_rate': 3.7641292924524325e-05, 'epoch': 1.81} 18%|█▊ | 7486/41250 [18:05:00<81:00:09, 8.64s/it][2025-04-26 02:02:43,402] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:02:43,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.55 | bwd_microstep: 5694.64 | bwd_inner_microstep: 5645.00 | bwd_allreduce_microstep: 49.59 | step_microstep: 18.49 [2025-04-26 02:02:43,403] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.55 | bwd: 5694.65 | bwd_inner: 5645.00 | bwd_allreduce: 49.61 | step: 18.49 18%|█▊ | 7487/41250 [18:05:08<80:54:10, 8.63s/it] {'loss': 0.1278, 'grad_norm': 2.3430521488189697, 'learning_rate': 3.764055304535049e-05, 'epoch': 1.82} 18%|█▊ | 7487/41250 [18:05:08<80:54:10, 8.63s/it][2025-04-26 02:02:52,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-26 02:02:52,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.93 | bwd_microstep: 5776.06 | bwd_inner_microstep: 5636.12 | bwd_allreduce_microstep: 139.89 | step_microstep: 19.32 [2025-04-26 02:02:52,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.93 | bwd: 5776.07 | bwd_inner: 5636.12 | bwd_allreduce: 139.91 | step: 19.33 18%|█▊ | 7488/41250 [18:05:17<81:05:18, 8.65s/it] {'loss': 0.1071, 'grad_norm': 2.871640682220459, 'learning_rate': 3.763981305742611e-05, 'epoch': 1.82} 18%|█▊ | 7488/41250 [18:05:17<81:05:18, 8.65s/it][2025-04-26 02:03:00,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.03 | optimizer_step: 1.10 [2025-04-26 02:03:00,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2886.06 | bwd_microstep: 5763.40 | bwd_inner_microstep: 5750.55 | bwd_allreduce_microstep: 12.80 | step_microstep: 19.20 [2025-04-26 02:03:00,830] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2886.06 | bwd: 5763.41 | bwd_inner: 5750.55 | bwd_allreduce: 12.82 | step: 19.20 18%|█▊ | 7489/41250 [18:05:26<81:20:09, 8.67s/it] {'loss': 0.2241, 'grad_norm': 2.106090545654297, 'learning_rate': 3.763907296075576e-05, 'epoch': 1.82} 18%|█▊ | 7489/41250 [18:05:26<81:20:09, 8.67s/it][2025-04-26 02:03:09,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.02 | optimizer_step: 1.02 [2025-04-26 02:03:09,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.91 | bwd_microstep: 5758.55 | bwd_inner_microstep: 5635.48 | bwd_allreduce_microstep: 123.03 | step_microstep: 18.78 [2025-04-26 02:03:09,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.91 | bwd: 5758.57 | bwd_inner: 5635.48 | bwd_allreduce: 123.05 | step: 18.79 18%|█▊ | 7490/41250 [18:05:34<81:19:40, 8.67s/it] {'loss': 0.2293, 'grad_norm': 2.2617416381835938, 'learning_rate': 3.763833275534399e-05, 'epoch': 1.82} 18%|█▊ | 7490/41250 [18:05:34<81:19:40, 8.67s/it][2025-04-26 02:03:18,126] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:03:18,127] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.49 | bwd_microstep: 5695.90 | bwd_inner_microstep: 5683.21 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.39 [2025-04-26 02:03:18,127] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.49 | bwd: 5695.92 | bwd_inner: 5683.21 | bwd_allreduce: 12.67 | step: 18.40 18%|█▊ | 7491/41250 [18:05:43<81:11:19, 8.66s/it] {'loss': 0.1269, 'grad_norm': 1.2592639923095703, 'learning_rate': 3.763759244119537e-05, 'epoch': 1.82} 18%|█▊ | 7491/41250 [18:05:43<81:11:19, 8.66s/it][2025-04-26 02:03:26,789] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:03:26,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.80 | bwd_microstep: 5734.89 | bwd_inner_microstep: 5697.79 | bwd_allreduce_microstep: 37.06 | step_microstep: 18.80 [2025-04-26 02:03:26,790] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.80 | bwd: 5734.91 | bwd_inner: 5697.79 | bwd_allreduce: 37.07 | step: 18.80 18%|█▊ | 7492/41250 [18:05:52<81:12:06, 8.66s/it] {'loss': 0.1429, 'grad_norm': 2.4689817428588867, 'learning_rate': 3.7636852018314466e-05, 'epoch': 1.82} 18%|█▊ | 7492/41250 [18:05:52<81:12:06, 8.66s/it][2025-04-26 02:03:35,475] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 1.00 [2025-04-26 02:03:35,476] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.17 | bwd_microstep: 5776.87 | bwd_inner_microstep: 5650.73 | bwd_allreduce_microstep: 126.08 | step_microstep: 18.70 [2025-04-26 02:03:35,476] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.17 | bwd: 5776.89 | bwd_inner: 5650.73 | bwd_allreduce: 126.11 | step: 18.71 18%|█▊ | 7493/41250 [18:06:00<81:17:54, 8.67s/it] {'loss': 0.2665, 'grad_norm': 2.4802443981170654, 'learning_rate': 3.763611148670583e-05, 'epoch': 1.82} 18%|█▊ | 7493/41250 [18:06:00<81:17:54, 8.67s/it][2025-04-26 02:03:44,122] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 02:03:44,122] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.90 | bwd_microstep: 5709.96 | bwd_inner_microstep: 5697.23 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.51 [2025-04-26 02:03:44,122] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.91 | bwd: 5709.97 | bwd_inner: 5697.23 | bwd_allreduce: 12.70 | step: 18.51 18%|█▊ | 7494/41250 [18:06:09<81:12:19, 8.66s/it] {'loss': 0.0613, 'grad_norm': 0.8474976420402527, 'learning_rate': 3.763537084637404e-05, 'epoch': 1.82} 18%|█▊ | 7494/41250 [18:06:09<81:12:19, 8.66s/it][2025-04-26 02:03:52,723] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.99 | optimizer_step: 0.94 [2025-04-26 02:03:52,723] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.06 | bwd_microstep: 5694.27 | bwd_inner_microstep: 5636.91 | bwd_allreduce_microstep: 57.31 | step_microstep: 18.83 [2025-04-26 02:03:52,724] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.06 | bwd: 5694.29 | bwd_inner: 5636.91 | bwd_allreduce: 57.33 | step: 18.83 18%|█▊ | 7495/41250 [18:06:18<81:02:11, 8.64s/it] {'loss': 0.2036, 'grad_norm': 1.6514898538589478, 'learning_rate': 3.7634630097323655e-05, 'epoch': 1.82} 18%|█▊ | 7495/41250 [18:06:18<81:02:11, 8.64s/it][2025-04-26 02:04:01,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-26 02:04:01,353] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.30 | bwd_microstep: 5692.97 | bwd_inner_microstep: 5680.19 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.86 [2025-04-26 02:04:01,353] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.30 | bwd: 5692.99 | bwd_inner: 5680.19 | bwd_allreduce: 12.76 | step: 18.86 18%|█▊ | 7496/41250 [18:06:26<81:00:00, 8.64s/it] {'loss': 0.0577, 'grad_norm': 1.2662631273269653, 'learning_rate': 3.763388923955924e-05, 'epoch': 1.82} 18%|█▊ | 7496/41250 [18:06:26<81:00:00, 8.64s/it][2025-04-26 02:04:10,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:04:10,253] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2934.04 | bwd_microstep: 5883.38 | bwd_inner_microstep: 5870.49 | bwd_allreduce_microstep: 12.84 | step_microstep: 19.10 [2025-04-26 02:04:10,253] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2934.04 | bwd: 5883.39 | bwd_inner: 5870.49 | bwd_allreduce: 12.86 | step: 19.10 18%|█▊ | 7497/41250 [18:06:35<81:43:48, 8.72s/it] {'loss': 0.0983, 'grad_norm': 1.668526291847229, 'learning_rate': 3.7633148273085376e-05, 'epoch': 1.82} 18%|█▊ | 7497/41250 [18:06:35<81:43:48, 8.72s/it][2025-04-26 02:04:18,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.04 | optimizer_step: 1.03 [2025-04-26 02:04:18,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.71 | bwd_microstep: 5727.34 | bwd_inner_microstep: 5700.48 | bwd_allreduce_microstep: 26.81 | step_microstep: 19.64 [2025-04-26 02:04:18,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.71 | bwd: 5727.36 | bwd_inner: 5700.48 | bwd_allreduce: 26.83 | step: 19.64 18%|█▊ | 7498/41250 [18:06:44<81:35:07, 8.70s/it] {'loss': 0.3048, 'grad_norm': 6.614038944244385, 'learning_rate': 3.763240719790662e-05, 'epoch': 1.82} 18%|█▊ | 7498/41250 [18:06:44<81:35:07, 8.70s/it][2025-04-26 02:04:27,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 0.98 | optimizer_step: 1.07 [2025-04-26 02:04:27,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.03 | bwd_microstep: 5704.42 | bwd_inner_microstep: 5636.32 | bwd_allreduce_microstep: 68.05 | step_microstep: 18.49 [2025-04-26 02:04:27,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.03 | bwd: 5704.44 | bwd_inner: 5636.32 | bwd_allreduce: 68.07 | step: 18.49 18%|█▊ | 7499/41250 [18:06:52<81:19:35, 8.67s/it] {'loss': 0.6008, 'grad_norm': 7.514025688171387, 'learning_rate': 3.763166601402754e-05, 'epoch': 1.82} 18%|█▊ | 7499/41250 [18:06:52<81:19:35, 8.67s/it][2025-04-26 02:04:36,221] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.07 | optimizer_step: 1.11 [2025-04-26 02:04:36,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2815.73 | bwd_microstep: 5793.89 | bwd_inner_microstep: 5639.69 | bwd_allreduce_microstep: 154.15 | step_microstep: 19.50 [2025-04-26 02:04:36,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2815.74 | bwd: 5793.90 | bwd_inner: 5639.69 | bwd_allreduce: 154.17 | step: 19.50 18%|█▊ | 7500/41250 [18:07:01<81:22:28, 8.68s/it] {'loss': 0.2154, 'grad_norm': 1.7492165565490723, 'learning_rate': 3.76309247214527e-05, 'epoch': 1.82} 18%|█▊ | 7500/41250 [18:07:01<81:22:28, 8.68s/it][2025-04-26 02:04:44,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:04:44,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.04 | bwd_microstep: 5715.54 | bwd_inner_microstep: 5635.89 | bwd_allreduce_microstep: 79.60 | step_microstep: 18.52 [2025-04-26 02:04:44,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.04 | bwd: 5715.56 | bwd_inner: 5635.89 | bwd_allreduce: 79.62 | step: 18.52 18%|█▊ | 7501/41250 [18:07:10<81:11:17, 8.66s/it] {'loss': 0.2032, 'grad_norm': 2.730041027069092, 'learning_rate': 3.7630183320186685e-05, 'epoch': 1.82} 18%|█▊ | 7501/41250 [18:07:10<81:11:17, 8.66s/it][2025-04-26 02:04:53,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.15 | optimizer_step: 0.92 [2025-04-26 02:04:53,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.55 | bwd_microstep: 5883.95 | bwd_inner_microstep: 5698.41 | bwd_allreduce_microstep: 185.49 | step_microstep: 19.17 [2025-04-26 02:04:53,656] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.55 | bwd: 5883.97 | bwd_inner: 5698.41 | bwd_allreduce: 185.51 | step: 19.17 18%|█▊ | 7502/41250 [18:07:18<81:37:46, 8.71s/it] {'loss': 0.1061, 'grad_norm': 4.395662307739258, 'learning_rate': 3.7629441810234056e-05, 'epoch': 1.82} 18%|█▊ | 7502/41250 [18:07:18<81:37:46, 8.71s/it][2025-04-26 02:05:02,297] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.99 | optimizer_step: 0.96 [2025-04-26 02:05:02,298] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.42 | bwd_microstep: 5713.56 | bwd_inner_microstep: 5700.83 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.81 [2025-04-26 02:05:02,298] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.42 | bwd: 5713.57 | bwd_inner: 5700.83 | bwd_allreduce: 12.70 | step: 18.82 18%|█▊ | 7503/41250 [18:07:27<81:26:30, 8.69s/it] {'loss': 0.0405, 'grad_norm': 0.7355008721351624, 'learning_rate': 3.762870019159938e-05, 'epoch': 1.82} 18%|█▊ | 7503/41250 [18:07:27<81:26:30, 8.69s/it][2025-04-26 02:05:10,977] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:05:10,977] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.22 | bwd_microstep: 5742.14 | bwd_inner_microstep: 5707.68 | bwd_allreduce_microstep: 34.42 | step_microstep: 18.52 [2025-04-26 02:05:10,978] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.22 | bwd: 5742.16 | bwd_inner: 5707.68 | bwd_allreduce: 34.43 | step: 18.53 18%|█▊ | 7504/41250 [18:07:36<81:25:01, 8.69s/it] {'loss': 0.1065, 'grad_norm': 1.8430851697921753, 'learning_rate': 3.7627958464287244e-05, 'epoch': 1.82} 18%|█▊ | 7504/41250 [18:07:36<81:25:01, 8.69s/it][2025-04-26 02:05:19,734] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.04 | optimizer_step: 1.18 [2025-04-26 02:05:19,735] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2883.66 | bwd_microstep: 5789.60 | bwd_inner_microstep: 5776.74 | bwd_allreduce_microstep: 12.80 | step_microstep: 19.82 [2025-04-26 02:05:19,735] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2883.66 | bwd: 5789.61 | bwd_inner: 5776.74 | bwd_allreduce: 12.82 | step: 19.83 18%|█▊ | 7505/41250 [18:07:45<81:37:25, 8.71s/it] {'loss': 0.2175, 'grad_norm': 3.384611129760742, 'learning_rate': 3.762721662830221e-05, 'epoch': 1.82} 18%|█▊ | 7505/41250 [18:07:45<81:37:25, 8.71s/it][2025-04-26 02:05:28,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:05:28,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.15 | bwd_microstep: 5744.27 | bwd_inner_microstep: 5704.71 | bwd_allreduce_microstep: 39.51 | step_microstep: 18.46 [2025-04-26 02:05:28,415] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.15 | bwd: 5744.28 | bwd_inner: 5704.71 | bwd_allreduce: 39.53 | step: 18.46 18%|█▊ | 7506/41250 [18:07:53<81:32:04, 8.70s/it] {'loss': 0.0709, 'grad_norm': 0.849391520023346, 'learning_rate': 3.762647468364885e-05, 'epoch': 1.82} 18%|█▊ | 7506/41250 [18:07:53<81:32:04, 8.70s/it][2025-04-26 02:05:37,104] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-26 02:05:37,105] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.79 | bwd_microstep: 5779.87 | bwd_inner_microstep: 5663.83 | bwd_allreduce_microstep: 116.00 | step_microstep: 19.05 [2025-04-26 02:05:37,105] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.79 | bwd: 5779.89 | bwd_inner: 5663.83 | bwd_allreduce: 116.02 | step: 19.05 18%|█▊ | 7507/41250 [18:08:02<81:30:44, 8.70s/it] {'loss': 0.3405, 'grad_norm': 3.033501625061035, 'learning_rate': 3.762573263033174e-05, 'epoch': 1.82} 18%|█▊ | 7507/41250 [18:08:02<81:30:44, 8.70s/it][2025-04-26 02:05:45,723] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:05:45,724] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.16 | bwd_microstep: 5692.92 | bwd_inner_microstep: 5676.90 | bwd_allreduce_microstep: 15.97 | step_microstep: 18.87 [2025-04-26 02:05:45,724] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.16 | bwd: 5692.93 | bwd_inner: 5676.90 | bwd_allreduce: 15.99 | step: 18.87 18%|█▊ | 7508/41250 [18:08:11<81:17:19, 8.67s/it] {'loss': 0.1302, 'grad_norm': 1.5089983940124512, 'learning_rate': 3.762499046835545e-05, 'epoch': 1.82} 18%|█▊ | 7508/41250 [18:08:11<81:17:19, 8.67s/it][2025-04-26 02:05:54,340] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:05:54,340] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.70 | bwd_microstep: 5698.69 | bwd_inner_microstep: 5669.13 | bwd_allreduce_microstep: 29.51 | step_microstep: 18.51 [2025-04-26 02:05:54,340] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.70 | bwd: 5698.70 | bwd_inner: 5669.13 | bwd_allreduce: 29.53 | step: 18.52 18%|█▊ | 7509/41250 [18:08:19<81:07:46, 8.66s/it] {'loss': 0.0654, 'grad_norm': 1.0168042182922363, 'learning_rate': 3.7624248197724564e-05, 'epoch': 1.82} 18%|█▊ | 7509/41250 [18:08:19<81:07:46, 8.66s/it][2025-04-26 02:06:03,045] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 1.02 [2025-04-26 02:06:03,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.63 | bwd_microstep: 5794.14 | bwd_inner_microstep: 5658.93 | bwd_allreduce_microstep: 135.16 | step_microstep: 18.89 [2025-04-26 02:06:03,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.63 | bwd: 5794.15 | bwd_inner: 5658.93 | bwd_allreduce: 135.18 | step: 18.90 18%|█▊ | 7510/41250 [18:08:28<81:16:23, 8.67s/it] {'loss': 0.2024, 'grad_norm': 4.107924938201904, 'learning_rate': 3.762350581844366e-05, 'epoch': 1.82} 18%|█▊ | 7510/41250 [18:08:28<81:16:23, 8.67s/it][2025-04-26 02:06:11,674] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 1.02 | optimizer_step: 1.12 [2025-04-26 02:06:11,675] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.14 | bwd_microstep: 5698.45 | bwd_inner_microstep: 5685.64 | bwd_allreduce_microstep: 12.76 | step_microstep: 19.57 [2025-04-26 02:06:11,675] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.14 | bwd: 5698.46 | bwd_inner: 5685.64 | bwd_allreduce: 12.78 | step: 19.58 18%|█▊ | 7511/41250 [18:08:36<81:08:40, 8.66s/it] {'loss': 0.046, 'grad_norm': 1.3850518465042114, 'learning_rate': 3.7622763330517305e-05, 'epoch': 1.82} 18%|█▊ | 7511/41250 [18:08:37<81:08:40, 8.66s/it][2025-04-26 02:06:20,328] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 02:06:20,329] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.67 | bwd_microstep: 5716.47 | bwd_inner_microstep: 5703.67 | bwd_allreduce_microstep: 12.76 | step_microstep: 19.15 [2025-04-26 02:06:20,329] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.67 | bwd: 5716.49 | bwd_inner: 5703.67 | bwd_allreduce: 12.78 | step: 19.16 18%|█▊ | 7512/41250 [18:08:45<81:07:57, 8.66s/it] {'loss': 0.1051, 'grad_norm': 1.4816441535949707, 'learning_rate': 3.762202073395008e-05, 'epoch': 1.82} 18%|█▊ | 7512/41250 [18:08:45<81:07:57, 8.66s/it][2025-04-26 02:06:28,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 02:06:28,943] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.14 | bwd_microstep: 5702.62 | bwd_inner_microstep: 5661.73 | bwd_allreduce_microstep: 40.85 | step_microstep: 18.84 [2025-04-26 02:06:28,943] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.14 | bwd: 5702.63 | bwd_inner: 5661.73 | bwd_allreduce: 40.86 | step: 18.85 18%|█▊ | 7513/41250 [18:08:54<81:00:26, 8.64s/it] {'loss': 0.1256, 'grad_norm': 2.8173983097076416, 'learning_rate': 3.762127802874656e-05, 'epoch': 1.82} 18%|█▊ | 7513/41250 [18:08:54<81:00:26, 8.64s/it][2025-04-26 02:06:37,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:06:37,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.81 | bwd_microstep: 5717.45 | bwd_inner_microstep: 5704.41 | bwd_allreduce_microstep: 13.00 | step_microstep: 18.65 [2025-04-26 02:06:37,593] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.81 | bwd: 5717.46 | bwd_inner: 5704.41 | bwd_allreduce: 13.01 | step: 18.66 18%|█▊ | 7514/41250 [18:09:02<81:01:09, 8.65s/it] {'loss': 0.0747, 'grad_norm': 1.5200659036636353, 'learning_rate': 3.762053521491133e-05, 'epoch': 1.82} 18%|█▊ | 7514/41250 [18:09:02<81:01:09, 8.65s/it][2025-04-26 02:06:46,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-26 02:06:46,213] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.75 | bwd_microstep: 5704.23 | bwd_inner_microstep: 5673.11 | bwd_allreduce_microstep: 31.08 | step_microstep: 18.90 [2025-04-26 02:06:46,213] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.75 | bwd: 5704.25 | bwd_inner: 5673.11 | bwd_allreduce: 31.10 | step: 18.91 18%|█▊ | 7515/41250 [18:09:11<80:56:50, 8.64s/it] {'loss': 0.0837, 'grad_norm': 2.528031587600708, 'learning_rate': 3.761979229244897e-05, 'epoch': 1.82} 18%|█▊ | 7515/41250 [18:09:11<80:56:50, 8.64s/it][2025-04-26 02:06:54,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.05 | optimizer_step: 0.90 [2025-04-26 02:06:54,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.80 | bwd_microstep: 5723.33 | bwd_inner_microstep: 5659.38 | bwd_allreduce_microstep: 63.91 | step_microstep: 18.88 [2025-04-26 02:06:54,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.80 | bwd: 5723.35 | bwd_inner: 5659.38 | bwd_allreduce: 63.93 | step: 18.88 18%|█▊ | 7516/41250 [18:09:20<80:55:51, 8.64s/it] {'loss': 0.0317, 'grad_norm': 0.40961843729019165, 'learning_rate': 3.761904926136405e-05, 'epoch': 1.82} 18%|█▊ | 7516/41250 [18:09:20<80:55:51, 8.64s/it][2025-04-26 02:07:03,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:07:03,459] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.42 | bwd_microstep: 5693.62 | bwd_inner_microstep: 5654.37 | bwd_allreduce_microstep: 39.20 | step_microstep: 18.50 [2025-04-26 02:07:03,459] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.42 | bwd: 5693.63 | bwd_inner: 5654.37 | bwd_allreduce: 39.22 | step: 18.50 18%|█▊ | 7517/41250 [18:09:28<80:51:23, 8.63s/it] {'loss': 0.1587, 'grad_norm': 1.9332960844039917, 'learning_rate': 3.761830612166116e-05, 'epoch': 1.82} 18%|█▊ | 7517/41250 [18:09:28<80:51:23, 8.63s/it][2025-04-26 02:07:12,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:07:12,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.94 | bwd_microstep: 5726.25 | bwd_inner_microstep: 5652.61 | bwd_allreduce_microstep: 73.59 | step_microstep: 18.75 [2025-04-26 02:07:12,096] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.94 | bwd: 5726.26 | bwd_inner: 5652.61 | bwd_allreduce: 73.61 | step: 18.75 18%|█▊ | 7518/41250 [18:09:37<80:52:41, 8.63s/it] {'loss': 0.0313, 'grad_norm': 0.5472142100334167, 'learning_rate': 3.761756287334488e-05, 'epoch': 1.82} 18%|█▊ | 7518/41250 [18:09:37<80:52:41, 8.63s/it][2025-04-26 02:07:20,740] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.34 | optimizer_step: 1.04 [2025-04-26 02:07:20,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.00 | bwd_microstep: 5711.17 | bwd_inner_microstep: 5697.11 | bwd_allreduce_microstep: 14.01 | step_microstep: 20.16 [2025-04-26 02:07:20,741] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.00 | bwd: 5711.19 | bwd_inner: 5697.10 | bwd_allreduce: 14.03 | step: 20.17 18%|█▊ | 7519/41250 [18:09:46<80:54:48, 8.64s/it] {'loss': 0.0406, 'grad_norm': 0.6092974543571472, 'learning_rate': 3.761681951641979e-05, 'epoch': 1.82} 18%|█▊ | 7519/41250 [18:09:46<80:54:48, 8.64s/it][2025-04-26 02:07:29,348] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 1.05 [2025-04-26 02:07:29,349] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.56 | bwd_microstep: 5698.79 | bwd_inner_microstep: 5651.64 | bwd_allreduce_microstep: 47.11 | step_microstep: 18.68 [2025-04-26 02:07:29,349] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.56 | bwd: 5698.80 | bwd_inner: 5651.64 | bwd_allreduce: 47.13 | step: 18.68 18%|█▊ | 7520/41250 [18:09:54<80:49:58, 8.63s/it] {'loss': 0.2416, 'grad_norm': 2.741051197052002, 'learning_rate': 3.7616076050890466e-05, 'epoch': 1.82} 18%|█▊ | 7520/41250 [18:09:54<80:49:58, 8.63s/it][2025-04-26 02:07:38,165] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-26 02:07:38,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.22 | bwd_microstep: 5907.10 | bwd_inner_microstep: 5661.21 | bwd_allreduce_microstep: 245.84 | step_microstep: 19.09 [2025-04-26 02:07:38,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.22 | bwd: 5907.11 | bwd_inner: 5661.21 | bwd_allreduce: 245.86 | step: 19.09 18%|█▊ | 7521/41250 [18:10:03<81:21:51, 8.68s/it] {'loss': 0.2391, 'grad_norm': 2.327169418334961, 'learning_rate': 3.76153324767615e-05, 'epoch': 1.82} 18%|█▊ | 7521/41250 [18:10:03<81:21:51, 8.68s/it][2025-04-26 02:07:46,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:07:46,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.65 | bwd_microstep: 5771.91 | bwd_inner_microstep: 5651.38 | bwd_allreduce_microstep: 120.48 | step_microstep: 18.63 [2025-04-26 02:07:46,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.65 | bwd: 5771.92 | bwd_inner: 5651.38 | bwd_allreduce: 120.50 | step: 18.63 18%|█▊ | 7522/41250 [18:10:12<81:21:04, 8.68s/it] {'loss': 0.2171, 'grad_norm': 4.368993759155273, 'learning_rate': 3.7614588794037475e-05, 'epoch': 1.82} 18%|█▊ | 7522/41250 [18:10:12<81:21:04, 8.68s/it][2025-04-26 02:07:55,472] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-26 02:07:55,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.98 | bwd_microstep: 5715.99 | bwd_inner_microstep: 5659.11 | bwd_allreduce_microstep: 56.83 | step_microstep: 17.99 [2025-04-26 02:07:55,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.98 | bwd: 5716.00 | bwd_inner: 5659.11 | bwd_allreduce: 56.85 | step: 17.99 18%|█▊ | 7523/41250 [18:10:20<81:11:14, 8.67s/it] {'loss': 0.0293, 'grad_norm': 0.5347867608070374, 'learning_rate': 3.761384500272298e-05, 'epoch': 1.82} 18%|█▊ | 7523/41250 [18:10:20<81:11:14, 8.67s/it][2025-04-26 02:08:04,133] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 02:08:04,134] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.36 | bwd_microstep: 5748.73 | bwd_inner_microstep: 5651.17 | bwd_allreduce_microstep: 97.51 | step_microstep: 18.57 [2025-04-26 02:08:04,134] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.36 | bwd: 5748.74 | bwd_inner: 5651.17 | bwd_allreduce: 97.53 | step: 18.57 18%|█▊ | 7524/41250 [18:10:29<81:10:15, 8.66s/it] {'loss': 0.2122, 'grad_norm': 1.7018033266067505, 'learning_rate': 3.7613101102822593e-05, 'epoch': 1.82} 18%|█▊ | 7524/41250 [18:10:29<81:10:15, 8.66s/it][2025-04-26 02:08:12,747] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 02:08:12,748] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.88 | bwd_microstep: 5707.36 | bwd_inner_microstep: 5653.05 | bwd_allreduce_microstep: 54.26 | step_microstep: 18.36 [2025-04-26 02:08:12,748] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.88 | bwd: 5707.37 | bwd_inner: 5653.05 | bwd_allreduce: 54.28 | step: 18.36 18%|█▊ | 7525/41250 [18:10:38<81:02:06, 8.65s/it] {'loss': 0.3519, 'grad_norm': 2.5958192348480225, 'learning_rate': 3.76123570943409e-05, 'epoch': 1.82} 18%|█▊ | 7525/41250 [18:10:38<81:02:06, 8.65s/it][2025-04-26 02:08:21,363] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-26 02:08:21,364] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.83 | bwd_microstep: 5687.39 | bwd_inner_microstep: 5674.56 | bwd_allreduce_microstep: 12.79 | step_microstep: 18.48 [2025-04-26 02:08:21,364] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.83 | bwd: 5687.40 | bwd_inner: 5674.56 | bwd_allreduce: 12.80 | step: 18.48 18%|█▊ | 7526/41250 [18:10:46<80:55:43, 8.64s/it] {'loss': 0.0563, 'grad_norm': 1.2919237613677979, 'learning_rate': 3.761161297728249e-05, 'epoch': 1.82} 18%|█▊ | 7526/41250 [18:10:46<80:55:43, 8.64s/it][2025-04-26 02:08:29,977] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 1.13 | optimizer_step: 1.03 [2025-04-26 02:08:29,977] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.20 | bwd_microstep: 5705.51 | bwd_inner_microstep: 5647.49 | bwd_allreduce_microstep: 57.96 | step_microstep: 19.81 [2025-04-26 02:08:29,978] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.20 | bwd: 5705.53 | bwd_inner: 5647.49 | bwd_allreduce: 57.99 | step: 19.81 18%|█▊ | 7527/41250 [18:10:55<80:51:32, 8.63s/it] {'loss': 0.0449, 'grad_norm': 0.8010697960853577, 'learning_rate': 3.7610868751651943e-05, 'epoch': 1.82} 18%|█▊ | 7527/41250 [18:10:55<80:51:32, 8.63s/it][2025-04-26 02:08:38,652] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:08:38,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.28 | bwd_microstep: 5773.93 | bwd_inner_microstep: 5633.97 | bwd_allreduce_microstep: 139.91 | step_microstep: 18.56 [2025-04-26 02:08:38,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.28 | bwd: 5773.94 | bwd_inner: 5633.97 | bwd_allreduce: 139.93 | step: 18.57 18%|█▊ | 7528/41250 [18:11:03<80:58:29, 8.64s/it] {'loss': 0.1934, 'grad_norm': 2.9174437522888184, 'learning_rate': 3.7610124417453866e-05, 'epoch': 1.82} 18%|█▊ | 7528/41250 [18:11:03<80:58:29, 8.64s/it][2025-04-26 02:08:47,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 02:08:47,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.68 | bwd_microstep: 5767.33 | bwd_inner_microstep: 5657.19 | bwd_allreduce_microstep: 110.10 | step_microstep: 18.54 [2025-04-26 02:08:47,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.68 | bwd: 5767.35 | bwd_inner: 5657.19 | bwd_allreduce: 110.12 | step: 18.54 18%|█▊ | 7529/41250 [18:11:12<81:04:10, 8.65s/it] {'loss': 0.1825, 'grad_norm': 3.4741053581237793, 'learning_rate': 3.760937997469283e-05, 'epoch': 1.83} 18%|█▊ | 7529/41250 [18:11:12<81:04:10, 8.65s/it][2025-04-26 02:08:56,011] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-26 02:08:56,011] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.62 | bwd_microstep: 5750.06 | bwd_inner_microstep: 5694.18 | bwd_allreduce_microstep: 55.84 | step_microstep: 18.94 [2025-04-26 02:08:56,012] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.62 | bwd: 5750.07 | bwd_inner: 5694.18 | bwd_allreduce: 55.86 | step: 18.95 18%|█▊ | 7530/41250 [18:11:21<81:08:15, 8.66s/it] {'loss': 0.3051, 'grad_norm': 20.875995635986328, 'learning_rate': 3.760863542337343e-05, 'epoch': 1.83} 18%|█▊ | 7530/41250 [18:11:21<81:08:15, 8.66s/it][2025-04-26 02:09:04,673] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.97 | optimizer_step: 1.00 [2025-04-26 02:09:04,674] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.48 | bwd_microstep: 5726.30 | bwd_inner_microstep: 5707.64 | bwd_allreduce_microstep: 18.62 | step_microstep: 18.84 [2025-04-26 02:09:04,674] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.48 | bwd: 5726.32 | bwd_inner: 5707.64 | bwd_allreduce: 18.63 | step: 18.85 18%|█▊ | 7531/41250 [18:11:29<81:08:09, 8.66s/it] {'loss': 0.1623, 'grad_norm': 1.7427184581756592, 'learning_rate': 3.7607890763500254e-05, 'epoch': 1.83} 18%|█▊ | 7531/41250 [18:11:30<81:08:09, 8.66s/it][2025-04-26 02:09:13,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:09:13,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.05 | bwd_microstep: 5763.41 | bwd_inner_microstep: 5646.58 | bwd_allreduce_microstep: 116.78 | step_microstep: 18.78 [2025-04-26 02:09:13,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.05 | bwd: 5763.42 | bwd_inner: 5646.58 | bwd_allreduce: 116.80 | step: 18.78 18%|█▊ | 7532/41250 [18:11:38<81:09:44, 8.67s/it] {'loss': 0.0901, 'grad_norm': 1.1502394676208496, 'learning_rate': 3.76071459950779e-05, 'epoch': 1.83} 18%|█▊ | 7532/41250 [18:11:38<81:09:44, 8.67s/it][2025-04-26 02:09:22,019] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.05 | optimizer_step: 1.01 [2025-04-26 02:09:22,020] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.81 | bwd_microstep: 5768.38 | bwd_inner_microstep: 5645.34 | bwd_allreduce_microstep: 122.99 | step_microstep: 19.40 [2025-04-26 02:09:22,020] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.81 | bwd: 5768.40 | bwd_inner: 5645.34 | bwd_allreduce: 123.02 | step: 19.40 18%|█▊ | 7533/41250 [18:11:47<81:11:09, 8.67s/it] {'loss': 0.1555, 'grad_norm': 1.8360531330108643, 'learning_rate': 3.760640111811095e-05, 'epoch': 1.83} 18%|█▊ | 7533/41250 [18:11:47<81:11:09, 8.67s/it][2025-04-26 02:09:30,694] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:09:30,694] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.63 | bwd_microstep: 5741.12 | bwd_inner_microstep: 5704.00 | bwd_allreduce_microstep: 37.08 | step_microstep: 19.00 [2025-04-26 02:09:30,694] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.63 | bwd: 5741.13 | bwd_inner: 5704.00 | bwd_allreduce: 37.09 | step: 19.00 18%|█▊ | 7534/41250 [18:11:56<81:11:52, 8.67s/it] {'loss': 0.0808, 'grad_norm': 2.7879014015197754, 'learning_rate': 3.7605656132603995e-05, 'epoch': 1.83} 18%|█▊ | 7534/41250 [18:11:56<81:11:52, 8.67s/it][2025-04-26 02:09:39,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-26 02:09:39,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.17 | bwd_microstep: 5747.41 | bwd_inner_microstep: 5647.59 | bwd_allreduce_microstep: 99.78 | step_microstep: 18.85 [2025-04-26 02:09:39,346] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.17 | bwd: 5747.43 | bwd_inner: 5647.59 | bwd_allreduce: 99.80 | step: 18.85 18%|█▊ | 7535/41250 [18:12:04<81:08:52, 8.66s/it] {'loss': 0.294, 'grad_norm': 2.2633609771728516, 'learning_rate': 3.760491103856164e-05, 'epoch': 1.83} 18%|█▊ | 7535/41250 [18:12:04<81:08:52, 8.66s/it][2025-04-26 02:09:48,082] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-26 02:09:48,083] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2881.12 | bwd_microstep: 5771.37 | bwd_inner_microstep: 5758.48 | bwd_allreduce_microstep: 12.84 | step_microstep: 19.16 [2025-04-26 02:09:48,083] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2881.12 | bwd: 5771.38 | bwd_inner: 5758.48 | bwd_allreduce: 12.86 | step: 19.16 18%|█▊ | 7536/41250 [18:12:13<81:20:45, 8.69s/it] {'loss': 0.2447, 'grad_norm': 3.160491704940796, 'learning_rate': 3.760416583598848e-05, 'epoch': 1.83} 18%|█▊ | 7536/41250 [18:12:13<81:20:45, 8.69s/it][2025-04-26 02:09:56,709] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.99 | optimizer_step: 1.07 [2025-04-26 02:09:56,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.41 | bwd_microstep: 5701.70 | bwd_inner_microstep: 5688.73 | bwd_allreduce_microstep: 12.92 | step_microstep: 19.08 [2025-04-26 02:09:56,710] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.41 | bwd: 5701.71 | bwd_inner: 5688.73 | bwd_allreduce: 12.94 | step: 19.08 18%|█▊ | 7537/41250 [18:12:22<81:10:38, 8.67s/it] {'loss': 0.1337, 'grad_norm': 1.925520658493042, 'learning_rate': 3.760342052488908e-05, 'epoch': 1.83} 18%|█▊ | 7537/41250 [18:12:22<81:10:38, 8.67s/it][2025-04-26 02:10:05,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:10:05,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.99 | bwd_microstep: 5694.83 | bwd_inner_microstep: 5635.53 | bwd_allreduce_microstep: 59.26 | step_microstep: 18.55 [2025-04-26 02:10:05,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.99 | bwd: 5694.85 | bwd_inner: 5635.53 | bwd_allreduce: 59.27 | step: 18.56 18%|█▊ | 7538/41250 [18:12:30<80:58:18, 8.65s/it] {'loss': 0.102, 'grad_norm': 1.2089189291000366, 'learning_rate': 3.7602675105268065e-05, 'epoch': 1.83} 18%|█▊ | 7538/41250 [18:12:30<80:58:18, 8.65s/it][2025-04-26 02:10:13,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:10:13,976] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.24 | bwd_microstep: 5721.95 | bwd_inner_microstep: 5708.98 | bwd_allreduce_microstep: 12.92 | step_microstep: 18.49 [2025-04-26 02:10:13,976] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.24 | bwd: 5721.96 | bwd_inner: 5708.98 | bwd_allreduce: 12.94 | step: 18.49 18%|█▊ | 7539/41250 [18:12:39<81:01:52, 8.65s/it] {'loss': 0.0682, 'grad_norm': 1.2000114917755127, 'learning_rate': 3.760192957713003e-05, 'epoch': 1.83} 18%|█▊ | 7539/41250 [18:12:39<81:01:52, 8.65s/it][2025-04-26 02:10:22,609] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.29 | optimizer_step: 1.07 [2025-04-26 02:10:22,610] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.38 | bwd_microstep: 5702.96 | bwd_inner_microstep: 5688.96 | bwd_allreduce_microstep: 13.94 | step_microstep: 20.20 [2025-04-26 02:10:22,610] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.38 | bwd: 5702.98 | bwd_inner: 5688.96 | bwd_allreduce: 13.97 | step: 20.20 18%|█▊ | 7540/41250 [18:12:47<80:58:53, 8.65s/it] {'loss': 0.2364, 'grad_norm': 2.463449239730835, 'learning_rate': 3.760118394047955e-05, 'epoch': 1.83} 18%|█▊ | 7540/41250 [18:12:47<80:58:53, 8.65s/it][2025-04-26 02:10:31,310] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.98 | optimizer_step: 1.09 [2025-04-26 02:10:31,311] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.17 | bwd_microstep: 5782.33 | bwd_inner_microstep: 5641.56 | bwd_allreduce_microstep: 140.73 | step_microstep: 18.72 [2025-04-26 02:10:31,311] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.17 | bwd: 5782.35 | bwd_inner: 5641.56 | bwd_allreduce: 140.74 | step: 18.72 18%|█▊ | 7541/41250 [18:12:56<81:07:13, 8.66s/it] {'loss': 0.0635, 'grad_norm': 2.370535373687744, 'learning_rate': 3.760043819532123e-05, 'epoch': 1.83} 18%|█▊ | 7541/41250 [18:12:56<81:07:13, 8.66s/it][2025-04-26 02:10:39,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 02:10:39,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.46 | bwd_microstep: 5781.13 | bwd_inner_microstep: 5641.50 | bwd_allreduce_microstep: 139.58 | step_microstep: 19.03 [2025-04-26 02:10:39,996] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.46 | bwd: 5781.15 | bwd_inner: 5641.50 | bwd_allreduce: 139.60 | step: 19.03 18%|█▊ | 7542/41250 [18:13:05<81:10:44, 8.67s/it] {'loss': 0.094, 'grad_norm': 1.5204317569732666, 'learning_rate': 3.7599692341659674e-05, 'epoch': 1.83} 18%|█▊ | 7542/41250 [18:13:05<81:10:44, 8.67s/it][2025-04-26 02:10:48,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:10:48,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.91 | bwd_microstep: 5758.87 | bwd_inner_microstep: 5654.41 | bwd_allreduce_microstep: 104.42 | step_microstep: 18.50 [2025-04-26 02:10:48,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.91 | bwd: 5758.88 | bwd_inner: 5654.41 | bwd_allreduce: 104.44 | step: 18.50 18%|█▊ | 7543/41250 [18:13:13<81:10:43, 8.67s/it] {'loss': 0.1662, 'grad_norm': 1.3656655550003052, 'learning_rate': 3.759894637949948e-05, 'epoch': 1.83} 18%|█▊ | 7543/41250 [18:13:13<81:10:43, 8.67s/it][2025-04-26 02:10:57,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 02:10:57,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.13 | bwd_microstep: 5730.65 | bwd_inner_microstep: 5695.95 | bwd_allreduce_microstep: 34.64 | step_microstep: 19.17 [2025-04-26 02:10:57,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.13 | bwd: 5730.66 | bwd_inner: 5695.95 | bwd_allreduce: 34.66 | step: 19.18 18%|█▊ | 7544/41250 [18:13:22<81:09:49, 8.67s/it] {'loss': 0.0693, 'grad_norm': 1.5447204113006592, 'learning_rate': 3.759820030884524e-05, 'epoch': 1.83} 18%|█▊ | 7544/41250 [18:13:22<81:09:49, 8.67s/it][2025-04-26 02:11:05,994] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 0.99 | optimizer_step: 1.01 [2025-04-26 02:11:05,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.06 | bwd_microstep: 5752.37 | bwd_inner_microstep: 5654.01 | bwd_allreduce_microstep: 98.31 | step_microstep: 18.62 [2025-04-26 02:11:05,995] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.06 | bwd: 5752.39 | bwd_inner: 5654.01 | bwd_allreduce: 98.33 | step: 18.63 18%|█▊ | 7545/41250 [18:13:31<81:09:10, 8.67s/it] {'loss': 0.0857, 'grad_norm': 2.573286771774292, 'learning_rate': 3.7597454129701555e-05, 'epoch': 1.83} 18%|█▊ | 7545/41250 [18:13:31<81:09:10, 8.67s/it][2025-04-26 02:11:14,595] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.01 | optimizer_step: 0.95 [2025-04-26 02:11:14,596] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.44 | bwd_microstep: 5680.11 | bwd_inner_microstep: 5644.13 | bwd_allreduce_microstep: 35.94 | step_microstep: 18.50 [2025-04-26 02:11:14,596] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.44 | bwd: 5680.13 | bwd_inner: 5644.13 | bwd_allreduce: 35.96 | step: 18.50 18%|█▊ | 7546/41250 [18:13:39<80:57:20, 8.65s/it] {'loss': 0.1877, 'grad_norm': 1.8049148321151733, 'learning_rate': 3.759670784207303e-05, 'epoch': 1.83} 18%|█▊ | 7546/41250 [18:13:39<80:57:20, 8.65s/it][2025-04-26 02:11:23,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:11:23,222] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.69 | bwd_microstep: 5696.01 | bwd_inner_microstep: 5647.84 | bwd_allreduce_microstep: 48.13 | step_microstep: 18.56 [2025-04-26 02:11:23,223] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.69 | bwd: 5696.02 | bwd_inner: 5647.84 | bwd_allreduce: 48.14 | step: 18.57 18%|█▊ | 7547/41250 [18:13:48<80:53:49, 8.64s/it] {'loss': 0.0927, 'grad_norm': 1.6757495403289795, 'learning_rate': 3.759596144596426e-05, 'epoch': 1.83} 18%|█▊ | 7547/41250 [18:13:48<80:53:49, 8.64s/it][2025-04-26 02:11:31,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 02:11:31,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.77 | bwd_microstep: 5715.25 | bwd_inner_microstep: 5702.01 | bwd_allreduce_microstep: 13.19 | step_microstep: 18.82 [2025-04-26 02:11:31,879] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.77 | bwd: 5715.26 | bwd_inner: 5702.01 | bwd_allreduce: 13.21 | step: 18.82 18%|█▊ | 7548/41250 [18:13:57<80:56:02, 8.65s/it] {'loss': 0.0847, 'grad_norm': 0.9795426726341248, 'learning_rate': 3.7595214941379855e-05, 'epoch': 1.83} 18%|█▊ | 7548/41250 [18:13:57<80:56:02, 8.65s/it][2025-04-26 02:11:40,569] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:11:40,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2866.75 | bwd_microstep: 5742.98 | bwd_inner_microstep: 5709.78 | bwd_allreduce_microstep: 33.16 | step_microstep: 18.43 [2025-04-26 02:11:40,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2866.75 | bwd: 5742.99 | bwd_inner: 5709.78 | bwd_allreduce: 33.18 | step: 18.44 18%|█▊ | 7549/41250 [18:14:05<81:03:36, 8.66s/it] {'loss': 0.0676, 'grad_norm': 1.749991774559021, 'learning_rate': 3.7594468328324404e-05, 'epoch': 1.83} 18%|█▊ | 7549/41250 [18:14:05<81:03:36, 8.66s/it][2025-04-26 02:11:49,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 02:11:49,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.69 | bwd_microstep: 5755.86 | bwd_inner_microstep: 5650.09 | bwd_allreduce_microstep: 105.73 | step_microstep: 18.58 [2025-04-26 02:11:49,241] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.69 | bwd: 5755.87 | bwd_inner: 5650.09 | bwd_allreduce: 105.75 | step: 18.58 18%|█▊ | 7550/41250 [18:14:14<81:05:33, 8.66s/it] {'loss': 0.0341, 'grad_norm': 0.6774425506591797, 'learning_rate': 3.7593721606802514e-05, 'epoch': 1.83} 18%|█▊ | 7550/41250 [18:14:14<81:05:33, 8.66s/it][2025-04-26 02:11:57,898] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:11:57,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.03 | bwd_microstep: 5717.09 | bwd_inner_microstep: 5704.18 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.73 [2025-04-26 02:11:57,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.03 | bwd: 5717.10 | bwd_inner: 5704.18 | bwd_allreduce: 12.88 | step: 18.73 18%|█▊ | 7551/41250 [18:14:23<81:04:43, 8.66s/it] {'loss': 0.0834, 'grad_norm': 2.7581377029418945, 'learning_rate': 3.7592974776818794e-05, 'epoch': 1.83} 18%|█▊ | 7551/41250 [18:14:23<81:04:43, 8.66s/it][2025-04-26 02:12:06,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.97 | optimizer_gradients: 0.98 | optimizer_step: 0.91 [2025-04-26 02:12:06,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.47 | bwd_microstep: 5745.39 | bwd_inner_microstep: 5690.83 | bwd_allreduce_microstep: 54.51 | step_microstep: 18.26 [2025-04-26 02:12:06,576] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.47 | bwd: 5745.40 | bwd_inner: 5690.83 | bwd_allreduce: 54.53 | step: 18.26 18%|█▊ | 7552/41250 [18:14:31<81:07:03, 8.67s/it] {'loss': 0.0812, 'grad_norm': 2.8921658992767334, 'learning_rate': 3.7592227838377846e-05, 'epoch': 1.83} 18%|█▊ | 7552/41250 [18:14:31<81:07:03, 8.67s/it][2025-04-26 02:12:15,217] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 1.15 [2025-04-26 02:12:15,218] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.47 | bwd_microstep: 5702.30 | bwd_inner_microstep: 5689.48 | bwd_allreduce_microstep: 12.77 | step_microstep: 19.14 [2025-04-26 02:12:15,218] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.47 | bwd: 5702.32 | bwd_inner: 5689.48 | bwd_allreduce: 12.79 | step: 19.14 18%|█▊ | 7553/41250 [18:14:40<81:03:16, 8.66s/it] {'loss': 0.114, 'grad_norm': 2.1499183177948, 'learning_rate': 3.7591480791484275e-05, 'epoch': 1.83} 18%|█▊ | 7553/41250 [18:14:40<81:03:16, 8.66s/it][2025-04-26 02:12:23,897] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:12:23,898] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.03 | bwd_microstep: 5736.14 | bwd_inner_microstep: 5692.95 | bwd_allreduce_microstep: 43.14 | step_microstep: 18.32 [2025-04-26 02:12:23,898] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.03 | bwd: 5736.15 | bwd_inner: 5692.95 | bwd_allreduce: 43.16 | step: 18.32 18%|█▊ | 7554/41250 [18:14:49<81:06:30, 8.67s/it] {'loss': 0.1946, 'grad_norm': 1.5708030462265015, 'learning_rate': 3.7590733636142685e-05, 'epoch': 1.83} 18%|█▊ | 7554/41250 [18:14:49<81:06:30, 8.67s/it][2025-04-26 02:12:32,576] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-26 02:12:32,577] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.83 | bwd_microstep: 5748.49 | bwd_inner_microstep: 5698.74 | bwd_allreduce_microstep: 49.71 | step_microstep: 19.04 [2025-04-26 02:12:32,577] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.83 | bwd: 5748.50 | bwd_inner: 5698.74 | bwd_allreduce: 49.73 | step: 19.04 18%|█▊ | 7555/41250 [18:14:57<81:08:40, 8.67s/it] {'loss': 0.1457, 'grad_norm': 1.55513596534729, 'learning_rate': 3.7589986372357675e-05, 'epoch': 1.83} 18%|█▊ | 7555/41250 [18:14:57<81:08:40, 8.67s/it][2025-04-26 02:12:41,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 1.08 [2025-04-26 02:12:41,210] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.85 | bwd_microstep: 5692.50 | bwd_inner_microstep: 5679.78 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.71 [2025-04-26 02:12:41,210] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.85 | bwd: 5692.52 | bwd_inner: 5679.77 | bwd_allreduce: 12.70 | step: 18.72 18%|█▊ | 7556/41250 [18:15:06<81:02:18, 8.66s/it] {'loss': 0.0264, 'grad_norm': 0.9886921644210815, 'learning_rate': 3.758923900013387e-05, 'epoch': 1.83} 18%|█▊ | 7556/41250 [18:15:06<81:02:18, 8.66s/it][2025-04-26 02:12:49,840] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 02:12:49,840] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.37 | bwd_microstep: 5693.91 | bwd_inner_microstep: 5681.03 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.68 [2025-04-26 02:12:49,840] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.37 | bwd: 5693.92 | bwd_inner: 5681.03 | bwd_allreduce: 12.85 | step: 18.68 18%|█▊ | 7557/41250 [18:15:15<80:57:12, 8.65s/it] {'loss': 0.2332, 'grad_norm': 3.1951100826263428, 'learning_rate': 3.758849151947585e-05, 'epoch': 1.83} 18%|█▊ | 7557/41250 [18:15:15<80:57:12, 8.65s/it][2025-04-26 02:12:58,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:12:58,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.92 | bwd_microstep: 5878.38 | bwd_inner_microstep: 5707.24 | bwd_allreduce_microstep: 171.10 | step_microstep: 18.57 [2025-04-26 02:12:58,651] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.92 | bwd: 5878.40 | bwd_inner: 5707.24 | bwd_allreduce: 171.12 | step: 18.58 18%|█▊ | 7558/41250 [18:15:23<81:24:09, 8.70s/it] {'loss': 0.2614, 'grad_norm': 4.771475791931152, 'learning_rate': 3.758774393038826e-05, 'epoch': 1.83} 18%|█▊ | 7558/41250 [18:15:23<81:24:09, 8.70s/it][2025-04-26 02:13:07,287] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.95 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:13:07,287] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.94 | bwd_microstep: 5703.24 | bwd_inner_microstep: 5690.47 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.02 [2025-04-26 02:13:07,287] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.94 | bwd: 5703.25 | bwd_inner: 5690.47 | bwd_allreduce: 12.74 | step: 18.03 18%|█▊ | 7559/41250 [18:15:32<81:13:44, 8.68s/it] {'loss': 0.1365, 'grad_norm': 2.290580987930298, 'learning_rate': 3.758699623287567e-05, 'epoch': 1.83} 18%|█▊ | 7559/41250 [18:15:32<81:13:44, 8.68s/it][2025-04-26 02:13:15,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.98 | optimizer_step: 1.05 [2025-04-26 02:13:15,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.53 | bwd_microstep: 5701.51 | bwd_inner_microstep: 5648.57 | bwd_allreduce_microstep: 52.89 | step_microstep: 18.44 [2025-04-26 02:13:15,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.53 | bwd: 5701.52 | bwd_inner: 5648.57 | bwd_allreduce: 52.91 | step: 18.44 18%|█▊ | 7560/41250 [18:15:41<81:02:29, 8.66s/it] {'loss': 0.0399, 'grad_norm': 0.7614302039146423, 'learning_rate': 3.7586248426942716e-05, 'epoch': 1.83} 18%|█▊ | 7560/41250 [18:15:41<81:02:29, 8.66s/it][2025-04-26 02:13:24,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:13:24,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.57 | bwd_microstep: 5775.86 | bwd_inner_microstep: 5644.34 | bwd_allreduce_microstep: 131.47 | step_microstep: 18.44 [2025-04-26 02:13:24,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.57 | bwd: 5775.87 | bwd_inner: 5644.34 | bwd_allreduce: 131.49 | step: 18.44 18%|█▊ | 7561/41250 [18:15:49<81:08:22, 8.67s/it] {'loss': 0.0368, 'grad_norm': 0.5080866813659668, 'learning_rate': 3.758550051259399e-05, 'epoch': 1.83} 18%|█▊ | 7561/41250 [18:15:49<81:08:22, 8.67s/it][2025-04-26 02:13:33,286] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.97 | optimizer_step: 1.06 [2025-04-26 02:13:33,287] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.65 | bwd_microstep: 5753.32 | bwd_inner_microstep: 5693.50 | bwd_allreduce_microstep: 59.78 | step_microstep: 18.58 [2025-04-26 02:13:33,287] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.65 | bwd: 5753.33 | bwd_inner: 5693.50 | bwd_allreduce: 59.79 | step: 18.59 18%|█▊ | 7562/41250 [18:15:58<81:11:26, 8.68s/it] {'loss': 0.1859, 'grad_norm': 3.7991206645965576, 'learning_rate': 3.758475248983413e-05, 'epoch': 1.83} 18%|█▊ | 7562/41250 [18:15:58<81:11:26, 8.68s/it][2025-04-26 02:13:41,973] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.88 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:13:41,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.97 | bwd_microstep: 5776.51 | bwd_inner_microstep: 5657.20 | bwd_allreduce_microstep: 119.26 | step_microstep: 18.51 [2025-04-26 02:13:41,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.97 | bwd: 5776.53 | bwd_inner: 5657.20 | bwd_allreduce: 119.28 | step: 18.52 18%|█▊ | 7563/41250 [18:16:07<81:13:17, 8.68s/it] {'loss': 0.2768, 'grad_norm': 2.052696466445923, 'learning_rate': 3.7584004358667715e-05, 'epoch': 1.83} 18%|█▊ | 7563/41250 [18:16:07<81:13:17, 8.68s/it][2025-04-26 02:13:50,612] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:13:50,613] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.05 | bwd_microstep: 5706.29 | bwd_inner_microstep: 5693.61 | bwd_allreduce_microstep: 12.62 | step_microstep: 18.52 [2025-04-26 02:13:50,613] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.05 | bwd: 5706.31 | bwd_inner: 5693.61 | bwd_allreduce: 12.65 | step: 18.52 18%|█▊ | 7564/41250 [18:16:15<81:06:55, 8.67s/it] {'loss': 0.0528, 'grad_norm': 1.0667699575424194, 'learning_rate': 3.7583256119099374e-05, 'epoch': 1.83} 18%|█▊ | 7564/41250 [18:16:15<81:06:55, 8.67s/it][2025-04-26 02:13:59,398] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.30 | optimizer_step: 1.04 [2025-04-26 02:13:59,399] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.28 | bwd_microstep: 5868.83 | bwd_inner_microstep: 5641.16 | bwd_allreduce_microstep: 227.61 | step_microstep: 20.00 [2025-04-26 02:13:59,399] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.28 | bwd: 5868.85 | bwd_inner: 5641.16 | bwd_allreduce: 227.64 | step: 20.00 18%|█▊ | 7565/41250 [18:16:24<81:25:51, 8.70s/it] {'loss': 0.3019, 'grad_norm': 2.719003677368164, 'learning_rate': 3.758250777113373e-05, 'epoch': 1.83} 18%|█▊ | 7565/41250 [18:16:24<81:25:51, 8.70s/it][2025-04-26 02:14:08,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:14:08,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.54 | bwd_microstep: 5770.65 | bwd_inner_microstep: 5640.22 | bwd_allreduce_microstep: 130.38 | step_microstep: 18.78 [2025-04-26 02:14:08,072] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.54 | bwd: 5770.67 | bwd_inner: 5640.22 | bwd_allreduce: 130.40 | step: 18.78 18%|█▊ | 7566/41250 [18:16:33<81:20:32, 8.69s/it] {'loss': 0.0386, 'grad_norm': 0.7239093780517578, 'learning_rate': 3.758175931477537e-05, 'epoch': 1.83} 18%|█▊ | 7566/41250 [18:16:33<81:20:32, 8.69s/it][2025-04-26 02:14:16,733] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-26 02:14:16,734] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.24 | bwd_microstep: 5729.51 | bwd_inner_microstep: 5707.95 | bwd_allreduce_microstep: 21.50 | step_microstep: 19.11 [2025-04-26 02:14:16,734] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.24 | bwd: 5729.52 | bwd_inner: 5707.95 | bwd_allreduce: 21.52 | step: 19.12 18%|█▊ | 7567/41250 [18:16:42<81:15:09, 8.68s/it] {'loss': 0.2428, 'grad_norm': 4.6000895500183105, 'learning_rate': 3.7581010750028924e-05, 'epoch': 1.83} 18%|█▊ | 7567/41250 [18:16:42<81:15:09, 8.68s/it][2025-04-26 02:14:25,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.05 | optimizer_step: 0.90 [2025-04-26 02:14:25,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.04 | bwd_microstep: 5735.05 | bwd_inner_microstep: 5686.46 | bwd_allreduce_microstep: 48.54 | step_microstep: 19.26 [2025-04-26 02:14:25,406] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.04 | bwd: 5735.07 | bwd_inner: 5686.46 | bwd_allreduce: 48.56 | step: 19.26 18%|█▊ | 7568/41250 [18:16:50<81:12:51, 8.68s/it] {'loss': 0.0977, 'grad_norm': 3.9959800243377686, 'learning_rate': 3.7580262076899004e-05, 'epoch': 1.83} 18%|█▊ | 7568/41250 [18:16:50<81:12:51, 8.68s/it][2025-04-26 02:14:34,206] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 02:14:34,207] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.26 | bwd_microstep: 5886.03 | bwd_inner_microstep: 5636.30 | bwd_allreduce_microstep: 249.68 | step_microstep: 18.95 [2025-04-26 02:14:34,207] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.26 | bwd: 5886.05 | bwd_inner: 5636.30 | bwd_allreduce: 249.70 | step: 18.95 18%|█▊ | 7569/41250 [18:16:59<81:33:07, 8.72s/it] {'loss': 0.0998, 'grad_norm': 2.1275668144226074, 'learning_rate': 3.7579513295390235e-05, 'epoch': 1.83} 18%|█▊ | 7569/41250 [18:16:59<81:33:07, 8.72s/it][2025-04-26 02:14:42,824] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:14:42,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.01 | bwd_microstep: 5688.47 | bwd_inner_microstep: 5675.70 | bwd_allreduce_microstep: 12.73 | step_microstep: 19.03 [2025-04-26 02:14:42,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.01 | bwd: 5688.48 | bwd_inner: 5675.70 | bwd_allreduce: 12.75 | step: 19.03 18%|█▊ | 7570/41250 [18:17:08<81:16:23, 8.69s/it] {'loss': 0.3138, 'grad_norm': 2.061302661895752, 'learning_rate': 3.757876440550721e-05, 'epoch': 1.84} 18%|█▊ | 7570/41250 [18:17:08<81:16:23, 8.69s/it][2025-04-26 02:14:51,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:14:51,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.37 | bwd_microstep: 5670.39 | bwd_inner_microstep: 5637.40 | bwd_allreduce_microstep: 32.95 | step_microstep: 18.89 [2025-04-26 02:14:51,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.37 | bwd: 5670.40 | bwd_inner: 5637.40 | bwd_allreduce: 32.97 | step: 18.90 18%|█▊ | 7571/41250 [18:17:16<80:58:05, 8.65s/it] {'loss': 0.1456, 'grad_norm': 1.5665549039840698, 'learning_rate': 3.757801540725457e-05, 'epoch': 1.84} 18%|█▊ | 7571/41250 [18:17:16<80:58:05, 8.65s/it][2025-04-26 02:15:00,040] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 1.08 [2025-04-26 02:15:00,040] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.16 | bwd_microstep: 5704.57 | bwd_inner_microstep: 5691.54 | bwd_allreduce_microstep: 12.98 | step_microstep: 18.81 [2025-04-26 02:15:00,040] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.16 | bwd: 5704.58 | bwd_inner: 5691.54 | bwd_allreduce: 13.00 | step: 18.81 18%|█▊ | 7572/41250 [18:17:25<80:55:10, 8.65s/it] {'loss': 0.1198, 'grad_norm': 1.8899986743927002, 'learning_rate': 3.757726630063692e-05, 'epoch': 1.84} 18%|█▊ | 7572/41250 [18:17:25<80:55:10, 8.65s/it][2025-04-26 02:15:08,700] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.14 | optimizer_step: 1.03 [2025-04-26 02:15:08,701] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.94 | bwd_microstep: 5746.17 | bwd_inner_microstep: 5652.27 | bwd_allreduce_microstep: 93.79 | step_microstep: 20.58 [2025-04-26 02:15:08,701] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.94 | bwd: 5746.21 | bwd_inner: 5652.27 | bwd_allreduce: 93.85 | step: 20.57 18%|█▊ | 7573/41250 [18:17:34<80:56:24, 8.65s/it] {'loss': 0.169, 'grad_norm': 1.8660835027694702, 'learning_rate': 3.757651708565888e-05, 'epoch': 1.84} 18%|█▊ | 7573/41250 [18:17:34<80:56:24, 8.65s/it][2025-04-26 02:15:17,284] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-26 02:15:17,284] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.28 | bwd_microstep: 5675.13 | bwd_inner_microstep: 5651.19 | bwd_allreduce_microstep: 23.89 | step_microstep: 18.87 [2025-04-26 02:15:17,285] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.28 | bwd: 5675.14 | bwd_inner: 5651.19 | bwd_allreduce: 23.91 | step: 18.87 18%|█▊ | 7574/41250 [18:17:42<80:44:44, 8.63s/it] {'loss': 0.0398, 'grad_norm': 0.4117273986339569, 'learning_rate': 3.7575767762325075e-05, 'epoch': 1.84} 18%|█▊ | 7574/41250 [18:17:42<80:44:44, 8.63s/it][2025-04-26 02:15:25,959] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.04 | optimizer_step: 0.89 [2025-04-26 02:15:25,960] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.05 | bwd_microstep: 5744.44 | bwd_inner_microstep: 5680.41 | bwd_allreduce_microstep: 63.98 | step_microstep: 18.54 [2025-04-26 02:15:25,960] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.05 | bwd: 5744.45 | bwd_inner: 5680.41 | bwd_allreduce: 64.00 | step: 18.54 18%|█▊ | 7575/41250 [18:17:51<80:51:52, 8.64s/it] {'loss': 0.0216, 'grad_norm': 0.23894910514354706, 'learning_rate': 3.757501833064011e-05, 'epoch': 1.84} 18%|█▊ | 7575/41250 [18:17:51<80:51:52, 8.64s/it][2025-04-26 02:15:34,600] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-26 02:15:34,600] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2815.65 | bwd_microstep: 5742.20 | bwd_inner_microstep: 5643.00 | bwd_allreduce_microstep: 99.15 | step_microstep: 18.70 [2025-04-26 02:15:34,601] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2815.65 | bwd: 5742.22 | bwd_inner: 5643.00 | bwd_allreduce: 99.17 | step: 18.70 18%|█▊ | 7576/41250 [18:17:59<80:50:58, 8.64s/it] {'loss': 0.4465, 'grad_norm': 3.761159658432007, 'learning_rate': 3.7574268790608616e-05, 'epoch': 1.84} 18%|█▊ | 7576/41250 [18:17:59<80:50:58, 8.64s/it][2025-04-26 02:15:43,338] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:15:43,339] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2880.66 | bwd_microstep: 5774.48 | bwd_inner_microstep: 5761.76 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.77 [2025-04-26 02:15:43,339] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2880.66 | bwd: 5774.49 | bwd_inner: 5761.76 | bwd_allreduce: 12.69 | step: 18.77 18%|█▊ | 7577/41250 [18:18:08<81:06:48, 8.67s/it] {'loss': 0.0689, 'grad_norm': 1.0331701040267944, 'learning_rate': 3.7573519142235216e-05, 'epoch': 1.84} 18%|█▊ | 7577/41250 [18:18:08<81:06:48, 8.67s/it][2025-04-26 02:15:51,991] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 02:15:51,992] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.57 | bwd_microstep: 5731.56 | bwd_inner_microstep: 5672.76 | bwd_allreduce_microstep: 58.76 | step_microstep: 19.11 [2025-04-26 02:15:51,992] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.57 | bwd: 5731.58 | bwd_inner: 5672.76 | bwd_allreduce: 58.77 | step: 19.11 18%|█▊ | 7578/41250 [18:18:17<81:03:39, 8.67s/it] {'loss': 0.3518, 'grad_norm': 3.225346088409424, 'learning_rate': 3.757276938552452e-05, 'epoch': 1.84} 18%|█▊ | 7578/41250 [18:18:17<81:03:39, 8.67s/it][2025-04-26 02:16:00,617] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:16:00,617] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.93 | bwd_microstep: 5699.77 | bwd_inner_microstep: 5686.98 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.56 [2025-04-26 02:16:00,618] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.93 | bwd: 5699.78 | bwd_inner: 5686.98 | bwd_allreduce: 12.75 | step: 18.57 18%|█▊ | 7579/41250 [18:18:25<80:56:37, 8.65s/it] {'loss': 0.0522, 'grad_norm': 0.781453013420105, 'learning_rate': 3.7572019520481164e-05, 'epoch': 1.84} 18%|█▊ | 7579/41250 [18:18:25<80:56:37, 8.65s/it][2025-04-26 02:16:09,257] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.03 | optimizer_step: 1.20 [2025-04-26 02:16:09,258] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.24 | bwd_microstep: 5705.06 | bwd_inner_microstep: 5692.21 | bwd_allreduce_microstep: 12.80 | step_microstep: 19.59 [2025-04-26 02:16:09,258] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.24 | bwd: 5705.07 | bwd_inner: 5692.21 | bwd_allreduce: 12.82 | step: 19.60 18%|█▊ | 7580/41250 [18:18:34<80:54:08, 8.65s/it] {'loss': 0.0771, 'grad_norm': 0.8485404253005981, 'learning_rate': 3.757126954710976e-05, 'epoch': 1.84} 18%|█▊ | 7580/41250 [18:18:34<80:54:08, 8.65s/it][2025-04-26 02:16:17,930] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 6.72 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 02:16:17,931] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.85 | bwd_microstep: 5763.86 | bwd_inner_microstep: 5648.45 | bwd_allreduce_microstep: 115.35 | step_microstep: 20.22 [2025-04-26 02:16:17,931] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.85 | bwd: 5763.87 | bwd_inner: 5648.45 | bwd_allreduce: 115.38 | step: 20.22 18%|█▊ | 7581/41250 [18:18:43<80:57:40, 8.66s/it] {'loss': 0.0461, 'grad_norm': 0.8674254417419434, 'learning_rate': 3.757051946541494e-05, 'epoch': 1.84} 18%|█▊ | 7581/41250 [18:18:43<80:57:40, 8.66s/it][2025-04-26 02:16:26,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.04 | optimizer_step: 1.03 [2025-04-26 02:16:26,591] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.39 | bwd_microstep: 5757.59 | bwd_inner_microstep: 5641.70 | bwd_allreduce_microstep: 115.84 | step_microstep: 18.99 [2025-04-26 02:16:26,591] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.39 | bwd: 5757.61 | bwd_inner: 5641.70 | bwd_allreduce: 115.87 | step: 18.99 18%|█▊ | 7582/41250 [18:18:51<80:58:02, 8.66s/it] {'loss': 0.0609, 'grad_norm': 1.9724520444869995, 'learning_rate': 3.756976927540132e-05, 'epoch': 1.84} 18%|█▊ | 7582/41250 [18:18:51<80:58:02, 8.66s/it][2025-04-26 02:16:35,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:16:35,210] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.38 | bwd_microstep: 5701.08 | bwd_inner_microstep: 5640.21 | bwd_allreduce_microstep: 60.82 | step_microstep: 18.77 [2025-04-26 02:16:35,210] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.38 | bwd: 5701.09 | bwd_inner: 5640.21 | bwd_allreduce: 60.84 | step: 18.78 18%|█▊ | 7583/41250 [18:19:00<80:51:34, 8.65s/it] {'loss': 0.2518, 'grad_norm': 7.268094062805176, 'learning_rate': 3.7569018977073526e-05, 'epoch': 1.84} 18%|█▊ | 7583/41250 [18:19:00<80:51:34, 8.65s/it][2025-04-26 02:16:43,818] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.98 [2025-04-26 02:16:43,819] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.40 | bwd_microstep: 5700.89 | bwd_inner_microstep: 5649.17 | bwd_allreduce_microstep: 51.67 | step_microstep: 19.17 [2025-04-26 02:16:43,819] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.40 | bwd: 5700.91 | bwd_inner: 5649.17 | bwd_allreduce: 51.69 | step: 19.17 18%|█▊ | 7584/41250 [18:19:09<80:45:21, 8.64s/it] {'loss': 0.1126, 'grad_norm': 1.9192017316818237, 'learning_rate': 3.756826857043619e-05, 'epoch': 1.84} 18%|█▊ | 7584/41250 [18:19:09<80:45:21, 8.64s/it][2025-04-26 02:16:52,627] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.16 | optimizer_step: 0.92 [2025-04-26 02:16:52,628] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.38 | bwd_microstep: 5875.36 | bwd_inner_microstep: 5701.97 | bwd_allreduce_microstep: 173.34 | step_microstep: 18.97 [2025-04-26 02:16:52,628] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.38 | bwd: 5875.38 | bwd_inner: 5701.97 | bwd_allreduce: 173.36 | step: 18.97 18%|█▊ | 7585/41250 [18:19:17<81:14:13, 8.69s/it] {'loss': 0.2392, 'grad_norm': 2.6329243183135986, 'learning_rate': 3.756751805549393e-05, 'epoch': 1.84} 18%|█▊ | 7585/41250 [18:19:17<81:14:13, 8.69s/it][2025-04-26 02:17:01,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 02:17:01,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.11 | bwd_microstep: 5775.85 | bwd_inner_microstep: 5651.34 | bwd_allreduce_microstep: 124.47 | step_microstep: 18.69 [2025-04-26 02:17:01,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.11 | bwd: 5775.87 | bwd_inner: 5651.34 | bwd_allreduce: 124.49 | step: 18.69 18%|█▊ | 7586/41250 [18:19:26<81:13:51, 8.69s/it] {'loss': 0.0769, 'grad_norm': 1.593944787979126, 'learning_rate': 3.756676743225137e-05, 'epoch': 1.84} 18%|█▊ | 7586/41250 [18:19:26<81:13:51, 8.69s/it][2025-04-26 02:17:09,986] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.97 | optimizer_step: 1.07 [2025-04-26 02:17:09,987] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.09 | bwd_microstep: 5747.92 | bwd_inner_microstep: 5692.26 | bwd_allreduce_microstep: 55.62 | step_microstep: 18.55 [2025-04-26 02:17:09,987] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.09 | bwd: 5747.93 | bwd_inner: 5692.26 | bwd_allreduce: 55.64 | step: 18.55 18%|█▊ | 7587/41250 [18:19:35<81:11:25, 8.68s/it] {'loss': 0.1591, 'grad_norm': 2.7274296283721924, 'learning_rate': 3.756601670071316e-05, 'epoch': 1.84} 18%|█▊ | 7587/41250 [18:19:35<81:11:25, 8.68s/it][2025-04-26 02:17:18,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 0.99 [2025-04-26 02:17:18,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.15 | bwd_microstep: 5759.69 | bwd_inner_microstep: 5644.09 | bwd_allreduce_microstep: 115.55 | step_microstep: 18.73 [2025-04-26 02:17:18,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.15 | bwd: 5759.70 | bwd_inner: 5644.09 | bwd_allreduce: 115.57 | step: 18.74 18%|█▊ | 7588/41250 [18:19:43<81:08:01, 8.68s/it] {'loss': 0.3605, 'grad_norm': 1.8611782789230347, 'learning_rate': 3.75652658608839e-05, 'epoch': 1.84} 18%|█▊ | 7588/41250 [18:19:43<81:08:01, 8.68s/it][2025-04-26 02:17:27,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 0.98 | optimizer_step: 1.06 [2025-04-26 02:17:27,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.96 | bwd_microstep: 5695.92 | bwd_inner_microstep: 5683.14 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.50 [2025-04-26 02:17:27,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.96 | bwd: 5695.93 | bwd_inner: 5683.14 | bwd_allreduce: 12.74 | step: 18.50 18%|█▊ | 7589/41250 [18:19:52<80:59:00, 8.66s/it] {'loss': 0.0514, 'grad_norm': 2.0539681911468506, 'learning_rate': 3.7564514912768234e-05, 'epoch': 1.84} 18%|█▊ | 7589/41250 [18:19:52<80:59:00, 8.66s/it][2025-04-26 02:17:35,968] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:17:35,969] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.48 | bwd_microstep: 5793.42 | bwd_inner_microstep: 5636.23 | bwd_allreduce_microstep: 157.15 | step_microstep: 18.82 [2025-04-26 02:17:35,969] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.48 | bwd: 5793.43 | bwd_inner: 5636.23 | bwd_allreduce: 157.16 | step: 18.83 18%|█▊ | 7590/41250 [18:20:01<81:04:32, 8.67s/it] {'loss': 0.1861, 'grad_norm': 2.259657382965088, 'learning_rate': 3.7563763856370794e-05, 'epoch': 1.84} 18%|█▊ | 7590/41250 [18:20:01<81:04:32, 8.67s/it][2025-04-26 02:17:44,589] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:17:44,589] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.37 | bwd_microstep: 5716.42 | bwd_inner_microstep: 5647.02 | bwd_allreduce_microstep: 69.35 | step_microstep: 18.81 [2025-04-26 02:17:44,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.37 | bwd: 5716.43 | bwd_inner: 5647.02 | bwd_allreduce: 69.37 | step: 18.81 18%|█▊ | 7591/41250 [18:20:09<80:56:11, 8.66s/it] {'loss': 0.2403, 'grad_norm': 3.9679040908813477, 'learning_rate': 3.75630126916962e-05, 'epoch': 1.84} 18%|█▊ | 7591/41250 [18:20:09<80:56:11, 8.66s/it][2025-04-26 02:17:53,286] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.35 | optimizer_step: 1.04 [2025-04-26 02:17:53,287] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.39 | bwd_microstep: 5768.00 | bwd_inner_microstep: 5680.48 | bwd_allreduce_microstep: 87.46 | step_microstep: 20.23 [2025-04-26 02:17:53,288] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.38 | bwd: 5768.02 | bwd_inner: 5680.48 | bwd_allreduce: 87.49 | step: 20.23 18%|█▊ | 7592/41250 [18:20:18<81:02:39, 8.67s/it] {'loss': 0.2724, 'grad_norm': 2.867647886276245, 'learning_rate': 3.7562261418749085e-05, 'epoch': 1.84} 18%|█▊ | 7592/41250 [18:20:18<81:02:39, 8.67s/it][2025-04-26 02:18:01,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:18:01,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.44 | bwd_microstep: 5750.42 | bwd_inner_microstep: 5699.56 | bwd_allreduce_microstep: 50.82 | step_microstep: 18.51 [2025-04-26 02:18:01,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.44 | bwd: 5750.44 | bwd_inner: 5699.56 | bwd_allreduce: 50.83 | step: 18.52 18%|█▊ | 7593/41250 [18:20:27<81:05:43, 8.67s/it] {'loss': 0.191, 'grad_norm': 3.083501100540161, 'learning_rate': 3.756151003753409e-05, 'epoch': 1.84} 18%|█▊ | 7593/41250 [18:20:27<81:05:43, 8.67s/it][2025-04-26 02:18:10,675] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:18:10,675] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.44 | bwd_microstep: 5765.53 | bwd_inner_microstep: 5712.92 | bwd_allreduce_microstep: 52.57 | step_microstep: 18.58 [2025-04-26 02:18:10,675] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.44 | bwd: 5765.55 | bwd_inner: 5712.92 | bwd_allreduce: 52.59 | step: 18.58 18%|█▊ | 7594/41250 [18:20:35<81:10:00, 8.68s/it] {'loss': 0.1617, 'grad_norm': 2.97817325592041, 'learning_rate': 3.756075854805583e-05, 'epoch': 1.84} 18%|█▊ | 7594/41250 [18:20:36<81:10:00, 8.68s/it][2025-04-26 02:18:19,292] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:18:19,293] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.97 | bwd_microstep: 5709.40 | bwd_inner_microstep: 5655.10 | bwd_allreduce_microstep: 54.26 | step_microstep: 18.45 [2025-04-26 02:18:19,293] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.97 | bwd: 5709.41 | bwd_inner: 5655.10 | bwd_allreduce: 54.27 | step: 18.45 18%|█▊ | 7595/41250 [18:20:44<80:58:59, 8.66s/it] {'loss': 0.0608, 'grad_norm': 1.6126221418380737, 'learning_rate': 3.756000695031895e-05, 'epoch': 1.84} 18%|█▊ | 7595/41250 [18:20:44<80:58:59, 8.66s/it][2025-04-26 02:18:27,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.02 | optimizer_step: 1.07 [2025-04-26 02:18:27,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.94 | bwd_microstep: 5790.06 | bwd_inner_microstep: 5666.27 | bwd_allreduce_microstep: 123.74 | step_microstep: 18.92 [2025-04-26 02:18:28,000] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.94 | bwd: 5790.08 | bwd_inner: 5666.27 | bwd_allreduce: 123.77 | step: 18.93 18%|█▊ | 7596/41250 [18:20:53<81:06:13, 8.68s/it] {'loss': 0.2918, 'grad_norm': 2.6556835174560547, 'learning_rate': 3.755925524432808e-05, 'epoch': 1.84} 18%|█▊ | 7596/41250 [18:20:53<81:06:13, 8.68s/it][2025-04-26 02:18:36,702] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 0.99 | optimizer_step: 1.03 [2025-04-26 02:18:36,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.55 | bwd_microstep: 5791.77 | bwd_inner_microstep: 5661.69 | bwd_allreduce_microstep: 130.03 | step_microstep: 18.57 [2025-04-26 02:18:36,703] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.55 | bwd: 5791.78 | bwd_inner: 5661.69 | bwd_allreduce: 130.05 | step: 18.57 18%|█▊ | 7597/41250 [18:21:02<81:10:48, 8.68s/it] {'loss': 0.4433, 'grad_norm': 3.6208910942077637, 'learning_rate': 3.755850343008786e-05, 'epoch': 1.84} 18%|█▊ | 7597/41250 [18:21:02<81:10:48, 8.68s/it][2025-04-26 02:18:45,351] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 1.01 [2025-04-26 02:18:45,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.87 | bwd_microstep: 5713.54 | bwd_inner_microstep: 5700.80 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.83 [2025-04-26 02:18:45,352] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.87 | bwd: 5713.55 | bwd_inner: 5700.80 | bwd_allreduce: 12.71 | step: 18.83 18%|█▊ | 7598/41250 [18:21:10<81:04:42, 8.67s/it] {'loss': 0.2061, 'grad_norm': 2.2734203338623047, 'learning_rate': 3.755775150760292e-05, 'epoch': 1.84} 18%|█▊ | 7598/41250 [18:21:10<81:04:42, 8.67s/it][2025-04-26 02:18:54,055] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.22 | optimizer_step: 0.90 [2025-04-26 02:18:54,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.23 | bwd_microstep: 5776.85 | bwd_inner_microstep: 5694.51 | bwd_allreduce_microstep: 82.29 | step_microstep: 19.02 [2025-04-26 02:18:54,056] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.23 | bwd: 5776.86 | bwd_inner: 5694.51 | bwd_allreduce: 82.31 | step: 19.03 18%|█▊ | 7599/41250 [18:21:19<81:09:44, 8.68s/it] {'loss': 0.2081, 'grad_norm': 2.4323604106903076, 'learning_rate': 3.755699947687789e-05, 'epoch': 1.84} 18%|█▊ | 7599/41250 [18:21:19<81:09:44, 8.68s/it][2025-04-26 02:19:02,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:19:02,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.58 | bwd_microstep: 5789.33 | bwd_inner_microstep: 5776.44 | bwd_allreduce_microstep: 12.84 | step_microstep: 19.13 [2025-04-26 02:19:02,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.58 | bwd: 5789.35 | bwd_inner: 5776.44 | bwd_allreduce: 12.86 | step: 19.13 18%|█▊ | 7600/41250 [18:21:28<81:22:31, 8.71s/it] {'loss': 0.2407, 'grad_norm': 1.9750858545303345, 'learning_rate': 3.755624733791742e-05, 'epoch': 1.84} 18%|█▊ | 7600/41250 [18:21:28<81:22:31, 8.71s/it][2025-04-26 02:19:11,483] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.04 | optimizer_step: 1.05 [2025-04-26 02:19:11,484] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.53 | bwd_microstep: 5727.37 | bwd_inner_microstep: 5697.16 | bwd_allreduce_microstep: 30.16 | step_microstep: 19.25 [2025-04-26 02:19:11,484] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.53 | bwd: 5727.38 | bwd_inner: 5697.16 | bwd_allreduce: 30.18 | step: 19.25 18%|█▊ | 7601/41250 [18:21:36<81:16:21, 8.70s/it] {'loss': 0.1515, 'grad_norm': 4.214780807495117, 'learning_rate': 3.755549509072613e-05, 'epoch': 1.84} 18%|█▊ | 7601/41250 [18:21:36<81:16:21, 8.70s/it][2025-04-26 02:19:20,160] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 02:19:20,160] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.00 | bwd_microstep: 5760.74 | bwd_inner_microstep: 5662.49 | bwd_allreduce_microstep: 98.20 | step_microstep: 19.01 [2025-04-26 02:19:20,161] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.00 | bwd: 5760.76 | bwd_inner: 5662.49 | bwd_allreduce: 98.22 | step: 19.01 18%|█▊ | 7602/41250 [18:21:45<81:12:48, 8.69s/it] {'loss': 0.1106, 'grad_norm': 2.0817456245422363, 'learning_rate': 3.7554742735308674e-05, 'epoch': 1.84} 18%|█▊ | 7602/41250 [18:21:45<81:12:48, 8.69s/it][2025-04-26 02:19:28,812] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:19:28,812] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.86 | bwd_microstep: 5718.57 | bwd_inner_microstep: 5705.69 | bwd_allreduce_microstep: 12.84 | step_microstep: 18.54 [2025-04-26 02:19:28,813] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.86 | bwd: 5718.59 | bwd_inner: 5705.69 | bwd_allreduce: 12.86 | step: 18.54 18%|█▊ | 7603/41250 [18:21:54<81:06:23, 8.68s/it] {'loss': 0.1154, 'grad_norm': 1.665610671043396, 'learning_rate': 3.7553990271669676e-05, 'epoch': 1.84} 18%|█▊ | 7603/41250 [18:21:54<81:06:23, 8.68s/it][2025-04-26 02:19:37,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:19:37,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.48 | bwd_microstep: 5763.19 | bwd_inner_microstep: 5691.42 | bwd_allreduce_microstep: 71.73 | step_microstep: 18.69 [2025-04-26 02:19:37,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.48 | bwd: 5763.21 | bwd_inner: 5691.42 | bwd_allreduce: 71.75 | step: 18.69 18%|█▊ | 7604/41250 [18:22:02<81:09:53, 8.68s/it] {'loss': 0.185, 'grad_norm': 6.374874591827393, 'learning_rate': 3.7553237699813775e-05, 'epoch': 1.84} 18%|█▊ | 7604/41250 [18:22:02<81:09:53, 8.68s/it][2025-04-26 02:19:46,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 02:19:46,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.65 | bwd_microstep: 5780.20 | bwd_inner_microstep: 5641.18 | bwd_allreduce_microstep: 138.96 | step_microstep: 18.80 [2025-04-26 02:19:46,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.65 | bwd: 5780.21 | bwd_inner: 5641.18 | bwd_allreduce: 138.98 | step: 18.80 18%|█▊ | 7605/41250 [18:22:11<81:10:18, 8.69s/it] {'loss': 0.0443, 'grad_norm': 1.8573896884918213, 'learning_rate': 3.755248501974563e-05, 'epoch': 1.84} 18%|█▊ | 7605/41250 [18:22:11<81:10:18, 8.69s/it][2025-04-26 02:19:54,810] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.03 | optimizer_step: 1.17 [2025-04-26 02:19:54,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.06 | bwd_microstep: 5697.98 | bwd_inner_microstep: 5651.57 | bwd_allreduce_microstep: 46.36 | step_microstep: 19.57 [2025-04-26 02:19:54,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.06 | bwd: 5697.99 | bwd_inner: 5651.57 | bwd_allreduce: 46.38 | step: 19.58 18%|█▊ | 7606/41250 [18:22:20<80:57:41, 8.66s/it] {'loss': 0.1435, 'grad_norm': 1.563520073890686, 'learning_rate': 3.7551732231469855e-05, 'epoch': 1.84} 18%|█▊ | 7606/41250 [18:22:20<80:57:41, 8.66s/it][2025-04-26 02:20:03,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:20:03,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.66 | bwd_microstep: 5779.74 | bwd_inner_microstep: 5652.58 | bwd_allreduce_microstep: 127.12 | step_microstep: 18.24 [2025-04-26 02:20:03,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.66 | bwd: 5779.75 | bwd_inner: 5652.58 | bwd_allreduce: 127.13 | step: 18.25 18%|█▊ | 7607/41250 [18:22:28<81:01:58, 8.67s/it] {'loss': 0.287, 'grad_norm': 2.5027899742126465, 'learning_rate': 3.755097933499111e-05, 'epoch': 1.84} 18%|█▊ | 7607/41250 [18:22:28<81:01:58, 8.67s/it][2025-04-26 02:20:12,189] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.26 | optimizer_step: 0.90 [2025-04-26 02:20:12,190] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.34 | bwd_microstep: 5786.09 | bwd_inner_microstep: 5641.49 | bwd_allreduce_microstep: 144.54 | step_microstep: 19.39 [2025-04-26 02:20:12,190] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.34 | bwd: 5786.10 | bwd_inner: 5641.49 | bwd_allreduce: 144.56 | step: 19.39 18%|█▊ | 7608/41250 [18:22:37<81:05:05, 8.68s/it] {'loss': 0.0925, 'grad_norm': 1.2791459560394287, 'learning_rate': 3.7550226330314026e-05, 'epoch': 1.84} 18%|█▊ | 7608/41250 [18:22:37<81:05:05, 8.68s/it][2025-04-26 02:20:20,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:20:20,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.17 | bwd_microstep: 5708.05 | bwd_inner_microstep: 5695.31 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.69 [2025-04-26 02:20:20,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.17 | bwd: 5708.06 | bwd_inner: 5695.31 | bwd_allreduce: 12.71 | step: 18.70 18%|█▊ | 7609/41250 [18:22:46<81:01:18, 8.67s/it] {'loss': 0.1007, 'grad_norm': 2.0111193656921387, 'learning_rate': 3.754947321744325e-05, 'epoch': 1.84} 18%|█▊ | 7609/41250 [18:22:46<81:01:18, 8.67s/it][2025-04-26 02:20:29,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:20:29,505] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.17 | bwd_microstep: 5736.88 | bwd_inner_microstep: 5649.03 | bwd_allreduce_microstep: 87.80 | step_microstep: 18.77 [2025-04-26 02:20:29,505] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.17 | bwd: 5736.89 | bwd_inner: 5649.03 | bwd_allreduce: 87.82 | step: 18.78 18%|█▊ | 7610/41250 [18:22:54<80:59:18, 8.67s/it] {'loss': 0.122, 'grad_norm': 1.58474600315094, 'learning_rate': 3.754871999638342e-05, 'epoch': 1.84} 18%|█▊ | 7610/41250 [18:22:54<80:59:18, 8.67s/it][2025-04-26 02:20:38,108] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-26 02:20:38,108] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.91 | bwd_microstep: 5693.06 | bwd_inner_microstep: 5647.37 | bwd_allreduce_microstep: 45.64 | step_microstep: 18.47 [2025-04-26 02:20:38,108] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.92 | bwd: 5693.07 | bwd_inner: 5647.37 | bwd_allreduce: 45.66 | step: 18.48 18%|█▊ | 7611/41250 [18:23:03<80:48:39, 8.65s/it] {'loss': 0.0842, 'grad_norm': 1.2612029314041138, 'learning_rate': 3.754796666713919e-05, 'epoch': 1.85} 18%|█▊ | 7611/41250 [18:23:03<80:48:39, 8.65s/it][2025-04-26 02:20:46,771] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.00 | optimizer_step: 1.11 [2025-04-26 02:20:46,772] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.73 | bwd_microstep: 5742.47 | bwd_inner_microstep: 5638.21 | bwd_allreduce_microstep: 104.22 | step_microstep: 18.82 [2025-04-26 02:20:46,772] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.73 | bwd: 5742.49 | bwd_inner: 5638.21 | bwd_allreduce: 104.24 | step: 18.82 18%|█▊ | 7612/41250 [18:23:12<80:51:07, 8.65s/it] {'loss': 0.2759, 'grad_norm': 1.4539010524749756, 'learning_rate': 3.7547213229715194e-05, 'epoch': 1.85} 18%|█▊ | 7612/41250 [18:23:12<80:51:07, 8.65s/it][2025-04-26 02:20:55,453] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:20:55,453] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.02 | bwd_microstep: 5762.22 | bwd_inner_microstep: 5641.00 | bwd_allreduce_microstep: 121.18 | step_microstep: 18.62 [2025-04-26 02:20:55,453] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.02 | bwd: 5762.23 | bwd_inner: 5641.00 | bwd_allreduce: 121.19 | step: 18.62 18%|█▊ | 7613/41250 [18:23:20<80:55:39, 8.66s/it] {'loss': 0.1239, 'grad_norm': 1.8454614877700806, 'learning_rate': 3.754645968411608e-05, 'epoch': 1.85} 18%|█▊ | 7613/41250 [18:23:20<80:55:39, 8.66s/it][2025-04-26 02:21:04,103] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-26 02:21:04,104] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.74 | bwd_microstep: 5736.98 | bwd_inner_microstep: 5648.00 | bwd_allreduce_microstep: 88.92 | step_microstep: 18.75 [2025-04-26 02:21:04,104] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.74 | bwd: 5736.99 | bwd_inner: 5648.00 | bwd_allreduce: 88.94 | step: 18.75 18%|█▊ | 7614/41250 [18:23:29<80:53:49, 8.66s/it] {'loss': 0.1087, 'grad_norm': 1.0840935707092285, 'learning_rate': 3.754570603034649e-05, 'epoch': 1.85} 18%|█▊ | 7614/41250 [18:23:29<80:53:49, 8.66s/it][2025-04-26 02:21:12,776] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 02:21:12,776] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.59 | bwd_microstep: 5736.24 | bwd_inner_microstep: 5693.29 | bwd_allreduce_microstep: 42.91 | step_microstep: 18.77 [2025-04-26 02:21:12,777] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.59 | bwd: 5736.25 | bwd_inner: 5693.29 | bwd_allreduce: 42.92 | step: 18.77 18%|█▊ | 7615/41250 [18:23:38<80:56:02, 8.66s/it] {'loss': 0.0745, 'grad_norm': 1.1489921808242798, 'learning_rate': 3.7544952268411076e-05, 'epoch': 1.85} 18%|█▊ | 7615/41250 [18:23:38<80:56:02, 8.66s/it][2025-04-26 02:21:21,469] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:21:21,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.26 | bwd_microstep: 5780.23 | bwd_inner_microstep: 5650.21 | bwd_allreduce_microstep: 129.97 | step_microstep: 18.51 [2025-04-26 02:21:21,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.26 | bwd: 5780.24 | bwd_inner: 5650.21 | bwd_allreduce: 129.99 | step: 18.51 18%|█▊ | 7616/41250 [18:23:46<81:01:07, 8.67s/it] {'loss': 0.3268, 'grad_norm': 3.1761162281036377, 'learning_rate': 3.754419839831448e-05, 'epoch': 1.85} 18%|█▊ | 7616/41250 [18:23:46<81:01:07, 8.67s/it][2025-04-26 02:21:30,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:21:30,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.57 | bwd_microstep: 5688.64 | bwd_inner_microstep: 5675.73 | bwd_allreduce_microstep: 12.87 | step_microstep: 18.62 [2025-04-26 02:21:30,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.57 | bwd: 5688.65 | bwd_inner: 5675.73 | bwd_allreduce: 12.88 | step: 18.62 18%|█▊ | 7617/41250 [18:23:55<80:52:15, 8.66s/it] {'loss': 0.2272, 'grad_norm': 1.3058613538742065, 'learning_rate': 3.7543444420061356e-05, 'epoch': 1.85} 18%|█▊ | 7617/41250 [18:23:55<80:52:15, 8.66s/it][2025-04-26 02:21:38,842] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:21:38,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2884.97 | bwd_microstep: 5784.33 | bwd_inner_microstep: 5771.56 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.74 [2025-04-26 02:21:38,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2884.97 | bwd: 5784.35 | bwd_inner: 5771.56 | bwd_allreduce: 12.74 | step: 18.74 18%|█▊ | 7618/41250 [18:24:04<81:08:12, 8.68s/it] {'loss': 0.1255, 'grad_norm': 2.39825439453125, 'learning_rate': 3.754269033365634e-05, 'epoch': 1.85} 18%|█▊ | 7618/41250 [18:24:04<81:08:12, 8.68s/it][2025-04-26 02:21:47,453] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:21:47,454] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.83 | bwd_microstep: 5689.37 | bwd_inner_microstep: 5676.66 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.55 [2025-04-26 02:21:47,454] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.83 | bwd: 5689.38 | bwd_inner: 5676.66 | bwd_allreduce: 12.68 | step: 18.56 18%|█▊ | 7619/41250 [18:24:12<80:55:55, 8.66s/it] {'loss': 0.1318, 'grad_norm': 1.5689363479614258, 'learning_rate': 3.75419361391041e-05, 'epoch': 1.85} 18%|█▊ | 7619/41250 [18:24:12<80:55:55, 8.66s/it][2025-04-26 02:21:56,118] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.02 | optimizer_step: 0.95 [2025-04-26 02:21:56,119] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.84 | bwd_microstep: 5734.50 | bwd_inner_microstep: 5694.57 | bwd_allreduce_microstep: 39.89 | step_microstep: 19.00 [2025-04-26 02:21:56,119] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.84 | bwd: 5734.51 | bwd_inner: 5694.57 | bwd_allreduce: 39.91 | step: 19.00 18%|█▊ | 7620/41250 [18:24:21<80:55:39, 8.66s/it] {'loss': 0.2556, 'grad_norm': 2.171955108642578, 'learning_rate': 3.7541181836409264e-05, 'epoch': 1.85} 18%|█▊ | 7620/41250 [18:24:21<80:55:39, 8.66s/it][2025-04-26 02:22:04,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.21 | optimizer_step: 0.90 [2025-04-26 02:22:04,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.31 | bwd_microstep: 5737.07 | bwd_inner_microstep: 5679.61 | bwd_allreduce_microstep: 57.41 | step_microstep: 19.13 [2025-04-26 02:22:04,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.31 | bwd: 5737.08 | bwd_inner: 5679.61 | bwd_allreduce: 57.43 | step: 19.13 18%|█▊ | 7621/41250 [18:24:30<80:55:34, 8.66s/it] {'loss': 0.1978, 'grad_norm': 4.497612476348877, 'learning_rate': 3.7540427425576496e-05, 'epoch': 1.85} 18%|█▊ | 7621/41250 [18:24:30<80:55:34, 8.66s/it][2025-04-26 02:22:13,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 1.04 [2025-04-26 02:22:13,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.07 | bwd_microstep: 5679.95 | bwd_inner_microstep: 5647.70 | bwd_allreduce_microstep: 32.20 | step_microstep: 18.95 [2025-04-26 02:22:13,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.07 | bwd: 5679.96 | bwd_inner: 5647.70 | bwd_allreduce: 32.22 | step: 18.95 18%|█▊ | 7622/41250 [18:24:38<80:44:21, 8.64s/it] {'loss': 0.1214, 'grad_norm': 1.4717488288879395, 'learning_rate': 3.7539672906610445e-05, 'epoch': 1.85} 18%|█▊ | 7622/41250 [18:24:38<80:44:21, 8.64s/it][2025-04-26 02:22:21,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.99 | optimizer_step: 1.01 [2025-04-26 02:22:21,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.53 | bwd_microstep: 5671.91 | bwd_inner_microstep: 5659.22 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.86 [2025-04-26 02:22:21,975] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.53 | bwd: 5671.93 | bwd_inner: 5659.22 | bwd_allreduce: 12.66 | step: 18.87 18%|█▊ | 7623/41250 [18:24:47<80:36:08, 8.63s/it] {'loss': 0.1352, 'grad_norm': 2.7683913707733154, 'learning_rate': 3.753891827951575e-05, 'epoch': 1.85} 18%|█▊ | 7623/41250 [18:24:47<80:36:08, 8.63s/it][2025-04-26 02:22:30,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:22:30,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.30 | bwd_microstep: 5708.26 | bwd_inner_microstep: 5641.01 | bwd_allreduce_microstep: 67.20 | step_microstep: 18.98 [2025-04-26 02:22:30,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.30 | bwd: 5708.28 | bwd_inner: 5641.01 | bwd_allreduce: 67.22 | step: 18.98 18%|█▊ | 7624/41250 [18:24:55<80:33:08, 8.62s/it] {'loss': 0.2531, 'grad_norm': 2.100295305252075, 'learning_rate': 3.753816354429709e-05, 'epoch': 1.85} 18%|█▊ | 7624/41250 [18:24:55<80:33:08, 8.62s/it][2025-04-26 02:22:39,251] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:22:39,251] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.41 | bwd_microstep: 5726.01 | bwd_inner_microstep: 5705.25 | bwd_allreduce_microstep: 20.71 | step_microstep: 18.57 [2025-04-26 02:22:39,251] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.41 | bwd: 5726.02 | bwd_inner: 5705.25 | bwd_allreduce: 20.73 | step: 18.57 18%|█▊ | 7625/41250 [18:25:04<80:39:48, 8.64s/it] {'loss': 0.0833, 'grad_norm': 0.753722071647644, 'learning_rate': 3.753740870095909e-05, 'epoch': 1.85} 18%|█▊ | 7625/41250 [18:25:04<80:39:48, 8.64s/it][2025-04-26 02:22:47,909] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 02:22:47,909] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.24 | bwd_microstep: 5725.85 | bwd_inner_microstep: 5701.14 | bwd_allreduce_microstep: 24.65 | step_microstep: 18.99 [2025-04-26 02:22:47,910] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.24 | bwd: 5725.86 | bwd_inner: 5701.14 | bwd_allreduce: 24.67 | step: 19.00 18%|█▊ | 7626/41250 [18:25:13<80:43:33, 8.64s/it] {'loss': 0.1402, 'grad_norm': 0.950484037399292, 'learning_rate': 3.7536653749506415e-05, 'epoch': 1.85} 18%|█▊ | 7626/41250 [18:25:13<80:43:33, 8.64s/it][2025-04-26 02:22:56,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-26 02:22:56,595] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.29 | bwd_microstep: 5781.72 | bwd_inner_microstep: 5636.45 | bwd_allreduce_microstep: 145.23 | step_microstep: 18.93 [2025-04-26 02:22:56,595] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.29 | bwd: 5781.73 | bwd_inner: 5636.45 | bwd_allreduce: 145.24 | step: 18.93 18%|█▊ | 7627/41250 [18:25:21<80:50:25, 8.66s/it] {'loss': 0.1869, 'grad_norm': 3.1996192932128906, 'learning_rate': 3.753589868994372e-05, 'epoch': 1.85} 18%|█▊ | 7627/41250 [18:25:21<80:50:25, 8.66s/it][2025-04-26 02:23:05,204] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-26 02:23:05,205] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.96 | bwd_microstep: 5699.26 | bwd_inner_microstep: 5658.97 | bwd_allreduce_microstep: 40.24 | step_microstep: 19.16 [2025-04-26 02:23:05,205] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.96 | bwd: 5699.27 | bwd_inner: 5658.97 | bwd_allreduce: 40.26 | step: 19.16 18%|█▊ | 7628/41250 [18:25:30<80:42:34, 8.64s/it] {'loss': 0.058, 'grad_norm': 0.9418116211891174, 'learning_rate': 3.753514352227567e-05, 'epoch': 1.85} 18%|█▊ | 7628/41250 [18:25:30<80:42:34, 8.64s/it][2025-04-26 02:23:13,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 02:23:13,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.14 | bwd_microstep: 5764.06 | bwd_inner_microstep: 5639.62 | bwd_allreduce_microstep: 124.39 | step_microstep: 18.62 [2025-04-26 02:23:13,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.14 | bwd: 5764.07 | bwd_inner: 5639.62 | bwd_allreduce: 124.41 | step: 18.62 18%|█▊ | 7629/41250 [18:25:39<80:47:18, 8.65s/it] {'loss': 0.1719, 'grad_norm': 1.4452967643737793, 'learning_rate': 3.7534388246506894e-05, 'epoch': 1.85} 18%|█▊ | 7629/41250 [18:25:39<80:47:18, 8.65s/it][2025-04-26 02:23:22,498] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 02:23:22,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.63 | bwd_microstep: 5715.91 | bwd_inner_microstep: 5646.50 | bwd_allreduce_microstep: 69.37 | step_microstep: 18.64 [2025-04-26 02:23:22,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.63 | bwd: 5715.92 | bwd_inner: 5646.50 | bwd_allreduce: 69.38 | step: 18.64 18%|█▊ | 7630/41250 [18:25:47<80:42:32, 8.64s/it] {'loss': 0.0904, 'grad_norm': 1.7073543071746826, 'learning_rate': 3.7533632862642074e-05, 'epoch': 1.85} 18%|█▊ | 7630/41250 [18:25:47<80:42:32, 8.64s/it][2025-04-26 02:23:31,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:23:31,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.78 | bwd_microstep: 5772.12 | bwd_inner_microstep: 5698.56 | bwd_allreduce_microstep: 73.52 | step_microstep: 18.53 [2025-04-26 02:23:31,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.78 | bwd: 5772.14 | bwd_inner: 5698.56 | bwd_allreduce: 73.53 | step: 18.53 18%|█▊ | 7631/41250 [18:25:56<80:52:34, 8.66s/it] {'loss': 0.0628, 'grad_norm': 0.6892262697219849, 'learning_rate': 3.753287737068585e-05, 'epoch': 1.85} 18%|█▊ | 7631/41250 [18:25:56<80:52:34, 8.66s/it][2025-04-26 02:23:39,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:23:39,835] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.07 | bwd_microstep: 5702.86 | bwd_inner_microstep: 5690.11 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.86 [2025-04-26 02:23:39,835] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.07 | bwd: 5702.88 | bwd_inner: 5690.11 | bwd_allreduce: 12.72 | step: 18.87 19%|█▊ | 7632/41250 [18:26:05<80:47:47, 8.65s/it] {'loss': 0.0259, 'grad_norm': 0.5222805738449097, 'learning_rate': 3.7532121770642885e-05, 'epoch': 1.85} 19%|█▊ | 7632/41250 [18:26:05<80:47:47, 8.65s/it][2025-04-26 02:23:48,463] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:23:48,464] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.34 | bwd_microstep: 5700.71 | bwd_inner_microstep: 5687.85 | bwd_allreduce_microstep: 12.81 | step_microstep: 18.79 [2025-04-26 02:23:48,464] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.34 | bwd: 5700.72 | bwd_inner: 5687.85 | bwd_allreduce: 12.83 | step: 18.80 19%|█▊ | 7633/41250 [18:26:13<80:43:49, 8.65s/it] {'loss': 0.1322, 'grad_norm': 1.1384373903274536, 'learning_rate': 3.753136606251784e-05, 'epoch': 1.85} 19%|█▊ | 7633/41250 [18:26:13<80:43:49, 8.65s/it][2025-04-26 02:23:57,261] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:23:57,262] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.95 | bwd_microstep: 5891.10 | bwd_inner_microstep: 5653.51 | bwd_allreduce_microstep: 237.55 | step_microstep: 18.41 [2025-04-26 02:23:57,262] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.95 | bwd: 5891.12 | bwd_inner: 5653.51 | bwd_allreduce: 237.57 | step: 18.41 19%|█▊ | 7634/41250 [18:26:22<81:09:23, 8.69s/it] {'loss': 0.1727, 'grad_norm': 1.893848180770874, 'learning_rate': 3.753061024631537e-05, 'epoch': 1.85} 19%|█▊ | 7634/41250 [18:26:22<81:09:23, 8.69s/it][2025-04-26 02:24:05,941] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 02:24:05,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.66 | bwd_microstep: 5748.00 | bwd_inner_microstep: 5705.98 | bwd_allreduce_microstep: 41.96 | step_microstep: 18.58 [2025-04-26 02:24:05,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.66 | bwd: 5748.01 | bwd_inner: 5705.98 | bwd_allreduce: 41.98 | step: 18.58 19%|█▊ | 7635/41250 [18:26:31<81:07:24, 8.69s/it] {'loss': 0.1576, 'grad_norm': 1.3118184804916382, 'learning_rate': 3.752985432204013e-05, 'epoch': 1.85} 19%|█▊ | 7635/41250 [18:26:31<81:07:24, 8.69s/it][2025-04-26 02:24:14,617] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-26 02:24:14,618] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.79 | bwd_microstep: 5741.05 | bwd_inner_microstep: 5699.80 | bwd_allreduce_microstep: 41.21 | step_microstep: 19.16 [2025-04-26 02:24:14,618] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.79 | bwd: 5741.06 | bwd_inner: 5699.79 | bwd_allreduce: 41.23 | step: 19.16 19%|█▊ | 7636/41250 [18:26:39<81:05:14, 8.68s/it] {'loss': 0.1211, 'grad_norm': 2.1110076904296875, 'learning_rate': 3.7529098289696795e-05, 'epoch': 1.85} 19%|█▊ | 7636/41250 [18:26:39<81:05:14, 8.68s/it][2025-04-26 02:24:23,252] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 1.08 [2025-04-26 02:24:23,253] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.36 | bwd_microstep: 5706.24 | bwd_inner_microstep: 5693.53 | bwd_allreduce_microstep: 12.68 | step_microstep: 19.05 [2025-04-26 02:24:23,253] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.36 | bwd: 5706.26 | bwd_inner: 5693.52 | bwd_allreduce: 12.69 | step: 19.05 19%|█▊ | 7637/41250 [18:26:48<80:56:44, 8.67s/it] {'loss': 0.1776, 'grad_norm': 1.0904046297073364, 'learning_rate': 3.752834214929002e-05, 'epoch': 1.85} 19%|█▊ | 7637/41250 [18:26:48<80:56:44, 8.67s/it][2025-04-26 02:24:32,077] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:24:32,077] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.46 | bwd_microstep: 5887.05 | bwd_inner_microstep: 5700.00 | bwd_allreduce_microstep: 187.01 | step_microstep: 18.83 [2025-04-26 02:24:32,078] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.46 | bwd: 5887.06 | bwd_inner: 5700.00 | bwd_allreduce: 187.02 | step: 18.83 19%|█▊ | 7638/41250 [18:26:57<81:22:38, 8.72s/it] {'loss': 0.3663, 'grad_norm': 1.5378687381744385, 'learning_rate': 3.7527585900824454e-05, 'epoch': 1.85} 19%|█▊ | 7638/41250 [18:26:57<81:22:38, 8.72s/it][2025-04-26 02:24:40,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-26 02:24:40,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2884.73 | bwd_microstep: 5782.97 | bwd_inner_microstep: 5770.26 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.56 [2025-04-26 02:24:40,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2884.73 | bwd: 5782.98 | bwd_inner: 5770.26 | bwd_allreduce: 12.68 | step: 18.56 19%|█▊ | 7639/41250 [18:27:06<81:28:14, 8.73s/it] {'loss': 0.4649, 'grad_norm': 4.887169361114502, 'learning_rate': 3.7526829544304775e-05, 'epoch': 1.85} 19%|█▊ | 7639/41250 [18:27:06<81:28:14, 8.73s/it][2025-04-26 02:24:49,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:24:49,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.74 | bwd_microstep: 5761.58 | bwd_inner_microstep: 5701.76 | bwd_allreduce_microstep: 59.78 | step_microstep: 18.84 [2025-04-26 02:24:49,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.74 | bwd: 5761.59 | bwd_inner: 5701.76 | bwd_allreduce: 59.79 | step: 18.84 19%|█▊ | 7640/41250 [18:27:14<81:22:01, 8.72s/it] {'loss': 0.1411, 'grad_norm': 1.0651686191558838, 'learning_rate': 3.7526073079735635e-05, 'epoch': 1.85} 19%|█▊ | 7640/41250 [18:27:14<81:22:01, 8.72s/it][2025-04-26 02:24:58,157] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.99 | optimizer_step: 0.96 [2025-04-26 02:24:58,157] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.14 | bwd_microstep: 5726.29 | bwd_inner_microstep: 5650.48 | bwd_allreduce_microstep: 75.77 | step_microstep: 18.45 [2025-04-26 02:24:58,157] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.14 | bwd: 5726.30 | bwd_inner: 5650.48 | bwd_allreduce: 75.79 | step: 18.46 19%|█▊ | 7641/41250 [18:27:23<81:09:07, 8.69s/it] {'loss': 0.0804, 'grad_norm': 1.0299708843231201, 'learning_rate': 3.7525316507121704e-05, 'epoch': 1.85} 19%|█▊ | 7641/41250 [18:27:23<81:09:07, 8.69s/it][2025-04-26 02:25:06,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.97 | optimizer_step: 1.09 [2025-04-26 02:25:06,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.31 | bwd_microstep: 5763.30 | bwd_inner_microstep: 5698.84 | bwd_allreduce_microstep: 64.42 | step_microstep: 18.72 [2025-04-26 02:25:06,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.31 | bwd: 5763.31 | bwd_inner: 5698.84 | bwd_allreduce: 64.43 | step: 18.72 19%|█▊ | 7642/41250 [18:27:32<81:09:36, 8.69s/it] {'loss': 0.1698, 'grad_norm': 1.9403884410858154, 'learning_rate': 3.7524559826467635e-05, 'epoch': 1.85} 19%|█▊ | 7642/41250 [18:27:32<81:09:36, 8.69s/it][2025-04-26 02:25:15,487] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.56 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:25:15,487] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.60 | bwd_microstep: 5708.24 | bwd_inner_microstep: 5694.11 | bwd_allreduce_microstep: 14.09 | step_microstep: 18.81 [2025-04-26 02:25:15,488] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.60 | bwd: 5708.25 | bwd_inner: 5694.11 | bwd_allreduce: 14.10 | step: 18.81 19%|█▊ | 7643/41250 [18:27:40<80:59:24, 8.68s/it] {'loss': 0.0829, 'grad_norm': 1.7872580289840698, 'learning_rate': 3.752380303777812e-05, 'epoch': 1.85} 19%|█▊ | 7643/41250 [18:27:40<80:59:24, 8.68s/it][2025-04-26 02:25:24,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.91 [2025-04-26 02:25:24,171] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.49 | bwd_microstep: 5756.62 | bwd_inner_microstep: 5681.52 | bwd_allreduce_microstep: 75.04 | step_microstep: 18.69 [2025-04-26 02:25:24,171] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.49 | bwd: 5756.63 | bwd_inner: 5681.52 | bwd_allreduce: 75.06 | step: 18.69 19%|█▊ | 7644/41250 [18:27:49<81:00:58, 8.68s/it] {'loss': 0.2119, 'grad_norm': 1.5244262218475342, 'learning_rate': 3.7523046141057785e-05, 'epoch': 1.85} 19%|█▊ | 7644/41250 [18:27:49<81:00:58, 8.68s/it][2025-04-26 02:25:32,888] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.05 | optimizer_step: 0.97 [2025-04-26 02:25:32,889] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.28 | bwd_microstep: 5808.36 | bwd_inner_microstep: 5653.98 | bwd_allreduce_microstep: 154.33 | step_microstep: 19.31 [2025-04-26 02:25:32,889] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.28 | bwd: 5808.38 | bwd_inner: 5653.98 | bwd_allreduce: 154.36 | step: 19.31 19%|█▊ | 7645/41250 [18:27:58<81:07:20, 8.69s/it] {'loss': 0.131, 'grad_norm': 0.8781213760375977, 'learning_rate': 3.752228913631133e-05, 'epoch': 1.85} 19%|█▊ | 7645/41250 [18:27:58<81:07:20, 8.69s/it][2025-04-26 02:25:41,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.00 | optimizer_step: 1.09 [2025-04-26 02:25:41,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.06 | bwd_microstep: 5700.46 | bwd_inner_microstep: 5686.07 | bwd_allreduce_microstep: 14.35 | step_microstep: 18.95 [2025-04-26 02:25:41,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.06 | bwd: 5700.47 | bwd_inner: 5686.07 | bwd_allreduce: 14.36 | step: 18.96 19%|█▊ | 7646/41250 [18:28:06<80:57:11, 8.67s/it] {'loss': 0.1452, 'grad_norm': 1.959054708480835, 'learning_rate': 3.752153202354341e-05, 'epoch': 1.85} 19%|█▊ | 7646/41250 [18:28:06<80:57:11, 8.67s/it][2025-04-26 02:25:50,172] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-26 02:25:50,173] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.10 | bwd_microstep: 5716.11 | bwd_inner_microstep: 5703.19 | bwd_allreduce_microstep: 12.87 | step_microstep: 18.90 [2025-04-26 02:25:50,173] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.10 | bwd: 5716.12 | bwd_inner: 5703.19 | bwd_allreduce: 12.89 | step: 18.90 19%|█▊ | 7647/41250 [18:28:15<80:53:32, 8.67s/it] {'loss': 0.1216, 'grad_norm': 1.2207261323928833, 'learning_rate': 3.752077480275869e-05, 'epoch': 1.85} 19%|█▊ | 7647/41250 [18:28:15<80:53:32, 8.67s/it][2025-04-26 02:25:58,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:25:58,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.65 | bwd_microstep: 5751.71 | bwd_inner_microstep: 5687.02 | bwd_allreduce_microstep: 64.64 | step_microstep: 18.85 [2025-04-26 02:25:58,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.65 | bwd: 5751.72 | bwd_inner: 5687.02 | bwd_allreduce: 64.66 | step: 18.85 19%|█▊ | 7648/41250 [18:28:24<80:55:44, 8.67s/it] {'loss': 0.0617, 'grad_norm': 0.7868779301643372, 'learning_rate': 3.752001747396183e-05, 'epoch': 1.85} 19%|█▊ | 7648/41250 [18:28:24<80:55:44, 8.67s/it][2025-04-26 02:26:07,532] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 02:26:07,533] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.75 | bwd_microstep: 5771.38 | bwd_inner_microstep: 5636.75 | bwd_allreduce_microstep: 134.58 | step_microstep: 18.85 [2025-04-26 02:26:07,533] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.75 | bwd: 5771.39 | bwd_inner: 5636.75 | bwd_allreduce: 134.60 | step: 18.85 19%|█▊ | 7649/41250 [18:28:32<80:57:04, 8.67s/it] {'loss': 0.1133, 'grad_norm': 3.545379400253296, 'learning_rate': 3.7519260037157515e-05, 'epoch': 1.85} 19%|█▊ | 7649/41250 [18:28:32<80:57:04, 8.67s/it][2025-04-26 02:26:16,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:26:16,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.79 | bwd_microstep: 5740.33 | bwd_inner_microstep: 5696.51 | bwd_allreduce_microstep: 43.77 | step_microstep: 18.61 [2025-04-26 02:26:16,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.79 | bwd: 5740.34 | bwd_inner: 5696.51 | bwd_allreduce: 43.79 | step: 18.61 19%|█▊ | 7650/41250 [18:28:41<80:56:19, 8.67s/it] {'loss': 0.122, 'grad_norm': 1.6988922357559204, 'learning_rate': 3.751850249235041e-05, 'epoch': 1.85} 19%|█▊ | 7650/41250 [18:28:41<80:56:19, 8.67s/it][2025-04-26 02:26:25,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 02:26:25,008] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.12 | bwd_microstep: 5869.10 | bwd_inner_microstep: 5703.00 | bwd_allreduce_microstep: 166.06 | step_microstep: 18.93 [2025-04-26 02:26:25,008] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.12 | bwd: 5869.11 | bwd_inner: 5703.00 | bwd_allreduce: 166.07 | step: 18.93 19%|█▊ | 7651/41250 [18:28:50<81:18:42, 8.71s/it] {'loss': 0.0897, 'grad_norm': 1.3728922605514526, 'learning_rate': 3.7517744839545185e-05, 'epoch': 1.85} 19%|█▊ | 7651/41250 [18:28:50<81:18:42, 8.71s/it][2025-04-26 02:26:33,621] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:26:33,622] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.73 | bwd_microstep: 5699.61 | bwd_inner_microstep: 5643.53 | bwd_allreduce_microstep: 56.03 | step_microstep: 18.55 [2025-04-26 02:26:33,622] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.73 | bwd: 5699.62 | bwd_inner: 5643.53 | bwd_allreduce: 56.05 | step: 18.55 19%|█▊ | 7652/41250 [18:28:58<81:02:06, 8.68s/it] {'loss': 0.206, 'grad_norm': 1.9299432039260864, 'learning_rate': 3.75169870787465e-05, 'epoch': 1.86} 19%|█▊ | 7652/41250 [18:28:58<81:02:06, 8.68s/it][2025-04-26 02:26:42,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:26:42,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.58 | bwd_microstep: 5713.38 | bwd_inner_microstep: 5648.70 | bwd_allreduce_microstep: 64.63 | step_microstep: 18.35 [2025-04-26 02:26:42,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.58 | bwd: 5713.39 | bwd_inner: 5648.70 | bwd_allreduce: 64.65 | step: 18.36 19%|█▊ | 7653/41250 [18:29:07<80:50:43, 8.66s/it] {'loss': 0.2037, 'grad_norm': 2.5926334857940674, 'learning_rate': 3.751622920995904e-05, 'epoch': 1.86} 19%|█▊ | 7653/41250 [18:29:07<80:50:43, 8.66s/it][2025-04-26 02:26:50,935] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.00 | optimizer_step: 1.04 [2025-04-26 02:26:50,936] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.12 | bwd_microstep: 5782.30 | bwd_inner_microstep: 5647.98 | bwd_allreduce_microstep: 134.28 | step_microstep: 18.60 [2025-04-26 02:26:50,936] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.12 | bwd: 5782.32 | bwd_inner: 5647.98 | bwd_allreduce: 134.29 | step: 18.60 19%|█▊ | 7654/41250 [18:29:16<80:56:23, 8.67s/it] {'loss': 0.0691, 'grad_norm': 2.1579720973968506, 'learning_rate': 3.751547123318748e-05, 'epoch': 1.86} 19%|█▊ | 7654/41250 [18:29:16<80:56:23, 8.67s/it][2025-04-26 02:26:59,886] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:26:59,887] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.81 | bwd_microstep: 6040.77 | bwd_inner_microstep: 5643.32 | bwd_allreduce_microstep: 397.40 | step_microstep: 18.72 [2025-04-26 02:26:59,887] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.81 | bwd: 6040.78 | bwd_inner: 5643.32 | bwd_allreduce: 397.42 | step: 18.73 19%|█▊ | 7655/41250 [18:29:25<81:43:05, 8.76s/it] {'loss': 0.1316, 'grad_norm': 0.8739269971847534, 'learning_rate': 3.751471314843647e-05, 'epoch': 1.86} 19%|█▊ | 7655/41250 [18:29:25<81:43:05, 8.76s/it][2025-04-26 02:27:08,542] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.03 | optimizer_step: 1.10 [2025-04-26 02:27:08,543] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.87 | bwd_microstep: 5723.41 | bwd_inner_microstep: 5686.71 | bwd_allreduce_microstep: 36.65 | step_microstep: 19.55 [2025-04-26 02:27:08,543] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.87 | bwd: 5723.42 | bwd_inner: 5686.71 | bwd_allreduce: 36.67 | step: 19.55 19%|█▊ | 7656/41250 [18:29:33<81:25:51, 8.73s/it] {'loss': 0.0738, 'grad_norm': 0.8915503025054932, 'learning_rate': 3.7513954955710704e-05, 'epoch': 1.86} 19%|█▊ | 7656/41250 [18:29:33<81:25:51, 8.73s/it][2025-04-26 02:27:17,216] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 02:27:17,217] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.62 | bwd_microstep: 5749.96 | bwd_inner_microstep: 5686.31 | bwd_allreduce_microstep: 63.61 | step_microstep: 18.40 [2025-04-26 02:27:17,217] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.62 | bwd: 5749.98 | bwd_inner: 5686.31 | bwd_allreduce: 63.63 | step: 18.40 19%|█▊ | 7657/41250 [18:29:42<81:16:50, 8.71s/it] {'loss': 0.1046, 'grad_norm': 1.8434231281280518, 'learning_rate': 3.7513196655014855e-05, 'epoch': 1.86} 19%|█▊ | 7657/41250 [18:29:42<81:16:50, 8.71s/it][2025-04-26 02:27:25,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 02:27:25,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2814.12 | bwd_microstep: 5791.21 | bwd_inner_microstep: 5626.84 | bwd_allreduce_microstep: 164.32 | step_microstep: 18.66 [2025-04-26 02:27:25,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2814.12 | bwd: 5791.22 | bwd_inner: 5626.84 | bwd_allreduce: 164.34 | step: 18.66 19%|█▊ | 7658/41250 [18:29:51<81:13:22, 8.70s/it] {'loss': 0.0387, 'grad_norm': 0.42392656207084656, 'learning_rate': 3.751243824635359e-05, 'epoch': 1.86} 19%|█▊ | 7658/41250 [18:29:51<81:13:22, 8.70s/it][2025-04-26 02:27:34,561] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.07 | optimizer_step: 0.98 [2025-04-26 02:27:34,562] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.68 | bwd_microstep: 5753.70 | bwd_inner_microstep: 5623.90 | bwd_allreduce_microstep: 129.75 | step_microstep: 19.38 [2025-04-26 02:27:34,562] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.68 | bwd: 5753.72 | bwd_inner: 5623.90 | bwd_allreduce: 129.77 | step: 19.38 19%|█▊ | 7659/41250 [18:29:59<81:05:05, 8.69s/it] {'loss': 0.1022, 'grad_norm': 1.484567642211914, 'learning_rate': 3.751167972973159e-05, 'epoch': 1.86} 19%|█▊ | 7659/41250 [18:29:59<81:05:05, 8.69s/it][2025-04-26 02:27:43,169] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 02:27:43,169] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.62 | bwd_microstep: 5685.79 | bwd_inner_microstep: 5673.07 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.61 [2025-04-26 02:27:43,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.62 | bwd: 5685.80 | bwd_inner: 5673.07 | bwd_allreduce: 12.69 | step: 18.61 19%|█▊ | 7660/41250 [18:30:08<80:50:55, 8.66s/it] {'loss': 0.0247, 'grad_norm': 0.455024778842926, 'learning_rate': 3.751092110515353e-05, 'epoch': 1.86} 19%|█▊ | 7660/41250 [18:30:08<80:50:55, 8.66s/it][2025-04-26 02:27:51,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.98 | optimizer_step: 0.92 [2025-04-26 02:27:51,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.60 | bwd_microstep: 5762.79 | bwd_inner_microstep: 5642.21 | bwd_allreduce_microstep: 120.54 | step_microstep: 18.72 [2025-04-26 02:27:51,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.60 | bwd: 5762.81 | bwd_inner: 5642.21 | bwd_allreduce: 120.55 | step: 18.73 19%|█▊ | 7661/41250 [18:30:17<80:51:27, 8.67s/it] {'loss': 0.2828, 'grad_norm': 2.263742685317993, 'learning_rate': 3.751016237262407e-05, 'epoch': 1.86} 19%|█▊ | 7661/41250 [18:30:17<80:51:27, 8.67s/it][2025-04-26 02:28:00,454] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.91 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:28:00,454] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.81 | bwd_microstep: 5708.63 | bwd_inner_microstep: 5633.98 | bwd_allreduce_microstep: 74.61 | step_microstep: 18.10 [2025-04-26 02:28:00,455] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.81 | bwd: 5708.65 | bwd_inner: 5633.98 | bwd_allreduce: 74.63 | step: 18.10 19%|█▊ | 7662/41250 [18:30:25<80:42:46, 8.65s/it] {'loss': 0.1655, 'grad_norm': 2.7516086101531982, 'learning_rate': 3.750940353214792e-05, 'epoch': 1.86} 19%|█▊ | 7662/41250 [18:30:25<80:42:46, 8.65s/it][2025-04-26 02:28:09,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 02:28:09,129] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.71 | bwd_microstep: 5770.39 | bwd_inner_microstep: 5633.63 | bwd_allreduce_microstep: 136.72 | step_microstep: 18.07 [2025-04-26 02:28:09,129] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.71 | bwd: 5770.40 | bwd_inner: 5633.63 | bwd_allreduce: 136.73 | step: 18.07 19%|█▊ | 7663/41250 [18:30:34<80:46:35, 8.66s/it] {'loss': 0.0316, 'grad_norm': 0.6385459899902344, 'learning_rate': 3.7508644583729736e-05, 'epoch': 1.86} 19%|█▊ | 7663/41250 [18:30:34<80:46:35, 8.66s/it][2025-04-26 02:28:17,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.31 | optimizer_step: 1.06 [2025-04-26 02:28:17,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.94 | bwd_microstep: 5762.81 | bwd_inner_microstep: 5638.11 | bwd_allreduce_microstep: 124.64 | step_microstep: 20.19 [2025-04-26 02:28:17,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.94 | bwd: 5762.83 | bwd_inner: 5638.11 | bwd_allreduce: 124.68 | step: 20.19 19%|█▊ | 7664/41250 [18:30:43<80:47:43, 8.66s/it] {'loss': 0.11, 'grad_norm': 1.639635682106018, 'learning_rate': 3.7507885527374205e-05, 'epoch': 1.86} 19%|█▊ | 7664/41250 [18:30:43<80:47:43, 8.66s/it][2025-04-26 02:28:26,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-26 02:28:26,470] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.57 | bwd_microstep: 5746.33 | bwd_inner_microstep: 5681.41 | bwd_allreduce_microstep: 64.87 | step_microstep: 18.65 [2025-04-26 02:28:26,471] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.57 | bwd: 5746.34 | bwd_inner: 5681.41 | bwd_allreduce: 64.89 | step: 18.65 19%|█▊ | 7665/41250 [18:30:51<80:50:18, 8.67s/it] {'loss': 0.2756, 'grad_norm': 4.591912746429443, 'learning_rate': 3.7507126363086e-05, 'epoch': 1.86} 19%|█▊ | 7665/41250 [18:30:51<80:50:18, 8.67s/it][2025-04-26 02:28:35,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.05 | optimizer_step: 0.90 [2025-04-26 02:28:35,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.83 | bwd_microstep: 5783.86 | bwd_inner_microstep: 5638.10 | bwd_allreduce_microstep: 145.72 | step_microstep: 18.93 [2025-04-26 02:28:35,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.83 | bwd: 5783.88 | bwd_inner: 5638.10 | bwd_allreduce: 145.73 | step: 18.94 19%|█▊ | 7666/41250 [18:31:00<80:53:27, 8.67s/it] {'loss': 0.0526, 'grad_norm': 0.7211490273475647, 'learning_rate': 3.75063670908698e-05, 'epoch': 1.86} 19%|█▊ | 7666/41250 [18:31:00<80:53:27, 8.67s/it][2025-04-26 02:28:43,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-26 02:28:43,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.40 | bwd_microstep: 5698.69 | bwd_inner_microstep: 5685.71 | bwd_allreduce_microstep: 12.92 | step_microstep: 19.08 [2025-04-26 02:28:43,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.40 | bwd: 5698.70 | bwd_inner: 5685.71 | bwd_allreduce: 12.94 | step: 19.08 19%|█▊ | 7667/41250 [18:31:09<80:46:26, 8.66s/it] {'loss': 0.0893, 'grad_norm': 1.3749464750289917, 'learning_rate': 3.75056077107303e-05, 'epoch': 1.86} 19%|█▊ | 7667/41250 [18:31:09<80:46:26, 8.66s/it][2025-04-26 02:28:52,532] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:28:52,533] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2876.73 | bwd_microstep: 5787.66 | bwd_inner_microstep: 5774.81 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.85 [2025-04-26 02:28:52,533] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2876.73 | bwd: 5787.67 | bwd_inner: 5774.81 | bwd_allreduce: 12.82 | step: 18.86 19%|█▊ | 7668/41250 [18:31:17<81:01:12, 8.69s/it] {'loss': 0.3992, 'grad_norm': 2.4727728366851807, 'learning_rate': 3.750484822267217e-05, 'epoch': 1.86} 19%|█▊ | 7668/41250 [18:31:17<81:01:12, 8.69s/it][2025-04-26 02:29:01,162] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.98 | optimizer_step: 0.94 [2025-04-26 02:29:01,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.60 | bwd_microstep: 5698.41 | bwd_inner_microstep: 5685.75 | bwd_allreduce_microstep: 12.62 | step_microstep: 18.71 [2025-04-26 02:29:01,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.60 | bwd: 5698.42 | bwd_inner: 5685.75 | bwd_allreduce: 12.63 | step: 18.71 19%|█▊ | 7669/41250 [18:31:26<80:51:41, 8.67s/it] {'loss': 0.1805, 'grad_norm': 3.095423460006714, 'learning_rate': 3.750408862670009e-05, 'epoch': 1.86} 19%|█▊ | 7669/41250 [18:31:26<80:51:41, 8.67s/it][2025-04-26 02:29:09,830] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:29:09,830] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.13 | bwd_microstep: 5730.41 | bwd_inner_microstep: 5695.34 | bwd_allreduce_microstep: 35.03 | step_microstep: 18.72 [2025-04-26 02:29:09,830] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.13 | bwd: 5730.43 | bwd_inner: 5695.34 | bwd_allreduce: 35.04 | step: 18.73 19%|█▊ | 7670/41250 [18:31:35<80:51:18, 8.67s/it] {'loss': 0.0541, 'grad_norm': 0.7256878614425659, 'learning_rate': 3.750332892281875e-05, 'epoch': 1.86} 19%|█▊ | 7670/41250 [18:31:35<80:51:18, 8.67s/it][2025-04-26 02:29:18,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:29:18,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.20 | bwd_microstep: 5752.40 | bwd_inner_microstep: 5697.56 | bwd_allreduce_microstep: 54.80 | step_microstep: 18.94 [2025-04-26 02:29:18,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.20 | bwd: 5752.42 | bwd_inner: 5697.56 | bwd_allreduce: 54.81 | step: 18.94 19%|█▊ | 7671/41250 [18:31:43<80:54:07, 8.67s/it] {'loss': 0.2093, 'grad_norm': 2.390334129333496, 'learning_rate': 3.7502569111032824e-05, 'epoch': 1.86} 19%|█▊ | 7671/41250 [18:31:43<80:54:07, 8.67s/it][2025-04-26 02:29:27,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:29:27,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.19 | bwd_microstep: 5783.20 | bwd_inner_microstep: 5643.44 | bwd_allreduce_microstep: 139.71 | step_microstep: 18.71 [2025-04-26 02:29:27,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.19 | bwd: 5783.22 | bwd_inner: 5643.44 | bwd_allreduce: 139.73 | step: 18.71 19%|█▊ | 7672/41250 [18:31:52<80:55:51, 8.68s/it] {'loss': 0.0258, 'grad_norm': 0.33774808049201965, 'learning_rate': 3.7501809191347e-05, 'epoch': 1.86} 19%|█▊ | 7672/41250 [18:31:52<80:55:51, 8.68s/it][2025-04-26 02:29:35,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-26 02:29:35,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.83 | bwd_microstep: 5710.07 | bwd_inner_microstep: 5697.23 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.74 [2025-04-26 02:29:35,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.83 | bwd: 5710.08 | bwd_inner: 5697.23 | bwd_allreduce: 12.82 | step: 18.75 19%|█▊ | 7673/41250 [18:32:01<80:48:53, 8.66s/it] {'loss': 0.2221, 'grad_norm': 3.8780195713043213, 'learning_rate': 3.750104916376597e-05, 'epoch': 1.86} 19%|█▊ | 7673/41250 [18:32:01<80:48:53, 8.66s/it][2025-04-26 02:29:44,524] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-26 02:29:44,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.48 | bwd_microstep: 5781.75 | bwd_inner_microstep: 5648.52 | bwd_allreduce_microstep: 133.19 | step_microstep: 18.64 [2025-04-26 02:29:44,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.48 | bwd: 5781.76 | bwd_inner: 5648.51 | bwd_allreduce: 133.21 | step: 18.65 19%|█▊ | 7674/41250 [18:32:09<80:52:36, 8.67s/it] {'loss': 0.3575, 'grad_norm': 2.562004804611206, 'learning_rate': 3.750028902829441e-05, 'epoch': 1.86} 19%|█▊ | 7674/41250 [18:32:09<80:52:36, 8.67s/it][2025-04-26 02:29:53,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-26 02:29:53,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.69 | bwd_microstep: 5759.93 | bwd_inner_microstep: 5637.80 | bwd_allreduce_microstep: 122.08 | step_microstep: 18.98 [2025-04-26 02:29:53,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.69 | bwd: 5759.94 | bwd_inner: 5637.80 | bwd_allreduce: 122.10 | step: 18.99 19%|█▊ | 7675/41250 [18:32:18<80:52:37, 8.67s/it] {'loss': 0.0333, 'grad_norm': 0.7206865549087524, 'learning_rate': 3.7499528784937015e-05, 'epoch': 1.86} 19%|█▊ | 7675/41250 [18:32:18<80:52:37, 8.67s/it][2025-04-26 02:30:02,030] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 02:30:02,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.87 | bwd_microstep: 5912.37 | bwd_inner_microstep: 5655.60 | bwd_allreduce_microstep: 256.72 | step_microstep: 18.62 [2025-04-26 02:30:02,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.87 | bwd: 5912.38 | bwd_inner: 5655.60 | bwd_allreduce: 256.74 | step: 18.63 19%|█▊ | 7676/41250 [18:32:27<81:19:54, 8.72s/it] {'loss': 0.2032, 'grad_norm': 2.556124687194824, 'learning_rate': 3.7498768433698474e-05, 'epoch': 1.86} 19%|█▊ | 7676/41250 [18:32:27<81:19:54, 8.72s/it][2025-04-26 02:30:10,729] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:30:10,729] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.87 | bwd_microstep: 5788.12 | bwd_inner_microstep: 5660.45 | bwd_allreduce_microstep: 127.62 | step_microstep: 18.48 [2025-04-26 02:30:10,730] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.87 | bwd: 5788.13 | bwd_inner: 5660.45 | bwd_allreduce: 127.64 | step: 18.49 19%|█▊ | 7677/41250 [18:32:36<81:15:58, 8.71s/it] {'loss': 0.0435, 'grad_norm': 0.7778078317642212, 'learning_rate': 3.7498007974583456e-05, 'epoch': 1.86} 19%|█▊ | 7677/41250 [18:32:36<81:15:58, 8.71s/it][2025-04-26 02:30:19,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:30:19,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.74 | bwd_microstep: 5689.59 | bwd_inner_microstep: 5656.70 | bwd_allreduce_microstep: 32.85 | step_microstep: 18.61 [2025-04-26 02:30:19,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.74 | bwd: 5689.60 | bwd_inner: 5656.70 | bwd_allreduce: 32.86 | step: 18.62 19%|█▊ | 7678/41250 [18:32:44<80:57:24, 8.68s/it] {'loss': 0.0392, 'grad_norm': 0.602817952632904, 'learning_rate': 3.7497247407596665e-05, 'epoch': 1.86} 19%|█▊ | 7678/41250 [18:32:44<80:57:24, 8.68s/it][2025-04-26 02:30:27,962] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 1.01 [2025-04-26 02:30:27,963] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.52 | bwd_microstep: 5713.53 | bwd_inner_microstep: 5650.92 | bwd_allreduce_microstep: 62.57 | step_microstep: 18.99 [2025-04-26 02:30:27,963] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.52 | bwd: 5713.54 | bwd_inner: 5650.92 | bwd_allreduce: 62.58 | step: 19.00 19%|█▊ | 7679/41250 [18:32:53<80:48:21, 8.67s/it] {'loss': 0.421, 'grad_norm': 3.8015167713165283, 'learning_rate': 3.749648673274278e-05, 'epoch': 1.86} 19%|█▊ | 7679/41250 [18:32:53<80:48:21, 8.67s/it][2025-04-26 02:30:36,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:30:36,598] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.66 | bwd_microstep: 5720.94 | bwd_inner_microstep: 5657.08 | bwd_allreduce_microstep: 63.82 | step_microstep: 18.78 [2025-04-26 02:30:36,598] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.66 | bwd: 5720.96 | bwd_inner: 5657.08 | bwd_allreduce: 63.84 | step: 18.78 19%|█▊ | 7680/41250 [18:33:01<80:43:21, 8.66s/it] {'loss': 0.0656, 'grad_norm': 1.1059954166412354, 'learning_rate': 3.7495725950026505e-05, 'epoch': 1.86} 19%|█▊ | 7680/41250 [18:33:01<80:43:21, 8.66s/it][2025-04-26 02:30:45,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 0.99 | optimizer_step: 0.98 [2025-04-26 02:30:45,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.69 | bwd_microstep: 5728.74 | bwd_inner_microstep: 5650.34 | bwd_allreduce_microstep: 78.35 | step_microstep: 18.26 [2025-04-26 02:30:45,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.69 | bwd: 5728.75 | bwd_inner: 5650.34 | bwd_allreduce: 78.37 | step: 18.26 19%|█▊ | 7681/41250 [18:33:10<80:39:18, 8.65s/it] {'loss': 0.1871, 'grad_norm': 3.0474889278411865, 'learning_rate': 3.7494965059452514e-05, 'epoch': 1.86} 19%|█▊ | 7681/41250 [18:33:10<80:39:18, 8.65s/it][2025-04-26 02:30:53,873] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:30:53,874] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.44 | bwd_microstep: 5709.85 | bwd_inner_microstep: 5689.94 | bwd_allreduce_microstep: 19.88 | step_microstep: 18.62 [2025-04-26 02:30:53,874] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.44 | bwd: 5709.87 | bwd_inner: 5689.94 | bwd_allreduce: 19.89 | step: 18.62 19%|█▊ | 7682/41250 [18:33:19<80:37:43, 8.65s/it] {'loss': 0.0652, 'grad_norm': 1.4344886541366577, 'learning_rate': 3.749420406102551e-05, 'epoch': 1.86} 19%|█▊ | 7682/41250 [18:33:19<80:37:43, 8.65s/it][2025-04-26 02:31:02,514] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:31:02,514] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.68 | bwd_microstep: 5700.95 | bwd_inner_microstep: 5688.03 | bwd_allreduce_microstep: 12.87 | step_microstep: 18.71 [2025-04-26 02:31:02,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.68 | bwd: 5700.96 | bwd_inner: 5688.03 | bwd_allreduce: 12.88 | step: 18.71 19%|█▊ | 7683/41250 [18:33:27<80:36:47, 8.65s/it] {'loss': 0.0669, 'grad_norm': 3.083219051361084, 'learning_rate': 3.749344295475017e-05, 'epoch': 1.86} 19%|█▊ | 7683/41250 [18:33:27<80:36:47, 8.65s/it][2025-04-26 02:31:11,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:31:11,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.43 | bwd_microstep: 5715.04 | bwd_inner_microstep: 5644.63 | bwd_allreduce_microstep: 70.37 | step_microstep: 18.84 [2025-04-26 02:31:11,150] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.43 | bwd: 5715.05 | bwd_inner: 5644.62 | bwd_allreduce: 70.39 | step: 18.84 19%|█▊ | 7684/41250 [18:33:36<80:34:45, 8.64s/it] {'loss': 0.1314, 'grad_norm': 2.912235736846924, 'learning_rate': 3.7492681740631205e-05, 'epoch': 1.86} 19%|█▊ | 7684/41250 [18:33:36<80:34:45, 8.64s/it][2025-04-26 02:31:19,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:31:19,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.23 | bwd_microstep: 5721.82 | bwd_inner_microstep: 5650.36 | bwd_allreduce_microstep: 71.42 | step_microstep: 18.30 [2025-04-26 02:31:19,794] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.23 | bwd: 5721.83 | bwd_inner: 5650.36 | bwd_allreduce: 71.44 | step: 18.30 19%|█▊ | 7685/41250 [18:33:45<80:34:40, 8.64s/it] {'loss': 0.3555, 'grad_norm': 2.9040586948394775, 'learning_rate': 3.749192041867329e-05, 'epoch': 1.86} 19%|█▊ | 7685/41250 [18:33:45<80:34:40, 8.64s/it][2025-04-26 02:31:28,488] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 02:31:28,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.67 | bwd_microstep: 5779.90 | bwd_inner_microstep: 5659.93 | bwd_allreduce_microstep: 119.92 | step_microstep: 18.76 [2025-04-26 02:31:28,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.67 | bwd: 5779.92 | bwd_inner: 5659.93 | bwd_allreduce: 119.94 | step: 18.76 19%|█▊ | 7686/41250 [18:33:53<80:44:03, 8.66s/it] {'loss': 0.1843, 'grad_norm': 3.2544314861297607, 'learning_rate': 3.7491158988881126e-05, 'epoch': 1.86} 19%|█▊ | 7686/41250 [18:33:53<80:44:03, 8.66s/it][2025-04-26 02:31:37,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.03 | optimizer_step: 0.95 [2025-04-26 02:31:37,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.60 | bwd_microstep: 6045.28 | bwd_inner_microstep: 5645.03 | bwd_allreduce_microstep: 400.21 | step_microstep: 18.66 [2025-04-26 02:31:37,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.60 | bwd: 6045.29 | bwd_inner: 5645.03 | bwd_allreduce: 400.22 | step: 18.67 19%|█▊ | 7687/41250 [18:34:02<81:35:49, 8.75s/it] {'loss': 0.0712, 'grad_norm': 1.4922674894332886, 'learning_rate': 3.749039745125941e-05, 'epoch': 1.86} 19%|█▊ | 7687/41250 [18:34:02<81:35:49, 8.75s/it][2025-04-26 02:31:46,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.24 | optimizer_step: 0.94 [2025-04-26 02:31:46,139] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.38 | bwd_microstep: 5737.98 | bwd_inner_microstep: 5701.29 | bwd_allreduce_microstep: 36.63 | step_microstep: 19.59 [2025-04-26 02:31:46,139] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.38 | bwd: 5737.99 | bwd_inner: 5701.29 | bwd_allreduce: 36.66 | step: 19.59 19%|█▊ | 7688/41250 [18:34:11<81:23:34, 8.73s/it] {'loss': 0.0661, 'grad_norm': 0.758581280708313, 'learning_rate': 3.748963580581284e-05, 'epoch': 1.86} 19%|█▊ | 7688/41250 [18:34:11<81:23:34, 8.73s/it][2025-04-26 02:31:54,832] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 02:31:54,832] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.08 | bwd_microstep: 5749.14 | bwd_inner_microstep: 5706.66 | bwd_allreduce_microstep: 42.43 | step_microstep: 18.60 [2025-04-26 02:31:54,832] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.08 | bwd: 5749.15 | bwd_inner: 5706.66 | bwd_allreduce: 42.45 | step: 18.60 19%|█▊ | 7689/41250 [18:34:20<81:17:06, 8.72s/it] {'loss': 0.2012, 'grad_norm': 2.6692521572113037, 'learning_rate': 3.74888740525461e-05, 'epoch': 1.86} 19%|█▊ | 7689/41250 [18:34:20<81:17:06, 8.72s/it][2025-04-26 02:32:03,508] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.17 | optimizer_step: 0.92 [2025-04-26 02:32:03,508] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.93 | bwd_microstep: 5770.36 | bwd_inner_microstep: 5644.72 | bwd_allreduce_microstep: 125.59 | step_microstep: 18.83 [2025-04-26 02:32:03,509] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.93 | bwd: 5770.38 | bwd_inner: 5644.72 | bwd_allreduce: 125.61 | step: 18.83 19%|█▊ | 7690/41250 [18:34:28<81:09:42, 8.71s/it] {'loss': 0.1616, 'grad_norm': 3.037764310836792, 'learning_rate': 3.748811219146388e-05, 'epoch': 1.86} 19%|█▊ | 7690/41250 [18:34:28<81:09:42, 8.71s/it][2025-04-26 02:32:12,203] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.02 | optimizer_step: 1.18 [2025-04-26 02:32:12,204] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.90 | bwd_microstep: 5758.28 | bwd_inner_microstep: 5685.71 | bwd_allreduce_microstep: 72.51 | step_microstep: 19.27 [2025-04-26 02:32:12,204] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.90 | bwd: 5758.29 | bwd_inner: 5685.71 | bwd_allreduce: 72.54 | step: 19.27 19%|█▊ | 7691/41250 [18:34:37<81:08:01, 8.70s/it] {'loss': 0.0552, 'grad_norm': 0.9870890378952026, 'learning_rate': 3.74873502225709e-05, 'epoch': 1.86} 19%|█▊ | 7691/41250 [18:34:37<81:08:01, 8.70s/it][2025-04-26 02:32:20,818] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:32:20,819] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.94 | bwd_microstep: 5679.35 | bwd_inner_microstep: 5652.81 | bwd_allreduce_microstep: 26.49 | step_microstep: 18.65 [2025-04-26 02:32:20,819] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.94 | bwd: 5679.36 | bwd_inner: 5652.81 | bwd_allreduce: 26.51 | step: 18.66 19%|█▊ | 7692/41250 [18:34:46<80:52:37, 8.68s/it] {'loss': 0.1078, 'grad_norm': 1.3682721853256226, 'learning_rate': 3.748658814587184e-05, 'epoch': 1.86} 19%|█▊ | 7692/41250 [18:34:46<80:52:37, 8.68s/it][2025-04-26 02:32:29,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.03 | optimizer_step: 1.00 [2025-04-26 02:32:29,494] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.11 | bwd_microstep: 5760.62 | bwd_inner_microstep: 5648.58 | bwd_allreduce_microstep: 111.99 | step_microstep: 19.14 [2025-04-26 02:32:29,494] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.11 | bwd: 5760.63 | bwd_inner: 5648.58 | bwd_allreduce: 112.01 | step: 19.14 19%|█▊ | 7693/41250 [18:34:54<80:52:14, 8.68s/it] {'loss': 0.0302, 'grad_norm': 0.996963381767273, 'learning_rate': 3.748582596137142e-05, 'epoch': 1.86} 19%|█▊ | 7693/41250 [18:34:54<80:52:14, 8.68s/it][2025-04-26 02:32:38,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.98 | optimizer_step: 1.00 [2025-04-26 02:32:38,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.70 | bwd_microstep: 5749.61 | bwd_inner_microstep: 5685.30 | bwd_allreduce_microstep: 64.27 | step_microstep: 18.75 [2025-04-26 02:32:38,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.70 | bwd: 5749.63 | bwd_inner: 5685.30 | bwd_allreduce: 64.28 | step: 18.75 19%|█▊ | 7694/41250 [18:35:03<80:54:41, 8.68s/it] {'loss': 0.2224, 'grad_norm': 1.9239269495010376, 'learning_rate': 3.748506366907431e-05, 'epoch': 1.87} 19%|█▊ | 7694/41250 [18:35:03<80:54:41, 8.68s/it][2025-04-26 02:32:46,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.01 | optimizer_step: 0.98 [2025-04-26 02:32:46,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.38 | bwd_microstep: 5706.21 | bwd_inner_microstep: 5654.20 | bwd_allreduce_microstep: 51.96 | step_microstep: 19.13 [2025-04-26 02:32:46,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.38 | bwd: 5706.22 | bwd_inner: 5654.20 | bwd_allreduce: 51.98 | step: 19.13 19%|█▊ | 7695/41250 [18:35:12<80:44:59, 8.66s/it] {'loss': 0.046, 'grad_norm': 0.8932245969772339, 'learning_rate': 3.748430126898522e-05, 'epoch': 1.87} 19%|█▊ | 7695/41250 [18:35:12<80:44:59, 8.66s/it][2025-04-26 02:32:55,428] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-26 02:32:55,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.42 | bwd_microstep: 5706.42 | bwd_inner_microstep: 5653.27 | bwd_allreduce_microstep: 53.10 | step_microstep: 18.99 [2025-04-26 02:32:55,429] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.42 | bwd: 5706.44 | bwd_inner: 5653.27 | bwd_allreduce: 53.13 | step: 19.00 19%|█▊ | 7696/41250 [18:35:20<80:37:32, 8.65s/it] {'loss': 0.1958, 'grad_norm': 2.9292080402374268, 'learning_rate': 3.748353876110885e-05, 'epoch': 1.87} 19%|█▊ | 7696/41250 [18:35:20<80:37:32, 8.65s/it][2025-04-26 02:33:04,092] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 02:33:04,092] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.13 | bwd_microstep: 5750.20 | bwd_inner_microstep: 5654.38 | bwd_allreduce_microstep: 95.77 | step_microstep: 18.83 [2025-04-26 02:33:04,092] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.13 | bwd: 5750.21 | bwd_inner: 5654.38 | bwd_allreduce: 95.79 | step: 18.84 19%|█▊ | 7697/41250 [18:35:29<80:39:37, 8.65s/it] {'loss': 0.1058, 'grad_norm': 2.1087124347686768, 'learning_rate': 3.74827761454499e-05, 'epoch': 1.87} 19%|█▊ | 7697/41250 [18:35:29<80:39:37, 8.65s/it][2025-04-26 02:33:12,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.07 | optimizer_step: 1.19 [2025-04-26 02:33:12,694] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.60 | bwd_microstep: 5695.41 | bwd_inner_microstep: 5640.94 | bwd_allreduce_microstep: 54.42 | step_microstep: 19.72 [2025-04-26 02:33:12,694] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.60 | bwd: 5695.43 | bwd_inner: 5640.94 | bwd_allreduce: 54.45 | step: 19.72 19%|█▊ | 7698/41250 [18:35:38<80:31:16, 8.64s/it] {'loss': 0.1211, 'grad_norm': 1.7929749488830566, 'learning_rate': 3.7482013422013085e-05, 'epoch': 1.87} 19%|█▊ | 7698/41250 [18:35:38<80:31:16, 8.64s/it][2025-04-26 02:33:21,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:33:21,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.13 | bwd_microstep: 5736.61 | bwd_inner_microstep: 5689.52 | bwd_allreduce_microstep: 47.05 | step_microstep: 18.75 [2025-04-26 02:33:21,373] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.13 | bwd: 5736.62 | bwd_inner: 5689.52 | bwd_allreduce: 47.07 | step: 18.75 19%|█▊ | 7699/41250 [18:35:46<80:37:15, 8.65s/it] {'loss': 0.0561, 'grad_norm': 1.2657146453857422, 'learning_rate': 3.748125059080309e-05, 'epoch': 1.87} 19%|█▊ | 7699/41250 [18:35:46<80:37:15, 8.65s/it][2025-04-26 02:33:30,260] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-26 02:33:30,260] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2926.43 | bwd_microstep: 5875.53 | bwd_inner_microstep: 5862.33 | bwd_allreduce_microstep: 13.16 | step_microstep: 18.75 [2025-04-26 02:33:30,261] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2926.43 | bwd: 5875.54 | bwd_inner: 5862.32 | bwd_allreduce: 13.17 | step: 18.75 19%|█▊ | 7700/41250 [18:35:55<81:16:50, 8.72s/it] {'loss': 0.1733, 'grad_norm': 1.549142837524414, 'learning_rate': 3.748048765182462e-05, 'epoch': 1.87} 19%|█▊ | 7700/41250 [18:35:55<81:16:50, 8.72s/it][2025-04-26 02:33:38,931] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.00 | optimizer_step: 1.06 [2025-04-26 02:33:38,931] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.47 | bwd_microstep: 5761.70 | bwd_inner_microstep: 5647.97 | bwd_allreduce_microstep: 113.68 | step_microstep: 18.64 [2025-04-26 02:33:38,931] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.47 | bwd: 5761.71 | bwd_inner: 5647.97 | bwd_allreduce: 113.70 | step: 18.64 19%|█▊ | 7701/41250 [18:36:04<81:08:09, 8.71s/it] {'loss': 0.3092, 'grad_norm': 3.81469464302063, 'learning_rate': 3.747972460508239e-05, 'epoch': 1.87} 19%|█▊ | 7701/41250 [18:36:04<81:08:09, 8.71s/it][2025-04-26 02:33:47,605] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:33:47,606] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.79 | bwd_microstep: 5773.31 | bwd_inner_microstep: 5647.05 | bwd_allreduce_microstep: 126.21 | step_microstep: 19.01 [2025-04-26 02:33:47,606] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.79 | bwd: 5773.33 | bwd_inner: 5647.05 | bwd_allreduce: 126.23 | step: 19.01 19%|█▊ | 7702/41250 [18:36:12<81:03:22, 8.70s/it] {'loss': 0.0334, 'grad_norm': 0.7407433390617371, 'learning_rate': 3.7478961450581085e-05, 'epoch': 1.87} 19%|█▊ | 7702/41250 [18:36:12<81:03:22, 8.70s/it][2025-04-26 02:33:56,230] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.24 | optimizer_step: 0.99 [2025-04-26 02:33:56,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.50 | bwd_microstep: 5694.37 | bwd_inner_microstep: 5681.47 | bwd_allreduce_microstep: 12.84 | step_microstep: 20.08 [2025-04-26 02:33:56,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.50 | bwd: 5694.38 | bwd_inner: 5681.47 | bwd_allreduce: 12.87 | step: 20.08 19%|█▊ | 7703/41250 [18:36:21<80:50:22, 8.68s/it] {'loss': 0.0385, 'grad_norm': 1.1069931983947754, 'learning_rate': 3.7478198188325434e-05, 'epoch': 1.87} 19%|█▊ | 7703/41250 [18:36:21<80:50:22, 8.68s/it][2025-04-26 02:34:05,147] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.98 [2025-04-26 02:34:05,148] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.27 | bwd_microstep: 5993.44 | bwd_inner_microstep: 5673.87 | bwd_allreduce_microstep: 319.52 | step_microstep: 19.24 [2025-04-26 02:34:05,148] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.27 | bwd: 5993.45 | bwd_inner: 5673.87 | bwd_allreduce: 319.54 | step: 19.24 19%|█▊ | 7704/41250 [18:36:30<81:30:57, 8.75s/it] {'loss': 0.1098, 'grad_norm': 1.918540120124817, 'learning_rate': 3.747743481832012e-05, 'epoch': 1.87} 19%|█▊ | 7704/41250 [18:36:30<81:30:57, 8.75s/it][2025-04-26 02:34:13,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.98 | optimizer_step: 0.91 [2025-04-26 02:34:13,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.65 | bwd_microstep: 5725.10 | bwd_inner_microstep: 5675.32 | bwd_allreduce_microstep: 49.74 | step_microstep: 18.76 [2025-04-26 02:34:13,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.65 | bwd: 5725.12 | bwd_inner: 5675.32 | bwd_allreduce: 49.76 | step: 18.76 19%|█▊ | 7705/41250 [18:36:39<81:14:03, 8.72s/it] {'loss': 0.0832, 'grad_norm': 1.9458351135253906, 'learning_rate': 3.747667134056986e-05, 'epoch': 1.87} 19%|█▊ | 7705/41250 [18:36:39<81:14:03, 8.72s/it][2025-04-26 02:34:22,457] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:34:22,457] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2815.03 | bwd_microstep: 5761.99 | bwd_inner_microstep: 5625.36 | bwd_allreduce_microstep: 136.59 | step_microstep: 19.01 [2025-04-26 02:34:22,457] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2815.03 | bwd: 5762.01 | bwd_inner: 5625.36 | bwd_allreduce: 136.61 | step: 19.02 19%|█▊ | 7706/41250 [18:36:47<81:04:01, 8.70s/it] {'loss': 0.2426, 'grad_norm': 3.396961212158203, 'learning_rate': 3.747590775507936e-05, 'epoch': 1.87} 19%|█▊ | 7706/41250 [18:36:47<81:04:01, 8.70s/it][2025-04-26 02:34:31,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.07 | optimizer_step: 1.14 [2025-04-26 02:34:31,129] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2815.46 | bwd_microstep: 5769.96 | bwd_inner_microstep: 5625.46 | bwd_allreduce_microstep: 144.45 | step_microstep: 19.34 [2025-04-26 02:34:31,129] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2815.46 | bwd: 5769.97 | bwd_inner: 5625.46 | bwd_allreduce: 144.47 | step: 19.34 19%|█▊ | 7707/41250 [18:36:56<80:59:28, 8.69s/it] {'loss': 0.1087, 'grad_norm': 3.031385898590088, 'learning_rate': 3.7475144061853324e-05, 'epoch': 1.87} 19%|█▊ | 7707/41250 [18:36:56<80:59:28, 8.69s/it][2025-04-26 02:34:39,794] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:34:39,794] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.96 | bwd_microstep: 5737.84 | bwd_inner_microstep: 5686.34 | bwd_allreduce_microstep: 51.45 | step_microstep: 18.79 [2025-04-26 02:34:39,795] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.96 | bwd: 5737.85 | bwd_inner: 5686.34 | bwd_allreduce: 51.47 | step: 18.79 19%|█▊ | 7708/41250 [18:37:05<80:54:25, 8.68s/it] {'loss': 0.0503, 'grad_norm': 1.226146936416626, 'learning_rate': 3.747438026089646e-05, 'epoch': 1.87} 19%|█▊ | 7708/41250 [18:37:05<80:54:25, 8.68s/it][2025-04-26 02:34:48,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 02:34:48,404] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.68 | bwd_microstep: 5690.24 | bwd_inner_microstep: 5677.57 | bwd_allreduce_microstep: 12.62 | step_microstep: 18.64 [2025-04-26 02:34:48,405] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.68 | bwd: 5690.26 | bwd_inner: 5677.57 | bwd_allreduce: 12.64 | step: 18.64 19%|█▊ | 7709/41250 [18:37:13<80:41:56, 8.66s/it] {'loss': 0.1261, 'grad_norm': 2.6045901775360107, 'learning_rate': 3.747361635221348e-05, 'epoch': 1.87} 19%|█▊ | 7709/41250 [18:37:13<80:41:56, 8.66s/it][2025-04-26 02:34:57,059] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 02:34:57,060] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2814.05 | bwd_microstep: 5757.44 | bwd_inner_microstep: 5643.34 | bwd_allreduce_microstep: 114.06 | step_microstep: 18.68 [2025-04-26 02:34:57,060] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2814.06 | bwd: 5757.45 | bwd_inner: 5643.34 | bwd_allreduce: 114.07 | step: 18.68 19%|█▊ | 7710/41250 [18:37:22<80:40:43, 8.66s/it] {'loss': 0.0169, 'grad_norm': 0.6618806719779968, 'learning_rate': 3.747285233580909e-05, 'epoch': 1.87} 19%|█▊ | 7710/41250 [18:37:22<80:40:43, 8.66s/it][2025-04-26 02:35:05,711] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 02:35:05,711] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.28 | bwd_microstep: 5749.11 | bwd_inner_microstep: 5638.22 | bwd_allreduce_microstep: 110.84 | step_microstep: 18.35 [2025-04-26 02:35:05,712] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.28 | bwd: 5749.13 | bwd_inner: 5638.22 | bwd_allreduce: 110.86 | step: 18.36 19%|█▊ | 7711/41250 [18:37:31<80:39:57, 8.66s/it] {'loss': 0.0951, 'grad_norm': 1.6398183107376099, 'learning_rate': 3.747208821168801e-05, 'epoch': 1.87} 19%|█▊ | 7711/41250 [18:37:31<80:39:57, 8.66s/it][2025-04-26 02:35:14,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 1.14 [2025-04-26 02:35:14,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.24 | bwd_microstep: 5690.26 | bwd_inner_microstep: 5677.36 | bwd_allreduce_microstep: 12.85 | step_microstep: 19.28 [2025-04-26 02:35:14,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.24 | bwd: 5690.27 | bwd_inner: 5677.36 | bwd_allreduce: 12.87 | step: 19.29 19%|█▊ | 7712/41250 [18:37:39<80:32:43, 8.65s/it] {'loss': 0.0765, 'grad_norm': 1.0906096696853638, 'learning_rate': 3.747132397985494e-05, 'epoch': 1.87} 19%|█▊ | 7712/41250 [18:37:39<80:32:43, 8.65s/it][2025-04-26 02:35:23,004] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 02:35:23,004] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.77 | bwd_microstep: 5760.96 | bwd_inner_microstep: 5649.57 | bwd_allreduce_microstep: 111.35 | step_microstep: 18.70 [2025-04-26 02:35:23,004] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.77 | bwd: 5760.97 | bwd_inner: 5649.57 | bwd_allreduce: 111.37 | step: 18.70 19%|█▊ | 7713/41250 [18:37:48<80:37:14, 8.65s/it] {'loss': 0.2141, 'grad_norm': 1.4056813716888428, 'learning_rate': 3.747055964031459e-05, 'epoch': 1.87} 19%|█▊ | 7713/41250 [18:37:48<80:37:14, 8.65s/it][2025-04-26 02:35:31,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:35:31,667] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.47 | bwd_microstep: 5723.32 | bwd_inner_microstep: 5700.54 | bwd_allreduce_microstep: 22.73 | step_microstep: 18.52 [2025-04-26 02:35:31,667] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.47 | bwd: 5723.33 | bwd_inner: 5700.54 | bwd_allreduce: 22.75 | step: 18.52 19%|█▊ | 7714/41250 [18:37:56<80:38:26, 8.66s/it] {'loss': 0.2178, 'grad_norm': 2.927652597427368, 'learning_rate': 3.746979519307168e-05, 'epoch': 1.87} 19%|█▊ | 7714/41250 [18:37:56<80:38:26, 8.66s/it][2025-04-26 02:35:40,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:35:40,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.70 | bwd_microstep: 5711.37 | bwd_inner_microstep: 5698.63 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.71 [2025-04-26 02:35:40,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.70 | bwd: 5711.38 | bwd_inner: 5698.63 | bwd_allreduce: 12.71 | step: 18.71 19%|█▊ | 7715/41250 [18:38:05<80:35:26, 8.65s/it] {'loss': 0.258, 'grad_norm': 3.302464723587036, 'learning_rate': 3.746903063813092e-05, 'epoch': 1.87} 19%|█▊ | 7715/41250 [18:38:05<80:35:26, 8.65s/it][2025-04-26 02:35:48,908] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.14 | optimizer_step: 1.07 [2025-04-26 02:35:48,909] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.68 | bwd_microstep: 5689.85 | bwd_inner_microstep: 5649.93 | bwd_allreduce_microstep: 39.86 | step_microstep: 20.41 [2025-04-26 02:35:48,909] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.69 | bwd: 5689.87 | bwd_inner: 5649.93 | bwd_allreduce: 39.89 | step: 20.41 19%|█▊ | 7716/41250 [18:38:14<80:27:19, 8.64s/it] {'loss': 0.0984, 'grad_norm': 3.551663398742676, 'learning_rate': 3.746826597549701e-05, 'epoch': 1.87} 19%|█▊ | 7716/41250 [18:38:14<80:27:19, 8.64s/it][2025-04-26 02:35:57,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.25 | optimizer_step: 0.92 [2025-04-26 02:35:57,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.51 | bwd_microstep: 5772.38 | bwd_inner_microstep: 5647.55 | bwd_allreduce_microstep: 124.78 | step_microstep: 19.84 [2025-04-26 02:35:57,598] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.51 | bwd: 5772.39 | bwd_inner: 5647.55 | bwd_allreduce: 124.80 | step: 19.84 19%|█▊ | 7717/41250 [18:38:22<80:35:36, 8.65s/it] {'loss': 0.1338, 'grad_norm': 1.8323569297790527, 'learning_rate': 3.7467501205174686e-05, 'epoch': 1.87} 19%|█▊ | 7717/41250 [18:38:22<80:35:36, 8.65s/it][2025-04-26 02:36:06,524] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-26 02:36:06,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.60 | bwd_microstep: 5991.39 | bwd_inner_microstep: 5700.39 | bwd_allreduce_microstep: 290.95 | step_microstep: 18.79 [2025-04-26 02:36:06,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.60 | bwd: 5991.40 | bwd_inner: 5700.39 | bwd_allreduce: 290.97 | step: 18.80 19%|█▊ | 7718/41250 [18:38:31<81:21:34, 8.73s/it] {'loss': 0.3587, 'grad_norm': 4.722893238067627, 'learning_rate': 3.7466736327168655e-05, 'epoch': 1.87} 19%|█▊ | 7718/41250 [18:38:31<81:21:34, 8.73s/it][2025-04-26 02:36:15,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:36:15,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.88 | bwd_microstep: 5694.76 | bwd_inner_microstep: 5651.50 | bwd_allreduce_microstep: 43.20 | step_microstep: 18.79 [2025-04-26 02:36:15,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.88 | bwd: 5694.77 | bwd_inner: 5651.50 | bwd_allreduce: 43.22 | step: 18.79 19%|█▊ | 7719/41250 [18:38:40<80:59:48, 8.70s/it] {'loss': 0.0825, 'grad_norm': 2.390090227127075, 'learning_rate': 3.746597134148363e-05, 'epoch': 1.87} 19%|█▊ | 7719/41250 [18:38:40<80:59:48, 8.70s/it][2025-04-26 02:36:23,739] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.04 | optimizer_step: 1.01 [2025-04-26 02:36:23,740] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.55 | bwd_microstep: 5702.30 | bwd_inner_microstep: 5648.51 | bwd_allreduce_microstep: 53.74 | step_microstep: 19.08 [2025-04-26 02:36:23,740] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.55 | bwd: 5702.31 | bwd_inner: 5648.51 | bwd_allreduce: 53.76 | step: 19.09 19%|█▊ | 7720/41250 [18:38:49<80:45:03, 8.67s/it] {'loss': 0.1356, 'grad_norm': 2.168428421020508, 'learning_rate': 3.746520624812432e-05, 'epoch': 1.87} 19%|█▊ | 7720/41250 [18:38:49<80:45:03, 8.67s/it][2025-04-26 02:36:32,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:36:32,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.38 | bwd_microstep: 5710.50 | bwd_inner_microstep: 5697.70 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.30 [2025-04-26 02:36:32,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.38 | bwd: 5710.52 | bwd_inner: 5697.70 | bwd_allreduce: 12.78 | step: 18.30 19%|█▊ | 7721/41250 [18:38:57<80:42:07, 8.66s/it] {'loss': 0.1054, 'grad_norm': 1.9271867275238037, 'learning_rate': 3.746444104709545e-05, 'epoch': 1.87} 19%|█▊ | 7721/41250 [18:38:57<80:42:07, 8.66s/it][2025-04-26 02:36:41,000] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.99 [2025-04-26 02:36:41,000] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.48 | bwd_microstep: 5695.85 | bwd_inner_microstep: 5647.97 | bwd_allreduce_microstep: 47.83 | step_microstep: 18.55 [2025-04-26 02:36:41,000] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.48 | bwd: 5695.86 | bwd_inner: 5647.97 | bwd_allreduce: 47.85 | step: 18.55 19%|█▊ | 7722/41250 [18:39:06<80:32:11, 8.65s/it] {'loss': 0.1601, 'grad_norm': 2.4365813732147217, 'learning_rate': 3.746367573840174e-05, 'epoch': 1.87} 19%|█▊ | 7722/41250 [18:39:06<80:32:11, 8.65s/it][2025-04-26 02:36:50,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.20 | optimizer_step: 0.90 [2025-04-26 02:36:50,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.43 | bwd_microstep: 6128.81 | bwd_inner_microstep: 5650.61 | bwd_allreduce_microstep: 478.15 | step_microstep: 18.82 [2025-04-26 02:36:50,046] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.43 | bwd: 6128.82 | bwd_inner: 5650.61 | bwd_allreduce: 478.17 | step: 18.82 19%|█▊ | 7723/41250 [18:39:15<81:38:48, 8.77s/it] {'loss': 0.1994, 'grad_norm': 1.3863154649734497, 'learning_rate': 3.746291032204791e-05, 'epoch': 1.87} 19%|█▊ | 7723/41250 [18:39:15<81:38:48, 8.77s/it][2025-04-26 02:36:58,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 1.11 [2025-04-26 02:36:58,928] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2928.60 | bwd_microstep: 5869.61 | bwd_inner_microstep: 5856.85 | bwd_allreduce_microstep: 12.71 | step_microstep: 19.55 [2025-04-26 02:36:58,928] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2928.60 | bwd: 5869.62 | bwd_inner: 5856.85 | bwd_allreduce: 12.73 | step: 19.55 19%|█▊ | 7724/41250 [18:39:24<81:58:44, 8.80s/it] {'loss': 0.2763, 'grad_norm': 5.018056869506836, 'learning_rate': 3.746214479803865e-05, 'epoch': 1.87} 19%|█▊ | 7724/41250 [18:39:24<81:58:44, 8.80s/it][2025-04-26 02:37:07,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-26 02:37:07,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.79 | bwd_microstep: 5700.42 | bwd_inner_microstep: 5687.68 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.98 [2025-04-26 02:37:07,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.79 | bwd: 5700.43 | bwd_inner: 5687.68 | bwd_allreduce: 12.71 | step: 18.98 19%|█▊ | 7725/41250 [18:39:32<81:29:15, 8.75s/it] {'loss': 0.1161, 'grad_norm': 2.091240167617798, 'learning_rate': 3.746137916637873e-05, 'epoch': 1.87} 19%|█▊ | 7725/41250 [18:39:32<81:29:15, 8.75s/it][2025-04-26 02:37:16,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:37:16,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.99 | bwd_microstep: 5742.93 | bwd_inner_microstep: 5712.55 | bwd_allreduce_microstep: 30.33 | step_microstep: 18.68 [2025-04-26 02:37:16,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.99 | bwd: 5742.94 | bwd_inner: 5712.55 | bwd_allreduce: 30.35 | step: 18.68 19%|█▊ | 7726/41250 [18:39:41<81:17:35, 8.73s/it] {'loss': 0.1574, 'grad_norm': 2.935206413269043, 'learning_rate': 3.746061342707283e-05, 'epoch': 1.87} 19%|█▊ | 7726/41250 [18:39:41<81:17:35, 8.73s/it][2025-04-26 02:37:24,937] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 02:37:24,938] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.18 | bwd_microstep: 5764.71 | bwd_inner_microstep: 5703.95 | bwd_allreduce_microstep: 60.71 | step_microstep: 18.75 [2025-04-26 02:37:24,938] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.18 | bwd: 5764.72 | bwd_inner: 5703.95 | bwd_allreduce: 60.73 | step: 18.75 19%|█▊ | 7727/41250 [18:39:50<81:11:48, 8.72s/it] {'loss': 0.0929, 'grad_norm': 1.96268892288208, 'learning_rate': 3.7459847580125675e-05, 'epoch': 1.87} 19%|█▊ | 7727/41250 [18:39:50<81:11:48, 8.72s/it][2025-04-26 02:37:33,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 02:37:33,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.46 | bwd_microstep: 5768.00 | bwd_inner_microstep: 5658.94 | bwd_allreduce_microstep: 109.02 | step_microstep: 18.81 [2025-04-26 02:37:33,626] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.46 | bwd: 5768.02 | bwd_inner: 5658.94 | bwd_allreduce: 109.04 | step: 18.81 19%|█▊ | 7728/41250 [18:39:58<81:06:14, 8.71s/it] {'loss': 0.0352, 'grad_norm': 0.9511499404907227, 'learning_rate': 3.7459081625542e-05, 'epoch': 1.87} 19%|█▊ | 7728/41250 [18:39:58<81:06:14, 8.71s/it][2025-04-26 02:37:42,278] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:37:42,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.02 | bwd_microstep: 5719.02 | bwd_inner_microstep: 5695.29 | bwd_allreduce_microstep: 23.67 | step_microstep: 18.64 [2025-04-26 02:37:42,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.02 | bwd: 5719.04 | bwd_inner: 5695.29 | bwd_allreduce: 23.70 | step: 18.64 19%|█▊ | 7729/41250 [18:40:07<80:56:51, 8.69s/it] {'loss': 0.3016, 'grad_norm': 5.410097599029541, 'learning_rate': 3.745831556332652e-05, 'epoch': 1.87} 19%|█▊ | 7729/41250 [18:40:07<80:56:51, 8.69s/it][2025-04-26 02:37:50,909] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.04 | optimizer_step: 1.04 [2025-04-26 02:37:50,910] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.89 | bwd_microstep: 5720.19 | bwd_inner_microstep: 5639.81 | bwd_allreduce_microstep: 80.33 | step_microstep: 19.12 [2025-04-26 02:37:50,910] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.89 | bwd: 5720.21 | bwd_inner: 5639.81 | bwd_allreduce: 80.35 | step: 19.13 19%|█▊ | 7730/41250 [18:40:16<80:46:02, 8.67s/it] {'loss': 0.1663, 'grad_norm': 3.6293585300445557, 'learning_rate': 3.7457549393483964e-05, 'epoch': 1.87} 19%|█▊ | 7730/41250 [18:40:16<80:46:02, 8.67s/it][2025-04-26 02:37:59,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.98 | optimizer_step: 0.91 [2025-04-26 02:37:59,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.25 | bwd_microstep: 5698.73 | bwd_inner_microstep: 5686.15 | bwd_allreduce_microstep: 12.54 | step_microstep: 18.35 [2025-04-26 02:37:59,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.25 | bwd: 5698.74 | bwd_inner: 5686.15 | bwd_allreduce: 12.55 | step: 18.36 19%|█▊ | 7731/41250 [18:40:24<80:39:41, 8.66s/it] {'loss': 0.1276, 'grad_norm': 3.848191976547241, 'learning_rate': 3.745678311601904e-05, 'epoch': 1.87} 19%|█▊ | 7731/41250 [18:40:24<80:39:41, 8.66s/it][2025-04-26 02:38:08,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:38:08,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.53 | bwd_microstep: 5693.53 | bwd_inner_microstep: 5655.97 | bwd_allreduce_microstep: 37.52 | step_microstep: 18.50 [2025-04-26 02:38:08,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.53 | bwd: 5693.54 | bwd_inner: 5655.97 | bwd_allreduce: 37.53 | step: 18.50 19%|█▊ | 7732/41250 [18:40:33<80:32:12, 8.65s/it] {'loss': 0.1364, 'grad_norm': 3.614423990249634, 'learning_rate': 3.7456016730936496e-05, 'epoch': 1.87} 19%|█▊ | 7732/41250 [18:40:33<80:32:12, 8.65s/it][2025-04-26 02:38:16,817] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.97 | optimizer_step: 1.07 [2025-04-26 02:38:16,817] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.70 | bwd_microstep: 5714.95 | bwd_inner_microstep: 5701.99 | bwd_allreduce_microstep: 12.92 | step_microstep: 18.68 [2025-04-26 02:38:16,818] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.70 | bwd: 5714.96 | bwd_inner: 5701.99 | bwd_allreduce: 12.94 | step: 18.68 19%|█▊ | 7733/41250 [18:40:42<80:32:06, 8.65s/it] {'loss': 0.1696, 'grad_norm': 1.0674803256988525, 'learning_rate': 3.745525023824103e-05, 'epoch': 1.87} 19%|█▊ | 7733/41250 [18:40:42<80:32:06, 8.65s/it][2025-04-26 02:38:25,508] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:38:25,509] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.48 | bwd_microstep: 5754.70 | bwd_inner_microstep: 5693.60 | bwd_allreduce_microstep: 61.05 | step_microstep: 18.13 [2025-04-26 02:38:25,509] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.48 | bwd: 5754.71 | bwd_inner: 5693.60 | bwd_allreduce: 61.07 | step: 18.13 19%|█▊ | 7734/41250 [18:40:50<80:38:55, 8.66s/it] {'loss': 0.487, 'grad_norm': 5.128145217895508, 'learning_rate': 3.745448363793738e-05, 'epoch': 1.87} 19%|█▊ | 7734/41250 [18:40:50<80:38:55, 8.66s/it][2025-04-26 02:38:34,215] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 02:38:34,215] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.60 | bwd_microstep: 5766.03 | bwd_inner_microstep: 5693.87 | bwd_allreduce_microstep: 72.11 | step_microstep: 18.61 [2025-04-26 02:38:34,216] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.60 | bwd: 5766.04 | bwd_inner: 5693.87 | bwd_allreduce: 72.13 | step: 18.61 19%|█▉ | 7735/41250 [18:40:59<80:46:13, 8.68s/it] {'loss': 0.2899, 'grad_norm': 3.028493881225586, 'learning_rate': 3.745371693003027e-05, 'epoch': 1.88} 19%|█▉ | 7735/41250 [18:40:59<80:46:13, 8.68s/it][2025-04-26 02:38:42,918] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-26 02:38:42,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.37 | bwd_microstep: 5782.44 | bwd_inner_microstep: 5650.15 | bwd_allreduce_microstep: 132.25 | step_microstep: 18.57 [2025-04-26 02:38:42,919] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.37 | bwd: 5782.45 | bwd_inner: 5650.15 | bwd_allreduce: 132.27 | step: 18.57 19%|█▉ | 7736/41250 [18:41:08<80:50:39, 8.68s/it] {'loss': 0.0762, 'grad_norm': 1.360615611076355, 'learning_rate': 3.745295011452444e-05, 'epoch': 1.88} 19%|█▉ | 7736/41250 [18:41:08<80:50:39, 8.68s/it][2025-04-26 02:38:51,616] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 02:38:51,617] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.15 | bwd_microstep: 5788.43 | bwd_inner_microstep: 5656.26 | bwd_allreduce_microstep: 132.12 | step_microstep: 18.53 [2025-04-26 02:38:51,617] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.15 | bwd: 5788.44 | bwd_inner: 5656.26 | bwd_allreduce: 132.14 | step: 18.55 19%|█▉ | 7737/41250 [18:41:16<80:52:49, 8.69s/it] {'loss': 0.1247, 'grad_norm': 2.236340045928955, 'learning_rate': 3.7452183191424595e-05, 'epoch': 1.88} 19%|█▉ | 7737/41250 [18:41:16<80:52:49, 8.69s/it][2025-04-26 02:39:00,229] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:39:00,229] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.96 | bwd_microstep: 5706.21 | bwd_inner_microstep: 5650.56 | bwd_allreduce_microstep: 55.60 | step_microstep: 18.27 [2025-04-26 02:39:00,229] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.96 | bwd: 5706.22 | bwd_inner: 5650.56 | bwd_allreduce: 55.62 | step: 18.27 19%|█▉ | 7738/41250 [18:41:25<80:40:00, 8.67s/it] {'loss': 0.1147, 'grad_norm': 1.095980167388916, 'learning_rate': 3.7451416160735475e-05, 'epoch': 1.88} 19%|█▉ | 7738/41250 [18:41:25<80:40:00, 8.67s/it][2025-04-26 02:39:08,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 02:39:08,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.30 | bwd_microstep: 5750.83 | bwd_inner_microstep: 5646.11 | bwd_allreduce_microstep: 104.68 | step_microstep: 18.40 [2025-04-26 02:39:08,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.30 | bwd: 5750.85 | bwd_inner: 5646.11 | bwd_allreduce: 104.70 | step: 18.40 19%|█▉ | 7739/41250 [18:41:34<80:40:38, 8.67s/it] {'loss': 0.0916, 'grad_norm': 1.4935111999511719, 'learning_rate': 3.745064902246181e-05, 'epoch': 1.88} 19%|█▉ | 7739/41250 [18:41:34<80:40:38, 8.67s/it][2025-04-26 02:39:17,585] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-26 02:39:17,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.36 | bwd_microstep: 5754.92 | bwd_inner_microstep: 5680.19 | bwd_allreduce_microstep: 74.68 | step_microstep: 18.73 [2025-04-26 02:39:17,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.36 | bwd: 5754.93 | bwd_inner: 5680.19 | bwd_allreduce: 74.70 | step: 18.73 19%|█▉ | 7740/41250 [18:41:42<80:43:48, 8.67s/it] {'loss': 0.1043, 'grad_norm': 1.8758310079574585, 'learning_rate': 3.744988177660832e-05, 'epoch': 1.88} 19%|█▉ | 7740/41250 [18:41:42<80:43:48, 8.67s/it][2025-04-26 02:39:26,289] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:39:26,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.77 | bwd_microstep: 5767.60 | bwd_inner_microstep: 5687.84 | bwd_allreduce_microstep: 79.72 | step_microstep: 18.75 [2025-04-26 02:39:26,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.77 | bwd: 5767.61 | bwd_inner: 5687.84 | bwd_allreduce: 79.73 | step: 18.75 19%|█▉ | 7741/41250 [18:41:51<80:48:51, 8.68s/it] {'loss': 0.3375, 'grad_norm': 3.9358344078063965, 'learning_rate': 3.7449114423179746e-05, 'epoch': 1.88} 19%|█▉ | 7741/41250 [18:41:51<80:48:51, 8.68s/it][2025-04-26 02:39:34,898] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:39:34,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.02 | bwd_microstep: 5694.48 | bwd_inner_microstep: 5634.07 | bwd_allreduce_microstep: 60.36 | step_microstep: 18.83 [2025-04-26 02:39:34,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.02 | bwd: 5694.49 | bwd_inner: 5634.07 | bwd_allreduce: 60.38 | step: 18.83 19%|█▉ | 7742/41250 [18:42:00<80:36:17, 8.66s/it] {'loss': 0.2485, 'grad_norm': 3.315063953399658, 'learning_rate': 3.744834696218082e-05, 'epoch': 1.88} 19%|█▉ | 7742/41250 [18:42:00<80:36:17, 8.66s/it][2025-04-26 02:39:43,585] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:39:43,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.75 | bwd_microstep: 5764.87 | bwd_inner_microstep: 5689.74 | bwd_allreduce_microstep: 75.08 | step_microstep: 18.70 [2025-04-26 02:39:43,586] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.75 | bwd: 5764.88 | bwd_inner: 5689.74 | bwd_allreduce: 75.10 | step: 18.70 19%|█▉ | 7743/41250 [18:42:08<80:40:48, 8.67s/it] {'loss': 0.2208, 'grad_norm': 1.9611586332321167, 'learning_rate': 3.744757939361626e-05, 'epoch': 1.88} 19%|█▉ | 7743/41250 [18:42:08<80:40:48, 8.67s/it][2025-04-26 02:39:52,260] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:39:52,261] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.37 | bwd_microstep: 5751.21 | bwd_inner_microstep: 5681.56 | bwd_allreduce_microstep: 69.61 | step_microstep: 18.73 [2025-04-26 02:39:52,261] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.37 | bwd: 5751.22 | bwd_inner: 5681.56 | bwd_allreduce: 69.62 | step: 18.74 19%|█▉ | 7744/41250 [18:42:17<80:41:45, 8.67s/it] {'loss': 0.1017, 'grad_norm': 1.7522038221359253, 'learning_rate': 3.7446811717490803e-05, 'epoch': 1.88} 19%|█▉ | 7744/41250 [18:42:17<80:41:45, 8.67s/it][2025-04-26 02:40:00,884] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.94 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:40:00,884] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.79 | bwd_microstep: 5701.90 | bwd_inner_microstep: 5688.06 | bwd_allreduce_microstep: 13.79 | step_microstep: 18.18 [2025-04-26 02:40:00,885] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.79 | bwd: 5701.91 | bwd_inner: 5688.06 | bwd_allreduce: 13.81 | step: 18.18 19%|█▉ | 7745/41250 [18:42:26<80:33:57, 8.66s/it] {'loss': 0.0677, 'grad_norm': 1.4166680574417114, 'learning_rate': 3.744604393380918e-05, 'epoch': 1.88} 19%|█▉ | 7745/41250 [18:42:26<80:33:57, 8.66s/it][2025-04-26 02:40:09,488] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 1.29 | optimizer_step: 1.04 [2025-04-26 02:40:09,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.72 | bwd_microstep: 5700.87 | bwd_inner_microstep: 5639.78 | bwd_allreduce_microstep: 61.03 | step_microstep: 19.72 [2025-04-26 02:40:09,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.72 | bwd: 5700.89 | bwd_inner: 5639.78 | bwd_allreduce: 61.06 | step: 19.72 19%|█▉ | 7746/41250 [18:42:34<80:25:00, 8.64s/it] {'loss': 0.0466, 'grad_norm': 0.739491879940033, 'learning_rate': 3.744527604257613e-05, 'epoch': 1.88} 19%|█▉ | 7746/41250 [18:42:34<80:25:00, 8.64s/it][2025-04-26 02:40:18,120] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.22 | optimizer_step: 0.92 [2025-04-26 02:40:18,121] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.92 | bwd_microstep: 5705.42 | bwd_inner_microstep: 5692.23 | bwd_allreduce_microstep: 13.14 | step_microstep: 19.26 [2025-04-26 02:40:18,121] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.92 | bwd: 5705.44 | bwd_inner: 5692.23 | bwd_allreduce: 13.16 | step: 19.26 19%|█▉ | 7747/41250 [18:42:43<80:23:19, 8.64s/it] {'loss': 0.1143, 'grad_norm': 1.1823534965515137, 'learning_rate': 3.744450804379639e-05, 'epoch': 1.88} 19%|█▉ | 7747/41250 [18:42:43<80:23:19, 8.64s/it][2025-04-26 02:40:26,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.98 | optimizer_step: 1.02 [2025-04-26 02:40:26,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.01 | bwd_microstep: 5699.22 | bwd_inner_microstep: 5686.49 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.76 [2025-04-26 02:40:26,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.01 | bwd: 5699.23 | bwd_inner: 5686.49 | bwd_allreduce: 12.70 | step: 18.76 19%|█▉ | 7748/41250 [18:42:52<80:23:41, 8.64s/it] {'loss': 0.1864, 'grad_norm': 2.524021625518799, 'learning_rate': 3.744373993747469e-05, 'epoch': 1.88} 19%|█▉ | 7748/41250 [18:42:52<80:23:41, 8.64s/it][2025-04-26 02:40:35,377] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.97 | optimizer_step: 1.05 [2025-04-26 02:40:35,377] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.57 | bwd_microstep: 5710.10 | bwd_inner_microstep: 5640.84 | bwd_allreduce_microstep: 69.22 | step_microstep: 18.36 [2025-04-26 02:40:35,378] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.57 | bwd: 5710.11 | bwd_inner: 5640.84 | bwd_allreduce: 69.23 | step: 18.37 19%|█▉ | 7749/41250 [18:43:00<80:19:37, 8.63s/it] {'loss': 0.0758, 'grad_norm': 2.739722728729248, 'learning_rate': 3.7442971723615755e-05, 'epoch': 1.88} 19%|█▉ | 7749/41250 [18:43:00<80:19:37, 8.63s/it][2025-04-26 02:40:43,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.08 | optimizer_step: 0.97 [2025-04-26 02:40:43,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.12 | bwd_microstep: 5694.44 | bwd_inner_microstep: 5681.25 | bwd_allreduce_microstep: 13.15 | step_microstep: 18.94 [2025-04-26 02:40:43,998] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.12 | bwd: 5694.46 | bwd_inner: 5681.25 | bwd_allreduce: 13.17 | step: 18.94 19%|█▉ | 7750/41250 [18:43:09<80:17:33, 8.63s/it] {'loss': 0.0639, 'grad_norm': 0.6125897169113159, 'learning_rate': 3.744220340222434e-05, 'epoch': 1.88} 19%|█▉ | 7750/41250 [18:43:09<80:17:33, 8.63s/it][2025-04-26 02:40:52,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:40:52,654] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.93 | bwd_microstep: 5747.79 | bwd_inner_microstep: 5638.78 | bwd_allreduce_microstep: 108.97 | step_microstep: 18.93 [2025-04-26 02:40:52,654] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.93 | bwd: 5747.81 | bwd_inner: 5638.78 | bwd_allreduce: 108.99 | step: 18.93 19%|█▉ | 7751/41250 [18:43:17<80:22:03, 8.64s/it] {'loss': 0.1628, 'grad_norm': 1.77158522605896, 'learning_rate': 3.7441434973305167e-05, 'epoch': 1.88} 19%|█▉ | 7751/41250 [18:43:17<80:22:03, 8.64s/it][2025-04-26 02:41:01,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.97 | optimizer_step: 1.01 [2025-04-26 02:41:01,315] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.80 | bwd_microstep: 5731.42 | bwd_inner_microstep: 5690.66 | bwd_allreduce_microstep: 40.72 | step_microstep: 18.51 [2025-04-26 02:41:01,315] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.80 | bwd: 5731.44 | bwd_inner: 5690.66 | bwd_allreduce: 40.73 | step: 18.51 19%|█▉ | 7752/41250 [18:43:26<80:26:05, 8.64s/it] {'loss': 0.3101, 'grad_norm': 2.897172212600708, 'learning_rate': 3.744066643686298e-05, 'epoch': 1.88} 19%|█▉ | 7752/41250 [18:43:26<80:26:05, 8.64s/it][2025-04-26 02:41:09,950] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.97 | optimizer_step: 0.94 [2025-04-26 02:41:09,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.01 | bwd_microstep: 5711.48 | bwd_inner_microstep: 5698.61 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.48 [2025-04-26 02:41:09,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.01 | bwd: 5711.49 | bwd_inner: 5698.61 | bwd_allreduce: 12.84 | step: 18.49 19%|█▉ | 7753/41250 [18:43:35<80:24:34, 8.64s/it] {'loss': 0.2519, 'grad_norm': 5.349159240722656, 'learning_rate': 3.7439897792902506e-05, 'epoch': 1.88} 19%|█▉ | 7753/41250 [18:43:35<80:24:34, 8.64s/it][2025-04-26 02:41:18,564] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 02:41:18,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2816.05 | bwd_microstep: 5715.00 | bwd_inner_microstep: 5634.19 | bwd_allreduce_microstep: 80.76 | step_microstep: 18.36 [2025-04-26 02:41:18,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2816.05 | bwd: 5715.01 | bwd_inner: 5634.19 | bwd_allreduce: 80.78 | step: 18.37 19%|█▉ | 7754/41250 [18:43:43<80:19:26, 8.63s/it] {'loss': 0.2562, 'grad_norm': 3.2081940174102783, 'learning_rate': 3.74391290414285e-05, 'epoch': 1.88} 19%|█▉ | 7754/41250 [18:43:43<80:19:26, 8.63s/it][2025-04-26 02:41:27,456] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 1.12 [2025-04-26 02:41:27,457] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2931.68 | bwd_microstep: 5874.55 | bwd_inner_microstep: 5861.72 | bwd_allreduce_microstep: 12.78 | step_microstep: 19.18 [2025-04-26 02:41:27,457] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2931.68 | bwd: 5874.56 | bwd_inner: 5861.72 | bwd_allreduce: 12.79 | step: 19.18 19%|█▉ | 7755/41250 [18:43:52<81:02:49, 8.71s/it] {'loss': 0.1139, 'grad_norm': 1.2012370824813843, 'learning_rate': 3.7438360182445694e-05, 'epoch': 1.88} 19%|█▉ | 7755/41250 [18:43:52<81:02:49, 8.71s/it][2025-04-26 02:41:36,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 1.09 [2025-04-26 02:41:36,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.16 | bwd_microstep: 5733.65 | bwd_inner_microstep: 5690.21 | bwd_allreduce_microstep: 43.39 | step_microstep: 18.85 [2025-04-26 02:41:36,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.16 | bwd: 5733.66 | bwd_inner: 5690.21 | bwd_allreduce: 43.41 | step: 18.85 19%|█▉ | 7756/41250 [18:44:01<80:55:19, 8.70s/it] {'loss': 0.1681, 'grad_norm': 2.0036449432373047, 'learning_rate': 3.7437591215958827e-05, 'epoch': 1.88} 19%|█▉ | 7756/41250 [18:44:01<80:55:19, 8.70s/it][2025-04-26 02:41:44,765] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:41:44,766] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.12 | bwd_microstep: 5731.60 | bwd_inner_microstep: 5638.90 | bwd_allreduce_microstep: 92.65 | step_microstep: 18.39 [2025-04-26 02:41:44,766] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.12 | bwd: 5731.61 | bwd_inner: 5638.90 | bwd_allreduce: 92.66 | step: 18.39 19%|█▉ | 7757/41250 [18:44:10<80:45:56, 8.68s/it] {'loss': 0.1886, 'grad_norm': 2.0013418197631836, 'learning_rate': 3.7436822141972636e-05, 'epoch': 1.88} 19%|█▉ | 7757/41250 [18:44:10<80:45:56, 8.68s/it][2025-04-26 02:41:53,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:41:53,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.12 | bwd_microstep: 5880.68 | bwd_inner_microstep: 5693.77 | bwd_allreduce_microstep: 186.87 | step_microstep: 18.77 [2025-04-26 02:41:53,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.12 | bwd: 5880.70 | bwd_inner: 5693.77 | bwd_allreduce: 186.89 | step: 18.77 19%|█▉ | 7758/41250 [18:44:18<81:07:08, 8.72s/it] {'loss': 0.2069, 'grad_norm': 1.6084426641464233, 'learning_rate': 3.7436052960491866e-05, 'epoch': 1.88} 19%|█▉ | 7758/41250 [18:44:18<81:07:08, 8.72s/it][2025-04-26 02:42:02,249] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:42:02,249] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.70 | bwd_microstep: 5758.88 | bwd_inner_microstep: 5646.03 | bwd_allreduce_microstep: 112.81 | step_microstep: 18.84 [2025-04-26 02:42:02,249] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.70 | bwd: 5758.90 | bwd_inner: 5646.03 | bwd_allreduce: 112.83 | step: 18.84 19%|█▉ | 7759/41250 [18:44:27<80:59:40, 8.71s/it] {'loss': 0.1465, 'grad_norm': 0.8897202014923096, 'learning_rate': 3.743528367152127e-05, 'epoch': 1.88} 19%|█▉ | 7759/41250 [18:44:27<80:59:40, 8.71s/it][2025-04-26 02:42:10,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.01 | optimizer_step: 1.07 [2025-04-26 02:42:10,943] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.12 | bwd_microstep: 5764.44 | bwd_inner_microstep: 5688.98 | bwd_allreduce_microstep: 75.41 | step_microstep: 18.70 [2025-04-26 02:42:10,943] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.12 | bwd: 5764.45 | bwd_inner: 5688.98 | bwd_allreduce: 75.43 | step: 18.70 19%|█▉ | 7760/41250 [18:44:36<80:57:17, 8.70s/it] {'loss': 0.0783, 'grad_norm': 3.063530206680298, 'learning_rate': 3.743451427506557e-05, 'epoch': 1.88} 19%|█▉ | 7760/41250 [18:44:36<80:57:17, 8.70s/it][2025-04-26 02:42:19,564] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 02:42:19,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.66 | bwd_microstep: 5688.98 | bwd_inner_microstep: 5676.27 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.65 [2025-04-26 02:42:19,565] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.66 | bwd: 5689.00 | bwd_inner: 5676.27 | bwd_allreduce: 12.68 | step: 18.65 19%|█▉ | 7761/41250 [18:44:44<80:43:31, 8.68s/it] {'loss': 0.0376, 'grad_norm': 0.6029969453811646, 'learning_rate': 3.743374477112952e-05, 'epoch': 1.88} 19%|█▉ | 7761/41250 [18:44:44<80:43:31, 8.68s/it][2025-04-26 02:42:28,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 02:42:28,186] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.75 | bwd_microstep: 5688.75 | bwd_inner_microstep: 5671.28 | bwd_allreduce_microstep: 17.42 | step_microstep: 18.46 [2025-04-26 02:42:28,186] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.75 | bwd: 5688.76 | bwd_inner: 5671.28 | bwd_allreduce: 17.44 | step: 18.46 19%|█▉ | 7762/41250 [18:44:53<80:33:51, 8.66s/it] {'loss': 0.0412, 'grad_norm': 0.6859224438667297, 'learning_rate': 3.7432975159717865e-05, 'epoch': 1.88} 19%|█▉ | 7762/41250 [18:44:53<80:33:51, 8.66s/it][2025-04-26 02:42:36,813] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:42:36,813] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.29 | bwd_microstep: 5697.93 | bwd_inner_microstep: 5685.11 | bwd_allreduce_microstep: 12.79 | step_microstep: 18.57 [2025-04-26 02:42:36,814] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.29 | bwd: 5697.95 | bwd_inner: 5685.11 | bwd_allreduce: 12.80 | step: 18.57 19%|█▉ | 7763/41250 [18:45:02<80:28:12, 8.65s/it] {'loss': 0.2672, 'grad_norm': 1.2640830278396606, 'learning_rate': 3.743220544083534e-05, 'epoch': 1.88} 19%|█▉ | 7763/41250 [18:45:02<80:28:12, 8.65s/it][2025-04-26 02:42:45,426] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:42:45,426] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.95 | bwd_microstep: 5700.77 | bwd_inner_microstep: 5657.79 | bwd_allreduce_microstep: 42.93 | step_microstep: 18.75 [2025-04-26 02:42:45,426] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.95 | bwd: 5700.79 | bwd_inner: 5657.79 | bwd_allreduce: 42.95 | step: 18.76 19%|█▉ | 7764/41250 [18:45:10<80:22:02, 8.64s/it] {'loss': 0.1822, 'grad_norm': 1.919074296951294, 'learning_rate': 3.7431435614486704e-05, 'epoch': 1.88} 19%|█▉ | 7764/41250 [18:45:10<80:22:02, 8.64s/it][2025-04-26 02:42:54,127] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.97 | optimizer_step: 0.93 [2025-04-26 02:42:54,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.20 | bwd_microstep: 5764.03 | bwd_inner_microstep: 5686.39 | bwd_allreduce_microstep: 77.60 | step_microstep: 18.76 [2025-04-26 02:42:54,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.20 | bwd: 5764.05 | bwd_inner: 5686.39 | bwd_allreduce: 77.62 | step: 18.76 19%|█▉ | 7765/41250 [18:45:19<80:31:52, 8.66s/it] {'loss': 0.3198, 'grad_norm': 1.7258522510528564, 'learning_rate': 3.743066568067669e-05, 'epoch': 1.88} 19%|█▉ | 7765/41250 [18:45:19<80:31:52, 8.66s/it][2025-04-26 02:43:02,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.00 | optimizer_step: 1.07 [2025-04-26 02:43:02,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.63 | bwd_microstep: 5738.82 | bwd_inner_microstep: 5654.91 | bwd_allreduce_microstep: 83.87 | step_microstep: 18.81 [2025-04-26 02:43:02,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.63 | bwd: 5738.84 | bwd_inner: 5654.91 | bwd_allreduce: 83.89 | step: 18.81 19%|█▉ | 7766/41250 [18:45:28<80:31:28, 8.66s/it] {'loss': 0.2154, 'grad_norm': 4.403244972229004, 'learning_rate': 3.742989563941006e-05, 'epoch': 1.88} 19%|█▉ | 7766/41250 [18:45:28<80:31:28, 8.66s/it][2025-04-26 02:43:11,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:43:11,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.25 | bwd_microstep: 5716.37 | bwd_inner_microstep: 5659.27 | bwd_allreduce_microstep: 57.05 | step_microstep: 18.74 [2025-04-26 02:43:11,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.25 | bwd: 5716.38 | bwd_inner: 5659.27 | bwd_allreduce: 57.07 | step: 18.74 19%|█▉ | 7767/41250 [18:45:36<80:27:35, 8.65s/it] {'loss': 0.1059, 'grad_norm': 1.6129070520401, 'learning_rate': 3.7429125490691545e-05, 'epoch': 1.88} 19%|█▉ | 7767/41250 [18:45:36<80:27:35, 8.65s/it][2025-04-26 02:43:20,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:43:20,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2943.04 | bwd_microstep: 5886.66 | bwd_inner_microstep: 5873.94 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.65 [2025-04-26 02:43:20,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2943.04 | bwd: 5886.67 | bwd_inner: 5873.94 | bwd_allreduce: 12.69 | step: 18.66 19%|█▉ | 7768/41250 [18:45:45<81:11:43, 8.73s/it] {'loss': 0.1047, 'grad_norm': 2.182828903198242, 'learning_rate': 3.7428355234525906e-05, 'epoch': 1.88} 19%|█▉ | 7768/41250 [18:45:45<81:11:43, 8.73s/it][2025-04-26 02:43:28,965] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 02:43:28,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.05 | bwd_microstep: 5699.41 | bwd_inner_microstep: 5686.59 | bwd_allreduce_microstep: 12.78 | step_microstep: 18.52 [2025-04-26 02:43:28,966] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.05 | bwd: 5699.42 | bwd_inner: 5686.59 | bwd_allreduce: 12.79 | step: 18.53 19%|█▉ | 7769/41250 [18:45:54<80:54:52, 8.70s/it] {'loss': 0.0586, 'grad_norm': 1.2691185474395752, 'learning_rate': 3.742758487091788e-05, 'epoch': 1.88} 19%|█▉ | 7769/41250 [18:45:54<80:54:52, 8.70s/it][2025-04-26 02:43:37,663] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.96 | optimizer_step: 1.07 [2025-04-26 02:43:37,664] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.06 | bwd_microstep: 5782.96 | bwd_inner_microstep: 5658.09 | bwd_allreduce_microstep: 124.83 | step_microstep: 18.47 [2025-04-26 02:43:37,664] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.06 | bwd: 5782.97 | bwd_inner: 5658.09 | bwd_allreduce: 124.84 | step: 18.47 19%|█▉ | 7770/41250 [18:46:02<80:54:22, 8.70s/it] {'loss': 0.1038, 'grad_norm': 2.182033061981201, 'learning_rate': 3.742681439987221e-05, 'epoch': 1.88} 19%|█▉ | 7770/41250 [18:46:02<80:54:22, 8.70s/it][2025-04-26 02:43:46,370] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:43:46,370] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.97 | bwd_microstep: 5767.70 | bwd_inner_microstep: 5703.41 | bwd_allreduce_microstep: 64.24 | step_microstep: 18.21 [2025-04-26 02:43:46,371] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.97 | bwd: 5767.71 | bwd_inner: 5703.41 | bwd_allreduce: 64.26 | step: 18.21 19%|█▉ | 7771/41250 [18:46:11<80:55:30, 8.70s/it] {'loss': 0.2939, 'grad_norm': 2.3294641971588135, 'learning_rate': 3.7426043821393675e-05, 'epoch': 1.88} 19%|█▉ | 7771/41250 [18:46:11<80:55:30, 8.70s/it][2025-04-26 02:43:55,022] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.95 [2025-04-26 02:43:55,023] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.33 | bwd_microstep: 5714.04 | bwd_inner_microstep: 5701.27 | bwd_allreduce_microstep: 12.72 | step_microstep: 19.14 [2025-04-26 02:43:55,023] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.33 | bwd: 5714.05 | bwd_inner: 5701.27 | bwd_allreduce: 12.74 | step: 19.14 19%|█▉ | 7772/41250 [18:46:20<80:47:06, 8.69s/it] {'loss': 0.0616, 'grad_norm': 0.6529540419578552, 'learning_rate': 3.7425273135486994e-05, 'epoch': 1.88} 19%|█▉ | 7772/41250 [18:46:20<80:47:06, 8.69s/it][2025-04-26 02:44:03,676] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 02:44:03,677] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.63 | bwd_microstep: 5722.65 | bwd_inner_microstep: 5710.20 | bwd_allreduce_microstep: 12.41 | step_microstep: 17.91 [2025-04-26 02:44:03,677] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.63 | bwd: 5722.66 | bwd_inner: 5710.20 | bwd_allreduce: 12.42 | step: 17.91 19%|█▉ | 7773/41250 [18:46:28<80:41:16, 8.68s/it] {'loss': 0.0804, 'grad_norm': 2.4052631855010986, 'learning_rate': 3.742450234215694e-05, 'epoch': 1.88} 19%|█▉ | 7773/41250 [18:46:29<80:41:16, 8.68s/it][2025-04-26 02:44:12,346] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-26 02:44:12,346] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.90 | bwd_microstep: 5731.64 | bwd_inner_microstep: 5719.06 | bwd_allreduce_microstep: 12.54 | step_microstep: 18.32 [2025-04-26 02:44:12,347] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.90 | bwd: 5731.65 | bwd_inner: 5719.06 | bwd_allreduce: 12.55 | step: 18.32 19%|█▉ | 7774/41250 [18:46:37<80:39:57, 8.67s/it] {'loss': 0.5138, 'grad_norm': 3.0079946517944336, 'learning_rate': 3.742373144140825e-05, 'epoch': 1.88} 19%|█▉ | 7774/41250 [18:46:37<80:39:57, 8.67s/it][2025-04-26 02:44:21,027] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 02:44:21,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.41 | bwd_microstep: 5758.89 | bwd_inner_microstep: 5650.78 | bwd_allreduce_microstep: 108.06 | step_microstep: 18.68 [2025-04-26 02:44:21,028] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.41 | bwd: 5758.90 | bwd_inner: 5650.78 | bwd_allreduce: 108.08 | step: 18.68 19%|█▉ | 7775/41250 [18:46:46<80:41:22, 8.68s/it] {'loss': 0.0713, 'grad_norm': 0.8787352442741394, 'learning_rate': 3.742296043324569e-05, 'epoch': 1.88} 19%|█▉ | 7775/41250 [18:46:46<80:41:22, 8.68s/it][2025-04-26 02:44:29,670] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:44:29,670] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.49 | bwd_microstep: 5707.56 | bwd_inner_microstep: 5694.96 | bwd_allreduce_microstep: 12.56 | step_microstep: 18.20 [2025-04-26 02:44:29,671] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.49 | bwd: 5707.58 | bwd_inner: 5694.96 | bwd_allreduce: 12.57 | step: 18.21 19%|█▉ | 7776/41250 [18:46:54<80:34:53, 8.67s/it] {'loss': 0.3549, 'grad_norm': 2.1003475189208984, 'learning_rate': 3.742218931767401e-05, 'epoch': 1.89} 19%|█▉ | 7776/41250 [18:46:54<80:34:53, 8.67s/it][2025-04-26 02:44:38,301] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 02:44:38,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.88 | bwd_microstep: 5703.98 | bwd_inner_microstep: 5691.06 | bwd_allreduce_microstep: 12.89 | step_microstep: 18.18 [2025-04-26 02:44:38,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.88 | bwd: 5704.00 | bwd_inner: 5691.06 | bwd_allreduce: 12.90 | step: 18.18 19%|█▉ | 7777/41250 [18:47:03<80:28:54, 8.66s/it] {'loss': 0.0688, 'grad_norm': 2.0911495685577393, 'learning_rate': 3.742141809469795e-05, 'epoch': 1.89} 19%|█▉ | 7777/41250 [18:47:03<80:28:54, 8.66s/it][2025-04-26 02:44:46,936] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 02:44:46,937] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.32 | bwd_microstep: 5723.89 | bwd_inner_microstep: 5659.93 | bwd_allreduce_microstep: 63.92 | step_microstep: 18.23 [2025-04-26 02:44:46,937] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.32 | bwd: 5723.91 | bwd_inner: 5659.93 | bwd_allreduce: 63.94 | step: 18.24 19%|█▉ | 7778/41250 [18:47:12<80:25:19, 8.65s/it] {'loss': 0.0443, 'grad_norm': 0.387297123670578, 'learning_rate': 3.742064676432227e-05, 'epoch': 1.89} 19%|█▉ | 7778/41250 [18:47:12<80:25:19, 8.65s/it][2025-04-26 02:44:55,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 02:44:55,655] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.59 | bwd_microstep: 5793.30 | bwd_inner_microstep: 5658.34 | bwd_allreduce_microstep: 134.91 | step_microstep: 18.90 [2025-04-26 02:44:55,656] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.59 | bwd: 5793.31 | bwd_inner: 5658.34 | bwd_allreduce: 134.93 | step: 18.91 19%|█▉ | 7779/41250 [18:47:20<80:36:44, 8.67s/it] {'loss': 0.2009, 'grad_norm': 1.6343481540679932, 'learning_rate': 3.7419875326551736e-05, 'epoch': 1.89} 19%|█▉ | 7779/41250 [18:47:20<80:36:44, 8.67s/it][2025-04-26 02:45:04,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 02:45:04,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.87 | bwd_microstep: 5913.27 | bwd_inner_microstep: 5643.56 | bwd_allreduce_microstep: 269.66 | step_microstep: 18.61 [2025-04-26 02:45:04,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.87 | bwd: 5913.29 | bwd_inner: 5643.56 | bwd_allreduce: 269.68 | step: 18.61 19%|█▉ | 7780/41250 [18:47:29<81:02:44, 8.72s/it] {'loss': 0.0812, 'grad_norm': 0.655626654624939, 'learning_rate': 3.7419103781391096e-05, 'epoch': 1.89} 19%|█▉ | 7780/41250 [18:47:29<81:02:44, 8.72s/it][2025-04-26 02:45:13,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 02:45:13,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.46 | bwd_microstep: 5768.05 | bwd_inner_microstep: 5699.37 | bwd_allreduce_microstep: 68.64 | step_microstep: 18.23 [2025-04-26 02:45:13,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.46 | bwd: 5768.06 | bwd_inner: 5699.37 | bwd_allreduce: 68.66 | step: 18.23 19%|█▉ | 7781/41250 [18:47:38<81:00:18, 8.71s/it] {'loss': 0.4055, 'grad_norm': 2.9023361206054688, 'learning_rate': 3.741833212884511e-05, 'epoch': 1.89} 19%|█▉ | 7781/41250 [18:47:38<81:00:18, 8.71s/it][2025-04-26 02:45:21,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-26 02:45:21,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.73 | bwd_microstep: 5696.25 | bwd_inner_microstep: 5654.83 | bwd_allreduce_microstep: 41.38 | step_microstep: 18.60 [2025-04-26 02:45:21,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.73 | bwd: 5696.26 | bwd_inner: 5654.83 | bwd_allreduce: 41.40 | step: 18.60 19%|█▉ | 7782/41250 [18:47:47<80:43:41, 8.68s/it] {'loss': 0.2632, 'grad_norm': 2.7136268615722656, 'learning_rate': 3.741756036891853e-05, 'epoch': 1.89} 19%|█▉ | 7782/41250 [18:47:47<80:43:41, 8.68s/it][2025-04-26 02:45:30,497] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.01 | optimizer_step: 0.91 [2025-04-26 02:45:30,497] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.41 | bwd_microstep: 5782.06 | bwd_inner_microstep: 5650.55 | bwd_allreduce_microstep: 131.47 | step_microstep: 18.41 [2025-04-26 02:45:30,497] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.41 | bwd: 5782.08 | bwd_inner: 5650.55 | bwd_allreduce: 131.49 | step: 18.43 19%|█▉ | 7783/41250 [18:47:55<80:45:48, 8.69s/it] {'loss': 0.1007, 'grad_norm': 2.297952890396118, 'learning_rate': 3.7416788501616114e-05, 'epoch': 1.89} 19%|█▉ | 7783/41250 [18:47:55<80:45:48, 8.69s/it][2025-04-26 02:45:39,190] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-26 02:45:39,191] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.82 | bwd_microstep: 5776.18 | bwd_inner_microstep: 5646.33 | bwd_allreduce_microstep: 129.80 | step_microstep: 19.36 [2025-04-26 02:45:39,191] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.82 | bwd: 5776.20 | bwd_inner: 5646.33 | bwd_allreduce: 129.82 | step: 19.37 19%|█▉ | 7784/41250 [18:48:04<80:46:50, 8.69s/it] {'loss': 0.1819, 'grad_norm': 1.7519114017486572, 'learning_rate': 3.7416016526942625e-05, 'epoch': 1.89} 19%|█▉ | 7784/41250 [18:48:04<80:46:50, 8.69s/it][2025-04-26 02:45:47,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:45:47,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.73 | bwd_microstep: 5701.10 | bwd_inner_microstep: 5688.30 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.99 [2025-04-26 02:45:47,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.73 | bwd: 5701.12 | bwd_inner: 5688.29 | bwd_allreduce: 12.78 | step: 18.99 19%|█▉ | 7785/41250 [18:48:13<80:37:32, 8.67s/it] {'loss': 0.0582, 'grad_norm': 1.085708737373352, 'learning_rate': 3.7415244444902816e-05, 'epoch': 1.89} 19%|█▉ | 7785/41250 [18:48:13<80:37:32, 8.67s/it][2025-04-26 02:45:56,510] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 1.06 [2025-04-26 02:45:56,510] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.23 | bwd_microstep: 5775.64 | bwd_inner_microstep: 5636.21 | bwd_allreduce_microstep: 139.38 | step_microstep: 18.83 [2025-04-26 02:45:56,510] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.23 | bwd: 5775.65 | bwd_inner: 5636.21 | bwd_allreduce: 139.40 | step: 18.83 19%|█▉ | 7786/41250 [18:48:21<80:39:08, 8.68s/it] {'loss': 0.0403, 'grad_norm': 1.038636565208435, 'learning_rate': 3.741447225550146e-05, 'epoch': 1.89} 19%|█▉ | 7786/41250 [18:48:21<80:39:08, 8.68s/it][2025-04-26 02:46:05,196] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.97 | optimizer_step: 0.92 [2025-04-26 02:46:05,197] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.02 | bwd_microstep: 5775.46 | bwd_inner_microstep: 5644.43 | bwd_allreduce_microstep: 130.99 | step_microstep: 18.71 [2025-04-26 02:46:05,197] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.02 | bwd: 5775.47 | bwd_inner: 5644.43 | bwd_allreduce: 131.00 | step: 18.71 19%|█▉ | 7787/41250 [18:48:30<80:40:43, 8.68s/it] {'loss': 0.2458, 'grad_norm': 2.2757866382598877, 'learning_rate': 3.74136999587433e-05, 'epoch': 1.89} 19%|█▉ | 7787/41250 [18:48:30<80:40:43, 8.68s/it][2025-04-26 02:46:13,876] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:46:13,877] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.64 | bwd_microstep: 5742.10 | bwd_inner_microstep: 5684.35 | bwd_allreduce_microstep: 57.70 | step_microstep: 18.57 [2025-04-26 02:46:13,877] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.64 | bwd: 5742.12 | bwd_inner: 5684.35 | bwd_allreduce: 57.72 | step: 18.57 19%|█▉ | 7788/41250 [18:48:39<80:40:27, 8.68s/it] {'loss': 0.1962, 'grad_norm': 1.4725723266601562, 'learning_rate': 3.7412927554633104e-05, 'epoch': 1.89} 19%|█▉ | 7788/41250 [18:48:39<80:40:27, 8.68s/it][2025-04-26 02:46:22,494] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.90 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-26 02:46:22,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.93 | bwd_microstep: 5704.14 | bwd_inner_microstep: 5649.53 | bwd_allreduce_microstep: 54.56 | step_microstep: 17.78 [2025-04-26 02:46:22,495] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.93 | bwd: 5704.15 | bwd_inner: 5649.53 | bwd_allreduce: 54.58 | step: 17.78 19%|█▉ | 7789/41250 [18:48:47<80:30:05, 8.66s/it] {'loss': 0.0485, 'grad_norm': 0.7967332601547241, 'learning_rate': 3.741215504317564e-05, 'epoch': 1.89} 19%|█▉ | 7789/41250 [18:48:47<80:30:05, 8.66s/it][2025-04-26 02:46:31,117] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:46:31,117] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.17 | bwd_microstep: 5685.41 | bwd_inner_microstep: 5672.52 | bwd_allreduce_microstep: 12.85 | step_microstep: 18.74 [2025-04-26 02:46:31,117] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.17 | bwd: 5685.43 | bwd_inner: 5672.52 | bwd_allreduce: 12.87 | step: 18.74 19%|█▉ | 7790/41250 [18:48:56<80:23:31, 8.65s/it] {'loss': 0.3384, 'grad_norm': 1.982440710067749, 'learning_rate': 3.741138242437566e-05, 'epoch': 1.89} 19%|█▉ | 7790/41250 [18:48:56<80:23:31, 8.65s/it][2025-04-26 02:46:39,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:46:39,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.15 | bwd_microstep: 5742.85 | bwd_inner_microstep: 5645.75 | bwd_allreduce_microstep: 97.06 | step_microstep: 18.57 [2025-04-26 02:46:39,782] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.15 | bwd: 5742.86 | bwd_inner: 5645.75 | bwd_allreduce: 97.07 | step: 18.57 19%|█▉ | 7791/41250 [18:49:05<80:26:00, 8.65s/it] {'loss': 0.2416, 'grad_norm': 3.9538021087646484, 'learning_rate': 3.741060969823793e-05, 'epoch': 1.89} 19%|█▉ | 7791/41250 [18:49:05<80:26:00, 8.65s/it][2025-04-26 02:46:48,521] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.19 | optimizer_step: 0.90 [2025-04-26 02:46:48,521] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2889.66 | bwd_microstep: 5766.02 | bwd_inner_microstep: 5753.44 | bwd_allreduce_microstep: 12.53 | step_microstep: 18.93 [2025-04-26 02:46:48,521] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2889.66 | bwd: 5766.03 | bwd_inner: 5753.44 | bwd_allreduce: 12.55 | step: 18.93 19%|█▉ | 7792/41250 [18:49:13<80:40:00, 8.68s/it] {'loss': 0.1008, 'grad_norm': 1.2669600248336792, 'learning_rate': 3.740983686476722e-05, 'epoch': 1.89} 19%|█▉ | 7792/41250 [18:49:13<80:40:00, 8.68s/it][2025-04-26 02:46:57,172] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.82 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 02:46:57,173] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.86 | bwd_microstep: 5717.88 | bwd_inner_microstep: 5681.89 | bwd_allreduce_microstep: 35.95 | step_microstep: 17.63 [2025-04-26 02:46:57,173] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.86 | bwd: 5717.90 | bwd_inner: 5681.89 | bwd_allreduce: 35.97 | step: 17.63 19%|█▉ | 7793/41250 [18:49:22<80:35:17, 8.67s/it] {'loss': 0.1365, 'grad_norm': 1.1043176651000977, 'learning_rate': 3.740906392396829e-05, 'epoch': 1.89} 19%|█▉ | 7793/41250 [18:49:22<80:35:17, 8.67s/it][2025-04-26 02:47:05,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.00 | optimizer_step: 0.99 [2025-04-26 02:47:05,840] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.53 | bwd_microstep: 5750.19 | bwd_inner_microstep: 5642.58 | bwd_allreduce_microstep: 107.57 | step_microstep: 18.95 [2025-04-26 02:47:05,840] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.53 | bwd: 5750.21 | bwd_inner: 5642.58 | bwd_allreduce: 107.58 | step: 18.95 19%|█▉ | 7794/41250 [18:49:31<80:34:22, 8.67s/it] {'loss': 0.155, 'grad_norm': 3.0944344997406006, 'learning_rate': 3.740829087584591e-05, 'epoch': 1.89} 19%|█▉ | 7794/41250 [18:49:31<80:34:22, 8.67s/it][2025-04-26 02:47:14,485] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:47:14,486] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.65 | bwd_microstep: 5715.21 | bwd_inner_microstep: 5702.43 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.91 [2025-04-26 02:47:14,486] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.65 | bwd: 5715.22 | bwd_inner: 5702.43 | bwd_allreduce: 12.75 | step: 18.92 19%|█▉ | 7795/41250 [18:49:39<80:30:17, 8.66s/it] {'loss': 0.1256, 'grad_norm': 2.738616704940796, 'learning_rate': 3.740751772040483e-05, 'epoch': 1.89} 19%|█▉ | 7795/41250 [18:49:39<80:30:17, 8.66s/it][2025-04-26 02:47:23,258] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 02:47:23,259] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.66 | bwd_microstep: 5871.29 | bwd_inner_microstep: 5636.90 | bwd_allreduce_microstep: 234.34 | step_microstep: 18.48 [2025-04-26 02:47:23,259] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.66 | bwd: 5871.30 | bwd_inner: 5636.90 | bwd_allreduce: 234.36 | step: 18.48 19%|█▉ | 7796/41250 [18:49:48<80:48:51, 8.70s/it] {'loss': 0.2403, 'grad_norm': 2.2365972995758057, 'learning_rate': 3.740674445764984e-05, 'epoch': 1.89} 19%|█▉ | 7796/41250 [18:49:48<80:48:51, 8.70s/it][2025-04-26 02:47:32,031] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.98 | optimizer_step: 0.93 [2025-04-26 02:47:32,032] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.07 | bwd_microstep: 5857.95 | bwd_inner_microstep: 5644.23 | bwd_allreduce_microstep: 213.67 | step_microstep: 18.68 [2025-04-26 02:47:32,032] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.07 | bwd: 5857.96 | bwd_inner: 5644.23 | bwd_allreduce: 213.69 | step: 18.68 19%|█▉ | 7797/41250 [18:49:57<81:01:08, 8.72s/it] {'loss': 0.154, 'grad_norm': 1.478623390197754, 'learning_rate': 3.740597108758568e-05, 'epoch': 1.89} 19%|█▉ | 7797/41250 [18:49:57<81:01:08, 8.72s/it][2025-04-26 02:47:40,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 02:47:40,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.87 | bwd_microstep: 5695.21 | bwd_inner_microstep: 5682.57 | bwd_allreduce_microstep: 12.60 | step_microstep: 18.25 [2025-04-26 02:47:40,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.87 | bwd: 5695.23 | bwd_inner: 5682.57 | bwd_allreduce: 12.62 | step: 18.25 19%|█▉ | 7798/41250 [18:50:05<80:44:02, 8.69s/it] {'loss': 0.1146, 'grad_norm': 1.634359359741211, 'learning_rate': 3.740519761021714e-05, 'epoch': 1.89} 19%|█▉ | 7798/41250 [18:50:05<80:44:02, 8.69s/it][2025-04-26 02:47:49,330] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 02:47:49,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.31 | bwd_microstep: 5738.68 | bwd_inner_microstep: 5696.57 | bwd_allreduce_microstep: 42.06 | step_microstep: 18.25 [2025-04-26 02:47:49,331] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.31 | bwd: 5738.69 | bwd_inner: 5696.57 | bwd_allreduce: 42.07 | step: 18.26 19%|█▉ | 7799/41250 [18:50:14<80:42:43, 8.69s/it] {'loss': 0.0668, 'grad_norm': 0.8654105067253113, 'learning_rate': 3.740442402554898e-05, 'epoch': 1.89} 19%|█▉ | 7799/41250 [18:50:14<80:42:43, 8.69s/it][2025-04-26 02:47:57,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:47:57,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.55 | bwd_microstep: 5700.70 | bwd_inner_microstep: 5687.74 | bwd_allreduce_microstep: 12.92 | step_microstep: 18.58 [2025-04-26 02:47:57,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.55 | bwd: 5700.72 | bwd_inner: 5687.74 | bwd_allreduce: 12.94 | step: 18.58 19%|█▉ | 7800/41250 [18:50:23<80:32:42, 8.67s/it] {'loss': 0.2194, 'grad_norm': 4.828146934509277, 'learning_rate': 3.740365033358597e-05, 'epoch': 1.89} 19%|█▉ | 7800/41250 [18:50:23<80:32:42, 8.67s/it][2025-04-26 02:48:06,747] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.91 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:48:06,748] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.00 | bwd_microstep: 5847.51 | bwd_inner_microstep: 5691.98 | bwd_allreduce_microstep: 155.49 | step_microstep: 17.81 [2025-04-26 02:48:06,748] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.00 | bwd: 5847.52 | bwd_inner: 5691.98 | bwd_allreduce: 155.50 | step: 17.81 19%|█▉ | 7801/41250 [18:50:32<80:52:48, 8.70s/it] {'loss': 0.1193, 'grad_norm': 4.992996692657471, 'learning_rate': 3.7402876534332874e-05, 'epoch': 1.89} 19%|█▉ | 7801/41250 [18:50:32<80:52:48, 8.70s/it][2025-04-26 02:48:15,401] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 02:48:15,401] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.83 | bwd_microstep: 5729.08 | bwd_inner_microstep: 5682.87 | bwd_allreduce_microstep: 46.16 | step_microstep: 17.97 [2025-04-26 02:48:15,401] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.83 | bwd: 5729.09 | bwd_inner: 5682.87 | bwd_allreduce: 46.17 | step: 17.97 19%|█▉ | 7802/41250 [18:50:40<80:44:26, 8.69s/it] {'loss': 0.1019, 'grad_norm': 1.7656053304672241, 'learning_rate': 3.7402102627794465e-05, 'epoch': 1.89} 19%|█▉ | 7802/41250 [18:50:40<80:44:26, 8.69s/it][2025-04-26 02:48:24,067] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.96 | optimizer_step: 0.95 [2025-04-26 02:48:24,068] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.22 | bwd_microstep: 5734.19 | bwd_inner_microstep: 5702.41 | bwd_allreduce_microstep: 31.75 | step_microstep: 18.62 [2025-04-26 02:48:24,068] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.22 | bwd: 5734.21 | bwd_inner: 5702.41 | bwd_allreduce: 31.76 | step: 18.63 19%|█▉ | 7803/41250 [18:50:49<80:40:00, 8.68s/it] {'loss': 0.062, 'grad_norm': 1.057460904121399, 'learning_rate': 3.7401328613975525e-05, 'epoch': 1.89} 19%|█▉ | 7803/41250 [18:50:49<80:40:00, 8.68s/it][2025-04-26 02:48:32,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:48:32,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.28 | bwd_microstep: 5894.24 | bwd_inner_microstep: 5652.46 | bwd_allreduce_microstep: 241.74 | step_microstep: 18.31 [2025-04-26 02:48:32,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.28 | bwd: 5894.26 | bwd_inner: 5652.46 | bwd_allreduce: 241.75 | step: 18.32 19%|█▉ | 7804/41250 [18:50:58<80:59:35, 8.72s/it] {'loss': 0.1404, 'grad_norm': 2.93847918510437, 'learning_rate': 3.740055449288081e-05, 'epoch': 1.89} 19%|█▉ | 7804/41250 [18:50:58<80:59:35, 8.72s/it][2025-04-26 02:48:41,550] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 02:48:41,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.02 | bwd_microstep: 5772.00 | bwd_inner_microstep: 5657.43 | bwd_allreduce_microstep: 114.52 | step_microstep: 18.47 [2025-04-26 02:48:41,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.02 | bwd: 5772.01 | bwd_inner: 5657.43 | bwd_allreduce: 114.54 | step: 18.47 19%|█▉ | 7805/41250 [18:51:06<80:53:31, 8.71s/it] {'loss': 0.1989, 'grad_norm': 2.0157461166381836, 'learning_rate': 3.739978026451511e-05, 'epoch': 1.89} 19%|█▉ | 7805/41250 [18:51:06<80:53:31, 8.71s/it][2025-04-26 02:48:50,172] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-26 02:48:50,173] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.45 | bwd_microstep: 5716.80 | bwd_inner_microstep: 5637.29 | bwd_allreduce_microstep: 79.47 | step_microstep: 18.28 [2025-04-26 02:48:50,173] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.45 | bwd: 5716.81 | bwd_inner: 5637.29 | bwd_allreduce: 79.48 | step: 18.28 19%|█▉ | 7806/41250 [18:51:15<80:39:08, 8.68s/it] {'loss': 0.3219, 'grad_norm': 2.7822883129119873, 'learning_rate': 3.739900592888317e-05, 'epoch': 1.89} 19%|█▉ | 7806/41250 [18:51:15<80:39:08, 8.68s/it][2025-04-26 02:48:58,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 02:48:58,835] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.09 | bwd_microstep: 5760.29 | bwd_inner_microstep: 5654.36 | bwd_allreduce_microstep: 105.88 | step_microstep: 18.26 [2025-04-26 02:48:58,835] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.09 | bwd: 5760.30 | bwd_inner: 5654.36 | bwd_allreduce: 105.90 | step: 18.26 19%|█▉ | 7807/41250 [18:51:24<80:35:49, 8.68s/it] {'loss': 0.1667, 'grad_norm': 1.2466368675231934, 'learning_rate': 3.739823148598979e-05, 'epoch': 1.89} 19%|█▉ | 7807/41250 [18:51:24<80:35:49, 8.68s/it][2025-04-26 02:49:07,459] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:49:07,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.88 | bwd_microstep: 5720.24 | bwd_inner_microstep: 5647.94 | bwd_allreduce_microstep: 72.25 | step_microstep: 18.22 [2025-04-26 02:49:07,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.88 | bwd: 5720.25 | bwd_inner: 5647.94 | bwd_allreduce: 72.27 | step: 18.22 19%|█▉ | 7808/41250 [18:51:32<80:27:05, 8.66s/it] {'loss': 0.0573, 'grad_norm': 0.7113448977470398, 'learning_rate': 3.739745693583974e-05, 'epoch': 1.89} 19%|█▉ | 7808/41250 [18:51:32<80:27:05, 8.66s/it][2025-04-26 02:49:16,111] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.79 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 02:49:16,111] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.26 | bwd_microstep: 5704.78 | bwd_inner_microstep: 5692.06 | bwd_allreduce_microstep: 12.68 | step_microstep: 17.66 [2025-04-26 02:49:16,112] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.26 | bwd: 5704.79 | bwd_inner: 5692.06 | bwd_allreduce: 12.70 | step: 17.66 19%|█▉ | 7809/41250 [18:51:41<80:25:23, 8.66s/it] {'loss': 0.0728, 'grad_norm': 1.8661201000213623, 'learning_rate': 3.739668227843778e-05, 'epoch': 1.89} 19%|█▉ | 7809/41250 [18:51:41<80:25:23, 8.66s/it][2025-04-26 02:49:24,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 02:49:24,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2888.66 | bwd_microstep: 5773.63 | bwd_inner_microstep: 5760.99 | bwd_allreduce_microstep: 12.59 | step_microstep: 18.60 [2025-04-26 02:49:24,857] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2888.66 | bwd: 5773.64 | bwd_inner: 5760.99 | bwd_allreduce: 12.60 | step: 18.60 19%|█▉ | 7810/41250 [18:51:50<80:40:22, 8.68s/it] {'loss': 0.0878, 'grad_norm': 1.2326395511627197, 'learning_rate': 3.7395907513788696e-05, 'epoch': 1.89} 19%|█▉ | 7810/41250 [18:51:50<80:40:22, 8.68s/it][2025-04-26 02:49:33,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:49:33,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.62 | bwd_microstep: 5880.56 | bwd_inner_microstep: 5663.17 | bwd_allreduce_microstep: 217.34 | step_microstep: 18.49 [2025-04-26 02:49:33,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.62 | bwd: 5880.58 | bwd_inner: 5663.17 | bwd_allreduce: 217.36 | step: 18.49 19%|█▉ | 7811/41250 [18:51:58<80:57:46, 8.72s/it] {'loss': 0.3369, 'grad_norm': 4.325037956237793, 'learning_rate': 3.7395132641897266e-05, 'epoch': 1.89} 19%|█▉ | 7811/41250 [18:51:58<80:57:46, 8.72s/it][2025-04-26 02:49:42,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.94 [2025-04-26 02:49:42,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.95 | bwd_microstep: 5736.32 | bwd_inner_microstep: 5699.92 | bwd_allreduce_microstep: 36.35 | step_microstep: 18.87 [2025-04-26 02:49:42,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.95 | bwd: 5736.33 | bwd_inner: 5699.92 | bwd_allreduce: 36.37 | step: 18.88 19%|█▉ | 7812/41250 [18:52:07<80:48:56, 8.70s/it] {'loss': 0.0492, 'grad_norm': 2.26255202293396, 'learning_rate': 3.739435766276827e-05, 'epoch': 1.89} 19%|█▉ | 7812/41250 [18:52:07<80:48:56, 8.70s/it][2025-04-26 02:49:50,950] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-26 02:49:50,950] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.22 | bwd_microstep: 5708.39 | bwd_inner_microstep: 5689.36 | bwd_allreduce_microstep: 18.98 | step_microstep: 18.12 [2025-04-26 02:49:50,951] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.22 | bwd: 5708.40 | bwd_inner: 5689.36 | bwd_allreduce: 19.00 | step: 18.12 19%|█▉ | 7813/41250 [18:52:16<80:38:04, 8.68s/it] {'loss': 0.1334, 'grad_norm': 2.203641891479492, 'learning_rate': 3.739358257640648e-05, 'epoch': 1.89} 19%|█▉ | 7813/41250 [18:52:16<80:38:04, 8.68s/it][2025-04-26 02:49:59,774] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.95 | optimizer_step: 0.99 [2025-04-26 02:49:59,774] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.57 | bwd_microstep: 5909.13 | bwd_inner_microstep: 5668.23 | bwd_allreduce_microstep: 240.86 | step_microstep: 18.17 [2025-04-26 02:49:59,774] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.57 | bwd: 5909.14 | bwd_inner: 5668.23 | bwd_allreduce: 240.87 | step: 18.18 19%|█▉ | 7814/41250 [18:52:25<81:01:44, 8.72s/it] {'loss': 0.0509, 'grad_norm': 0.7801550030708313, 'learning_rate': 3.739280738281667e-05, 'epoch': 1.89} 19%|█▉ | 7814/41250 [18:52:25<81:01:44, 8.72s/it][2025-04-26 02:50:08,594] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.96 | optimizer_step: 0.99 [2025-04-26 02:50:08,595] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.50 | bwd_microstep: 5909.28 | bwd_inner_microstep: 5665.96 | bwd_allreduce_microstep: 243.28 | step_microstep: 18.31 [2025-04-26 02:50:08,595] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.50 | bwd: 5909.29 | bwd_inner: 5665.96 | bwd_allreduce: 243.29 | step: 18.31 19%|█▉ | 7815/41250 [18:52:33<81:17:42, 8.75s/it] {'loss': 0.2537, 'grad_norm': 2.3311588764190674, 'learning_rate': 3.739203208200362e-05, 'epoch': 1.89} 19%|█▉ | 7815/41250 [18:52:33<81:17:42, 8.75s/it][2025-04-26 02:50:17,289] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-26 02:50:17,289] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.66 | bwd_microstep: 5787.07 | bwd_inner_microstep: 5665.22 | bwd_allreduce_microstep: 121.80 | step_microstep: 18.41 [2025-04-26 02:50:17,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.66 | bwd: 5787.08 | bwd_inner: 5665.22 | bwd_allreduce: 121.82 | step: 18.41 19%|█▉ | 7816/41250 [18:52:42<81:07:42, 8.74s/it] {'loss': 0.1072, 'grad_norm': 1.4336906671524048, 'learning_rate': 3.739125667397212e-05, 'epoch': 1.89} 19%|█▉ | 7816/41250 [18:52:42<81:07:42, 8.74s/it][2025-04-26 02:50:25,945] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.01 | optimizer_step: 1.02 [2025-04-26 02:50:25,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.00 | bwd_microstep: 5715.61 | bwd_inner_microstep: 5702.57 | bwd_allreduce_microstep: 12.99 | step_microstep: 18.55 [2025-04-26 02:50:25,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.00 | bwd: 5715.62 | bwd_inner: 5702.57 | bwd_allreduce: 13.01 | step: 18.55 19%|█▉ | 7817/41250 [18:52:51<80:54:31, 8.71s/it] {'loss': 0.1225, 'grad_norm': 2.370265245437622, 'learning_rate': 3.739048115872694e-05, 'epoch': 1.9} 19%|█▉ | 7817/41250 [18:52:51<80:54:31, 8.71s/it][2025-04-26 02:50:34,658] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 02:50:34,659] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.44 | bwd_microstep: 5780.09 | bwd_inner_microstep: 5691.82 | bwd_allreduce_microstep: 88.22 | step_microstep: 18.27 [2025-04-26 02:50:34,659] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.44 | bwd: 5780.10 | bwd_inner: 5691.82 | bwd_allreduce: 88.24 | step: 18.27 19%|█▉ | 7818/41250 [18:52:59<80:54:29, 8.71s/it] {'loss': 0.0956, 'grad_norm': 1.6837921142578125, 'learning_rate': 3.738970553627287e-05, 'epoch': 1.9} 19%|█▉ | 7818/41250 [18:52:59<80:54:29, 8.71s/it][2025-04-26 02:50:43,365] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-26 02:50:43,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.41 | bwd_microstep: 5778.64 | bwd_inner_microstep: 5652.45 | bwd_allreduce_microstep: 126.16 | step_microstep: 18.26 [2025-04-26 02:50:43,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.42 | bwd: 5778.66 | bwd_inner: 5652.45 | bwd_allreduce: 126.17 | step: 18.26 19%|█▉ | 7819/41250 [18:53:08<80:53:18, 8.71s/it] {'loss': 0.2173, 'grad_norm': 1.857789158821106, 'learning_rate': 3.738892980661468e-05, 'epoch': 1.9} 19%|█▉ | 7819/41250 [18:53:08<80:53:18, 8.71s/it][2025-04-26 02:50:52,047] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 02:50:52,047] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.84 | bwd_microstep: 5742.01 | bwd_inner_microstep: 5690.62 | bwd_allreduce_microstep: 51.34 | step_microstep: 18.65 [2025-04-26 02:50:52,048] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.84 | bwd: 5742.02 | bwd_inner: 5690.62 | bwd_allreduce: 51.36 | step: 18.65 19%|█▉ | 7820/41250 [18:53:17<80:48:26, 8.70s/it] {'loss': 0.0788, 'grad_norm': 0.994238018989563, 'learning_rate': 3.738815396975715e-05, 'epoch': 1.9} 19%|█▉ | 7820/41250 [18:53:17<80:48:26, 8.70s/it][2025-04-26 02:51:00,677] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 1.07 | optimizer_step: 1.17 [2025-04-26 02:51:00,678] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.89 | bwd_microstep: 5715.82 | bwd_inner_microstep: 5668.42 | bwd_allreduce_microstep: 47.36 | step_microstep: 19.66 [2025-04-26 02:51:00,678] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.89 | bwd: 5715.84 | bwd_inner: 5668.42 | bwd_allreduce: 47.38 | step: 19.67 19%|█▉ | 7821/41250 [18:53:26<80:36:29, 8.68s/it] {'loss': 0.0314, 'grad_norm': 0.5286582708358765, 'learning_rate': 3.7387378025705085e-05, 'epoch': 1.9} 19%|█▉ | 7821/41250 [18:53:26<80:36:29, 8.68s/it][2025-04-26 02:51:09,321] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:51:09,322] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.62 | bwd_microstep: 5710.88 | bwd_inner_microstep: 5697.91 | bwd_allreduce_microstep: 12.92 | step_microstep: 19.05 [2025-04-26 02:51:09,322] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.62 | bwd: 5710.90 | bwd_inner: 5697.91 | bwd_allreduce: 12.94 | step: 19.05 19%|█▉ | 7822/41250 [18:53:34<80:30:12, 8.67s/it] {'loss': 0.0838, 'grad_norm': 1.7114256620407104, 'learning_rate': 3.738660197446325e-05, 'epoch': 1.9} 19%|█▉ | 7822/41250 [18:53:34<80:30:12, 8.67s/it][2025-04-26 02:51:17,970] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:51:17,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.41 | bwd_microstep: 5709.45 | bwd_inner_microstep: 5697.17 | bwd_allreduce_microstep: 12.23 | step_microstep: 18.77 [2025-04-26 02:51:17,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.41 | bwd: 5709.46 | bwd_inner: 5697.17 | bwd_allreduce: 12.25 | step: 18.77 19%|█▉ | 7823/41250 [18:53:43<80:26:28, 8.66s/it] {'loss': 0.0527, 'grad_norm': 0.9333617091178894, 'learning_rate': 3.738582581603643e-05, 'epoch': 1.9} 19%|█▉ | 7823/41250 [18:53:43<80:26:28, 8.66s/it][2025-04-26 02:51:26,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:51:26,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.59 | bwd_microstep: 5764.95 | bwd_inner_microstep: 5691.82 | bwd_allreduce_microstep: 73.09 | step_microstep: 18.87 [2025-04-26 02:51:26,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.59 | bwd: 5764.96 | bwd_inner: 5691.82 | bwd_allreduce: 73.10 | step: 18.87 19%|█▉ | 7824/41250 [18:53:51<80:31:35, 8.67s/it] {'loss': 0.3619, 'grad_norm': 3.1670925617218018, 'learning_rate': 3.7385049550429415e-05, 'epoch': 1.9} 19%|█▉ | 7824/41250 [18:53:51<80:31:35, 8.67s/it][2025-04-26 02:51:35,338] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:51:35,339] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.64 | bwd_microstep: 5740.26 | bwd_inner_microstep: 5692.43 | bwd_allreduce_microstep: 47.79 | step_microstep: 18.58 [2025-04-26 02:51:35,339] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.64 | bwd: 5740.28 | bwd_inner: 5692.43 | bwd_allreduce: 47.80 | step: 18.58 19%|█▉ | 7825/41250 [18:54:00<80:31:38, 8.67s/it] {'loss': 0.0934, 'grad_norm': 1.90762197971344, 'learning_rate': 3.7384273177646994e-05, 'epoch': 1.9} 19%|█▉ | 7825/41250 [18:54:00<80:31:38, 8.67s/it][2025-04-26 02:51:44,019] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.06 | optimizer_step: 1.03 [2025-04-26 02:51:44,020] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.76 | bwd_microstep: 5744.21 | bwd_inner_microstep: 5688.87 | bwd_allreduce_microstep: 55.29 | step_microstep: 19.27 [2025-04-26 02:51:44,020] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.75 | bwd: 5744.22 | bwd_inner: 5688.87 | bwd_allreduce: 55.31 | step: 19.27 19%|█▉ | 7826/41250 [18:54:09<80:32:54, 8.68s/it] {'loss': 0.1957, 'grad_norm': 1.3773881196975708, 'learning_rate': 3.738349669769394e-05, 'epoch': 1.9} 19%|█▉ | 7826/41250 [18:54:09<80:32:54, 8.68s/it][2025-04-26 02:51:52,628] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.00 | optimizer_step: 1.06 [2025-04-26 02:51:52,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.06 | bwd_microstep: 5688.89 | bwd_inner_microstep: 5658.96 | bwd_allreduce_microstep: 29.88 | step_microstep: 18.85 [2025-04-26 02:51:52,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.06 | bwd: 5688.90 | bwd_inner: 5658.96 | bwd_allreduce: 29.89 | step: 18.86 19%|█▉ | 7827/41250 [18:54:17<80:21:21, 8.66s/it] {'loss': 0.1056, 'grad_norm': 1.582063913345337, 'learning_rate': 3.7382720110575054e-05, 'epoch': 1.9} 19%|█▉ | 7827/41250 [18:54:17<80:21:21, 8.66s/it][2025-04-26 02:52:01,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-26 02:52:01,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.64 | bwd_microstep: 5752.61 | bwd_inner_microstep: 5648.83 | bwd_allreduce_microstep: 103.73 | step_microstep: 18.66 [2025-04-26 02:52:01,300] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.64 | bwd: 5752.62 | bwd_inner: 5648.83 | bwd_allreduce: 103.76 | step: 18.67 19%|█▉ | 7828/41250 [18:54:26<80:24:01, 8.66s/it] {'loss': 0.0151, 'grad_norm': 0.16523490846157074, 'learning_rate': 3.738194341629511e-05, 'epoch': 1.9} 19%|█▉ | 7828/41250 [18:54:26<80:24:01, 8.66s/it][2025-04-26 02:52:10,105] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 1.09 [2025-04-26 02:52:10,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.53 | bwd_microstep: 5868.19 | bwd_inner_microstep: 5713.36 | bwd_allreduce_microstep: 154.78 | step_microstep: 19.49 [2025-04-26 02:52:10,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.53 | bwd: 5868.20 | bwd_inner: 5713.36 | bwd_allreduce: 154.80 | step: 19.49 19%|█▉ | 7829/41250 [18:54:35<80:48:44, 8.70s/it] {'loss': 0.0686, 'grad_norm': 4.542836666107178, 'learning_rate': 3.738116661485891e-05, 'epoch': 1.9} 19%|█▉ | 7829/41250 [18:54:35<80:48:44, 8.70s/it][2025-04-26 02:52:18,791] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:52:18,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.90 | bwd_microstep: 5778.55 | bwd_inner_microstep: 5654.11 | bwd_allreduce_microstep: 124.40 | step_microstep: 18.56 [2025-04-26 02:52:18,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.90 | bwd: 5778.57 | bwd_inner: 5654.11 | bwd_allreduce: 124.42 | step: 18.56 19%|█▉ | 7830/41250 [18:54:44<80:44:54, 8.70s/it] {'loss': 0.0841, 'grad_norm': 2.6795032024383545, 'learning_rate': 3.738038970627124e-05, 'epoch': 1.9} 19%|█▉ | 7830/41250 [18:54:44<80:44:54, 8.70s/it][2025-04-26 02:52:27,406] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.98 | optimizer_step: 0.94 [2025-04-26 02:52:27,406] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.39 | bwd_microstep: 5707.50 | bwd_inner_microstep: 5650.88 | bwd_allreduce_microstep: 56.58 | step_microstep: 18.47 [2025-04-26 02:52:27,407] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.39 | bwd: 5707.52 | bwd_inner: 5650.88 | bwd_allreduce: 56.60 | step: 18.47 19%|█▉ | 7831/41250 [18:54:52<80:30:51, 8.67s/it] {'loss': 0.1405, 'grad_norm': 2.6074435710906982, 'learning_rate': 3.737961269053688e-05, 'epoch': 1.9} 19%|█▉ | 7831/41250 [18:54:52<80:30:51, 8.67s/it][2025-04-26 02:52:36,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.96 [2025-04-26 02:52:36,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.88 | bwd_microstep: 5784.92 | bwd_inner_microstep: 5641.55 | bwd_allreduce_microstep: 143.32 | step_microstep: 18.64 [2025-04-26 02:52:36,097] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.88 | bwd: 5784.94 | bwd_inner: 5641.56 | bwd_allreduce: 143.34 | step: 18.64 19%|█▉ | 7832/41250 [18:55:01<80:33:52, 8.68s/it] {'loss': 0.208, 'grad_norm': 2.362736701965332, 'learning_rate': 3.737883556766063e-05, 'epoch': 1.9} 19%|█▉ | 7832/41250 [18:55:01<80:33:52, 8.68s/it][2025-04-26 02:52:44,786] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:52:44,786] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.91 | bwd_microstep: 5760.88 | bwd_inner_microstep: 5674.44 | bwd_allreduce_microstep: 86.40 | step_microstep: 18.92 [2025-04-26 02:52:44,786] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.91 | bwd: 5760.90 | bwd_inner: 5674.44 | bwd_allreduce: 86.41 | step: 18.93 19%|█▉ | 7833/41250 [18:55:10<80:35:10, 8.68s/it] {'loss': 0.177, 'grad_norm': 2.253438711166382, 'learning_rate': 3.7378058337647274e-05, 'epoch': 1.9} 19%|█▉ | 7833/41250 [18:55:10<80:35:10, 8.68s/it][2025-04-26 02:52:53,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:52:53,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2873.13 | bwd_microstep: 5763.91 | bwd_inner_microstep: 5751.00 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.74 [2025-04-26 02:52:53,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2873.13 | bwd: 5763.92 | bwd_inner: 5751.00 | bwd_allreduce: 12.88 | step: 18.74 19%|█▉ | 7834/41250 [18:55:18<80:40:59, 8.69s/it] {'loss': 0.077, 'grad_norm': 0.8875815868377686, 'learning_rate': 3.7377281000501606e-05, 'epoch': 1.9} 19%|█▉ | 7834/41250 [18:55:18<80:40:59, 8.69s/it][2025-04-26 02:53:02,165] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.31 | optimizer_step: 1.03 [2025-04-26 02:53:02,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.91 | bwd_microstep: 5751.21 | bwd_inner_microstep: 5646.60 | bwd_allreduce_microstep: 104.55 | step_microstep: 20.25 [2025-04-26 02:53:02,166] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.91 | bwd: 5751.22 | bwd_inner: 5646.60 | bwd_allreduce: 104.57 | step: 20.25 19%|█▉ | 7835/41250 [18:55:27<80:36:10, 8.68s/it] {'loss': 0.3269, 'grad_norm': 2.958845853805542, 'learning_rate': 3.737650355622842e-05, 'epoch': 1.9} 19%|█▉ | 7835/41250 [18:55:27<80:36:10, 8.68s/it][2025-04-26 02:53:10,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:53:10,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.06 | bwd_microstep: 5774.18 | bwd_inner_microstep: 5645.50 | bwd_allreduce_microstep: 128.64 | step_microstep: 18.90 [2025-04-26 02:53:10,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.06 | bwd: 5774.19 | bwd_inner: 5645.50 | bwd_allreduce: 128.66 | step: 18.90 19%|█▉ | 7836/41250 [18:55:36<80:35:29, 8.68s/it] {'loss': 0.0864, 'grad_norm': 1.009624719619751, 'learning_rate': 3.7375726004832506e-05, 'epoch': 1.9} 19%|█▉ | 7836/41250 [18:55:36<80:35:29, 8.68s/it][2025-04-26 02:53:19,488] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.05 | optimizer_step: 0.94 [2025-04-26 02:53:19,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.72 | bwd_microstep: 5705.91 | bwd_inner_microstep: 5693.17 | bwd_allreduce_microstep: 12.69 | step_microstep: 19.33 [2025-04-26 02:53:19,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.72 | bwd: 5705.92 | bwd_inner: 5693.17 | bwd_allreduce: 12.71 | step: 19.33 19%|█▉ | 7837/41250 [18:55:44<80:28:37, 8.67s/it] {'loss': 0.2965, 'grad_norm': 2.2630395889282227, 'learning_rate': 3.737494834631866e-05, 'epoch': 1.9} 19%|█▉ | 7837/41250 [18:55:44<80:28:37, 8.67s/it][2025-04-26 02:53:28,078] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 02:53:28,078] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.76 | bwd_microstep: 5670.48 | bwd_inner_microstep: 5635.49 | bwd_allreduce_microstep: 34.95 | step_microstep: 19.05 [2025-04-26 02:53:28,078] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.76 | bwd: 5670.49 | bwd_inner: 5635.49 | bwd_allreduce: 34.96 | step: 19.05 19%|█▉ | 7838/41250 [18:55:53<80:14:49, 8.65s/it] {'loss': 0.0739, 'grad_norm': 2.1472933292388916, 'learning_rate': 3.7374170580691674e-05, 'epoch': 1.9} 19%|█▉ | 7838/41250 [18:55:53<80:14:49, 8.65s/it][2025-04-26 02:53:36,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.01 | optimizer_step: 1.04 [2025-04-26 02:53:36,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.15 | bwd_microstep: 5685.15 | bwd_inner_microstep: 5629.97 | bwd_allreduce_microstep: 55.13 | step_microstep: 19.09 [2025-04-26 02:53:36,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.15 | bwd: 5685.16 | bwd_inner: 5629.97 | bwd_allreduce: 55.15 | step: 19.09 19%|█▉ | 7839/41250 [18:56:01<80:04:52, 8.63s/it] {'loss': 0.0508, 'grad_norm': 0.9591458439826965, 'learning_rate': 3.7373392707956346e-05, 'epoch': 1.9} 19%|█▉ | 7839/41250 [18:56:01<80:04:52, 8.63s/it][2025-04-26 02:53:45,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.24 | optimizer_step: 0.90 [2025-04-26 02:53:45,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.35 | bwd_microstep: 5775.65 | bwd_inner_microstep: 5634.64 | bwd_allreduce_microstep: 140.96 | step_microstep: 19.53 [2025-04-26 02:53:45,345] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.35 | bwd: 5775.66 | bwd_inner: 5634.64 | bwd_allreduce: 140.98 | step: 19.53 19%|█▉ | 7840/41250 [18:56:10<80:13:10, 8.64s/it] {'loss': 0.1812, 'grad_norm': 3.9073543548583984, 'learning_rate': 3.7372614728117463e-05, 'epoch': 1.9} 19%|█▉ | 7840/41250 [18:56:10<80:13:10, 8.64s/it][2025-04-26 02:53:54,019] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.03 | optimizer_step: 0.91 [2025-04-26 02:53:54,020] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.42 | bwd_microstep: 5762.88 | bwd_inner_microstep: 5646.45 | bwd_allreduce_microstep: 116.37 | step_microstep: 19.08 [2025-04-26 02:53:54,020] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.42 | bwd: 5762.89 | bwd_inner: 5646.45 | bwd_allreduce: 116.39 | step: 19.08 19%|█▉ | 7841/41250 [18:56:19<80:18:23, 8.65s/it] {'loss': 0.2497, 'grad_norm': 1.7519707679748535, 'learning_rate': 3.7371836641179837e-05, 'epoch': 1.9} 19%|█▉ | 7841/41250 [18:56:19<80:18:23, 8.65s/it][2025-04-26 02:54:02,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.07 | optimizer_step: 1.00 [2025-04-26 02:54:02,704] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.25 | bwd_microstep: 5770.08 | bwd_inner_microstep: 5651.93 | bwd_allreduce_microstep: 118.09 | step_microstep: 19.52 [2025-04-26 02:54:02,705] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.25 | bwd: 5770.09 | bwd_inner: 5651.93 | bwd_allreduce: 118.12 | step: 19.52 19%|█▉ | 7842/41250 [18:56:28<80:23:32, 8.66s/it] {'loss': 0.1601, 'grad_norm': 1.5731312036514282, 'learning_rate': 3.737105844714825e-05, 'epoch': 1.9} 19%|█▉ | 7842/41250 [18:56:28<80:23:32, 8.66s/it][2025-04-26 02:54:11,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 02:54:11,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.26 | bwd_microstep: 5854.32 | bwd_inner_microstep: 5696.23 | bwd_allreduce_microstep: 158.04 | step_microstep: 18.50 [2025-04-26 02:54:11,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.26 | bwd: 5854.33 | bwd_inner: 5696.23 | bwd_allreduce: 158.06 | step: 18.50 19%|█▉ | 7843/41250 [18:56:36<80:45:28, 8.70s/it] {'loss': 0.0978, 'grad_norm': 1.5212547779083252, 'learning_rate': 3.73702801460275e-05, 'epoch': 1.9} 19%|█▉ | 7843/41250 [18:56:36<80:45:28, 8.70s/it][2025-04-26 02:54:20,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.98 | optimizer_step: 1.06 [2025-04-26 02:54:20,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.44 | bwd_microstep: 5744.05 | bwd_inner_microstep: 5636.32 | bwd_allreduce_microstep: 107.68 | step_microstep: 18.68 [2025-04-26 02:54:20,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.44 | bwd: 5744.06 | bwd_inner: 5636.32 | bwd_allreduce: 107.70 | step: 18.68 19%|█▉ | 7844/41250 [18:56:45<80:36:50, 8.69s/it] {'loss': 0.2212, 'grad_norm': 2.2539596557617188, 'learning_rate': 3.73695017378224e-05, 'epoch': 1.9} 19%|█▉ | 7844/41250 [18:56:45<80:36:50, 8.69s/it][2025-04-26 02:54:28,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-26 02:54:28,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.18 | bwd_microstep: 5757.24 | bwd_inner_microstep: 5685.24 | bwd_allreduce_microstep: 71.94 | step_microstep: 19.20 [2025-04-26 02:54:28,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.18 | bwd: 5757.25 | bwd_inner: 5685.24 | bwd_allreduce: 71.96 | step: 19.20 19%|█▉ | 7845/41250 [18:56:54<80:36:00, 8.69s/it] {'loss': 0.0602, 'grad_norm': 0.8322860598564148, 'learning_rate': 3.736872322253773e-05, 'epoch': 1.9} 19%|█▉ | 7845/41250 [18:56:54<80:36:00, 8.69s/it][2025-04-26 02:54:37,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:54:37,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.56 | bwd_microstep: 5752.17 | bwd_inner_microstep: 5685.27 | bwd_allreduce_microstep: 66.86 | step_microstep: 18.81 [2025-04-26 02:54:37,516] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.56 | bwd: 5752.19 | bwd_inner: 5685.27 | bwd_allreduce: 66.88 | step: 18.82 19%|█▉ | 7846/41250 [18:57:02<80:34:22, 8.68s/it] {'loss': 0.0635, 'grad_norm': 1.7356455326080322, 'learning_rate': 3.73679446001783e-05, 'epoch': 1.9} 19%|█▉ | 7846/41250 [18:57:02<80:34:22, 8.68s/it][2025-04-26 02:54:46,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:54:46,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.92 | bwd_microstep: 5702.56 | bwd_inner_microstep: 5689.84 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.78 [2025-04-26 02:54:46,150] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.92 | bwd: 5702.57 | bwd_inner: 5689.84 | bwd_allreduce: 12.69 | step: 18.78 19%|█▉ | 7847/41250 [18:57:11<80:25:58, 8.67s/it] {'loss': 0.373, 'grad_norm': 5.280000686645508, 'learning_rate': 3.736716587074891e-05, 'epoch': 1.9} 19%|█▉ | 7847/41250 [18:57:11<80:25:58, 8.67s/it][2025-04-26 02:54:54,760] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:54:54,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.66 | bwd_microstep: 5701.40 | bwd_inner_microstep: 5644.03 | bwd_allreduce_microstep: 57.33 | step_microstep: 18.77 [2025-04-26 02:54:54,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.66 | bwd: 5701.41 | bwd_inner: 5644.03 | bwd_allreduce: 57.34 | step: 18.77 19%|█▉ | 7848/41250 [18:57:20<80:16:17, 8.65s/it] {'loss': 0.0533, 'grad_norm': 1.4301989078521729, 'learning_rate': 3.736638703425435e-05, 'epoch': 1.9} 19%|█▉ | 7848/41250 [18:57:20<80:16:17, 8.65s/it][2025-04-26 02:55:03,725] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.03 | optimizer_step: 1.23 [2025-04-26 02:55:03,726] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.45 | bwd_microstep: 6033.56 | bwd_inner_microstep: 5701.49 | bwd_allreduce_microstep: 332.02 | step_microstep: 19.46 [2025-04-26 02:55:03,726] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.45 | bwd: 6033.57 | bwd_inner: 5701.49 | bwd_allreduce: 332.04 | step: 19.46 19%|█▉ | 7849/41250 [18:57:29<81:08:44, 8.75s/it] {'loss': 0.2723, 'grad_norm': 2.922126293182373, 'learning_rate': 3.7365608090699436e-05, 'epoch': 1.9} 19%|█▉ | 7849/41250 [18:57:29<81:08:44, 8.75s/it][2025-04-26 02:55:12,419] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:55:12,420] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.45 | bwd_microstep: 5781.20 | bwd_inner_microstep: 5642.53 | bwd_allreduce_microstep: 138.63 | step_microstep: 18.33 [2025-04-26 02:55:12,420] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.45 | bwd: 5781.22 | bwd_inner: 5642.53 | bwd_allreduce: 138.64 | step: 18.33 19%|█▉ | 7850/41250 [18:57:37<80:59:38, 8.73s/it] {'loss': 0.0634, 'grad_norm': 2.153057098388672, 'learning_rate': 3.736482904008896e-05, 'epoch': 1.9} 19%|█▉ | 7850/41250 [18:57:37<80:59:38, 8.73s/it][2025-04-26 02:55:21,101] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 02:55:21,102] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.78 | bwd_microstep: 5768.93 | bwd_inner_microstep: 5642.49 | bwd_allreduce_microstep: 126.39 | step_microstep: 18.81 [2025-04-26 02:55:21,102] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.78 | bwd: 5768.94 | bwd_inner: 5642.49 | bwd_allreduce: 126.41 | step: 18.81 19%|█▉ | 7851/41250 [18:57:46<80:51:31, 8.72s/it] {'loss': 0.1486, 'grad_norm': 2.012498378753662, 'learning_rate': 3.7364049882427734e-05, 'epoch': 1.9} 19%|█▉ | 7851/41250 [18:57:46<80:51:31, 8.72s/it][2025-04-26 02:55:29,905] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.12 | optimizer_step: 0.94 [2025-04-26 02:55:29,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.25 | bwd_microstep: 5888.46 | bwd_inner_microstep: 5651.24 | bwd_allreduce_microstep: 237.17 | step_microstep: 18.67 [2025-04-26 02:55:29,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.25 | bwd: 5888.47 | bwd_inner: 5651.24 | bwd_allreduce: 237.19 | step: 18.68 19%|█▉ | 7852/41250 [18:57:55<81:06:11, 8.74s/it] {'loss': 0.0284, 'grad_norm': 1.0850852727890015, 'learning_rate': 3.736327061772055e-05, 'epoch': 1.9} 19%|█▉ | 7852/41250 [18:57:55<81:06:11, 8.74s/it][2025-04-26 02:55:38,603] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:55:38,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.00 | bwd_microstep: 5783.04 | bwd_inner_microstep: 5663.86 | bwd_allreduce_microstep: 119.14 | step_microstep: 18.54 [2025-04-26 02:55:38,604] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.00 | bwd: 5783.06 | bwd_inner: 5663.86 | bwd_allreduce: 119.15 | step: 18.55 19%|█▉ | 7853/41250 [18:58:03<80:58:30, 8.73s/it] {'loss': 0.2381, 'grad_norm': 2.7080068588256836, 'learning_rate': 3.7362491245972224e-05, 'epoch': 1.9} 19%|█▉ | 7853/41250 [18:58:03<80:58:30, 8.73s/it][2025-04-26 02:55:47,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 02:55:47,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.94 | bwd_microstep: 5674.57 | bwd_inner_microstep: 5646.50 | bwd_allreduce_microstep: 28.02 | step_microstep: 18.83 [2025-04-26 02:55:47,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.95 | bwd: 5674.58 | bwd_inner: 5646.50 | bwd_allreduce: 28.04 | step: 18.83 19%|█▉ | 7854/41250 [18:58:12<80:36:38, 8.69s/it] {'loss': 0.2632, 'grad_norm': 4.524087905883789, 'learning_rate': 3.736171176718755e-05, 'epoch': 1.9} 19%|█▉ | 7854/41250 [18:58:12<80:36:38, 8.69s/it][2025-04-26 02:55:55,845] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.05 | optimizer_step: 0.89 [2025-04-26 02:55:55,845] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.20 | bwd_microstep: 5709.17 | bwd_inner_microstep: 5695.81 | bwd_allreduce_microstep: 13.30 | step_microstep: 18.76 [2025-04-26 02:55:55,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.20 | bwd: 5709.18 | bwd_inner: 5695.81 | bwd_allreduce: 13.33 | step: 18.76 19%|█▉ | 7855/41250 [18:58:21<80:28:47, 8.68s/it] {'loss': 0.2729, 'grad_norm': 2.3465194702148438, 'learning_rate': 3.736093218137134e-05, 'epoch': 1.9} 19%|█▉ | 7855/41250 [18:58:21<80:28:47, 8.68s/it][2025-04-26 02:56:04,539] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.96 | optimizer_step: 1.01 [2025-04-26 02:56:04,540] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.44 | bwd_microstep: 5774.66 | bwd_inner_microstep: 5657.80 | bwd_allreduce_microstep: 116.82 | step_microstep: 18.72 [2025-04-26 02:56:04,540] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.44 | bwd: 5774.67 | bwd_inner: 5657.80 | bwd_allreduce: 116.83 | step: 18.72 19%|█▉ | 7856/41250 [18:58:29<80:31:53, 8.68s/it] {'loss': 0.0409, 'grad_norm': 0.5965216755867004, 'learning_rate': 3.73601524885284e-05, 'epoch': 1.9} 19%|█▉ | 7856/41250 [18:58:29<80:31:53, 8.68s/it][2025-04-26 02:56:13,214] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.95 [2025-04-26 02:56:13,214] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.78 | bwd_microstep: 5733.49 | bwd_inner_microstep: 5720.67 | bwd_allreduce_microstep: 12.77 | step_microstep: 19.16 [2025-04-26 02:56:13,215] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.78 | bwd: 5733.50 | bwd_inner: 5720.67 | bwd_allreduce: 12.79 | step: 19.16 19%|█▉ | 7857/41250 [18:58:38<80:30:24, 8.68s/it] {'loss': 0.0488, 'grad_norm': 0.9591202139854431, 'learning_rate': 3.7359372688663526e-05, 'epoch': 1.9} 19%|█▉ | 7857/41250 [18:58:38<80:30:24, 8.68s/it][2025-04-26 02:56:21,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-26 02:56:21,828] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.04 | bwd_microstep: 5690.99 | bwd_inner_microstep: 5663.39 | bwd_allreduce_microstep: 27.54 | step_microstep: 19.01 [2025-04-26 02:56:21,828] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.04 | bwd: 5691.00 | bwd_inner: 5663.39 | bwd_allreduce: 27.57 | step: 19.02 19%|█▉ | 7858/41250 [18:58:47<80:19:21, 8.66s/it] {'loss': 0.2223, 'grad_norm': 1.4870494604110718, 'learning_rate': 3.7358592781781544e-05, 'epoch': 1.9} 19%|█▉ | 7858/41250 [18:58:47<80:19:21, 8.66s/it][2025-04-26 02:56:30,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:56:30,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.22 | bwd_microstep: 5779.49 | bwd_inner_microstep: 5638.51 | bwd_allreduce_microstep: 140.93 | step_microstep: 18.68 [2025-04-26 02:56:30,515] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.22 | bwd: 5779.50 | bwd_inner: 5638.51 | bwd_allreduce: 140.95 | step: 18.68 19%|█▉ | 7859/41250 [18:58:55<80:23:51, 8.67s/it] {'loss': 0.096, 'grad_norm': 3.3918004035949707, 'learning_rate': 3.735781276788724e-05, 'epoch': 1.91} 19%|█▉ | 7859/41250 [18:58:55<80:23:51, 8.67s/it][2025-04-26 02:56:39,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 1.08 [2025-04-26 02:56:39,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2862.63 | bwd_microstep: 5740.05 | bwd_inner_microstep: 5723.08 | bwd_allreduce_microstep: 16.93 | step_microstep: 18.76 [2025-04-26 02:56:39,202] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2862.63 | bwd: 5740.07 | bwd_inner: 5723.08 | bwd_allreduce: 16.94 | step: 18.77 19%|█▉ | 7860/41250 [18:59:04<80:26:46, 8.67s/it] {'loss': 0.1094, 'grad_norm': 1.6689811944961548, 'learning_rate': 3.7357032646985445e-05, 'epoch': 1.91} 19%|█▉ | 7860/41250 [18:59:04<80:26:46, 8.67s/it][2025-04-26 02:56:47,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-26 02:56:47,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.08 | bwd_microstep: 5739.40 | bwd_inner_microstep: 5711.18 | bwd_allreduce_microstep: 28.17 | step_microstep: 18.64 [2025-04-26 02:56:47,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.08 | bwd: 5739.41 | bwd_inner: 5711.18 | bwd_allreduce: 28.19 | step: 18.64 19%|█▉ | 7861/41250 [18:59:13<80:27:39, 8.68s/it] {'loss': 0.0141, 'grad_norm': 0.18269117176532745, 'learning_rate': 3.735625241908096e-05, 'epoch': 1.91} 19%|█▉ | 7861/41250 [18:59:13<80:27:39, 8.68s/it][2025-04-26 02:56:56,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.03 | optimizer_step: 0.96 [2025-04-26 02:56:56,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.19 | bwd_microstep: 5701.73 | bwd_inner_microstep: 5688.92 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.99 [2025-04-26 02:56:56,521] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.19 | bwd: 5701.74 | bwd_inner: 5688.92 | bwd_allreduce: 12.78 | step: 18.99 19%|█▉ | 7862/41250 [18:59:21<80:21:30, 8.66s/it] {'loss': 0.2421, 'grad_norm': 2.3027544021606445, 'learning_rate': 3.7355472084178577e-05, 'epoch': 1.91} 19%|█▉ | 7862/41250 [18:59:21<80:21:30, 8.66s/it][2025-04-26 02:57:05,206] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.01 | optimizer_step: 1.08 [2025-04-26 02:57:05,206] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.19 | bwd_microstep: 5763.64 | bwd_inner_microstep: 5660.10 | bwd_allreduce_microstep: 103.49 | step_microstep: 18.72 [2025-04-26 02:57:05,206] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.19 | bwd: 5763.65 | bwd_inner: 5660.10 | bwd_allreduce: 103.51 | step: 18.72 19%|█▉ | 7863/41250 [18:59:30<80:25:02, 8.67s/it] {'loss': 0.0324, 'grad_norm': 0.6145033836364746, 'learning_rate': 3.7354691642283134e-05, 'epoch': 1.91} 19%|█▉ | 7863/41250 [18:59:30<80:25:02, 8.67s/it][2025-04-26 02:57:13,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.03 | optimizer_step: 1.02 [2025-04-26 02:57:13,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.93 | bwd_microstep: 5728.09 | bwd_inner_microstep: 5707.99 | bwd_allreduce_microstep: 20.05 | step_microstep: 18.97 [2025-04-26 02:57:13,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.93 | bwd: 5728.10 | bwd_inner: 5707.99 | bwd_allreduce: 20.07 | step: 18.97 19%|█▉ | 7864/41250 [18:59:39<80:25:45, 8.67s/it] {'loss': 0.1977, 'grad_norm': 1.9769057035446167, 'learning_rate': 3.735391109339943e-05, 'epoch': 1.91} 19%|█▉ | 7864/41250 [18:59:39<80:25:45, 8.67s/it][2025-04-26 02:57:22,505] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 02:57:22,506] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.09 | bwd_microstep: 5696.69 | bwd_inner_microstep: 5671.22 | bwd_allreduce_microstep: 25.42 | step_microstep: 18.74 [2025-04-26 02:57:22,506] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.09 | bwd: 5696.71 | bwd_inner: 5671.22 | bwd_allreduce: 25.44 | step: 18.74 19%|█▉ | 7865/41250 [18:59:47<80:17:20, 8.66s/it] {'loss': 0.0841, 'grad_norm': 1.62043035030365, 'learning_rate': 3.7353130437532276e-05, 'epoch': 1.91} 19%|█▉ | 7865/41250 [18:59:47<80:17:20, 8.66s/it][2025-04-26 02:57:31,169] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.05 | optimizer_step: 0.90 [2025-04-26 02:57:31,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.69 | bwd_microstep: 5720.25 | bwd_inner_microstep: 5707.24 | bwd_allreduce_microstep: 12.96 | step_microstep: 19.16 [2025-04-26 02:57:31,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.69 | bwd: 5720.26 | bwd_inner: 5707.24 | bwd_allreduce: 12.98 | step: 19.17 19%|█▉ | 7866/41250 [18:59:56<80:18:12, 8.66s/it] {'loss': 0.0326, 'grad_norm': 1.0948899984359741, 'learning_rate': 3.735234967468649e-05, 'epoch': 1.91} 19%|█▉ | 7866/41250 [18:59:56<80:18:12, 8.66s/it][2025-04-26 02:57:39,845] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 02:57:39,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.84 | bwd_microstep: 5739.41 | bwd_inner_microstep: 5700.47 | bwd_allreduce_microstep: 38.90 | step_microstep: 18.81 [2025-04-26 02:57:39,846] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.84 | bwd: 5739.43 | bwd_inner: 5700.47 | bwd_allreduce: 38.92 | step: 18.82 19%|█▉ | 7867/41250 [19:00:05<80:20:43, 8.66s/it] {'loss': 0.0949, 'grad_norm': 2.093801736831665, 'learning_rate': 3.735156880486687e-05, 'epoch': 1.91} 19%|█▉ | 7867/41250 [19:00:05<80:20:43, 8.66s/it][2025-04-26 02:57:48,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 02:57:48,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.48 | bwd_microstep: 5699.96 | bwd_inner_microstep: 5651.73 | bwd_allreduce_microstep: 48.19 | step_microstep: 18.68 [2025-04-26 02:57:48,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.48 | bwd: 5699.98 | bwd_inner: 5651.73 | bwd_allreduce: 48.21 | step: 18.68 19%|█▉ | 7868/41250 [19:00:13<80:12:43, 8.65s/it] {'loss': 0.087, 'grad_norm': 2.89923357963562, 'learning_rate': 3.7350787828078247e-05, 'epoch': 1.91} 19%|█▉ | 7868/41250 [19:00:13<80:12:43, 8.65s/it][2025-04-26 02:57:57,169] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.06 | optimizer_step: 1.05 [2025-04-26 02:57:57,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.11 | bwd_microstep: 5772.16 | bwd_inner_microstep: 5695.10 | bwd_allreduce_microstep: 77.02 | step_microstep: 19.89 [2025-04-26 02:57:57,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.11 | bwd: 5772.18 | bwd_inner: 5695.10 | bwd_allreduce: 77.04 | step: 19.89 19%|█▉ | 7869/41250 [19:00:22<80:22:01, 8.67s/it] {'loss': 0.1066, 'grad_norm': 2.04044508934021, 'learning_rate': 3.7350006744325435e-05, 'epoch': 1.91} 19%|█▉ | 7869/41250 [19:00:22<80:22:01, 8.67s/it][2025-04-26 02:58:05,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 02:58:05,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.05 | bwd_microstep: 5774.91 | bwd_inner_microstep: 5639.56 | bwd_allreduce_microstep: 135.30 | step_microstep: 18.81 [2025-04-26 02:58:05,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.05 | bwd: 5774.92 | bwd_inner: 5639.56 | bwd_allreduce: 135.32 | step: 18.81 19%|█▉ | 7870/41250 [19:00:31<80:23:37, 8.67s/it] {'loss': 0.3735, 'grad_norm': 2.2209160327911377, 'learning_rate': 3.734922555361324e-05, 'epoch': 1.91} 19%|█▉ | 7870/41250 [19:00:31<80:23:37, 8.67s/it][2025-04-26 02:58:14,531] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 02:58:14,532] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.89 | bwd_microstep: 5774.70 | bwd_inner_microstep: 5657.11 | bwd_allreduce_microstep: 117.55 | step_microstep: 18.47 [2025-04-26 02:58:14,532] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.89 | bwd: 5774.71 | bwd_inner: 5657.10 | bwd_allreduce: 117.57 | step: 18.47 19%|█▉ | 7871/41250 [19:00:39<80:26:13, 8.68s/it] {'loss': 0.1466, 'grad_norm': 1.7002633810043335, 'learning_rate': 3.734844425594648e-05, 'epoch': 1.91} 19%|█▉ | 7871/41250 [19:00:39<80:26:13, 8.68s/it][2025-04-26 02:58:23,203] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.04 | optimizer_step: 1.09 [2025-04-26 02:58:23,204] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.93 | bwd_microstep: 5753.83 | bwd_inner_microstep: 5652.38 | bwd_allreduce_microstep: 101.39 | step_microstep: 19.06 [2025-04-26 02:58:23,204] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.93 | bwd: 5753.84 | bwd_inner: 5652.39 | bwd_allreduce: 101.41 | step: 19.07 19%|█▉ | 7872/41250 [19:00:48<80:25:22, 8.67s/it] {'loss': 0.1421, 'grad_norm': 1.94554603099823, 'learning_rate': 3.734766285132998e-05, 'epoch': 1.91} 19%|█▉ | 7872/41250 [19:00:48<80:25:22, 8.67s/it][2025-04-26 02:58:31,888] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.98 | optimizer_step: 0.99 [2025-04-26 02:58:31,889] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.83 | bwd_microstep: 5768.04 | bwd_inner_microstep: 5640.54 | bwd_allreduce_microstep: 127.45 | step_microstep: 18.55 [2025-04-26 02:58:31,889] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.83 | bwd: 5768.05 | bwd_inner: 5640.54 | bwd_allreduce: 127.47 | step: 18.55 19%|█▉ | 7873/41250 [19:00:57<80:26:43, 8.68s/it] {'loss': 0.0669, 'grad_norm': 1.2356709241867065, 'learning_rate': 3.734688133976854e-05, 'epoch': 1.91} 19%|█▉ | 7873/41250 [19:00:57<80:26:43, 8.68s/it][2025-04-26 02:58:40,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:58:40,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.63 | bwd_microstep: 5695.41 | bwd_inner_microstep: 5683.01 | bwd_allreduce_microstep: 12.35 | step_microstep: 18.59 [2025-04-26 02:58:40,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.63 | bwd: 5695.42 | bwd_inner: 5683.01 | bwd_allreduce: 12.37 | step: 18.59 19%|█▉ | 7874/41250 [19:01:05<80:18:34, 8.66s/it] {'loss': 0.099, 'grad_norm': 1.8679242134094238, 'learning_rate': 3.7346099721266995e-05, 'epoch': 1.91} 19%|█▉ | 7874/41250 [19:01:05<80:18:34, 8.66s/it][2025-04-26 02:58:49,151] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.94 [2025-04-26 02:58:49,151] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.62 | bwd_microstep: 5706.37 | bwd_inner_microstep: 5693.50 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.80 [2025-04-26 02:58:49,152] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.63 | bwd: 5706.38 | bwd_inner: 5693.50 | bwd_allreduce: 12.84 | step: 18.80 19%|█▉ | 7875/41250 [19:01:14<80:13:36, 8.65s/it] {'loss': 0.2004, 'grad_norm': 4.163116931915283, 'learning_rate': 3.734531799583016e-05, 'epoch': 1.91} 19%|█▉ | 7875/41250 [19:01:14<80:13:36, 8.65s/it][2025-04-26 02:58:57,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 1.02 [2025-04-26 02:58:57,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.82 | bwd_microstep: 5764.22 | bwd_inner_microstep: 5648.74 | bwd_allreduce_microstep: 115.43 | step_microstep: 18.69 [2025-04-26 02:58:57,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.82 | bwd: 5764.24 | bwd_inner: 5648.74 | bwd_allreduce: 115.45 | step: 18.69 19%|█▉ | 7876/41250 [19:01:23<80:17:34, 8.66s/it] {'loss': 0.1251, 'grad_norm': 4.656129360198975, 'learning_rate': 3.734453616346285e-05, 'epoch': 1.91} 19%|█▉ | 7876/41250 [19:01:23<80:17:34, 8.66s/it][2025-04-26 02:59:06,441] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-26 02:59:06,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.30 | bwd_microstep: 5694.14 | bwd_inner_microstep: 5634.93 | bwd_allreduce_microstep: 59.16 | step_microstep: 19.11 [2025-04-26 02:59:06,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.30 | bwd: 5694.16 | bwd_inner: 5634.93 | bwd_allreduce: 59.19 | step: 19.11 19%|█▉ | 7877/41250 [19:01:31<80:09:15, 8.65s/it] {'loss': 0.1463, 'grad_norm': 1.0463299751281738, 'learning_rate': 3.734375422416988e-05, 'epoch': 1.91} 19%|█▉ | 7877/41250 [19:01:31<80:09:15, 8.65s/it][2025-04-26 02:59:15,107] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 02:59:15,107] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.61 | bwd_microstep: 5751.37 | bwd_inner_microstep: 5640.50 | bwd_allreduce_microstep: 110.83 | step_microstep: 18.55 [2025-04-26 02:59:15,107] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.61 | bwd: 5751.39 | bwd_inner: 5640.50 | bwd_allreduce: 110.84 | step: 18.55 19%|█▉ | 7878/41250 [19:01:40<80:12:12, 8.65s/it] {'loss': 0.179, 'grad_norm': 3.9737510681152344, 'learning_rate': 3.734297217795608e-05, 'epoch': 1.91} 19%|█▉ | 7878/41250 [19:01:40<80:12:12, 8.65s/it][2025-04-26 02:59:23,716] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 02:59:23,716] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.64 | bwd_microstep: 5686.84 | bwd_inner_microstep: 5674.18 | bwd_allreduce_microstep: 12.62 | step_microstep: 18.57 [2025-04-26 02:59:23,717] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.65 | bwd: 5686.85 | bwd_inner: 5674.18 | bwd_allreduce: 12.63 | step: 18.57 19%|█▉ | 7879/41250 [19:01:49<80:04:56, 8.64s/it] {'loss': 0.1542, 'grad_norm': 1.6142910718917847, 'learning_rate': 3.734219002482627e-05, 'epoch': 1.91} 19%|█▉ | 7879/41250 [19:01:49<80:04:56, 8.64s/it][2025-04-26 02:59:32,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 02:59:32,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.55 | bwd_microstep: 5684.64 | bwd_inner_microstep: 5636.01 | bwd_allreduce_microstep: 48.58 | step_microstep: 18.58 [2025-04-26 02:59:32,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.55 | bwd: 5684.65 | bwd_inner: 5636.01 | bwd_allreduce: 48.60 | step: 18.58 19%|█▉ | 7880/41250 [19:01:57<79:56:42, 8.62s/it] {'loss': 0.3117, 'grad_norm': 5.144911766052246, 'learning_rate': 3.734140776478526e-05, 'epoch': 1.91} 19%|█▉ | 7880/41250 [19:01:57<79:56:42, 8.62s/it][2025-04-26 02:59:41,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 02:59:41,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.40 | bwd_microstep: 5841.56 | bwd_inner_microstep: 5637.06 | bwd_allreduce_microstep: 204.46 | step_microstep: 18.79 [2025-04-26 02:59:41,059] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.40 | bwd: 5841.57 | bwd_inner: 5637.06 | bwd_allreduce: 204.47 | step: 18.79 19%|█▉ | 7881/41250 [19:02:06<80:17:42, 8.66s/it] {'loss': 0.1289, 'grad_norm': 1.4631273746490479, 'learning_rate': 3.734062539783789e-05, 'epoch': 1.91} 19%|█▉ | 7881/41250 [19:02:06<80:17:42, 8.66s/it][2025-04-26 02:59:49,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:59:49,645] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.60 | bwd_microstep: 5667.92 | bwd_inner_microstep: 5638.46 | bwd_allreduce_microstep: 29.42 | step_microstep: 18.33 [2025-04-26 02:59:49,645] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.60 | bwd: 5667.94 | bwd_inner: 5638.46 | bwd_allreduce: 29.43 | step: 18.34 19%|█▉ | 7882/41250 [19:02:14<80:04:53, 8.64s/it] {'loss': 0.2742, 'grad_norm': 2.441119432449341, 'learning_rate': 3.733984292398898e-05, 'epoch': 1.91} 19%|█▉ | 7882/41250 [19:02:14<80:04:53, 8.64s/it][2025-04-26 02:59:58,277] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 02:59:58,277] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.05 | bwd_microstep: 5703.92 | bwd_inner_microstep: 5690.99 | bwd_allreduce_microstep: 12.88 | step_microstep: 18.62 [2025-04-26 02:59:58,278] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.05 | bwd: 5703.93 | bwd_inner: 5690.99 | bwd_allreduce: 12.90 | step: 18.63 19%|█▉ | 7883/41250 [19:02:23<80:03:59, 8.64s/it] {'loss': 0.1631, 'grad_norm': 2.1868059635162354, 'learning_rate': 3.733906034324334e-05, 'epoch': 1.91} 19%|█▉ | 7883/41250 [19:02:23<80:03:59, 8.64s/it][2025-04-26 03:00:06,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:00:06,958] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.42 | bwd_microstep: 5773.31 | bwd_inner_microstep: 5634.67 | bwd_allreduce_microstep: 138.60 | step_microstep: 18.43 [2025-04-26 03:00:06,959] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.42 | bwd: 5773.32 | bwd_inner: 5634.67 | bwd_allreduce: 138.61 | step: 18.44 19%|█▉ | 7884/41250 [19:02:32<80:10:30, 8.65s/it] {'loss': 0.2225, 'grad_norm': 1.229912281036377, 'learning_rate': 3.733827765560581e-05, 'epoch': 1.91} 19%|█▉ | 7884/41250 [19:02:32<80:10:30, 8.65s/it][2025-04-26 03:00:15,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 03:00:15,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2816.47 | bwd_microstep: 5772.90 | bwd_inner_microstep: 5643.08 | bwd_allreduce_microstep: 129.78 | step_microstep: 18.42 [2025-04-26 03:00:15,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2816.47 | bwd: 5772.91 | bwd_inner: 5643.08 | bwd_allreduce: 129.80 | step: 18.42 19%|█▉ | 7885/41250 [19:02:40<80:14:08, 8.66s/it] {'loss': 0.2659, 'grad_norm': 1.5700414180755615, 'learning_rate': 3.733749486108121e-05, 'epoch': 1.91} 19%|█▉ | 7885/41250 [19:02:40<80:14:08, 8.66s/it][2025-04-26 03:00:24,303] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.30 | optimizer_step: 1.05 [2025-04-26 03:00:24,304] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.92 | bwd_microstep: 5738.10 | bwd_inner_microstep: 5697.90 | bwd_allreduce_microstep: 40.14 | step_microstep: 20.06 [2025-04-26 03:00:24,304] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.92 | bwd: 5738.12 | bwd_inner: 5697.90 | bwd_allreduce: 40.17 | step: 20.06 19%|█▉ | 7886/41250 [19:02:49<80:17:22, 8.66s/it] {'loss': 0.0201, 'grad_norm': 0.3797388970851898, 'learning_rate': 3.733671195967436e-05, 'epoch': 1.91} 19%|█▉ | 7886/41250 [19:02:49<80:17:22, 8.66s/it][2025-04-26 03:00:32,961] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.08 | optimizer_step: 0.90 [2025-04-26 03:00:32,962] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.14 | bwd_microstep: 5730.92 | bwd_inner_microstep: 5638.32 | bwd_allreduce_microstep: 92.56 | step_microstep: 19.46 [2025-04-26 03:00:32,962] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.14 | bwd: 5730.94 | bwd_inner: 5638.32 | bwd_allreduce: 92.58 | step: 19.46 19%|█▉ | 7887/41250 [19:02:58<80:16:12, 8.66s/it] {'loss': 0.2755, 'grad_norm': 6.529089450836182, 'learning_rate': 3.733592895139008e-05, 'epoch': 1.91} 19%|█▉ | 7887/41250 [19:02:58<80:16:12, 8.66s/it][2025-04-26 03:00:41,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.05 | optimizer_step: 0.90 [2025-04-26 03:00:41,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.73 | bwd_microstep: 5749.67 | bwd_inner_microstep: 5695.79 | bwd_allreduce_microstep: 53.82 | step_microstep: 19.03 [2025-04-26 03:00:41,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.73 | bwd: 5749.68 | bwd_inner: 5695.79 | bwd_allreduce: 53.85 | step: 19.04 19%|█▉ | 7888/41250 [19:03:06<80:22:13, 8.67s/it] {'loss': 0.2174, 'grad_norm': 1.9956766366958618, 'learning_rate': 3.733514583623323e-05, 'epoch': 1.91} 19%|█▉ | 7888/41250 [19:03:06<80:22:13, 8.67s/it][2025-04-26 03:00:50,339] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.11 | optimizer_step: 0.90 [2025-04-26 03:00:50,340] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.80 | bwd_microstep: 5772.92 | bwd_inner_microstep: 5643.62 | bwd_allreduce_microstep: 129.25 | step_microstep: 19.18 [2025-04-26 03:00:50,340] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.80 | bwd: 5772.94 | bwd_inner: 5643.62 | bwd_allreduce: 129.27 | step: 19.18 19%|█▉ | 7889/41250 [19:03:15<80:23:24, 8.67s/it] {'loss': 0.0567, 'grad_norm': 2.6411983966827393, 'learning_rate': 3.73343626142086e-05, 'epoch': 1.91} 19%|█▉ | 7889/41250 [19:03:15<80:23:24, 8.67s/it][2025-04-26 03:00:59,018] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.01 | optimizer_step: 1.02 [2025-04-26 03:00:59,019] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.03 | bwd_microstep: 5735.63 | bwd_inner_microstep: 5683.52 | bwd_allreduce_microstep: 52.06 | step_microstep: 18.95 [2025-04-26 03:00:59,019] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.03 | bwd: 5735.64 | bwd_inner: 5683.52 | bwd_allreduce: 52.08 | step: 18.95 19%|█▉ | 7890/41250 [19:03:24<80:23:54, 8.68s/it] {'loss': 0.0561, 'grad_norm': 0.5917049646377563, 'learning_rate': 3.7333579285321036e-05, 'epoch': 1.91} 19%|█▉ | 7890/41250 [19:03:24<80:23:54, 8.68s/it][2025-04-26 03:01:07,621] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-26 03:01:07,622] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.72 | bwd_microstep: 5692.34 | bwd_inner_microstep: 5652.13 | bwd_allreduce_microstep: 40.17 | step_microstep: 19.05 [2025-04-26 03:01:07,622] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.72 | bwd: 5692.35 | bwd_inner: 5652.13 | bwd_allreduce: 40.18 | step: 19.05 19%|█▉ | 7891/41250 [19:03:32<80:11:29, 8.65s/it] {'loss': 0.1774, 'grad_norm': 3.5513319969177246, 'learning_rate': 3.733279584957536e-05, 'epoch': 1.91} 19%|█▉ | 7891/41250 [19:03:32<80:11:29, 8.65s/it][2025-04-26 03:01:16,319] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.05 | optimizer_step: 0.91 [2025-04-26 03:01:16,320] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.03 | bwd_microstep: 5767.78 | bwd_inner_microstep: 5641.68 | bwd_allreduce_microstep: 126.05 | step_microstep: 19.22 [2025-04-26 03:01:16,320] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.03 | bwd: 5767.80 | bwd_inner: 5641.68 | bwd_allreduce: 126.07 | step: 19.22 19%|█▉ | 7892/41250 [19:03:41<80:19:18, 8.67s/it] {'loss': 0.0766, 'grad_norm': 0.9400420188903809, 'learning_rate': 3.733201230697642e-05, 'epoch': 1.91} 19%|█▉ | 7892/41250 [19:03:41<80:19:18, 8.67s/it][2025-04-26 03:01:25,000] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.28 | optimizer_step: 0.90 [2025-04-26 03:01:25,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.61 | bwd_microstep: 5743.82 | bwd_inner_microstep: 5685.22 | bwd_allreduce_microstep: 58.55 | step_microstep: 19.68 [2025-04-26 03:01:25,001] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.61 | bwd: 5743.84 | bwd_inner: 5685.22 | bwd_allreduce: 58.57 | step: 19.68 19%|█▉ | 7893/41250 [19:03:50<80:20:44, 8.67s/it] {'loss': 0.2119, 'grad_norm': 4.222671985626221, 'learning_rate': 3.7331228657529026e-05, 'epoch': 1.91} 19%|█▉ | 7893/41250 [19:03:50<80:20:44, 8.67s/it][2025-04-26 03:01:33,648] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 1.00 [2025-04-26 03:01:33,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.85 | bwd_microstep: 5706.44 | bwd_inner_microstep: 5693.81 | bwd_allreduce_microstep: 12.59 | step_microstep: 19.18 [2025-04-26 03:01:33,649] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.86 | bwd: 5706.45 | bwd_inner: 5693.81 | bwd_allreduce: 12.60 | step: 19.19 19%|█▉ | 7894/41250 [19:03:58<80:15:57, 8.66s/it] {'loss': 0.2438, 'grad_norm': 2.6121819019317627, 'learning_rate': 3.733044490123802e-05, 'epoch': 1.91} 19%|█▉ | 7894/41250 [19:03:58<80:15:57, 8.66s/it][2025-04-26 03:01:42,263] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 0.96 [2025-04-26 03:01:42,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.71 | bwd_microstep: 5706.87 | bwd_inner_microstep: 5644.89 | bwd_allreduce_microstep: 61.93 | step_microstep: 18.81 [2025-04-26 03:01:42,264] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.71 | bwd: 5706.88 | bwd_inner: 5644.89 | bwd_allreduce: 61.95 | step: 18.81 19%|█▉ | 7895/41250 [19:04:07<80:07:38, 8.65s/it] {'loss': 0.0567, 'grad_norm': 3.4520974159240723, 'learning_rate': 3.7329661038108225e-05, 'epoch': 1.91} 19%|█▉ | 7895/41250 [19:04:07<80:07:38, 8.65s/it][2025-04-26 03:01:50,986] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:01:50,987] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.27 | bwd_microstep: 5801.22 | bwd_inner_microstep: 5691.38 | bwd_allreduce_microstep: 109.80 | step_microstep: 18.77 [2025-04-26 03:01:50,987] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.27 | bwd: 5801.23 | bwd_inner: 5691.38 | bwd_allreduce: 109.81 | step: 18.77 19%|█▉ | 7896/41250 [19:04:16<80:20:02, 8.67s/it] {'loss': 0.1284, 'grad_norm': 1.1819169521331787, 'learning_rate': 3.732887706814448e-05, 'epoch': 1.91} 19%|█▉ | 7896/41250 [19:04:16<80:20:02, 8.67s/it][2025-04-26 03:01:59,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:01:59,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.51 | bwd_microstep: 5794.03 | bwd_inner_microstep: 5642.38 | bwd_allreduce_microstep: 151.61 | step_microstep: 18.52 [2025-04-26 03:01:59,692] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.51 | bwd: 5794.04 | bwd_inner: 5642.38 | bwd_allreduce: 151.62 | step: 18.53 19%|█▉ | 7897/41250 [19:04:25<80:25:41, 8.68s/it] {'loss': 0.1777, 'grad_norm': 3.607475757598877, 'learning_rate': 3.732809299135162e-05, 'epoch': 1.91} 19%|█▉ | 7897/41250 [19:04:25<80:25:41, 8.68s/it][2025-04-26 03:02:08,387] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.97 | optimizer_gradients: 0.96 | optimizer_step: 1.01 [2025-04-26 03:02:08,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.59 | bwd_microstep: 5784.13 | bwd_inner_microstep: 5656.44 | bwd_allreduce_microstep: 127.65 | step_microstep: 18.19 [2025-04-26 03:02:08,388] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.59 | bwd: 5784.14 | bwd_inner: 5656.44 | bwd_allreduce: 127.67 | step: 18.20 19%|█▉ | 7898/41250 [19:04:33<80:27:57, 8.69s/it] {'loss': 0.1238, 'grad_norm': 1.451407551765442, 'learning_rate': 3.732730880773447e-05, 'epoch': 1.91} 19%|█▉ | 7898/41250 [19:04:33<80:27:57, 8.69s/it][2025-04-26 03:02:17,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.98 | optimizer_step: 0.94 [2025-04-26 03:02:17,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.50 | bwd_microstep: 5874.32 | bwd_inner_microstep: 5698.28 | bwd_allreduce_microstep: 175.99 | step_microstep: 18.36 [2025-04-26 03:02:17,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.50 | bwd: 5874.34 | bwd_inner: 5698.28 | bwd_allreduce: 176.01 | step: 18.37 19%|█▉ | 7899/41250 [19:04:42<80:48:43, 8.72s/it] {'loss': 0.1742, 'grad_norm': 0.9812632203102112, 'learning_rate': 3.732652451729787e-05, 'epoch': 1.91} 19%|█▉ | 7899/41250 [19:04:42<80:48:43, 8.72s/it][2025-04-26 03:02:25,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.96 | optimizer_gradients: 1.04 | optimizer_step: 0.98 [2025-04-26 03:02:25,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.52 | bwd_microstep: 5827.36 | bwd_inner_microstep: 5645.14 | bwd_allreduce_microstep: 182.17 | step_microstep: 18.91 [2025-04-26 03:02:25,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.52 | bwd: 5827.37 | bwd_inner: 5645.14 | bwd_allreduce: 182.19 | step: 18.91 19%|█▉ | 7900/41250 [19:04:51<80:50:33, 8.73s/it] {'loss': 0.3378, 'grad_norm': 2.7800614833831787, 'learning_rate': 3.732574012004665e-05, 'epoch': 1.92} 19%|█▉ | 7900/41250 [19:04:51<80:50:33, 8.73s/it][2025-04-26 03:02:34,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:02:34,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.17 | bwd_microstep: 5715.79 | bwd_inner_microstep: 5656.63 | bwd_allreduce_microstep: 59.11 | step_microstep: 18.32 [2025-04-26 03:02:34,556] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.17 | bwd: 5715.80 | bwd_inner: 5656.63 | bwd_allreduce: 59.13 | step: 18.32 19%|█▉ | 7901/41250 [19:04:59<80:33:00, 8.70s/it] {'loss': 0.1774, 'grad_norm': 1.8070062398910522, 'learning_rate': 3.732495561598565e-05, 'epoch': 1.92} 19%|█▉ | 7901/41250 [19:04:59<80:33:00, 8.70s/it][2025-04-26 03:02:43,191] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.92 [2025-04-26 03:02:43,191] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.43 | bwd_microstep: 5726.83 | bwd_inner_microstep: 5651.63 | bwd_allreduce_microstep: 75.15 | step_microstep: 18.95 [2025-04-26 03:02:43,191] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.43 | bwd: 5726.84 | bwd_inner: 5651.63 | bwd_allreduce: 75.17 | step: 18.95 19%|█▉ | 7902/41250 [19:05:08<80:22:53, 8.68s/it] {'loss': 0.0716, 'grad_norm': 1.2515931129455566, 'learning_rate': 3.7324171005119716e-05, 'epoch': 1.92} 19%|█▉ | 7902/41250 [19:05:08<80:22:53, 8.68s/it][2025-04-26 03:02:51,879] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:02:51,879] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.45 | bwd_microstep: 5757.69 | bwd_inner_microstep: 5694.09 | bwd_allreduce_microstep: 63.56 | step_microstep: 18.85 [2025-04-26 03:02:51,880] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.45 | bwd: 5757.71 | bwd_inner: 5694.09 | bwd_allreduce: 63.58 | step: 18.85 19%|█▉ | 7903/41250 [19:05:17<80:24:42, 8.68s/it] {'loss': 0.215, 'grad_norm': 1.8425160646438599, 'learning_rate': 3.732338628745367e-05, 'epoch': 1.92} 19%|█▉ | 7903/41250 [19:05:17<80:24:42, 8.68s/it][2025-04-26 03:03:00,624] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 1.07 [2025-04-26 03:03:00,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2881.93 | bwd_microstep: 5780.49 | bwd_inner_microstep: 5767.68 | bwd_allreduce_microstep: 12.77 | step_microstep: 18.98 [2025-04-26 03:03:00,625] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2881.93 | bwd: 5780.51 | bwd_inner: 5767.68 | bwd_allreduce: 12.78 | step: 18.98 19%|█▉ | 7904/41250 [19:05:25<80:35:16, 8.70s/it] {'loss': 0.2034, 'grad_norm': 1.8444890975952148, 'learning_rate': 3.732260146299236e-05, 'epoch': 1.92} 19%|█▉ | 7904/41250 [19:05:25<80:35:16, 8.70s/it][2025-04-26 03:03:09,267] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 03:03:09,268] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.67 | bwd_microstep: 5721.10 | bwd_inner_microstep: 5654.40 | bwd_allreduce_microstep: 66.65 | step_microstep: 18.81 [2025-04-26 03:03:09,268] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.67 | bwd: 5721.11 | bwd_inner: 5654.40 | bwd_allreduce: 66.67 | step: 18.81 19%|█▉ | 7905/41250 [19:05:34<80:25:24, 8.68s/it] {'loss': 0.06, 'grad_norm': 0.7454511523246765, 'learning_rate': 3.732181653174061e-05, 'epoch': 1.92} 19%|█▉ | 7905/41250 [19:05:34<80:25:24, 8.68s/it][2025-04-26 03:03:17,901] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:03:17,902] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.08 | bwd_microstep: 5717.09 | bwd_inner_microstep: 5653.99 | bwd_allreduce_microstep: 63.05 | step_microstep: 18.68 [2025-04-26 03:03:17,902] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.08 | bwd: 5717.10 | bwd_inner: 5653.99 | bwd_allreduce: 63.07 | step: 18.69 19%|█▉ | 7906/41250 [19:05:43<80:17:24, 8.67s/it] {'loss': 0.1409, 'grad_norm': 1.1755908727645874, 'learning_rate': 3.732103149370327e-05, 'epoch': 1.92} 19%|█▉ | 7906/41250 [19:05:43<80:17:24, 8.67s/it][2025-04-26 03:03:26,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 03:03:26,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.96 | bwd_microstep: 5768.23 | bwd_inner_microstep: 5657.35 | bwd_allreduce_microstep: 110.84 | step_microstep: 18.65 [2025-04-26 03:03:26,584] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.96 | bwd: 5768.25 | bwd_inner: 5657.35 | bwd_allreduce: 110.85 | step: 18.65 19%|█▉ | 7907/41250 [19:05:51<80:19:31, 8.67s/it] {'loss': 0.0642, 'grad_norm': 0.9501145482063293, 'learning_rate': 3.732024634888518e-05, 'epoch': 1.92} 19%|█▉ | 7907/41250 [19:05:51<80:19:31, 8.67s/it][2025-04-26 03:03:35,271] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:03:35,271] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.30 | bwd_microstep: 5755.97 | bwd_inner_microstep: 5705.89 | bwd_allreduce_microstep: 50.04 | step_microstep: 18.51 [2025-04-26 03:03:35,271] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.30 | bwd: 5755.98 | bwd_inner: 5705.89 | bwd_allreduce: 50.05 | step: 18.52 19%|█▉ | 7908/41250 [19:06:00<80:21:43, 8.68s/it] {'loss': 0.0674, 'grad_norm': 1.5506538152694702, 'learning_rate': 3.731946109729118e-05, 'epoch': 1.92} 19%|█▉ | 7908/41250 [19:06:00<80:21:43, 8.68s/it][2025-04-26 03:03:43,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:03:43,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.19 | bwd_microstep: 5718.52 | bwd_inner_microstep: 5705.73 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.81 [2025-04-26 03:03:43,923] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.20 | bwd: 5718.54 | bwd_inner: 5705.73 | bwd_allreduce: 12.76 | step: 18.81 19%|█▉ | 7909/41250 [19:06:09<80:17:12, 8.67s/it] {'loss': 0.0481, 'grad_norm': 0.77467280626297, 'learning_rate': 3.731867573892611e-05, 'epoch': 1.92} 19%|█▉ | 7909/41250 [19:06:09<80:17:12, 8.67s/it][2025-04-26 03:03:52,596] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.98 | optimizer_step: 0.91 [2025-04-26 03:03:52,596] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.49 | bwd_microstep: 5765.75 | bwd_inner_microstep: 5650.96 | bwd_allreduce_microstep: 114.75 | step_microstep: 18.43 [2025-04-26 03:03:52,597] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.49 | bwd: 5765.76 | bwd_inner: 5650.96 | bwd_allreduce: 114.76 | step: 18.43 19%|█▉ | 7910/41250 [19:06:17<80:18:00, 8.67s/it] {'loss': 0.1384, 'grad_norm': 1.3182562589645386, 'learning_rate': 3.7317890273794804e-05, 'epoch': 1.92} 19%|█▉ | 7910/41250 [19:06:17<80:18:00, 8.67s/it][2025-04-26 03:04:01,332] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.03 | optimizer_step: 1.07 [2025-04-26 03:04:01,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2877.80 | bwd_microstep: 5775.19 | bwd_inner_microstep: 5762.27 | bwd_allreduce_microstep: 12.87 | step_microstep: 19.19 [2025-04-26 03:04:01,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2877.80 | bwd: 5775.20 | bwd_inner: 5762.27 | bwd_allreduce: 12.88 | step: 19.20 19%|█▉ | 7911/41250 [19:06:26<80:28:55, 8.69s/it] {'loss': 0.157, 'grad_norm': 1.798346757888794, 'learning_rate': 3.731710470190212e-05, 'epoch': 1.92} 19%|█▉ | 7911/41250 [19:06:26<80:28:55, 8.69s/it][2025-04-26 03:04:10,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 1.13 | optimizer_step: 1.00 [2025-04-26 03:04:10,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.83 | bwd_microstep: 5810.20 | bwd_inner_microstep: 5652.04 | bwd_allreduce_microstep: 158.11 | step_microstep: 19.43 [2025-04-26 03:04:10,052] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.83 | bwd: 5810.21 | bwd_inner: 5652.04 | bwd_allreduce: 158.13 | step: 19.43 19%|█▉ | 7912/41250 [19:06:35<80:33:19, 8.70s/it] {'loss': 0.132, 'grad_norm': 1.1590380668640137, 'learning_rate': 3.731631902325288e-05, 'epoch': 1.92} 19%|█▉ | 7912/41250 [19:06:35<80:33:19, 8.70s/it][2025-04-26 03:04:18,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:04:18,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.33 | bwd_microstep: 5780.18 | bwd_inner_microstep: 5640.03 | bwd_allreduce_microstep: 140.10 | step_microstep: 18.51 [2025-04-26 03:04:18,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.33 | bwd: 5780.19 | bwd_inner: 5640.03 | bwd_allreduce: 140.12 | step: 18.51 19%|█▉ | 7913/41250 [19:06:44<80:30:00, 8.69s/it] {'loss': 0.0868, 'grad_norm': 0.9999778866767883, 'learning_rate': 3.731553323785194e-05, 'epoch': 1.92} 19%|█▉ | 7913/41250 [19:06:44<80:30:00, 8.69s/it][2025-04-26 03:04:27,360] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 03:04:27,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.34 | bwd_microstep: 5727.54 | bwd_inner_microstep: 5631.23 | bwd_allreduce_microstep: 96.26 | step_microstep: 18.97 [2025-04-26 03:04:27,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.34 | bwd: 5727.56 | bwd_inner: 5631.23 | bwd_allreduce: 96.28 | step: 18.98 19%|█▉ | 7914/41250 [19:06:52<80:19:18, 8.67s/it] {'loss': 0.1199, 'grad_norm': 1.7564719915390015, 'learning_rate': 3.731474734570416e-05, 'epoch': 1.92} 19%|█▉ | 7914/41250 [19:06:52<80:19:18, 8.67s/it][2025-04-26 03:04:35,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.00 | optimizer_step: 1.06 [2025-04-26 03:04:35,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.85 | bwd_microstep: 5707.64 | bwd_inner_microstep: 5694.91 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.91 [2025-04-26 03:04:35,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.86 | bwd: 5707.66 | bwd_inner: 5694.91 | bwd_allreduce: 12.70 | step: 18.91 19%|█▉ | 7915/41250 [19:07:01<80:13:10, 8.66s/it] {'loss': 0.1059, 'grad_norm': 3.0658795833587646, 'learning_rate': 3.7313961346814356e-05, 'epoch': 1.92} 19%|█▉ | 7915/41250 [19:07:01<80:13:10, 8.66s/it][2025-04-26 03:04:44,690] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 03:04:44,691] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.22 | bwd_microstep: 5785.62 | bwd_inner_microstep: 5652.41 | bwd_allreduce_microstep: 133.15 | step_microstep: 18.89 [2025-04-26 03:04:44,691] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.22 | bwd: 5785.63 | bwd_inner: 5652.41 | bwd_allreduce: 133.18 | step: 18.89 19%|█▉ | 7916/41250 [19:07:10<80:17:45, 8.67s/it] {'loss': 0.0619, 'grad_norm': 0.9082417488098145, 'learning_rate': 3.7313175241187386e-05, 'epoch': 1.92} 19%|█▉ | 7916/41250 [19:07:10<80:17:45, 8.67s/it][2025-04-26 03:04:53,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.99 | optimizer_step: 1.04 [2025-04-26 03:04:53,381] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.66 | bwd_microstep: 5784.28 | bwd_inner_microstep: 5642.00 | bwd_allreduce_microstep: 142.24 | step_microstep: 18.67 [2025-04-26 03:04:53,381] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.66 | bwd: 5784.30 | bwd_inner: 5642.00 | bwd_allreduce: 142.26 | step: 18.67 19%|█▉ | 7917/41250 [19:07:18<80:20:40, 8.68s/it] {'loss': 0.0635, 'grad_norm': 1.8001476526260376, 'learning_rate': 3.731238902882809e-05, 'epoch': 1.92} 19%|█▉ | 7917/41250 [19:07:18<80:20:40, 8.68s/it][2025-04-26 03:05:02,024] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 1.00 [2025-04-26 03:05:02,024] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.71 | bwd_microstep: 5710.01 | bwd_inner_microstep: 5696.89 | bwd_allreduce_microstep: 13.07 | step_microstep: 18.93 [2025-04-26 03:05:02,024] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.71 | bwd: 5710.03 | bwd_inner: 5696.89 | bwd_allreduce: 13.09 | step: 18.93 19%|█▉ | 7918/41250 [19:07:27<80:15:33, 8.67s/it] {'loss': 0.1539, 'grad_norm': 1.3480944633483887, 'learning_rate': 3.731160270974133e-05, 'epoch': 1.92} 19%|█▉ | 7918/41250 [19:07:27<80:15:33, 8.67s/it][2025-04-26 03:05:10,688] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-26 03:05:10,688] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.22 | bwd_microstep: 5754.02 | bwd_inner_microstep: 5636.56 | bwd_allreduce_microstep: 117.42 | step_microstep: 18.47 [2025-04-26 03:05:10,688] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.23 | bwd: 5754.04 | bwd_inner: 5636.56 | bwd_allreduce: 117.44 | step: 18.47 19%|█▉ | 7919/41250 [19:07:36<80:13:56, 8.67s/it] {'loss': 0.1666, 'grad_norm': 1.2967370748519897, 'learning_rate': 3.731081628393194e-05, 'epoch': 1.92} 19%|█▉ | 7919/41250 [19:07:36<80:13:56, 8.67s/it][2025-04-26 03:05:19,384] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.03 | optimizer_step: 1.12 [2025-04-26 03:05:19,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.94 | bwd_microstep: 5787.69 | bwd_inner_microstep: 5649.29 | bwd_allreduce_microstep: 138.35 | step_microstep: 19.43 [2025-04-26 03:05:19,385] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.94 | bwd: 5787.70 | bwd_inner: 5649.29 | bwd_allreduce: 138.37 | step: 19.43 19%|█▉ | 7920/41250 [19:07:44<80:19:00, 8.68s/it] {'loss': 0.2194, 'grad_norm': 0.869810163974762, 'learning_rate': 3.731002975140477e-05, 'epoch': 1.92} 19%|█▉ | 7920/41250 [19:07:44<80:19:00, 8.68s/it][2025-04-26 03:05:28,119] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 03:05:28,120] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2881.07 | bwd_microstep: 5770.84 | bwd_inner_microstep: 5758.06 | bwd_allreduce_microstep: 12.74 | step_microstep: 18.87 [2025-04-26 03:05:28,120] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2881.07 | bwd: 5770.86 | bwd_inner: 5758.06 | bwd_allreduce: 12.76 | step: 18.88 19%|█▉ | 7921/41250 [19:07:53<80:28:51, 8.69s/it] {'loss': 0.0846, 'grad_norm': 1.3780314922332764, 'learning_rate': 3.730924311216468e-05, 'epoch': 1.92} 19%|█▉ | 7921/41250 [19:07:53<80:28:51, 8.69s/it][2025-04-26 03:05:36,729] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:05:36,730] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.95 | bwd_microstep: 5704.59 | bwd_inner_microstep: 5636.74 | bwd_allreduce_microstep: 67.79 | step_microstep: 18.83 [2025-04-26 03:05:36,730] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.95 | bwd: 5704.60 | bwd_inner: 5636.74 | bwd_allreduce: 67.81 | step: 18.83 19%|█▉ | 7922/41250 [19:08:02<80:14:51, 8.67s/it] {'loss': 0.2874, 'grad_norm': 1.7254716157913208, 'learning_rate': 3.73084563662165e-05, 'epoch': 1.92} 19%|█▉ | 7922/41250 [19:08:02<80:14:51, 8.67s/it][2025-04-26 03:05:45,315] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:05:45,315] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.16 | bwd_microstep: 5681.07 | bwd_inner_microstep: 5643.21 | bwd_allreduce_microstep: 37.82 | step_microstep: 18.87 [2025-04-26 03:05:45,316] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.17 | bwd: 5681.08 | bwd_inner: 5643.21 | bwd_allreduce: 37.83 | step: 18.87 19%|█▉ | 7923/41250 [19:08:10<80:00:56, 8.64s/it] {'loss': 0.3364, 'grad_norm': 1.4619330167770386, 'learning_rate': 3.7307669513565106e-05, 'epoch': 1.92} 19%|█▉ | 7923/41250 [19:08:10<80:00:56, 8.64s/it][2025-04-26 03:05:53,985] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:05:53,986] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.68 | bwd_microstep: 5743.39 | bwd_inner_microstep: 5681.31 | bwd_allreduce_microstep: 62.03 | step_microstep: 18.79 [2025-04-26 03:05:53,986] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.68 | bwd: 5743.40 | bwd_inner: 5681.31 | bwd_allreduce: 62.05 | step: 18.80 19%|█▉ | 7924/41250 [19:08:19<80:05:20, 8.65s/it] {'loss': 0.0769, 'grad_norm': 1.7622699737548828, 'learning_rate': 3.730688255421532e-05, 'epoch': 1.92} 19%|█▉ | 7924/41250 [19:08:19<80:05:20, 8.65s/it][2025-04-26 03:06:02,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.08 | optimizer_step: 0.97 [2025-04-26 03:06:02,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.56 | bwd_microstep: 5693.66 | bwd_inner_microstep: 5644.63 | bwd_allreduce_microstep: 48.98 | step_microstep: 19.28 [2025-04-26 03:06:02,587] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.56 | bwd: 5693.67 | bwd_inner: 5644.63 | bwd_allreduce: 49.00 | step: 19.28 19%|█▉ | 7925/41250 [19:08:27<79:56:51, 8.64s/it] {'loss': 0.0468, 'grad_norm': 0.5831716060638428, 'learning_rate': 3.7306095488172014e-05, 'epoch': 1.92} 19%|█▉ | 7925/41250 [19:08:27<79:56:51, 8.64s/it][2025-04-26 03:06:11,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 1.05 | optimizer_step: 1.04 [2025-04-26 03:06:11,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.70 | bwd_microstep: 5706.24 | bwd_inner_microstep: 5693.32 | bwd_allreduce_microstep: 12.87 | step_microstep: 19.71 [2025-04-26 03:06:11,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.70 | bwd: 5706.25 | bwd_inner: 5693.32 | bwd_allreduce: 12.88 | step: 19.71 19%|█▉ | 7926/41250 [19:08:36<79:58:21, 8.64s/it] {'loss': 0.1827, 'grad_norm': 1.321038842201233, 'learning_rate': 3.730530831544003e-05, 'epoch': 1.92} 19%|█▉ | 7926/41250 [19:08:36<79:58:21, 8.64s/it][2025-04-26 03:06:19,812] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.05 | optimizer_step: 0.95 [2025-04-26 03:06:19,812] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.44 | bwd_microstep: 5677.30 | bwd_inner_microstep: 5633.15 | bwd_allreduce_microstep: 44.09 | step_microstep: 19.33 [2025-04-26 03:06:19,813] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.44 | bwd: 5677.31 | bwd_inner: 5633.15 | bwd_allreduce: 44.12 | step: 19.33 19%|█▉ | 7927/41250 [19:08:45<79:48:19, 8.62s/it] {'loss': 0.3219, 'grad_norm': 6.648956298828125, 'learning_rate': 3.730452103602423e-05, 'epoch': 1.92} 19%|█▉ | 7927/41250 [19:08:45<79:48:19, 8.62s/it][2025-04-26 03:06:28,422] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-26 03:06:28,423] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.70 | bwd_microstep: 5705.16 | bwd_inner_microstep: 5639.31 | bwd_allreduce_microstep: 65.81 | step_microstep: 19.40 [2025-04-26 03:06:28,423] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.70 | bwd: 5705.18 | bwd_inner: 5639.31 | bwd_allreduce: 65.82 | step: 19.41 19%|█▉ | 7928/41250 [19:08:53<79:46:21, 8.62s/it] {'loss': 0.36, 'grad_norm': 3.318005084991455, 'learning_rate': 3.730373364992945e-05, 'epoch': 1.92} 19%|█▉ | 7928/41250 [19:08:53<79:46:21, 8.62s/it][2025-04-26 03:06:37,098] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.00 | optimizer_step: 1.01 [2025-04-26 03:06:37,098] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.50 | bwd_microstep: 5764.58 | bwd_inner_microstep: 5645.90 | bwd_allreduce_microstep: 118.63 | step_microstep: 19.34 [2025-04-26 03:06:37,098] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.50 | bwd: 5764.60 | bwd_inner: 5645.90 | bwd_allreduce: 118.65 | step: 19.34 19%|█▉ | 7929/41250 [19:09:02<79:55:44, 8.64s/it] {'loss': 0.0881, 'grad_norm': 0.9827333092689514, 'learning_rate': 3.730294615716056e-05, 'epoch': 1.92} 19%|█▉ | 7929/41250 [19:09:02<79:55:44, 8.64s/it][2025-04-26 03:06:45,874] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 1.06 [2025-04-26 03:06:45,875] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.55 | bwd_microstep: 5846.07 | bwd_inner_microstep: 5679.12 | bwd_allreduce_microstep: 166.89 | step_microstep: 19.25 [2025-04-26 03:06:45,875] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.55 | bwd: 5846.08 | bwd_inner: 5679.12 | bwd_allreduce: 166.92 | step: 19.25 19%|█▉ | 7930/41250 [19:09:11<80:19:02, 8.68s/it] {'loss': 0.2453, 'grad_norm': 1.443005084991455, 'learning_rate': 3.7302158557722415e-05, 'epoch': 1.92} 19%|█▉ | 7930/41250 [19:09:11<80:19:02, 8.68s/it][2025-04-26 03:06:54,509] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.04 | optimizer_step: 0.91 [2025-04-26 03:06:54,510] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.65 | bwd_microstep: 5702.56 | bwd_inner_microstep: 5689.74 | bwd_allreduce_microstep: 12.78 | step_microstep: 19.19 [2025-04-26 03:06:54,510] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.65 | bwd: 5702.58 | bwd_inner: 5689.74 | bwd_allreduce: 12.79 | step: 19.19 19%|█▉ | 7931/41250 [19:09:19<80:11:49, 8.67s/it] {'loss': 0.0793, 'grad_norm': 2.276012420654297, 'learning_rate': 3.730137085161986e-05, 'epoch': 1.92} 19%|█▉ | 7931/41250 [19:09:19<80:11:49, 8.67s/it][2025-04-26 03:07:03,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 03:07:03,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.86 | bwd_microstep: 5699.49 | bwd_inner_microstep: 5680.18 | bwd_allreduce_microstep: 19.26 | step_microstep: 19.22 [2025-04-26 03:07:03,137] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.86 | bwd: 5699.51 | bwd_inner: 5680.18 | bwd_allreduce: 19.28 | step: 19.22 19%|█▉ | 7932/41250 [19:09:28<80:05:19, 8.65s/it] {'loss': 0.0806, 'grad_norm': 1.9377950429916382, 'learning_rate': 3.730058303885776e-05, 'epoch': 1.92} 19%|█▉ | 7932/41250 [19:09:28<80:05:19, 8.65s/it][2025-04-26 03:07:11,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.05 | optimizer_step: 0.97 [2025-04-26 03:07:11,744] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.44 | bwd_microstep: 5694.73 | bwd_inner_microstep: 5656.87 | bwd_allreduce_microstep: 37.81 | step_microstep: 19.10 [2025-04-26 03:07:11,745] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.44 | bwd: 5694.74 | bwd_inner: 5656.87 | bwd_allreduce: 37.83 | step: 19.10 19%|█▉ | 7933/41250 [19:09:37<79:57:25, 8.64s/it] {'loss': 0.0568, 'grad_norm': 0.9302794337272644, 'learning_rate': 3.729979511944097e-05, 'epoch': 1.92} 19%|█▉ | 7933/41250 [19:09:37<79:57:25, 8.64s/it][2025-04-26 03:07:20,433] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.03 | optimizer_step: 1.03 [2025-04-26 03:07:20,434] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.06 | bwd_microstep: 5762.19 | bwd_inner_microstep: 5685.25 | bwd_allreduce_microstep: 76.88 | step_microstep: 19.24 [2025-04-26 03:07:20,434] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.06 | bwd: 5762.21 | bwd_inner: 5685.25 | bwd_allreduce: 76.91 | step: 19.24 19%|█▉ | 7934/41250 [19:09:45<80:06:23, 8.66s/it] {'loss': 0.0664, 'grad_norm': 1.2978895902633667, 'learning_rate': 3.729900709337434e-05, 'epoch': 1.92} 19%|█▉ | 7934/41250 [19:09:45<80:06:23, 8.66s/it][2025-04-26 03:07:29,126] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-26 03:07:29,126] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.28 | bwd_microstep: 5758.32 | bwd_inner_microstep: 5690.73 | bwd_allreduce_microstep: 67.53 | step_microstep: 19.21 [2025-04-26 03:07:29,127] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.28 | bwd: 5758.33 | bwd_inner: 5690.73 | bwd_allreduce: 67.55 | step: 19.22 19%|█▉ | 7935/41250 [19:09:54<80:11:28, 8.67s/it] {'loss': 0.0931, 'grad_norm': 1.3641525506973267, 'learning_rate': 3.729821896066274e-05, 'epoch': 1.92} 19%|█▉ | 7935/41250 [19:09:54<80:11:28, 8.67s/it][2025-04-26 03:07:37,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:07:37,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.67 | bwd_microstep: 5768.69 | bwd_inner_microstep: 5654.96 | bwd_allreduce_microstep: 113.69 | step_microstep: 18.70 [2025-04-26 03:07:37,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.67 | bwd: 5768.71 | bwd_inner: 5654.96 | bwd_allreduce: 113.71 | step: 18.70 19%|█▉ | 7936/41250 [19:10:03<80:13:41, 8.67s/it] {'loss': 0.0904, 'grad_norm': 1.4645546674728394, 'learning_rate': 3.729743072131101e-05, 'epoch': 1.92} 19%|█▉ | 7936/41250 [19:10:03<80:13:41, 8.67s/it][2025-04-26 03:07:46,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:07:46,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.14 | bwd_microstep: 5770.28 | bwd_inner_microstep: 5685.69 | bwd_allreduce_microstep: 84.55 | step_microstep: 18.70 [2025-04-26 03:07:46,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.14 | bwd: 5770.29 | bwd_inner: 5685.69 | bwd_allreduce: 84.56 | step: 18.70 19%|█▉ | 7937/41250 [19:10:11<80:20:16, 8.68s/it] {'loss': 0.2087, 'grad_norm': 3.589425802230835, 'learning_rate': 3.7296642375324046e-05, 'epoch': 1.92} 19%|█▉ | 7937/41250 [19:10:11<80:20:16, 8.68s/it][2025-04-26 03:07:55,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:07:55,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.48 | bwd_microstep: 5769.10 | bwd_inner_microstep: 5655.86 | bwd_allreduce_microstep: 113.20 | step_microstep: 18.77 [2025-04-26 03:07:55,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.48 | bwd: 5769.11 | bwd_inner: 5655.86 | bwd_allreduce: 113.22 | step: 18.78 19%|█▉ | 7938/41250 [19:10:20<80:19:52, 8.68s/it] {'loss': 0.2117, 'grad_norm': 1.3596364259719849, 'learning_rate': 3.729585392270666e-05, 'epoch': 1.92} 19%|█▉ | 7938/41250 [19:10:20<80:19:52, 8.68s/it][2025-04-26 03:08:03,835] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:08:03,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.67 | bwd_microstep: 5713.50 | bwd_inner_microstep: 5701.01 | bwd_allreduce_microstep: 12.44 | step_microstep: 18.48 [2025-04-26 03:08:03,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.67 | bwd: 5713.51 | bwd_inner: 5701.01 | bwd_allreduce: 12.46 | step: 18.49 19%|█▉ | 7939/41250 [19:10:29<80:12:28, 8.67s/it] {'loss': 0.0765, 'grad_norm': 3.0703814029693604, 'learning_rate': 3.729506536346375e-05, 'epoch': 1.92} 19%|█▉ | 7939/41250 [19:10:29<80:12:28, 8.67s/it][2025-04-26 03:08:12,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:08:12,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.15 | bwd_microstep: 5754.07 | bwd_inner_microstep: 5692.79 | bwd_allreduce_microstep: 61.24 | step_microstep: 18.57 [2025-04-26 03:08:12,517] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.15 | bwd: 5754.09 | bwd_inner: 5692.79 | bwd_allreduce: 61.25 | step: 18.57 19%|█▉ | 7940/41250 [19:10:37<80:14:29, 8.67s/it] {'loss': 0.1183, 'grad_norm': 2.0854787826538086, 'learning_rate': 3.729427669760016e-05, 'epoch': 1.92} 19%|█▉ | 7940/41250 [19:10:37<80:14:29, 8.67s/it][2025-04-26 03:08:21,223] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 03:08:21,223] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.55 | bwd_microstep: 5780.51 | bwd_inner_microstep: 5682.43 | bwd_allreduce_microstep: 98.04 | step_microstep: 18.34 [2025-04-26 03:08:21,224] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.55 | bwd: 5780.53 | bwd_inner: 5682.43 | bwd_allreduce: 98.06 | step: 18.34 19%|█▉ | 7941/41250 [19:10:46<80:19:59, 8.68s/it] {'loss': 0.0815, 'grad_norm': 1.9177812337875366, 'learning_rate': 3.7293487925120754e-05, 'epoch': 1.93} 19%|█▉ | 7941/41250 [19:10:46<80:19:59, 8.68s/it][2025-04-26 03:08:29,857] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.95 | optimizer_step: 0.98 [2025-04-26 03:08:29,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.20 | bwd_microstep: 5728.53 | bwd_inner_microstep: 5658.54 | bwd_allreduce_microstep: 69.95 | step_microstep: 18.43 [2025-04-26 03:08:29,858] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.20 | bwd: 5728.54 | bwd_inner: 5658.54 | bwd_allreduce: 69.96 | step: 18.43 19%|█▉ | 7942/41250 [19:10:55<80:11:53, 8.67s/it] {'loss': 0.1894, 'grad_norm': 2.014744997024536, 'learning_rate': 3.72926990460304e-05, 'epoch': 1.93} 19%|█▉ | 7942/41250 [19:10:55<80:11:53, 8.67s/it][2025-04-26 03:08:38,491] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 03:08:38,491] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.78 | bwd_microstep: 5705.43 | bwd_inner_microstep: 5692.80 | bwd_allreduce_microstep: 12.58 | step_microstep: 18.27 [2025-04-26 03:08:38,492] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.78 | bwd: 5705.44 | bwd_inner: 5692.80 | bwd_allreduce: 12.60 | step: 18.27 19%|█▉ | 7943/41250 [19:11:03<80:06:01, 8.66s/it] {'loss': 0.1267, 'grad_norm': 2.4106807708740234, 'learning_rate': 3.7291910060333955e-05, 'epoch': 1.93} 19%|█▉ | 7943/41250 [19:11:03<80:06:01, 8.66s/it][2025-04-26 03:08:47,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 03:08:47,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.90 | bwd_microstep: 5772.05 | bwd_inner_microstep: 5698.96 | bwd_allreduce_microstep: 73.05 | step_microstep: 18.59 [2025-04-26 03:08:47,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.90 | bwd: 5772.06 | bwd_inner: 5698.96 | bwd_allreduce: 73.07 | step: 18.59 19%|█▉ | 7944/41250 [19:11:12<80:15:49, 8.68s/it] {'loss': 0.0336, 'grad_norm': 0.6657421588897705, 'learning_rate': 3.729112096803628e-05, 'epoch': 1.93} 19%|█▉ | 7944/41250 [19:11:12<80:15:49, 8.68s/it][2025-04-26 03:08:55,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.91 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 03:08:55,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.46 | bwd_microstep: 5754.04 | bwd_inner_microstep: 5695.41 | bwd_allreduce_microstep: 58.59 | step_microstep: 17.93 [2025-04-26 03:08:55,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.46 | bwd: 5754.05 | bwd_inner: 5695.41 | bwd_allreduce: 58.60 | step: 17.93 19%|█▉ | 7945/41250 [19:11:21<80:18:06, 8.68s/it] {'loss': 0.0312, 'grad_norm': 0.47408634424209595, 'learning_rate': 3.729033176914225e-05, 'epoch': 1.93} 19%|█▉ | 7945/41250 [19:11:21<80:18:06, 8.68s/it][2025-04-26 03:09:04,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.94 | optimizer_step: 0.89 [2025-04-26 03:09:04,667] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.69 | bwd_microstep: 5799.45 | bwd_inner_microstep: 5786.69 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.22 [2025-04-26 03:09:04,667] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.69 | bwd: 5799.46 | bwd_inner: 5786.69 | bwd_allreduce: 12.73 | step: 18.23 19%|█▉ | 7946/41250 [19:11:29<80:32:37, 8.71s/it] {'loss': 0.3581, 'grad_norm': 3.2332303524017334, 'learning_rate': 3.728954246365673e-05, 'epoch': 1.93} 19%|█▉ | 7946/41250 [19:11:29<80:32:37, 8.71s/it][2025-04-26 03:09:13,313] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 03:09:13,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.80 | bwd_microstep: 5715.11 | bwd_inner_microstep: 5702.50 | bwd_allreduce_microstep: 12.58 | step_microstep: 18.30 [2025-04-26 03:09:13,314] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.80 | bwd: 5715.13 | bwd_inner: 5702.50 | bwd_allreduce: 12.59 | step: 18.30 19%|█▉ | 7947/41250 [19:11:38<80:22:32, 8.69s/it] {'loss': 0.0362, 'grad_norm': 0.6069901585578918, 'learning_rate': 3.7288753051584584e-05, 'epoch': 1.93} 19%|█▉ | 7947/41250 [19:11:38<80:22:32, 8.69s/it][2025-04-26 03:09:21,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 03:09:21,926] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.12 | bwd_microstep: 5707.55 | bwd_inner_microstep: 5645.21 | bwd_allreduce_microstep: 62.30 | step_microstep: 18.13 [2025-04-26 03:09:21,927] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.12 | bwd: 5707.56 | bwd_inner: 5645.21 | bwd_allreduce: 62.31 | step: 18.13 19%|█▉ | 7948/41250 [19:11:47<80:09:45, 8.67s/it] {'loss': 0.1015, 'grad_norm': 1.5355932712554932, 'learning_rate': 3.7287963532930665e-05, 'epoch': 1.93} 19%|█▉ | 7948/41250 [19:11:47<80:09:45, 8.67s/it][2025-04-26 03:09:30,620] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:09:30,620] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.17 | bwd_microstep: 5780.90 | bwd_inner_microstep: 5656.93 | bwd_allreduce_microstep: 123.91 | step_microstep: 18.80 [2025-04-26 03:09:30,620] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.17 | bwd: 5780.92 | bwd_inner: 5656.93 | bwd_allreduce: 123.93 | step: 18.80 19%|█▉ | 7949/41250 [19:11:55<80:14:57, 8.68s/it] {'loss': 0.0666, 'grad_norm': 1.1709758043289185, 'learning_rate': 3.728717390769986e-05, 'epoch': 1.93} 19%|█▉ | 7949/41250 [19:11:55<80:14:57, 8.68s/it][2025-04-26 03:09:39,327] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:09:39,328] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.74 | bwd_microstep: 5795.54 | bwd_inner_microstep: 5663.16 | bwd_allreduce_microstep: 132.34 | step_microstep: 18.38 [2025-04-26 03:09:39,328] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.74 | bwd: 5795.56 | bwd_inner: 5663.16 | bwd_allreduce: 132.36 | step: 18.38 19%|█▉ | 7950/41250 [19:12:04<80:19:33, 8.68s/it] {'loss': 0.2656, 'grad_norm': 2.4866485595703125, 'learning_rate': 3.728638417589702e-05, 'epoch': 1.93} 19%|█▉ | 7950/41250 [19:12:04<80:19:33, 8.68s/it][2025-04-26 03:09:47,999] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-26 03:09:48,000] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.21 | bwd_microstep: 5763.97 | bwd_inner_microstep: 5655.94 | bwd_allreduce_microstep: 107.98 | step_microstep: 18.93 [2025-04-26 03:09:48,000] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.21 | bwd: 5763.98 | bwd_inner: 5655.94 | bwd_allreduce: 108.00 | step: 18.93 19%|█▉ | 7951/41250 [19:12:13<80:17:41, 8.68s/it] {'loss': 0.0881, 'grad_norm': 1.0738067626953125, 'learning_rate': 3.728559433752703e-05, 'epoch': 1.93} 19%|█▉ | 7951/41250 [19:12:13<80:17:41, 8.68s/it][2025-04-26 03:09:56,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:09:56,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.07 | bwd_microstep: 5770.74 | bwd_inner_microstep: 5662.43 | bwd_allreduce_microstep: 108.26 | step_microstep: 18.78 [2025-04-26 03:09:56,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.07 | bwd: 5770.75 | bwd_inner: 5662.43 | bwd_allreduce: 108.28 | step: 18.78 19%|█▉ | 7952/41250 [19:12:22<80:18:05, 8.68s/it] {'loss': 0.0178, 'grad_norm': 0.33761176466941833, 'learning_rate': 3.7284804392594747e-05, 'epoch': 1.93} 19%|█▉ | 7952/41250 [19:12:22<80:18:05, 8.68s/it][2025-04-26 03:10:05,365] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.22 | optimizer_step: 0.94 [2025-04-26 03:10:05,365] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.32 | bwd_microstep: 5769.53 | bwd_inner_microstep: 5654.96 | bwd_allreduce_microstep: 114.52 | step_microstep: 19.68 [2025-04-26 03:10:05,366] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.32 | bwd: 5769.54 | bwd_inner: 5654.96 | bwd_allreduce: 114.54 | step: 19.68 19%|█▉ | 7953/41250 [19:12:30<80:17:57, 8.68s/it] {'loss': 0.1516, 'grad_norm': 2.1915817260742188, 'learning_rate': 3.728401434110504e-05, 'epoch': 1.93} 19%|█▉ | 7953/41250 [19:12:30<80:17:57, 8.68s/it][2025-04-26 03:10:14,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 03:10:14,057] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.21 | bwd_microstep: 5772.20 | bwd_inner_microstep: 5649.09 | bwd_allreduce_microstep: 123.06 | step_microstep: 18.43 [2025-04-26 03:10:14,058] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.21 | bwd: 5772.21 | bwd_inner: 5649.09 | bwd_allreduce: 123.08 | step: 18.43 19%|█▉ | 7954/41250 [19:12:39<80:19:13, 8.68s/it] {'loss': 0.2289, 'grad_norm': 3.0541951656341553, 'learning_rate': 3.7283224183062795e-05, 'epoch': 1.93} 19%|█▉ | 7954/41250 [19:12:39<80:19:13, 8.68s/it][2025-04-26 03:10:22,695] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-26 03:10:22,696] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.34 | bwd_microstep: 5707.98 | bwd_inner_microstep: 5689.08 | bwd_allreduce_microstep: 18.86 | step_microstep: 18.79 [2025-04-26 03:10:22,696] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.34 | bwd: 5707.99 | bwd_inner: 5689.08 | bwd_allreduce: 18.88 | step: 18.79 19%|█▉ | 7955/41250 [19:12:48<80:11:29, 8.67s/it] {'loss': 0.1234, 'grad_norm': 2.425520896911621, 'learning_rate': 3.728243391847286e-05, 'epoch': 1.93} 19%|█▉ | 7955/41250 [19:12:48<80:11:29, 8.67s/it][2025-04-26 03:10:31,325] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.32 | optimizer_step: 1.03 [2025-04-26 03:10:31,326] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.59 | bwd_microstep: 5694.85 | bwd_inner_microstep: 5681.17 | bwd_allreduce_microstep: 13.62 | step_microstep: 19.98 [2025-04-26 03:10:31,326] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.59 | bwd: 5694.87 | bwd_inner: 5681.17 | bwd_allreduce: 13.65 | step: 19.98 19%|█▉ | 7956/41250 [19:12:56<80:04:37, 8.66s/it] {'loss': 0.1787, 'grad_norm': 2.1453001499176025, 'learning_rate': 3.728164354734013e-05, 'epoch': 1.93} 19%|█▉ | 7956/41250 [19:12:56<80:04:37, 8.66s/it][2025-04-26 03:10:39,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.99 | optimizer_step: 1.03 [2025-04-26 03:10:39,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.15 | bwd_microstep: 5707.43 | bwd_inner_microstep: 5635.36 | bwd_allreduce_microstep: 72.03 | step_microstep: 18.80 [2025-04-26 03:10:39,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.15 | bwd: 5707.44 | bwd_inner: 5635.36 | bwd_allreduce: 72.04 | step: 18.81 19%|█▉ | 7957/41250 [19:13:05<79:57:24, 8.65s/it] {'loss': 0.1238, 'grad_norm': 2.6602509021759033, 'learning_rate': 3.728085306966946e-05, 'epoch': 1.93} 19%|█▉ | 7957/41250 [19:13:05<79:57:24, 8.65s/it][2025-04-26 03:10:48,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 03:10:48,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.25 | bwd_microstep: 5698.76 | bwd_inner_microstep: 5686.21 | bwd_allreduce_microstep: 12.50 | step_microstep: 18.84 [2025-04-26 03:10:48,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.25 | bwd: 5698.77 | bwd_inner: 5686.21 | bwd_allreduce: 12.52 | step: 18.85 19%|█▉ | 7958/41250 [19:13:13<79:54:15, 8.64s/it] {'loss': 0.1629, 'grad_norm': 2.0360188484191895, 'learning_rate': 3.728006248546573e-05, 'epoch': 1.93} 19%|█▉ | 7958/41250 [19:13:13<79:54:15, 8.64s/it][2025-04-26 03:10:57,217] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 03:10:57,218] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.34 | bwd_microstep: 5713.37 | bwd_inner_microstep: 5700.47 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.94 [2025-04-26 03:10:57,218] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.34 | bwd: 5713.39 | bwd_inner: 5700.47 | bwd_allreduce: 12.88 | step: 18.95 19%|█▉ | 7959/41250 [19:13:22<79:55:31, 8.64s/it] {'loss': 0.0838, 'grad_norm': 1.1416176557540894, 'learning_rate': 3.7279271794733814e-05, 'epoch': 1.93} 19%|█▉ | 7959/41250 [19:13:22<79:55:31, 8.64s/it][2025-04-26 03:11:05,877] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:11:05,877] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.45 | bwd_microstep: 5731.96 | bwd_inner_microstep: 5697.14 | bwd_allreduce_microstep: 34.78 | step_microstep: 18.85 [2025-04-26 03:11:05,877] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.46 | bwd: 5731.97 | bwd_inner: 5697.14 | bwd_allreduce: 34.80 | step: 18.86 19%|█▉ | 7960/41250 [19:13:31<79:57:55, 8.65s/it] {'loss': 0.1522, 'grad_norm': 2.971947431564331, 'learning_rate': 3.727848099747859e-05, 'epoch': 1.93} 19%|█▉ | 7960/41250 [19:13:31<79:57:55, 8.65s/it][2025-04-26 03:11:14,525] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:11:14,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.13 | bwd_microstep: 5743.69 | bwd_inner_microstep: 5641.06 | bwd_allreduce_microstep: 102.59 | step_microstep: 18.32 [2025-04-26 03:11:14,526] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.13 | bwd: 5743.70 | bwd_inner: 5641.06 | bwd_allreduce: 102.60 | step: 18.32 19%|█▉ | 7961/41250 [19:13:39<79:57:56, 8.65s/it] {'loss': 0.4389, 'grad_norm': 2.9058303833007812, 'learning_rate': 3.727769009370493e-05, 'epoch': 1.93} 19%|█▉ | 7961/41250 [19:13:39<79:57:56, 8.65s/it][2025-04-26 03:11:23,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.00 | optimizer_step: 0.94 [2025-04-26 03:11:23,210] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.23 | bwd_microstep: 5741.24 | bwd_inner_microstep: 5714.06 | bwd_allreduce_microstep: 27.14 | step_microstep: 19.09 [2025-04-26 03:11:23,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.23 | bwd: 5741.26 | bwd_inner: 5714.06 | bwd_allreduce: 27.15 | step: 19.09 19%|█▉ | 7962/41250 [19:13:48<80:04:02, 8.66s/it] {'loss': 0.0986, 'grad_norm': 1.353115200996399, 'learning_rate': 3.72768990834177e-05, 'epoch': 1.93} 19%|█▉ | 7962/41250 [19:13:48<80:04:02, 8.66s/it][2025-04-26 03:11:31,888] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:11:31,889] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.07 | bwd_microstep: 5766.51 | bwd_inner_microstep: 5642.02 | bwd_allreduce_microstep: 124.45 | step_microstep: 18.44 [2025-04-26 03:11:31,889] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.07 | bwd: 5766.53 | bwd_inner: 5642.02 | bwd_allreduce: 124.47 | step: 18.44 19%|█▉ | 7963/41250 [19:13:57<80:06:56, 8.66s/it] {'loss': 0.0799, 'grad_norm': 1.5568349361419678, 'learning_rate': 3.727610796662179e-05, 'epoch': 1.93} 19%|█▉ | 7963/41250 [19:13:57<80:06:56, 8.66s/it][2025-04-26 03:11:40,511] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 03:11:40,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.62 | bwd_microstep: 5695.62 | bwd_inner_microstep: 5682.45 | bwd_allreduce_microstep: 13.13 | step_microstep: 18.95 [2025-04-26 03:11:40,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.62 | bwd: 5695.64 | bwd_inner: 5682.45 | bwd_allreduce: 13.15 | step: 18.95 19%|█▉ | 7964/41250 [19:14:05<79:59:57, 8.65s/it] {'loss': 0.2719, 'grad_norm': 2.615628242492676, 'learning_rate': 3.727531674332207e-05, 'epoch': 1.93} 19%|█▉ | 7964/41250 [19:14:05<79:59:57, 8.65s/it][2025-04-26 03:11:49,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.97 | optimizer_step: 0.99 [2025-04-26 03:11:49,196] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.59 | bwd_microstep: 5775.09 | bwd_inner_microstep: 5641.10 | bwd_allreduce_microstep: 133.95 | step_microstep: 18.14 [2025-04-26 03:11:49,196] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.59 | bwd: 5775.10 | bwd_inner: 5641.10 | bwd_allreduce: 133.96 | step: 18.14 19%|█▉ | 7965/41250 [19:14:14<80:05:06, 8.66s/it] {'loss': 0.2598, 'grad_norm': 2.9421896934509277, 'learning_rate': 3.727452541352342e-05, 'epoch': 1.93} 19%|█▉ | 7965/41250 [19:14:14<80:05:06, 8.66s/it][2025-04-26 03:11:57,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:11:57,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.48 | bwd_microstep: 5684.46 | bwd_inner_microstep: 5671.70 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.60 [2025-04-26 03:11:57,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.48 | bwd: 5684.47 | bwd_inner: 5671.70 | bwd_allreduce: 12.73 | step: 18.61 19%|█▉ | 7966/41250 [19:14:23<79:55:59, 8.65s/it] {'loss': 0.1485, 'grad_norm': 2.126108169555664, 'learning_rate': 3.7273733977230726e-05, 'epoch': 1.93} 19%|█▉ | 7966/41250 [19:14:23<79:55:59, 8.65s/it][2025-04-26 03:12:06,476] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:12:06,476] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.19 | bwd_microstep: 5747.63 | bwd_inner_microstep: 5674.99 | bwd_allreduce_microstep: 72.60 | step_microstep: 18.93 [2025-04-26 03:12:06,477] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.19 | bwd: 5747.65 | bwd_inner: 5674.99 | bwd_allreduce: 72.62 | step: 18.93 19%|█▉ | 7967/41250 [19:14:31<80:00:21, 8.65s/it] {'loss': 0.2438, 'grad_norm': 3.4644885063171387, 'learning_rate': 3.727294243444886e-05, 'epoch': 1.93} 19%|█▉ | 7967/41250 [19:14:31<80:00:21, 8.65s/it][2025-04-26 03:12:15,081] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.06 | optimizer_step: 1.18 [2025-04-26 03:12:15,082] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.35 | bwd_microstep: 5697.38 | bwd_inner_microstep: 5640.07 | bwd_allreduce_microstep: 57.26 | step_microstep: 19.51 [2025-04-26 03:12:15,082] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.35 | bwd: 5697.40 | bwd_inner: 5640.07 | bwd_allreduce: 57.28 | step: 19.51 19%|█▉ | 7968/41250 [19:14:40<79:52:25, 8.64s/it] {'loss': 0.2062, 'grad_norm': 1.6537140607833862, 'learning_rate': 3.727215078518269e-05, 'epoch': 1.93} 19%|█▉ | 7968/41250 [19:14:40<79:52:25, 8.64s/it][2025-04-26 03:12:23,713] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.08 | optimizer_step: 1.02 [2025-04-26 03:12:23,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.04 | bwd_microstep: 5697.27 | bwd_inner_microstep: 5684.18 | bwd_allreduce_microstep: 13.05 | step_microstep: 19.09 [2025-04-26 03:12:23,714] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.04 | bwd: 5697.29 | bwd_inner: 5684.18 | bwd_allreduce: 13.07 | step: 19.09 19%|█▉ | 7969/41250 [19:14:49<79:50:49, 8.64s/it] {'loss': 0.0318, 'grad_norm': 0.5292767882347107, 'learning_rate': 3.727135902943712e-05, 'epoch': 1.93} 19%|█▉ | 7969/41250 [19:14:49<79:50:49, 8.64s/it][2025-04-26 03:12:32,306] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 03:12:32,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.78 | bwd_microstep: 5672.46 | bwd_inner_microstep: 5636.97 | bwd_allreduce_microstep: 35.45 | step_microstep: 18.61 [2025-04-26 03:12:32,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.78 | bwd: 5672.47 | bwd_inner: 5636.96 | bwd_allreduce: 35.47 | step: 18.61 19%|█▉ | 7970/41250 [19:14:57<79:43:17, 8.62s/it] {'loss': 0.2203, 'grad_norm': 2.873655080795288, 'learning_rate': 3.7270567167217014e-05, 'epoch': 1.93} 19%|█▉ | 7970/41250 [19:14:57<79:43:17, 8.62s/it][2025-04-26 03:12:40,947] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.99 | optimizer_step: 0.98 [2025-04-26 03:12:40,947] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.58 | bwd_microstep: 5702.04 | bwd_inner_microstep: 5689.13 | bwd_allreduce_microstep: 12.86 | step_microstep: 19.13 [2025-04-26 03:12:40,947] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.58 | bwd: 5702.05 | bwd_inner: 5689.13 | bwd_allreduce: 12.88 | step: 19.14 19%|█▉ | 7971/41250 [19:15:06<79:45:59, 8.63s/it] {'loss': 0.058, 'grad_norm': 0.9976211190223694, 'learning_rate': 3.7269775198527255e-05, 'epoch': 1.93} 19%|█▉ | 7971/41250 [19:15:06<79:45:59, 8.63s/it][2025-04-26 03:12:49,798] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:12:49,799] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.70 | bwd_microstep: 5946.76 | bwd_inner_microstep: 5645.73 | bwd_allreduce_microstep: 300.98 | step_microstep: 18.71 [2025-04-26 03:12:49,799] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.70 | bwd: 5946.77 | bwd_inner: 5645.73 | bwd_allreduce: 301.00 | step: 18.71 19%|█▉ | 7972/41250 [19:15:15<80:22:56, 8.70s/it] {'loss': 0.1117, 'grad_norm': 1.5577534437179565, 'learning_rate': 3.726898312337273e-05, 'epoch': 1.93} 19%|█▉ | 7972/41250 [19:15:15<80:22:56, 8.70s/it][2025-04-26 03:12:58,468] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.07 | optimizer_step: 1.20 [2025-04-26 03:12:58,468] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.22 | bwd_microstep: 5751.17 | bwd_inner_microstep: 5647.52 | bwd_allreduce_microstep: 103.60 | step_microstep: 19.35 [2025-04-26 03:12:58,469] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.22 | bwd: 5751.18 | bwd_inner: 5647.52 | bwd_allreduce: 103.62 | step: 19.35 19%|█▉ | 7973/41250 [19:15:23<80:18:43, 8.69s/it] {'loss': 0.0891, 'grad_norm': 1.380790114402771, 'learning_rate': 3.7268190941758325e-05, 'epoch': 1.93} 19%|█▉ | 7973/41250 [19:15:23<80:18:43, 8.69s/it][2025-04-26 03:13:07,148] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-26 03:13:07,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.41 | bwd_microstep: 5768.22 | bwd_inner_microstep: 5639.17 | bwd_allreduce_microstep: 128.99 | step_microstep: 18.80 [2025-04-26 03:13:07,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.41 | bwd: 5768.23 | bwd_inner: 5639.17 | bwd_allreduce: 129.01 | step: 18.80 19%|█▉ | 7974/41250 [19:15:32<80:16:54, 8.69s/it] {'loss': 0.1135, 'grad_norm': 2.4950740337371826, 'learning_rate': 3.726739865368891e-05, 'epoch': 1.93} 19%|█▉ | 7974/41250 [19:15:32<80:16:54, 8.69s/it][2025-04-26 03:13:15,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 1.03 | optimizer_step: 1.09 [2025-04-26 03:13:15,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.10 | bwd_microstep: 5777.79 | bwd_inner_microstep: 5676.00 | bwd_allreduce_microstep: 101.74 | step_microstep: 19.27 [2025-04-26 03:13:15,854] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.10 | bwd: 5777.81 | bwd_inner: 5676.00 | bwd_allreduce: 101.76 | step: 19.26 19%|█▉ | 7975/41250 [19:15:41<80:20:05, 8.69s/it] {'loss': 0.09, 'grad_norm': 2.202418565750122, 'learning_rate': 3.726660625916939e-05, 'epoch': 1.93} 19%|█▉ | 7975/41250 [19:15:41<80:20:05, 8.69s/it][2025-04-26 03:13:24,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 03:13:24,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.02 | bwd_microstep: 5712.87 | bwd_inner_microstep: 5700.12 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.92 [2025-04-26 03:13:24,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.02 | bwd: 5712.88 | bwd_inner: 5700.12 | bwd_allreduce: 12.72 | step: 18.93 19%|█▉ | 7976/41250 [19:15:49<80:12:22, 8.68s/it] {'loss': 0.2236, 'grad_norm': 1.9117599725723267, 'learning_rate': 3.726581375820463e-05, 'epoch': 1.93} 19%|█▉ | 7976/41250 [19:15:49<80:12:22, 8.68s/it][2025-04-26 03:13:33,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.26 | optimizer_step: 1.00 [2025-04-26 03:13:33,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.66 | bwd_microstep: 5756.89 | bwd_inner_microstep: 5642.13 | bwd_allreduce_microstep: 114.71 | step_microstep: 19.57 [2025-04-26 03:13:33,168] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.66 | bwd: 5756.91 | bwd_inner: 5642.13 | bwd_allreduce: 114.73 | step: 19.57 19%|█▉ | 7977/41250 [19:15:58<80:10:53, 8.68s/it] {'loss': 0.2876, 'grad_norm': 3.8188090324401855, 'learning_rate': 3.7265021150799526e-05, 'epoch': 1.93} 19%|█▉ | 7977/41250 [19:15:58<80:10:53, 8.68s/it][2025-04-26 03:13:41,855] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 03:13:41,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.39 | bwd_microstep: 5778.97 | bwd_inner_microstep: 5637.22 | bwd_allreduce_microstep: 141.71 | step_microstep: 18.74 [2025-04-26 03:13:41,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.39 | bwd: 5778.98 | bwd_inner: 5637.22 | bwd_allreduce: 141.72 | step: 18.75 19%|█▉ | 7978/41250 [19:16:07<80:12:36, 8.68s/it] {'loss': 0.1796, 'grad_norm': 2.0775225162506104, 'learning_rate': 3.7264228436958966e-05, 'epoch': 1.93} 19%|█▉ | 7978/41250 [19:16:07<80:12:36, 8.68s/it][2025-04-26 03:13:50,477] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 03:13:50,478] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.38 | bwd_microstep: 5692.12 | bwd_inner_microstep: 5679.32 | bwd_allreduce_microstep: 12.76 | step_microstep: 18.72 [2025-04-26 03:13:50,478] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.38 | bwd: 5692.14 | bwd_inner: 5679.32 | bwd_allreduce: 12.78 | step: 18.72 19%|█▉ | 7979/41250 [19:16:15<80:03:12, 8.66s/it] {'loss': 0.0254, 'grad_norm': 0.34262606501579285, 'learning_rate': 3.726343561668783e-05, 'epoch': 1.93} 19%|█▉ | 7979/41250 [19:16:15<80:03:12, 8.66s/it][2025-04-26 03:13:59,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:13:59,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.21 | bwd_microstep: 5733.29 | bwd_inner_microstep: 5698.80 | bwd_allreduce_microstep: 34.45 | step_microstep: 18.44 [2025-04-26 03:13:59,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.21 | bwd: 5733.31 | bwd_inner: 5698.80 | bwd_allreduce: 34.47 | step: 18.45 19%|█▉ | 7980/41250 [19:16:24<80:03:25, 8.66s/it] {'loss': 0.2225, 'grad_norm': 1.800246000289917, 'learning_rate': 3.726264268999101e-05, 'epoch': 1.93} 19%|█▉ | 7980/41250 [19:16:24<80:03:25, 8.66s/it][2025-04-26 03:14:07,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:14:07,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2880.87 | bwd_microstep: 5718.08 | bwd_inner_microstep: 5705.18 | bwd_allreduce_microstep: 12.85 | step_microstep: 18.61 [2025-04-26 03:14:07,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2880.87 | bwd: 5718.09 | bwd_inner: 5705.18 | bwd_allreduce: 12.87 | step: 18.61 19%|█▉ | 7981/41250 [19:16:33<80:06:44, 8.67s/it] {'loss': 0.0619, 'grad_norm': 2.8412468433380127, 'learning_rate': 3.726184965687339e-05, 'epoch': 1.93} 19%|█▉ | 7981/41250 [19:16:33<80:06:44, 8.67s/it][2025-04-26 03:14:16,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.56 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 03:14:16,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.58 | bwd_microstep: 5702.90 | bwd_inner_microstep: 5690.07 | bwd_allreduce_microstep: 12.79 | step_microstep: 18.75 [2025-04-26 03:14:16,463] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.58 | bwd: 5702.92 | bwd_inner: 5690.07 | bwd_allreduce: 12.81 | step: 18.76 19%|█▉ | 7982/41250 [19:16:41<80:01:13, 8.66s/it] {'loss': 0.1225, 'grad_norm': 3.8491225242614746, 'learning_rate': 3.726105651733987e-05, 'epoch': 1.94} 19%|█▉ | 7982/41250 [19:16:41<80:01:13, 8.66s/it][2025-04-26 03:14:25,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.97 | optimizer_step: 1.04 [2025-04-26 03:14:25,143] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.10 | bwd_microstep: 5741.85 | bwd_inner_microstep: 5701.83 | bwd_allreduce_microstep: 39.97 | step_microstep: 18.29 [2025-04-26 03:14:25,143] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.10 | bwd: 5741.86 | bwd_inner: 5701.83 | bwd_allreduce: 39.99 | step: 18.29 19%|█▉ | 7983/41250 [19:16:50<80:04:35, 8.67s/it] {'loss': 0.304, 'grad_norm': 2.122929811477661, 'learning_rate': 3.7260263271395325e-05, 'epoch': 1.94} 19%|█▉ | 7983/41250 [19:16:50<80:04:35, 8.67s/it][2025-04-26 03:14:33,814] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 03:14:33,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.32 | bwd_microstep: 5736.71 | bwd_inner_microstep: 5692.01 | bwd_allreduce_microstep: 44.66 | step_microstep: 18.58 [2025-04-26 03:14:33,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.32 | bwd: 5736.72 | bwd_inner: 5692.01 | bwd_allreduce: 44.68 | step: 18.59 19%|█▉ | 7984/41250 [19:16:59<80:05:28, 8.67s/it] {'loss': 0.1011, 'grad_norm': 1.8959314823150635, 'learning_rate': 3.725946991904465e-05, 'epoch': 1.94} 19%|█▉ | 7984/41250 [19:16:59<80:05:28, 8.67s/it][2025-04-26 03:14:42,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:14:42,527] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.20 | bwd_microstep: 5775.21 | bwd_inner_microstep: 5696.44 | bwd_allreduce_microstep: 78.72 | step_microstep: 18.65 [2025-04-26 03:14:42,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.20 | bwd: 5775.22 | bwd_inner: 5696.44 | bwd_allreduce: 78.74 | step: 18.65 19%|█▉ | 7985/41250 [19:17:07<80:13:00, 8.68s/it] {'loss': 0.116, 'grad_norm': 1.9440454244613647, 'learning_rate': 3.7258676460292744e-05, 'epoch': 1.94} 19%|█▉ | 7985/41250 [19:17:07<80:13:00, 8.68s/it][2025-04-26 03:14:51,216] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.03 | optimizer_step: 1.00 [2025-04-26 03:14:51,217] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.43 | bwd_microstep: 5754.29 | bwd_inner_microstep: 5694.46 | bwd_allreduce_microstep: 59.79 | step_microstep: 19.29 [2025-04-26 03:14:51,217] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.43 | bwd: 5754.30 | bwd_inner: 5694.46 | bwd_allreduce: 59.80 | step: 19.30 19%|█▉ | 7986/41250 [19:17:16<80:13:55, 8.68s/it] {'loss': 0.1332, 'grad_norm': 1.590816617012024, 'learning_rate': 3.7257882895144485e-05, 'epoch': 1.94} 19%|█▉ | 7986/41250 [19:17:16<80:13:55, 8.68s/it][2025-04-26 03:14:59,912] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 03:14:59,912] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.33 | bwd_microstep: 5788.60 | bwd_inner_microstep: 5658.93 | bwd_allreduce_microstep: 129.62 | step_microstep: 18.43 [2025-04-26 03:14:59,913] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.33 | bwd: 5788.61 | bwd_inner: 5658.93 | bwd_allreduce: 129.64 | step: 18.44 19%|█▉ | 7987/41250 [19:17:25<80:16:03, 8.69s/it] {'loss': 0.0862, 'grad_norm': 3.6912381649017334, 'learning_rate': 3.725708922360478e-05, 'epoch': 1.94} 19%|█▉ | 7987/41250 [19:17:25<80:16:03, 8.69s/it][2025-04-26 03:15:08,706] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.22 | optimizer_step: 1.05 [2025-04-26 03:15:08,707] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.86 | bwd_microstep: 5861.09 | bwd_inner_microstep: 5689.66 | bwd_allreduce_microstep: 171.38 | step_microstep: 19.89 [2025-04-26 03:15:08,708] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.86 | bwd: 5861.11 | bwd_inner: 5689.66 | bwd_allreduce: 171.41 | step: 19.89 19%|█▉ | 7988/41250 [19:17:34<80:33:56, 8.72s/it] {'loss': 0.0945, 'grad_norm': 10.21113395690918, 'learning_rate': 3.725629544567851e-05, 'epoch': 1.94} 19%|█▉ | 7988/41250 [19:17:34<80:33:56, 8.72s/it][2025-04-26 03:15:17,391] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.03 | optimizer_step: 1.19 [2025-04-26 03:15:17,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.98 | bwd_microstep: 5773.10 | bwd_inner_microstep: 5649.54 | bwd_allreduce_microstep: 123.51 | step_microstep: 19.20 [2025-04-26 03:15:17,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.98 | bwd: 5773.11 | bwd_inner: 5649.54 | bwd_allreduce: 123.53 | step: 19.20 19%|█▉ | 7989/41250 [19:17:42<80:27:51, 8.71s/it] {'loss': 0.0335, 'grad_norm': 1.1417410373687744, 'learning_rate': 3.7255501561370576e-05, 'epoch': 1.94} 19%|█▉ | 7989/41250 [19:17:42<80:27:51, 8.71s/it][2025-04-26 03:15:26,103] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.99 | optimizer_step: 1.07 [2025-04-26 03:15:26,104] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2864.11 | bwd_microstep: 5764.06 | bwd_inner_microstep: 5721.51 | bwd_allreduce_microstep: 42.51 | step_microstep: 18.70 [2025-04-26 03:15:26,104] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2864.11 | bwd: 5764.07 | bwd_inner: 5721.51 | bwd_allreduce: 42.52 | step: 18.71 19%|█▉ | 7990/41250 [19:17:51<80:27:58, 8.71s/it] {'loss': 0.0433, 'grad_norm': 0.858853816986084, 'learning_rate': 3.7254707570685867e-05, 'epoch': 1.94} 19%|█▉ | 7990/41250 [19:17:51<80:27:58, 8.71s/it][2025-04-26 03:15:34,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.97 | optimizer_step: 0.93 [2025-04-26 03:15:34,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.45 | bwd_microstep: 5765.00 | bwd_inner_microstep: 5720.62 | bwd_allreduce_microstep: 44.33 | step_microstep: 18.67 [2025-04-26 03:15:34,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.45 | bwd: 5765.01 | bwd_inner: 5720.62 | bwd_allreduce: 44.35 | step: 18.67 19%|█▉ | 7991/41250 [19:18:00<80:26:37, 8.71s/it] {'loss': 0.1062, 'grad_norm': 2.4419467449188232, 'learning_rate': 3.725391347362928e-05, 'epoch': 1.94} 19%|█▉ | 7991/41250 [19:18:00<80:26:37, 8.71s/it][2025-04-26 03:15:43,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 1.00 | optimizer_step: 1.08 [2025-04-26 03:15:43,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.10 | bwd_microstep: 5781.62 | bwd_inner_microstep: 5659.61 | bwd_allreduce_microstep: 121.97 | step_microstep: 18.48 [2025-04-26 03:15:43,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.10 | bwd: 5781.64 | bwd_inner: 5659.61 | bwd_allreduce: 121.98 | step: 18.48 19%|█▉ | 7992/41250 [19:18:08<80:24:35, 8.70s/it] {'loss': 0.1424, 'grad_norm': 2.035543203353882, 'learning_rate': 3.725311927020571e-05, 'epoch': 1.94} 19%|█▉ | 7992/41250 [19:18:08<80:24:35, 8.70s/it][2025-04-26 03:15:52,146] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:15:52,146] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.21 | bwd_microstep: 5713.82 | bwd_inner_microstep: 5701.12 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.29 [2025-04-26 03:15:52,146] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.21 | bwd: 5713.83 | bwd_inner: 5701.12 | bwd_allreduce: 12.67 | step: 18.29 19%|█▉ | 7993/41250 [19:18:17<80:14:26, 8.69s/it] {'loss': 0.0962, 'grad_norm': 1.4063676595687866, 'learning_rate': 3.725232496042005e-05, 'epoch': 1.94} 19%|█▉ | 7993/41250 [19:18:17<80:14:26, 8.69s/it][2025-04-26 03:16:00,859] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:16:00,860] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.45 | bwd_microstep: 5769.27 | bwd_inner_microstep: 5719.34 | bwd_allreduce_microstep: 49.88 | step_microstep: 18.68 [2025-04-26 03:16:00,860] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.45 | bwd: 5769.28 | bwd_inner: 5719.34 | bwd_allreduce: 49.90 | step: 18.68 19%|█▉ | 7994/41250 [19:18:26<80:19:03, 8.69s/it] {'loss': 0.1295, 'grad_norm': 2.041811466217041, 'learning_rate': 3.7251530544277205e-05, 'epoch': 1.94} 19%|█▉ | 7994/41250 [19:18:26<80:19:03, 8.69s/it][2025-04-26 03:16:09,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.02 | optimizer_step: 1.04 [2025-04-26 03:16:09,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.94 | bwd_microstep: 5705.02 | bwd_inner_microstep: 5692.09 | bwd_allreduce_microstep: 12.88 | step_microstep: 19.45 [2025-04-26 03:16:09,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.94 | bwd: 5705.03 | bwd_inner: 5692.09 | bwd_allreduce: 12.90 | step: 19.45 19%|█▉ | 7995/41250 [19:18:34<80:10:30, 8.68s/it] {'loss': 0.1729, 'grad_norm': 4.6222381591796875, 'learning_rate': 3.7250736021782066e-05, 'epoch': 1.94} 19%|█▉ | 7995/41250 [19:18:34<80:10:30, 8.68s/it][2025-04-26 03:16:18,129] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.55 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 03:16:18,130] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.82 | bwd_microstep: 5713.34 | bwd_inner_microstep: 5654.26 | bwd_allreduce_microstep: 59.03 | step_microstep: 19.16 [2025-04-26 03:16:18,130] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.82 | bwd: 5713.35 | bwd_inner: 5654.26 | bwd_allreduce: 59.05 | step: 19.16 19%|█▉ | 7996/41250 [19:18:43<80:01:30, 8.66s/it] {'loss': 0.0673, 'grad_norm': 1.7279244661331177, 'learning_rate': 3.7249941392939536e-05, 'epoch': 1.94} 19%|█▉ | 7996/41250 [19:18:43<80:01:30, 8.66s/it][2025-04-26 03:16:26,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:16:26,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.14 | bwd_microstep: 5765.01 | bwd_inner_microstep: 5698.88 | bwd_allreduce_microstep: 66.09 | step_microstep: 18.65 [2025-04-26 03:16:26,827] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.14 | bwd: 5765.02 | bwd_inner: 5698.88 | bwd_allreduce: 66.10 | step: 18.65 19%|█▉ | 7997/41250 [19:18:52<80:06:58, 8.67s/it] {'loss': 0.2204, 'grad_norm': 2.895005702972412, 'learning_rate': 3.7249146657754505e-05, 'epoch': 1.94} 19%|█▉ | 7997/41250 [19:18:52<80:06:58, 8.67s/it][2025-04-26 03:16:35,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.03 | optimizer_step: 1.00 [2025-04-26 03:16:35,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.14 | bwd_microstep: 5757.24 | bwd_inner_microstep: 5656.49 | bwd_allreduce_microstep: 100.70 | step_microstep: 19.47 [2025-04-26 03:16:35,494] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.14 | bwd: 5757.26 | bwd_inner: 5656.49 | bwd_allreduce: 100.72 | step: 19.47 19%|█▉ | 7998/41250 [19:19:00<80:05:52, 8.67s/it] {'loss': 0.03, 'grad_norm': 1.4967327117919922, 'learning_rate': 3.7248351816231875e-05, 'epoch': 1.94} 19%|█▉ | 7998/41250 [19:19:00<80:05:52, 8.67s/it][2025-04-26 03:16:44,174] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 03:16:44,175] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.18 | bwd_microstep: 5765.29 | bwd_inner_microstep: 5643.94 | bwd_allreduce_microstep: 121.30 | step_microstep: 18.58 [2025-04-26 03:16:44,175] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.18 | bwd: 5765.30 | bwd_inner: 5643.94 | bwd_allreduce: 121.32 | step: 18.58 19%|█▉ | 7999/41250 [19:19:09<80:07:03, 8.67s/it] {'loss': 0.1681, 'grad_norm': 2.1041159629821777, 'learning_rate': 3.724755686837655e-05, 'epoch': 1.94} 19%|█▉ | 7999/41250 [19:19:09<80:07:03, 8.67s/it][2025-04-26 03:16:53,006] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.98 | optimizer_step: 1.04 [2025-04-26 03:16:53,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.15 | bwd_microstep: 5926.67 | bwd_inner_microstep: 5637.03 | bwd_allreduce_microstep: 289.59 | step_microstep: 18.82 [2025-04-26 03:16:53,007] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.15 | bwd: 5926.68 | bwd_inner: 5637.03 | bwd_allreduce: 289.61 | step: 18.83 19%|█▉ | 8000/41250 [19:19:18<80:33:04, 8.72s/it] {'loss': 0.4024, 'grad_norm': 3.715576171875, 'learning_rate': 3.724676181419343e-05, 'epoch': 1.94} 19%|█▉ | 8000/41250 [19:19:18<80:33:04, 8.72s/it][2025-04-26 03:17:01,653] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 03:17:01,654] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.38 | bwd_microstep: 5712.60 | bwd_inner_microstep: 5699.88 | bwd_allreduce_microstep: 12.67 | step_microstep: 18.69 [2025-04-26 03:17:01,654] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.38 | bwd: 5712.61 | bwd_inner: 5699.88 | bwd_allreduce: 12.69 | step: 18.69 19%|█▉ | 8001/41250 [19:19:26<80:20:34, 8.70s/it] {'loss': 0.138, 'grad_norm': 2.2388648986816406, 'learning_rate': 3.724596665368742e-05, 'epoch': 1.94} 19%|█▉ | 8001/41250 [19:19:26<80:20:34, 8.70s/it][2025-04-26 03:17:10,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 03:17:10,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.60 | bwd_microstep: 5769.99 | bwd_inner_microstep: 5643.71 | bwd_allreduce_microstep: 126.24 | step_microstep: 18.63 [2025-04-26 03:17:10,335] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.60 | bwd: 5770.00 | bwd_inner: 5643.71 | bwd_allreduce: 126.25 | step: 18.64 19%|█▉ | 8002/41250 [19:19:35<80:17:59, 8.69s/it] {'loss': 0.2991, 'grad_norm': 2.1534817218780518, 'learning_rate': 3.7245171386863414e-05, 'epoch': 1.94} 19%|█▉ | 8002/41250 [19:19:35<80:17:59, 8.69s/it][2025-04-26 03:17:18,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.97 | optimizer_step: 0.94 [2025-04-26 03:17:18,956] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.56 | bwd_microstep: 5711.24 | bwd_inner_microstep: 5647.32 | bwd_allreduce_microstep: 63.88 | step_microstep: 18.77 [2025-04-26 03:17:18,956] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.56 | bwd: 5711.25 | bwd_inner: 5647.32 | bwd_allreduce: 63.89 | step: 18.78 19%|█▉ | 8003/41250 [19:19:44<80:05:08, 8.67s/it] {'loss': 0.0687, 'grad_norm': 1.2536883354187012, 'learning_rate': 3.724437601372632e-05, 'epoch': 1.94} 19%|█▉ | 8003/41250 [19:19:44<80:05:08, 8.67s/it][2025-04-26 03:17:27,563] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.98 | optimizer_step: 0.93 [2025-04-26 03:17:27,564] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.77 | bwd_microstep: 5702.24 | bwd_inner_microstep: 5651.61 | bwd_allreduce_microstep: 50.59 | step_microstep: 18.80 [2025-04-26 03:17:27,564] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.77 | bwd: 5702.26 | bwd_inner: 5651.61 | bwd_allreduce: 50.61 | step: 18.80 19%|█▉ | 8004/41250 [19:19:52<79:54:23, 8.65s/it] {'loss': 0.5366, 'grad_norm': 2.429692268371582, 'learning_rate': 3.7243580534281036e-05, 'epoch': 1.94} 19%|█▉ | 8004/41250 [19:19:52<79:54:23, 8.65s/it][2025-04-26 03:17:36,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:17:36,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.03 | bwd_microstep: 5748.28 | bwd_inner_microstep: 5689.85 | bwd_allreduce_microstep: 58.39 | step_microstep: 18.17 [2025-04-26 03:17:36,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.03 | bwd: 5748.30 | bwd_inner: 5689.85 | bwd_allreduce: 58.41 | step: 18.17 19%|█▉ | 8005/41250 [19:20:01<79:57:57, 8.66s/it] {'loss': 0.4008, 'grad_norm': 4.968603134155273, 'learning_rate': 3.7242784948532477e-05, 'epoch': 1.94} 19%|█▉ | 8005/41250 [19:20:01<79:57:57, 8.66s/it][2025-04-26 03:17:44,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.03 | optimizer_step: 1.10 [2025-04-26 03:17:44,925] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.10 | bwd_microstep: 5782.20 | bwd_inner_microstep: 5644.51 | bwd_allreduce_microstep: 137.65 | step_microstep: 19.23 [2025-04-26 03:17:44,925] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.10 | bwd: 5782.22 | bwd_inner: 5644.51 | bwd_allreduce: 137.67 | step: 19.23 19%|█▉ | 8006/41250 [19:20:10<80:02:18, 8.67s/it] {'loss': 0.1793, 'grad_norm': 4.943327903747559, 'learning_rate': 3.724198925648553e-05, 'epoch': 1.94} 19%|█▉ | 8006/41250 [19:20:10<80:02:18, 8.67s/it][2025-04-26 03:17:53,608] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 03:17:53,608] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.89 | bwd_microstep: 5773.33 | bwd_inner_microstep: 5649.29 | bwd_allreduce_microstep: 123.98 | step_microstep: 19.03 [2025-04-26 03:17:53,608] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.89 | bwd: 5773.34 | bwd_inner: 5649.29 | bwd_allreduce: 124.00 | step: 19.03 19%|█▉ | 8007/41250 [19:20:18<80:05:28, 8.67s/it] {'loss': 0.2669, 'grad_norm': 2.868577718734741, 'learning_rate': 3.724119345814512e-05, 'epoch': 1.94} 19%|█▉ | 8007/41250 [19:20:18<80:05:28, 8.67s/it][2025-04-26 03:18:02,248] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-26 03:18:02,249] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.09 | bwd_microstep: 5703.25 | bwd_inner_microstep: 5690.32 | bwd_allreduce_microstep: 12.89 | step_microstep: 19.34 [2025-04-26 03:18:02,249] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.09 | bwd: 5703.27 | bwd_inner: 5690.32 | bwd_allreduce: 12.91 | step: 19.34 19%|█▉ | 8008/41250 [19:20:27<79:59:20, 8.66s/it] {'loss': 0.1935, 'grad_norm': 2.151076316833496, 'learning_rate': 3.724039755351614e-05, 'epoch': 1.94} 19%|█▉ | 8008/41250 [19:20:27<79:59:20, 8.66s/it][2025-04-26 03:18:10,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:18:10,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.10 | bwd_microstep: 5742.33 | bwd_inner_microstep: 5649.01 | bwd_allreduce_microstep: 93.28 | step_microstep: 18.41 [2025-04-26 03:18:10,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.10 | bwd: 5742.35 | bwd_inner: 5649.01 | bwd_allreduce: 93.29 | step: 18.41 19%|█▉ | 8009/41250 [19:20:36<79:58:16, 8.66s/it] {'loss': 0.2027, 'grad_norm': 1.5765413045883179, 'learning_rate': 3.7239601542603496e-05, 'epoch': 1.94} 19%|█▉ | 8009/41250 [19:20:36<79:58:16, 8.66s/it][2025-04-26 03:18:19,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.92 [2025-04-26 03:18:19,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.20 | bwd_microstep: 5709.77 | bwd_inner_microstep: 5678.33 | bwd_allreduce_microstep: 31.40 | step_microstep: 19.07 [2025-04-26 03:18:19,549] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.20 | bwd: 5709.79 | bwd_inner: 5678.33 | bwd_allreduce: 31.42 | step: 19.07 19%|█▉ | 8010/41250 [19:20:44<79:55:04, 8.66s/it] {'loss': 0.0625, 'grad_norm': 1.1015089750289917, 'learning_rate': 3.7238805425412107e-05, 'epoch': 1.94} 19%|█▉ | 8010/41250 [19:20:44<79:55:04, 8.66s/it][2025-04-26 03:18:28,211] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:18:28,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.41 | bwd_microstep: 5744.86 | bwd_inner_microstep: 5646.42 | bwd_allreduce_microstep: 98.40 | step_microstep: 18.35 [2025-04-26 03:18:28,212] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.41 | bwd: 5744.87 | bwd_inner: 5646.42 | bwd_allreduce: 98.41 | step: 18.35 19%|█▉ | 8011/41250 [19:20:53<79:56:13, 8.66s/it] {'loss': 0.1231, 'grad_norm': 1.8531126976013184, 'learning_rate': 3.723800920194687e-05, 'epoch': 1.94} 19%|█▉ | 8011/41250 [19:20:53<79:56:13, 8.66s/it][2025-04-26 03:18:36,989] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.03 | optimizer_step: 0.91 [2025-04-26 03:18:36,990] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.16 | bwd_microstep: 5835.30 | bwd_inner_microstep: 5665.67 | bwd_allreduce_microstep: 169.57 | step_microstep: 19.03 [2025-04-26 03:18:36,990] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.16 | bwd: 5835.32 | bwd_inner: 5665.67 | bwd_allreduce: 169.59 | step: 19.04 19%|█▉ | 8012/41250 [19:21:02<80:16:36, 8.69s/it] {'loss': 0.1424, 'grad_norm': 1.995123028755188, 'learning_rate': 3.723721287221269e-05, 'epoch': 1.94} 19%|█▉ | 8012/41250 [19:21:02<80:16:36, 8.69s/it][2025-04-26 03:18:45,663] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:18:45,664] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.97 | bwd_microstep: 5747.71 | bwd_inner_microstep: 5673.23 | bwd_allreduce_microstep: 74.43 | step_microstep: 18.69 [2025-04-26 03:18:45,664] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.97 | bwd: 5747.72 | bwd_inner: 5673.23 | bwd_allreduce: 74.45 | step: 18.69 19%|█▉ | 8013/41250 [19:21:10<80:12:20, 8.69s/it] {'loss': 0.2439, 'grad_norm': 1.8550572395324707, 'learning_rate': 3.723641643621449e-05, 'epoch': 1.94} 19%|█▉ | 8013/41250 [19:21:10<80:12:20, 8.69s/it][2025-04-26 03:18:54,566] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.19 | optimizer_step: 1.05 [2025-04-26 03:18:54,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.87 | bwd_microstep: 6000.08 | bwd_inner_microstep: 5644.75 | bwd_allreduce_microstep: 355.29 | step_microstep: 19.36 [2025-04-26 03:18:54,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.87 | bwd: 6000.09 | bwd_inner: 5644.75 | bwd_allreduce: 355.31 | step: 19.36 19%|█▉ | 8014/41250 [19:21:19<80:48:12, 8.75s/it] {'loss': 0.1482, 'grad_norm': 3.0175909996032715, 'learning_rate': 3.7235619893957175e-05, 'epoch': 1.94} 19%|█▉ | 8014/41250 [19:21:19<80:48:12, 8.75s/it][2025-04-26 03:19:03,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 03:19:03,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.91 | bwd_microstep: 5738.81 | bwd_inner_microstep: 5690.34 | bwd_allreduce_microstep: 48.42 | step_microstep: 18.90 [2025-04-26 03:19:03,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.91 | bwd: 5738.82 | bwd_inner: 5690.34 | bwd_allreduce: 48.44 | step: 18.90 19%|█▉ | 8015/41250 [19:21:28<80:34:21, 8.73s/it] {'loss': 0.1798, 'grad_norm': 3.0475103855133057, 'learning_rate': 3.7234823245445646e-05, 'epoch': 1.94} 19%|█▉ | 8015/41250 [19:21:28<80:34:21, 8.73s/it][2025-04-26 03:19:11,823] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:19:11,824] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.29 | bwd_microstep: 5684.60 | bwd_inner_microstep: 5640.72 | bwd_allreduce_microstep: 43.83 | step_microstep: 18.92 [2025-04-26 03:19:11,824] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.29 | bwd: 5684.62 | bwd_inner: 5640.72 | bwd_allreduce: 43.85 | step: 18.92 19%|█▉ | 8016/41250 [19:21:37<80:11:24, 8.69s/it] {'loss': 0.0523, 'grad_norm': 0.6806823015213013, 'learning_rate': 3.723402649068483e-05, 'epoch': 1.94} 19%|█▉ | 8016/41250 [19:21:37<80:11:24, 8.69s/it][2025-04-26 03:19:20,478] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.02 | optimizer_step: 1.01 [2025-04-26 03:19:20,479] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.46 | bwd_microstep: 5742.11 | bwd_inner_microstep: 5647.50 | bwd_allreduce_microstep: 94.56 | step_microstep: 19.25 [2025-04-26 03:19:20,479] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.46 | bwd: 5742.12 | bwd_inner: 5647.50 | bwd_allreduce: 94.58 | step: 19.25 19%|█▉ | 8017/41250 [19:21:45<80:05:24, 8.68s/it] {'loss': 0.0187, 'grad_norm': 0.5980530977249146, 'learning_rate': 3.7233229629679625e-05, 'epoch': 1.94} 19%|█▉ | 8017/41250 [19:21:45<80:05:24, 8.68s/it][2025-04-26 03:19:29,140] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 03:19:29,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.10 | bwd_microstep: 5753.36 | bwd_inner_microstep: 5637.15 | bwd_allreduce_microstep: 116.17 | step_microstep: 18.64 [2025-04-26 03:19:29,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.10 | bwd: 5753.37 | bwd_inner: 5637.15 | bwd_allreduce: 116.18 | step: 18.64 19%|█▉ | 8018/41250 [19:21:54<80:02:57, 8.67s/it] {'loss': 0.2258, 'grad_norm': 2.5408754348754883, 'learning_rate': 3.723243266243495e-05, 'epoch': 1.94} 19%|█▉ | 8018/41250 [19:21:54<80:02:57, 8.67s/it][2025-04-26 03:19:37,850] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.06 | optimizer_step: 1.00 [2025-04-26 03:19:37,851] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.80 | bwd_microstep: 5777.43 | bwd_inner_microstep: 5682.24 | bwd_allreduce_microstep: 95.13 | step_microstep: 19.62 [2025-04-26 03:19:37,851] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.80 | bwd: 5777.44 | bwd_inner: 5682.24 | bwd_allreduce: 95.15 | step: 19.62 19%|█▉ | 8019/41250 [19:22:03<80:09:36, 8.68s/it] {'loss': 0.0518, 'grad_norm': 0.5113413333892822, 'learning_rate': 3.723163558895572e-05, 'epoch': 1.94} 19%|█▉ | 8019/41250 [19:22:03<80:09:36, 8.68s/it][2025-04-26 03:19:46,475] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 1.02 | optimizer_step: 0.92 [2025-04-26 03:19:46,475] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.68 | bwd_microstep: 5715.77 | bwd_inner_microstep: 5642.06 | bwd_allreduce_microstep: 73.66 | step_microstep: 19.05 [2025-04-26 03:19:46,476] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.68 | bwd: 5715.78 | bwd_inner: 5642.06 | bwd_allreduce: 73.68 | step: 19.05 19%|█▉ | 8020/41250 [19:22:11<79:59:24, 8.67s/it] {'loss': 0.1076, 'grad_norm': 0.969128429889679, 'learning_rate': 3.723083840924684e-05, 'epoch': 1.94} 19%|█▉ | 8020/41250 [19:22:11<79:59:24, 8.67s/it][2025-04-26 03:19:55,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:19:55,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2866.83 | bwd_microstep: 5752.56 | bwd_inner_microstep: 5692.90 | bwd_allreduce_microstep: 59.61 | step_microstep: 18.62 [2025-04-26 03:19:55,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2866.83 | bwd: 5752.57 | bwd_inner: 5692.90 | bwd_allreduce: 59.64 | step: 18.63 19%|█▉ | 8021/41250 [19:22:20<80:06:45, 8.68s/it] {'loss': 0.1457, 'grad_norm': 1.8292908668518066, 'learning_rate': 3.723004112331324e-05, 'epoch': 1.94} 19%|█▉ | 8021/41250 [19:22:20<80:06:45, 8.68s/it][2025-04-26 03:20:03,877] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.02 | optimizer_step: 1.19 [2025-04-26 03:20:03,877] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.93 | bwd_microstep: 5768.98 | bwd_inner_microstep: 5653.48 | bwd_allreduce_microstep: 115.45 | step_microstep: 19.35 [2025-04-26 03:20:03,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.93 | bwd: 5768.99 | bwd_inner: 5653.48 | bwd_allreduce: 115.47 | step: 19.36 19%|█▉ | 8022/41250 [19:22:29<80:08:26, 8.68s/it] {'loss': 0.0932, 'grad_norm': 1.2958565950393677, 'learning_rate': 3.722924373115981e-05, 'epoch': 1.94} 19%|█▉ | 8022/41250 [19:22:29<80:08:26, 8.68s/it][2025-04-26 03:20:12,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:20:12,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.62 | bwd_microstep: 5770.05 | bwd_inner_microstep: 5657.11 | bwd_allreduce_microstep: 112.90 | step_microstep: 18.34 [2025-04-26 03:20:12,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.62 | bwd: 5770.06 | bwd_inner: 5657.11 | bwd_allreduce: 112.92 | step: 18.34 19%|█▉ | 8023/41250 [19:22:37<80:08:06, 8.68s/it] {'loss': 0.146, 'grad_norm': 1.5735522508621216, 'learning_rate': 3.722844623279149e-05, 'epoch': 1.94} 19%|█▉ | 8023/41250 [19:22:37<80:08:06, 8.68s/it][2025-04-26 03:20:21,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 03:20:21,185] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.63 | bwd_microstep: 5692.00 | bwd_inner_microstep: 5679.55 | bwd_allreduce_microstep: 12.41 | step_microstep: 18.28 [2025-04-26 03:20:21,186] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.63 | bwd: 5692.01 | bwd_inner: 5679.55 | bwd_allreduce: 12.42 | step: 18.28 19%|█▉ | 8024/41250 [19:22:46<79:58:32, 8.67s/it] {'loss': 0.2243, 'grad_norm': 4.328475475311279, 'learning_rate': 3.7227648628213194e-05, 'epoch': 1.95} 19%|█▉ | 8024/41250 [19:22:46<79:58:32, 8.67s/it][2025-04-26 03:20:29,813] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:20:29,813] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.44 | bwd_microstep: 5719.10 | bwd_inner_microstep: 5647.70 | bwd_allreduce_microstep: 71.35 | step_microstep: 18.36 [2025-04-26 03:20:29,813] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.44 | bwd: 5719.11 | bwd_inner: 5647.70 | bwd_allreduce: 71.37 | step: 18.36 19%|█▉ | 8025/41250 [19:22:55<79:52:14, 8.65s/it] {'loss': 0.1952, 'grad_norm': 2.072726011276245, 'learning_rate': 3.7226850917429826e-05, 'epoch': 1.95} 19%|█▉ | 8025/41250 [19:22:55<79:52:14, 8.65s/it][2025-04-26 03:20:38,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.94 | optimizer_step: 0.90 [2025-04-26 03:20:38,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.38 | bwd_microstep: 5746.49 | bwd_inner_microstep: 5715.08 | bwd_allreduce_microstep: 31.37 | step_microstep: 18.24 [2025-04-26 03:20:38,500] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.38 | bwd: 5746.50 | bwd_inner: 5715.08 | bwd_allreduce: 31.38 | step: 18.24 19%|█▉ | 8026/41250 [19:23:03<79:57:15, 8.66s/it] {'loss': 0.1764, 'grad_norm': 2.5055482387542725, 'learning_rate': 3.722605310044631e-05, 'epoch': 1.95} 19%|█▉ | 8026/41250 [19:23:03<79:57:15, 8.66s/it][2025-04-26 03:20:47,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 03:20:47,117] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.03 | bwd_microstep: 5697.65 | bwd_inner_microstep: 5636.58 | bwd_allreduce_microstep: 61.03 | step_microstep: 18.38 [2025-04-26 03:20:47,117] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.03 | bwd: 5697.66 | bwd_inner: 5636.58 | bwd_allreduce: 61.04 | step: 18.38 19%|█▉ | 8027/41250 [19:23:12<79:49:45, 8.65s/it] {'loss': 0.1511, 'grad_norm': 1.9101132154464722, 'learning_rate': 3.7225255177267565e-05, 'epoch': 1.95} 19%|█▉ | 8027/41250 [19:23:12<79:49:45, 8.65s/it][2025-04-26 03:20:55,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-26 03:20:55,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.35 | bwd_microstep: 5778.92 | bwd_inner_microstep: 5656.06 | bwd_allreduce_microstep: 122.81 | step_microstep: 19.25 [2025-04-26 03:20:55,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.35 | bwd: 5778.93 | bwd_inner: 5656.06 | bwd_allreduce: 122.83 | step: 19.25 19%|█▉ | 8028/41250 [19:23:21<79:56:12, 8.66s/it] {'loss': 0.1699, 'grad_norm': 2.2808969020843506, 'learning_rate': 3.722445714789851e-05, 'epoch': 1.95} 19%|█▉ | 8028/41250 [19:23:21<79:56:12, 8.66s/it][2025-04-26 03:21:04,454] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:21:04,455] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.22 | bwd_microstep: 5708.83 | bwd_inner_microstep: 5696.11 | bwd_allreduce_microstep: 12.67 | step_microstep: 19.06 [2025-04-26 03:21:04,455] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.22 | bwd: 5708.85 | bwd_inner: 5696.11 | bwd_allreduce: 12.69 | step: 19.06 19%|█▉ | 8029/41250 [19:23:29<79:53:40, 8.66s/it] {'loss': 0.1856, 'grad_norm': 2.5550527572631836, 'learning_rate': 3.7223659012344064e-05, 'epoch': 1.95} 19%|█▉ | 8029/41250 [19:23:29<79:53:40, 8.66s/it][2025-04-26 03:21:13,134] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-26 03:21:13,135] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.94 | bwd_microstep: 5748.82 | bwd_inner_microstep: 5693.26 | bwd_allreduce_microstep: 55.51 | step_microstep: 19.19 [2025-04-26 03:21:13,135] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.94 | bwd: 5748.83 | bwd_inner: 5693.26 | bwd_allreduce: 55.53 | step: 19.19 19%|█▉ | 8030/41250 [19:23:38<79:57:08, 8.66s/it] {'loss': 0.1718, 'grad_norm': 2.7720580101013184, 'learning_rate': 3.7222860770609145e-05, 'epoch': 1.95} 19%|█▉ | 8030/41250 [19:23:38<79:57:08, 8.66s/it][2025-04-26 03:21:21,764] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 03:21:21,765] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.80 | bwd_microstep: 5723.08 | bwd_inner_microstep: 5639.14 | bwd_allreduce_microstep: 83.89 | step_microstep: 19.09 [2025-04-26 03:21:21,765] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.80 | bwd: 5723.10 | bwd_inner: 5639.14 | bwd_allreduce: 83.91 | step: 19.09 19%|█▉ | 8031/41250 [19:23:47<79:51:08, 8.65s/it] {'loss': 0.2667, 'grad_norm': 3.5089969635009766, 'learning_rate': 3.722206242269869e-05, 'epoch': 1.95} 19%|█▉ | 8031/41250 [19:23:47<79:51:08, 8.65s/it][2025-04-26 03:21:30,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:21:30,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.59 | bwd_microstep: 5774.88 | bwd_inner_microstep: 5650.63 | bwd_allreduce_microstep: 124.20 | step_microstep: 18.73 [2025-04-26 03:21:30,452] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.59 | bwd: 5774.89 | bwd_inner: 5650.63 | bwd_allreduce: 124.22 | step: 18.73 19%|█▉ | 8032/41250 [19:23:55<79:56:42, 8.66s/it] {'loss': 0.1017, 'grad_norm': 2.54472017288208, 'learning_rate': 3.722126396861759e-05, 'epoch': 1.95} 19%|█▉ | 8032/41250 [19:23:55<79:56:42, 8.66s/it][2025-04-26 03:21:39,103] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.02 | optimizer_step: 1.01 [2025-04-26 03:21:39,104] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.14 | bwd_microstep: 5720.84 | bwd_inner_microstep: 5708.12 | bwd_allreduce_microstep: 12.68 | step_microstep: 19.55 [2025-04-26 03:21:39,104] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.14 | bwd: 5720.86 | bwd_inner: 5708.12 | bwd_allreduce: 12.70 | step: 19.55 19%|█▉ | 8033/41250 [19:24:04<79:54:38, 8.66s/it] {'loss': 0.2734, 'grad_norm': 1.4905842542648315, 'learning_rate': 3.72204654083708e-05, 'epoch': 1.95} 19%|█▉ | 8033/41250 [19:24:04<79:54:38, 8.66s/it][2025-04-26 03:21:47,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 03:21:47,794] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.33 | bwd_microstep: 5777.22 | bwd_inner_microstep: 5641.98 | bwd_allreduce_microstep: 135.18 | step_microstep: 18.79 [2025-04-26 03:21:47,794] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.33 | bwd: 5777.24 | bwd_inner: 5641.98 | bwd_allreduce: 135.20 | step: 18.79 19%|█▉ | 8034/41250 [19:24:13<79:59:51, 8.67s/it] {'loss': 0.0822, 'grad_norm': 1.0143953561782837, 'learning_rate': 3.7219666741963225e-05, 'epoch': 1.95} 19%|█▉ | 8034/41250 [19:24:13<79:59:51, 8.67s/it][2025-04-26 03:21:56,643] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.15 | optimizer_step: 1.05 [2025-04-26 03:21:56,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.17 | bwd_microstep: 5910.77 | bwd_inner_microstep: 5690.70 | bwd_allreduce_microstep: 220.01 | step_microstep: 20.13 [2025-04-26 03:21:56,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.17 | bwd: 5910.79 | bwd_inner: 5690.70 | bwd_allreduce: 220.04 | step: 20.13 19%|█▉ | 8035/41250 [19:24:21<80:29:00, 8.72s/it] {'loss': 0.2496, 'grad_norm': 2.00915789604187, 'learning_rate': 3.7218867969399786e-05, 'epoch': 1.95} 19%|█▉ | 8035/41250 [19:24:21<80:29:00, 8.72s/it][2025-04-26 03:22:05,278] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 1.08 [2025-04-26 03:22:05,278] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.03 | bwd_microstep: 5712.77 | bwd_inner_microstep: 5656.64 | bwd_allreduce_microstep: 56.09 | step_microstep: 18.88 [2025-04-26 03:22:05,279] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.03 | bwd: 5712.78 | bwd_inner: 5656.64 | bwd_allreduce: 56.10 | step: 18.88 19%|█▉ | 8036/41250 [19:24:30<80:14:15, 8.70s/it] {'loss': 0.1367, 'grad_norm': 2.8705270290374756, 'learning_rate': 3.721806909068541e-05, 'epoch': 1.95} 19%|█▉ | 8036/41250 [19:24:30<80:14:15, 8.70s/it][2025-04-26 03:22:13,915] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-26 03:22:13,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.31 | bwd_microstep: 5721.24 | bwd_inner_microstep: 5654.47 | bwd_allreduce_microstep: 66.72 | step_microstep: 18.47 [2025-04-26 03:22:13,916] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.31 | bwd: 5721.25 | bwd_inner: 5654.47 | bwd_allreduce: 66.74 | step: 18.48 19%|█▉ | 8037/41250 [19:24:39<80:03:58, 8.68s/it] {'loss': 0.0378, 'grad_norm': 0.3668220341205597, 'learning_rate': 3.721727010582503e-05, 'epoch': 1.95} 19%|█▉ | 8037/41250 [19:24:39<80:03:58, 8.68s/it][2025-04-26 03:22:22,615] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 0.97 | optimizer_step: 0.94 [2025-04-26 03:22:22,616] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.90 | bwd_microstep: 5768.01 | bwd_inner_microstep: 5707.88 | bwd_allreduce_microstep: 60.08 | step_microstep: 18.29 [2025-04-26 03:22:22,616] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.90 | bwd: 5768.02 | bwd_inner: 5707.88 | bwd_allreduce: 60.10 | step: 18.30 19%|█▉ | 8038/41250 [19:24:47<80:07:25, 8.68s/it] {'loss': 0.0303, 'grad_norm': 0.5118327140808105, 'learning_rate': 3.7216471014823564e-05, 'epoch': 1.95} 19%|█▉ | 8038/41250 [19:24:47<80:07:25, 8.68s/it][2025-04-26 03:22:31,296] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:22:31,297] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.51 | bwd_microstep: 5738.97 | bwd_inner_microstep: 5695.55 | bwd_allreduce_microstep: 43.37 | step_microstep: 18.55 [2025-04-26 03:22:31,297] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.51 | bwd: 5738.98 | bwd_inner: 5695.55 | bwd_allreduce: 43.39 | step: 18.55 19%|█▉ | 8039/41250 [19:24:56<80:06:38, 8.68s/it] {'loss': 0.0625, 'grad_norm': 1.5809153318405151, 'learning_rate': 3.721567181768594e-05, 'epoch': 1.95} 19%|█▉ | 8039/41250 [19:24:56<80:06:38, 8.68s/it][2025-04-26 03:22:39,946] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.00 | optimizer_step: 1.00 [2025-04-26 03:22:39,947] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.45 | bwd_microstep: 5712.17 | bwd_inner_microstep: 5699.32 | bwd_allreduce_microstep: 12.80 | step_microstep: 19.26 [2025-04-26 03:22:39,947] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.45 | bwd: 5712.18 | bwd_inner: 5699.32 | bwd_allreduce: 12.82 | step: 19.26 19%|█▉ | 8040/41250 [19:25:05<80:00:53, 8.67s/it] {'loss': 0.1573, 'grad_norm': 2.904890775680542, 'learning_rate': 3.7214872514417094e-05, 'epoch': 1.95} 19%|█▉ | 8040/41250 [19:25:05<80:00:53, 8.67s/it][2025-04-26 03:22:48,663] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:22:48,663] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.73 | bwd_microstep: 5811.52 | bwd_inner_microstep: 5653.62 | bwd_allreduce_microstep: 157.86 | step_microstep: 18.42 [2025-04-26 03:22:48,663] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.73 | bwd: 5811.53 | bwd_inner: 5653.62 | bwd_allreduce: 157.88 | step: 18.43 19%|█▉ | 8041/41250 [19:25:13<80:07:45, 8.69s/it] {'loss': 0.0852, 'grad_norm': 0.7487773895263672, 'learning_rate': 3.7214073105021934e-05, 'epoch': 1.95} 19%|█▉ | 8041/41250 [19:25:13<80:07:45, 8.69s/it][2025-04-26 03:22:57,287] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:22:57,288] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.55 | bwd_microstep: 5696.62 | bwd_inner_microstep: 5684.16 | bwd_allreduce_microstep: 12.41 | step_microstep: 18.75 [2025-04-26 03:22:57,288] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.55 | bwd: 5696.63 | bwd_inner: 5684.16 | bwd_allreduce: 12.43 | step: 18.76 19%|█▉ | 8042/41250 [19:25:22<79:57:24, 8.67s/it] {'loss': 0.0638, 'grad_norm': 1.1290607452392578, 'learning_rate': 3.7213273589505405e-05, 'epoch': 1.95} 19%|█▉ | 8042/41250 [19:25:22<79:57:24, 8.67s/it][2025-04-26 03:23:05,979] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 03:23:05,980] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.29 | bwd_microstep: 5770.53 | bwd_inner_microstep: 5678.95 | bwd_allreduce_microstep: 91.54 | step_microstep: 18.37 [2025-04-26 03:23:05,980] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.29 | bwd: 5770.55 | bwd_inner: 5678.95 | bwd_allreduce: 91.55 | step: 18.37 19%|█▉ | 8043/41250 [19:25:31<80:01:13, 8.68s/it] {'loss': 0.1335, 'grad_norm': 1.3906670808792114, 'learning_rate': 3.7212473967872425e-05, 'epoch': 1.95} 19%|█▉ | 8043/41250 [19:25:31<80:01:13, 8.68s/it][2025-04-26 03:23:14,777] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.06 | optimizer_step: 0.89 [2025-04-26 03:23:14,777] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.03 | bwd_microstep: 5895.09 | bwd_inner_microstep: 5641.29 | bwd_allreduce_microstep: 253.75 | step_microstep: 18.75 [2025-04-26 03:23:14,778] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.03 | bwd: 5895.11 | bwd_inner: 5641.29 | bwd_allreduce: 253.77 | step: 18.75 20%|█▉ | 8044/41250 [19:25:40<80:22:04, 8.71s/it] {'loss': 0.0729, 'grad_norm': 2.741832733154297, 'learning_rate': 3.7211674240127935e-05, 'epoch': 1.95} 20%|█▉ | 8044/41250 [19:25:40<80:22:04, 8.71s/it][2025-04-26 03:23:23,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.13 | optimizer_step: 1.13 [2025-04-26 03:23:23,459] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.16 | bwd_microstep: 5766.29 | bwd_inner_microstep: 5657.88 | bwd_allreduce_microstep: 108.35 | step_microstep: 19.33 [2025-04-26 03:23:23,459] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.16 | bwd: 5766.31 | bwd_inner: 5657.88 | bwd_allreduce: 108.38 | step: 19.33 20%|█▉ | 8045/41250 [19:25:48<80:16:24, 8.70s/it] {'loss': 0.2955, 'grad_norm': 1.4689806699752808, 'learning_rate': 3.7210874406276855e-05, 'epoch': 1.95} 20%|█▉ | 8045/41250 [19:25:48<80:16:24, 8.70s/it][2025-04-26 03:23:32,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 03:23:32,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.74 | bwd_microstep: 5747.39 | bwd_inner_microstep: 5658.29 | bwd_allreduce_microstep: 89.06 | step_microstep: 18.69 [2025-04-26 03:23:32,129] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.74 | bwd: 5747.40 | bwd_inner: 5658.29 | bwd_allreduce: 89.07 | step: 18.70 20%|█▉ | 8046/41250 [19:25:57<80:10:23, 8.69s/it] {'loss': 0.0753, 'grad_norm': 1.7786375284194946, 'learning_rate': 3.7210074466324116e-05, 'epoch': 1.95} 20%|█▉ | 8046/41250 [19:25:57<80:10:23, 8.69s/it][2025-04-26 03:23:40,795] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.96 | optimizer_step: 1.01 [2025-04-26 03:23:40,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.04 | bwd_microstep: 5732.97 | bwd_inner_microstep: 5706.91 | bwd_allreduce_microstep: 26.02 | step_microstep: 18.57 [2025-04-26 03:23:40,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.04 | bwd: 5732.99 | bwd_inner: 5706.91 | bwd_allreduce: 26.04 | step: 18.57 20%|█▉ | 8047/41250 [19:26:06<80:06:02, 8.68s/it] {'loss': 0.0997, 'grad_norm': 2.149657726287842, 'learning_rate': 3.720927442027466e-05, 'epoch': 1.95} 20%|█▉ | 8047/41250 [19:26:06<80:06:02, 8.68s/it][2025-04-26 03:23:49,474] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 03:23:49,475] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.68 | bwd_microstep: 5772.97 | bwd_inner_microstep: 5642.14 | bwd_allreduce_microstep: 130.78 | step_microstep: 18.48 [2025-04-26 03:23:49,475] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.68 | bwd: 5772.98 | bwd_inner: 5642.14 | bwd_allreduce: 130.80 | step: 18.48 20%|█▉ | 8048/41250 [19:26:14<80:04:59, 8.68s/it] {'loss': 0.0481, 'grad_norm': 1.0658848285675049, 'learning_rate': 3.720847426813341e-05, 'epoch': 1.95} 20%|█▉ | 8048/41250 [19:26:14<80:04:59, 8.68s/it][2025-04-26 03:23:58,145] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 1.11 [2025-04-26 03:23:58,146] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.42 | bwd_microstep: 5737.52 | bwd_inner_microstep: 5684.27 | bwd_allreduce_microstep: 53.19 | step_microstep: 19.33 [2025-04-26 03:23:58,146] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.42 | bwd: 5737.53 | bwd_inner: 5684.27 | bwd_allreduce: 53.22 | step: 19.33 20%|█▉ | 8049/41250 [19:26:23<80:02:57, 8.68s/it] {'loss': 0.1033, 'grad_norm': 3.4217469692230225, 'learning_rate': 3.720767400990529e-05, 'epoch': 1.95} 20%|█▉ | 8049/41250 [19:26:23<80:02:57, 8.68s/it][2025-04-26 03:24:06,833] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.06 | optimizer_step: 1.10 [2025-04-26 03:24:06,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.35 | bwd_microstep: 5774.99 | bwd_inner_microstep: 5645.35 | bwd_allreduce_microstep: 129.59 | step_microstep: 19.81 [2025-04-26 03:24:06,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.35 | bwd: 5775.01 | bwd_inner: 5645.35 | bwd_allreduce: 129.61 | step: 19.81 20%|█▉ | 8050/41250 [19:26:32<80:04:15, 8.68s/it] {'loss': 0.0914, 'grad_norm': 1.4040884971618652, 'learning_rate': 3.720687364559526e-05, 'epoch': 1.95} 20%|█▉ | 8050/41250 [19:26:32<80:04:15, 8.68s/it][2025-04-26 03:24:15,479] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 03:24:15,480] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.52 | bwd_microstep: 5738.35 | bwd_inner_microstep: 5652.84 | bwd_allreduce_microstep: 85.47 | step_microstep: 19.10 [2025-04-26 03:24:15,480] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.52 | bwd: 5738.36 | bwd_inner: 5652.84 | bwd_allreduce: 85.49 | step: 19.10 20%|█▉ | 8051/41250 [19:26:40<79:57:57, 8.67s/it] {'loss': 0.2452, 'grad_norm': 3.26017427444458, 'learning_rate': 3.720607317520823e-05, 'epoch': 1.95} 20%|█▉ | 8051/41250 [19:26:40<79:57:57, 8.67s/it][2025-04-26 03:24:24,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.00 | optimizer_step: 1.03 [2025-04-26 03:24:24,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.83 | bwd_microstep: 5729.68 | bwd_inner_microstep: 5644.51 | bwd_allreduce_microstep: 85.12 | step_microstep: 18.88 [2025-04-26 03:24:24,124] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.83 | bwd: 5729.70 | bwd_inner: 5644.51 | bwd_allreduce: 85.14 | step: 18.88 20%|█▉ | 8052/41250 [19:26:49<79:53:30, 8.66s/it] {'loss': 0.3179, 'grad_norm': 3.0283265113830566, 'learning_rate': 3.720527259874916e-05, 'epoch': 1.95} 20%|█▉ | 8052/41250 [19:26:49<79:53:30, 8.66s/it][2025-04-26 03:24:32,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.02 | optimizer_step: 1.04 [2025-04-26 03:24:32,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2880.64 | bwd_microstep: 5779.25 | bwd_inner_microstep: 5766.46 | bwd_allreduce_microstep: 12.75 | step_microstep: 19.06 [2025-04-26 03:24:32,869] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2880.64 | bwd: 5779.27 | bwd_inner: 5766.46 | bwd_allreduce: 12.77 | step: 19.06 20%|█▉ | 8053/41250 [19:26:58<80:06:36, 8.69s/it] {'loss': 0.2131, 'grad_norm': 3.1179492473602295, 'learning_rate': 3.720447191622294e-05, 'epoch': 1.95} 20%|█▉ | 8053/41250 [19:26:58<80:06:36, 8.69s/it][2025-04-26 03:24:41,521] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.01 | optimizer_step: 1.00 [2025-04-26 03:24:41,522] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.08 | bwd_microstep: 5727.35 | bwd_inner_microstep: 5688.41 | bwd_allreduce_microstep: 38.90 | step_microstep: 18.97 [2025-04-26 03:24:41,522] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.08 | bwd: 5727.36 | bwd_inner: 5688.41 | bwd_allreduce: 38.92 | step: 18.97 20%|█▉ | 8054/41250 [19:27:06<80:00:50, 8.68s/it] {'loss': 0.0696, 'grad_norm': 0.940660297870636, 'learning_rate': 3.7203671127634555e-05, 'epoch': 1.95} 20%|█▉ | 8054/41250 [19:27:06<80:00:50, 8.68s/it][2025-04-26 03:24:50,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:24:50,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.84 | bwd_microstep: 5750.41 | bwd_inner_microstep: 5639.62 | bwd_allreduce_microstep: 110.75 | step_microstep: 18.72 [2025-04-26 03:24:50,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.84 | bwd: 5750.42 | bwd_inner: 5639.62 | bwd_allreduce: 110.76 | step: 18.72 20%|█▉ | 8055/41250 [19:27:15<79:57:56, 8.67s/it] {'loss': 0.1269, 'grad_norm': 1.7883996963500977, 'learning_rate': 3.7202870232988915e-05, 'epoch': 1.95} 20%|█▉ | 8055/41250 [19:27:15<79:57:56, 8.67s/it][2025-04-26 03:24:58,777] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:24:58,778] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.48 | bwd_microstep: 5675.90 | bwd_inner_microstep: 5639.31 | bwd_allreduce_microstep: 36.55 | step_microstep: 18.78 [2025-04-26 03:24:58,778] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.49 | bwd: 5675.91 | bwd_inner: 5639.31 | bwd_allreduce: 36.56 | step: 18.79 20%|█▉ | 8056/41250 [19:27:24<79:45:03, 8.65s/it] {'loss': 0.2067, 'grad_norm': 2.1056790351867676, 'learning_rate': 3.720206923229096e-05, 'epoch': 1.95} 20%|█▉ | 8056/41250 [19:27:24<79:45:03, 8.65s/it][2025-04-26 03:25:07,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:25:07,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.00 | bwd_microstep: 5737.83 | bwd_inner_microstep: 5695.58 | bwd_allreduce_microstep: 42.21 | step_microstep: 18.65 [2025-04-26 03:25:07,458] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.00 | bwd: 5737.85 | bwd_inner: 5695.58 | bwd_allreduce: 42.23 | step: 18.65 20%|█▉ | 8057/41250 [19:27:32<79:50:31, 8.66s/it] {'loss': 0.2265, 'grad_norm': 2.5936906337738037, 'learning_rate': 3.720126812554564e-05, 'epoch': 1.95} 20%|█▉ | 8057/41250 [19:27:32<79:50:31, 8.66s/it][2025-04-26 03:25:16,092] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.02 | optimizer_step: 1.00 [2025-04-26 03:25:16,092] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.67 | bwd_microstep: 5697.37 | bwd_inner_microstep: 5684.69 | bwd_allreduce_microstep: 12.63 | step_microstep: 18.90 [2025-04-26 03:25:16,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.67 | bwd: 5697.39 | bwd_inner: 5684.69 | bwd_allreduce: 12.65 | step: 18.90 20%|█▉ | 8058/41250 [19:27:41<79:45:43, 8.65s/it] {'loss': 0.1098, 'grad_norm': 1.814013957977295, 'learning_rate': 3.720046691275788e-05, 'epoch': 1.95} 20%|█▉ | 8058/41250 [19:27:41<79:45:43, 8.65s/it][2025-04-26 03:25:24,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:25:24,764] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.48 | bwd_microstep: 5732.50 | bwd_inner_microstep: 5686.26 | bwd_allreduce_microstep: 46.20 | step_microstep: 18.43 [2025-04-26 03:25:24,764] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.48 | bwd: 5732.52 | bwd_inner: 5686.26 | bwd_allreduce: 46.22 | step: 18.44 20%|█▉ | 8059/41250 [19:27:50<79:48:46, 8.66s/it] {'loss': 0.1098, 'grad_norm': 4.067841053009033, 'learning_rate': 3.7199665593932625e-05, 'epoch': 1.95} 20%|█▉ | 8059/41250 [19:27:50<79:48:46, 8.66s/it][2025-04-26 03:25:33,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.56 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 03:25:33,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2883.99 | bwd_microstep: 5767.35 | bwd_inner_microstep: 5754.79 | bwd_allreduce_microstep: 12.52 | step_microstep: 18.73 [2025-04-26 03:25:33,499] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2883.99 | bwd: 5767.36 | bwd_inner: 5754.79 | bwd_allreduce: 12.53 | step: 18.73 20%|█▉ | 8060/41250 [19:27:58<80:01:37, 8.68s/it] {'loss': 0.0989, 'grad_norm': 2.0558574199676514, 'learning_rate': 3.719886416907481e-05, 'epoch': 1.95} 20%|█▉ | 8060/41250 [19:27:58<80:01:37, 8.68s/it][2025-04-26 03:25:42,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 03:25:42,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.02 | bwd_microstep: 5736.06 | bwd_inner_microstep: 5693.13 | bwd_allreduce_microstep: 42.88 | step_microstep: 18.26 [2025-04-26 03:25:42,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.02 | bwd: 5736.07 | bwd_inner: 5693.13 | bwd_allreduce: 42.90 | step: 18.26 20%|█▉ | 8061/41250 [19:28:07<79:59:25, 8.68s/it] {'loss': 0.1619, 'grad_norm': 3.383626937866211, 'learning_rate': 3.7198062638189386e-05, 'epoch': 1.95} 20%|█▉ | 8061/41250 [19:28:07<79:59:25, 8.68s/it][2025-04-26 03:25:50,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 03:25:50,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.97 | bwd_microstep: 5766.65 | bwd_inner_microstep: 5645.04 | bwd_allreduce_microstep: 121.58 | step_microstep: 18.22 [2025-04-26 03:25:50,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.96 | bwd: 5766.67 | bwd_inner: 5645.03 | bwd_allreduce: 121.59 | step: 18.22 20%|█▉ | 8062/41250 [19:28:16<79:59:55, 8.68s/it] {'loss': 0.113, 'grad_norm': 2.1355772018432617, 'learning_rate': 3.719726100128129e-05, 'epoch': 1.95} 20%|█▉ | 8062/41250 [19:28:16<79:59:55, 8.68s/it][2025-04-26 03:25:59,539] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.04 | optimizer_step: 0.90 [2025-04-26 03:25:59,540] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.69 | bwd_microstep: 5760.89 | bwd_inner_microstep: 5687.82 | bwd_allreduce_microstep: 73.02 | step_microstep: 18.70 [2025-04-26 03:25:59,540] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.69 | bwd: 5760.91 | bwd_inner: 5687.82 | bwd_allreduce: 73.04 | step: 18.70 20%|█▉ | 8063/41250 [19:28:24<80:02:32, 8.68s/it] {'loss': 0.269, 'grad_norm': 3.9537465572357178, 'learning_rate': 3.719645925835546e-05, 'epoch': 1.95} 20%|█▉ | 8063/41250 [19:28:24<80:02:32, 8.68s/it][2025-04-26 03:26:08,239] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.04 | optimizer_step: 1.03 [2025-04-26 03:26:08,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.42 | bwd_microstep: 5790.90 | bwd_inner_microstep: 5654.32 | bwd_allreduce_microstep: 136.53 | step_microstep: 19.32 [2025-04-26 03:26:08,240] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.42 | bwd: 5790.92 | bwd_inner: 5654.32 | bwd_allreduce: 136.55 | step: 19.32 20%|█▉ | 8064/41250 [19:28:33<80:05:19, 8.69s/it] {'loss': 0.1023, 'grad_norm': 1.46503484249115, 'learning_rate': 3.719565740941684e-05, 'epoch': 1.95} 20%|█▉ | 8064/41250 [19:28:33<80:05:19, 8.69s/it][2025-04-26 03:26:16,870] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:26:16,870] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.29 | bwd_microstep: 5714.42 | bwd_inner_microstep: 5632.32 | bwd_allreduce_microstep: 82.05 | step_microstep: 18.71 [2025-04-26 03:26:16,870] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.29 | bwd: 5714.44 | bwd_inner: 5632.32 | bwd_allreduce: 82.07 | step: 18.71 20%|█▉ | 8065/41250 [19:28:42<79:55:29, 8.67s/it] {'loss': 0.0382, 'grad_norm': 1.5900369882583618, 'learning_rate': 3.719485545447038e-05, 'epoch': 1.96} 20%|█▉ | 8065/41250 [19:28:42<79:55:29, 8.67s/it][2025-04-26 03:26:25,558] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:26:25,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.86 | bwd_microstep: 5782.89 | bwd_inner_microstep: 5645.20 | bwd_allreduce_microstep: 137.62 | step_microstep: 18.78 [2025-04-26 03:26:25,559] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.86 | bwd: 5782.91 | bwd_inner: 5645.20 | bwd_allreduce: 137.65 | step: 18.78 20%|█▉ | 8066/41250 [19:28:50<79:58:55, 8.68s/it] {'loss': 0.0486, 'grad_norm': 0.935218334197998, 'learning_rate': 3.719405339352101e-05, 'epoch': 1.96} 20%|█▉ | 8066/41250 [19:28:50<79:58:55, 8.68s/it][2025-04-26 03:26:34,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.02 | optimizer_step: 0.95 [2025-04-26 03:26:34,198] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.77 | bwd_microstep: 5707.57 | bwd_inner_microstep: 5643.24 | bwd_allreduce_microstep: 64.29 | step_microstep: 18.92 [2025-04-26 03:26:34,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.77 | bwd: 5707.59 | bwd_inner: 5643.24 | bwd_allreduce: 64.30 | step: 18.92 20%|█▉ | 8067/41250 [19:28:59<79:51:55, 8.66s/it] {'loss': 0.0659, 'grad_norm': 2.061981201171875, 'learning_rate': 3.7193251226573684e-05, 'epoch': 1.96} 20%|█▉ | 8067/41250 [19:28:59<79:51:55, 8.66s/it][2025-04-26 03:26:42,870] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:26:42,871] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.62 | bwd_microstep: 5762.71 | bwd_inner_microstep: 5657.83 | bwd_allreduce_microstep: 104.84 | step_microstep: 18.37 [2025-04-26 03:26:42,871] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.62 | bwd: 5762.73 | bwd_inner: 5657.83 | bwd_allreduce: 104.86 | step: 18.37 20%|█▉ | 8068/41250 [19:29:08<79:52:54, 8.67s/it] {'loss': 0.038, 'grad_norm': 0.8313356637954712, 'learning_rate': 3.719244895363335e-05, 'epoch': 1.96} 20%|█▉ | 8068/41250 [19:29:08<79:52:54, 8.67s/it][2025-04-26 03:26:51,551] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:26:51,552] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2867.22 | bwd_microstep: 5722.76 | bwd_inner_microstep: 5709.80 | bwd_allreduce_microstep: 12.92 | step_microstep: 18.62 [2025-04-26 03:26:51,552] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2867.22 | bwd: 5722.78 | bwd_inner: 5709.80 | bwd_allreduce: 12.94 | step: 18.62 20%|█▉ | 8069/41250 [19:29:16<79:55:19, 8.67s/it] {'loss': 0.283, 'grad_norm': 2.7161600589752197, 'learning_rate': 3.719164657470495e-05, 'epoch': 1.96} 20%|█▉ | 8069/41250 [19:29:16<79:55:19, 8.67s/it][2025-04-26 03:27:00,180] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:27:00,180] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.42 | bwd_microstep: 5695.88 | bwd_inner_microstep: 5683.36 | bwd_allreduce_microstep: 12.47 | step_microstep: 18.67 [2025-04-26 03:27:00,181] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.42 | bwd: 5695.89 | bwd_inner: 5683.36 | bwd_allreduce: 12.49 | step: 18.67 20%|█▉ | 8070/41250 [19:29:25<79:48:04, 8.66s/it] {'loss': 0.0431, 'grad_norm': 0.9957391023635864, 'learning_rate': 3.719084408979343e-05, 'epoch': 1.96} 20%|█▉ | 8070/41250 [19:29:25<79:48:04, 8.66s/it][2025-04-26 03:27:08,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-26 03:27:08,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.36 | bwd_microstep: 5746.73 | bwd_inner_microstep: 5705.37 | bwd_allreduce_microstep: 41.32 | step_microstep: 19.15 [2025-04-26 03:27:08,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.36 | bwd: 5746.75 | bwd_inner: 5705.37 | bwd_allreduce: 41.34 | step: 19.15 20%|█▉ | 8071/41250 [19:29:34<79:51:57, 8.67s/it] {'loss': 0.2833, 'grad_norm': 3.3107197284698486, 'learning_rate': 3.7190041498903734e-05, 'epoch': 1.96} 20%|█▉ | 8071/41250 [19:29:34<79:51:57, 8.67s/it][2025-04-26 03:27:17,574] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.59 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:27:17,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.41 | bwd_microstep: 5770.92 | bwd_inner_microstep: 5707.43 | bwd_allreduce_microstep: 63.44 | step_microstep: 18.96 [2025-04-26 03:27:17,575] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.41 | bwd: 5770.93 | bwd_inner: 5707.43 | bwd_allreduce: 63.46 | step: 18.96 20%|█▉ | 8072/41250 [19:29:42<79:59:25, 8.68s/it] {'loss': 0.2643, 'grad_norm': 1.9520643949508667, 'learning_rate': 3.718923880204081e-05, 'epoch': 1.96} 20%|█▉ | 8072/41250 [19:29:42<79:59:25, 8.68s/it][2025-04-26 03:27:26,220] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.01 | optimizer_step: 1.09 [2025-04-26 03:27:26,221] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.78 | bwd_microstep: 5712.26 | bwd_inner_microstep: 5699.47 | bwd_allreduce_microstep: 12.75 | step_microstep: 19.01 [2025-04-26 03:27:26,221] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.78 | bwd: 5712.28 | bwd_inner: 5699.47 | bwd_allreduce: 12.77 | step: 19.02 20%|█▉ | 8073/41250 [19:29:51<79:53:48, 8.67s/it] {'loss': 0.1703, 'grad_norm': 1.2667462825775146, 'learning_rate': 3.718843599920962e-05, 'epoch': 1.96} 20%|█▉ | 8073/41250 [19:29:51<79:53:48, 8.67s/it][2025-04-26 03:27:34,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:27:34,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.07 | bwd_microstep: 5761.83 | bwd_inner_microstep: 5702.21 | bwd_allreduce_microstep: 59.58 | step_microstep: 18.43 [2025-04-26 03:27:34,924] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.07 | bwd: 5761.85 | bwd_inner: 5702.21 | bwd_allreduce: 59.60 | step: 18.43 20%|█▉ | 8074/41250 [19:30:00<79:59:18, 8.68s/it] {'loss': 0.1212, 'grad_norm': 1.6826127767562866, 'learning_rate': 3.718763309041509e-05, 'epoch': 1.96} 20%|█▉ | 8074/41250 [19:30:00<79:59:18, 8.68s/it][2025-04-26 03:27:43,644] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 03:27:43,645] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.89 | bwd_microstep: 5805.87 | bwd_inner_microstep: 5656.86 | bwd_allreduce_microstep: 148.97 | step_microstep: 19.08 [2025-04-26 03:27:43,645] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.89 | bwd: 5805.89 | bwd_inner: 5656.86 | bwd_allreduce: 148.99 | step: 19.08 20%|█▉ | 8075/41250 [19:30:08<80:05:59, 8.69s/it] {'loss': 0.0785, 'grad_norm': 1.4023244380950928, 'learning_rate': 3.71868300756622e-05, 'epoch': 1.96} 20%|█▉ | 8075/41250 [19:30:08<80:05:59, 8.69s/it][2025-04-26 03:27:52,360] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.99 [2025-04-26 03:27:52,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2858.15 | bwd_microstep: 5771.48 | bwd_inner_microstep: 5704.90 | bwd_allreduce_microstep: 66.54 | step_microstep: 19.11 [2025-04-26 03:27:52,361] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2858.15 | bwd: 5771.49 | bwd_inner: 5704.90 | bwd_allreduce: 66.56 | step: 19.11 20%|█▉ | 8076/41250 [19:30:17<80:09:54, 8.70s/it] {'loss': 0.0369, 'grad_norm': 1.3606973886489868, 'learning_rate': 3.7186026954955865e-05, 'epoch': 1.96} 20%|█▉ | 8076/41250 [19:30:17<80:09:54, 8.70s/it][2025-04-26 03:28:01,047] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:28:01,047] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.25 | bwd_microstep: 5752.31 | bwd_inner_microstep: 5715.30 | bwd_allreduce_microstep: 36.96 | step_microstep: 18.57 [2025-04-26 03:28:01,048] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.25 | bwd: 5752.32 | bwd_inner: 5715.30 | bwd_allreduce: 36.98 | step: 18.57 20%|█▉ | 8077/41250 [19:30:26<80:07:23, 8.70s/it] {'loss': 0.0397, 'grad_norm': 0.6943707466125488, 'learning_rate': 3.718522372830107e-05, 'epoch': 1.96} 20%|█▉ | 8077/41250 [19:30:26<80:07:23, 8.70s/it][2025-04-26 03:28:09,677] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:28:09,678] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.92 | bwd_microstep: 5720.29 | bwd_inner_microstep: 5669.13 | bwd_allreduce_microstep: 51.12 | step_microstep: 18.33 [2025-04-26 03:28:09,678] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.92 | bwd: 5720.31 | bwd_inner: 5669.13 | bwd_allreduce: 51.14 | step: 18.33 20%|█▉ | 8078/41250 [19:30:35<79:56:35, 8.68s/it] {'loss': 0.1795, 'grad_norm': 1.7341254949569702, 'learning_rate': 3.718442039570274e-05, 'epoch': 1.96} 20%|█▉ | 8078/41250 [19:30:35<79:56:35, 8.68s/it][2025-04-26 03:28:18,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:28:18,391] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2867.28 | bwd_microstep: 5760.90 | bwd_inner_microstep: 5706.71 | bwd_allreduce_microstep: 54.15 | step_microstep: 18.69 [2025-04-26 03:28:18,391] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2867.28 | bwd: 5760.91 | bwd_inner: 5706.71 | bwd_allreduce: 54.16 | step: 18.69 20%|█▉ | 8079/41250 [19:30:43<80:02:30, 8.69s/it] {'loss': 0.0443, 'grad_norm': 1.047080159187317, 'learning_rate': 3.718361695716584e-05, 'epoch': 1.96} 20%|█▉ | 8079/41250 [19:30:43<80:02:30, 8.69s/it][2025-04-26 03:28:27,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:28:27,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.38 | bwd_microstep: 5781.09 | bwd_inner_microstep: 5709.06 | bwd_allreduce_microstep: 71.98 | step_microstep: 18.66 [2025-04-26 03:28:27,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.38 | bwd: 5781.10 | bwd_inner: 5709.06 | bwd_allreduce: 72.00 | step: 18.66 20%|█▉ | 8080/41250 [19:30:52<80:07:05, 8.70s/it] {'loss': 0.0698, 'grad_norm': 2.0658302307128906, 'learning_rate': 3.7182813412695324e-05, 'epoch': 1.96} 20%|█▉ | 8080/41250 [19:30:52<80:07:05, 8.70s/it][2025-04-26 03:28:35,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:28:35,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.17 | bwd_microstep: 5794.89 | bwd_inner_microstep: 5650.05 | bwd_allreduce_microstep: 144.79 | step_microstep: 18.67 [2025-04-26 03:28:35,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.17 | bwd: 5794.90 | bwd_inner: 5650.05 | bwd_allreduce: 144.81 | step: 18.67 20%|█▉ | 8081/41250 [19:31:01<80:09:10, 8.70s/it] {'loss': 0.0659, 'grad_norm': 1.2712328433990479, 'learning_rate': 3.718200976229614e-05, 'epoch': 1.96} 20%|█▉ | 8081/41250 [19:31:01<80:09:10, 8.70s/it][2025-04-26 03:28:44,512] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:28:44,513] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.94 | bwd_microstep: 5765.02 | bwd_inner_microstep: 5694.79 | bwd_allreduce_microstep: 70.17 | step_microstep: 18.61 [2025-04-26 03:28:44,513] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.94 | bwd: 5765.03 | bwd_inner: 5694.79 | bwd_allreduce: 70.19 | step: 18.62 20%|█▉ | 8082/41250 [19:31:09<80:08:56, 8.70s/it] {'loss': 0.1048, 'grad_norm': 2.0721371173858643, 'learning_rate': 3.718120600597325e-05, 'epoch': 1.96} 20%|█▉ | 8082/41250 [19:31:09<80:08:56, 8.70s/it][2025-04-26 03:28:53,235] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.97 [2025-04-26 03:28:53,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.10 | bwd_microstep: 5793.17 | bwd_inner_microstep: 5690.32 | bwd_allreduce_microstep: 102.81 | step_microstep: 19.17 [2025-04-26 03:28:53,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.10 | bwd: 5793.18 | bwd_inner: 5690.32 | bwd_allreduce: 102.82 | step: 19.18 20%|█▉ | 8083/41250 [19:31:18<80:12:38, 8.71s/it] {'loss': 0.3915, 'grad_norm': 3.8365774154663086, 'learning_rate': 3.71804021437316e-05, 'epoch': 1.96} 20%|█▉ | 8083/41250 [19:31:18<80:12:38, 8.71s/it][2025-04-26 03:29:01,929] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:29:01,930] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.32 | bwd_microstep: 5787.22 | bwd_inner_microstep: 5651.52 | bwd_allreduce_microstep: 135.65 | step_microstep: 18.35 [2025-04-26 03:29:01,930] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.32 | bwd: 5787.23 | bwd_inner: 5651.52 | bwd_allreduce: 135.67 | step: 18.35 20%|█▉ | 8084/41250 [19:31:27<80:10:19, 8.70s/it] {'loss': 0.0846, 'grad_norm': 4.437922954559326, 'learning_rate': 3.717959817557615e-05, 'epoch': 1.96} 20%|█▉ | 8084/41250 [19:31:27<80:10:19, 8.70s/it][2025-04-26 03:29:10,610] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:29:10,610] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.24 | bwd_microstep: 5771.63 | bwd_inner_microstep: 5642.66 | bwd_allreduce_microstep: 128.93 | step_microstep: 18.54 [2025-04-26 03:29:10,611] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.24 | bwd: 5771.65 | bwd_inner: 5642.66 | bwd_allreduce: 128.95 | step: 18.54 20%|█▉ | 8085/41250 [19:31:35<80:06:38, 8.70s/it] {'loss': 0.1522, 'grad_norm': 3.157036542892456, 'learning_rate': 3.717879410151186e-05, 'epoch': 1.96} 20%|█▉ | 8085/41250 [19:31:35<80:06:38, 8.70s/it][2025-04-26 03:29:19,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:29:19,242] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.30 | bwd_microstep: 5729.54 | bwd_inner_microstep: 5644.22 | bwd_allreduce_microstep: 85.27 | step_microstep: 19.06 [2025-04-26 03:29:19,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.30 | bwd: 5729.56 | bwd_inner: 5644.22 | bwd_allreduce: 85.29 | step: 19.07 20%|█▉ | 8086/41250 [19:31:44<79:55:54, 8.68s/it] {'loss': 0.2163, 'grad_norm': 2.6744697093963623, 'learning_rate': 3.717798992154368e-05, 'epoch': 1.96} 20%|█▉ | 8086/41250 [19:31:44<79:55:54, 8.68s/it][2025-04-26 03:29:27,938] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 03:29:27,939] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.73 | bwd_microstep: 5787.05 | bwd_inner_microstep: 5648.39 | bwd_allreduce_microstep: 138.62 | step_microstep: 18.40 [2025-04-26 03:29:27,939] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.73 | bwd: 5787.07 | bwd_inner: 5648.39 | bwd_allreduce: 138.64 | step: 18.40 20%|█▉ | 8087/41250 [19:31:53<79:59:01, 8.68s/it] {'loss': 0.0313, 'grad_norm': 0.9520822763442993, 'learning_rate': 3.717718563567657e-05, 'epoch': 1.96} 20%|█▉ | 8087/41250 [19:31:53<79:59:01, 8.68s/it][2025-04-26 03:29:36,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 03:29:36,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.47 | bwd_microstep: 5764.05 | bwd_inner_microstep: 5683.47 | bwd_allreduce_microstep: 80.54 | step_microstep: 18.30 [2025-04-26 03:29:36,631] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.47 | bwd: 5764.07 | bwd_inner: 5683.47 | bwd_allreduce: 80.56 | step: 18.30 20%|█▉ | 8088/41250 [19:32:01<80:00:20, 8.69s/it] {'loss': 0.0533, 'grad_norm': 1.135956048965454, 'learning_rate': 3.717638124391549e-05, 'epoch': 1.96} 20%|█▉ | 8088/41250 [19:32:01<80:00:20, 8.69s/it][2025-04-26 03:29:45,266] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 03:29:45,266] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.99 | bwd_microstep: 5706.53 | bwd_inner_microstep: 5693.89 | bwd_allreduce_microstep: 12.60 | step_microstep: 18.31 [2025-04-26 03:29:45,267] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.99 | bwd: 5706.54 | bwd_inner: 5693.89 | bwd_allreduce: 12.62 | step: 18.31 20%|█▉ | 8089/41250 [19:32:10<79:52:02, 8.67s/it] {'loss': 0.2255, 'grad_norm': 3.5701375007629395, 'learning_rate': 3.71755767462654e-05, 'epoch': 1.96} 20%|█▉ | 8089/41250 [19:32:10<79:52:02, 8.67s/it][2025-04-26 03:29:53,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 03:29:53,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.10 | bwd_microstep: 5685.74 | bwd_inner_microstep: 5646.88 | bwd_allreduce_microstep: 38.81 | step_microstep: 18.66 [2025-04-26 03:29:53,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.10 | bwd: 5685.75 | bwd_inner: 5646.88 | bwd_allreduce: 38.83 | step: 18.66 20%|█▉ | 8090/41250 [19:32:19<79:39:39, 8.65s/it] {'loss': 0.0481, 'grad_norm': 2.518338680267334, 'learning_rate': 3.717477214273126e-05, 'epoch': 1.96} 20%|█▉ | 8090/41250 [19:32:19<79:39:39, 8.65s/it][2025-04-26 03:30:02,560] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.15 | optimizer_step: 1.05 [2025-04-26 03:30:02,561] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.61 | bwd_microstep: 5763.73 | bwd_inner_microstep: 5684.69 | bwd_allreduce_microstep: 78.97 | step_microstep: 19.88 [2025-04-26 03:30:02,561] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.61 | bwd: 5763.75 | bwd_inner: 5684.69 | bwd_allreduce: 79.00 | step: 19.89 20%|█▉ | 8091/41250 [19:32:27<79:47:58, 8.66s/it] {'loss': 0.1829, 'grad_norm': 1.4360474348068237, 'learning_rate': 3.717396743331802e-05, 'epoch': 1.96} 20%|█▉ | 8091/41250 [19:32:27<79:47:58, 8.66s/it][2025-04-26 03:30:11,171] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-26 03:30:11,172] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.51 | bwd_microstep: 5699.64 | bwd_inner_microstep: 5649.73 | bwd_allreduce_microstep: 49.87 | step_microstep: 18.92 [2025-04-26 03:30:11,172] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.51 | bwd: 5699.66 | bwd_inner: 5649.73 | bwd_allreduce: 49.88 | step: 18.92 20%|█▉ | 8092/41250 [19:32:36<79:39:03, 8.65s/it] {'loss': 0.104, 'grad_norm': 1.4138959646224976, 'learning_rate': 3.717316261803065e-05, 'epoch': 1.96} 20%|█▉ | 8092/41250 [19:32:36<79:39:03, 8.65s/it][2025-04-26 03:30:19,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.22 | optimizer_step: 0.96 [2025-04-26 03:30:19,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.89 | bwd_microstep: 5786.60 | bwd_inner_microstep: 5680.61 | bwd_allreduce_microstep: 105.94 | step_microstep: 19.33 [2025-04-26 03:30:19,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.89 | bwd: 5786.62 | bwd_inner: 5680.61 | bwd_allreduce: 105.96 | step: 19.34 20%|█▉ | 8093/41250 [19:32:45<79:49:40, 8.67s/it] {'loss': 0.1964, 'grad_norm': 2.300931453704834, 'learning_rate': 3.717235769687412e-05, 'epoch': 1.96} 20%|█▉ | 8093/41250 [19:32:45<79:49:40, 8.67s/it][2025-04-26 03:30:28,529] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.30 | optimizer_step: 1.05 [2025-04-26 03:30:28,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.74 | bwd_microstep: 5709.00 | bwd_inner_microstep: 5694.99 | bwd_allreduce_microstep: 13.95 | step_microstep: 20.01 [2025-04-26 03:30:28,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.74 | bwd: 5709.02 | bwd_inner: 5694.99 | bwd_allreduce: 13.98 | step: 20.01 20%|█▉ | 8094/41250 [19:32:53<79:45:48, 8.66s/it] {'loss': 0.1533, 'grad_norm': 2.2575340270996094, 'learning_rate': 3.717155266985337e-05, 'epoch': 1.96} 20%|█▉ | 8094/41250 [19:32:53<79:45:48, 8.66s/it][2025-04-26 03:30:37,131] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-26 03:30:37,132] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.76 | bwd_microstep: 5694.94 | bwd_inner_microstep: 5649.68 | bwd_allreduce_microstep: 45.22 | step_microstep: 18.68 [2025-04-26 03:30:37,132] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.76 | bwd: 5694.95 | bwd_inner: 5649.68 | bwd_allreduce: 45.24 | step: 18.68 20%|█▉ | 8095/41250 [19:33:02<79:35:49, 8.64s/it] {'loss': 0.2047, 'grad_norm': 2.2775027751922607, 'learning_rate': 3.717074753697338e-05, 'epoch': 1.96} 20%|█▉ | 8095/41250 [19:33:02<79:35:49, 8.64s/it][2025-04-26 03:30:45,727] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 03:30:45,728] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.86 | bwd_microstep: 5688.80 | bwd_inner_microstep: 5645.25 | bwd_allreduce_microstep: 43.50 | step_microstep: 18.59 [2025-04-26 03:30:45,728] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.86 | bwd: 5688.81 | bwd_inner: 5645.25 | bwd_allreduce: 43.52 | step: 18.59 20%|█▉ | 8096/41250 [19:33:11<79:27:56, 8.63s/it] {'loss': 0.1034, 'grad_norm': 1.9214829206466675, 'learning_rate': 3.716994229823911e-05, 'epoch': 1.96} 20%|█▉ | 8096/41250 [19:33:11<79:27:56, 8.63s/it][2025-04-26 03:30:54,333] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 03:30:54,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.21 | bwd_microstep: 5701.73 | bwd_inner_microstep: 5643.66 | bwd_allreduce_microstep: 58.02 | step_microstep: 18.56 [2025-04-26 03:30:54,334] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.21 | bwd: 5701.74 | bwd_inner: 5643.66 | bwd_allreduce: 58.04 | step: 18.56 20%|█▉ | 8097/41250 [19:33:19<79:24:27, 8.62s/it] {'loss': 0.1646, 'grad_norm': 3.425051212310791, 'learning_rate': 3.7169136953655525e-05, 'epoch': 1.96} 20%|█▉ | 8097/41250 [19:33:19<79:24:27, 8.62s/it][2025-04-26 03:31:03,132] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.07 | optimizer_step: 1.18 [2025-04-26 03:31:03,133] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.38 | bwd_microstep: 5874.89 | bwd_inner_microstep: 5675.70 | bwd_allreduce_microstep: 199.14 | step_microstep: 19.59 [2025-04-26 03:31:03,133] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.38 | bwd: 5874.91 | bwd_inner: 5675.70 | bwd_allreduce: 199.16 | step: 19.59 20%|█▉ | 8098/41250 [19:33:28<79:53:17, 8.68s/it] {'loss': 0.1501, 'grad_norm': 1.5338010787963867, 'learning_rate': 3.7168331503227586e-05, 'epoch': 1.96} 20%|█▉ | 8098/41250 [19:33:28<79:53:17, 8.68s/it][2025-04-26 03:31:11,721] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:31:11,722] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.64 | bwd_microstep: 5678.00 | bwd_inner_microstep: 5643.73 | bwd_allreduce_microstep: 34.23 | step_microstep: 18.63 [2025-04-26 03:31:11,722] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.64 | bwd: 5678.01 | bwd_inner: 5643.73 | bwd_allreduce: 34.24 | step: 18.63 20%|█▉ | 8099/41250 [19:33:37<79:38:41, 8.65s/it] {'loss': 0.1838, 'grad_norm': 3.030965805053711, 'learning_rate': 3.716752594696026e-05, 'epoch': 1.96} 20%|█▉ | 8099/41250 [19:33:37<79:38:41, 8.65s/it][2025-04-26 03:31:20,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.01 | optimizer_step: 1.08 [2025-04-26 03:31:20,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.05 | bwd_microstep: 5754.34 | bwd_inner_microstep: 5651.73 | bwd_allreduce_microstep: 102.56 | step_microstep: 19.16 [2025-04-26 03:31:20,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.05 | bwd: 5754.36 | bwd_inner: 5651.73 | bwd_allreduce: 102.58 | step: 19.17 20%|█▉ | 8100/41250 [19:33:45<79:40:38, 8.65s/it] {'loss': 0.0891, 'grad_norm': 1.5660358667373657, 'learning_rate': 3.7166720284858514e-05, 'epoch': 1.96} 20%|█▉ | 8100/41250 [19:33:45<79:40:38, 8.65s/it][2025-04-26 03:31:28,988] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:31:28,989] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.54 | bwd_microstep: 5700.44 | bwd_inner_microstep: 5642.73 | bwd_allreduce_microstep: 57.66 | step_microstep: 18.81 [2025-04-26 03:31:28,989] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.54 | bwd: 5700.46 | bwd_inner: 5642.73 | bwd_allreduce: 57.68 | step: 18.81 20%|█▉ | 8101/41250 [19:33:54<79:33:03, 8.64s/it] {'loss': 0.0397, 'grad_norm': 0.9159672856330872, 'learning_rate': 3.716591451692731e-05, 'epoch': 1.96} 20%|█▉ | 8101/41250 [19:33:54<79:33:03, 8.64s/it][2025-04-26 03:31:37,659] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.01 | optimizer_step: 1.02 [2025-04-26 03:31:37,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.43 | bwd_microstep: 5741.13 | bwd_inner_microstep: 5678.87 | bwd_allreduce_microstep: 62.22 | step_microstep: 19.42 [2025-04-26 03:31:37,660] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.43 | bwd: 5741.15 | bwd_inner: 5678.87 | bwd_allreduce: 62.24 | step: 19.42 20%|█▉ | 8102/41250 [19:34:02<79:37:51, 8.65s/it] {'loss': 0.1749, 'grad_norm': 2.734135866165161, 'learning_rate': 3.716510864317162e-05, 'epoch': 1.96} 20%|█▉ | 8102/41250 [19:34:02<79:37:51, 8.65s/it][2025-04-26 03:31:46,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.03 | optimizer_gradients: 1.00 | optimizer_step: 1.02 [2025-04-26 03:31:46,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.75 | bwd_microstep: 5705.97 | bwd_inner_microstep: 5693.23 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.58 [2025-04-26 03:31:46,302] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.75 | bwd: 5705.98 | bwd_inner: 5693.23 | bwd_allreduce: 12.71 | step: 18.58 20%|█▉ | 8103/41250 [19:34:11<79:36:38, 8.65s/it] {'loss': 0.2181, 'grad_norm': 2.2048661708831787, 'learning_rate': 3.7164302663596414e-05, 'epoch': 1.96} 20%|█▉ | 8103/41250 [19:34:11<79:36:38, 8.65s/it][2025-04-26 03:31:54,956] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 0.98 | optimizer_step: 0.91 [2025-04-26 03:31:54,957] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.03 | bwd_microstep: 5746.62 | bwd_inner_microstep: 5650.39 | bwd_allreduce_microstep: 96.19 | step_microstep: 18.41 [2025-04-26 03:31:54,957] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.03 | bwd: 5746.64 | bwd_inner: 5650.39 | bwd_allreduce: 96.20 | step: 18.41 20%|█▉ | 8104/41250 [19:34:20<79:37:58, 8.65s/it] {'loss': 0.1634, 'grad_norm': 2.424443006515503, 'learning_rate': 3.716349657820666e-05, 'epoch': 1.96} 20%|█▉ | 8104/41250 [19:34:20<79:37:58, 8.65s/it][2025-04-26 03:32:03,566] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.09 | optimizer_gradients: 0.98 | optimizer_step: 0.99 [2025-04-26 03:32:03,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.22 | bwd_microstep: 5697.17 | bwd_inner_microstep: 5643.38 | bwd_allreduce_microstep: 53.73 | step_microstep: 18.52 [2025-04-26 03:32:03,567] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.22 | bwd: 5697.18 | bwd_inner: 5643.38 | bwd_allreduce: 53.75 | step: 18.52 20%|█▉ | 8105/41250 [19:34:28<79:31:53, 8.64s/it] {'loss': 0.027, 'grad_norm': 0.6579890847206116, 'learning_rate': 3.7162690387007316e-05, 'epoch': 1.96} 20%|█▉ | 8105/41250 [19:34:28<79:31:53, 8.64s/it][2025-04-26 03:32:12,246] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 1.05 [2025-04-26 03:32:12,247] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.91 | bwd_microstep: 5763.33 | bwd_inner_microstep: 5653.35 | bwd_allreduce_microstep: 109.93 | step_microstep: 19.13 [2025-04-26 03:32:12,247] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.91 | bwd: 5763.34 | bwd_inner: 5653.35 | bwd_allreduce: 109.95 | step: 19.13 20%|█▉ | 8106/41250 [19:34:37<79:38:05, 8.65s/it] {'loss': 0.1989, 'grad_norm': 1.873679280281067, 'learning_rate': 3.716188409000337e-05, 'epoch': 1.97} 20%|█▉ | 8106/41250 [19:34:37<79:38:05, 8.65s/it][2025-04-26 03:32:20,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 03:32:20,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.50 | bwd_microstep: 5720.41 | bwd_inner_microstep: 5638.52 | bwd_allreduce_microstep: 81.84 | step_microstep: 18.72 [2025-04-26 03:32:20,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.50 | bwd: 5720.42 | bwd_inner: 5638.52 | bwd_allreduce: 81.86 | step: 18.73 20%|█▉ | 8107/41250 [19:34:46<79:35:57, 8.65s/it] {'loss': 0.1091, 'grad_norm': 1.9352002143859863, 'learning_rate': 3.7161077687199784e-05, 'epoch': 1.97} 20%|█▉ | 8107/41250 [19:34:46<79:35:57, 8.65s/it][2025-04-26 03:32:29,490] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.05 | optimizer_step: 0.98 [2025-04-26 03:32:29,491] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.99 | bwd_microstep: 5696.73 | bwd_inner_microstep: 5652.95 | bwd_allreduce_microstep: 43.73 | step_microstep: 19.22 [2025-04-26 03:32:29,491] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.99 | bwd: 5696.74 | bwd_inner: 5652.95 | bwd_allreduce: 43.75 | step: 19.22 20%|█▉ | 8108/41250 [19:34:54<79:29:36, 8.63s/it] {'loss': 0.3419, 'grad_norm': 2.6685476303100586, 'learning_rate': 3.716027117860152e-05, 'epoch': 1.97} 20%|█▉ | 8108/41250 [19:34:54<79:29:36, 8.63s/it][2025-04-26 03:32:38,102] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 03:32:38,103] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.58 | bwd_microstep: 5704.04 | bwd_inner_microstep: 5650.41 | bwd_allreduce_microstep: 53.58 | step_microstep: 19.34 [2025-04-26 03:32:38,103] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.58 | bwd: 5704.05 | bwd_inner: 5650.41 | bwd_allreduce: 53.60 | step: 19.34 20%|█▉ | 8109/41250 [19:35:03<79:25:42, 8.63s/it] {'loss': 0.0525, 'grad_norm': 0.8145923614501953, 'learning_rate': 3.7159464564213576e-05, 'epoch': 1.97} 20%|█▉ | 8109/41250 [19:35:03<79:25:42, 8.63s/it][2025-04-26 03:32:46,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 03:32:46,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.88 | bwd_microstep: 5747.73 | bwd_inner_microstep: 5708.95 | bwd_allreduce_microstep: 38.73 | step_microstep: 19.27 [2025-04-26 03:32:46,785] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.88 | bwd: 5747.74 | bwd_inner: 5708.95 | bwd_allreduce: 38.75 | step: 19.27 20%|█▉ | 8110/41250 [19:35:12<79:34:27, 8.64s/it] {'loss': 0.1435, 'grad_norm': 4.756702899932861, 'learning_rate': 3.71586578440409e-05, 'epoch': 1.97} 20%|█▉ | 8110/41250 [19:35:12<79:34:27, 8.64s/it][2025-04-26 03:32:55,459] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 03:32:55,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.07 | bwd_microstep: 5737.65 | bwd_inner_microstep: 5687.44 | bwd_allreduce_microstep: 50.16 | step_microstep: 18.89 [2025-04-26 03:32:55,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.07 | bwd: 5737.66 | bwd_inner: 5687.44 | bwd_allreduce: 50.18 | step: 18.89 20%|█▉ | 8111/41250 [19:35:20<79:39:25, 8.65s/it] {'loss': 0.0367, 'grad_norm': 1.098236083984375, 'learning_rate': 3.7157851018088475e-05, 'epoch': 1.97} 20%|█▉ | 8111/41250 [19:35:20<79:39:25, 8.65s/it][2025-04-26 03:33:04,092] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.08 | optimizer_step: 1.09 [2025-04-26 03:33:04,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.42 | bwd_microstep: 5720.68 | bwd_inner_microstep: 5662.76 | bwd_allreduce_microstep: 57.86 | step_microstep: 19.62 [2025-04-26 03:33:04,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.42 | bwd: 5720.70 | bwd_inner: 5662.76 | bwd_allreduce: 57.88 | step: 19.62 20%|█▉ | 8112/41250 [19:35:29<79:35:58, 8.65s/it] {'loss': 0.1648, 'grad_norm': 3.1636953353881836, 'learning_rate': 3.715704408636127e-05, 'epoch': 1.97} 20%|█▉ | 8112/41250 [19:35:29<79:35:58, 8.65s/it][2025-04-26 03:33:12,731] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 03:33:12,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.21 | bwd_microstep: 5704.88 | bwd_inner_microstep: 5692.08 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.93 [2025-04-26 03:33:12,732] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.21 | bwd: 5704.90 | bwd_inner: 5692.08 | bwd_allreduce: 12.77 | step: 18.93 20%|█▉ | 8113/41250 [19:35:38<79:34:14, 8.64s/it] {'loss': 0.1872, 'grad_norm': 3.73124623298645, 'learning_rate': 3.7156237048864256e-05, 'epoch': 1.97} 20%|█▉ | 8113/41250 [19:35:38<79:34:14, 8.64s/it][2025-04-26 03:33:21,434] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 03:33:21,435] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.86 | bwd_microstep: 5793.24 | bwd_inner_microstep: 5653.41 | bwd_allreduce_microstep: 139.78 | step_microstep: 18.90 [2025-04-26 03:33:21,435] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.86 | bwd: 5793.25 | bwd_inner: 5653.41 | bwd_allreduce: 139.80 | step: 18.90 20%|█▉ | 8114/41250 [19:35:46<79:43:38, 8.66s/it] {'loss': 0.1707, 'grad_norm': 1.8307074308395386, 'learning_rate': 3.715542990560243e-05, 'epoch': 1.97} 20%|█▉ | 8114/41250 [19:35:46<79:43:38, 8.66s/it][2025-04-26 03:33:30,148] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.99 | optimizer_step: 0.95 [2025-04-26 03:33:30,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.06 | bwd_microstep: 5803.85 | bwd_inner_microstep: 5650.27 | bwd_allreduce_microstep: 153.55 | step_microstep: 18.57 [2025-04-26 03:33:30,149] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.06 | bwd: 5803.87 | bwd_inner: 5650.27 | bwd_allreduce: 153.56 | step: 18.57 20%|█▉ | 8115/41250 [19:35:55<79:52:04, 8.68s/it] {'loss': 0.1966, 'grad_norm': 3.965644598007202, 'learning_rate': 3.715462265658074e-05, 'epoch': 1.97} 20%|█▉ | 8115/41250 [19:35:55<79:52:04, 8.68s/it][2025-04-26 03:33:38,843] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:33:38,844] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.52 | bwd_microstep: 5763.62 | bwd_inner_microstep: 5707.83 | bwd_allreduce_microstep: 55.75 | step_microstep: 18.62 [2025-04-26 03:33:38,844] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.52 | bwd: 5763.63 | bwd_inner: 5707.83 | bwd_allreduce: 55.77 | step: 18.62 20%|█▉ | 8116/41250 [19:36:04<79:54:51, 8.68s/it] {'loss': 0.0708, 'grad_norm': 1.5480328798294067, 'learning_rate': 3.7153815301804186e-05, 'epoch': 1.97} 20%|█▉ | 8116/41250 [19:36:04<79:54:51, 8.68s/it][2025-04-26 03:33:47,537] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.34 | optimizer_step: 1.05 [2025-04-26 03:33:47,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.90 | bwd_microstep: 5784.58 | bwd_inner_microstep: 5643.50 | bwd_allreduce_microstep: 141.01 | step_microstep: 20.25 [2025-04-26 03:33:47,538] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.90 | bwd: 5784.60 | bwd_inner: 5643.50 | bwd_allreduce: 141.04 | step: 20.25 20%|█▉ | 8117/41250 [19:36:12<79:57:13, 8.69s/it] {'loss': 0.1823, 'grad_norm': 1.9906527996063232, 'learning_rate': 3.715300784127773e-05, 'epoch': 1.97} 20%|█▉ | 8117/41250 [19:36:12<79:57:13, 8.69s/it][2025-04-26 03:33:56,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.20 | optimizer_step: 0.94 [2025-04-26 03:33:56,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.08 | bwd_microstep: 5758.76 | bwd_inner_microstep: 5691.79 | bwd_allreduce_microstep: 66.92 | step_microstep: 19.63 [2025-04-26 03:33:56,234] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.08 | bwd: 5758.78 | bwd_inner: 5691.79 | bwd_allreduce: 66.94 | step: 19.63 20%|█▉ | 8118/41250 [19:36:21<79:58:04, 8.69s/it] {'loss': 0.0362, 'grad_norm': 0.5473706722259521, 'learning_rate': 3.7152200275006356e-05, 'epoch': 1.97} 20%|█▉ | 8118/41250 [19:36:21<79:58:04, 8.69s/it][2025-04-26 03:34:04,906] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.06 | optimizer_step: 0.90 [2025-04-26 03:34:04,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.70 | bwd_microstep: 5734.30 | bwd_inner_microstep: 5721.52 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.92 [2025-04-26 03:34:04,907] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.70 | bwd: 5734.32 | bwd_inner: 5721.52 | bwd_allreduce: 12.75 | step: 18.92 20%|█▉ | 8119/41250 [19:36:30<79:55:05, 8.68s/it] {'loss': 0.0695, 'grad_norm': 2.1959173679351807, 'learning_rate': 3.715139260299504e-05, 'epoch': 1.97} 20%|█▉ | 8119/41250 [19:36:30<79:55:05, 8.68s/it][2025-04-26 03:34:13,612] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.94 | optimizer_gradients: 1.02 | optimizer_step: 0.89 [2025-04-26 03:34:13,612] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.96 | bwd_microstep: 5770.88 | bwd_inner_microstep: 5708.96 | bwd_allreduce_microstep: 61.87 | step_microstep: 18.46 [2025-04-26 03:34:13,612] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.96 | bwd: 5770.89 | bwd_inner: 5708.96 | bwd_allreduce: 61.89 | step: 18.46 20%|█▉ | 8120/41250 [19:36:38<79:58:31, 8.69s/it] {'loss': 0.1463, 'grad_norm': 1.6770788431167603, 'learning_rate': 3.715058482524876e-05, 'epoch': 1.97} 20%|█▉ | 8120/41250 [19:36:38<79:58:31, 8.69s/it][2025-04-26 03:34:22,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-26 03:34:22,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.53 | bwd_microstep: 5703.46 | bwd_inner_microstep: 5690.85 | bwd_allreduce_microstep: 12.57 | step_microstep: 19.07 [2025-04-26 03:34:22,243] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.53 | bwd: 5703.47 | bwd_inner: 5690.85 | bwd_allreduce: 12.59 | step: 19.07 20%|█▉ | 8121/41250 [19:36:47<79:48:35, 8.67s/it] {'loss': 0.2027, 'grad_norm': 1.1402812004089355, 'learning_rate': 3.7149776941772494e-05, 'epoch': 1.97} 20%|█▉ | 8121/41250 [19:36:47<79:48:35, 8.67s/it][2025-04-26 03:34:31,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-26 03:34:31,052] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.15 | bwd_microstep: 5878.03 | bwd_inner_microstep: 5692.97 | bwd_allreduce_microstep: 185.01 | step_microstep: 19.29 [2025-04-26 03:34:31,052] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.15 | bwd: 5878.04 | bwd_inner: 5692.97 | bwd_allreduce: 185.03 | step: 19.29 20%|█▉ | 8122/41250 [19:36:56<80:11:09, 8.71s/it] {'loss': 0.1382, 'grad_norm': 1.8086050748825073, 'learning_rate': 3.714896895257123e-05, 'epoch': 1.97} 20%|█▉ | 8122/41250 [19:36:56<80:11:09, 8.71s/it][2025-04-26 03:34:39,739] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-26 03:34:39,740] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.43 | bwd_microstep: 5748.96 | bwd_inner_microstep: 5693.11 | bwd_allreduce_microstep: 55.81 | step_microstep: 18.76 [2025-04-26 03:34:39,740] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.43 | bwd: 5748.98 | bwd_inner: 5693.11 | bwd_allreduce: 55.83 | step: 18.76 20%|█▉ | 8123/41250 [19:37:05<80:06:55, 8.71s/it] {'loss': 0.0622, 'grad_norm': 0.9602349400520325, 'learning_rate': 3.714816085764995e-05, 'epoch': 1.97} 20%|█▉ | 8123/41250 [19:37:05<80:06:55, 8.71s/it][2025-04-26 03:34:48,441] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 03:34:48,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.09 | bwd_microstep: 5755.88 | bwd_inner_microstep: 5706.58 | bwd_allreduce_microstep: 49.25 | step_microstep: 19.24 [2025-04-26 03:34:48,442] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.09 | bwd: 5755.90 | bwd_inner: 5706.58 | bwd_allreduce: 49.27 | step: 19.24 20%|█▉ | 8124/41250 [19:37:13<80:05:43, 8.70s/it] {'loss': 0.1755, 'grad_norm': 3.504256248474121, 'learning_rate': 3.7147352657013626e-05, 'epoch': 1.97} 20%|█▉ | 8124/41250 [19:37:13<80:05:43, 8.70s/it][2025-04-26 03:34:57,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 03:34:57,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.24 | bwd_microstep: 5721.75 | bwd_inner_microstep: 5708.88 | bwd_allreduce_microstep: 12.83 | step_microstep: 18.54 [2025-04-26 03:34:57,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.24 | bwd: 5721.76 | bwd_inner: 5708.88 | bwd_allreduce: 12.85 | step: 18.55 20%|█▉ | 8125/41250 [19:37:22<79:58:50, 8.69s/it] {'loss': 0.1418, 'grad_norm': 2.9559104442596436, 'learning_rate': 3.714654435066725e-05, 'epoch': 1.97} 20%|█▉ | 8125/41250 [19:37:22<79:58:50, 8.69s/it][2025-04-26 03:35:05,794] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.20 | optimizer_step: 0.90 [2025-04-26 03:35:05,795] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.54 | bwd_microstep: 5749.22 | bwd_inner_microstep: 5704.78 | bwd_allreduce_microstep: 44.39 | step_microstep: 19.09 [2025-04-26 03:35:05,795] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.54 | bwd: 5749.24 | bwd_inner: 5704.78 | bwd_allreduce: 44.41 | step: 19.09 20%|█▉ | 8126/41250 [19:37:31<79:58:08, 8.69s/it] {'loss': 0.1429, 'grad_norm': 3.054588794708252, 'learning_rate': 3.71457359386158e-05, 'epoch': 1.97} 20%|█▉ | 8126/41250 [19:37:31<79:58:08, 8.69s/it][2025-04-26 03:35:14,563] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.04 | optimizer_step: 1.02 [2025-04-26 03:35:14,563] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2891.57 | bwd_microstep: 5793.95 | bwd_inner_microstep: 5781.10 | bwd_allreduce_microstep: 12.80 | step_microstep: 19.55 [2025-04-26 03:35:14,564] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2891.57 | bwd: 5793.97 | bwd_inner: 5781.10 | bwd_allreduce: 12.82 | step: 19.55 20%|█▉ | 8127/41250 [19:37:39<80:11:25, 8.72s/it] {'loss': 0.3948, 'grad_norm': 3.2013323307037354, 'learning_rate': 3.7144927420864256e-05, 'epoch': 1.97} 20%|█▉ | 8127/41250 [19:37:39<80:11:25, 8.72s/it][2025-04-26 03:35:23,382] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.26 | optimizer_step: 0.97 [2025-04-26 03:35:23,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.08 | bwd_microstep: 5898.11 | bwd_inner_microstep: 5658.65 | bwd_allreduce_microstep: 239.41 | step_microstep: 19.60 [2025-04-26 03:35:23,383] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.08 | bwd: 5898.12 | bwd_inner: 5658.65 | bwd_allreduce: 239.43 | step: 19.60 20%|█▉ | 8128/41250 [19:37:48<80:28:08, 8.75s/it] {'loss': 0.0922, 'grad_norm': 1.2439628839492798, 'learning_rate': 3.7144118797417604e-05, 'epoch': 1.97} 20%|█▉ | 8128/41250 [19:37:48<80:28:08, 8.75s/it][2025-04-26 03:35:32,194] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.26 | optimizer_step: 0.90 [2025-04-26 03:35:32,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.34 | bwd_microstep: 5879.79 | bwd_inner_microstep: 5685.71 | bwd_allreduce_microstep: 194.03 | step_microstep: 19.75 [2025-04-26 03:35:32,195] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.34 | bwd: 5879.80 | bwd_inner: 5685.71 | bwd_allreduce: 194.05 | step: 19.75 20%|█▉ | 8129/41250 [19:37:57<80:38:58, 8.77s/it] {'loss': 0.15, 'grad_norm': 1.6412625312805176, 'learning_rate': 3.7143310068280845e-05, 'epoch': 1.97} 20%|█▉ | 8129/41250 [19:37:57<80:38:58, 8.77s/it][2025-04-26 03:35:40,888] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 03:35:40,889] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.87 | bwd_microstep: 5783.74 | bwd_inner_microstep: 5645.79 | bwd_allreduce_microstep: 137.91 | step_microstep: 19.23 [2025-04-26 03:35:40,889] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.87 | bwd: 5783.75 | bwd_inner: 5645.79 | bwd_allreduce: 137.93 | step: 19.23 20%|█▉ | 8130/41250 [19:38:06<80:26:50, 8.74s/it] {'loss': 0.0316, 'grad_norm': 1.1450363397598267, 'learning_rate': 3.7142501233458945e-05, 'epoch': 1.97} 20%|█▉ | 8130/41250 [19:38:06<80:26:50, 8.74s/it][2025-04-26 03:35:49,522] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.03 | optimizer_step: 0.97 [2025-04-26 03:35:49,522] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.25 | bwd_microstep: 5721.25 | bwd_inner_microstep: 5657.13 | bwd_allreduce_microstep: 64.06 | step_microstep: 19.05 [2025-04-26 03:35:49,523] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.25 | bwd: 5721.26 | bwd_inner: 5657.13 | bwd_allreduce: 64.08 | step: 19.05 20%|█▉ | 8131/41250 [19:38:14<80:08:25, 8.71s/it] {'loss': 0.1159, 'grad_norm': 1.3682291507720947, 'learning_rate': 3.71416922929569e-05, 'epoch': 1.97} 20%|█▉ | 8131/41250 [19:38:14<80:08:25, 8.71s/it][2025-04-26 03:35:58,169] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.01 | optimizer_step: 1.15 [2025-04-26 03:35:58,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.86 | bwd_microstep: 5713.73 | bwd_inner_microstep: 5700.86 | bwd_allreduce_microstep: 12.82 | step_microstep: 19.30 [2025-04-26 03:35:58,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.86 | bwd: 5713.74 | bwd_inner: 5700.86 | bwd_allreduce: 12.84 | step: 19.30 20%|█▉ | 8132/41250 [19:38:23<79:57:38, 8.69s/it] {'loss': 0.0556, 'grad_norm': 0.767991304397583, 'learning_rate': 3.7140883246779684e-05, 'epoch': 1.97} 20%|█▉ | 8132/41250 [19:38:23<79:57:38, 8.69s/it][2025-04-26 03:36:06,830] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-26 03:36:06,831] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.85 | bwd_microstep: 5709.87 | bwd_inner_microstep: 5696.95 | bwd_allreduce_microstep: 12.87 | step_microstep: 18.94 [2025-04-26 03:36:06,831] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.85 | bwd: 5709.89 | bwd_inner: 5696.95 | bwd_allreduce: 12.89 | step: 18.94 20%|█▉ | 8133/41250 [19:38:32<79:52:10, 8.68s/it] {'loss': 0.3865, 'grad_norm': 4.146049976348877, 'learning_rate': 3.71400740949323e-05, 'epoch': 1.97} 20%|█▉ | 8133/41250 [19:38:32<79:52:10, 8.68s/it][2025-04-26 03:36:15,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:36:15,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.61 | bwd_microstep: 5771.99 | bwd_inner_microstep: 5644.02 | bwd_allreduce_microstep: 127.92 | step_microstep: 18.74 [2025-04-26 03:36:15,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.61 | bwd: 5772.00 | bwd_inner: 5644.02 | bwd_allreduce: 127.94 | step: 18.75 20%|█▉ | 8134/41250 [19:38:40<79:52:45, 8.68s/it] {'loss': 0.2145, 'grad_norm': 1.962787389755249, 'learning_rate': 3.7139264837419727e-05, 'epoch': 1.97} 20%|█▉ | 8134/41250 [19:38:40<79:52:45, 8.68s/it][2025-04-26 03:36:24,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:36:24,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.58 | bwd_microstep: 5744.82 | bwd_inner_microstep: 5688.61 | bwd_allreduce_microstep: 56.16 | step_microstep: 19.32 [2025-04-26 03:36:24,199] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.58 | bwd: 5744.83 | bwd_inner: 5688.61 | bwd_allreduce: 56.18 | step: 19.32 20%|█▉ | 8135/41250 [19:38:49<79:52:11, 8.68s/it] {'loss': 0.0449, 'grad_norm': 0.8213307857513428, 'learning_rate': 3.7138455474246965e-05, 'epoch': 1.97} 20%|█▉ | 8135/41250 [19:38:49<79:52:11, 8.68s/it][2025-04-26 03:36:32,883] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.25 | optimizer_step: 1.12 [2025-04-26 03:36:32,884] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.55 | bwd_microstep: 5749.80 | bwd_inner_microstep: 5696.20 | bwd_allreduce_microstep: 53.53 | step_microstep: 20.01 [2025-04-26 03:36:32,884] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.55 | bwd: 5749.81 | bwd_inner: 5696.20 | bwd_allreduce: 53.56 | step: 20.01 20%|█▉ | 8136/41250 [19:38:58<79:52:58, 8.68s/it] {'loss': 0.0824, 'grad_norm': 2.419579267501831, 'learning_rate': 3.7137646005418986e-05, 'epoch': 1.97} 20%|█▉ | 8136/41250 [19:38:58<79:52:58, 8.68s/it][2025-04-26 03:36:41,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:36:41,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.02 | bwd_microstep: 5701.67 | bwd_inner_microstep: 5688.91 | bwd_allreduce_microstep: 12.72 | step_microstep: 18.76 [2025-04-26 03:36:41,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.02 | bwd: 5701.69 | bwd_inner: 5688.91 | bwd_allreduce: 12.74 | step: 18.77 20%|█▉ | 8137/41250 [19:39:06<79:43:58, 8.67s/it] {'loss': 0.0761, 'grad_norm': 2.1683919429779053, 'learning_rate': 3.713683643094079e-05, 'epoch': 1.97} 20%|█▉ | 8137/41250 [19:39:06<79:43:58, 8.67s/it][2025-04-26 03:36:50,153] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:36:50,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.78 | bwd_microstep: 5704.44 | bwd_inner_microstep: 5691.84 | bwd_allreduce_microstep: 12.56 | step_microstep: 18.60 [2025-04-26 03:36:50,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.78 | bwd: 5704.45 | bwd_inner: 5691.84 | bwd_allreduce: 12.57 | step: 18.60 20%|█▉ | 8138/41250 [19:39:15<79:38:14, 8.66s/it] {'loss': 0.1103, 'grad_norm': 1.0689845085144043, 'learning_rate': 3.713602675081737e-05, 'epoch': 1.97} 20%|█▉ | 8138/41250 [19:39:15<79:38:14, 8.66s/it][2025-04-26 03:36:58,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 03:36:58,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.01 | bwd_microstep: 5757.52 | bwd_inner_microstep: 5674.00 | bwd_allreduce_microstep: 83.47 | step_microstep: 18.48 [2025-04-26 03:36:58,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.01 | bwd: 5757.53 | bwd_inner: 5674.00 | bwd_allreduce: 83.49 | step: 18.48 20%|█▉ | 8139/41250 [19:39:24<79:42:08, 8.67s/it] {'loss': 0.3278, 'grad_norm': 2.9847590923309326, 'learning_rate': 3.713521696505373e-05, 'epoch': 1.97} 20%|█▉ | 8139/41250 [19:39:24<79:42:08, 8.67s/it][2025-04-26 03:37:07,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.97 | optimizer_step: 0.93 [2025-04-26 03:37:07,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.09 | bwd_microstep: 5761.08 | bwd_inner_microstep: 5637.75 | bwd_allreduce_microstep: 123.29 | step_microstep: 18.66 [2025-04-26 03:37:07,503] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.09 | bwd: 5761.09 | bwd_inner: 5637.75 | bwd_allreduce: 123.30 | step: 18.66 20%|█▉ | 8140/41250 [19:39:32<79:42:12, 8.67s/it] {'loss': 0.3283, 'grad_norm': 1.8545008897781372, 'learning_rate': 3.7134407073654835e-05, 'epoch': 1.97} 20%|█▉ | 8140/41250 [19:39:32<79:42:12, 8.67s/it][2025-04-26 03:37:16,163] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 1.01 | optimizer_step: 0.94 [2025-04-26 03:37:16,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.56 | bwd_microstep: 5731.84 | bwd_inner_microstep: 5699.14 | bwd_allreduce_microstep: 32.66 | step_microstep: 18.53 [2025-04-26 03:37:16,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.56 | bwd: 5731.86 | bwd_inner: 5699.13 | bwd_allreduce: 32.68 | step: 18.53 20%|█▉ | 8141/41250 [19:39:41<79:41:11, 8.66s/it] {'loss': 0.0401, 'grad_norm': 0.9681961536407471, 'learning_rate': 3.713359707662568e-05, 'epoch': 1.97} 20%|█▉ | 8141/41250 [19:39:41<79:41:11, 8.66s/it][2025-04-26 03:37:24,817] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 03:37:24,818] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.66 | bwd_microstep: 5743.35 | bwd_inner_microstep: 5642.56 | bwd_allreduce_microstep: 100.74 | step_microstep: 18.74 [2025-04-26 03:37:24,818] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.66 | bwd: 5743.36 | bwd_inner: 5642.56 | bwd_allreduce: 100.76 | step: 18.74 20%|█▉ | 8142/41250 [19:39:50<79:39:36, 8.66s/it] {'loss': 0.1238, 'grad_norm': 1.681649088859558, 'learning_rate': 3.713278697397129e-05, 'epoch': 1.97} 20%|█▉ | 8142/41250 [19:39:50<79:39:36, 8.66s/it][2025-04-26 03:37:33,435] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 1.07 | optimizer_step: 0.99 [2025-04-26 03:37:33,436] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.71 | bwd_microstep: 5711.75 | bwd_inner_microstep: 5647.97 | bwd_allreduce_microstep: 63.73 | step_microstep: 19.36 [2025-04-26 03:37:33,436] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.71 | bwd: 5711.76 | bwd_inner: 5647.97 | bwd_allreduce: 63.75 | step: 19.36 20%|█▉ | 8143/41250 [19:39:58<79:31:59, 8.65s/it] {'loss': 0.0121, 'grad_norm': 0.26918742060661316, 'learning_rate': 3.7131976765696626e-05, 'epoch': 1.97} 20%|█▉ | 8143/41250 [19:39:58<79:31:59, 8.65s/it][2025-04-26 03:37:42,109] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 03:37:42,110] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.62 | bwd_microstep: 5769.45 | bwd_inner_microstep: 5646.53 | bwd_allreduce_microstep: 122.87 | step_microstep: 18.75 [2025-04-26 03:37:42,110] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.62 | bwd: 5769.46 | bwd_inner: 5646.53 | bwd_allreduce: 122.89 | step: 18.75 20%|█▉ | 8144/41250 [19:40:07<79:35:54, 8.66s/it] {'loss': 0.1161, 'grad_norm': 2.2523627281188965, 'learning_rate': 3.71311664518067e-05, 'epoch': 1.97} 20%|█▉ | 8144/41250 [19:40:07<79:35:54, 8.66s/it][2025-04-26 03:37:50,771] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 03:37:50,771] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.45 | bwd_microstep: 5754.90 | bwd_inner_microstep: 5647.17 | bwd_allreduce_microstep: 107.69 | step_microstep: 18.41 [2025-04-26 03:37:50,771] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.45 | bwd: 5754.92 | bwd_inner: 5647.17 | bwd_allreduce: 107.71 | step: 18.42 20%|█▉ | 8145/41250 [19:40:16<79:36:44, 8.66s/it] {'loss': 0.2411, 'grad_norm': 3.1757421493530273, 'learning_rate': 3.71303560323065e-05, 'epoch': 1.97} 20%|█▉ | 8145/41250 [19:40:16<79:36:44, 8.66s/it][2025-04-26 03:37:59,413] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:37:59,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.78 | bwd_microstep: 5732.72 | bwd_inner_microstep: 5639.26 | bwd_allreduce_microstep: 93.42 | step_microstep: 18.39 [2025-04-26 03:37:59,414] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.78 | bwd: 5732.74 | bwd_inner: 5639.26 | bwd_allreduce: 93.44 | step: 18.40 20%|█▉ | 8146/41250 [19:40:24<79:34:07, 8.65s/it] {'loss': 0.2858, 'grad_norm': 3.970125436782837, 'learning_rate': 3.712954550720102e-05, 'epoch': 1.97} 20%|█▉ | 8146/41250 [19:40:24<79:34:07, 8.65s/it][2025-04-26 03:38:08,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.96 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 03:38:08,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.00 | bwd_microstep: 5771.47 | bwd_inner_microstep: 5649.35 | bwd_allreduce_microstep: 122.07 | step_microstep: 18.08 [2025-04-26 03:38:08,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.00 | bwd: 5771.49 | bwd_inner: 5649.35 | bwd_allreduce: 122.09 | step: 18.08 20%|█▉ | 8147/41250 [19:40:33<79:37:46, 8.66s/it] {'loss': 0.2489, 'grad_norm': 2.28961181640625, 'learning_rate': 3.712873487649526e-05, 'epoch': 1.98} 20%|█▉ | 8147/41250 [19:40:33<79:37:46, 8.66s/it][2025-04-26 03:38:16,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:38:16,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.23 | bwd_microstep: 5774.33 | bwd_inner_microstep: 5640.49 | bwd_allreduce_microstep: 133.80 | step_microstep: 18.13 [2025-04-26 03:38:16,763] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.23 | bwd: 5774.34 | bwd_inner: 5640.49 | bwd_allreduce: 133.82 | step: 18.14 20%|█▉ | 8148/41250 [19:40:42<79:39:54, 8.66s/it] {'loss': 0.1393, 'grad_norm': 0.7707547545433044, 'learning_rate': 3.712792414019423e-05, 'epoch': 1.98} 20%|█▉ | 8148/41250 [19:40:42<79:39:54, 8.66s/it][2025-04-26 03:38:25,421] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.96 | optimizer_step: 0.89 [2025-04-26 03:38:25,421] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.75 | bwd_microstep: 5740.98 | bwd_inner_microstep: 5679.90 | bwd_allreduce_microstep: 61.04 | step_microstep: 18.21 [2025-04-26 03:38:25,421] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.75 | bwd: 5741.00 | bwd_inner: 5679.90 | bwd_allreduce: 61.06 | step: 18.21 20%|█▉ | 8149/41250 [19:40:50<79:39:03, 8.66s/it] {'loss': 0.0273, 'grad_norm': 0.4961722493171692, 'learning_rate': 3.712711329830291e-05, 'epoch': 1.98} 20%|█▉ | 8149/41250 [19:40:50<79:39:03, 8.66s/it][2025-04-26 03:38:34,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:38:34,051] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.67 | bwd_microstep: 5698.94 | bwd_inner_microstep: 5686.61 | bwd_allreduce_microstep: 12.29 | step_microstep: 18.32 [2025-04-26 03:38:34,052] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.67 | bwd: 5698.96 | bwd_inner: 5686.61 | bwd_allreduce: 12.31 | step: 18.33 20%|█▉ | 8150/41250 [19:40:59<79:33:14, 8.65s/it] {'loss': 0.0843, 'grad_norm': 2.7875709533691406, 'learning_rate': 3.712630235082631e-05, 'epoch': 1.98} 20%|█▉ | 8150/41250 [19:40:59<79:33:14, 8.65s/it][2025-04-26 03:38:42,783] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 03:38:42,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2874.21 | bwd_microstep: 5777.66 | bwd_inner_microstep: 5765.00 | bwd_allreduce_microstep: 12.61 | step_microstep: 18.25 [2025-04-26 03:38:42,784] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2874.22 | bwd: 5777.67 | bwd_inner: 5765.00 | bwd_allreduce: 12.63 | step: 18.25 20%|█▉ | 8151/41250 [19:41:08<79:46:20, 8.68s/it] {'loss': 0.0368, 'grad_norm': 0.7105482816696167, 'learning_rate': 3.7125491297769416e-05, 'epoch': 1.98} 20%|█▉ | 8151/41250 [19:41:08<79:46:20, 8.68s/it][2025-04-26 03:38:51,455] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.04 | optimizer_step: 1.09 [2025-04-26 03:38:51,456] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.90 | bwd_microstep: 5761.41 | bwd_inner_microstep: 5653.72 | bwd_allreduce_microstep: 107.64 | step_microstep: 19.36 [2025-04-26 03:38:51,456] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.90 | bwd: 5761.43 | bwd_inner: 5653.72 | bwd_allreduce: 107.66 | step: 19.37 20%|█▉ | 8152/41250 [19:41:16<79:45:34, 8.68s/it] {'loss': 0.0638, 'grad_norm': 1.6021090745925903, 'learning_rate': 3.712468013913724e-05, 'epoch': 1.98} 20%|█▉ | 8152/41250 [19:41:16<79:45:34, 8.68s/it][2025-04-26 03:39:00,107] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.00 | optimizer_step: 0.91 [2025-04-26 03:39:00,108] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.33 | bwd_microstep: 5746.13 | bwd_inner_microstep: 5650.09 | bwd_allreduce_microstep: 96.00 | step_microstep: 18.55 [2025-04-26 03:39:00,108] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.33 | bwd: 5746.15 | bwd_inner: 5650.09 | bwd_allreduce: 96.01 | step: 18.55 20%|█▉ | 8153/41250 [19:41:25<79:41:35, 8.67s/it] {'loss': 0.2531, 'grad_norm': 2.191370964050293, 'learning_rate': 3.7123868874934786e-05, 'epoch': 1.98} 20%|█▉ | 8153/41250 [19:41:25<79:41:35, 8.67s/it][2025-04-26 03:39:08,739] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:39:08,740] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.14 | bwd_microstep: 5703.23 | bwd_inner_microstep: 5690.89 | bwd_allreduce_microstep: 12.31 | step_microstep: 18.95 [2025-04-26 03:39:08,740] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.14 | bwd: 5703.25 | bwd_inner: 5690.89 | bwd_allreduce: 12.32 | step: 18.95 20%|█▉ | 8154/41250 [19:41:34<79:35:34, 8.66s/it] {'loss': 0.1004, 'grad_norm': 1.6122548580169678, 'learning_rate': 3.712305750516704e-05, 'epoch': 1.98} 20%|█▉ | 8154/41250 [19:41:34<79:35:34, 8.66s/it][2025-04-26 03:39:17,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:39:17,493] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2882.46 | bwd_microstep: 5788.52 | bwd_inner_microstep: 5775.81 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.81 [2025-04-26 03:39:17,494] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2882.46 | bwd: 5788.54 | bwd_inner: 5775.82 | bwd_allreduce: 12.68 | step: 18.81 20%|█▉ | 8155/41250 [19:41:42<79:51:15, 8.69s/it] {'loss': 0.1768, 'grad_norm': 1.761583685874939, 'learning_rate': 3.712224602983901e-05, 'epoch': 1.98} 20%|█▉ | 8155/41250 [19:41:42<79:51:15, 8.69s/it][2025-04-26 03:39:26,180] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:39:26,181] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.16 | bwd_microstep: 5772.26 | bwd_inner_microstep: 5653.62 | bwd_allreduce_microstep: 118.59 | step_microstep: 18.65 [2025-04-26 03:39:26,181] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.16 | bwd: 5772.27 | bwd_inner: 5653.62 | bwd_allreduce: 118.61 | step: 18.66 20%|█▉ | 8156/41250 [19:41:51<79:51:13, 8.69s/it] {'loss': 0.1802, 'grad_norm': 2.288135290145874, 'learning_rate': 3.712143444895571e-05, 'epoch': 1.98} 20%|█▉ | 8156/41250 [19:41:51<79:51:13, 8.69s/it][2025-04-26 03:39:34,828] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.35 | optimizer_gradients: 1.06 | optimizer_step: 0.98 [2025-04-26 03:39:34,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.48 | bwd_microstep: 5710.44 | bwd_inner_microstep: 5697.40 | bwd_allreduce_microstep: 12.99 | step_microstep: 19.27 [2025-04-26 03:39:34,829] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.48 | bwd: 5710.46 | bwd_inner: 5697.40 | bwd_allreduce: 13.01 | step: 19.27 20%|█▉ | 8157/41250 [19:42:00<79:44:44, 8.68s/it] {'loss': 0.1377, 'grad_norm': 1.7571979761123657, 'learning_rate': 3.712062276252213e-05, 'epoch': 1.98} 20%|█▉ | 8157/41250 [19:42:00<79:44:44, 8.68s/it][2025-04-26 03:39:43,459] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:39:43,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.24 | bwd_microstep: 5703.10 | bwd_inner_microstep: 5690.42 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.07 [2025-04-26 03:39:43,460] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.25 | bwd: 5703.12 | bwd_inner: 5690.42 | bwd_allreduce: 12.66 | step: 18.08 20%|█▉ | 8158/41250 [19:42:08<79:37:04, 8.66s/it] {'loss': 0.1, 'grad_norm': 1.8426765203475952, 'learning_rate': 3.7119810970543284e-05, 'epoch': 1.98} 20%|█▉ | 8158/41250 [19:42:08<79:37:04, 8.66s/it][2025-04-26 03:39:52,145] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.81 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:39:52,145] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.86 | bwd_microstep: 5753.12 | bwd_inner_microstep: 5696.61 | bwd_allreduce_microstep: 56.46 | step_microstep: 17.83 [2025-04-26 03:39:52,145] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.86 | bwd: 5753.13 | bwd_inner: 5696.61 | bwd_allreduce: 56.48 | step: 17.83 20%|█▉ | 8159/41250 [19:42:17<79:40:54, 8.67s/it] {'loss': 0.2222, 'grad_norm': 3.6188395023345947, 'learning_rate': 3.711899907302416e-05, 'epoch': 1.98} 20%|█▉ | 8159/41250 [19:42:17<79:40:54, 8.67s/it][2025-04-26 03:40:00,897] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.38 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 03:40:00,898] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2883.81 | bwd_microstep: 5782.92 | bwd_inner_microstep: 5770.15 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.56 [2025-04-26 03:40:00,898] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2883.81 | bwd: 5782.93 | bwd_inner: 5770.15 | bwd_allreduce: 12.74 | step: 18.56 20%|█▉ | 8160/41250 [19:42:26<79:54:43, 8.69s/it] {'loss': 0.1159, 'grad_norm': 3.3805832862854004, 'learning_rate': 3.711818706996978e-05, 'epoch': 1.98} 20%|█▉ | 8160/41250 [19:42:26<79:54:43, 8.69s/it][2025-04-26 03:40:09,539] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:40:09,540] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.05 | bwd_microstep: 5714.84 | bwd_inner_microstep: 5682.39 | bwd_allreduce_microstep: 32.39 | step_microstep: 18.48 [2025-04-26 03:40:09,540] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.05 | bwd: 5714.85 | bwd_inner: 5682.39 | bwd_allreduce: 32.41 | step: 18.48 20%|█▉ | 8161/41250 [19:42:34<79:45:54, 8.68s/it] {'loss': 0.0501, 'grad_norm': 0.8286653161048889, 'learning_rate': 3.7117374961385147e-05, 'epoch': 1.98} 20%|█▉ | 8161/41250 [19:42:34<79:45:54, 8.68s/it][2025-04-26 03:40:18,230] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 0.95 | optimizer_step: 0.98 [2025-04-26 03:40:18,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.60 | bwd_microstep: 5758.00 | bwd_inner_microstep: 5689.29 | bwd_allreduce_microstep: 68.67 | step_microstep: 18.23 [2025-04-26 03:40:18,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.60 | bwd: 5758.02 | bwd_inner: 5689.29 | bwd_allreduce: 68.69 | step: 18.23 20%|█▉ | 8162/41250 [19:42:43<79:47:54, 8.68s/it] {'loss': 0.3032, 'grad_norm': 1.6918202638626099, 'learning_rate': 3.7116562747275254e-05, 'epoch': 1.98} 20%|█▉ | 8162/41250 [19:42:43<79:47:54, 8.68s/it][2025-04-26 03:40:26,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:40:26,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.93 | bwd_microstep: 5741.75 | bwd_inner_microstep: 5641.76 | bwd_allreduce_microstep: 99.95 | step_microstep: 18.55 [2025-04-26 03:40:26,878] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.93 | bwd: 5741.76 | bwd_inner: 5641.76 | bwd_allreduce: 99.96 | step: 18.55 20%|█▉ | 8163/41250 [19:42:52<79:42:01, 8.67s/it] {'loss': 0.2975, 'grad_norm': 4.528618335723877, 'learning_rate': 3.711575042764513e-05, 'epoch': 1.98} 20%|█▉ | 8163/41250 [19:42:52<79:42:01, 8.67s/it][2025-04-26 03:40:35,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 03:40:35,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.04 | bwd_microstep: 5712.96 | bwd_inner_microstep: 5700.04 | bwd_allreduce_microstep: 12.88 | step_microstep: 18.58 [2025-04-26 03:40:35,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.04 | bwd: 5712.98 | bwd_inner: 5700.04 | bwd_allreduce: 12.90 | step: 18.59 20%|█▉ | 8164/41250 [19:43:00<79:36:51, 8.66s/it] {'loss': 0.0733, 'grad_norm': 1.3022431135177612, 'learning_rate': 3.711493800249976e-05, 'epoch': 1.98} 20%|█▉ | 8164/41250 [19:43:00<79:36:51, 8.66s/it][2025-04-26 03:40:44,127] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:40:44,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.61 | bwd_microstep: 5694.81 | bwd_inner_microstep: 5656.73 | bwd_allreduce_microstep: 38.04 | step_microstep: 18.23 [2025-04-26 03:40:44,128] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.61 | bwd: 5694.82 | bwd_inner: 5656.72 | bwd_allreduce: 38.05 | step: 18.24 20%|█▉ | 8165/41250 [19:43:09<79:27:41, 8.65s/it] {'loss': 0.1452, 'grad_norm': 1.9144206047058105, 'learning_rate': 3.7114125471844175e-05, 'epoch': 1.98} 20%|█▉ | 8165/41250 [19:43:09<79:27:41, 8.65s/it][2025-04-26 03:40:52,791] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.29 | optimizer_step: 1.07 [2025-04-26 03:40:52,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.81 | bwd_microstep: 5724.18 | bwd_inner_microstep: 5710.08 | bwd_allreduce_microstep: 14.04 | step_microstep: 20.33 [2025-04-26 03:40:52,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.81 | bwd: 5724.20 | bwd_inner: 5710.08 | bwd_allreduce: 14.07 | step: 20.33 20%|█▉ | 8166/41250 [19:43:18<79:30:52, 8.65s/it] {'loss': 0.0548, 'grad_norm': 1.3270039558410645, 'learning_rate': 3.711331283568336e-05, 'epoch': 1.98} 20%|█▉ | 8166/41250 [19:43:18<79:30:52, 8.65s/it][2025-04-26 03:41:01,485] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:41:01,486] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.96 | bwd_microstep: 5758.63 | bwd_inner_microstep: 5686.70 | bwd_allreduce_microstep: 71.88 | step_microstep: 18.44 [2025-04-26 03:41:01,486] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.96 | bwd: 5758.64 | bwd_inner: 5686.70 | bwd_allreduce: 71.90 | step: 18.44 20%|█▉ | 8167/41250 [19:43:26<79:37:20, 8.66s/it] {'loss': 0.0988, 'grad_norm': 1.2102493047714233, 'learning_rate': 3.7112500094022345e-05, 'epoch': 1.98} 20%|█▉ | 8167/41250 [19:43:26<79:37:20, 8.66s/it][2025-04-26 03:41:10,122] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.02 | optimizer_step: 0.93 [2025-04-26 03:41:10,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.06 | bwd_microstep: 5723.43 | bwd_inner_microstep: 5651.52 | bwd_allreduce_microstep: 71.87 | step_microstep: 19.48 [2025-04-26 03:41:10,123] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.06 | bwd: 5723.45 | bwd_inner: 5651.52 | bwd_allreduce: 71.89 | step: 19.50 20%|█▉ | 8168/41250 [19:43:35<79:32:45, 8.66s/it] {'loss': 0.21, 'grad_norm': 3.1497058868408203, 'learning_rate': 3.711168724686613e-05, 'epoch': 1.98} 20%|█▉ | 8168/41250 [19:43:35<79:32:45, 8.66s/it][2025-04-26 03:41:18,810] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:41:18,810] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.91 | bwd_microstep: 5771.81 | bwd_inner_microstep: 5657.92 | bwd_allreduce_microstep: 113.85 | step_microstep: 18.87 [2025-04-26 03:41:18,811] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.91 | bwd: 5771.83 | bwd_inner: 5657.92 | bwd_allreduce: 113.87 | step: 18.88 20%|█▉ | 8169/41250 [19:43:44<79:37:44, 8.67s/it] {'loss': 0.0953, 'grad_norm': 1.1460002660751343, 'learning_rate': 3.711087429421973e-05, 'epoch': 1.98} 20%|█▉ | 8169/41250 [19:43:44<79:37:44, 8.67s/it][2025-04-26 03:41:27,439] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.29 | optimizer_step: 1.07 [2025-04-26 03:41:27,440] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.55 | bwd_microstep: 5701.05 | bwd_inner_microstep: 5681.96 | bwd_allreduce_microstep: 19.02 | step_microstep: 20.43 [2025-04-26 03:41:27,441] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.55 | bwd: 5701.06 | bwd_inner: 5681.96 | bwd_allreduce: 19.06 | step: 20.43 20%|█▉ | 8170/41250 [19:43:52<79:31:59, 8.66s/it] {'loss': 0.1092, 'grad_norm': 1.6053060293197632, 'learning_rate': 3.7110061236088156e-05, 'epoch': 1.98} 20%|█▉ | 8170/41250 [19:43:52<79:31:59, 8.66s/it][2025-04-26 03:41:36,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.03 | optimizer_step: 0.93 [2025-04-26 03:41:36,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.96 | bwd_microstep: 5800.72 | bwd_inner_microstep: 5658.42 | bwd_allreduce_microstep: 142.26 | step_microstep: 19.21 [2025-04-26 03:41:36,154] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.97 | bwd: 5800.74 | bwd_inner: 5658.42 | bwd_allreduce: 142.28 | step: 19.21 20%|█▉ | 8171/41250 [19:44:01<79:41:16, 8.67s/it] {'loss': 0.0439, 'grad_norm': 0.674374520778656, 'learning_rate': 3.710924807247642e-05, 'epoch': 1.98} 20%|█▉ | 8171/41250 [19:44:01<79:41:16, 8.67s/it][2025-04-26 03:41:44,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:41:44,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.70 | bwd_microstep: 5793.76 | bwd_inner_microstep: 5781.01 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.79 [2025-04-26 03:41:44,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.70 | bwd: 5793.78 | bwd_inner: 5781.01 | bwd_allreduce: 12.73 | step: 18.79 20%|█▉ | 8172/41250 [19:44:10<79:56:43, 8.70s/it] {'loss': 0.1745, 'grad_norm': 0.8979306221008301, 'learning_rate': 3.710843480338953e-05, 'epoch': 1.98} 20%|█▉ | 8172/41250 [19:44:10<79:56:43, 8.70s/it][2025-04-26 03:41:53,599] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 0.97 | optimizer_step: 1.00 [2025-04-26 03:41:53,599] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.18 | bwd_microstep: 5748.24 | bwd_inner_microstep: 5693.93 | bwd_allreduce_microstep: 54.26 | step_microstep: 18.56 [2025-04-26 03:41:53,599] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.18 | bwd: 5748.25 | bwd_inner: 5693.93 | bwd_allreduce: 54.28 | step: 18.57 20%|█▉ | 8173/41250 [19:44:18<79:52:49, 8.69s/it] {'loss': 0.1637, 'grad_norm': 5.220314025878906, 'learning_rate': 3.7107621428832514e-05, 'epoch': 1.98} 20%|█▉ | 8173/41250 [19:44:18<79:52:49, 8.69s/it][2025-04-26 03:42:02,253] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.03 | optimizer_step: 0.94 [2025-04-26 03:42:02,254] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.89 | bwd_microstep: 5714.19 | bwd_inner_microstep: 5701.28 | bwd_allreduce_microstep: 12.86 | step_microstep: 18.90 [2025-04-26 03:42:02,254] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.89 | bwd: 5714.21 | bwd_inner: 5701.28 | bwd_allreduce: 12.88 | step: 18.90 20%|█▉ | 8174/41250 [19:44:27<79:46:10, 8.68s/it] {'loss': 0.0249, 'grad_norm': 0.44565147161483765, 'learning_rate': 3.710680794881037e-05, 'epoch': 1.98} 20%|█▉ | 8174/41250 [19:44:27<79:46:10, 8.68s/it][2025-04-26 03:42:10,882] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 03:42:10,883] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.08 | bwd_microstep: 5700.60 | bwd_inner_microstep: 5687.73 | bwd_allreduce_microstep: 12.83 | step_microstep: 18.58 [2025-04-26 03:42:10,883] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.08 | bwd: 5700.61 | bwd_inner: 5687.73 | bwd_allreduce: 12.84 | step: 18.58 20%|█▉ | 8175/41250 [19:44:36<79:37:12, 8.67s/it] {'loss': 0.1292, 'grad_norm': 2.2255046367645264, 'learning_rate': 3.710599436332812e-05, 'epoch': 1.98} 20%|█▉ | 8175/41250 [19:44:36<79:37:12, 8.67s/it][2025-04-26 03:42:19,487] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:42:19,488] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.50 | bwd_microstep: 5706.08 | bwd_inner_microstep: 5636.95 | bwd_allreduce_microstep: 69.08 | step_microstep: 18.50 [2025-04-26 03:42:19,488] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.50 | bwd: 5706.09 | bwd_inner: 5636.95 | bwd_allreduce: 69.09 | step: 18.51 20%|█▉ | 8176/41250 [19:44:44<79:27:06, 8.65s/it] {'loss': 0.1926, 'grad_norm': 1.2540379762649536, 'learning_rate': 3.710518067239078e-05, 'epoch': 1.98} 20%|█▉ | 8176/41250 [19:44:44<79:27:06, 8.65s/it][2025-04-26 03:42:28,167] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.21 | optimizer_gradients: 1.31 | optimizer_step: 1.04 [2025-04-26 03:42:28,168] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.86 | bwd_microstep: 5770.84 | bwd_inner_microstep: 5654.95 | bwd_allreduce_microstep: 115.83 | step_microstep: 19.95 [2025-04-26 03:42:28,168] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.86 | bwd: 5770.86 | bwd_inner: 5654.95 | bwd_allreduce: 115.86 | step: 19.95 20%|█▉ | 8177/41250 [19:44:53<79:32:29, 8.66s/it] {'loss': 0.1077, 'grad_norm': 2.837503671646118, 'learning_rate': 3.710436687600337e-05, 'epoch': 1.98} 20%|█▉ | 8177/41250 [19:44:53<79:32:29, 8.66s/it][2025-04-26 03:42:36,891] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.06 | optimizer_step: 0.94 [2025-04-26 03:42:36,892] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2874.20 | bwd_microstep: 5762.43 | bwd_inner_microstep: 5749.18 | bwd_allreduce_microstep: 13.20 | step_microstep: 19.45 [2025-04-26 03:42:36,892] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2874.21 | bwd: 5762.45 | bwd_inner: 5749.18 | bwd_allreduce: 13.22 | step: 19.46 20%|█▉ | 8178/41250 [19:45:02<79:42:56, 8.68s/it] {'loss': 0.149, 'grad_norm': 1.0460646152496338, 'learning_rate': 3.71035529741709e-05, 'epoch': 1.98} 20%|█▉ | 8178/41250 [19:45:02<79:42:56, 8.68s/it][2025-04-26 03:42:45,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-26 03:42:45,530] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.44 | bwd_microstep: 5704.43 | bwd_inner_microstep: 5691.50 | bwd_allreduce_microstep: 12.89 | step_microstep: 19.14 [2025-04-26 03:42:45,531] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.44 | bwd: 5704.45 | bwd_inner: 5691.50 | bwd_allreduce: 12.91 | step: 19.15 20%|█▉ | 8179/41250 [19:45:10<79:36:30, 8.67s/it] {'loss': 0.0808, 'grad_norm': 0.9200416803359985, 'learning_rate': 3.710273896689839e-05, 'epoch': 1.98} 20%|█▉ | 8179/41250 [19:45:10<79:36:30, 8.67s/it][2025-04-26 03:42:54,115] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.03 | optimizer_step: 0.97 [2025-04-26 03:42:54,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.04 | bwd_microstep: 5675.68 | bwd_inner_microstep: 5647.55 | bwd_allreduce_microstep: 28.08 | step_microstep: 18.83 [2025-04-26 03:42:54,116] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.04 | bwd: 5675.70 | bwd_inner: 5647.55 | bwd_allreduce: 28.10 | step: 18.83 20%|█▉ | 8180/41250 [19:45:19<79:22:55, 8.64s/it] {'loss': 0.0733, 'grad_norm': 2.050414562225342, 'learning_rate': 3.710192485419086e-05, 'epoch': 1.98} 20%|█▉ | 8180/41250 [19:45:19<79:22:55, 8.64s/it][2025-04-26 03:43:02,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.98 | optimizer_gradients: 1.03 | optimizer_step: 1.06 [2025-04-26 03:43:02,736] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.34 | bwd_microstep: 5686.71 | bwd_inner_microstep: 5673.66 | bwd_allreduce_microstep: 12.99 | step_microstep: 18.85 [2025-04-26 03:43:02,737] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.34 | bwd: 5686.72 | bwd_inner: 5673.66 | bwd_allreduce: 13.02 | step: 18.85 20%|█▉ | 8181/41250 [19:45:28<79:19:19, 8.64s/it] {'loss': 0.0469, 'grad_norm': 0.834411084651947, 'learning_rate': 3.710111063605332e-05, 'epoch': 1.98} 20%|█▉ | 8181/41250 [19:45:28<79:19:19, 8.64s/it][2025-04-26 03:43:11,682] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.29 | optimizer_step: 1.05 [2025-04-26 03:43:11,683] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.01 | bwd_microstep: 6009.64 | bwd_inner_microstep: 5691.31 | bwd_allreduce_microstep: 318.27 | step_microstep: 20.20 [2025-04-26 03:43:11,683] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.01 | bwd: 6009.66 | bwd_inner: 5691.31 | bwd_allreduce: 318.30 | step: 20.20 20%|█▉ | 8182/41250 [19:45:37<80:10:55, 8.73s/it] {'loss': 0.0668, 'grad_norm': 1.012890100479126, 'learning_rate': 3.71002963124908e-05, 'epoch': 1.98} 20%|█▉ | 8182/41250 [19:45:37<80:10:55, 8.73s/it][2025-04-26 03:43:20,339] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 03:43:20,339] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2837.22 | bwd_microstep: 5733.95 | bwd_inner_microstep: 5674.36 | bwd_allreduce_microstep: 59.55 | step_microstep: 18.78 [2025-04-26 03:43:20,339] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2837.22 | bwd: 5733.97 | bwd_inner: 5674.36 | bwd_allreduce: 59.57 | step: 18.78 20%|█▉ | 8183/41250 [19:45:45<79:58:25, 8.71s/it] {'loss': 0.1057, 'grad_norm': 2.2232208251953125, 'learning_rate': 3.709948188350832e-05, 'epoch': 1.98} 20%|█▉ | 8183/41250 [19:45:45<79:58:25, 8.71s/it][2025-04-26 03:43:28,952] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.05 | optimizer_step: 0.94 [2025-04-26 03:43:28,952] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.94 | bwd_microstep: 5710.31 | bwd_inner_microstep: 5633.63 | bwd_allreduce_microstep: 76.62 | step_microstep: 19.49 [2025-04-26 03:43:28,953] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.94 | bwd: 5710.32 | bwd_inner: 5633.63 | bwd_allreduce: 76.65 | step: 19.49 20%|█▉ | 8184/41250 [19:45:54<79:42:52, 8.68s/it] {'loss': 0.0546, 'grad_norm': 1.293359637260437, 'learning_rate': 3.709866734911089e-05, 'epoch': 1.98} 20%|█▉ | 8184/41250 [19:45:54<79:42:52, 8.68s/it][2025-04-26 03:43:37,573] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 03:43:37,573] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.30 | bwd_microstep: 5694.65 | bwd_inner_microstep: 5681.75 | bwd_allreduce_microstep: 12.85 | step_microstep: 18.94 [2025-04-26 03:43:37,573] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.30 | bwd: 5694.66 | bwd_inner: 5681.75 | bwd_allreduce: 12.87 | step: 18.94 20%|█▉ | 8185/41250 [19:46:02<79:33:06, 8.66s/it] {'loss': 0.1659, 'grad_norm': 3.572227716445923, 'learning_rate': 3.7097852709303545e-05, 'epoch': 1.98} 20%|█▉ | 8185/41250 [19:46:02<79:33:06, 8.66s/it][2025-04-26 03:43:46,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 1.05 [2025-04-26 03:43:46,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.22 | bwd_microstep: 5741.90 | bwd_inner_microstep: 5679.32 | bwd_allreduce_microstep: 62.54 | step_microstep: 19.04 [2025-04-26 03:43:46,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.22 | bwd: 5741.92 | bwd_inner: 5679.32 | bwd_allreduce: 62.56 | step: 19.05 20%|█▉ | 8186/41250 [19:46:11<79:33:32, 8.66s/it] {'loss': 0.3289, 'grad_norm': 6.123682498931885, 'learning_rate': 3.70970379640913e-05, 'epoch': 1.98} 20%|█▉ | 8186/41250 [19:46:11<79:33:32, 8.66s/it][2025-04-26 03:43:54,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 1.02 | optimizer_step: 1.14 [2025-04-26 03:43:54,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.84 | bwd_microstep: 5721.18 | bwd_inner_microstep: 5703.06 | bwd_allreduce_microstep: 18.08 | step_microstep: 19.56 [2025-04-26 03:43:54,896] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.84 | bwd: 5721.19 | bwd_inner: 5703.06 | bwd_allreduce: 18.09 | step: 19.56 20%|█▉ | 8187/41250 [19:46:20<79:32:44, 8.66s/it] {'loss': 0.4208, 'grad_norm': 4.19537353515625, 'learning_rate': 3.709622311347918e-05, 'epoch': 1.98} 20%|█▉ | 8187/41250 [19:46:20<79:32:44, 8.66s/it][2025-04-26 03:44:03,488] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.01 | optimizer_step: 1.05 [2025-04-26 03:44:03,488] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.08 | bwd_microstep: 5687.74 | bwd_inner_microstep: 5646.63 | bwd_allreduce_microstep: 41.07 | step_microstep: 18.90 [2025-04-26 03:44:03,489] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.08 | bwd: 5687.76 | bwd_inner: 5646.62 | bwd_allreduce: 41.09 | step: 18.91 20%|█▉ | 8188/41250 [19:46:28<79:21:04, 8.64s/it] {'loss': 0.0856, 'grad_norm': 2.5497496128082275, 'learning_rate': 3.7095408157472206e-05, 'epoch': 1.98} 20%|█▉ | 8188/41250 [19:46:28<79:21:04, 8.64s/it][2025-04-26 03:44:12,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:44:12,189] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.93 | bwd_microstep: 5791.77 | bwd_inner_microstep: 5640.74 | bwd_allreduce_microstep: 150.99 | step_microstep: 18.65 [2025-04-26 03:44:12,189] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.93 | bwd: 5791.78 | bwd_inner: 5640.74 | bwd_allreduce: 151.00 | step: 18.65 20%|█▉ | 8189/41250 [19:46:37<79:30:52, 8.66s/it] {'loss': 0.0307, 'grad_norm': 0.8096205592155457, 'learning_rate': 3.70945930960754e-05, 'epoch': 1.99} 20%|█▉ | 8189/41250 [19:46:37<79:30:52, 8.66s/it][2025-04-26 03:44:20,840] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-26 03:44:20,841] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.56 | bwd_microstep: 5747.77 | bwd_inner_microstep: 5636.68 | bwd_allreduce_microstep: 111.04 | step_microstep: 18.95 [2025-04-26 03:44:20,841] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.56 | bwd: 5747.78 | bwd_inner: 5636.68 | bwd_allreduce: 111.06 | step: 18.96 20%|█▉ | 8190/41250 [19:46:46<79:29:53, 8.66s/it] {'loss': 0.2478, 'grad_norm': 4.698051452636719, 'learning_rate': 3.709377792929379e-05, 'epoch': 1.99} 20%|█▉ | 8190/41250 [19:46:46<79:29:53, 8.66s/it][2025-04-26 03:44:29,504] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.99 | optimizer_step: 0.91 [2025-04-26 03:44:29,505] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.63 | bwd_microstep: 5726.24 | bwd_inner_microstep: 5687.02 | bwd_allreduce_microstep: 39.18 | step_microstep: 18.89 [2025-04-26 03:44:29,505] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.63 | bwd: 5726.25 | bwd_inner: 5687.02 | bwd_allreduce: 39.19 | step: 18.89 20%|█▉ | 8191/41250 [19:46:54<79:30:38, 8.66s/it] {'loss': 0.149, 'grad_norm': 4.353422164916992, 'learning_rate': 3.709296265713241e-05, 'epoch': 1.99} 20%|█▉ | 8191/41250 [19:46:54<79:30:38, 8.66s/it][2025-04-26 03:44:38,182] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:44:38,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.37 | bwd_microstep: 5773.24 | bwd_inner_microstep: 5636.08 | bwd_allreduce_microstep: 137.11 | step_microstep: 18.53 [2025-04-26 03:44:38,183] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.37 | bwd: 5773.25 | bwd_inner: 5636.08 | bwd_allreduce: 137.13 | step: 18.53 20%|█▉ | 8192/41250 [19:47:03<79:33:46, 8.66s/it] {'loss': 0.026, 'grad_norm': 0.6496707201004028, 'learning_rate': 3.7092147279596264e-05, 'epoch': 1.99} 20%|█▉ | 8192/41250 [19:47:03<79:33:46, 8.66s/it][2025-04-26 03:44:46,852] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:44:46,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2817.08 | bwd_microstep: 5771.63 | bwd_inner_microstep: 5634.58 | bwd_allreduce_microstep: 137.02 | step_microstep: 18.40 [2025-04-26 03:44:46,853] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2817.08 | bwd: 5771.65 | bwd_inner: 5634.58 | bwd_allreduce: 137.03 | step: 18.41 20%|█▉ | 8193/41250 [19:47:12<79:34:32, 8.67s/it] {'loss': 0.2016, 'grad_norm': 1.6747370958328247, 'learning_rate': 3.7091331796690405e-05, 'epoch': 1.99} 20%|█▉ | 8193/41250 [19:47:12<79:34:32, 8.67s/it][2025-04-26 03:44:55,501] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:44:55,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.99 | bwd_microstep: 5706.85 | bwd_inner_microstep: 5693.98 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.69 [2025-04-26 03:44:55,502] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.99 | bwd: 5706.86 | bwd_inner: 5693.98 | bwd_allreduce: 12.84 | step: 18.69 20%|█▉ | 8194/41250 [19:47:20<79:31:38, 8.66s/it] {'loss': 0.0784, 'grad_norm': 2.6625406742095947, 'learning_rate': 3.709051620841984e-05, 'epoch': 1.99} 20%|█▉ | 8194/41250 [19:47:20<79:31:38, 8.66s/it][2025-04-26 03:45:04,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 03:45:04,189] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.98 | bwd_microstep: 5757.59 | bwd_inner_microstep: 5689.14 | bwd_allreduce_microstep: 68.41 | step_microstep: 17.92 [2025-04-26 03:45:04,189] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.97 | bwd: 5757.60 | bwd_inner: 5689.14 | bwd_allreduce: 68.43 | step: 17.93 20%|█▉ | 8195/41250 [19:47:29<79:35:40, 8.67s/it] {'loss': 0.3874, 'grad_norm': 4.749426364898682, 'learning_rate': 3.7089700514789605e-05, 'epoch': 1.99} 20%|█▉ | 8195/41250 [19:47:29<79:35:40, 8.67s/it][2025-04-26 03:45:12,802] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.02 | optimizer_step: 1.10 [2025-04-26 03:45:12,803] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.32 | bwd_microstep: 5691.57 | bwd_inner_microstep: 5638.86 | bwd_allreduce_microstep: 52.65 | step_microstep: 19.07 [2025-04-26 03:45:12,803] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.32 | bwd: 5691.58 | bwd_inner: 5638.86 | bwd_allreduce: 52.68 | step: 19.07 20%|█▉ | 8196/41250 [19:47:38<79:26:49, 8.65s/it] {'loss': 0.0917, 'grad_norm': 1.3672428131103516, 'learning_rate': 3.708888471580473e-05, 'epoch': 1.99} 20%|█▉ | 8196/41250 [19:47:38<79:26:49, 8.65s/it][2025-04-26 03:45:21,478] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.16 | optimizer_gradients: 1.21 | optimizer_step: 0.89 [2025-04-26 03:45:21,479] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.65 | bwd_microstep: 5734.70 | bwd_inner_microstep: 5685.89 | bwd_allreduce_microstep: 48.76 | step_microstep: 19.17 [2025-04-26 03:45:21,479] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.65 | bwd: 5734.72 | bwd_inner: 5685.89 | bwd_allreduce: 48.78 | step: 19.17 20%|█▉ | 8197/41250 [19:47:46<79:30:20, 8.66s/it] {'loss': 0.0608, 'grad_norm': 1.6747575998306274, 'learning_rate': 3.7088068811470236e-05, 'epoch': 1.99} 20%|█▉ | 8197/41250 [19:47:46<79:30:20, 8.66s/it][2025-04-26 03:45:30,117] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 0.98 | optimizer_step: 0.99 [2025-04-26 03:45:30,118] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.18 | bwd_microstep: 5708.00 | bwd_inner_microstep: 5695.00 | bwd_allreduce_microstep: 12.96 | step_microstep: 18.83 [2025-04-26 03:45:30,118] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.18 | bwd: 5708.02 | bwd_inner: 5695.00 | bwd_allreduce: 12.98 | step: 18.84 20%|█▉ | 8198/41250 [19:47:55<79:26:49, 8.65s/it] {'loss': 0.2075, 'grad_norm': 2.5873777866363525, 'learning_rate': 3.7087252801791164e-05, 'epoch': 1.99} 20%|█▉ | 8198/41250 [19:47:55<79:26:49, 8.65s/it][2025-04-26 03:45:38,807] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.00 | optimizer_step: 0.96 [2025-04-26 03:45:38,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.84 | bwd_microstep: 5755.12 | bwd_inner_microstep: 5711.05 | bwd_allreduce_microstep: 44.02 | step_microstep: 18.50 [2025-04-26 03:45:38,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.84 | bwd: 5755.13 | bwd_inner: 5711.05 | bwd_allreduce: 44.04 | step: 18.51 20%|█▉ | 8199/41250 [19:48:04<79:32:39, 8.66s/it] {'loss': 0.071, 'grad_norm': 1.9904769659042358, 'learning_rate': 3.7086436686772535e-05, 'epoch': 1.99} 20%|█▉ | 8199/41250 [19:48:04<79:32:39, 8.66s/it][2025-04-26 03:45:47,490] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.95 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-26 03:45:47,491] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2834.97 | bwd_microstep: 5767.04 | bwd_inner_microstep: 5645.69 | bwd_allreduce_microstep: 121.30 | step_microstep: 18.71 [2025-04-26 03:45:47,491] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2834.97 | bwd: 5767.05 | bwd_inner: 5645.69 | bwd_allreduce: 121.32 | step: 18.72 20%|█▉ | 8200/41250 [19:48:12<79:35:41, 8.67s/it] {'loss': 0.0571, 'grad_norm': 0.7852814793586731, 'learning_rate': 3.7085620466419395e-05, 'epoch': 1.99} 20%|█▉ | 8200/41250 [19:48:12<79:35:41, 8.67s/it][2025-04-26 03:45:56,138] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.90 | optimizer_gradients: 1.12 | optimizer_step: 1.05 [2025-04-26 03:45:56,139] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.42 | bwd_microstep: 5708.99 | bwd_inner_microstep: 5695.18 | bwd_allreduce_microstep: 13.75 | step_microstep: 19.34 [2025-04-26 03:45:56,140] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.42 | bwd: 5709.01 | bwd_inner: 5695.18 | bwd_allreduce: 13.78 | step: 19.34 20%|█▉ | 8201/41250 [19:48:21<79:32:07, 8.66s/it] {'loss': 0.037, 'grad_norm': 1.3378111124038696, 'learning_rate': 3.708480414073675e-05, 'epoch': 1.99} 20%|█▉ | 8201/41250 [19:48:21<79:32:07, 8.66s/it][2025-04-26 03:46:04,855] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:46:04,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.31 | bwd_microstep: 5803.42 | bwd_inner_microstep: 5653.31 | bwd_allreduce_microstep: 150.07 | step_microstep: 18.20 [2025-04-26 03:46:04,856] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.31 | bwd: 5803.43 | bwd_inner: 5653.30 | bwd_allreduce: 150.08 | step: 18.21 20%|█▉ | 8202/41250 [19:48:30<79:40:26, 8.68s/it] {'loss': 0.0718, 'grad_norm': 1.4730015993118286, 'learning_rate': 3.7083987709729656e-05, 'epoch': 1.99} 20%|█▉ | 8202/41250 [19:48:30<79:40:26, 8.68s/it][2025-04-26 03:46:13,496] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:46:13,497] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.20 | bwd_microstep: 5708.44 | bwd_inner_microstep: 5695.48 | bwd_allreduce_microstep: 12.91 | step_microstep: 18.69 [2025-04-26 03:46:13,497] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.20 | bwd: 5708.46 | bwd_inner: 5695.48 | bwd_allreduce: 12.93 | step: 18.69 20%|█▉ | 8203/41250 [19:48:38<79:34:05, 8.67s/it] {'loss': 0.0954, 'grad_norm': 6.324215888977051, 'learning_rate': 3.708317117340314e-05, 'epoch': 1.99} 20%|█▉ | 8203/41250 [19:48:38<79:34:05, 8.67s/it][2025-04-26 03:46:22,176] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:46:22,177] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.05 | bwd_microstep: 5739.04 | bwd_inner_microstep: 5709.12 | bwd_allreduce_microstep: 29.87 | step_microstep: 18.54 [2025-04-26 03:46:22,177] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.05 | bwd: 5739.05 | bwd_inner: 5709.12 | bwd_allreduce: 29.89 | step: 18.55 20%|█▉ | 8204/41250 [19:48:47<79:35:54, 8.67s/it] {'loss': 0.111, 'grad_norm': 1.5860542058944702, 'learning_rate': 3.708235453176223e-05, 'epoch': 1.99} 20%|█▉ | 8204/41250 [19:48:47<79:35:54, 8.67s/it][2025-04-26 03:46:30,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:46:30,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.67 | bwd_microstep: 5706.42 | bwd_inner_microstep: 5644.70 | bwd_allreduce_microstep: 61.68 | step_microstep: 18.29 [2025-04-26 03:46:30,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.67 | bwd: 5706.44 | bwd_inner: 5644.70 | bwd_allreduce: 61.70 | step: 18.29 20%|█▉ | 8205/41250 [19:48:56<79:27:14, 8.66s/it] {'loss': 0.0803, 'grad_norm': 0.9463546276092529, 'learning_rate': 3.708153778481196e-05, 'epoch': 1.99} 20%|█▉ | 8205/41250 [19:48:56<79:27:14, 8.66s/it][2025-04-26 03:46:39,432] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 03:46:39,433] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.53 | bwd_microstep: 5718.96 | bwd_inner_microstep: 5656.70 | bwd_allreduce_microstep: 62.22 | step_microstep: 17.99 [2025-04-26 03:46:39,433] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.53 | bwd: 5718.98 | bwd_inner: 5656.70 | bwd_allreduce: 62.23 | step: 18.00 20%|█▉ | 8206/41250 [19:49:04<79:23:47, 8.65s/it] {'loss': 0.4285, 'grad_norm': 3.3844218254089355, 'learning_rate': 3.708072093255738e-05, 'epoch': 1.99} 20%|█▉ | 8206/41250 [19:49:04<79:23:47, 8.65s/it][2025-04-26 03:46:48,067] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 03:46:48,067] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.90 | bwd_microstep: 5724.66 | bwd_inner_microstep: 5645.98 | bwd_allreduce_microstep: 78.64 | step_microstep: 18.47 [2025-04-26 03:46:48,067] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.90 | bwd: 5724.68 | bwd_inner: 5645.98 | bwd_allreduce: 78.66 | step: 18.47 20%|█▉ | 8207/41250 [19:49:13<79:21:05, 8.65s/it] {'loss': 0.1854, 'grad_norm': 6.885409832000732, 'learning_rate': 3.707990397500351e-05, 'epoch': 1.99} 20%|█▉ | 8207/41250 [19:49:13<79:21:05, 8.65s/it][2025-04-26 03:46:56,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.90 | optimizer_gradients: 0.95 | optimizer_step: 0.90 [2025-04-26 03:46:56,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2861.37 | bwd_microstep: 5771.71 | bwd_inner_microstep: 5718.16 | bwd_allreduce_microstep: 53.51 | step_microstep: 17.64 [2025-04-26 03:46:56,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2861.37 | bwd: 5771.72 | bwd_inner: 5718.16 | bwd_allreduce: 53.52 | step: 17.64 20%|█▉ | 8208/41250 [19:49:22<79:33:16, 8.67s/it] {'loss': 0.1815, 'grad_norm': 3.4806206226348877, 'learning_rate': 3.707908691215539e-05, 'epoch': 1.99} 20%|█▉ | 8208/41250 [19:49:22<79:33:16, 8.67s/it][2025-04-26 03:47:05,423] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.44 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:47:05,424] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2838.93 | bwd_microstep: 5713.55 | bwd_inner_microstep: 5668.76 | bwd_allreduce_microstep: 44.75 | step_microstep: 18.43 [2025-04-26 03:47:05,424] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2838.93 | bwd: 5713.57 | bwd_inner: 5668.76 | bwd_allreduce: 44.76 | step: 18.43 20%|█▉ | 8209/41250 [19:49:30<79:28:35, 8.66s/it] {'loss': 0.0655, 'grad_norm': 0.9509207010269165, 'learning_rate': 3.707826974401806e-05, 'epoch': 1.99} 20%|█▉ | 8209/41250 [19:49:30<79:28:35, 8.66s/it][2025-04-26 03:47:14,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 03:47:14,095] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.11 | bwd_microstep: 5731.07 | bwd_inner_microstep: 5718.30 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.06 [2025-04-26 03:47:14,096] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.11 | bwd: 5731.08 | bwd_inner: 5718.30 | bwd_allreduce: 12.74 | step: 18.06 20%|█▉ | 8210/41250 [19:49:39<79:30:15, 8.66s/it] {'loss': 0.02, 'grad_norm': 0.8001929521560669, 'learning_rate': 3.707745247059656e-05, 'epoch': 1.99} 20%|█▉ | 8210/41250 [19:49:39<79:30:15, 8.66s/it][2025-04-26 03:47:22,796] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.25 | optimizer_step: 0.98 [2025-04-26 03:47:22,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.87 | bwd_microstep: 5792.79 | bwd_inner_microstep: 5635.17 | bwd_allreduce_microstep: 157.57 | step_microstep: 19.59 [2025-04-26 03:47:22,797] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.87 | bwd: 5792.80 | bwd_inner: 5635.17 | bwd_allreduce: 157.59 | step: 19.60 20%|█▉ | 8211/41250 [19:49:48<79:36:47, 8.67s/it] {'loss': 0.0754, 'grad_norm': 3.1600027084350586, 'learning_rate': 3.7076635091895925e-05, 'epoch': 1.99} 20%|█▉ | 8211/41250 [19:49:48<79:36:47, 8.67s/it][2025-04-26 03:47:31,472] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.93 [2025-04-26 03:47:31,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2860.76 | bwd_microstep: 5720.37 | bwd_inner_microstep: 5688.14 | bwd_allreduce_microstep: 32.19 | step_microstep: 19.21 [2025-04-26 03:47:31,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2860.76 | bwd: 5720.39 | bwd_inner: 5688.14 | bwd_allreduce: 32.21 | step: 19.21 20%|█▉ | 8212/41250 [19:49:56<79:36:30, 8.67s/it] {'loss': 0.1226, 'grad_norm': 6.197729110717773, 'learning_rate': 3.7075817607921186e-05, 'epoch': 1.99} 20%|█▉ | 8212/41250 [19:49:56<79:36:30, 8.67s/it][2025-04-26 03:47:40,156] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:47:40,157] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.08 | bwd_microstep: 5770.01 | bwd_inner_microstep: 5649.98 | bwd_allreduce_microstep: 119.98 | step_microstep: 18.62 [2025-04-26 03:47:40,157] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.08 | bwd: 5770.02 | bwd_inner: 5649.98 | bwd_allreduce: 119.99 | step: 18.62 20%|█▉ | 8213/41250 [19:50:05<79:37:49, 8.68s/it] {'loss': 0.2174, 'grad_norm': 4.82801628112793, 'learning_rate': 3.7075000018677395e-05, 'epoch': 1.99} 20%|█▉ | 8213/41250 [19:50:05<79:37:49, 8.68s/it][2025-04-26 03:47:48,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.54 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:47:48,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.89 | bwd_microstep: 5715.13 | bwd_inner_microstep: 5702.45 | bwd_allreduce_microstep: 12.64 | step_microstep: 18.85 [2025-04-26 03:47:48,815] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.89 | bwd: 5715.15 | bwd_inner: 5702.45 | bwd_allreduce: 12.66 | step: 18.86 20%|█▉ | 8214/41250 [19:50:14<79:34:32, 8.67s/it] {'loss': 0.4054, 'grad_norm': 3.589655637741089, 'learning_rate': 3.707418232416959e-05, 'epoch': 1.99} 20%|█▉ | 8214/41250 [19:50:14<79:34:32, 8.67s/it][2025-04-26 03:47:57,506] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-26 03:47:57,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.31 | bwd_microstep: 5775.89 | bwd_inner_microstep: 5640.05 | bwd_allreduce_microstep: 135.78 | step_microstep: 18.99 [2025-04-26 03:47:57,507] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.31 | bwd: 5775.90 | bwd_inner: 5640.05 | bwd_allreduce: 135.80 | step: 18.99 20%|█▉ | 8215/41250 [19:50:22<79:37:42, 8.68s/it] {'loss': 0.0618, 'grad_norm': 1.4045472145080566, 'learning_rate': 3.70733645244028e-05, 'epoch': 1.99} 20%|█▉ | 8215/41250 [19:50:22<79:37:42, 8.68s/it][2025-04-26 03:48:06,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.01 | optimizer_step: 0.98 [2025-04-26 03:48:06,396] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2929.74 | bwd_microstep: 5877.26 | bwd_inner_microstep: 5864.39 | bwd_allreduce_microstep: 12.83 | step_microstep: 19.08 [2025-04-26 03:48:06,397] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2929.74 | bwd: 5877.28 | bwd_inner: 5864.39 | bwd_allreduce: 12.85 | step: 19.08 20%|█▉ | 8216/41250 [19:50:31<80:12:52, 8.74s/it] {'loss': 0.1578, 'grad_norm': 4.764684200286865, 'learning_rate': 3.707254661938208e-05, 'epoch': 1.99} 20%|█▉ | 8216/41250 [19:50:31<80:12:52, 8.74s/it][2025-04-26 03:48:15,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:48:15,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.92 | bwd_microstep: 5758.97 | bwd_inner_microstep: 5697.59 | bwd_allreduce_microstep: 61.33 | step_microstep: 18.72 [2025-04-26 03:48:15,090] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.92 | bwd: 5758.98 | bwd_inner: 5697.59 | bwd_allreduce: 61.35 | step: 18.72 20%|█▉ | 8217/41250 [19:50:40<80:04:36, 8.73s/it] {'loss': 0.1536, 'grad_norm': 2.607351779937744, 'learning_rate': 3.707172860911247e-05, 'epoch': 1.99} 20%|█▉ | 8217/41250 [19:50:40<80:04:36, 8.73s/it][2025-04-26 03:48:23,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 03:48:23,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2850.32 | bwd_microstep: 5761.27 | bwd_inner_microstep: 5695.67 | bwd_allreduce_microstep: 65.56 | step_microstep: 18.86 [2025-04-26 03:48:23,787] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2850.32 | bwd: 5761.28 | bwd_inner: 5695.67 | bwd_allreduce: 65.57 | step: 18.86 20%|█▉ | 8218/41250 [19:50:49<79:59:33, 8.72s/it] {'loss': 0.1946, 'grad_norm': 4.11305046081543, 'learning_rate': 3.707091049359901e-05, 'epoch': 1.99} 20%|█▉ | 8218/41250 [19:50:49<79:59:33, 8.72s/it][2025-04-26 03:48:32,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.12 | optimizer_step: 0.98 [2025-04-26 03:48:32,448] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.44 | bwd_microstep: 5747.53 | bwd_inner_microstep: 5648.32 | bwd_allreduce_microstep: 99.17 | step_microstep: 19.11 [2025-04-26 03:48:32,448] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.44 | bwd: 5747.55 | bwd_inner: 5648.32 | bwd_allreduce: 99.19 | step: 19.11 20%|█▉ | 8219/41250 [19:50:57<79:49:44, 8.70s/it] {'loss': 0.0709, 'grad_norm': 1.3769067525863647, 'learning_rate': 3.707009227284674e-05, 'epoch': 1.99} 20%|█▉ | 8219/41250 [19:50:57<79:49:44, 8.70s/it][2025-04-26 03:48:41,208] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 03:48:41,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2887.64 | bwd_microstep: 5789.01 | bwd_inner_microstep: 5776.23 | bwd_allreduce_microstep: 12.73 | step_microstep: 18.72 [2025-04-26 03:48:41,209] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2887.64 | bwd: 5789.02 | bwd_inner: 5776.23 | bwd_allreduce: 12.75 | step: 18.72 20%|█▉ | 8220/41250 [19:51:06<79:59:34, 8.72s/it] {'loss': 0.21, 'grad_norm': 4.793348789215088, 'learning_rate': 3.7069273946860714e-05, 'epoch': 1.99} 20%|█▉ | 8220/41250 [19:51:06<79:59:34, 8.72s/it][2025-04-26 03:48:49,867] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:48:49,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2820.20 | bwd_microstep: 5759.91 | bwd_inner_microstep: 5634.82 | bwd_allreduce_microstep: 125.04 | step_microstep: 18.35 [2025-04-26 03:48:49,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2820.20 | bwd: 5759.92 | bwd_inner: 5634.82 | bwd_allreduce: 125.06 | step: 18.35 20%|█▉ | 8221/41250 [19:51:15<79:49:37, 8.70s/it] {'loss': 0.3705, 'grad_norm': 3.300631523132324, 'learning_rate': 3.706845551564597e-05, 'epoch': 1.99} 20%|█▉ | 8221/41250 [19:51:15<79:49:37, 8.70s/it][2025-04-26 03:48:58,484] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.98 | optimizer_step: 0.98 [2025-04-26 03:48:58,485] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.67 | bwd_microstep: 5691.92 | bwd_inner_microstep: 5679.19 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.44 [2025-04-26 03:48:58,485] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.67 | bwd: 5691.93 | bwd_inner: 5679.19 | bwd_allreduce: 12.70 | step: 18.44 20%|█▉ | 8222/41250 [19:51:23<79:35:50, 8.68s/it] {'loss': 0.0608, 'grad_norm': 0.9716383218765259, 'learning_rate': 3.7067636979207555e-05, 'epoch': 1.99} 20%|█▉ | 8222/41250 [19:51:23<79:35:50, 8.68s/it][2025-04-26 03:49:07,087] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.17 | optimizer_step: 1.06 [2025-04-26 03:49:07,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.79 | bwd_microstep: 5691.74 | bwd_inner_microstep: 5651.47 | bwd_allreduce_microstep: 40.20 | step_microstep: 20.09 [2025-04-26 03:49:07,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.80 | bwd: 5691.76 | bwd_inner: 5651.47 | bwd_allreduce: 40.23 | step: 20.09 20%|█▉ | 8223/41250 [19:51:32<79:23:46, 8.65s/it] {'loss': 0.2247, 'grad_norm': 2.7349510192871094, 'learning_rate': 3.7066818337550515e-05, 'epoch': 1.99} 20%|█▉ | 8223/41250 [19:51:32<79:23:46, 8.65s/it][2025-04-26 03:49:15,761] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.00 | optimizer_step: 0.92 [2025-04-26 03:49:15,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.12 | bwd_microstep: 5746.99 | bwd_inner_microstep: 5672.65 | bwd_allreduce_microstep: 74.30 | step_microstep: 19.19 [2025-04-26 03:49:15,762] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.12 | bwd: 5747.00 | bwd_inner: 5672.65 | bwd_allreduce: 74.31 | step: 19.19 20%|█▉ | 8224/41250 [19:51:41<79:26:40, 8.66s/it] {'loss': 0.2446, 'grad_norm': 5.319218635559082, 'learning_rate': 3.7065999590679894e-05, 'epoch': 1.99} 20%|█▉ | 8224/41250 [19:51:41<79:26:40, 8.66s/it][2025-04-26 03:49:24,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 0.99 [2025-04-26 03:49:24,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.10 | bwd_microstep: 5701.79 | bwd_inner_microstep: 5688.67 | bwd_allreduce_microstep: 13.07 | step_microstep: 19.12 [2025-04-26 03:49:24,394] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.10 | bwd: 5701.80 | bwd_inner: 5688.67 | bwd_allreduce: 13.09 | step: 19.12 20%|█▉ | 8225/41250 [19:51:49<79:21:55, 8.65s/it] {'loss': 0.1084, 'grad_norm': 2.2792985439300537, 'learning_rate': 3.706518073860075e-05, 'epoch': 1.99} 20%|█▉ | 8225/41250 [19:51:49<79:21:55, 8.65s/it][2025-04-26 03:49:33,052] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.08 | optimizer_step: 0.90 [2025-04-26 03:49:33,053] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.99 | bwd_microstep: 5731.17 | bwd_inner_microstep: 5679.73 | bwd_allreduce_microstep: 51.40 | step_microstep: 18.98 [2025-04-26 03:49:33,053] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.99 | bwd: 5731.18 | bwd_inner: 5679.73 | bwd_allreduce: 51.41 | step: 18.98 20%|█▉ | 8226/41250 [19:51:58<79:22:59, 8.65s/it] {'loss': 0.0699, 'grad_norm': 1.3460910320281982, 'learning_rate': 3.7064361781318116e-05, 'epoch': 1.99} 20%|█▉ | 8226/41250 [19:51:58<79:22:59, 8.65s/it][2025-04-26 03:49:41,728] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 03:49:41,728] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.83 | bwd_microstep: 5746.96 | bwd_inner_microstep: 5692.93 | bwd_allreduce_microstep: 53.98 | step_microstep: 18.79 [2025-04-26 03:49:41,728] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.83 | bwd: 5746.97 | bwd_inner: 5692.93 | bwd_allreduce: 54.00 | step: 18.79 20%|█▉ | 8227/41250 [19:52:07<79:26:28, 8.66s/it] {'loss': 0.0484, 'grad_norm': 1.3344134092330933, 'learning_rate': 3.7063542718837055e-05, 'epoch': 1.99} 20%|█▉ | 8227/41250 [19:52:07<79:26:28, 8.66s/it][2025-04-26 03:49:50,375] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.06 | optimizer_step: 1.02 [2025-04-26 03:49:50,376] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.85 | bwd_microstep: 5741.28 | bwd_inner_microstep: 5647.44 | bwd_allreduce_microstep: 93.78 | step_microstep: 19.43 [2025-04-26 03:49:50,376] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.85 | bwd: 5741.29 | bwd_inner: 5647.44 | bwd_allreduce: 93.80 | step: 19.43 20%|█▉ | 8228/41250 [19:52:15<79:24:26, 8.66s/it] {'loss': 0.0352, 'grad_norm': 0.5573418736457825, 'learning_rate': 3.706272355116261e-05, 'epoch': 1.99} 20%|█▉ | 8228/41250 [19:52:15<79:24:26, 8.66s/it][2025-04-26 03:49:58,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:49:58,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.67 | bwd_microstep: 5688.50 | bwd_inner_microstep: 5647.59 | bwd_allreduce_microstep: 40.86 | step_microstep: 18.68 [2025-04-26 03:49:58,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.67 | bwd: 5688.51 | bwd_inner: 5647.59 | bwd_allreduce: 40.88 | step: 18.68 20%|█▉ | 8229/41250 [19:52:24<79:13:57, 8.64s/it] {'loss': 0.3757, 'grad_norm': 3.6784462928771973, 'learning_rate': 3.706190427829982e-05, 'epoch': 1.99} 20%|█▉ | 8229/41250 [19:52:24<79:13:57, 8.64s/it][2025-04-26 03:50:07,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 1.01 [2025-04-26 03:50:07,619] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.60 | bwd_microstep: 5742.27 | bwd_inner_microstep: 5652.73 | bwd_allreduce_microstep: 89.49 | step_microstep: 19.00 [2025-04-26 03:50:07,620] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.60 | bwd: 5742.29 | bwd_inner: 5652.73 | bwd_allreduce: 89.51 | step: 19.00 20%|█▉ | 8230/41250 [19:52:32<79:15:27, 8.64s/it] {'loss': 0.1197, 'grad_norm': 1.3351025581359863, 'learning_rate': 3.706108490025375e-05, 'epoch': 2.0} 20%|█▉ | 8230/41250 [19:52:32<79:15:27, 8.64s/it][2025-04-26 03:50:16,260] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 03:50:16,261] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.49 | bwd_microstep: 5710.79 | bwd_inner_microstep: 5693.96 | bwd_allreduce_microstep: 16.78 | step_microstep: 18.74 [2025-04-26 03:50:16,261] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.49 | bwd: 5710.80 | bwd_inner: 5693.96 | bwd_allreduce: 16.80 | step: 18.75 20%|█▉ | 8231/41250 [19:52:41<79:15:25, 8.64s/it] {'loss': 0.1212, 'grad_norm': 2.002558946609497, 'learning_rate': 3.706026541702945e-05, 'epoch': 2.0} 20%|█▉ | 8231/41250 [19:52:41<79:15:25, 8.64s/it][2025-04-26 03:50:24,847] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 1.12 | optimizer_step: 0.96 [2025-04-26 03:50:24,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.89 | bwd_microstep: 5681.60 | bwd_inner_microstep: 5652.24 | bwd_allreduce_microstep: 29.31 | step_microstep: 18.72 [2025-04-26 03:50:24,848] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.89 | bwd: 5681.62 | bwd_inner: 5652.24 | bwd_allreduce: 29.33 | step: 18.72 20%|█▉ | 8232/41250 [19:52:50<79:06:43, 8.63s/it] {'loss': 0.2134, 'grad_norm': 2.6666271686553955, 'learning_rate': 3.7059445828631966e-05, 'epoch': 2.0} 20%|█▉ | 8232/41250 [19:52:50<79:06:43, 8.63s/it][2025-04-26 03:50:33,520] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.95 [2025-04-26 03:50:33,521] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.14 | bwd_microstep: 5746.21 | bwd_inner_microstep: 5681.77 | bwd_allreduce_microstep: 64.39 | step_microstep: 19.17 [2025-04-26 03:50:33,521] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.14 | bwd: 5746.23 | bwd_inner: 5681.77 | bwd_allreduce: 64.41 | step: 19.17 20%|█▉ | 8233/41250 [19:52:58<79:13:54, 8.64s/it] {'loss': 0.2329, 'grad_norm': 3.985957145690918, 'learning_rate': 3.705862613506636e-05, 'epoch': 2.0} 20%|█▉ | 8233/41250 [19:52:58<79:13:54, 8.64s/it][2025-04-26 03:50:42,175] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.98 | optimizer_step: 1.05 [2025-04-26 03:50:42,176] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.38 | bwd_microstep: 5747.12 | bwd_inner_microstep: 5648.54 | bwd_allreduce_microstep: 98.53 | step_microstep: 18.43 [2025-04-26 03:50:42,176] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.38 | bwd: 5747.13 | bwd_inner: 5648.54 | bwd_allreduce: 98.55 | step: 18.43 20%|█▉ | 8234/41250 [19:53:07<79:16:16, 8.64s/it] {'loss': 0.1034, 'grad_norm': 4.036986827850342, 'learning_rate': 3.705780633633767e-05, 'epoch': 2.0} 20%|█▉ | 8234/41250 [19:53:07<79:16:16, 8.64s/it][2025-04-26 03:50:50,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.25 | optimizer_step: 0.93 [2025-04-26 03:50:50,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.44 | bwd_microstep: 5741.15 | bwd_inner_microstep: 5652.82 | bwd_allreduce_microstep: 88.28 | step_microstep: 19.55 [2025-04-26 03:50:50,826] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.44 | bwd: 5741.16 | bwd_inner: 5652.82 | bwd_allreduce: 88.30 | step: 19.55 20%|█▉ | 8235/41250 [19:53:16<79:17:15, 8.65s/it] {'loss': 0.0472, 'grad_norm': 1.2205950021743774, 'learning_rate': 3.7056986432450966e-05, 'epoch': 2.0} 20%|█▉ | 8235/41250 [19:53:16<79:17:15, 8.65s/it][2025-04-26 03:50:59,472] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.04 | optimizer_step: 0.89 [2025-04-26 03:50:59,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.19 | bwd_microstep: 5716.42 | bwd_inner_microstep: 5687.01 | bwd_allreduce_microstep: 29.37 | step_microstep: 19.08 [2025-04-26 03:50:59,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.19 | bwd: 5716.44 | bwd_inner: 5687.01 | bwd_allreduce: 29.39 | step: 19.08 20%|█▉ | 8236/41250 [19:53:24<79:17:20, 8.65s/it] {'loss': 0.2486, 'grad_norm': 3.3851234912872314, 'learning_rate': 3.705616642341129e-05, 'epoch': 2.0} 20%|█▉ | 8236/41250 [19:53:24<79:17:20, 8.65s/it][2025-04-26 03:51:08,076] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.11 | optimizer_step: 1.01 [2025-04-26 03:51:08,077] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.62 | bwd_microstep: 5694.02 | bwd_inner_microstep: 5645.29 | bwd_allreduce_microstep: 48.67 | step_microstep: 19.62 [2025-04-26 03:51:08,077] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.63 | bwd: 5694.03 | bwd_inner: 5645.29 | bwd_allreduce: 48.70 | step: 19.62 20%|█▉ | 8237/41250 [19:53:33<79:10:44, 8.63s/it] {'loss': 0.092, 'grad_norm': 1.7295074462890625, 'learning_rate': 3.7055346309223706e-05, 'epoch': 2.0} 20%|█▉ | 8237/41250 [19:53:33<79:10:44, 8.63s/it][2025-04-26 03:51:16,775] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:51:16,775] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.85 | bwd_microstep: 5789.91 | bwd_inner_microstep: 5651.41 | bwd_allreduce_microstep: 138.46 | step_microstep: 18.87 [2025-04-26 03:51:16,775] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.85 | bwd: 5789.92 | bwd_inner: 5651.41 | bwd_allreduce: 138.48 | step: 18.87 20%|█▉ | 8238/41250 [19:53:42<79:20:43, 8.65s/it] {'loss': 0.0859, 'grad_norm': 2.91619873046875, 'learning_rate': 3.705452608989327e-05, 'epoch': 2.0} 20%|█▉ | 8238/41250 [19:53:42<79:20:43, 8.65s/it][2025-04-26 03:51:25,389] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 03:51:25,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.00 | bwd_microstep: 5704.83 | bwd_inner_microstep: 5646.12 | bwd_allreduce_microstep: 58.67 | step_microstep: 18.63 [2025-04-26 03:51:25,390] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.00 | bwd: 5704.85 | bwd_inner: 5646.12 | bwd_allreduce: 58.68 | step: 18.64 20%|█▉ | 8239/41250 [19:53:50<79:14:17, 8.64s/it] {'loss': 0.2063, 'grad_norm': 2.5826733112335205, 'learning_rate': 3.705370576542503e-05, 'epoch': 2.0} 20%|█▉ | 8239/41250 [19:53:50<79:14:17, 8.64s/it][2025-04-26 03:51:34,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 03:51:34,070] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.09 | bwd_microstep: 5774.85 | bwd_inner_microstep: 5648.58 | bwd_allreduce_microstep: 126.23 | step_microstep: 18.82 [2025-04-26 03:51:34,071] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.09 | bwd: 5774.87 | bwd_inner: 5648.58 | bwd_allreduce: 126.24 | step: 18.82 20%|█▉ | 8240/41250 [19:53:59<79:20:34, 8.65s/it] {'loss': 0.1668, 'grad_norm': 4.265397071838379, 'learning_rate': 3.705288533582405e-05, 'epoch': 2.0} 20%|█▉ | 8240/41250 [19:53:59<79:20:34, 8.65s/it][2025-04-26 03:51:42,674] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.51 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:51:42,674] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.68 | bwd_microstep: 5694.17 | bwd_inner_microstep: 5656.68 | bwd_allreduce_microstep: 37.45 | step_microstep: 18.81 [2025-04-26 03:51:42,674] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.68 | bwd: 5694.18 | bwd_inner: 5656.68 | bwd_allreduce: 37.46 | step: 18.81 20%|█▉ | 8241/41250 [19:54:07<79:12:26, 8.64s/it] {'loss': 0.0564, 'grad_norm': 0.8169367909431458, 'learning_rate': 3.705206480109539e-05, 'epoch': 2.0} 20%|█▉ | 8241/41250 [19:54:08<79:12:26, 8.64s/it][2025-04-26 03:51:51,371] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.05 | optimizer_step: 1.09 [2025-04-26 03:51:51,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.79 | bwd_microstep: 5771.99 | bwd_inner_microstep: 5687.54 | bwd_allreduce_microstep: 84.40 | step_microstep: 19.25 [2025-04-26 03:51:51,372] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.79 | bwd: 5772.01 | bwd_inner: 5687.54 | bwd_allreduce: 84.42 | step: 19.26 20%|█▉ | 8242/41250 [19:54:16<79:22:03, 8.66s/it] {'loss': 0.1822, 'grad_norm': 3.650533676147461, 'learning_rate': 3.7051244161244094e-05, 'epoch': 2.0} 20%|█▉ | 8242/41250 [19:54:16<79:22:03, 8.66s/it][2025-04-26 03:52:00,084] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.12 | optimizer_step: 0.91 [2025-04-26 03:52:00,084] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.37 | bwd_microstep: 5786.18 | bwd_inner_microstep: 5684.27 | bwd_allreduce_microstep: 101.87 | step_microstep: 18.90 [2025-04-26 03:52:00,084] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.37 | bwd: 5786.19 | bwd_inner: 5684.27 | bwd_allreduce: 101.88 | step: 18.90 20%|█▉ | 8243/41250 [19:54:25<79:31:12, 8.67s/it] {'loss': 0.1674, 'grad_norm': 1.8976936340332031, 'learning_rate': 3.7050423416275246e-05, 'epoch': 2.0} 20%|█▉ | 8243/41250 [19:54:25<79:31:12, 8.67s/it][2025-04-26 03:52:08,746] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.98 | optimizer_step: 1.00 [2025-04-26 03:52:08,747] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2826.18 | bwd_microstep: 5754.59 | bwd_inner_microstep: 5656.56 | bwd_allreduce_microstep: 97.98 | step_microstep: 18.72 [2025-04-26 03:52:08,747] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2826.18 | bwd: 5754.60 | bwd_inner: 5656.56 | bwd_allreduce: 98.00 | step: 18.72 20%|█▉ | 8244/41250 [19:54:34<79:29:39, 8.67s/it] {'loss': 0.0861, 'grad_norm': 1.862705945968628, 'learning_rate': 3.704960256619388e-05, 'epoch': 2.0} 20%|█▉ | 8244/41250 [19:54:34<79:29:39, 8.67s/it][2025-04-26 03:52:17,410] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-26 03:52:17,410] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.91 | bwd_microstep: 5738.02 | bwd_inner_microstep: 5680.23 | bwd_allreduce_microstep: 57.75 | step_microstep: 19.25 [2025-04-26 03:52:17,411] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.91 | bwd: 5738.03 | bwd_inner: 5680.23 | bwd_allreduce: 57.77 | step: 19.25 20%|█▉ | 8245/41250 [19:54:42<79:28:06, 8.67s/it] {'loss': 0.0315, 'grad_norm': 0.5722662210464478, 'learning_rate': 3.704878161100507e-05, 'epoch': 2.0} 20%|█▉ | 8245/41250 [19:54:42<79:28:06, 8.67s/it][2025-04-26 03:52:26,169] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.99 | optimizer_step: 0.92 [2025-04-26 03:52:26,169] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2885.12 | bwd_microstep: 5790.64 | bwd_inner_microstep: 5777.96 | bwd_allreduce_microstep: 12.63 | step_microstep: 18.89 [2025-04-26 03:52:26,170] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2885.13 | bwd: 5790.66 | bwd_inner: 5777.96 | bwd_allreduce: 12.65 | step: 18.89 20%|█▉ | 8246/41250 [19:54:51<79:42:54, 8.70s/it] {'loss': 0.1537, 'grad_norm': 1.7087929248809814, 'learning_rate': 3.704796055071388e-05, 'epoch': 2.0} 20%|█▉ | 8246/41250 [19:54:51<79:42:54, 8.70s/it][2025-04-26 03:52:34,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:52:34,861] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.63 | bwd_microstep: 5760.97 | bwd_inner_microstep: 5699.13 | bwd_allreduce_microstep: 61.80 | step_microstep: 18.77 [2025-04-26 03:52:34,862] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.63 | bwd: 5760.99 | bwd_inner: 5699.13 | bwd_allreduce: 61.81 | step: 18.78 20%|█▉ | 8247/41250 [19:55:00<79:42:14, 8.69s/it] {'loss': 0.0604, 'grad_norm': 4.437797546386719, 'learning_rate': 3.704713938532536e-05, 'epoch': 2.0} 20%|█▉ | 8247/41250 [19:55:00<79:42:14, 8.69s/it][2025-04-26 03:52:43,673] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.00 | optimizer_step: 1.01 [2025-04-26 03:52:43,673] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.60 | bwd_microstep: 5887.28 | bwd_inner_microstep: 5692.20 | bwd_allreduce_microstep: 195.03 | step_microstep: 19.51 [2025-04-26 03:52:43,674] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.60 | bwd: 5887.30 | bwd_inner: 5692.20 | bwd_allreduce: 195.05 | step: 19.51 20%|█▉ | 8248/41250 [19:55:09<80:02:01, 8.73s/it] {'loss': 0.0333, 'grad_norm': 0.8060459494590759, 'learning_rate': 3.7046318114844577e-05, 'epoch': 2.0} 20%|█▉ | 8248/41250 [19:55:09<80:02:01, 8.73s/it][2025-04-26 03:52:52,305] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:52:52,305] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.68 | bwd_microstep: 5718.45 | bwd_inner_microstep: 5665.23 | bwd_allreduce_microstep: 53.18 | step_microstep: 18.61 [2025-04-26 03:52:52,305] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.68 | bwd: 5718.46 | bwd_inner: 5665.23 | bwd_allreduce: 53.19 | step: 18.61 20%|█▉ | 8249/41250 [19:55:17<79:45:07, 8.70s/it] {'loss': 0.1501, 'grad_norm': 3.424215078353882, 'learning_rate': 3.7045496739276606e-05, 'epoch': 2.0} 20%|█▉ | 8249/41250 [19:55:17<79:45:07, 8.70s/it]Failed to load video: /home/wangjiarui/AIGV6K/Allvideos/Animate/00778.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k Failed to load video: /home/wangjiarui/AIGV6K/Allvideos/ZeroScope/02039.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. Failed to load video: /home/wangjiarui/AIGV6K/Allvideos/Pyramid/01174.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. Failed to load video: /home/wangjiarui/AIGV6K/Allvideos/Animate/00776.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. Failed to load video: /home/wangjiarui/AIGV6K/Allvideos/Animate/00777.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. [2025-04-26 03:53:03,203] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.04 | optimizer_step: 1.02 [2025-04-26 03:53:03,204] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2875.54 | bwd_microstep: 5766.58 | bwd_inner_microstep: 5753.67 | bwd_allreduce_microstep: 12.86 | step_microstep: 19.32 [2025-04-26 03:53:03,204] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2875.54 | bwd: 5766.59 | bwd_inner: 5753.67 | bwd_allreduce: 12.88 | step: 19.32 20%|██ | 8250/41250 [19:55:28<85:47:47, 9.36s/it]evaluate!evaluate! {'loss': 0.1095, 'grad_norm': 1.589577078819275, 'learning_rate': 3.704467525862649e-05, 'epoch': 2.0} 20%|██ | 8250/41250 [19:55:28<85:47:47, 9.36s/it]evaluate! 1 1 1 [WARNING|trainer.py:803] 2025-04-26 03:53:05,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 03:53:05,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 03:53:06,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2 2 2 [WARNING|trainer.py:803] 2025-04-26 03:53:08,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 03:53:08,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 03:53:08,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3 3 3 [WARNING|trainer.py:803] 2025-04-26 03:53:10,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.3333333333333333 evaluate! [WARNING|trainer.py:803] 2025-04-26 03:53:11,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.3333333333333333 evaluate! [WARNING|trainer.py:803] 2025-04-26 03:53:11,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.3333333333333333 evaluate! 1 [WARNING|trainer.py:803] 2025-04-26 03:53:12,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1 1 [WARNING|trainer.py:803] 2025-04-26 03:53:13,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 03:53:13,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2 [WARNING|trainer.py:803] 2025-04-26 03:53:15,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2 2 [WARNING|trainer.py:803] 2025-04-26 03:53:16,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 03:53:16,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3 [WARNING|trainer.py:803] 2025-04-26 03:53:17,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.3333333333333333 3 3 [WARNING|trainer.py:803] 2025-04-26 03:53:18,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.3333333333333333 [WARNING|trainer.py:803] 2025-04-26 03:53:18,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.3333333333333333 [2025-04-26 03:53:19,558] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-26 03:53:20,929] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-26 03:53:21,053] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [2025-04-26 03:53:25,140] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-26 03:53:26,637] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-26 03:53:26,660] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [2025-04-26 03:53:30,689] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-26 03:53:32,290] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-26 03:53:32,324] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [2025-04-26 03:53:36,158] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-26 03:53:37,854] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-26 03:53:37,967] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [2025-04-26 03:53:56,093] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.15 | optimizer_step: 1.10 [2025-04-26 03:53:56,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.98 | bwd_microstep: 8277.17 | bwd_inner_microstep: 5689.41 | bwd_allreduce_microstep: 2587.68 | step_microstep: 20.92 [2025-04-26 03:53:56,094] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.98 | bwd: 8277.19 | bwd_inner: 5689.41 | bwd_allreduce: 2587.73 | step: 20.92 20%|██ | 8251/41250 [19:56:21<205:30:10, 22.42s/it] {'loss': 0.115, 'grad_norm': 12.450986862182617, 'learning_rate': 3.704385367289931e-05, 'epoch': 2.0} 20%|██ | 8251/41250 [19:56:21<205:30:10, 22.42s/it][2025-04-26 03:54:04,749] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.04 | optimizer_step: 1.05 [2025-04-26 03:54:04,750] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.66 | bwd_microstep: 5730.65 | bwd_inner_microstep: 5685.53 | bwd_allreduce_microstep: 45.07 | step_microstep: 19.72 [2025-04-26 03:54:04,750] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.66 | bwd: 5730.66 | bwd_inner: 5685.53 | bwd_allreduce: 45.09 | step: 19.72 20%|██ | 8252/41250 [19:56:30<167:38:59, 18.29s/it] {'loss': 0.0292, 'grad_norm': 0.49935588240623474, 'learning_rate': 3.704303198210012e-05, 'epoch': 2.0} 20%|██ | 8252/41250 [19:56:30<167:38:59, 18.29s/it][2025-04-26 03:54:13,392] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 03:54:13,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.21 | bwd_microstep: 5710.25 | bwd_inner_microstep: 5665.63 | bwd_allreduce_microstep: 44.58 | step_microstep: 18.76 [2025-04-26 03:54:13,393] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.21 | bwd: 5710.27 | bwd_inner: 5665.63 | bwd_allreduce: 44.59 | step: 18.77 20%|██ | 8253/41250 [19:56:38<141:06:43, 15.40s/it] {'loss': 0.1062, 'grad_norm': 4.296016216278076, 'learning_rate': 3.704221018623399e-05, 'epoch': 2.0} 20%|██ | 8253/41250 [19:56:38<141:06:43, 15.40s/it][2025-04-26 03:54:21,970] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.43 | optimizer_gradients: 1.05 | optimizer_step: 1.02 [2025-04-26 03:54:21,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.33 | bwd_microstep: 5664.39 | bwd_inner_microstep: 5651.45 | bwd_allreduce_microstep: 12.89 | step_microstep: 19.28 [2025-04-26 03:54:21,971] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.33 | bwd: 5664.41 | bwd_inner: 5651.45 | bwd_allreduce: 12.91 | step: 19.29 20%|██ | 8254/41250 [19:56:47<122:21:45, 13.35s/it] {'loss': 0.046, 'grad_norm': 2.329895496368408, 'learning_rate': 3.704138828530599e-05, 'epoch': 2.0} 20%|██ | 8254/41250 [19:56:47<122:21:45, 13.35s/it][2025-04-26 03:54:30,608] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 03:54:30,609] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.39 | bwd_microstep: 5727.00 | bwd_inner_microstep: 5673.31 | bwd_allreduce_microstep: 53.64 | step_microstep: 18.08 [2025-04-26 03:54:30,609] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.39 | bwd: 5727.01 | bwd_inner: 5673.31 | bwd_allreduce: 53.66 | step: 18.09 20%|██ | 8255/41250 [19:56:55<109:23:53, 11.94s/it] {'loss': 0.2176, 'grad_norm': 3.610060930252075, 'learning_rate': 3.704056627932118e-05, 'epoch': 2.0} 20%|██ | 8255/41250 [19:56:55<109:23:53, 11.94s/it][2025-04-26 03:54:39,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.02 | optimizer_step: 0.96 [2025-04-26 03:54:39,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.37 | bwd_microstep: 5833.11 | bwd_inner_microstep: 5670.26 | bwd_allreduce_microstep: 162.80 | step_microstep: 19.09 [2025-04-26 03:54:39,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.37 | bwd: 5833.12 | bwd_inner: 5670.26 | bwd_allreduce: 162.82 | step: 19.09 20%|██ | 8256/41250 [19:57:04<100:41:39, 10.99s/it] {'loss': 0.1172, 'grad_norm': 1.948224425315857, 'learning_rate': 3.7039744168284634e-05, 'epoch': 2.0} 20%|██ | 8256/41250 [19:57:04<100:41:39, 10.99s/it][2025-04-26 03:54:48,015] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.00 | optimizer_step: 0.93 [2025-04-26 03:54:48,016] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.15 | bwd_microstep: 5707.15 | bwd_inner_microstep: 5694.53 | bwd_allreduce_microstep: 12.57 | step_microstep: 19.09 [2025-04-26 03:54:48,016] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.15 | bwd: 5707.16 | bwd_inner: 5694.53 | bwd_allreduce: 12.59 | step: 19.09 20%|██ | 8257/41250 [19:57:13<94:13:40, 10.28s/it] {'loss': 0.1908, 'grad_norm': 3.118178606033325, 'learning_rate': 3.703892195220141e-05, 'epoch': 2.0} 20%|██ | 8257/41250 [19:57:13<94:13:40, 10.28s/it][2025-04-26 03:54:56,650] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 1.04 [2025-04-26 03:54:56,651] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.66 | bwd_microstep: 5731.56 | bwd_inner_microstep: 5635.16 | bwd_allreduce_microstep: 96.35 | step_microstep: 19.52 [2025-04-26 03:54:56,651] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.66 | bwd: 5731.57 | bwd_inner: 5635.16 | bwd_allreduce: 96.37 | step: 19.53 20%|██ | 8258/41250 [19:57:21<89:42:04, 9.79s/it] {'loss': 0.0301, 'grad_norm': 0.6445997953414917, 'learning_rate': 3.7038099631076596e-05, 'epoch': 2.0} 20%|██ | 8258/41250 [19:57:21<89:42:04, 9.79s/it][2025-04-26 03:55:05,311] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:55:05,311] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.57 | bwd_microstep: 5735.07 | bwd_inner_microstep: 5684.79 | bwd_allreduce_microstep: 50.24 | step_microstep: 18.84 [2025-04-26 03:55:05,311] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.57 | bwd: 5735.08 | bwd_inner: 5684.79 | bwd_allreduce: 50.26 | step: 18.84 20%|██ | 8259/41250 [19:57:30<86:35:41, 9.45s/it] {'loss': 0.1052, 'grad_norm': 5.181938171386719, 'learning_rate': 3.7037277204915236e-05, 'epoch': 2.0} 20%|██ | 8259/41250 [19:57:30<86:35:41, 9.45s/it][2025-04-26 03:55:13,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.97 | optimizer_step: 0.91 [2025-04-26 03:55:13,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2835.51 | bwd_microstep: 5741.08 | bwd_inner_microstep: 5667.03 | bwd_allreduce_microstep: 74.01 | step_microstep: 18.64 [2025-04-26 03:55:13,974] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2835.51 | bwd: 5741.09 | bwd_inner: 5667.03 | bwd_allreduce: 74.02 | step: 18.64 20%|██ | 8260/41250 [19:57:39<84:25:52, 9.21s/it] {'loss': 0.0611, 'grad_norm': 2.0673670768737793, 'learning_rate': 3.703645467372242e-05, 'epoch': 2.0} 20%|██ | 8260/41250 [19:57:39<84:25:52, 9.21s/it][2025-04-26 03:55:22,693] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.08 | optimizer_gradients: 1.33 | optimizer_step: 1.06 [2025-04-26 03:55:22,694] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.39 | bwd_microstep: 5797.57 | bwd_inner_microstep: 5673.66 | bwd_allreduce_microstep: 123.85 | step_microstep: 20.15 [2025-04-26 03:55:22,694] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.39 | bwd: 5797.59 | bwd_inner: 5673.66 | bwd_allreduce: 123.88 | step: 20.15 20%|██ | 8261/41250 [19:57:48<83:04:45, 9.07s/it] {'loss': 0.1565, 'grad_norm': 2.20037841796875, 'learning_rate': 3.7035632037503206e-05, 'epoch': 2.0} 20%|██ | 8261/41250 [19:57:48<83:04:45, 9.07s/it][2025-04-26 03:55:31,293] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.57 | optimizer_gradients: 1.00 | optimizer_step: 0.90 [2025-04-26 03:55:31,294] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2836.15 | bwd_microstep: 5678.07 | bwd_inner_microstep: 5665.42 | bwd_allreduce_microstep: 12.61 | step_microstep: 18.76 [2025-04-26 03:55:31,294] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2836.15 | bwd: 5678.08 | bwd_inner: 5665.42 | bwd_allreduce: 12.63 | step: 18.77 20%|██ | 8262/41250 [19:57:56<81:47:13, 8.93s/it] {'loss': 0.1131, 'grad_norm': 2.456502914428711, 'learning_rate': 3.703480929626268e-05, 'epoch': 2.0} 20%|██ | 8262/41250 [19:57:56<81:47:13, 8.93s/it][2025-04-26 03:55:39,949] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 0.97 | optimizer_step: 0.89 [2025-04-26 03:55:39,950] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.34 | bwd_microstep: 5728.54 | bwd_inner_microstep: 5681.17 | bwd_allreduce_microstep: 47.33 | step_microstep: 18.40 [2025-04-26 03:55:39,950] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.34 | bwd: 5728.56 | bwd_inner: 5681.17 | bwd_allreduce: 47.34 | step: 18.40 20%|██ | 8263/41250 [19:58:05<81:02:33, 8.84s/it] {'loss': 0.0982, 'grad_norm': 2.8235065937042236, 'learning_rate': 3.7033986450005904e-05, 'epoch': 2.0} 20%|██ | 8263/41250 [19:58:05<81:02:33, 8.84s/it][2025-04-26 03:55:48,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.01 | optimizer_gradients: 1.00 | optimizer_step: 1.00 [2025-04-26 03:55:48,602] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.11 | bwd_microstep: 5718.84 | bwd_inner_microstep: 5669.91 | bwd_allreduce_microstep: 48.89 | step_microstep: 18.66 [2025-04-26 03:55:48,603] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.11 | bwd: 5718.85 | bwd_inner: 5669.90 | bwd_allreduce: 48.91 | step: 18.67 20%|██ | 8264/41250 [19:58:13<80:30:57, 8.79s/it] {'loss': 0.1634, 'grad_norm': 1.7482147216796875, 'learning_rate': 3.703316349873794e-05, 'epoch': 2.0} 20%|██ | 8264/41250 [19:58:13<80:30:57, 8.79s/it][2025-04-26 03:55:57,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.00 | optimizer_step: 0.89 [2025-04-26 03:55:57,187] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2818.17 | bwd_microstep: 5680.15 | bwd_inner_microstep: 5634.48 | bwd_allreduce_microstep: 45.62 | step_microstep: 18.81 [2025-04-26 03:55:57,188] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2818.17 | bwd: 5680.16 | bwd_inner: 5634.48 | bwd_allreduce: 45.64 | step: 18.81 20%|██ | 8265/41250 [19:58:22<79:57:19, 8.73s/it] {'loss': 0.0707, 'grad_norm': 1.8428782224655151, 'learning_rate': 3.7032340442463886e-05, 'epoch': 2.0} 20%|██ | 8265/41250 [19:58:22<79:57:19, 8.73s/it][2025-04-26 03:56:05,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 03:56:05,792] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.66 | bwd_microstep: 5674.97 | bwd_inner_microstep: 5662.13 | bwd_allreduce_microstep: 12.80 | step_microstep: 18.48 [2025-04-26 03:56:05,793] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.66 | bwd: 5674.98 | bwd_inner: 5662.13 | bwd_allreduce: 12.81 | step: 18.49 20%|██ | 8266/41250 [19:58:31<79:37:11, 8.69s/it] {'loss': 0.1675, 'grad_norm': 3.4046106338500977, 'learning_rate': 3.7031517281188795e-05, 'epoch': 2.0} 20%|██ | 8266/41250 [19:58:31<79:37:11, 8.69s/it][2025-04-26 03:56:14,406] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.21 | optimizer_step: 1.03 [2025-04-26 03:56:14,407] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.59 | bwd_microstep: 5684.31 | bwd_inner_microstep: 5670.94 | bwd_allreduce_microstep: 13.32 | step_microstep: 19.55 [2025-04-26 03:56:14,407] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.59 | bwd: 5684.32 | bwd_inner: 5670.94 | bwd_allreduce: 13.34 | step: 19.56 20%|██ | 8267/41250 [19:58:39<79:24:56, 8.67s/it] {'loss': 0.0446, 'grad_norm': 1.0650625228881836, 'learning_rate': 3.7030694014917756e-05, 'epoch': 2.0} 20%|██ | 8267/41250 [19:58:39<79:24:56, 8.67s/it][2025-04-26 03:56:23,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 1.03 | optimizer_step: 1.01 [2025-04-26 03:56:23,038] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.17 | bwd_microstep: 5705.73 | bwd_inner_microstep: 5693.00 | bwd_allreduce_microstep: 12.69 | step_microstep: 19.52 [2025-04-26 03:56:23,039] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.17 | bwd: 5705.74 | bwd_inner: 5693.00 | bwd_allreduce: 12.71 | step: 19.52 20%|██ | 8268/41250 [19:58:48<79:18:35, 8.66s/it] {'loss': 0.0378, 'grad_norm': 0.9459269046783447, 'learning_rate': 3.702987064365584e-05, 'epoch': 2.0} 20%|██ | 8268/41250 [19:58:48<79:18:35, 8.66s/it][2025-04-26 03:56:31,616] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:56:31,616] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.86 | bwd_microstep: 5669.68 | bwd_inner_microstep: 5647.07 | bwd_allreduce_microstep: 22.57 | step_microstep: 18.67 [2025-04-26 03:56:31,617] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.86 | bwd: 5669.69 | bwd_inner: 5647.07 | bwd_allreduce: 22.59 | step: 18.67 20%|██ | 8269/41250 [19:58:56<79:05:16, 8.63s/it] {'loss': 0.0848, 'grad_norm': 1.1309731006622314, 'learning_rate': 3.702904716740811e-05, 'epoch': 2.0} 20%|██ | 8269/41250 [19:58:56<79:05:16, 8.63s/it][2025-04-26 03:56:40,282] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 03:56:40,283] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2822.58 | bwd_microstep: 5755.74 | bwd_inner_microstep: 5640.81 | bwd_allreduce_microstep: 114.89 | step_microstep: 18.52 [2025-04-26 03:56:40,283] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2822.58 | bwd: 5755.76 | bwd_inner: 5640.81 | bwd_allreduce: 114.91 | step: 18.52 20%|██ | 8270/41250 [19:59:05<79:10:42, 8.64s/it] {'loss': 0.0547, 'grad_norm': 1.361976146697998, 'learning_rate': 3.702822358617966e-05, 'epoch': 2.0} 20%|██ | 8270/41250 [19:59:05<79:10:42, 8.64s/it][2025-04-26 03:56:48,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.02 | optimizer_gradients: 1.02 | optimizer_step: 1.22 [2025-04-26 03:56:48,955] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2841.55 | bwd_microstep: 5748.11 | bwd_inner_microstep: 5669.16 | bwd_allreduce_microstep: 78.90 | step_microstep: 19.23 [2025-04-26 03:56:48,956] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2841.55 | bwd: 5748.12 | bwd_inner: 5669.16 | bwd_allreduce: 78.92 | step: 19.23 20%|██ | 8271/41250 [19:59:14<79:15:37, 8.65s/it] {'loss': 0.0869, 'grad_norm': 2.0236308574676514, 'learning_rate': 3.7027399899975564e-05, 'epoch': 2.01} 20%|██ | 8271/41250 [19:59:14<79:15:37, 8.65s/it][2025-04-26 03:56:57,540] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.01 | optimizer_step: 0.97 [2025-04-26 03:56:57,541] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2819.57 | bwd_microstep: 5681.01 | bwd_inner_microstep: 5633.01 | bwd_allreduce_microstep: 47.95 | step_microstep: 18.84 [2025-04-26 03:56:57,541] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2819.57 | bwd: 5681.02 | bwd_inner: 5633.01 | bwd_allreduce: 47.97 | step: 18.84 20%|██ | 8272/41250 [19:59:22<79:04:15, 8.63s/it] {'loss': 0.1818, 'grad_norm': 4.344736576080322, 'learning_rate': 3.702657610880089e-05, 'epoch': 2.01} 20%|██ | 8272/41250 [19:59:22<79:04:15, 8.63s/it][2025-04-26 03:57:06,132] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 0.99 | optimizer_step: 0.96 [2025-04-26 03:57:06,133] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2816.14 | bwd_microstep: 5691.71 | bwd_inner_microstep: 5648.35 | bwd_allreduce_microstep: 43.32 | step_microstep: 18.54 [2025-04-26 03:57:06,133] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2816.14 | bwd: 5691.73 | bwd_inner: 5648.35 | bwd_allreduce: 43.34 | step: 18.54 20%|██ | 8273/41250 [19:59:31<78:57:35, 8.62s/it] {'loss': 0.0927, 'grad_norm': 1.002933144569397, 'learning_rate': 3.702575221266073e-05, 'epoch': 2.01} 20%|██ | 8273/41250 [19:59:31<78:57:35, 8.62s/it][2025-04-26 03:57:14,941] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 03:57:14,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.31 | bwd_microstep: 5881.99 | bwd_inner_microstep: 5676.17 | bwd_allreduce_microstep: 205.78 | step_microstep: 18.79 [2025-04-26 03:57:14,942] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.31 | bwd: 5882.01 | bwd_inner: 5676.17 | bwd_allreduce: 205.80 | step: 18.79 20%|██ | 8274/41250 [19:59:40<79:28:39, 8.68s/it] {'loss': 0.1688, 'grad_norm': 4.633322238922119, 'learning_rate': 3.7024928211560155e-05, 'epoch': 2.01} 20%|██ | 8274/41250 [19:59:40<79:28:39, 8.68s/it][2025-04-26 03:57:23,563] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.03 | optimizer_step: 0.96 [2025-04-26 03:57:23,564] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.24 | bwd_microstep: 5712.03 | bwd_inner_microstep: 5647.67 | bwd_allreduce_microstep: 64.31 | step_microstep: 19.00 [2025-04-26 03:57:23,564] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.24 | bwd: 5712.04 | bwd_inner: 5647.67 | bwd_allreduce: 64.33 | step: 19.00 20%|██ | 8275/41250 [19:59:48<79:19:46, 8.66s/it] {'loss': 0.0777, 'grad_norm': 3.6649184226989746, 'learning_rate': 3.702410410550424e-05, 'epoch': 2.01} 20%|██ | 8275/41250 [19:59:48<79:19:46, 8.66s/it][2025-04-26 03:57:32,214] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.40 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 03:57:32,215] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.82 | bwd_microstep: 5721.05 | bwd_inner_microstep: 5699.78 | bwd_allreduce_microstep: 21.22 | step_microstep: 18.84 [2025-04-26 03:57:32,215] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.82 | bwd: 5721.06 | bwd_inner: 5699.78 | bwd_allreduce: 21.24 | step: 18.84 20%|██ | 8276/41250 [19:59:57<79:17:45, 8.66s/it] {'loss': 0.0173, 'grad_norm': 0.8510460257530212, 'learning_rate': 3.702327989449808e-05, 'epoch': 2.01} 20%|██ | 8276/41250 [19:59:57<79:17:45, 8.66s/it][2025-04-26 03:57:40,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.03 | optimizer_step: 0.98 [2025-04-26 03:57:40,838] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.78 | bwd_microstep: 5717.02 | bwd_inner_microstep: 5649.11 | bwd_allreduce_microstep: 67.86 | step_microstep: 19.00 [2025-04-26 03:57:40,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.78 | bwd: 5717.04 | bwd_inner: 5649.11 | bwd_allreduce: 67.88 | step: 19.01 20%|██ | 8277/41250 [20:00:06<79:12:03, 8.65s/it] {'loss': 0.0327, 'grad_norm': 1.2745857238769531, 'learning_rate': 3.702245557854674e-05, 'epoch': 2.01} 20%|██ | 8277/41250 [20:00:06<79:12:03, 8.65s/it][2025-04-26 03:57:49,518] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.48 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 03:57:49,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.35 | bwd_microstep: 5746.60 | bwd_inner_microstep: 5685.36 | bwd_allreduce_microstep: 61.19 | step_microstep: 18.87 [2025-04-26 03:57:49,519] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.35 | bwd: 5746.61 | bwd_inner: 5685.36 | bwd_allreduce: 61.21 | step: 18.87 20%|██ | 8278/41250 [20:00:14<79:17:24, 8.66s/it] {'loss': 0.0365, 'grad_norm': 1.8157235383987427, 'learning_rate': 3.7021631157655325e-05, 'epoch': 2.01} 20%|██ | 8278/41250 [20:00:14<79:17:24, 8.66s/it][2025-04-26 03:57:58,141] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.18 | optimizer_gradients: 1.13 | optimizer_step: 1.07 [2025-04-26 03:57:58,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.42 | bwd_microstep: 5720.10 | bwd_inner_microstep: 5643.10 | bwd_allreduce_microstep: 76.94 | step_microstep: 20.23 [2025-04-26 03:57:58,142] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.42 | bwd: 5720.12 | bwd_inner: 5643.10 | bwd_allreduce: 76.97 | step: 20.23 20%|██ | 8279/41250 [20:00:23<79:11:55, 8.65s/it] {'loss': 0.1018, 'grad_norm': 1.8680955171585083, 'learning_rate': 3.7020806631828885e-05, 'epoch': 2.01} 20%|██ | 8279/41250 [20:00:23<79:11:55, 8.65s/it][2025-04-26 03:58:06,820] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 1.01 | optimizer_step: 0.92 [2025-04-26 03:58:06,821] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2832.48 | bwd_microstep: 5760.50 | bwd_inner_microstep: 5651.20 | bwd_allreduce_microstep: 109.25 | step_microstep: 18.96 [2025-04-26 03:58:06,821] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2832.48 | bwd: 5760.51 | bwd_inner: 5651.20 | bwd_allreduce: 109.27 | step: 18.96 20%|██ | 8280/41250 [20:00:32<79:16:40, 8.66s/it] {'loss': 0.0927, 'grad_norm': 2.336892604827881, 'learning_rate': 3.7019982001072525e-05, 'epoch': 2.01} 20%|██ | 8280/41250 [20:00:32<79:16:40, 8.66s/it][2025-04-26 03:58:15,589] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 03:58:15,589] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2886.72 | bwd_microstep: 5799.50 | bwd_inner_microstep: 5786.57 | bwd_allreduce_microstep: 12.88 | step_microstep: 18.52 [2025-04-26 03:58:15,590] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2886.72 | bwd: 5799.51 | bwd_inner: 5786.57 | bwd_allreduce: 12.90 | step: 18.52 20%|██ | 8281/41250 [20:00:40<79:34:58, 8.69s/it] {'loss': 0.2149, 'grad_norm': 2.2950174808502197, 'learning_rate': 3.701915726539133e-05, 'epoch': 2.01} 20%|██ | 8281/41250 [20:00:40<79:34:58, 8.69s/it][2025-04-26 03:58:24,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.07 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-26 03:58:24,307] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.15 | bwd_microstep: 5788.07 | bwd_inner_microstep: 5695.16 | bwd_allreduce_microstep: 92.85 | step_microstep: 18.76 [2025-04-26 03:58:24,308] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.15 | bwd: 5788.08 | bwd_inner: 5695.16 | bwd_allreduce: 92.88 | step: 18.76 20%|██ | 8282/41250 [20:00:49<79:39:31, 8.70s/it] {'loss': 0.0608, 'grad_norm': 1.9427648782730103, 'learning_rate': 3.701833242479037e-05, 'epoch': 2.01} 20%|██ | 8282/41250 [20:00:49<79:39:31, 8.70s/it][2025-04-26 03:58:32,932] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.20 | optimizer_step: 0.90 [2025-04-26 03:58:32,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.97 | bwd_microstep: 5707.95 | bwd_inner_microstep: 5657.14 | bwd_allreduce_microstep: 50.76 | step_microstep: 19.24 [2025-04-26 03:58:32,933] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.97 | bwd: 5707.96 | bwd_inner: 5657.14 | bwd_allreduce: 50.78 | step: 19.24 20%|██ | 8283/41250 [20:00:58<79:27:18, 8.68s/it] {'loss': 0.0121, 'grad_norm': 0.5638201236724854, 'learning_rate': 3.7017507479274737e-05, 'epoch': 2.01} 20%|██ | 8283/41250 [20:00:58<79:27:18, 8.68s/it][2025-04-26 03:58:41,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.02 | optimizer_step: 1.02 [2025-04-26 03:58:41,629] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.47 | bwd_microstep: 5756.56 | bwd_inner_microstep: 5712.32 | bwd_allreduce_microstep: 44.20 | step_microstep: 19.30 [2025-04-26 03:58:41,630] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.47 | bwd: 5756.58 | bwd_inner: 5712.32 | bwd_allreduce: 44.21 | step: 19.30 20%|██ | 8284/41250 [20:01:06<79:30:37, 8.68s/it] {'loss': 0.2291, 'grad_norm': 1.5324170589447021, 'learning_rate': 3.701668242884952e-05, 'epoch': 2.01} 20%|██ | 8284/41250 [20:01:06<79:30:37, 8.68s/it][2025-04-26 03:58:50,417] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.47 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 03:58:50,418] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.50 | bwd_microstep: 5854.79 | bwd_inner_microstep: 5697.00 | bwd_allreduce_microstep: 157.75 | step_microstep: 18.92 [2025-04-26 03:58:50,418] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.50 | bwd: 5854.81 | bwd_inner: 5697.00 | bwd_allreduce: 157.76 | step: 18.92 20%|██ | 8285/41250 [20:01:15<79:47:41, 8.71s/it] {'loss': 0.2085, 'grad_norm': 2.370657205581665, 'learning_rate': 3.70158572735198e-05, 'epoch': 2.01} 20%|██ | 8285/41250 [20:01:15<79:47:41, 8.71s/it][2025-04-26 03:58:59,059] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.97 | optimizer_step: 1.02 [2025-04-26 03:58:59,059] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2848.38 | bwd_microstep: 5708.10 | bwd_inner_microstep: 5695.14 | bwd_allreduce_microstep: 12.91 | step_microstep: 18.73 [2025-04-26 03:58:59,060] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2848.38 | bwd: 5708.11 | bwd_inner: 5695.14 | bwd_allreduce: 12.93 | step: 18.72 20%|██ | 8286/41250 [20:01:24<79:35:37, 8.69s/it] {'loss': 0.0157, 'grad_norm': 0.36310020089149475, 'learning_rate': 3.701503201329067e-05, 'epoch': 2.01} 20%|██ | 8286/41250 [20:01:24<79:35:37, 8.69s/it][2025-04-26 03:59:07,833] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 0.97 | optimizer_step: 1.01 [2025-04-26 03:59:07,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2890.60 | bwd_microstep: 5797.45 | bwd_inner_microstep: 5784.80 | bwd_allreduce_microstep: 12.61 | step_microstep: 18.71 [2025-04-26 03:59:07,834] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2890.60 | bwd: 5797.47 | bwd_inner: 5784.80 | bwd_allreduce: 12.63 | step: 18.72 20%|██ | 8287/41250 [20:01:33<79:48:56, 8.72s/it] {'loss': 0.0341, 'grad_norm': 0.7625123262405396, 'learning_rate': 3.7014206648167214e-05, 'epoch': 2.01} 20%|██ | 8287/41250 [20:01:33<79:48:56, 8.72s/it][2025-04-26 03:59:16,509] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.13 | optimizer_gradients: 1.07 | optimizer_step: 1.03 [2025-04-26 03:59:16,510] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2825.35 | bwd_microstep: 5764.18 | bwd_inner_microstep: 5655.47 | bwd_allreduce_microstep: 108.66 | step_microstep: 19.20 [2025-04-26 03:59:16,510] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2825.35 | bwd: 5764.19 | bwd_inner: 5655.47 | bwd_allreduce: 108.68 | step: 19.20 20%|██ | 8288/41250 [20:01:41<79:42:15, 8.71s/it] {'loss': 0.0073, 'grad_norm': 0.11478783190250397, 'learning_rate': 3.7013381178154524e-05, 'epoch': 2.01} 20%|██ | 8288/41250 [20:01:41<79:42:15, 8.71s/it][2025-04-26 03:59:25,143] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.22 | optimizer_step: 0.90 [2025-04-26 03:59:25,144] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.29 | bwd_microstep: 5721.10 | bwd_inner_microstep: 5663.84 | bwd_allreduce_microstep: 57.21 | step_microstep: 19.33 [2025-04-26 03:59:25,144] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.29 | bwd: 5721.12 | bwd_inner: 5663.84 | bwd_allreduce: 57.23 | step: 19.33 20%|██ | 8289/41250 [20:01:50<79:30:19, 8.68s/it] {'loss': 0.0228, 'grad_norm': 2.6264734268188477, 'learning_rate': 3.701255560325769e-05, 'epoch': 2.01} 20%|██ | 8289/41250 [20:01:50<79:30:19, 8.68s/it][2025-04-26 03:59:33,813] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.53 | optimizer_gradients: 1.00 | optimizer_step: 0.97 [2025-04-26 03:59:33,814] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.55 | bwd_microstep: 5725.11 | bwd_inner_microstep: 5712.10 | bwd_allreduce_microstep: 12.96 | step_microstep: 19.10 [2025-04-26 03:59:33,814] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.55 | bwd: 5725.13 | bwd_inner: 5712.10 | bwd_allreduce: 12.98 | step: 19.11 20%|██ | 8290/41250 [20:01:59<79:27:59, 8.68s/it] {'loss': 0.0064, 'grad_norm': 0.1403326392173767, 'learning_rate': 3.701172992348179e-05, 'epoch': 2.01} 20%|██ | 8290/41250 [20:01:59<79:27:59, 8.68s/it][2025-04-26 03:59:42,591] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.34 | optimizer_step: 0.94 [2025-04-26 03:59:42,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2899.54 | bwd_microstep: 5793.38 | bwd_inner_microstep: 5780.19 | bwd_allreduce_microstep: 13.13 | step_microstep: 20.18 [2025-04-26 03:59:42,592] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2899.55 | bwd: 5793.39 | bwd_inner: 5780.19 | bwd_allreduce: 13.15 | step: 20.18 20%|██ | 8291/41250 [20:02:07<79:44:00, 8.71s/it] {'loss': 0.0742, 'grad_norm': 1.7737482786178589, 'learning_rate': 3.701090413883192e-05, 'epoch': 2.01} 20%|██ | 8291/41250 [20:02:07<79:44:00, 8.71s/it][2025-04-26 03:59:51,289] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.07 | optimizer_step: 0.89 [2025-04-26 03:59:51,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2843.54 | bwd_microstep: 5771.59 | bwd_inner_microstep: 5662.20 | bwd_allreduce_microstep: 109.33 | step_microstep: 19.10 [2025-04-26 03:59:51,290] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2843.54 | bwd: 5771.60 | bwd_inner: 5662.20 | bwd_allreduce: 109.36 | step: 19.10 20%|██ | 8292/41250 [20:02:16<79:42:30, 8.71s/it] {'loss': 0.1531, 'grad_norm': 1.8178794384002686, 'learning_rate': 3.701007824931317e-05, 'epoch': 2.01} 20%|██ | 8292/41250 [20:02:16<79:42:30, 8.71s/it][2025-04-26 03:59:59,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.22 | optimizer_gradients: 1.02 | optimizer_step: 0.94 [2025-04-26 03:59:59,920] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.33 | bwd_microstep: 5714.81 | bwd_inner_microstep: 5659.22 | bwd_allreduce_microstep: 55.54 | step_microstep: 18.73 [2025-04-26 03:59:59,921] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.33 | bwd: 5714.83 | bwd_inner: 5659.22 | bwd_allreduce: 55.57 | step: 18.73 20%|██ | 8293/41250 [20:02:25<79:29:15, 8.68s/it] {'loss': 0.1812, 'grad_norm': 3.2375411987304688, 'learning_rate': 3.700925225493063e-05, 'epoch': 2.01} 20%|██ | 8293/41250 [20:02:25<79:29:15, 8.68s/it][2025-04-26 04:00:08,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 1.03 | optimizer_step: 1.01 [2025-04-26 04:00:08,836] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2936.51 | bwd_microstep: 5896.91 | bwd_inner_microstep: 5883.68 | bwd_allreduce_microstep: 13.18 | step_microstep: 18.61 [2025-04-26 04:00:08,837] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2936.51 | bwd: 5896.92 | bwd_inner: 5883.68 | bwd_allreduce: 13.20 | step: 18.61 20%|██ | 8294/41250 [20:02:34<80:07:34, 8.75s/it] {'loss': 0.0311, 'grad_norm': 0.8989822268486023, 'learning_rate': 3.70084261556894e-05, 'epoch': 2.01} 20%|██ | 8294/41250 [20:02:34<80:07:34, 8.75s/it][2025-04-26 04:00:17,534] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 4.99 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 04:00:17,534] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2839.06 | bwd_microstep: 5774.46 | bwd_inner_microstep: 5664.62 | bwd_allreduce_microstep: 109.79 | step_microstep: 18.11 [2025-04-26 04:00:17,534] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2839.06 | bwd: 5774.47 | bwd_inner: 5664.62 | bwd_allreduce: 109.81 | step: 18.11 20%|██ | 8295/41250 [20:02:42<79:58:22, 8.74s/it] {'loss': 0.1033, 'grad_norm': 3.6340842247009277, 'learning_rate': 3.700759995159457e-05, 'epoch': 2.01} 20%|██ | 8295/41250 [20:02:42<79:58:22, 8.74s/it][2025-04-26 04:00:26,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.22 | optimizer_step: 0.90 [2025-04-26 04:00:26,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.46 | bwd_microstep: 5755.26 | bwd_inner_microstep: 5688.72 | bwd_allreduce_microstep: 66.49 | step_microstep: 18.89 [2025-04-26 04:00:26,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.46 | bwd: 5755.27 | bwd_inner: 5688.72 | bwd_allreduce: 66.51 | step: 18.89 20%|██ | 8296/41250 [20:02:51<79:51:53, 8.72s/it] {'loss': 0.0133, 'grad_norm': 0.32194051146507263, 'learning_rate': 3.7006773642651235e-05, 'epoch': 2.01} 20%|██ | 8296/41250 [20:02:51<79:51:53, 8.72s/it][2025-04-26 04:00:34,869] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.23 | optimizer_gradients: 1.23 | optimizer_step: 1.00 [2025-04-26 04:00:34,870] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2827.30 | bwd_microstep: 5724.30 | bwd_inner_microstep: 5642.10 | bwd_allreduce_microstep: 82.15 | step_microstep: 19.34 [2025-04-26 04:00:34,870] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2827.30 | bwd: 5724.31 | bwd_inner: 5642.10 | bwd_allreduce: 82.17 | step: 19.35 20%|██ | 8297/41250 [20:03:00<79:37:39, 8.70s/it] {'loss': 0.1313, 'grad_norm': 2.1083168983459473, 'learning_rate': 3.700594722886448e-05, 'epoch': 2.01} 20%|██ | 8297/41250 [20:03:00<79:37:39, 8.70s/it][2025-04-26 04:00:43,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 04:00:43,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.28 | bwd_microstep: 5794.88 | bwd_inner_microstep: 5658.15 | bwd_allreduce_microstep: 136.69 | step_microstep: 18.49 [2025-04-26 04:00:43,581] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.28 | bwd: 5794.90 | bwd_inner: 5658.15 | bwd_allreduce: 136.71 | step: 18.49 20%|██ | 8298/41250 [20:03:08<79:39:17, 8.70s/it] {'loss': 0.0065, 'grad_norm': 0.0979618951678276, 'learning_rate': 3.700512071023941e-05, 'epoch': 2.01} 20%|██ | 8298/41250 [20:03:08<79:39:17, 8.70s/it][2025-04-26 04:00:52,274] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 04:00:52,275] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.43 | bwd_microstep: 5784.81 | bwd_inner_microstep: 5645.13 | bwd_allreduce_microstep: 139.63 | step_microstep: 18.83 [2025-04-26 04:00:52,275] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.43 | bwd: 5784.82 | bwd_inner: 5645.13 | bwd_allreduce: 139.64 | step: 18.83 20%|██ | 8299/41250 [20:03:17<79:37:50, 8.70s/it] {'loss': 0.0443, 'grad_norm': 0.9721679091453552, 'learning_rate': 3.700429408678111e-05, 'epoch': 2.01} 20%|██ | 8299/41250 [20:03:17<79:37:50, 8.70s/it][2025-04-26 04:01:00,899] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.49 | optimizer_gradients: 1.01 | optimizer_step: 0.89 [2025-04-26 04:01:00,900] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.47 | bwd_microstep: 5713.82 | bwd_inner_microstep: 5659.34 | bwd_allreduce_microstep: 54.43 | step_microstep: 19.14 [2025-04-26 04:01:00,900] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.47 | bwd: 5713.83 | bwd_inner: 5659.34 | bwd_allreduce: 54.45 | step: 19.13 20%|██ | 8300/41250 [20:03:26<79:25:30, 8.68s/it] {'loss': 0.0445, 'grad_norm': 1.1234980821609497, 'learning_rate': 3.700346735849468e-05, 'epoch': 2.01} 20%|██ | 8300/41250 [20:03:26<79:25:30, 8.68s/it][2025-04-26 04:01:09,570] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.05 | optimizer_step: 0.97 [2025-04-26 04:01:09,571] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.60 | bwd_microstep: 5734.82 | bwd_inner_microstep: 5704.30 | bwd_allreduce_microstep: 30.47 | step_microstep: 19.02 [2025-04-26 04:01:09,571] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.60 | bwd: 5734.83 | bwd_inner: 5704.30 | bwd_allreduce: 30.49 | step: 19.03 20%|██ | 8301/41250 [20:03:34<79:24:11, 8.68s/it] {'loss': 0.2106, 'grad_norm': 3.0970489978790283, 'learning_rate': 3.700264052538522e-05, 'epoch': 2.01} 20%|██ | 8301/41250 [20:03:34<79:24:11, 8.68s/it][2025-04-26 04:01:18,200] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.07 | optimizer_step: 0.95 [2025-04-26 04:01:18,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.27 | bwd_microstep: 5696.56 | bwd_inner_microstep: 5683.51 | bwd_allreduce_microstep: 12.99 | step_microstep: 19.58 [2025-04-26 04:01:18,201] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.27 | bwd: 5696.57 | bwd_inner: 5683.51 | bwd_allreduce: 13.02 | step: 19.59 20%|██ | 8302/41250 [20:03:43<79:16:32, 8.66s/it] {'loss': 0.2904, 'grad_norm': 3.527890920639038, 'learning_rate': 3.700181358745783e-05, 'epoch': 2.01} 20%|██ | 8302/41250 [20:03:43<79:16:32, 8.66s/it][2025-04-26 04:01:26,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 04:01:26,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.78 | bwd_microstep: 5739.62 | bwd_inner_microstep: 5681.08 | bwd_allreduce_microstep: 58.50 | step_microstep: 18.83 [2025-04-26 04:01:26,869] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.78 | bwd: 5739.63 | bwd_inner: 5681.08 | bwd_allreduce: 58.51 | step: 18.84 20%|██ | 8303/41250 [20:03:52<79:17:23, 8.66s/it] {'loss': 0.0343, 'grad_norm': 0.7391858100891113, 'learning_rate': 3.70009865447176e-05, 'epoch': 2.01} 20%|██ | 8303/41250 [20:03:52<79:17:23, 8.66s/it][2025-04-26 04:01:35,543] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 1.01 | optimizer_step: 0.98 [2025-04-26 04:01:35,544] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.80 | bwd_microstep: 5737.76 | bwd_inner_microstep: 5714.42 | bwd_allreduce_microstep: 23.29 | step_microstep: 19.58 [2025-04-26 04:01:35,544] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.80 | bwd: 5737.77 | bwd_inner: 5714.42 | bwd_allreduce: 23.31 | step: 19.59 20%|██ | 8304/41250 [20:04:00<79:19:05, 8.67s/it] {'loss': 0.0826, 'grad_norm': 1.9620847702026367, 'learning_rate': 3.7000159397169636e-05, 'epoch': 2.01} 20%|██ | 8304/41250 [20:04:00<79:19:05, 8.67s/it][2025-04-26 04:01:44,231] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.45 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 04:01:44,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.97 | bwd_microstep: 5767.25 | bwd_inner_microstep: 5648.63 | bwd_allreduce_microstep: 118.58 | step_microstep: 18.66 [2025-04-26 04:01:44,232] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.97 | bwd: 5767.26 | bwd_inner: 5648.63 | bwd_allreduce: 118.59 | step: 18.66 20%|██ | 8305/41250 [20:04:09<79:22:16, 8.67s/it] {'loss': 0.1885, 'grad_norm': 1.8427672386169434, 'learning_rate': 3.699933214481902e-05, 'epoch': 2.01} 20%|██ | 8305/41250 [20:04:09<79:22:16, 8.67s/it][2025-04-26 04:01:52,867] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 1.00 | optimizer_step: 0.93 [2025-04-26 04:01:52,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.79 | bwd_microstep: 5707.55 | bwd_inner_microstep: 5694.89 | bwd_allreduce_microstep: 12.61 | step_microstep: 19.05 [2025-04-26 04:01:52,868] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.79 | bwd: 5707.57 | bwd_inner: 5694.89 | bwd_allreduce: 12.63 | step: 19.05 20%|██ | 8306/41250 [20:04:18<79:16:03, 8.66s/it] {'loss': 0.0998, 'grad_norm': 4.431005954742432, 'learning_rate': 3.699850478767088e-05, 'epoch': 2.01} 20%|██ | 8306/41250 [20:04:18<79:16:03, 8.66s/it][2025-04-26 04:02:01,535] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.98 | optimizer_step: 0.89 [2025-04-26 04:02:01,535] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.83 | bwd_microstep: 5732.05 | bwd_inner_microstep: 5700.19 | bwd_allreduce_microstep: 31.82 | step_microstep: 18.65 [2025-04-26 04:02:01,535] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.83 | bwd: 5732.06 | bwd_inner: 5700.19 | bwd_allreduce: 31.84 | step: 18.65 20%|██ | 8307/41250 [20:04:26<79:16:46, 8.66s/it] {'loss': 0.1107, 'grad_norm': 1.9958207607269287, 'learning_rate': 3.69976773257303e-05, 'epoch': 2.01} 20%|██ | 8307/41250 [20:04:26<79:16:46, 8.66s/it][2025-04-26 04:02:10,174] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.37 | optimizer_gradients: 1.03 | optimizer_step: 0.89 [2025-04-26 04:02:10,175] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2845.35 | bwd_microstep: 5706.70 | bwd_inner_microstep: 5693.82 | bwd_allreduce_microstep: 12.82 | step_microstep: 18.91 [2025-04-26 04:02:10,175] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2845.35 | bwd: 5706.71 | bwd_inner: 5693.82 | bwd_allreduce: 12.84 | step: 18.91 20%|██ | 8308/41250 [20:04:35<79:13:12, 8.66s/it] {'loss': 0.0283, 'grad_norm': 0.6256017088890076, 'learning_rate': 3.699684975900237e-05, 'epoch': 2.01} 20%|██ | 8308/41250 [20:04:35<79:13:12, 8.66s/it][2025-04-26 04:02:18,824] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.20 | optimizer_gradients: 1.00 | optimizer_step: 1.06 [2025-04-26 04:02:18,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.67 | bwd_microstep: 5739.04 | bwd_inner_microstep: 5641.65 | bwd_allreduce_microstep: 97.34 | step_microstep: 18.59 [2025-04-26 04:02:18,825] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.68 | bwd: 5739.05 | bwd_inner: 5641.65 | bwd_allreduce: 97.36 | step: 18.60 20%|██ | 8309/41250 [20:04:44<79:11:07, 8.65s/it] {'loss': 0.2795, 'grad_norm': 3.5431010723114014, 'learning_rate': 3.699602208749221e-05, 'epoch': 2.01} 20%|██ | 8309/41250 [20:04:44<79:11:07, 8.65s/it][2025-04-26 04:02:27,498] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.20 | optimizer_step: 0.99 [2025-04-26 04:02:27,498] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.08 | bwd_microstep: 5761.77 | bwd_inner_microstep: 5652.52 | bwd_allreduce_microstep: 109.20 | step_microstep: 19.64 [2025-04-26 04:02:27,498] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.08 | bwd: 5761.79 | bwd_inner: 5652.52 | bwd_allreduce: 109.23 | step: 19.64 20%|██ | 8310/41250 [20:04:52<79:14:23, 8.66s/it] {'loss': 0.1287, 'grad_norm': 3.353853225708008, 'learning_rate': 3.6995194311204924e-05, 'epoch': 2.01} 20%|██ | 8310/41250 [20:04:52<79:14:23, 8.66s/it][2025-04-26 04:02:36,179] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 1.02 | optimizer_step: 0.90 [2025-04-26 04:02:36,180] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2823.24 | bwd_microstep: 5776.21 | bwd_inner_microstep: 5657.71 | bwd_allreduce_microstep: 118.45 | step_microstep: 18.75 [2025-04-26 04:02:36,180] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2823.24 | bwd: 5776.23 | bwd_inner: 5657.71 | bwd_allreduce: 118.47 | step: 18.75 20%|██ | 8311/41250 [20:05:01<79:18:28, 8.67s/it] {'loss': 0.0164, 'grad_norm': 0.4206230938434601, 'learning_rate': 3.6994366430145604e-05, 'epoch': 2.01} 20%|██ | 8311/41250 [20:05:01<79:18:28, 8.67s/it][2025-04-26 04:02:44,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.09 | optimizer_step: 0.98 [2025-04-26 04:02:44,804] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2840.78 | bwd_microstep: 5697.23 | bwd_inner_microstep: 5683.84 | bwd_allreduce_microstep: 13.34 | step_microstep: 19.44 [2025-04-26 04:02:44,805] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2840.78 | bwd: 5697.25 | bwd_inner: 5683.84 | bwd_allreduce: 13.36 | step: 19.44 20%|██ | 8312/41250 [20:05:10<79:10:32, 8.65s/it] {'loss': 0.3421, 'grad_norm': 3.744706392288208, 'learning_rate': 3.6993538444319354e-05, 'epoch': 2.02} 20%|██ | 8312/41250 [20:05:10<79:10:32, 8.65s/it][2025-04-26 04:02:53,482] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.46 | optimizer_gradients: 0.96 | optimizer_step: 0.90 [2025-04-26 04:02:53,483] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.22 | bwd_microstep: 5771.78 | bwd_inner_microstep: 5653.04 | bwd_allreduce_microstep: 118.70 | step_microstep: 18.60 [2025-04-26 04:02:53,483] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.22 | bwd: 5771.79 | bwd_inner: 5653.04 | bwd_allreduce: 118.72 | step: 18.61 20%|██ | 8313/41250 [20:05:18<79:14:24, 8.66s/it] {'loss': 0.0286, 'grad_norm': 0.9030571579933167, 'learning_rate': 3.699271035373129e-05, 'epoch': 2.02} 20%|██ | 8313/41250 [20:05:18<79:14:24, 8.66s/it][2025-04-26 04:03:02,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.00 | optimizer_gradients: 0.99 | optimizer_step: 0.92 [2025-04-26 04:03:02,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2884.38 | bwd_microstep: 5788.09 | bwd_inner_microstep: 5775.34 | bwd_allreduce_microstep: 12.70 | step_microstep: 18.69 [2025-04-26 04:03:02,238] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2884.38 | bwd: 5788.10 | bwd_inner: 5775.34 | bwd_allreduce: 12.72 | step: 18.70 20%|██ | 8314/41250 [20:05:27<79:29:49, 8.69s/it] {'loss': 0.245, 'grad_norm': 1.8476160764694214, 'learning_rate': 3.69918821583865e-05, 'epoch': 2.02} 20%|██ | 8314/41250 [20:05:27<79:29:49, 8.69s/it][2025-04-26 04:03:10,839] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 04:03:10,840] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2821.74 | bwd_microstep: 5699.50 | bwd_inner_microstep: 5636.38 | bwd_allreduce_microstep: 63.07 | step_microstep: 18.71 [2025-04-26 04:03:10,840] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2821.74 | bwd: 5699.52 | bwd_inner: 5636.38 | bwd_allreduce: 63.09 | step: 18.71 20%|██ | 8315/41250 [20:05:36<79:15:15, 8.66s/it] {'loss': 0.0693, 'grad_norm': 2.2404050827026367, 'learning_rate': 3.69910538582901e-05, 'epoch': 2.02} 20%|██ | 8315/41250 [20:05:36<79:15:15, 8.66s/it][2025-04-26 04:03:19,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 04:03:19,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.78 | bwd_microstep: 5757.63 | bwd_inner_microstep: 5700.17 | bwd_allreduce_microstep: 57.41 | step_microstep: 18.68 [2025-04-26 04:03:19,528] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.79 | bwd: 5757.64 | bwd_inner: 5700.17 | bwd_allreduce: 57.43 | step: 18.68 20%|██ | 8316/41250 [20:05:44<79:19:16, 8.67s/it] {'loss': 0.1501, 'grad_norm': 1.1123743057250977, 'learning_rate': 3.69902254534472e-05, 'epoch': 2.02} 20%|██ | 8316/41250 [20:05:44<79:19:16, 8.67s/it][2025-04-26 04:03:28,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.50 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 04:03:28,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2842.28 | bwd_microstep: 5702.58 | bwd_inner_microstep: 5689.85 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.87 [2025-04-26 04:03:28,155] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2842.28 | bwd: 5702.60 | bwd_inner: 5689.85 | bwd_allreduce: 12.70 | step: 18.87 20%|██ | 8317/41250 [20:05:53<79:12:06, 8.66s/it] {'loss': 0.1928, 'grad_norm': 1.413202166557312, 'learning_rate': 3.69893969438629e-05, 'epoch': 2.02} 20%|██ | 8317/41250 [20:05:53<79:12:06, 8.66s/it][2025-04-26 04:03:36,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.29 | optimizer_gradients: 1.03 | optimizer_step: 1.01 [2025-04-26 04:03:36,808] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.83 | bwd_microstep: 5719.91 | bwd_inner_microstep: 5687.81 | bwd_allreduce_microstep: 32.05 | step_microstep: 19.32 [2025-04-26 04:03:36,809] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.83 | bwd: 5719.92 | bwd_inner: 5687.81 | bwd_allreduce: 32.07 | step: 19.32 20%|██ | 8318/41250 [20:06:02<79:11:07, 8.66s/it] {'loss': 0.0111, 'grad_norm': 0.30280834436416626, 'learning_rate': 3.698856832954231e-05, 'epoch': 2.02} 20%|██ | 8318/41250 [20:06:02<79:11:07, 8.66s/it][2025-04-26 04:03:45,473] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.27 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 04:03:45,474] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2852.29 | bwd_microstep: 5731.23 | bwd_inner_microstep: 5718.23 | bwd_allreduce_microstep: 12.96 | step_microstep: 18.52 [2025-04-26 04:03:45,474] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2852.29 | bwd: 5731.24 | bwd_inner: 5718.23 | bwd_allreduce: 12.98 | step: 18.52 20%|██ | 8319/41250 [20:06:10<79:12:24, 8.66s/it] {'loss': 0.0874, 'grad_norm': 1.850829005241394, 'learning_rate': 3.698773961049053e-05, 'epoch': 2.02} 20%|██ | 8319/41250 [20:06:10<79:12:24, 8.66s/it][2025-04-26 04:03:54,203] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.17 | optimizer_gradients: 0.98 | optimizer_step: 0.99 [2025-04-26 04:03:54,203] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2879.45 | bwd_microstep: 5766.14 | bwd_inner_microstep: 5753.45 | bwd_allreduce_microstep: 12.65 | step_microstep: 18.57 [2025-04-26 04:03:54,203] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2879.45 | bwd: 5766.16 | bwd_inner: 5753.45 | bwd_allreduce: 12.66 | step: 18.57 20%|██ | 8320/41250 [20:06:19<79:23:49, 8.68s/it] {'loss': 0.1201, 'grad_norm': 14.866388320922852, 'learning_rate': 3.698691078671268e-05, 'epoch': 2.02} 20%|██ | 8320/41250 [20:06:19<79:23:49, 8.68s/it][2025-04-26 04:04:02,894] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.39 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 04:04:02,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.24 | bwd_microstep: 5760.78 | bwd_inner_microstep: 5689.70 | bwd_allreduce_microstep: 71.04 | step_microstep: 18.62 [2025-04-26 04:04:02,895] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.24 | bwd: 5760.80 | bwd_inner: 5689.70 | bwd_allreduce: 71.05 | step: 18.62 20%|██ | 8321/41250 [20:06:28<79:25:36, 8.68s/it] {'loss': 0.1847, 'grad_norm': 1.0680114030838013, 'learning_rate': 3.698608185821387e-05, 'epoch': 2.02} 20%|██ | 8321/41250 [20:06:28<79:25:36, 8.68s/it][2025-04-26 04:04:11,522] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.31 | optimizer_gradients: 0.97 | optimizer_step: 1.07 [2025-04-26 04:04:11,523] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2855.46 | bwd_microstep: 5691.06 | bwd_inner_microstep: 5678.33 | bwd_allreduce_microstep: 12.69 | step_microstep: 18.83 [2025-04-26 04:04:11,523] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2855.46 | bwd: 5691.07 | bwd_inner: 5678.33 | bwd_allreduce: 12.70 | step: 18.83 20%|██ | 8322/41250 [20:06:36<79:16:22, 8.67s/it] {'loss': 0.0912, 'grad_norm': 3.6967267990112305, 'learning_rate': 3.698525282499921e-05, 'epoch': 2.02} 20%|██ | 8322/41250 [20:06:36<79:16:22, 8.67s/it][2025-04-26 04:04:20,236] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 04:04:20,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.29 | bwd_microstep: 5802.31 | bwd_inner_microstep: 5663.36 | bwd_allreduce_microstep: 138.91 | step_microstep: 18.80 [2025-04-26 04:04:20,237] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.29 | bwd: 5802.33 | bwd_inner: 5663.36 | bwd_allreduce: 138.92 | step: 18.80 20%|██ | 8323/41250 [20:06:45<79:24:09, 8.68s/it] {'loss': 0.1747, 'grad_norm': 7.851690292358398, 'learning_rate': 3.69844236870738e-05, 'epoch': 2.02} 20%|██ | 8323/41250 [20:06:45<79:24:09, 8.68s/it][2025-04-26 04:04:28,879] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.32 | optimizer_gradients: 1.05 | optimizer_step: 1.25 [2025-04-26 04:04:28,880] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2833.81 | bwd_microstep: 5725.34 | bwd_inner_microstep: 5659.43 | bwd_allreduce_microstep: 65.86 | step_microstep: 19.86 [2025-04-26 04:04:28,881] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2833.81 | bwd: 5725.36 | bwd_inner: 5659.43 | bwd_allreduce: 65.88 | step: 19.86 20%|██ | 8324/41250 [20:06:54<79:17:48, 8.67s/it] {'loss': 0.148, 'grad_norm': 2.651956558227539, 'learning_rate': 3.698359444444276e-05, 'epoch': 2.02} 20%|██ | 8324/41250 [20:06:54<79:17:48, 8.67s/it][2025-04-26 04:04:37,523] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.25 | optimizer_gradients: 1.02 | optimizer_step: 0.91 [2025-04-26 04:04:37,524] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.96 | bwd_microstep: 5714.78 | bwd_inner_microstep: 5701.91 | bwd_allreduce_microstep: 12.82 | step_microstep: 19.01 [2025-04-26 04:04:37,524] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.96 | bwd: 5714.79 | bwd_inner: 5701.91 | bwd_allreduce: 12.83 | step: 19.01 20%|██ | 8325/41250 [20:07:02<79:13:15, 8.66s/it] {'loss': 0.2193, 'grad_norm': 1.4294992685317993, 'learning_rate': 3.698276509711121e-05, 'epoch': 2.02} 20%|██ | 8325/41250 [20:07:02<79:13:15, 8.66s/it][2025-04-26 04:04:46,164] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 04:04:46,165] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.62 | bwd_microstep: 5727.30 | bwd_inner_microstep: 5658.52 | bwd_allreduce_microstep: 68.73 | step_microstep: 19.01 [2025-04-26 04:04:46,165] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.62 | bwd: 5727.32 | bwd_inner: 5658.52 | bwd_allreduce: 68.75 | step: 19.00 20%|██ | 8326/41250 [20:07:11<79:10:09, 8.66s/it] {'loss': 0.0741, 'grad_norm': 0.9998537302017212, 'learning_rate': 3.698193564508424e-05, 'epoch': 2.02} 20%|██ | 8326/41250 [20:07:11<79:10:09, 8.66s/it][2025-04-26 04:04:54,977] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 1.02 | optimizer_step: 0.97 [2025-04-26 04:04:54,978] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.57 | bwd_microstep: 5878.98 | bwd_inner_microstep: 5692.33 | bwd_allreduce_microstep: 186.60 | step_microstep: 19.40 [2025-04-26 04:04:54,978] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.57 | bwd: 5879.00 | bwd_inner: 5692.33 | bwd_allreduce: 186.63 | step: 19.41 20%|██ | 8327/41250 [20:07:20<79:35:21, 8.70s/it] {'loss': 0.109, 'grad_norm': 2.0138089656829834, 'learning_rate': 3.6981106088366984e-05, 'epoch': 2.02} 20%|██ | 8327/41250 [20:07:20<79:35:21, 8.70s/it][2025-04-26 04:05:03,665] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.41 | optimizer_gradients: 1.03 | optimizer_step: 0.90 [2025-04-26 04:05:03,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.49 | bwd_microstep: 5770.55 | bwd_inner_microstep: 5659.81 | bwd_allreduce_microstep: 110.69 | step_microstep: 18.98 [2025-04-26 04:05:03,666] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.49 | bwd: 5770.56 | bwd_inner: 5659.81 | bwd_allreduce: 110.71 | step: 18.98 20%|██ | 8328/41250 [20:07:28<79:32:49, 8.70s/it] {'loss': 0.1871, 'grad_norm': 4.311672687530518, 'learning_rate': 3.6980276426964545e-05, 'epoch': 2.02} 20%|██ | 8328/41250 [20:07:28<79:32:49, 8.70s/it][2025-04-26 04:05:12,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.30 | optimizer_gradients: 1.25 | optimizer_step: 0.90 [2025-04-26 04:05:12,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.08 | bwd_microstep: 5779.41 | bwd_inner_microstep: 5698.68 | bwd_allreduce_microstep: 80.67 | step_microstep: 19.08 [2025-04-26 04:05:12,379] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.08 | bwd: 5779.42 | bwd_inner: 5698.68 | bwd_allreduce: 80.70 | step: 19.08 20%|██ | 8329/41250 [20:07:37<79:35:08, 8.70s/it] {'loss': 0.0702, 'grad_norm': 2.312218427658081, 'learning_rate': 3.697944666088204e-05, 'epoch': 2.02} 20%|██ | 8329/41250 [20:07:37<79:35:08, 8.70s/it][2025-04-26 04:05:21,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.05 | optimizer_gradients: 0.97 | optimizer_step: 1.05 [2025-04-26 04:05:21,088] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2856.52 | bwd_microstep: 5768.37 | bwd_inner_microstep: 5703.08 | bwd_allreduce_microstep: 65.25 | step_microstep: 18.52 [2025-04-26 04:05:21,089] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2856.52 | bwd: 5768.38 | bwd_inner: 5703.08 | bwd_allreduce: 65.26 | step: 18.52 20%|██ | 8330/41250 [20:07:46<79:35:46, 8.70s/it] {'loss': 0.0496, 'grad_norm': 1.0645071268081665, 'learning_rate': 3.6978616790124594e-05, 'epoch': 2.02} 20%|██ | 8330/41250 [20:07:46<79:35:46, 8.70s/it][2025-04-26 04:05:29,805] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.11 | optimizer_gradients: 1.16 | optimizer_step: 0.92 [2025-04-26 04:05:29,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2830.23 | bwd_microstep: 5802.40 | bwd_inner_microstep: 5670.19 | bwd_allreduce_microstep: 132.16 | step_microstep: 18.88 [2025-04-26 04:05:29,806] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2830.23 | bwd: 5802.41 | bwd_inner: 5670.19 | bwd_allreduce: 132.18 | step: 18.88 20%|██ | 8331/41250 [20:07:55<79:38:01, 8.71s/it] {'loss': 0.3554, 'grad_norm': 4.27353048324585, 'learning_rate': 3.6977786814697314e-05, 'epoch': 2.02} 20%|██ | 8331/41250 [20:07:55<79:38:01, 8.71s/it][2025-04-26 04:05:38,461] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.15 | optimizer_gradients: 1.01 | optimizer_step: 0.96 [2025-04-26 04:05:38,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2824.73 | bwd_microstep: 5744.99 | bwd_inner_microstep: 5652.73 | bwd_allreduce_microstep: 92.21 | step_microstep: 18.90 [2025-04-26 04:05:38,462] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2824.73 | bwd: 5745.01 | bwd_inner: 5652.73 | bwd_allreduce: 92.23 | step: 18.91 20%|██ | 8332/41250 [20:08:03<79:29:03, 8.69s/it] {'loss': 0.0411, 'grad_norm': 0.8492656350135803, 'learning_rate': 3.697695673460531e-05, 'epoch': 2.02} 20%|██ | 8332/41250 [20:08:03<79:29:03, 8.69s/it][2025-04-26 04:05:47,106] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 1.01 [2025-04-26 04:05:47,107] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.01 | bwd_microstep: 5731.89 | bwd_inner_microstep: 5657.84 | bwd_allreduce_microstep: 74.01 | step_microstep: 19.14 [2025-04-26 04:05:47,107] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.01 | bwd: 5731.90 | bwd_inner: 5657.84 | bwd_allreduce: 74.03 | step: 19.14 20%|██ | 8333/41250 [20:08:12<79:21:08, 8.68s/it] {'loss': 0.0424, 'grad_norm': 0.6605932116508484, 'learning_rate': 3.697612654985371e-05, 'epoch': 2.02} 20%|██ | 8333/41250 [20:08:12<79:21:08, 8.68s/it][2025-04-26 04:05:55,768] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.04 | optimizer_gradients: 1.01 | optimizer_step: 0.90 [2025-04-26 04:05:55,768] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2853.97 | bwd_microstep: 5720.66 | bwd_inner_microstep: 5707.96 | bwd_allreduce_microstep: 12.66 | step_microstep: 18.62 [2025-04-26 04:05:55,769] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2853.98 | bwd: 5720.68 | bwd_inner: 5707.96 | bwd_allreduce: 12.68 | step: 18.63 20%|██ | 8334/41250 [20:08:21<79:18:08, 8.67s/it] {'loss': 0.0767, 'grad_norm': 3.2833404541015625, 'learning_rate': 3.6975296260447636e-05, 'epoch': 2.02} 20%|██ | 8334/41250 [20:08:21<79:18:08, 8.67s/it][2025-04-26 04:06:04,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.34 | optimizer_gradients: 0.95 | optimizer_step: 0.89 [2025-04-26 04:06:04,447] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2859.34 | bwd_microstep: 5735.49 | bwd_inner_microstep: 5722.77 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.46 [2025-04-26 04:06:04,448] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2859.34 | bwd: 5735.51 | bwd_inner: 5722.77 | bwd_allreduce: 12.70 | step: 18.46 20%|██ | 8335/41250 [20:08:29<79:18:48, 8.67s/it] {'loss': 0.3624, 'grad_norm': 2.280608892440796, 'learning_rate': 3.697446586639219e-05, 'epoch': 2.02} 20%|██ | 8335/41250 [20:08:29<79:18:48, 8.67s/it][2025-04-26 04:06:13,169] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.36 | optimizer_gradients: 0.98 | optimizer_step: 0.90 [2025-04-26 04:06:13,169] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2849.53 | bwd_microstep: 5788.55 | bwd_inner_microstep: 5695.67 | bwd_allreduce_microstep: 92.83 | step_microstep: 18.64 [2025-04-26 04:06:13,169] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2849.53 | bwd: 5788.56 | bwd_inner: 5695.67 | bwd_allreduce: 92.85 | step: 18.65 20%|██ | 8336/41250 [20:08:38<79:26:35, 8.69s/it] {'loss': 0.1231, 'grad_norm': 2.110623598098755, 'learning_rate': 3.697363536769251e-05, 'epoch': 2.02} 20%|██ | 8336/41250 [20:08:38<79:26:35, 8.69s/it][2025-04-26 04:06:21,871] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.28 | optimizer_gradients: 1.01 | optimizer_step: 0.99 [2025-04-26 04:06:21,871] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2857.97 | bwd_microstep: 5760.24 | bwd_inner_microstep: 5705.06 | bwd_allreduce_microstep: 55.14 | step_microstep: 19.05 [2025-04-26 04:06:21,872] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2857.97 | bwd: 5760.25 | bwd_inner: 5705.06 | bwd_allreduce: 55.15 | step: 19.05 20%|██ | 8337/41250 [20:08:47<79:28:35, 8.69s/it] {'loss': 0.1009, 'grad_norm': 1.7899471521377563, 'learning_rate': 3.697280476435369e-05, 'epoch': 2.02} 20%|██ | 8337/41250 [20:08:47<79:28:35, 8.69s/it][2025-04-26 04:06:30,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.06 | optimizer_gradients: 1.09 | optimizer_step: 0.97 [2025-04-26 04:06:30,557] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2828.99 | bwd_microstep: 5772.77 | bwd_inner_microstep: 5667.28 | bwd_allreduce_microstep: 105.44 | step_microstep: 19.09 [2025-04-26 04:06:30,558] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2828.99 | bwd: 5772.79 | bwd_inner: 5667.28 | bwd_allreduce: 105.47 | step: 19.09 20%|██ | 8338/41250 [20:08:55<79:27:13, 8.69s/it] {'loss': 0.0854, 'grad_norm': 0.8198347687721252, 'learning_rate': 3.697197405638088e-05, 'epoch': 2.02} 20%|██ | 8338/41250 [20:08:55<79:27:13, 8.69s/it][2025-04-26 04:06:39,276] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.19 | optimizer_gradients: 1.00 | optimizer_step: 1.07 [2025-04-26 04:06:39,277] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2846.14 | bwd_microstep: 5788.24 | bwd_inner_microstep: 5691.94 | bwd_allreduce_microstep: 96.25 | step_microstep: 18.98 [2025-04-26 04:06:39,277] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2846.14 | bwd: 5788.26 | bwd_inner: 5691.94 | bwd_allreduce: 96.27 | step: 18.98 20%|██ | 8339/41250 [20:09:04<79:31:42, 8.70s/it] {'loss': 0.0542, 'grad_norm': 1.593015432357788, 'learning_rate': 3.6971143243779186e-05, 'epoch': 2.02} 20%|██ | 8339/41250 [20:09:04<79:31:42, 8.70s/it][2025-04-26 04:06:47,989] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.24 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 04:06:47,990] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2844.30 | bwd_microstep: 5785.16 | bwd_inner_microstep: 5697.53 | bwd_allreduce_microstep: 87.58 | step_microstep: 18.84 [2025-04-26 04:06:47,990] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2844.30 | bwd: 5785.17 | bwd_inner: 5697.53 | bwd_allreduce: 87.60 | step: 18.84 20%|██ | 8340/41250 [20:09:13<79:33:49, 8.70s/it] {'loss': 0.01, 'grad_norm': 0.16331017017364502, 'learning_rate': 3.697031232655373e-05, 'epoch': 2.02} 20%|██ | 8340/41250 [20:09:13<79:33:49, 8.70s/it][2025-04-26 04:06:56,646] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.10 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 04:06:56,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.21 | bwd_microstep: 5718.72 | bwd_inner_microstep: 5705.96 | bwd_allreduce_microstep: 12.71 | step_microstep: 18.60 [2025-04-26 04:06:56,647] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.21 | bwd: 5718.73 | bwd_inner: 5705.96 | bwd_allreduce: 12.73 | step: 18.60 20%|██ | 8341/41250 [20:09:21<79:26:05, 8.69s/it] {'loss': 0.0516, 'grad_norm': 1.0531275272369385, 'learning_rate': 3.696948130470964e-05, 'epoch': 2.02} 20%|██ | 8341/41250 [20:09:21<79:26:05, 8.69s/it][2025-04-26 04:07:05,272] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.12 | optimizer_gradients: 1.17 | optimizer_step: 1.05 [2025-04-26 04:07:05,273] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2831.78 | bwd_microstep: 5713.64 | bwd_inner_microstep: 5661.18 | bwd_allreduce_microstep: 52.41 | step_microstep: 19.61 [2025-04-26 04:07:05,273] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2831.78 | bwd: 5713.65 | bwd_inner: 5661.18 | bwd_allreduce: 52.43 | step: 19.61 20%|██ | 8342/41250 [20:09:30<79:15:36, 8.67s/it] {'loss': 0.0334, 'grad_norm': 0.6125720143318176, 'learning_rate': 3.696865017825204e-05, 'epoch': 2.02} 20%|██ | 8342/41250 [20:09:30<79:15:36, 8.67s/it][2025-04-26 04:07:13,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.26 | optimizer_gradients: 0.97 | optimizer_step: 0.90 [2025-04-26 04:07:13,922] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.87 | bwd_microstep: 5715.47 | bwd_inner_microstep: 5702.75 | bwd_allreduce_microstep: 12.68 | step_microstep: 18.44 [2025-04-26 04:07:13,923] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.87 | bwd: 5715.48 | bwd_inner: 5702.75 | bwd_allreduce: 12.69 | step: 18.44 20%|██ | 8343/41250 [20:09:39<79:11:45, 8.66s/it] {'loss': 0.0978, 'grad_norm': 1.9450950622558594, 'learning_rate': 3.696781894718604e-05, 'epoch': 2.02} 20%|██ | 8343/41250 [20:09:39<79:11:45, 8.66s/it][2025-04-26 04:07:22,547] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.90 [2025-04-26 04:07:22,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.40 | bwd_microstep: 5695.86 | bwd_inner_microstep: 5683.07 | bwd_allreduce_microstep: 12.75 | step_microstep: 18.89 [2025-04-26 04:07:22,548] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.40 | bwd: 5695.88 | bwd_inner: 5683.07 | bwd_allreduce: 12.77 | step: 18.89 20%|██ | 8344/41250 [20:09:47<79:05:19, 8.65s/it] {'loss': 0.0381, 'grad_norm': 0.7217829823493958, 'learning_rate': 3.6966987611516776e-05, 'epoch': 2.02} 20%|██ | 8344/41250 [20:09:47<79:05:19, 8.65s/it][2025-04-26 04:07:31,275] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.14 | optimizer_gradients: 0.99 | optimizer_step: 1.00 [2025-04-26 04:07:31,276] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2854.34 | bwd_microstep: 5789.89 | bwd_inner_microstep: 5711.27 | bwd_allreduce_microstep: 78.57 | step_microstep: 18.50 [2025-04-26 04:07:31,276] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2854.34 | bwd: 5789.90 | bwd_inner: 5711.27 | bwd_allreduce: 78.59 | step: 18.50 20%|██ | 8345/41250 [20:09:56<79:17:33, 8.68s/it] {'loss': 0.0689, 'grad_norm': 1.4117976427078247, 'learning_rate': 3.6966156171249365e-05, 'epoch': 2.02} 20%|██ | 8345/41250 [20:09:56<79:17:33, 8.68s/it][2025-04-26 04:07:39,972] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.52 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 04:07:39,973] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2847.95 | bwd_microstep: 5762.62 | bwd_inner_microstep: 5692.81 | bwd_allreduce_microstep: 69.77 | step_microstep: 19.06 [2025-04-26 04:07:39,973] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2847.95 | bwd: 5762.63 | bwd_inner: 5692.81 | bwd_allreduce: 69.78 | step: 19.06 20%|██ | 8346/41250 [20:10:05<79:21:01, 8.68s/it] {'loss': 0.0305, 'grad_norm': 0.42253735661506653, 'learning_rate': 3.696532462638895e-05, 'epoch': 2.02} 20%|██ | 8346/41250 [20:10:05<79:21:01, 8.68s/it][2025-04-26 04:07:48,684] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.33 | optimizer_gradients: 1.01 | optimizer_step: 1.03 [2025-04-26 04:07:48,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2829.14 | bwd_microstep: 5798.91 | bwd_inner_microstep: 5642.84 | bwd_allreduce_microstep: 156.02 | step_microstep: 19.31 [2025-04-26 04:07:48,685] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2829.14 | bwd: 5798.92 | bwd_inner: 5642.84 | bwd_allreduce: 156.04 | step: 19.31 20%|██ | 8347/41250 [20:10:14<79:26:09, 8.69s/it] {'loss': 0.3426, 'grad_norm': 2.860227108001709, 'learning_rate': 3.6964492976940635e-05, 'epoch': 2.02} 20%|██ | 8347/41250 [20:10:14<79:26:09, 8.69s/it][2025-04-26 04:07:57,380] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | optimizer_allgather: 5.42 | optimizer_gradients: 0.99 | optimizer_step: 0.89 [2025-04-26 04:07:57,381] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd_microstep: 2851.39 | bwd_microstep: 5754.35 | bwd_inner_microstep: 5689.27 | bwd_allreduce_microstep: 65.04 | step_microstep: 18.80 [2025-04-26 04:07:57,381] [INFO] [logging.py:128:log_dist] [Rank 0] time (ms) | fwd: 2851.39 | bwd: 5754.36 | bwd_inner: 5689.26 | bwd_allreduce: 65.06 | step: 18.81 20%|██ | 8348/41250 [20:10:22<79:26:35, 8.69s/it] {'loss': 0.1511, 'grad_norm': 1.5930591821670532, 'learning_rate': 3.696366122290957e-05, 'epoch': 2.02} 20%|██ | 8348/41250 [20:10:22<79:26:35, 8.69s/it]W0426 04:08:06.364300 784474 site-packages/torch/distributed/run.py:793] W0426 04:08:06.364300 784474 site-packages/torch/distributed/run.py:793] ***************************************** W0426 04:08:06.364300 784474 site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0426 04:08:06.364300 784474 site-packages/torch/distributed/run.py:793] ***************************************** [2025-04-26 04:08:08,311] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-26 04:08:08,312] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-04-26 04:08:08,324] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) petrel_client is not installed. If you read data locally instead of from ceph, ignore it. petrel_client is not installed. If you read data locally instead of from ceph, ignore it. petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. [2025-04-26 04:08:11,586] [INFO] [comm.py:652:init_distributed] cdb=None [2025-04-26 04:08:11,586] [INFO] [comm.py:683:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl Replace train sampler!!Replace train sampler!! petrel_client is not installed. Using PIL to load images.petrel_client is not installed. Using PIL to load images. 04/26/2025 04:08:11 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False 04/26/2025 04:08:11 - INFO - __main__ - Training/evaluation parameters TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4125, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr26_04-08-11_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) 04/26/2025 04:08:11 - INFO - __main__ - Loading Tokenizer: /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B [INFO|tokenization_utils_base.py:2032] 2025-04-26 04:08:11,770 >> loading file tokenizer.model [INFO|tokenization_utils_base.py:2032] 2025-04-26 04:08:11,770 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:2032] 2025-04-26 04:08:11,770 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:2032] 2025-04-26 04:08:11,770 >> loading file tokenizer_config.json [INFO|tokenization_utils_base.py:2032] 2025-04-26 04:08:11,770 >> loading file tokenizer.json [INFO|tokenization_utils_base.py:2032] 2025-04-26 04:08:11,770 >> loading file chat_template.jinja [2025-04-26 04:08:11,823] [INFO] [comm.py:652:init_distributed] cdb=None [2025-04-26 04:08:11,824] [INFO] [comm.py:652:init_distributed] cdb=None 04/26/2025 04:08:11 - WARNING - __main__ - Process rank: 1, device: cuda:1, n_gpu: 1distributed training: True, 16-bits training: False 04/26/2025 04:08:11 - WARNING - __main__ - Process rank: 2, device: cuda:2, n_gpu: 1distributed training: True, 16-bits training: False [INFO|tokenization_utils_base.py:2304] 2025-04-26 04:08:12,243 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 04/26/2025 04:08:12 - INFO - __main__ - Loading InternVLChatModel... [INFO|configuration_utils.py:694] 2025-04-26 04:08:12,384 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/config.json [INFO|configuration_utils.py:768] 2025-04-26 04:08:12,386 >> Model config InternVLChatConfig { "_commit_hash": null, "_name_or_path": "/mnt/petrelfs/wangweiyun/workspace_wwy/open_source/InternVL/internvl_chat/work_dirs/internvl_chat_v3_0/InternVL3_0-9B-MPO-try0-2", "architectures": [ "InternVLChatModel" ], "auto_map": { "AutoConfig": "configuration_internvl_chat.InternVLChatConfig", "AutoModel": "modeling_internvl_chat.InternVLChatModel", "AutoModelForCausalLM": "modeling_internvl_chat.InternVLChatModel" }, "downsample_ratio": 0.5, "dynamic_image_size": true, "force_image_size": 448, "hidden_size": 4096, "image_fold": null, "llm_config": { "_attn_implementation_autoset": true, "_name_or_path": "/mnt/petrelfs/share_data/wangweiyun/share_ckpt/hf_home/internlm2-chat-7b", "add_cross_attention": false, "architectures": [ "InternLM2ForCausalLM" ], "attn_implementation": "flash_attention_2", "auto_map": { "AutoConfig": "configuration_internlm2.InternLM2Config", "AutoModel": "modeling_internlm2.InternLM2ForCausalLM", "AutoModelForCausalLM": "modeling_internlm2.InternLM2ForCausalLM" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bias": false, "bos_token_id": 1, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": 2, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "silu", "hidden_size": 4096, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "initializer_range": 0.02, "intermediate_size": 10240, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "length_penalty": 1.0, "max_length": 20, "max_position_embeddings": 32768, "min_length": 0, "model_type": "internlm2", "moe_config": null, "no_repeat_ngram_size": 0, "num_attention_heads": 32, "num_beam_groups": 1, "num_beams": 1, "num_hidden_layers": 48, "num_key_value_heads": 2, "num_return_sequences": 1, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": 2, "prefix": null, "pretraining_tp": 1, "problem_type": null, "pruned_heads": {}, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "rms_norm_eps": 1e-05, "rope_scaling": { "factor": 2.0, "type": "dynamic" }, "rope_theta": 50000000, "sep_token_id": null, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": false, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_cache": false, "vocab_size": 128142 }, "max_dynamic_patch": 12, "min_dynamic_patch": 1, "model_type": "internvl_chat", "pad2square": false, "ps_version": "v2", "select_layer": -1, "system_message": null, "template": "internvl2_5", "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": null, "use_backbone_lora": 0, "use_img_start_end_token": true, "use_llm_lora": 0, "use_thumbnail": true, "vision_config": { "_attn_implementation_autoset": true, "_name_or_path": "pretrained/intern_vit_6b_448px_v1_2/", "add_cross_attention": false, "architectures": [ "InternVisionModel" ], "attention_dropout": 0.0, "auto_map": { "AutoConfig": "configuration_intern_vit.InternVisionConfig", "AutoModel": "modeling_intern_vit.InternVisionModel" }, "bad_words_ids": null, "begin_suppress_tokens": null, "bos_token_id": null, "capacity_factor": 1.2, "chunk_size_feed_forward": 0, "cross_attention_hidden_size": null, "decoder_start_token_id": null, "diversity_penalty": 0.0, "do_sample": false, "drop_path_rate": 0.1, "dropout": 0.0, "early_stopping": false, "encoder_no_repeat_ngram_size": 0, "eos_token_id": null, "eval_capacity_factor": 1.4, "exponential_decay_length_penalty": null, "finetuning_task": null, "forced_bos_token_id": null, "forced_eos_token_id": null, "hidden_act": "gelu", "hidden_size": 1024, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "image_size": 448, "initializer_factor": 0.1, "initializer_range": 1e-10, "intermediate_size": 4096, "is_decoder": false, "is_encoder_decoder": false, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "laux_allreduce": "all_nodes", "layer_norm_eps": 1e-06, "length_penalty": 1.0, "max_length": 20, "min_length": 0, "model_type": "intern_vit_6b", "moe_coeff_ratio": 0.5, "moe_intermediate_size": 768, "moe_output_scale": 4.0, "no_repeat_ngram_size": 0, "noisy_gate_policy": "RSample_before", "norm_type": "layer_norm", "num_attention_heads": 16, "num_beam_groups": 1, "num_beams": 1, "num_channels": 3, "num_experts": 8, "num_hidden_layers": 24, "num_return_sequences": 1, "num_routed_experts": 4, "num_shared_experts": 4, "output_attentions": false, "output_hidden_states": false, "output_scores": false, "pad_token_id": null, "patch_size": 14, "prefix": null, "problem_type": null, "pruned_heads": {}, "qk_normalization": false, "qkv_bias": true, "remove_invalid_values": false, "repetition_penalty": 1.0, "return_dict": true, "return_dict_in_generate": false, "sep_token_id": null, "shared_expert_intermediate_size": 3072, "suppress_tokens": null, "task_specific_params": null, "temperature": 1.0, "tf_legacy_loss": false, "tie_encoder_decoder": false, "tie_word_embeddings": true, "tokenizer_class": null, "top_k": 50, "top_p": 1.0, "torch_dtype": "bfloat16", "torchscript": false, "transformers_version": "4.48.3", "typical_p": 1.0, "use_bfloat16": false, "use_flash_attn": true, "use_moe": false, "use_residual": true, "use_rts": false, "use_weighted_residual": false } } 04/26/2025 04:08:12 - INFO - __main__ - Using flash_attention_2 for InternLM [INFO|modeling_utils.py:3901] 2025-04-26 04:08:12,387 >> loading weights file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/model.safetensors.index.json [INFO|modeling_utils.py:1582] 2025-04-26 04:08:12,388 >> Instantiating InternVLChatModel model under default dtype torch.bfloat16. [INFO|configuration_utils.py:1140] 2025-04-26 04:08:12,390 >> Generate config GenerationConfig {} this model [WARNING|logging.py:328] 2025-04-26 04:08:12,457 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. [INFO|configuration_utils.py:1140] 2025-04-26 04:08:12,457 >> Generate config GenerationConfig { "bos_token_id": 1, "eos_token_id": 2, "pad_token_id": 2, "use_cache": false } this model this model [WARNING|logging.py:328] 2025-04-26 04:08:12,588 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. [WARNING|logging.py:328] 2025-04-26 04:08:12,588 >> InternLM2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it. motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) motion_mlp.weight1 motion_mlp.weight1 Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) Parameter containing: tensor([[-3.9497e-35, 9.9609e-01, -2.1606e-02, ..., 9.8438e-01, -4.3195e-16, 8.5547e-01], [-8.8500e+08, 9.9219e-01, 4.3293e+12, ..., 9.9219e-01, -6.0423e-25, 8.7891e-01], [-2.0900e+02, 9.8828e-01, -1.3916e-02, ..., 9.9609e-01, 9.5695e-34, 9.0234e-01], ..., [ 6.8247e-06, 3.7575e-04, 8.9720e+14, ..., -8.5547e-01, 5.7894e+35, -8.9844e-01], [-1.3990e-30, -3.5156e-02, 2.8187e+22, ..., -8.8672e-01, -2.0146e+36, -8.7500e-01], [ 4.3335e-03, -7.0801e-02, 1.1744e+10, ..., -9.1406e-01, -1.3209e+37, -8.5156e-01]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 motion_mlp.weight2 Parameter containing: tensor([[0.0038, 0.0076, 0.0004, ..., 0.0018, 0.0058, 0.0042], [0.0016, 0.0024, 0.0002, ..., 0.0030, 0.0067, 0.0039], [0.0049, 0.0073, 0.0011, ..., 0.0092, 0.0042, 0.0098], ..., [0.0022, 0.0044, 0.0077, ..., 0.0032, 0.0088, 0.0034], [0.0026, 0.0081, 0.0054, ..., 0.0020, 0.0035, 0.0063], [0.0018, 0.0065, 0.0002, ..., 0.0020, 0.0045, 0.0099]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) motion_mlp.weight1 Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], requires_grad=True) motion_mlp.weight2 Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) None False motion_mlp.weight2 motion_mlp.weight2 Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) Parameter containing: tensor([[0.0052, 0.0048, 0.0044, ..., 0.0095, 0.0072, 0.0018], [0.0027, 0.0013, 0.0090, ..., 0.0052, 0.0060, 0.0030], [0.0056, 0.0089, 0.0038, ..., 0.0074, 0.0054, 0.0017], ..., [0.0082, 0.0093, 0.0095, ..., 0.0005, 0.0048, 0.0056], [0.0064, 0.0098, 0.0072, ..., 0.0095, 0.0014, 0.0092], [0.0061, 0.0058, 0.0096, ..., 0.0016, 0.0072, 0.0038]], requires_grad=True) motion_mlp.bias Parameter containing: tensor([0., 0., 0., ..., 0., 0., 0.], requires_grad=True) None False None False Setting backbone: fragments_backbone Setting backbone: fragments_backbone Setting backbone: fragments_backbone Loading checkpoint shards: 0%| | 0/4 [00:00> All model checkpoint weights were used when initializing InternVLChatModel. [WARNING|modeling_utils.py:4890] 2025-04-26 04:08:18,598 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. [INFO|configuration_utils.py:1093] 2025-04-26 04:08:18,609 >> loading configuration file /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B/generation_config.json [INFO|configuration_utils.py:1140] 2025-04-26 04:08:18,609 >> Generate config GenerationConfig {} 04/26/2025 04:08:18 - INFO - __main__ - Finished 04/26/2025 04:08:18 - INFO - __main__ - model.config.force_image_size: 448 04/26/2025 04:08:18 - INFO - __main__ - data_args.force_image_size: 448 04/26/2025 04:08:18 - INFO - __main__ - model.config.vision_config.image_size: 448 04/26/2025 04:08:18 - INFO - __main__ - [Dataset] num_image_token: 256 04/26/2025 04:08:18 - INFO - __main__ - [Dataset] dynamic_image_size: True 04/26/2025 04:08:18 - INFO - __main__ - [Dataset] use_thumbnail: True 04/26/2025 04:08:18 - INFO - __main__ - [Dataset] min_dynamic_patch: 1, max_dynamic_patch: 6 04/26/2025 04:08:18 - INFO - __main__ - Formatting inputs...Skip in lazy mode Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.72it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.43it/s] [WARNING|modeling_utils.py:4890] 2025-04-26 04:08:18,634 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.66it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 4.34it/s] [WARNING|modeling_utils.py:4890] 2025-04-26 04:08:18,648 >> Some weights of InternVLChatModel were not initialized from the model checkpoint at /home/wangjiarui/InternVL/internvl_chat/InternVL3-9B and are newly initialized: ['evaluator.fragments_backbone.layers.0.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.0.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.0.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.0.downsample.norm.bias', 'evaluator.fragments_backbone.layers.0.downsample.norm.weight', 'evaluator.fragments_backbone.layers.0.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.1.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.1.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.1.downsample.norm.bias', 'evaluator.fragments_backbone.layers.1.downsample.norm.weight', 'evaluator.fragments_backbone.layers.1.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.1.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.2.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.2.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.3.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.3.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.4.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.4.norm2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.fragment_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.proj.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.qkv.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.2.blocks.5.attn.relative_position_index', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm1.weight', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.bias', 'evaluator.fragments_backbone.layers.2.blocks.5.norm2.weight', 'evaluator.fragments_backbone.layers.2.downsample.norm.bias', 'evaluator.fragments_backbone.layers.2.downsample.norm.weight', 'evaluator.fragments_backbone.layers.2.downsample.reduction.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.0.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.0.norm2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.proj.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.qkv.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_bias_table', 'evaluator.fragments_backbone.layers.3.blocks.1.attn.relative_position_index', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.mlp.fc2.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm1.weight', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.bias', 'evaluator.fragments_backbone.layers.3.blocks.1.norm2.weight', 'evaluator.fragments_backbone.norm.bias', 'evaluator.fragments_backbone.norm.weight', 'evaluator.fragments_backbone.patch_embed.norm.bias', 'evaluator.fragments_backbone.patch_embed.norm.weight', 'evaluator.fragments_backbone.patch_embed.proj.bias', 'evaluator.fragments_backbone.patch_embed.proj.weight', 'fast_mlp.0.bias', 'fast_mlp.0.weight', 'fast_mlp.1.bias', 'fast_mlp.1.weight', 'fast_mlp.3.bias', 'fast_mlp.3.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. 04/26/2025 04:08:19 - INFO - __main__ - Add dataset: sharegpt4v_instruct_gpt4-vision_cap100k with length: 49500 04/26/2025 04:08:19 - INFO - __main__ - [Dataset] num_image_token: 256 04/26/2025 04:08:19 - INFO - __main__ - [Dataset] dynamic_image_size: True 04/26/2025 04:08:19 - INFO - __main__ - [Dataset] use_thumbnail: True 04/26/2025 04:08:19 - INFO - __main__ - [Dataset] min_dynamic_patch: 1, max_dynamic_patch: 6 04/26/2025 04:08:19 - INFO - __main__ - Formatting inputs...Skip in lazy mode 04/26/2025 04:08:19 - INFO - __main__ - Add dataset: sharegpt4v_instruct_gpt4-vision_cap100k with length: 9000 eval_dataset eval_dataset eval_dataset trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 trainable params: 6,291,456 || all params: 310,303,744 || trainable%: 2.0275 trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 training_args TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4125, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=2, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr26_04-08-11_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 training_args TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4125, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=1, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr26_04-08-11_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) trainable params: 46,006,272 || all params: 8,847,216,640 || trainable%: 0.5200 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.0.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.1.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.2.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.3.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.4.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.5.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.6.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.7.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.8.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.9.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.10.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.11.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.12.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.13.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.14.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.15.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.16.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.17.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.18.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.19.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.20.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.21.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.22.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.qkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.qkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.proj.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.attn.proj.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - vision_model.base_model.model.encoder.layers.23.mlp.fc2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.0.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.0.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.1.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.1.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.2.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.2.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.3.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.3.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.4.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.4.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.5.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.5.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.6.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.6.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.7.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.7.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.8.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.8.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.9.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.9.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.10.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.10.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.11.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.11.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.12.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.12.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.13.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.13.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.14.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.14.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.15.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.15.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.16.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.16.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.17.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.17.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.18.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.18.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.19.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.19.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.20.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.20.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.21.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.21.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.22.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.22.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.23.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.23.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.24.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.24.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.25.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.25.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.26.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.26.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.27.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.27.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.28.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.28.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.29.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.29.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.30.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.30.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.31.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.31.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.32.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.32.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.33.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.33.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.34.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.34.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.35.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.35.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.36.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.36.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.37.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.37.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.38.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.38.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.39.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.39.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.40.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.40.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.41.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.41.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.42.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.42.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.43.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.43.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.44.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.44.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.45.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.45.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.46.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.46.feed_forward.w2.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wqkv.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wqkv.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wo.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.47.attention.wo.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w1.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w1.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w3.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w3.lora_B.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w2.lora_A.default.weight 04/26/2025 04:08:20 - INFO - __main__ - language_model.base_model.model.model.layers.47.feed_forward.w2.lora_B.default.weight training_args TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, average_tokens_across_devices=False, batch_eval_metrics=False, bf16=True, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=4, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=zero_stage1_config.json, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=4125, eval_strategy=steps, eval_use_gather_object=False, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=True, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_for_metrics=[], include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=4e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/runs/Apr26_04-08-11_amax, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=1.0, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=4000000, save_strategy=steps, save_total_limit=1, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_liger_kernel=False, use_mps_device=False, warmup_ratio=0.03, warmup_steps=0, weight_decay=0.01, ) [INFO|trainer.py:741] 2025-04-26 04:08:20,321 >> Using auto half precision backend [WARNING|trainer.py:803] 2025-04-26 04:08:20,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 04:08:20,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 04:08:20,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 04:08:20,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 04:08:20,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 04:08:20,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [2025-04-26 04:08:20,557] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed info: version=0.16.3, git-hash=unknown, git-branch=unknown [2025-04-26 04:08:20,557] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 3 Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... [2025-04-26 04:08:26,890] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False Using /home/wangjiarui/.cache/torch_extensions/py39_cu121 as PyTorch extensions root... Detected CUDA files, patching ldflags Emitting ninja build file /home/wangjiarui/.cache/torch_extensions/py39_cu121/fused_adam/build.ninja... Building extension module fused_adam... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) ninja: no work to do. Loading extension module fused_adam... Time to load fused_adam op: 0.6663095951080322 seconds Loading extension module fused_adam... Time to load fused_adam op: 0.7029051780700684 seconds Loading extension module fused_adam... Time to load fused_adam op: 0.7036454677581787 seconds [2025-04-26 04:08:27,950] [INFO] [logging.py:128:log_dist] [Rank 0] Using DeepSpeed Optimizer param name adamw as basic optimizer [2025-04-26 04:08:27,950] [INFO] [logging.py:128:log_dist] [Rank 0] Removing param_group that has no 'params' in the basic Optimizer [2025-04-26 04:08:28,037] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Basic Optimizer = FusedAdam [2025-04-26 04:08:28,037] [INFO] [utils.py:59:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= [2025-04-26 04:08:28,037] [INFO] [logging.py:128:log_dist] [Rank 0] Creating torch.bfloat16 ZeRO stage 1 optimizer [2025-04-26 04:08:28,037] [INFO] [stage_1_and_2.py:149:__init__] Reduce bucket size 1000000000 [2025-04-26 04:08:28,037] [INFO] [stage_1_and_2.py:150:__init__] Allgather bucket size 1000000000 [2025-04-26 04:08:28,037] [INFO] [stage_1_and_2.py:151:__init__] CPU Offload: False [2025-04-26 04:08:28,038] [INFO] [stage_1_and_2.py:152:__init__] Round robin gradient partitioning: False [2025-04-26 04:08:28,448] [INFO] [utils.py:781:see_memory_usage] Before initializing optimizer states [2025-04-26 04:08:28,449] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.08 GB CA 18.33 GB Max_CA 18 GB [2025-04-26 04:08:28,449] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 25.93 GB, percent = 10.3% [2025-04-26 04:08:28,622] [INFO] [utils.py:781:see_memory_usage] After initializing optimizer states [2025-04-26 04:08:28,623] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.11 GB CA 18.39 GB Max_CA 18 GB [2025-04-26 04:08:28,623] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 25.94 GB, percent = 10.3% [2025-04-26 04:08:28,623] [INFO] [stage_1_and_2.py:545:__init__] optimizer state initialized [2025-04-26 04:08:28,796] [INFO] [utils.py:781:see_memory_usage] After initializing ZeRO optimizer [2025-04-26 04:08:28,797] [INFO] [utils.py:782:see_memory_usage] MA 18.05 GB Max_MA 18.05 GB CA 18.39 GB Max_CA 18 GB [2025-04-26 04:08:28,797] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 26.03 GB, percent = 10.3% [2025-04-26 04:08:28,803] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed Final Optimizer = DeepSpeedZeroOptimizer [2025-04-26 04:08:28,803] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed using client callable to create LR scheduler [2025-04-26 04:08:28,804] [INFO] [logging.py:128:log_dist] [Rank 0] DeepSpeed LR Scheduler = [2025-04-26 04:08:28,804] [INFO] [logging.py:128:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0], mom=[[0.9, 0.999]] [2025-04-26 04:08:28,814] [INFO] [config.py:999:print] DeepSpeedEngine configuration: [2025-04-26 04:08:28,814] [INFO] [config.py:1003:print] activation_checkpointing_config { "partition_activations": false, "contiguous_memory_optimization": false, "cpu_checkpointing": false, "number_checkpoints": null, "synchronize_checkpoint_boundary": false, "profile": false } [2025-04-26 04:08:28,814] [INFO] [config.py:1003:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True, 'use_gds': False} [2025-04-26 04:08:28,814] [INFO] [config.py:1003:print] amp_enabled .................. False [2025-04-26 04:08:28,814] [INFO] [config.py:1003:print] amp_params ................... False [2025-04-26 04:08:28,814] [INFO] [config.py:1003:print] autotuning_config ............ { "enabled": false, "start_step": null, "end_step": null, "metric_path": null, "arg_mappings": null, "metric": "throughput", "model_info": null, "results_dir": "autotuning_results", "exps_dir": "autotuning_exps", "overwrite": true, "fast": true, "start_profile_step": 3, "end_profile_step": 5, "tuner_type": "gridsearch", "tuner_early_stopping": 5, "tuner_num_trials": 50, "model_info_path": null, "mp_size": 1, "max_train_batch_size": null, "min_train_batch_size": 1, "max_train_micro_batch_size_per_gpu": 1.024000e+03, "min_train_micro_batch_size_per_gpu": 1, "num_tuning_micro_batch_sizes": 3 } [2025-04-26 04:08:28,814] [INFO] [config.py:1003:print] bfloat16_enabled ............. True [2025-04-26 04:08:28,814] [INFO] [config.py:1003:print] bfloat16_immediate_grad_update False [2025-04-26 04:08:28,814] [INFO] [config.py:1003:print] checkpoint_parallel_write_pipeline False [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] checkpoint_tag_validation_enabled True [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] checkpoint_tag_validation_fail False [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] comms_config ................. [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] communication_data_type ...... None [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] compression_config ........... {'weight_quantization': {'shared_parameters': {'enabled': False, 'quantizer_kernel': False, 'schedule_offset': 0, 'quantize_groups': 1, 'quantize_verbose': False, 'quantization_type': 'symmetric', 'quantize_weight_in_forward': False, 'rounding': 'nearest', 'fp16_mixed_quantize': False, 'quantize_change_ratio': 0.001}, 'different_groups': {}}, 'activation_quantization': {'shared_parameters': {'enabled': False, 'quantization_type': 'symmetric', 'range_calibration': 'dynamic', 'schedule_offset': 1000}, 'different_groups': {}}, 'sparse_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'row_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'head_pruning': {'shared_parameters': {'enabled': False, 'method': 'topk', 'schedule_offset': 1000}, 'different_groups': {}}, 'channel_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'layer_reduction': {'enabled': False}} [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] curriculum_enabled_legacy .... False [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] curriculum_params_legacy ..... False [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] data_efficiency_config ....... {'enabled': False, 'seed': 1234, 'data_sampling': {'enabled': False, 'num_epochs': 1000, 'num_workers': 0, 'curriculum_learning': {'enabled': False}}, 'data_routing': {'enabled': False, 'random_ltd': {'enabled': False, 'layer_token_lr_schedule': {'enabled': False}}}} [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] data_efficiency_enabled ...... False [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] dataloader_drop_last ......... False [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] disable_allgather ............ False [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] dump_state ................... False [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] dynamic_loss_scale_args ...... None [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] eigenvalue_enabled ........... False [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] eigenvalue_gas_boundary_resolution 1 [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] eigenvalue_layer_name ........ bert.encoder.layer [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] eigenvalue_layer_num ......... 0 [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] eigenvalue_max_iter .......... 100 [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] eigenvalue_stability ......... 1e-06 [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] eigenvalue_tol ............... 0.01 [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] eigenvalue_verbose ........... False [2025-04-26 04:08:28,815] [INFO] [config.py:1003:print] elasticity_enabled ........... False [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] flops_profiler_config ........ { "enabled": false, "recompute_fwd_factor": 0.0, "profile_step": 1, "module_depth": -1, "top_modules": 1, "detailed": true, "output_file": null } [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] fp16_auto_cast ............... None [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] fp16_enabled ................. False [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] fp16_master_weights_and_gradients False [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] global_rank .................. 0 [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] grad_accum_dtype ............. None [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] gradient_accumulation_steps .. 1 [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] gradient_clipping ............ 1.0 [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] gradient_predivide_factor .... 1.0 [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] graph_harvesting ............. False [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] hybrid_engine ................ enabled=False max_out_tokens=512 inference_tp_size=1 release_inference_cache=False pin_parameters=True tp_gather_partition_size=8 [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] initial_dynamic_scale ........ 1 [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] load_universal_checkpoint .... False [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] loss_scale ................... 1.0 [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] memory_breakdown ............. False [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] mics_hierarchial_params_gather False [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] mics_shard_size .............. -1 [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] monitor_config ............... tensorboard=TensorBoardConfig(enabled=False, output_path='', job_name='DeepSpeedJobName') comet=CometConfig(enabled=False, samples_log_interval=100, project=None, workspace=None, api_key=None, experiment_name=None, experiment_key=None, online=None, mode=None) wandb=WandbConfig(enabled=False, group=None, team=None, project='deepspeed') csv_monitor=CSVConfig(enabled=False, output_path='', job_name='DeepSpeedJobName') [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] nebula_config ................ { "enabled": false, "persistent_storage_path": null, "persistent_time_interval": 100, "num_of_version_in_retention": 2, "enable_nebula_load": true, "load_path": null } [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] optimizer_legacy_fusion ...... False [2025-04-26 04:08:28,816] [INFO] [config.py:1003:print] optimizer_name ............... adamw [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] optimizer_params ............. {'lr': 4e-05, 'betas': [0.9, 0.999], 'eps': 1e-08, 'weight_decay': 0.01} [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0, 'pipe_partitioned': True, 'grad_partitioned': True} [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] pld_enabled .................. False [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] pld_params ................... False [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] prescale_gradients ........... False [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] scheduler_name ............... None [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] scheduler_params ............. None [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] seq_parallel_communication_data_type torch.float32 [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] sparse_attention ............. None [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] sparse_gradients_enabled ..... False [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] steps_per_print .............. inf [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] timers_config ................ enabled=True synchronized=True [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] train_batch_size ............. 12 [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] train_micro_batch_size_per_gpu 4 [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] use_data_before_expert_parallel_ False [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] use_node_local_storage ....... False [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] wall_clock_breakdown ......... True [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] weight_quantization_config ... None [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] world_size ................... 3 [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] zero_allow_untested_optimizer False [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] zero_config .................. stage=1 contiguous_gradients=True reduce_scatter=True reduce_bucket_size=1000000000 use_multi_rank_bucket_allreduce=True allgather_partitions=True allgather_bucket_size=1000000000 overlap_comm=True load_from_fp32_weights=True elastic_checkpoint=False offload_param=None offload_optimizer=None sub_group_size=1000000000 cpu_offload_param=None cpu_offload_use_pin_memory=None cpu_offload=None prefetch_bucket_size=50000000 param_persistence_threshold=100000 model_persistence_threshold=9223372036854775807 max_live_parameters=1000000000 max_reuse_distance=1000000000 gather_16bit_weights_on_model_save=False module_granularity_threshold=0 use_all_reduce_for_fetch_params=False stage3_gather_fp16_weights_on_model_save=False ignore_unused_parameters=True legacy_stage1=False round_robin_gradients=False zero_hpz_partition_size=1 zero_quantized_weights=False zero_quantized_nontrainable_weights=False zero_quantized_gradients=False zeropp_loco_param=None mics_shard_size=-1 mics_hierarchical_params_gather=False memory_efficient_linear=True pipeline_loading_checkpoint=False override_module_apply=True [2025-04-26 04:08:28,817] [INFO] [config.py:1003:print] zero_enabled ................. True [2025-04-26 04:08:28,818] [INFO] [config.py:1003:print] zero_force_ds_cpu_optimizer .. True [2025-04-26 04:08:28,818] [INFO] [config.py:1003:print] zero_optimization_stage ...... 1 [2025-04-26 04:08:28,818] [INFO] [config.py:989:print_user_config] json = { "zero_optimization": { "stage": 1, "allgather_partitions": true, "allgather_bucket_size": 1.000000e+09, "overlap_comm": true, "reduce_scatter": true, "reduce_bucket_size": 1.000000e+09, "contiguous_gradients": true }, "fp16": { "enabled": false, "auto_cast": true, "loss_scale": 0, "initial_scale_power": 32, "loss_scale_window": 1000, "hysteresis": 2, "min_loss_scale": 1 }, "bf16": { "enabled": true }, "optimizer": { "type": "AdamW", "params": { "lr": 4e-05, "betas": [0.9, 0.999], "eps": 1e-08, "weight_decay": 0.01 } }, "gradient_accumulation_steps": 1, "gradient_clipping": 1.0, "steps_per_print": inf, "train_batch_size": 12, "train_micro_batch_size_per_gpu": 4, "wall_clock_breakdown": true } [INFO|trainer.py:2369] 2025-04-26 04:08:28,819 >> ***** Running training ***** [INFO|trainer.py:2370] 2025-04-26 04:08:28,819 >> Num examples = 49,500 [INFO|trainer.py:2371] 2025-04-26 04:08:28,819 >> Num Epochs = 10 [INFO|trainer.py:2372] 2025-04-26 04:08:28,819 >> Instantaneous batch size per device = 4 [INFO|trainer.py:2375] 2025-04-26 04:08:28,819 >> Total train batch size (w. parallel, distributed & accumulation) = 12 [INFO|trainer.py:2376] 2025-04-26 04:08:28,819 >> Gradient Accumulation steps = 1 [INFO|trainer.py:2377] 2025-04-26 04:08:28,819 >> Total optimization steps = 41,250 [INFO|trainer.py:2378] 2025-04-26 04:08:28,826 >> Number of trainable parameters = 52,297,728 0%| | 0/41250 [00:00> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:05:45,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:05:45,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2 2 2 [WARNING|trainer.py:803] 2025-04-26 14:05:47,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:05:47,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:05:47,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3 3 3 [WARNING|trainer.py:803] 2025-04-26 14:05:50,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:05:50,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:05:50,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4 4 4 [WARNING|trainer.py:803] 2025-04-26 14:05:52,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:05:52,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:05:52,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5 5 5 [WARNING|trainer.py:803] 2025-04-26 14:05:55,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:05:55,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:05:55,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6 6 6 [WARNING|trainer.py:803] 2025-04-26 14:05:57,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:05:57,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:05:57,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7 7 7 [WARNING|trainer.py:803] 2025-04-26 14:06:00,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:06:00,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:06:00,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8 8 8 [WARNING|trainer.py:803] 2025-04-26 14:06:02,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:06:02,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:06:02,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 9 9 9 [WARNING|trainer.py:803] 2025-04-26 14:06:05,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:06:05,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:06:05,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 10 10 10 [WARNING|trainer.py:803] 2025-04-26 14:06:07,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:06:07,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:06:07,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 11 11 11 [WARNING|trainer.py:803] 2025-04-26 14:06:10,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:10,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:10,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 12 12 12 [WARNING|trainer.py:803] 2025-04-26 14:06:12,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:06:12,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:06:12,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 13 13 13 [WARNING|trainer.py:803] 2025-04-26 14:06:15,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:15,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:15,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 14 14 14 [WARNING|trainer.py:803] 2025-04-26 14:06:17,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:17,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:17,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 15 15 15 [WARNING|trainer.py:803] 2025-04-26 14:06:20,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:20,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:20,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 16 16 16 [WARNING|trainer.py:803] 2025-04-26 14:06:22,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:22,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:22,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 17 17 17 [WARNING|trainer.py:803] 2025-04-26 14:06:25,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:25,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:25,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 18 18 18 [WARNING|trainer.py:803] 2025-04-26 14:06:27,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:06:27,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:06:27,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 19 19 19 [WARNING|trainer.py:803] 2025-04-26 14:06:29,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:06:29,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:06:29,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 20 20 20 [WARNING|trainer.py:803] 2025-04-26 14:06:32,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:06:32,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:06:32,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 21 21 21 [WARNING|trainer.py:803] 2025-04-26 14:06:34,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:34,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:34,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 22 22 22 [WARNING|trainer.py:803] 2025-04-26 14:06:37,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:37,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:37,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 23 23 23 [WARNING|trainer.py:803] 2025-04-26 14:06:39,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:06:39,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:06:39,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 24 24 24 [WARNING|trainer.py:803] 2025-04-26 14:06:42,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:06:42,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:06:42,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 25 25 25 [WARNING|trainer.py:803] 2025-04-26 14:06:45,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:45,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:45,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 26 26 26 [WARNING|trainer.py:803] 2025-04-26 14:06:47,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:06:47,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:06:47,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 27 27 27 [WARNING|trainer.py:803] 2025-04-26 14:06:50,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:06:50,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:06:50,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 28 28 28 [WARNING|trainer.py:803] 2025-04-26 14:06:52,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:52,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:06:53,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 29 29 29 [WARNING|trainer.py:803] 2025-04-26 14:06:55,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:06:55,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:06:55,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 30 30 30 [WARNING|trainer.py:803] 2025-04-26 14:06:57,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:06:57,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:06:57,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 31 31 31 [WARNING|trainer.py:803] 2025-04-26 14:07:00,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:07:00,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:07:00,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 32 32 32 [WARNING|trainer.py:803] 2025-04-26 14:07:02,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:03,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:03,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 33 33 33 [WARNING|trainer.py:803] 2025-04-26 14:07:05,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:05,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:05,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 34 34 34 [WARNING|trainer.py:803] 2025-04-26 14:07:07,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:07:07,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:07:08,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 35 35 35 [WARNING|trainer.py:803] 2025-04-26 14:07:10,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:10,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:10,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 36 36 36 [WARNING|trainer.py:803] 2025-04-26 14:07:13,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:13,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:13,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 37 37 37 [WARNING|trainer.py:803] 2025-04-26 14:07:15,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:15,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:15,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 38 38 38 [WARNING|trainer.py:803] 2025-04-26 14:07:18,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:07:18,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:07:18,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 39 39 39 [WARNING|trainer.py:803] 2025-04-26 14:07:20,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:20,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:20,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 40 40 40 [WARNING|trainer.py:803] 2025-04-26 14:07:22,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:23,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:23,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 41 41 41 [WARNING|trainer.py:803] 2025-04-26 14:07:25,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:25,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:25,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 42 42 42 [WARNING|trainer.py:803] 2025-04-26 14:07:27,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:07:28,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:07:28,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 43 43 43 [WARNING|trainer.py:803] 2025-04-26 14:07:30,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:30,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:30,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 44 44 44 [WARNING|trainer.py:803] 2025-04-26 14:07:33,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:33,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:33,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 45 45 45 [WARNING|trainer.py:803] 2025-04-26 14:07:35,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:35,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:36,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 46 46 46 [WARNING|trainer.py:803] 2025-04-26 14:07:38,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:38,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:38,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 47 47 47 [WARNING|trainer.py:803] 2025-04-26 14:07:41,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:41,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:41,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 48 48 48 [WARNING|trainer.py:803] 2025-04-26 14:07:43,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:43,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:43,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 49 49 49 [WARNING|trainer.py:803] 2025-04-26 14:07:46,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:46,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:46,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 50 50 50 [WARNING|trainer.py:803] 2025-04-26 14:07:49,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:07:49,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:07:49,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 51 51 51 [WARNING|trainer.py:803] 2025-04-26 14:07:51,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:51,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:52,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 52 52 52 [WARNING|trainer.py:803] 2025-04-26 14:07:54,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:54,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:54,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 53 53 53 [WARNING|trainer.py:803] 2025-04-26 14:07:57,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:57,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:07:57,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 54 54 54 [WARNING|trainer.py:803] 2025-04-26 14:07:59,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:59,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:07:59,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 55 55 55 [WARNING|trainer.py:803] 2025-04-26 14:08:02,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:02,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:02,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 56 56 56 [WARNING|trainer.py:803] 2025-04-26 14:08:04,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:04,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:04,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 57 57 57 [WARNING|trainer.py:803] 2025-04-26 14:08:06,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:08:06,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:08:06,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 58 58 58 [WARNING|trainer.py:803] 2025-04-26 14:08:08,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:08:09,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:08:09,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 59 59 59 [WARNING|trainer.py:803] 2025-04-26 14:08:11,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:08:11,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:08:11,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 60 60 60 [WARNING|trainer.py:803] 2025-04-26 14:08:13,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:08:14,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:08:14,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 61 61 61 [WARNING|trainer.py:803] 2025-04-26 14:08:16,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:08:16,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:08:16,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 62 62 62 [WARNING|trainer.py:803] 2025-04-26 14:08:19,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:19,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:19,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 63 63 63 [WARNING|trainer.py:803] 2025-04-26 14:08:21,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:08:21,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:08:21,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 64 64 64 [WARNING|trainer.py:803] 2025-04-26 14:08:23,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:08:23,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:08:23,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 65 65 65 [WARNING|trainer.py:803] 2025-04-26 14:08:26,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:26,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:26,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 66 66 66 [WARNING|trainer.py:803] 2025-04-26 14:08:28,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:08:28,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:08:28,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 67 67 67 [WARNING|trainer.py:803] 2025-04-26 14:08:31,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:08:31,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:08:31,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 68 68 68 [WARNING|trainer.py:803] 2025-04-26 14:08:33,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:34,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:34,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 69 69 69 [WARNING|trainer.py:803] 2025-04-26 14:08:36,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:08:36,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:08:36,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 70 70 70 [WARNING|trainer.py:803] 2025-04-26 14:08:39,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:08:39,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:08:39,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 71 71 71 [WARNING|trainer.py:803] 2025-04-26 14:08:41,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:08:42,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:08:42,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 72 72 72 [WARNING|trainer.py:803] 2025-04-26 14:08:44,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:44,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:44,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 73 73 73 [WARNING|trainer.py:803] 2025-04-26 14:08:47,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:08:47,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:08:47,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 74 74 74 [WARNING|trainer.py:803] 2025-04-26 14:08:49,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:49,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:49,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 75 75 75 [WARNING|trainer.py:803] 2025-04-26 14:08:52,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:53,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:53,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 76 76 76 [WARNING|trainer.py:803] 2025-04-26 14:08:55,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:08:55,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:08:55,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 77 77 77 [WARNING|trainer.py:803] 2025-04-26 14:08:58,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:58,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:08:58,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 78 78 78 [WARNING|trainer.py:803] 2025-04-26 14:09:00,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:09:00,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:09:00,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 79 79 79 [WARNING|trainer.py:803] 2025-04-26 14:09:02,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:09:03,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:09:03,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 80 80 80 [WARNING|trainer.py:803] 2025-04-26 14:09:05,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:09:05,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:09:05,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 81 81 81 [WARNING|trainer.py:803] 2025-04-26 14:09:08,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:09:08,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:09:08,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 82 82 82 [WARNING|trainer.py:803] 2025-04-26 14:09:10,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:09:10,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:09:10,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 83 83 83 [WARNING|trainer.py:803] 2025-04-26 14:09:13,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:09:13,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:09:13,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 84 84 84 [WARNING|trainer.py:803] 2025-04-26 14:09:16,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:09:16,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:09:16,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 85 85 85 [WARNING|trainer.py:803] 2025-04-26 14:09:19,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:09:19,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:09:19,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 86 86 86 [WARNING|trainer.py:803] 2025-04-26 14:09:21,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:09:21,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:09:21,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 87 87 87 [WARNING|trainer.py:803] 2025-04-26 14:09:24,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:09:24,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:09:24,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 88 88 88 [WARNING|trainer.py:803] 2025-04-26 14:09:26,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:09:26,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:09:26,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 89 89 89 [WARNING|trainer.py:803] 2025-04-26 14:09:28,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:09:28,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:09:28,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 90 90 90 [WARNING|trainer.py:803] 2025-04-26 14:09:30,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:09:30,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:09:31,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 91 91 91 [WARNING|trainer.py:803] 2025-04-26 14:09:33,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:09:33,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:09:33,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 92 92 92 [WARNING|trainer.py:803] 2025-04-26 14:09:36,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:09:36,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:09:36,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 93 93 93 [WARNING|trainer.py:803] 2025-04-26 14:09:38,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:09:39,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:09:39,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 94 94 94 [WARNING|trainer.py:803] 2025-04-26 14:09:41,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:09:41,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:09:41,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 95 95 95 [WARNING|trainer.py:803] 2025-04-26 14:09:43,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:09:43,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:09:44,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 96 96 96 [WARNING|trainer.py:803] 2025-04-26 14:09:46,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:09:46,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:09:46,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 97 97 97 [WARNING|trainer.py:803] 2025-04-26 14:09:48,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:09:49,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:09:49,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 98 98 98 [WARNING|trainer.py:803] 2025-04-26 14:09:51,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:09:51,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:09:52,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 99 99 99 [WARNING|trainer.py:803] 2025-04-26 14:09:53,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:09:54,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:09:54,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 100 100 100 [WARNING|trainer.py:803] 2025-04-26 14:09:56,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:09:56,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:09:56,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 101 101 101 [WARNING|trainer.py:803] 2025-04-26 14:09:59,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:09:59,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:09:59,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 102 102 102 [WARNING|trainer.py:803] 2025-04-26 14:10:01,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:10:02,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:10:02,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 103 103 103 [WARNING|trainer.py:803] 2025-04-26 14:10:04,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:10:04,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:10:05,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 104 104 104 [WARNING|trainer.py:803] 2025-04-26 14:10:07,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:10:07,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:10:07,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 105 105 105 [WARNING|trainer.py:803] 2025-04-26 14:10:10,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:10:10,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:10:10,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 106 106 106 [WARNING|trainer.py:803] 2025-04-26 14:10:12,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:10:13,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:10:13,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 107 107 107 [WARNING|trainer.py:803] 2025-04-26 14:10:15,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:10:16,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:10:16,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 108 108 108 [WARNING|trainer.py:803] 2025-04-26 14:10:18,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:10:18,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:10:18,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 109 109 109 [WARNING|trainer.py:803] 2025-04-26 14:10:20,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:10:20,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:10:20,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 110 110 110 [WARNING|trainer.py:803] 2025-04-26 14:10:23,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:10:24,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:10:24,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 111 111 111 [WARNING|trainer.py:803] 2025-04-26 14:10:26,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:10:26,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:10:26,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 112 112 112 [WARNING|trainer.py:803] 2025-04-26 14:10:28,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:10:29,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:10:29,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 113 113 113 [WARNING|trainer.py:803] 2025-04-26 14:10:31,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:10:31,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:10:31,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 114 114 114 [WARNING|trainer.py:803] 2025-04-26 14:10:33,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:10:34,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:10:34,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 115 115 115 [WARNING|trainer.py:803] 2025-04-26 14:10:36,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:10:36,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:10:36,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 116 116 116 [WARNING|trainer.py:803] 2025-04-26 14:10:39,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:10:39,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:10:39,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 117 117 117 [WARNING|trainer.py:803] 2025-04-26 14:10:41,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:10:41,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:10:41,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 118 118 118 [WARNING|trainer.py:803] 2025-04-26 14:10:44,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:10:45,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:10:45,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 119 119 119 [WARNING|trainer.py:803] 2025-04-26 14:10:47,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:10:47,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:10:47,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 120 120 120 [WARNING|trainer.py:803] 2025-04-26 14:10:50,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:10:50,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:10:50,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 121 121 121 [WARNING|trainer.py:803] 2025-04-26 14:10:53,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:10:53,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:10:53,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 122 122 122 [WARNING|trainer.py:803] 2025-04-26 14:10:56,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:10:56,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:10:56,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 123 123 123 [WARNING|trainer.py:803] 2025-04-26 14:10:59,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:10:59,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:10:59,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 124 124 124 [WARNING|trainer.py:803] 2025-04-26 14:11:02,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:11:02,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:11:02,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 125 125 125 [WARNING|trainer.py:803] 2025-04-26 14:11:04,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:11:04,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:11:04,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 126 126 126 [WARNING|trainer.py:803] 2025-04-26 14:11:07,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:07,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:07,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 127 127 127 [WARNING|trainer.py:803] 2025-04-26 14:11:10,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:10,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:10,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 128 128 128 [WARNING|trainer.py:803] 2025-04-26 14:11:12,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:11:12,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:11:12,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 129 129 129 [WARNING|trainer.py:803] 2025-04-26 14:11:15,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:11:15,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:11:15,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 130 130 130 [WARNING|trainer.py:803] 2025-04-26 14:11:17,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:17,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:17,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 131 131 131 [WARNING|trainer.py:803] 2025-04-26 14:11:20,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:20,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:20,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 132 132 132 [WARNING|trainer.py:803] 2025-04-26 14:11:22,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:11:22,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:11:23,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 133 133 133 [WARNING|trainer.py:803] 2025-04-26 14:11:25,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:11:25,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:11:25,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 134 134 134 [WARNING|trainer.py:803] 2025-04-26 14:11:28,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:11:28,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:11:28,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 135 135 135 [WARNING|trainer.py:803] 2025-04-26 14:11:30,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:11:30,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:11:30,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 136 136 136 [WARNING|trainer.py:803] 2025-04-26 14:11:33,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:11:33,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:11:33,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 137 137 137 [WARNING|trainer.py:803] 2025-04-26 14:11:36,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:11:36,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:11:36,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 138 138 138 [WARNING|trainer.py:803] 2025-04-26 14:11:39,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:11:39,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:11:39,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 139 139 139 [WARNING|trainer.py:803] 2025-04-26 14:11:41,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:41,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:41,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 140 140 140 [WARNING|trainer.py:803] 2025-04-26 14:11:44,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:44,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:44,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 141 141 141 [WARNING|trainer.py:803] 2025-04-26 14:11:46,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:46,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:46,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 142 142 142 [WARNING|trainer.py:803] 2025-04-26 14:11:49,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:49,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:11:49,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 143 143 143 [WARNING|trainer.py:803] 2025-04-26 14:11:52,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:11:52,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:11:52,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 144 144 144 [WARNING|trainer.py:803] 2025-04-26 14:11:54,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:11:54,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:11:55,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 145 145 145 [WARNING|trainer.py:803] 2025-04-26 14:11:58,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:11:58,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:11:58,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 146 146 146 [WARNING|trainer.py:803] 2025-04-26 14:12:00,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:00,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:00,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 147 147 147 [WARNING|trainer.py:803] 2025-04-26 14:12:03,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:12:03,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:12:03,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 148 148 148 [WARNING|trainer.py:803] 2025-04-26 14:12:06,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:12:06,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:12:06,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 149 149 149 [WARNING|trainer.py:803] 2025-04-26 14:12:09,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:09,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:09,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 150 150 150 [WARNING|trainer.py:803] 2025-04-26 14:12:11,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:11,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:11,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 151 151 151 [WARNING|trainer.py:803] 2025-04-26 14:12:14,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:12:14,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:12:14,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 152 152 152 [WARNING|trainer.py:803] 2025-04-26 14:12:17,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:12:17,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:12:17,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 153 153 153 [WARNING|trainer.py:803] 2025-04-26 14:12:20,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:12:20,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:12:20,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 154 154 154 [WARNING|trainer.py:803] 2025-04-26 14:12:22,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:22,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:22,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 155 155 155 [WARNING|trainer.py:803] 2025-04-26 14:12:25,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:12:25,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:12:25,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 156 156 156 [WARNING|trainer.py:803] 2025-04-26 14:12:27,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:27,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:27,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 157 157 157 [WARNING|trainer.py:803] 2025-04-26 14:12:30,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:12:30,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:12:30,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 158 158 158 [WARNING|trainer.py:803] 2025-04-26 14:12:33,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:33,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:33,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 159 159 159 [WARNING|trainer.py:803] 2025-04-26 14:12:35,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:35,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:35,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 160 160 160 [WARNING|trainer.py:803] 2025-04-26 14:12:38,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:38,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:38,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 161 161 161 [WARNING|trainer.py:803] 2025-04-26 14:12:40,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:40,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:41,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 162 162 162 [WARNING|trainer.py:803] 2025-04-26 14:12:43,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:43,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:43,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 163 163 163 [WARNING|trainer.py:803] 2025-04-26 14:12:46,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:12:46,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:12:46,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 164 164 164 [WARNING|trainer.py:803] 2025-04-26 14:12:48,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:12:48,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:12:49,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 165 165 165 [WARNING|trainer.py:803] 2025-04-26 14:12:51,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:12:51,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:12:51,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 166 166 166 [WARNING|trainer.py:803] 2025-04-26 14:12:53,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:12:54,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:12:54,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 167 167 167 [WARNING|trainer.py:803] 2025-04-26 14:12:56,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:56,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:12:57,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 168 168 168 [WARNING|trainer.py:803] 2025-04-26 14:12:59,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:12:59,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:12:59,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 169 169 169 [WARNING|trainer.py:803] 2025-04-26 14:13:03,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:13:03,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:13:03,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 170 170 170 [WARNING|trainer.py:803] 2025-04-26 14:13:05,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:13:05,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:13:05,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 171 171 171 [WARNING|trainer.py:803] 2025-04-26 14:13:07,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:13:07,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:13:08,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 172 172 172 [WARNING|trainer.py:803] 2025-04-26 14:13:10,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:13:10,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:13:10,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 173 173 173 [WARNING|trainer.py:803] 2025-04-26 14:13:12,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:13:12,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:13:12,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 174 174 174 [WARNING|trainer.py:803] 2025-04-26 14:13:15,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:13:15,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:13:15,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 175 175 175 [WARNING|trainer.py:803] 2025-04-26 14:13:17,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:13:17,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:13:17,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 176 176 176 [WARNING|trainer.py:803] 2025-04-26 14:13:20,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:13:20,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:13:20,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 177 177 177 [WARNING|trainer.py:803] 2025-04-26 14:13:22,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:13:22,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:13:23,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 178 178 178 [WARNING|trainer.py:803] 2025-04-26 14:13:25,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:13:25,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:13:26,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 179 179 179 [WARNING|trainer.py:803] 2025-04-26 14:13:28,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:13:28,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:13:29,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 180 180 180 [WARNING|trainer.py:803] 2025-04-26 14:13:31,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:13:31,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:13:32,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 181 181 181 [WARNING|trainer.py:803] 2025-04-26 14:13:33,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:13:34,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:13:34,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 182 182 182 [WARNING|trainer.py:803] 2025-04-26 14:13:36,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:13:36,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:13:37,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 183 183 183 [WARNING|trainer.py:803] 2025-04-26 14:13:39,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:13:39,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:13:39,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 184 184 184 [WARNING|trainer.py:803] 2025-04-26 14:13:41,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:13:42,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:13:42,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 185 185 185 [WARNING|trainer.py:803] 2025-04-26 14:13:44,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:13:44,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:13:45,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 186 186 186 [WARNING|trainer.py:803] 2025-04-26 14:13:47,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:13:47,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:13:47,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 187 187 187 [WARNING|trainer.py:803] 2025-04-26 14:13:49,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:13:50,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:13:50,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 188 188 188 [WARNING|trainer.py:803] 2025-04-26 14:13:53,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:13:53,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:13:53,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 189 189 189 [WARNING|trainer.py:803] 2025-04-26 14:13:55,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:13:55,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:13:56,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 190 190 190 [WARNING|trainer.py:803] 2025-04-26 14:13:58,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:13:58,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:13:58,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 191 191 191 [WARNING|trainer.py:803] 2025-04-26 14:14:00,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:14:01,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:14:01,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 192 192 192 [WARNING|trainer.py:803] 2025-04-26 14:14:03,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:03,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:04,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 193 193 193 [WARNING|trainer.py:803] 2025-04-26 14:14:06,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:14:06,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:14:06,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 194 194 194 [WARNING|trainer.py:803] 2025-04-26 14:14:09,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:14:09,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:14:09,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 195 195 195 [WARNING|trainer.py:803] 2025-04-26 14:14:11,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:11,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:11,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 196 196 196 [WARNING|trainer.py:803] 2025-04-26 14:14:14,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:14,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:14,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 197 197 197 [WARNING|trainer.py:803] 2025-04-26 14:14:16,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:16,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:17,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 198 198 198 [WARNING|trainer.py:803] 2025-04-26 14:14:19,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:14:19,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:14:19,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 199 199 199 [WARNING|trainer.py:803] 2025-04-26 14:14:21,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:21,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:21,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 200 200 200 [WARNING|trainer.py:803] 2025-04-26 14:14:24,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:24,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:24,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 201 201 201 [WARNING|trainer.py:803] 2025-04-26 14:14:27,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:14:27,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:14:27,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 202 202 202 [WARNING|trainer.py:803] 2025-04-26 14:14:30,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:14:30,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:14:30,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 203 203 203 [WARNING|trainer.py:803] 2025-04-26 14:14:32,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:14:32,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:14:32,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 204 204 204 [WARNING|trainer.py:803] 2025-04-26 14:14:35,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:14:35,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:14:35,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 205 205 205 [WARNING|trainer.py:803] 2025-04-26 14:14:37,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:37,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:38,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 206 206 206 [WARNING|trainer.py:803] 2025-04-26 14:14:40,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:14:40,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:14:40,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 207 207 207 [WARNING|trainer.py:803] 2025-04-26 14:14:43,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:43,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:43,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 208 208 208 [WARNING|trainer.py:803] 2025-04-26 14:14:45,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:14:46,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:14:46,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 209 209 209 [WARNING|trainer.py:803] 2025-04-26 14:14:48,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:48,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:48,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 210 210 210 [WARNING|trainer.py:803] 2025-04-26 14:14:50,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:50,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:50,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 211 211 211 [WARNING|trainer.py:803] 2025-04-26 14:14:52,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:14:52,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:14:52,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 212 212 212 [WARNING|trainer.py:803] 2025-04-26 14:14:54,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:14:54,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:14:54,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 213 213 213 [WARNING|trainer.py:803] 2025-04-26 14:14:56,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:56,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:14:56,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 214 214 214 [WARNING|trainer.py:803] 2025-04-26 14:14:58,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:14:58,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:14:58,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 215 215 215 [WARNING|trainer.py:803] 2025-04-26 14:15:00,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:15:00,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:15:00,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 216 216 216 [WARNING|trainer.py:803] 2025-04-26 14:15:02,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:15:02,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:15:02,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 217 217 217 [WARNING|trainer.py:803] 2025-04-26 14:15:04,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:15:04,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:15:04,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 218 218 218 [WARNING|trainer.py:803] 2025-04-26 14:15:06,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:06,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:06,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 219 219 219 [WARNING|trainer.py:803] 2025-04-26 14:15:08,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:15:08,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:15:09,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 220 220 220 [WARNING|trainer.py:803] 2025-04-26 14:15:10,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:15:10,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:15:11,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 221 221 221 [WARNING|trainer.py:803] 2025-04-26 14:15:12,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:12,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:13,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 222 222 222 [WARNING|trainer.py:803] 2025-04-26 14:15:14,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:14,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:15,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 223 223 223 [WARNING|trainer.py:803] 2025-04-26 14:15:16,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:17,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:17,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 224 224 224 [WARNING|trainer.py:803] 2025-04-26 14:15:18,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:19,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:19,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 225 225 225 [WARNING|trainer.py:803] 2025-04-26 14:15:20,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:15:21,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:15:21,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 226 226 226 [WARNING|trainer.py:803] 2025-04-26 14:15:22,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:23,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:23,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 227 227 227 [WARNING|trainer.py:803] 2025-04-26 14:15:24,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:15:25,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:15:25,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 228 228 228 [WARNING|trainer.py:803] 2025-04-26 14:15:26,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:15:27,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:15:27,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 229 229 229 [WARNING|trainer.py:803] 2025-04-26 14:15:28,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:29,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:29,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 230 230 230 [WARNING|trainer.py:803] 2025-04-26 14:15:30,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:31,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:31,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 231 231 231 [WARNING|trainer.py:803] 2025-04-26 14:15:32,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:15:33,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:15:33,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 232 232 232 [WARNING|trainer.py:803] 2025-04-26 14:15:34,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:15:35,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:15:35,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 233 233 233 [WARNING|trainer.py:803] 2025-04-26 14:15:36,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:37,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:37,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 234 234 [WARNING|trainer.py:803] 2025-04-26 14:15:38,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 234 [WARNING|trainer.py:803] 2025-04-26 14:15:39,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:15:39,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 235 [WARNING|trainer.py:803] 2025-04-26 14:15:40,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 235 235 [WARNING|trainer.py:803] 2025-04-26 14:15:41,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:41,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 236 [WARNING|trainer.py:803] 2025-04-26 14:15:42,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 236 236 [WARNING|trainer.py:803] 2025-04-26 14:15:43,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:15:43,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 237 [WARNING|trainer.py:803] 2025-04-26 14:15:44,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 237 237 [WARNING|trainer.py:803] 2025-04-26 14:15:45,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:15:45,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 238 [WARNING|trainer.py:803] 2025-04-26 14:15:46,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 238 238 [WARNING|trainer.py:803] 2025-04-26 14:15:47,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:15:47,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 239 [WARNING|trainer.py:803] 2025-04-26 14:15:48,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 239 239 [WARNING|trainer.py:803] 2025-04-26 14:15:49,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:15:49,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 240 [WARNING|trainer.py:803] 2025-04-26 14:15:50,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 240 240 [WARNING|trainer.py:803] 2025-04-26 14:15:51,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:15:51,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 241 [WARNING|trainer.py:803] 2025-04-26 14:15:52,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 241 241 [WARNING|trainer.py:803] 2025-04-26 14:15:53,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:15:53,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 242 [WARNING|trainer.py:803] 2025-04-26 14:15:54,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 242 242 [WARNING|trainer.py:803] 2025-04-26 14:15:55,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 243 [WARNING|trainer.py:803] 2025-04-26 14:15:55,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:15:56,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 243 243 244 [WARNING|trainer.py:803] 2025-04-26 14:15:57,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:15:57,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:15:58,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 244 244 245 [WARNING|trainer.py:803] 2025-04-26 14:15:59,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:15:59,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:00,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 245 245 246 [WARNING|trainer.py:803] 2025-04-26 14:16:01,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:16:01,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:16:02,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 246 246 247 [WARNING|trainer.py:803] 2025-04-26 14:16:03,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:16:03,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:16:03,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 247 247 248 [WARNING|trainer.py:803] 2025-04-26 14:16:05,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:16:05,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:16:05,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 248 248 249 [WARNING|trainer.py:803] 2025-04-26 14:16:07,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:16:07,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:16:07,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 249 249 [WARNING|trainer.py:803] 2025-04-26 14:16:09,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 250 [WARNING|trainer.py:803] 2025-04-26 14:16:09,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:16:10,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 250 250 251 [WARNING|trainer.py:803] 2025-04-26 14:16:11,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:16:11,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:16:12,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 251 251 252 [WARNING|trainer.py:803] 2025-04-26 14:16:13,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:13,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:14,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 252 252 253 [WARNING|trainer.py:803] 2025-04-26 14:16:15,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:15,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:15,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 253 253 254 [WARNING|trainer.py:803] 2025-04-26 14:16:17,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:17,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:17,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 254 254 255 [WARNING|trainer.py:803] 2025-04-26 14:16:19,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:19,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:19,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 255 255 256 [WARNING|trainer.py:803] 2025-04-26 14:16:21,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:21,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:22,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 256 256 257 [WARNING|trainer.py:803] 2025-04-26 14:16:23,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:24,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:24,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 257 257 [WARNING|trainer.py:803] 2025-04-26 14:16:26,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 258 [WARNING|trainer.py:803] 2025-04-26 14:16:26,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:27,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 258 258 259 [WARNING|trainer.py:803] 2025-04-26 14:16:28,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:28,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:29,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 259 260 259 [WARNING|trainer.py:803] 2025-04-26 14:16:31,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:31,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:31,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 260 260 261 [WARNING|trainer.py:803] 2025-04-26 14:16:33,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:33,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:33,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 261 262 261 [WARNING|trainer.py:803] 2025-04-26 14:16:35,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:16:35,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:16:35,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 263 262 262 [WARNING|trainer.py:803] 2025-04-26 14:16:37,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:16:37,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:16:37,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 263 264 263 [WARNING|trainer.py:803] 2025-04-26 14:16:39,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:16:39,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:16:39,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 264 264 265 [WARNING|trainer.py:803] 2025-04-26 14:16:41,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:16:41,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:16:42,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 265 265 266 [WARNING|trainer.py:803] 2025-04-26 14:16:43,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:16:44,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:16:44,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 266 266 267 [WARNING|trainer.py:803] 2025-04-26 14:16:46,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:46,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:46,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 267 268 267 [WARNING|trainer.py:803] 2025-04-26 14:16:48,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:48,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:16:48,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 268 268 269 [WARNING|trainer.py:803] 2025-04-26 14:16:50,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:16:51,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:16:51,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 269 269 270 [WARNING|trainer.py:803] 2025-04-26 14:16:53,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:16:53,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:16:53,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 270 271 270 [WARNING|trainer.py:803] 2025-04-26 14:16:55,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:16:55,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:55,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 271 271 272 [WARNING|trainer.py:803] 2025-04-26 14:16:58,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:58,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:16:58,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 272 273 272 [WARNING|trainer.py:803] 2025-04-26 14:17:00,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:17:00,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:00,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 273 273 274 [WARNING|trainer.py:803] 2025-04-26 14:17:03,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:03,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:03,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 274 274 275 [WARNING|trainer.py:803] 2025-04-26 14:17:05,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:05,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:05,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 275 276 275 [WARNING|trainer.py:803] 2025-04-26 14:17:08,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:08,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:08,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 276 277 276 [WARNING|trainer.py:803] 2025-04-26 14:17:10,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:10,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:10,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 277 278 277 [WARNING|trainer.py:803] 2025-04-26 14:17:13,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:13,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:13,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 278 279 278 [WARNING|trainer.py:803] 2025-04-26 14:17:15,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:15,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:15,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 279 280 279 [WARNING|trainer.py:803] 2025-04-26 14:17:17,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:17,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:17:18,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 280 281 280 [WARNING|trainer.py:803] 2025-04-26 14:17:20,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:17:20,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:17:20,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 281 282 281 [WARNING|trainer.py:803] 2025-04-26 14:17:22,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:17:22,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:17:22,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 282 283 282 [WARNING|trainer.py:803] 2025-04-26 14:17:25,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:17:25,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:17:25,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 283 284 283 [WARNING|trainer.py:803] 2025-04-26 14:17:27,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:17:27,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:17:27,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 284 285 284 [WARNING|trainer.py:803] 2025-04-26 14:17:30,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:17:30,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:17:30,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 285 285 286 [WARNING|trainer.py:803] 2025-04-26 14:17:32,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:17:33,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:17:33,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 286 286 [WARNING|trainer.py:803] 2025-04-26 14:17:35,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 287 [WARNING|trainer.py:803] 2025-04-26 14:17:35,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:36,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 287 288 287 [WARNING|trainer.py:803] 2025-04-26 14:17:38,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:38,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:38,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 288 289 288 [WARNING|trainer.py:803] 2025-04-26 14:17:40,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:40,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:17:41,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 289 289 290 [WARNING|trainer.py:803] 2025-04-26 14:17:43,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:17:43,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:17:43,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 290 291 290 [WARNING|trainer.py:803] 2025-04-26 14:17:45,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:46,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:46,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 291 292 291 [WARNING|trainer.py:803] 2025-04-26 14:17:48,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:48,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:48,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 292 293 292 [WARNING|trainer.py:803] 2025-04-26 14:17:50,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:50,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:51,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 293 293 294 [WARNING|trainer.py:803] 2025-04-26 14:17:53,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:53,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:53,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 294 294 295 [WARNING|trainer.py:803] 2025-04-26 14:17:56,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:56,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:56,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 296 295 295 [WARNING|trainer.py:803] 2025-04-26 14:17:58,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:17:58,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:17:59,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 296 297 296 [WARNING|trainer.py:803] 2025-04-26 14:18:00,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:18:01,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:18:01,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 297 298 297 [WARNING|trainer.py:803] 2025-04-26 14:18:03,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:18:03,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:03,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 298 298 299 [WARNING|trainer.py:803] 2025-04-26 14:18:05,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:06,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:06,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 299 300 299 [WARNING|trainer.py:803] 2025-04-26 14:18:08,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:08,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:08,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 301 [WARNING|trainer.py:803] 2025-04-26 14:18:09,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 300 300 302 [WARNING|trainer.py:803] 2025-04-26 14:18:10,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:11,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:11,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 301 301 303 [WARNING|trainer.py:803] 2025-04-26 14:18:12,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:12,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:12,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 302 302 304 [WARNING|trainer.py:803] 2025-04-26 14:18:13,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:13,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:13,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 303 305 303 [WARNING|trainer.py:803] 2025-04-26 14:18:14,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:15,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:15,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 304 306 304 [WARNING|trainer.py:803] 2025-04-26 14:18:16,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:18:16,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:16,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 305 307 305 [WARNING|trainer.py:803] 2025-04-26 14:18:17,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:18,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 306 [WARNING|trainer.py:803] 2025-04-26 14:18:18,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 308 306 [WARNING|trainer.py:803] 2025-04-26 14:18:18,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:19,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 307 [WARNING|trainer.py:803] 2025-04-26 14:18:19,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 309 307 [WARNING|trainer.py:803] 2025-04-26 14:18:20,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:20,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 308 [WARNING|trainer.py:803] 2025-04-26 14:18:20,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 310 308 [WARNING|trainer.py:803] 2025-04-26 14:18:21,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:22,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 309 [WARNING|trainer.py:803] 2025-04-26 14:18:22,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 311 309 [WARNING|trainer.py:803] 2025-04-26 14:18:22,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:18:23,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 310 [WARNING|trainer.py:803] 2025-04-26 14:18:23,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 312 310 [WARNING|trainer.py:803] 2025-04-26 14:18:24,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:18:24,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 311 [WARNING|trainer.py:803] 2025-04-26 14:18:25,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 313 311 [WARNING|trainer.py:803] 2025-04-26 14:18:25,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:18:26,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 312 [WARNING|trainer.py:803] 2025-04-26 14:18:26,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 314 312 [WARNING|trainer.py:803] 2025-04-26 14:18:26,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:27,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 313 [WARNING|trainer.py:803] 2025-04-26 14:18:27,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 315 313 [WARNING|trainer.py:803] 2025-04-26 14:18:28,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:18:28,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 314 [WARNING|trainer.py:803] 2025-04-26 14:18:29,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 316 [WARNING|trainer.py:803] 2025-04-26 14:18:29,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 314 [WARNING|trainer.py:803] 2025-04-26 14:18:29,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 315 [WARNING|trainer.py:803] 2025-04-26 14:18:30,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 317 [WARNING|trainer.py:803] 2025-04-26 14:18:30,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 315 [WARNING|trainer.py:803] 2025-04-26 14:18:31,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 316 [WARNING|trainer.py:803] 2025-04-26 14:18:31,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 318 [WARNING|trainer.py:803] 2025-04-26 14:18:32,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 316 [WARNING|trainer.py:803] 2025-04-26 14:18:32,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 317 [WARNING|trainer.py:803] 2025-04-26 14:18:33,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 319 [WARNING|trainer.py:803] 2025-04-26 14:18:33,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 317 [WARNING|trainer.py:803] 2025-04-26 14:18:34,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 318 [WARNING|trainer.py:803] 2025-04-26 14:18:34,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 320 [WARNING|trainer.py:803] 2025-04-26 14:18:34,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 318 [WARNING|trainer.py:803] 2025-04-26 14:18:35,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 319 [WARNING|trainer.py:803] 2025-04-26 14:18:35,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 321 [WARNING|trainer.py:803] 2025-04-26 14:18:36,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 319 [WARNING|trainer.py:803] 2025-04-26 14:18:36,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 320 [WARNING|trainer.py:803] 2025-04-26 14:18:37,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 322 [WARNING|trainer.py:803] 2025-04-26 14:18:37,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 320 [WARNING|trainer.py:803] 2025-04-26 14:18:38,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 321 [WARNING|trainer.py:803] 2025-04-26 14:18:38,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 323 [WARNING|trainer.py:803] 2025-04-26 14:18:38,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 321 322 [WARNING|trainer.py:803] 2025-04-26 14:18:39,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:18:39,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 324 [WARNING|trainer.py:803] 2025-04-26 14:18:40,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 322 323 [WARNING|trainer.py:803] 2025-04-26 14:18:40,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:18:41,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 325 [WARNING|trainer.py:803] 2025-04-26 14:18:41,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 323 [WARNING|trainer.py:803] 2025-04-26 14:18:42,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 324 [WARNING|trainer.py:803] 2025-04-26 14:18:42,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 326 [WARNING|trainer.py:803] 2025-04-26 14:18:43,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 324 [WARNING|trainer.py:803] 2025-04-26 14:18:43,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 325 [WARNING|trainer.py:803] 2025-04-26 14:18:44,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 327 [WARNING|trainer.py:803] 2025-04-26 14:18:44,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 325 [WARNING|trainer.py:803] 2025-04-26 14:18:45,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 326 [WARNING|trainer.py:803] 2025-04-26 14:18:45,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 328 [WARNING|trainer.py:803] 2025-04-26 14:18:45,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 326 327 [WARNING|trainer.py:803] 2025-04-26 14:18:46,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:18:46,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 329 [WARNING|trainer.py:803] 2025-04-26 14:18:47,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 327 328 [WARNING|trainer.py:803] 2025-04-26 14:18:47,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:18:48,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 330 [WARNING|trainer.py:803] 2025-04-26 14:18:48,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 328 329 [WARNING|trainer.py:803] 2025-04-26 14:18:49,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:18:49,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 331 [WARNING|trainer.py:803] 2025-04-26 14:18:49,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 329 330 [WARNING|trainer.py:803] 2025-04-26 14:18:50,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:18:51,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 332 [WARNING|trainer.py:803] 2025-04-26 14:18:51,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 330 331 [WARNING|trainer.py:803] 2025-04-26 14:18:52,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:18:52,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 333 [WARNING|trainer.py:803] 2025-04-26 14:18:52,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 331 332 [WARNING|trainer.py:803] 2025-04-26 14:18:53,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:18:53,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 334 [WARNING|trainer.py:803] 2025-04-26 14:18:54,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 332 333 [WARNING|trainer.py:803] 2025-04-26 14:18:54,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:18:55,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 335 [WARNING|trainer.py:803] 2025-04-26 14:18:55,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 333 334 [WARNING|trainer.py:803] 2025-04-26 14:18:56,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:18:56,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 336 [WARNING|trainer.py:803] 2025-04-26 14:18:56,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 334 335 [WARNING|trainer.py:803] 2025-04-26 14:18:57,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 337 [WARNING|trainer.py:803] 2025-04-26 14:18:58,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:18:58,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 335 336 [WARNING|trainer.py:803] 2025-04-26 14:18:58,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 338 [WARNING|trainer.py:803] 2025-04-26 14:18:59,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:18:59,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 336 337 [WARNING|trainer.py:803] 2025-04-26 14:19:00,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 339 [WARNING|trainer.py:803] 2025-04-26 14:19:00,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:19:00,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 337 338 [WARNING|trainer.py:803] 2025-04-26 14:19:01,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 340 [WARNING|trainer.py:803] 2025-04-26 14:19:02,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:02,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 338 339 [WARNING|trainer.py:803] 2025-04-26 14:19:02,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 341 [WARNING|trainer.py:803] 2025-04-26 14:19:03,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:19:03,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 339 340 [WARNING|trainer.py:803] 2025-04-26 14:19:04,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 342 [WARNING|trainer.py:803] 2025-04-26 14:19:04,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:04,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 340 341 [WARNING|trainer.py:803] 2025-04-26 14:19:05,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 343 [WARNING|trainer.py:803] 2025-04-26 14:19:06,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:06,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 341 342 [WARNING|trainer.py:803] 2025-04-26 14:19:06,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 344 [WARNING|trainer.py:803] 2025-04-26 14:19:07,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:07,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 343 [WARNING|trainer.py:803] 2025-04-26 14:19:08,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 342 345 [WARNING|trainer.py:803] 2025-04-26 14:19:09,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:19:09,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 344 [WARNING|trainer.py:803] 2025-04-26 14:19:09,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 343 346 [WARNING|trainer.py:803] 2025-04-26 14:19:10,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:10,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 345 344 [WARNING|trainer.py:803] 2025-04-26 14:19:11,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 347 [WARNING|trainer.py:803] 2025-04-26 14:19:11,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:11,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 346 [WARNING|trainer.py:803] 2025-04-26 14:19:12,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 345 348 [WARNING|trainer.py:803] 2025-04-26 14:19:13,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:13,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 347 [WARNING|trainer.py:803] 2025-04-26 14:19:13,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 346 349 [WARNING|trainer.py:803] 2025-04-26 14:19:14,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:14,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 348 [WARNING|trainer.py:803] 2025-04-26 14:19:15,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 347 350 [WARNING|trainer.py:803] 2025-04-26 14:19:15,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:16,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 349 [WARNING|trainer.py:803] 2025-04-26 14:19:16,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 348 351 [WARNING|trainer.py:803] 2025-04-26 14:19:17,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:17,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 350 [WARNING|trainer.py:803] 2025-04-26 14:19:17,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 349 352 [WARNING|trainer.py:803] 2025-04-26 14:19:18,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:18,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:19,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 351 350 353 [WARNING|trainer.py:803] 2025-04-26 14:19:20,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:20,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:20,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 352 351 354 [WARNING|trainer.py:803] 2025-04-26 14:19:21,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:21,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:21,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 353 352 355 [WARNING|trainer.py:803] 2025-04-26 14:19:22,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:23,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 354 [WARNING|trainer.py:803] 2025-04-26 14:19:23,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 353 356 [WARNING|trainer.py:803] 2025-04-26 14:19:24,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:24,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:24,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 355 354 357 [WARNING|trainer.py:803] 2025-04-26 14:19:25,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:25,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:26,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 356 355 358 [WARNING|trainer.py:803] 2025-04-26 14:19:26,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:27,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:27,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 357 356 359 [WARNING|trainer.py:803] 2025-04-26 14:19:28,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:28,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:28,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 358 357 360 [WARNING|trainer.py:803] 2025-04-26 14:19:29,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:19:29,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:29,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 359 358 361 [WARNING|trainer.py:803] 2025-04-26 14:19:30,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:19:31,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:19:31,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 360 362 359 [WARNING|trainer.py:803] 2025-04-26 14:19:32,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:19:32,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:32,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 361 363 360 [WARNING|trainer.py:803] 2025-04-26 14:19:33,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:33,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:19:33,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 362 364 361 [WARNING|trainer.py:803] 2025-04-26 14:19:34,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:35,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 363 [WARNING|trainer.py:803] 2025-04-26 14:19:35,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 365 362 [WARNING|trainer.py:803] 2025-04-26 14:19:36,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:19:36,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 364 [WARNING|trainer.py:803] 2025-04-26 14:19:36,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 366 363 [WARNING|trainer.py:803] 2025-04-26 14:19:37,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:37,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:19:38,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 365 367 364 [WARNING|trainer.py:803] 2025-04-26 14:19:38,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 366 [WARNING|trainer.py:803] 2025-04-26 14:19:39,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:39,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 368 365 [WARNING|trainer.py:803] 2025-04-26 14:19:40,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 367 [WARNING|trainer.py:803] 2025-04-26 14:19:40,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:40,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 369 366 [WARNING|trainer.py:803] 2025-04-26 14:19:41,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 368 [WARNING|trainer.py:803] 2025-04-26 14:19:42,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:19:42,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 370 367 [WARNING|trainer.py:803] 2025-04-26 14:19:42,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 369 [WARNING|trainer.py:803] 2025-04-26 14:19:43,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:19:43,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 368 371 [WARNING|trainer.py:803] 2025-04-26 14:19:44,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 370 [WARNING|trainer.py:803] 2025-04-26 14:19:44,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:44,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 369 372 [WARNING|trainer.py:803] 2025-04-26 14:19:45,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 371 [WARNING|trainer.py:803] 2025-04-26 14:19:46,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:19:46,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 370 373 [WARNING|trainer.py:803] 2025-04-26 14:19:46,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 372 [WARNING|trainer.py:803] 2025-04-26 14:19:47,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:19:47,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 371 374 [WARNING|trainer.py:803] 2025-04-26 14:19:48,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 373 [WARNING|trainer.py:803] 2025-04-26 14:19:49,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:49,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 372 [WARNING|trainer.py:803] 2025-04-26 14:19:49,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 375 374 [WARNING|trainer.py:803] 2025-04-26 14:19:50,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:50,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 373 [WARNING|trainer.py:803] 2025-04-26 14:19:51,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 376 375 [WARNING|trainer.py:803] 2025-04-26 14:19:51,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:51,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 374 377 [WARNING|trainer.py:803] 2025-04-26 14:19:52,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 376 [WARNING|trainer.py:803] 2025-04-26 14:19:53,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:53,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 378 [WARNING|trainer.py:803] 2025-04-26 14:19:53,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 375 377 [WARNING|trainer.py:803] 2025-04-26 14:19:54,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:54,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:55,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 379 376 378 [WARNING|trainer.py:803] 2025-04-26 14:19:55,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:19:56,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:56,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 380 377 379 [WARNING|trainer.py:803] 2025-04-26 14:19:57,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:57,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:19:57,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 381 378 380 [WARNING|trainer.py:803] 2025-04-26 14:19:58,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:19:58,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 382 379 [WARNING|trainer.py:803] 2025-04-26 14:19:59,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 381 [WARNING|trainer.py:803] 2025-04-26 14:19:59,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:00,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 383 380 [WARNING|trainer.py:803] 2025-04-26 14:20:00,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 382 [WARNING|trainer.py:803] 2025-04-26 14:20:01,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:01,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 384 [WARNING|trainer.py:803] 2025-04-26 14:20:02,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 381 383 [WARNING|trainer.py:803] 2025-04-26 14:20:02,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:02,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 385 [WARNING|trainer.py:803] 2025-04-26 14:20:03,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 382 384 [WARNING|trainer.py:803] 2025-04-26 14:20:04,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:04,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 386 [WARNING|trainer.py:803] 2025-04-26 14:20:04,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 383 385 [WARNING|trainer.py:803] 2025-04-26 14:20:05,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:05,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 387 [WARNING|trainer.py:803] 2025-04-26 14:20:06,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 384 386 [WARNING|trainer.py:803] 2025-04-26 14:20:06,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:07,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 388 [WARNING|trainer.py:803] 2025-04-26 14:20:07,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 385 387 [WARNING|trainer.py:803] 2025-04-26 14:20:08,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:08,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 389 [WARNING|trainer.py:803] 2025-04-26 14:20:08,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 386 388 [WARNING|trainer.py:803] 2025-04-26 14:20:09,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:09,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 390 [WARNING|trainer.py:803] 2025-04-26 14:20:10,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 387 389 [WARNING|trainer.py:803] 2025-04-26 14:20:10,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:20:11,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 391 [WARNING|trainer.py:803] 2025-04-26 14:20:11,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 388 390 [WARNING|trainer.py:803] 2025-04-26 14:20:12,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:20:12,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 392 [WARNING|trainer.py:803] 2025-04-26 14:20:12,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 389 391 [WARNING|trainer.py:803] 2025-04-26 14:20:13,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:14,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 393 [WARNING|trainer.py:803] 2025-04-26 14:20:14,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 390 392 [WARNING|trainer.py:803] 2025-04-26 14:20:14,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:20:15,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 394 [WARNING|trainer.py:803] 2025-04-26 14:20:15,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 391 393 [WARNING|trainer.py:803] 2025-04-26 14:20:16,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:16,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 395 [WARNING|trainer.py:803] 2025-04-26 14:20:17,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 392 394 [WARNING|trainer.py:803] 2025-04-26 14:20:17,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 396 [WARNING|trainer.py:803] 2025-04-26 14:20:18,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:18,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 393 395 [WARNING|trainer.py:803] 2025-04-26 14:20:19,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 397 [WARNING|trainer.py:803] 2025-04-26 14:20:19,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:20:19,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 394 396 [WARNING|trainer.py:803] 2025-04-26 14:20:20,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:20:20,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 398 [WARNING|trainer.py:803] 2025-04-26 14:20:21,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 395 397 [WARNING|trainer.py:803] 2025-04-26 14:20:21,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 399 [WARNING|trainer.py:803] 2025-04-26 14:20:22,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:22,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 396 398 [WARNING|trainer.py:803] 2025-04-26 14:20:23,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 400 [WARNING|trainer.py:803] 2025-04-26 14:20:23,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:23,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 397 399 [WARNING|trainer.py:803] 2025-04-26 14:20:24,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 401 [WARNING|trainer.py:803] 2025-04-26 14:20:25,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:20:25,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 398 400 [WARNING|trainer.py:803] 2025-04-26 14:20:25,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 402 [WARNING|trainer.py:803] 2025-04-26 14:20:26,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:20:26,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 399 401 [WARNING|trainer.py:803] 2025-04-26 14:20:27,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 403 [WARNING|trainer.py:803] 2025-04-26 14:20:27,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:20:27,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 400 402 [WARNING|trainer.py:803] 2025-04-26 14:20:28,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 404 [WARNING|trainer.py:803] 2025-04-26 14:20:29,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:20:29,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 401 403 [WARNING|trainer.py:803] 2025-04-26 14:20:29,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 405 [WARNING|trainer.py:803] 2025-04-26 14:20:30,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:30,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 402 404 [WARNING|trainer.py:803] 2025-04-26 14:20:31,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 406 [WARNING|trainer.py:803] 2025-04-26 14:20:31,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:32,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 403 405 [WARNING|trainer.py:803] 2025-04-26 14:20:32,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 407 [WARNING|trainer.py:803] 2025-04-26 14:20:33,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:20:33,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 404 406 [WARNING|trainer.py:803] 2025-04-26 14:20:34,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 408 [WARNING|trainer.py:803] 2025-04-26 14:20:34,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:20:34,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 405 407 [WARNING|trainer.py:803] 2025-04-26 14:20:35,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 409 [WARNING|trainer.py:803] 2025-04-26 14:20:36,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:36,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:20:36,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 406 408 410 [WARNING|trainer.py:803] 2025-04-26 14:20:37,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:20:37,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:38,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 409 407 411 [WARNING|trainer.py:803] 2025-04-26 14:20:38,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:38,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:20:39,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 408 410 412 [WARNING|trainer.py:803] 2025-04-26 14:20:40,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:40,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:40,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 411 409 413 [WARNING|trainer.py:803] 2025-04-26 14:20:41,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:41,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:42,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 412 410 414 [WARNING|trainer.py:803] 2025-04-26 14:20:43,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:43,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:43,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 413 411 415 [WARNING|trainer.py:803] 2025-04-26 14:20:44,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:44,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:44,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 414 412 416 [WARNING|trainer.py:803] 2025-04-26 14:20:45,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:45,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:46,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 415 413 417 [WARNING|trainer.py:803] 2025-04-26 14:20:47,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:20:47,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:47,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 416 414 418 [WARNING|trainer.py:803] 2025-04-26 14:20:48,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:20:48,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:49,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 417 415 419 [WARNING|trainer.py:803] 2025-04-26 14:20:49,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:50,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:20:50,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 418 416 420 [WARNING|trainer.py:803] 2025-04-26 14:20:51,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:20:51,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:51,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 419 417 421 [WARNING|trainer.py:803] 2025-04-26 14:20:52,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:52,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:53,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 420 418 422 [WARNING|trainer.py:803] 2025-04-26 14:20:54,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:20:54,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:20:54,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 421 419 423 [WARNING|trainer.py:803] 2025-04-26 14:20:55,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:20:55,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:20:55,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 422 420 424 [WARNING|trainer.py:803] 2025-04-26 14:20:56,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:20:57,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:20:57,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 423 421 425 [WARNING|trainer.py:803] 2025-04-26 14:20:58,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:20:58,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:20:58,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 424 422 426 [WARNING|trainer.py:803] 2025-04-26 14:20:59,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:21:00,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:00,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 425 423 427 [WARNING|trainer.py:803] 2025-04-26 14:21:01,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:21:01,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:01,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 426 424 428 [WARNING|trainer.py:803] 2025-04-26 14:21:02,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:21:02,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:21:02,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 427 429 425 [WARNING|trainer.py:803] 2025-04-26 14:21:04,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:21:04,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:21:04,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 428 430 426 [WARNING|trainer.py:803] 2025-04-26 14:21:05,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:21:05,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:21:05,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 429 431 427 [WARNING|trainer.py:803] 2025-04-26 14:21:06,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:21:06,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 430 [WARNING|trainer.py:803] 2025-04-26 14:21:07,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 432 428 [WARNING|trainer.py:803] 2025-04-26 14:21:08,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:21:08,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 431 [WARNING|trainer.py:803] 2025-04-26 14:21:08,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 433 429 [WARNING|trainer.py:803] 2025-04-26 14:21:09,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:21:09,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 432 [WARNING|trainer.py:803] 2025-04-26 14:21:09,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 434 430 [WARNING|trainer.py:803] 2025-04-26 14:21:10,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:21:10,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 433 [WARNING|trainer.py:803] 2025-04-26 14:21:11,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 435 431 [WARNING|trainer.py:803] 2025-04-26 14:21:12,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:21:12,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 434 [WARNING|trainer.py:803] 2025-04-26 14:21:12,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 436 432 [WARNING|trainer.py:803] 2025-04-26 14:21:13,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:21:13,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 435 [WARNING|trainer.py:803] 2025-04-26 14:21:14,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 437 433 [WARNING|trainer.py:803] 2025-04-26 14:21:14,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:15,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 436 [WARNING|trainer.py:803] 2025-04-26 14:21:15,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 438 434 [WARNING|trainer.py:803] 2025-04-26 14:21:16,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:16,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 437 [WARNING|trainer.py:803] 2025-04-26 14:21:16,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 439 435 [WARNING|trainer.py:803] 2025-04-26 14:21:17,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:17,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 438 [WARNING|trainer.py:803] 2025-04-26 14:21:18,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 440 436 [WARNING|trainer.py:803] 2025-04-26 14:21:19,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:21:19,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 439 441 [WARNING|trainer.py:803] 2025-04-26 14:21:19,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 437 [WARNING|trainer.py:803] 2025-04-26 14:21:20,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:21:20,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 440 [WARNING|trainer.py:803] 2025-04-26 14:21:21,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 442 438 [WARNING|trainer.py:803] 2025-04-26 14:21:21,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:21:21,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 441 [WARNING|trainer.py:803] 2025-04-26 14:21:22,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 443 439 [WARNING|trainer.py:803] 2025-04-26 14:21:23,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:21:23,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:21:23,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 442 444 440 [WARNING|trainer.py:803] 2025-04-26 14:21:24,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:21:24,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:21:25,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 443 445 441 [WARNING|trainer.py:803] 2025-04-26 14:21:25,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:21:26,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:21:26,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 446 444 442 [WARNING|trainer.py:803] 2025-04-26 14:21:27,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:27,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:21:27,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 447 445 443 [WARNING|trainer.py:803] 2025-04-26 14:21:28,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:28,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:21:29,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 446 448 444 [WARNING|trainer.py:803] 2025-04-26 14:21:30,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:30,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:21:30,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 449 447 445 [WARNING|trainer.py:803] 2025-04-26 14:21:31,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:21:31,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:31,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 450 448 446 [WARNING|trainer.py:803] 2025-04-26 14:21:32,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:21:32,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:21:33,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 451 449 447 [WARNING|trainer.py:803] 2025-04-26 14:21:34,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:34,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:21:34,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 452 450 448 [WARNING|trainer.py:803] 2025-04-26 14:21:35,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:21:35,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:21:36,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 453 451 449 [WARNING|trainer.py:803] 2025-04-26 14:21:36,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:37,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 454 [WARNING|trainer.py:803] 2025-04-26 14:21:37,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 452 450 [WARNING|trainer.py:803] 2025-04-26 14:21:38,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:38,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 455 [WARNING|trainer.py:803] 2025-04-26 14:21:38,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 453 451 [WARNING|trainer.py:803] 2025-04-26 14:21:39,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:21:39,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 456 [WARNING|trainer.py:803] 2025-04-26 14:21:40,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 454 452 [WARNING|trainer.py:803] 2025-04-26 14:21:40,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:21:41,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 457 [WARNING|trainer.py:803] 2025-04-26 14:21:41,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 455 453 [WARNING|trainer.py:803] 2025-04-26 14:21:42,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:21:42,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 458 [WARNING|trainer.py:803] 2025-04-26 14:21:42,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 456 454 [WARNING|trainer.py:803] 2025-04-26 14:21:43,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:21:43,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 459 [WARNING|trainer.py:803] 2025-04-26 14:21:44,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 457 455 [WARNING|trainer.py:803] 2025-04-26 14:21:44,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:45,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 460 [WARNING|trainer.py:803] 2025-04-26 14:21:45,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 458 456 [WARNING|trainer.py:803] 2025-04-26 14:21:46,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:46,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 461 [WARNING|trainer.py:803] 2025-04-26 14:21:47,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 459 457 [WARNING|trainer.py:803] 2025-04-26 14:21:47,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:48,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 462 [WARNING|trainer.py:803] 2025-04-26 14:21:48,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 460 [WARNING|trainer.py:803] 2025-04-26 14:21:49,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 458 463 [WARNING|trainer.py:803] 2025-04-26 14:21:49,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:21:49,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 461 [WARNING|trainer.py:803] 2025-04-26 14:21:50,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 459 [WARNING|trainer.py:803] 2025-04-26 14:21:50,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 464 [WARNING|trainer.py:803] 2025-04-26 14:21:51,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 462 [WARNING|trainer.py:803] 2025-04-26 14:21:51,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 460 [WARNING|trainer.py:803] 2025-04-26 14:21:52,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 465 [WARNING|trainer.py:803] 2025-04-26 14:21:52,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 463 [WARNING|trainer.py:803] 2025-04-26 14:21:53,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 461 [WARNING|trainer.py:803] 2025-04-26 14:21:53,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 466 [WARNING|trainer.py:803] 2025-04-26 14:21:54,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 464 [WARNING|trainer.py:803] 2025-04-26 14:21:54,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 462 [WARNING|trainer.py:803] 2025-04-26 14:21:55,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 467 [WARNING|trainer.py:803] 2025-04-26 14:21:55,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 465 [WARNING|trainer.py:803] 2025-04-26 14:21:55,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 463 [WARNING|trainer.py:803] 2025-04-26 14:21:56,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 468 [WARNING|trainer.py:803] 2025-04-26 14:21:56,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 466 [WARNING|trainer.py:803] 2025-04-26 14:21:57,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 464 469 [WARNING|trainer.py:803] 2025-04-26 14:21:57,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:21:58,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 467 [WARNING|trainer.py:803] 2025-04-26 14:21:58,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 465 470 [WARNING|trainer.py:803] 2025-04-26 14:21:59,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:21:59,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 468 [WARNING|trainer.py:803] 2025-04-26 14:21:59,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 466 471 [WARNING|trainer.py:803] 2025-04-26 14:22:00,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:22:01,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 469 [WARNING|trainer.py:803] 2025-04-26 14:22:01,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 467 472 [WARNING|trainer.py:803] 2025-04-26 14:22:01,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:02,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 470 [WARNING|trainer.py:803] 2025-04-26 14:22:02,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 468 473 [WARNING|trainer.py:803] 2025-04-26 14:22:03,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 471 [WARNING|trainer.py:803] 2025-04-26 14:22:03,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:22:03,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 474 469 [WARNING|trainer.py:803] 2025-04-26 14:22:04,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 472 [WARNING|trainer.py:803] 2025-04-26 14:22:05,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:05,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 475 470 [WARNING|trainer.py:803] 2025-04-26 14:22:05,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:22:06,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 473 [WARNING|trainer.py:803] 2025-04-26 14:22:06,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 476 471 [WARNING|trainer.py:803] 2025-04-26 14:22:07,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 474 [WARNING|trainer.py:803] 2025-04-26 14:22:07,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:07,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 477 472 [WARNING|trainer.py:803] 2025-04-26 14:22:08,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 475 [WARNING|trainer.py:803] 2025-04-26 14:22:09,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:22:09,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 478 473 [WARNING|trainer.py:803] 2025-04-26 14:22:09,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:22:10,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 476 [WARNING|trainer.py:803] 2025-04-26 14:22:10,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 479 474 [WARNING|trainer.py:803] 2025-04-26 14:22:11,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 477 [WARNING|trainer.py:803] 2025-04-26 14:22:11,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:11,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 480 475 [WARNING|trainer.py:803] 2025-04-26 14:22:12,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:22:13,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 478 [WARNING|trainer.py:803] 2025-04-26 14:22:13,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 481 476 [WARNING|trainer.py:803] 2025-04-26 14:22:14,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:14,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 479 [WARNING|trainer.py:803] 2025-04-26 14:22:14,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 482 477 [WARNING|trainer.py:803] 2025-04-26 14:22:15,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:15,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 480 [WARNING|trainer.py:803] 2025-04-26 14:22:16,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 483 478 [WARNING|trainer.py:803] 2025-04-26 14:22:16,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:22:17,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 481 [WARNING|trainer.py:803] 2025-04-26 14:22:17,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 484 479 [WARNING|trainer.py:803] 2025-04-26 14:22:18,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:18,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 482 [WARNING|trainer.py:803] 2025-04-26 14:22:18,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 485 480 [WARNING|trainer.py:803] 2025-04-26 14:22:19,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:19,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 483 [WARNING|trainer.py:803] 2025-04-26 14:22:20,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 486 481 [WARNING|trainer.py:803] 2025-04-26 14:22:20,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:21,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 484 [WARNING|trainer.py:803] 2025-04-26 14:22:21,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 487 482 [WARNING|trainer.py:803] 2025-04-26 14:22:22,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:22:22,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 485 [WARNING|trainer.py:803] 2025-04-26 14:22:22,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 488 483 [WARNING|trainer.py:803] 2025-04-26 14:22:23,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:22:23,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 486 [WARNING|trainer.py:803] 2025-04-26 14:22:24,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 489 484 [WARNING|trainer.py:803] 2025-04-26 14:22:24,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:25,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 487 [WARNING|trainer.py:803] 2025-04-26 14:22:25,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 490 485 [WARNING|trainer.py:803] 2025-04-26 14:22:26,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:26,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 488 [WARNING|trainer.py:803] 2025-04-26 14:22:27,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 491 486 [WARNING|trainer.py:803] 2025-04-26 14:22:27,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:28,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 489 [WARNING|trainer.py:803] 2025-04-26 14:22:28,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 492 487 [WARNING|trainer.py:803] 2025-04-26 14:22:29,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:29,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 490 [WARNING|trainer.py:803] 2025-04-26 14:22:29,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 493 488 [WARNING|trainer.py:803] 2025-04-26 14:22:30,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:22:30,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 491 [WARNING|trainer.py:803] 2025-04-26 14:22:31,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 494 489 [WARNING|trainer.py:803] 2025-04-26 14:22:31,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:32,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 492 [WARNING|trainer.py:803] 2025-04-26 14:22:32,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 495 490 [WARNING|trainer.py:803] 2025-04-26 14:22:33,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:33,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 493 [WARNING|trainer.py:803] 2025-04-26 14:22:34,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 496 491 [WARNING|trainer.py:803] 2025-04-26 14:22:34,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:22:35,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 494 [WARNING|trainer.py:803] 2025-04-26 14:22:35,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 497 492 [WARNING|trainer.py:803] 2025-04-26 14:22:36,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:36,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 495 [WARNING|trainer.py:803] 2025-04-26 14:22:36,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 498 493 [WARNING|trainer.py:803] 2025-04-26 14:22:37,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:37,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 496 499 [WARNING|trainer.py:803] 2025-04-26 14:22:38,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 494 [WARNING|trainer.py:803] 2025-04-26 14:22:38,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:39,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 497 500 [WARNING|trainer.py:803] 2025-04-26 14:22:39,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 495 [WARNING|trainer.py:803] 2025-04-26 14:22:40,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:40,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 498 501 [WARNING|trainer.py:803] 2025-04-26 14:22:41,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:41,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 496 [WARNING|trainer.py:803] 2025-04-26 14:22:41,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 499 502 [WARNING|trainer.py:803] 2025-04-26 14:22:42,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:43,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:43,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 497 500 503 [WARNING|trainer.py:803] 2025-04-26 14:22:43,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:44,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:44,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 498 504 501 [WARNING|trainer.py:803] 2025-04-26 14:22:45,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:22:45,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:45,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 499 505 502 [WARNING|trainer.py:803] 2025-04-26 14:22:46,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:47,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 500 [WARNING|trainer.py:803] 2025-04-26 14:22:47,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 506 503 [WARNING|trainer.py:803] 2025-04-26 14:22:48,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:48,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:22:48,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 501 507 504 [WARNING|trainer.py:803] 2025-04-26 14:22:49,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:49,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:22:49,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 502 508 505 [WARNING|trainer.py:803] 2025-04-26 14:22:50,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:50,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:22:51,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 503 509 506 [WARNING|trainer.py:803] 2025-04-26 14:22:52,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:52,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:52,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 504 510 507 [WARNING|trainer.py:803] 2025-04-26 14:22:53,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:53,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:53,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 505 511 508 [WARNING|trainer.py:803] 2025-04-26 14:22:54,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:22:54,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:22:55,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 506 512 509 [WARNING|trainer.py:803] 2025-04-26 14:22:56,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:22:56,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:22:56,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 507 513 510 [WARNING|trainer.py:803] 2025-04-26 14:22:57,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:22:57,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:22:57,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 508 514 511 [WARNING|trainer.py:803] 2025-04-26 14:22:58,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:22:58,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:22:59,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 509 515 512 [WARNING|trainer.py:803] 2025-04-26 14:23:00,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:00,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:00,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 516 510 513 [WARNING|trainer.py:803] 2025-04-26 14:23:01,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:01,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:01,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 517 511 514 [WARNING|trainer.py:803] 2025-04-26 14:23:02,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:02,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:03,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 518 512 515 [WARNING|trainer.py:803] 2025-04-26 14:23:04,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:04,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:23:04,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 519 513 516 [WARNING|trainer.py:803] 2025-04-26 14:23:05,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:23:05,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:05,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 520 517 514 [WARNING|trainer.py:803] 2025-04-26 14:23:07,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:07,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:07,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 521 518 515 [WARNING|trainer.py:803] 2025-04-26 14:23:08,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:08,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:08,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 522 519 516 [WARNING|trainer.py:803] 2025-04-26 14:23:09,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:09,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:23:09,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 523 517 520 [WARNING|trainer.py:803] 2025-04-26 14:23:11,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:11,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:11,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 518 524 521 [WARNING|trainer.py:803] 2025-04-26 14:23:12,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:12,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:12,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 519 522 525 [WARNING|trainer.py:803] 2025-04-26 14:23:13,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:23:13,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:14,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 520 523 526 [WARNING|trainer.py:803] 2025-04-26 14:23:15,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:15,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:15,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 524 521 527 [WARNING|trainer.py:803] 2025-04-26 14:23:16,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:16,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:16,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 525 522 528 [WARNING|trainer.py:803] 2025-04-26 14:23:17,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:17,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:18,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 526 523 529 [WARNING|trainer.py:803] 2025-04-26 14:23:19,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:23:19,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:19,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 527 524 530 [WARNING|trainer.py:803] 2025-04-26 14:23:20,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:20,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:20,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 528 525 531 [WARNING|trainer.py:803] 2025-04-26 14:23:21,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:22,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:22,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 529 526 532 [WARNING|trainer.py:803] 2025-04-26 14:23:23,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:23,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:23:23,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 530 527 533 [WARNING|trainer.py:803] 2025-04-26 14:23:24,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:24,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:24,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 531 528 534 [WARNING|trainer.py:803] 2025-04-26 14:23:25,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:26,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:26,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 532 529 535 [WARNING|trainer.py:803] 2025-04-26 14:23:27,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:27,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 533 [WARNING|trainer.py:803] 2025-04-26 14:23:27,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 530 536 [WARNING|trainer.py:803] 2025-04-26 14:23:28,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:28,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:28,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 534 NoYes 531 537 [WARNING|trainer.py:803] 2025-04-26 14:23:29,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:23:30,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 535 [WARNING|trainer.py:803] 2025-04-26 14:23:30,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 532 538 [WARNING|trainer.py:803] 2025-04-26 14:23:31,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:31,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 536 [WARNING|trainer.py:803] 2025-04-26 14:23:31,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 533 539 [WARNING|trainer.py:803] 2025-04-26 14:23:32,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:32,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 537 [WARNING|trainer.py:803] 2025-04-26 14:23:32,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 534 540 [WARNING|trainer.py:803] 2025-04-26 14:23:33,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:34,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 538 [WARNING|trainer.py:803] 2025-04-26 14:23:34,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 535 541 [WARNING|trainer.py:803] 2025-04-26 14:23:35,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 539 [WARNING|trainer.py:803] 2025-04-26 14:23:35,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:35,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 536 542 [WARNING|trainer.py:803] 2025-04-26 14:23:36,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:36,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 540 [WARNING|trainer.py:803] 2025-04-26 14:23:36,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 537 543 [WARNING|trainer.py:803] 2025-04-26 14:23:37,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:38,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 541 [WARNING|trainer.py:803] 2025-04-26 14:23:38,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 538 544 [WARNING|trainer.py:803] 2025-04-26 14:23:39,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:39,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 542 [WARNING|trainer.py:803] 2025-04-26 14:23:39,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 539 545 [WARNING|trainer.py:803] 2025-04-26 14:23:40,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:40,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 543 [WARNING|trainer.py:803] 2025-04-26 14:23:40,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 540 546 [WARNING|trainer.py:803] 2025-04-26 14:23:41,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:23:42,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 544 [WARNING|trainer.py:803] 2025-04-26 14:23:42,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 541 547 [WARNING|trainer.py:803] 2025-04-26 14:23:43,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 545 [WARNING|trainer.py:803] 2025-04-26 14:23:43,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:43,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 542 548 [WARNING|trainer.py:803] 2025-04-26 14:23:44,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 546 [WARNING|trainer.py:803] 2025-04-26 14:23:44,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:44,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 543 549 [WARNING|trainer.py:803] 2025-04-26 14:23:45,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 547 [WARNING|trainer.py:803] 2025-04-26 14:23:46,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:23:46,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 544 550 [WARNING|trainer.py:803] 2025-04-26 14:23:46,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 548 [WARNING|trainer.py:803] 2025-04-26 14:23:47,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:47,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 545 551 [WARNING|trainer.py:803] 2025-04-26 14:23:48,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 549 [WARNING|trainer.py:803] 2025-04-26 14:23:48,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:48,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 546 552 [WARNING|trainer.py:803] 2025-04-26 14:23:49,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 550 [WARNING|trainer.py:803] 2025-04-26 14:23:50,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:50,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 547 553 [WARNING|trainer.py:803] 2025-04-26 14:23:50,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 551 [WARNING|trainer.py:803] 2025-04-26 14:23:51,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:51,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 548 554 [WARNING|trainer.py:803] 2025-04-26 14:23:52,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 552 [WARNING|trainer.py:803] 2025-04-26 14:23:52,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:23:52,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 549 555 [WARNING|trainer.py:803] 2025-04-26 14:23:53,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 553 [WARNING|trainer.py:803] 2025-04-26 14:23:54,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:23:54,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 550 556 [WARNING|trainer.py:803] 2025-04-26 14:23:54,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 554 [WARNING|trainer.py:803] 2025-04-26 14:23:55,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:23:55,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 551 [WARNING|trainer.py:803] 2025-04-26 14:23:55,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 557 555 [WARNING|trainer.py:803] 2025-04-26 14:23:56,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:56,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:57,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 552 558 556 [WARNING|trainer.py:803] 2025-04-26 14:23:58,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:58,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:58,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 553 559 557 [WARNING|trainer.py:803] 2025-04-26 14:23:59,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:59,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:23:59,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 554 560 558 [WARNING|trainer.py:803] 2025-04-26 14:24:00,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:00,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:01,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 555 561 559 [WARNING|trainer.py:803] 2025-04-26 14:24:02,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:02,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:02,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 556 562 560 [WARNING|trainer.py:803] 2025-04-26 14:24:03,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:03,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:24:03,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 557 563 561 [WARNING|trainer.py:803] 2025-04-26 14:24:04,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:04,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:24:05,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 558 564 562 [WARNING|trainer.py:803] 2025-04-26 14:24:06,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:06,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:24:06,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 565 559 563 [WARNING|trainer.py:803] 2025-04-26 14:24:07,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:24:07,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:07,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 566 560 564 [WARNING|trainer.py:803] 2025-04-26 14:24:08,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:24:08,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:09,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 567 561 565 [WARNING|trainer.py:803] 2025-04-26 14:24:10,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:24:10,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:10,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 568 562 566 [WARNING|trainer.py:803] 2025-04-26 14:24:11,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:24:11,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:24:11,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 569 563 567 [WARNING|trainer.py:803] 2025-04-26 14:24:12,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:24:13,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:24:13,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 570 564 568 [WARNING|trainer.py:803] 2025-04-26 14:24:14,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:24:14,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:24:14,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 571 565 569 [WARNING|trainer.py:803] 2025-04-26 14:24:15,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:15,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:24:15,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 572 566 570 [WARNING|trainer.py:803] 2025-04-26 14:24:17,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:17,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:24:17,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 573 567 571 [WARNING|trainer.py:803] 2025-04-26 14:24:18,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:18,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:24:18,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 574 568 572 [WARNING|trainer.py:803] 2025-04-26 14:24:19,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:19,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:24:20,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 575 569 573 [WARNING|trainer.py:803] 2025-04-26 14:24:21,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:21,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:24:21,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 576 570 574 [WARNING|trainer.py:803] 2025-04-26 14:24:22,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:22,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:24:22,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 577 571 575 [WARNING|trainer.py:803] 2025-04-26 14:24:23,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:23,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:24,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 578 572 576 [WARNING|trainer.py:803] 2025-04-26 14:24:25,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:24:25,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:25,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 579 573 577 [WARNING|trainer.py:803] 2025-04-26 14:24:26,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:24:26,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:26,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 580 578 574 [WARNING|trainer.py:803] 2025-04-26 14:24:27,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:24:28,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:24:28,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 581 579 575 [WARNING|trainer.py:803] 2025-04-26 14:24:29,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:24:29,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:24:29,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 582 580 576 [WARNING|trainer.py:803] 2025-04-26 14:24:30,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:24:30,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:24:30,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 583 581 577 [WARNING|trainer.py:803] 2025-04-26 14:24:31,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:24:32,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 584 [WARNING|trainer.py:803] 2025-04-26 14:24:32,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 582 578 [WARNING|trainer.py:803] 2025-04-26 14:24:33,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:24:33,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 585 [WARNING|trainer.py:803] 2025-04-26 14:24:33,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 583 579 [WARNING|trainer.py:803] 2025-04-26 14:24:34,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:24:34,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:24:34,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 586 584 580 [WARNING|trainer.py:803] 2025-04-26 14:24:35,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:36,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:24:36,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 587 585 581 [WARNING|trainer.py:803] 2025-04-26 14:24:37,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:37,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:24:37,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 588 586 582 [WARNING|trainer.py:803] 2025-04-26 14:24:38,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:38,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:39,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 589 587 583 [WARNING|trainer.py:803] 2025-04-26 14:24:39,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:24:40,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:40,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 590 588 584 [WARNING|trainer.py:803] 2025-04-26 14:24:41,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:24:41,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:41,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 591 589 585 [WARNING|trainer.py:803] 2025-04-26 14:24:42,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:42,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:24:43,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 592 590 586 [WARNING|trainer.py:803] 2025-04-26 14:24:44,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:44,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:24:44,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 593 591 587 [WARNING|trainer.py:803] 2025-04-26 14:24:45,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:45,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 594 [WARNING|trainer.py:803] 2025-04-26 14:24:46,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 592 588 [WARNING|trainer.py:803] 2025-04-26 14:24:46,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:47,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:47,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 595 593 589 [WARNING|trainer.py:803] 2025-04-26 14:24:48,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:48,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:48,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 596 594 590 [WARNING|trainer.py:803] 2025-04-26 14:24:49,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:49,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 597 [WARNING|trainer.py:803] 2025-04-26 14:24:50,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 595 591 [WARNING|trainer.py:803] 2025-04-26 14:24:50,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:51,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 598 [WARNING|trainer.py:803] 2025-04-26 14:24:51,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 596 592 [WARNING|trainer.py:803] 2025-04-26 14:24:52,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:52,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 599 [WARNING|trainer.py:803] 2025-04-26 14:24:52,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 597 593 [WARNING|trainer.py:803] 2025-04-26 14:24:53,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:53,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 600 [WARNING|trainer.py:803] 2025-04-26 14:24:54,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 598 594 [WARNING|trainer.py:803] 2025-04-26 14:24:54,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:55,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:55,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 599 595 601 [WARNING|trainer.py:803] 2025-04-26 14:24:56,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:56,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 600 [WARNING|trainer.py:803] 2025-04-26 14:24:57,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 596 [WARNING|trainer.py:803] 2025-04-26 14:24:57,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:24:58,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 602 597 601 [WARNING|trainer.py:803] 2025-04-26 14:24:59,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:24:59,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 598 [WARNING|trainer.py:803] 2025-04-26 14:25:00,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 603 [WARNING|trainer.py:803] 2025-04-26 14:25:01,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 599 602 [WARNING|trainer.py:803] 2025-04-26 14:25:01,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:25:02,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:25:02,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 600 604 [WARNING|trainer.py:803] 2025-04-26 14:25:03,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 603 [WARNING|trainer.py:803] 2025-04-26 14:25:04,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:25:04,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 601 605 [WARNING|trainer.py:803] 2025-04-26 14:25:06,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 604 [WARNING|trainer.py:803] 2025-04-26 14:25:06,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:25:07,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 602 606 [WARNING|trainer.py:803] 2025-04-26 14:25:08,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:25:08,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 605 [WARNING|trainer.py:803] 2025-04-26 14:25:09,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 603 607 [WARNING|trainer.py:803] 2025-04-26 14:25:10,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 606 [WARNING|trainer.py:803] 2025-04-26 14:25:10,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:25:11,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 604 608 [WARNING|trainer.py:803] 2025-04-26 14:25:12,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 607 [WARNING|trainer.py:803] 2025-04-26 14:25:13,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:25:13,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 605 609 [WARNING|trainer.py:803] 2025-04-26 14:25:15,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 608 [WARNING|trainer.py:803] 2025-04-26 14:25:15,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:25:16,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 606 610 [WARNING|trainer.py:803] 2025-04-26 14:25:17,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:25:17,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 609 607 [WARNING|trainer.py:803] 2025-04-26 14:25:18,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 611 [WARNING|trainer.py:803] 2025-04-26 14:25:19,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:25:19,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 610 [WARNING|trainer.py:803] 2025-04-26 14:25:20,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 608 612 [WARNING|trainer.py:803] 2025-04-26 14:25:22,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 611 [WARNING|trainer.py:803] 2025-04-26 14:25:22,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:25:23,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 609 613 [WARNING|trainer.py:803] 2025-04-26 14:25:24,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 612 [WARNING|trainer.py:803] 2025-04-26 14:25:25,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 610 [WARNING|trainer.py:803] 2025-04-26 14:25:25,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:25:26,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 614 613 [WARNING|trainer.py:803] 2025-04-26 14:25:27,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 611 [WARNING|trainer.py:803] 2025-04-26 14:25:28,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 615 [WARNING|trainer.py:803] 2025-04-26 14:25:28,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:25:29,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 614 [WARNING|trainer.py:803] 2025-04-26 14:25:30,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 612 616 [WARNING|trainer.py:803] 2025-04-26 14:25:31,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:25:31,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 615 [WARNING|trainer.py:803] 2025-04-26 14:25:32,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 617 613 [WARNING|trainer.py:803] 2025-04-26 14:25:33,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 616 [WARNING|trainer.py:803] 2025-04-26 14:25:33,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:25:34,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 618 614 [WARNING|trainer.py:803] 2025-04-26 14:25:36,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 617 [WARNING|trainer.py:803] 2025-04-26 14:25:36,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:25:36,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 615 619 [WARNING|trainer.py:803] 2025-04-26 14:25:38,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:25:38,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 618 [WARNING|trainer.py:803] 2025-04-26 14:25:39,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 620 616 [WARNING|trainer.py:803] 2025-04-26 14:25:40,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:25:40,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 619 [WARNING|trainer.py:803] 2025-04-26 14:25:41,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 621 617 [WARNING|trainer.py:803] 2025-04-26 14:25:42,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:25:42,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 620 [WARNING|trainer.py:803] 2025-04-26 14:25:43,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 618 622 621 [WARNING|trainer.py:803] 2025-04-26 14:25:44,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:25:44,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:25:45,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 619 623 622 [WARNING|trainer.py:803] 2025-04-26 14:25:47,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:25:47,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:25:48,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 620 624 [WARNING|trainer.py:803] 2025-04-26 14:25:49,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 623 [WARNING|trainer.py:803] 2025-04-26 14:25:49,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:25:50,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 621 625 [WARNING|trainer.py:803] 2025-04-26 14:25:51,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 624 [WARNING|trainer.py:803] 2025-04-26 14:25:52,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:25:53,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 622 626 [WARNING|trainer.py:803] 2025-04-26 14:25:54,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 625 [WARNING|trainer.py:803] 2025-04-26 14:25:54,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:25:55,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 623 627 [WARNING|trainer.py:803] 2025-04-26 14:25:56,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 626 [WARNING|trainer.py:803] 2025-04-26 14:25:57,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:25:57,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 624 628 [WARNING|trainer.py:803] 2025-04-26 14:25:59,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 627 [WARNING|trainer.py:803] 2025-04-26 14:25:59,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:26:00,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 629 625 628 [WARNING|trainer.py:803] 2025-04-26 14:26:01,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:01,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:02,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 626 630 629 [WARNING|trainer.py:803] 2025-04-26 14:26:03,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:26:04,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:04,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 631 627 630 [WARNING|trainer.py:803] 2025-04-26 14:26:06,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:06,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:06,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 632 628 631 [WARNING|trainer.py:803] 2025-04-26 14:26:08,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:26:08,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:08,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 629 633 632 [WARNING|trainer.py:803] 2025-04-26 14:26:10,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:10,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:11,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 633 630 634 [WARNING|trainer.py:803] 2025-04-26 14:26:13,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:13,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:13,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 631 635 634 [WARNING|trainer.py:803] 2025-04-26 14:26:15,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:15,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:16,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 632 636 635 [WARNING|trainer.py:803] 2025-04-26 14:26:17,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:17,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:26:18,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 633 637 636 [WARNING|trainer.py:803] 2025-04-26 14:26:20,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:20,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:20,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 638 637 634 [WARNING|trainer.py:803] 2025-04-26 14:26:22,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:22,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:22,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 639 638 635 [WARNING|trainer.py:803] 2025-04-26 14:26:24,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:24,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:25,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 640 639 636 [WARNING|trainer.py:803] 2025-04-26 14:26:26,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:27,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:27,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 641 640 637 [WARNING|trainer.py:803] 2025-04-26 14:26:29,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:29,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:29,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 642 641 638 [WARNING|trainer.py:803] 2025-04-26 14:26:31,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:26:31,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:31,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 642 643 639 [WARNING|trainer.py:803] 2025-04-26 14:26:34,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:26:34,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:26:34,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 640 643 644 [WARNING|trainer.py:803] 2025-04-26 14:26:36,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:36,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:26:36,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 641 644 645 [WARNING|trainer.py:803] 2025-04-26 14:26:38,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:39,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:39,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 642 646 645 [WARNING|trainer.py:803] 2025-04-26 14:26:40,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:26:41,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:26:41,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 643 647 646 [WARNING|trainer.py:803] 2025-04-26 14:26:43,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:26:43,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:43,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 648 644 647 [WARNING|trainer.py:803] 2025-04-26 14:26:46,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:26:46,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:46,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 649 645 648 [WARNING|trainer.py:803] 2025-04-26 14:26:48,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:26:48,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:48,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 646 649 650 [WARNING|trainer.py:803] 2025-04-26 14:26:50,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:26:50,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:26:50,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 651 650 647 [WARNING|trainer.py:803] 2025-04-26 14:26:52,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:26:53,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:53,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 652 651 648 [WARNING|trainer.py:803] 2025-04-26 14:26:55,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:55,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:26:55,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 653 652 649 [WARNING|trainer.py:803] 2025-04-26 14:26:57,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:26:57,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:26:57,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 653 654 650 [WARNING|trainer.py:803] 2025-04-26 14:26:59,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:00,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:00,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 651 655 654 [WARNING|trainer.py:803] 2025-04-26 14:27:02,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:27:02,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:02,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 656 652 655 [WARNING|trainer.py:803] 2025-04-26 14:27:04,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:04,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:27:04,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 657 653 656 [WARNING|trainer.py:803] 2025-04-26 14:27:06,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:07,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:07,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 658 657 654 [WARNING|trainer.py:803] 2025-04-26 14:27:09,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:09,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:09,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 659 658 655 [WARNING|trainer.py:803] 2025-04-26 14:27:11,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:11,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:11,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 660 656 659 [WARNING|trainer.py:803] 2025-04-26 14:27:13,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:13,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:14,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 661 660 657 [WARNING|trainer.py:803] 2025-04-26 14:27:15,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:16,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:16,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 661 658 662 [WARNING|trainer.py:803] 2025-04-26 14:27:18,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:18,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:18,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 659 663 662 [WARNING|trainer.py:803] 2025-04-26 14:27:20,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:21,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:27:21,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 660 663 664 [WARNING|trainer.py:803] 2025-04-26 14:27:23,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:23,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:27:23,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 661 665 [WARNING|trainer.py:803] 2025-04-26 14:27:25,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 664 [WARNING|trainer.py:803] 2025-04-26 14:27:26,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:26,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 666 662 665 [WARNING|trainer.py:803] 2025-04-26 14:27:28,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:28,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:28,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 667 666 663 [WARNING|trainer.py:803] 2025-04-26 14:27:30,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:30,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:30,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 668 667 664 [WARNING|trainer.py:803] 2025-04-26 14:27:32,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:32,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:33,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 669 668 665 [WARNING|trainer.py:803] 2025-04-26 14:27:34,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:27:35,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:35,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 670 669 666 [WARNING|trainer.py:803] 2025-04-26 14:27:37,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:37,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:27:37,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 667 670 671 [WARNING|trainer.py:803] 2025-04-26 14:27:39,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:40,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:40,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 668 672 671 [WARNING|trainer.py:803] 2025-04-26 14:27:42,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:42,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:42,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 673 669 672 [WARNING|trainer.py:803] 2025-04-26 14:27:44,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:27:44,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:27:44,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 674 670 673 [WARNING|trainer.py:803] 2025-04-26 14:27:46,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:46,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:46,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 675 674 671 [WARNING|trainer.py:803] 2025-04-26 14:27:49,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:49,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:49,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 676 672 675 [WARNING|trainer.py:803] 2025-04-26 14:27:51,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:51,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:51,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 677 673 676 [WARNING|trainer.py:803] 2025-04-26 14:27:53,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:53,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:27:54,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 678 674 677 [WARNING|trainer.py:803] 2025-04-26 14:27:55,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:56,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:27:56,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 679 678 675 [WARNING|trainer.py:803] 2025-04-26 14:27:58,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:27:58,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:27:58,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 680 679 676 [WARNING|trainer.py:803] 2025-04-26 14:28:00,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:28:00,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:28:00,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 681 680 677 [WARNING|trainer.py:803] 2025-04-26 14:28:02,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:28:03,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:28:03,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 682 681 678 [WARNING|trainer.py:803] 2025-04-26 14:28:04,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:28:05,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:28:05,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 682 683 679 [WARNING|trainer.py:803] 2025-04-26 14:28:07,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:28:07,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:28:07,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 680 683 684 [WARNING|trainer.py:803] 2025-04-26 14:28:09,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:28:10,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:28:10,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 681 [WARNING|trainer.py:803] 2025-04-26 14:28:12,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 685 684 [WARNING|trainer.py:803] 2025-04-26 14:28:12,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 682 [WARNING|trainer.py:803] 2025-04-26 14:28:13,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:28:14,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 686 685 [WARNING|trainer.py:803] 2025-04-26 14:28:15,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:28:15,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 683 [WARNING|trainer.py:803] 2025-04-26 14:28:16,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 687 686 [WARNING|trainer.py:803] 2025-04-26 14:28:17,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:28:18,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 684 688 687 [WARNING|trainer.py:803] 2025-04-26 14:28:19,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:28:20,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:28:20,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 685 689 688 [WARNING|trainer.py:803] 2025-04-26 14:28:22,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:28:22,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:28:22,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 690 689 686 [WARNING|trainer.py:803] 2025-04-26 14:28:24,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:28:24,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:28:24,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 690 691 687 [WARNING|trainer.py:803] 2025-04-26 14:28:27,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:28:27,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:28:27,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 688 691 692 [WARNING|trainer.py:803] 2025-04-26 14:28:29,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:28:29,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:28:29,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 689 693 692 [WARNING|trainer.py:803] 2025-04-26 14:28:31,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:28:31,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:28:32,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 690 694 693 [WARNING|trainer.py:803] 2025-04-26 14:28:33,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:28:33,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:28:34,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 694 691 695 [WARNING|trainer.py:803] 2025-04-26 14:28:36,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:28:36,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:28:36,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 696 695 692 [WARNING|trainer.py:803] 2025-04-26 14:28:38,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:28:39,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:28:39,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 697 693 696 [WARNING|trainer.py:803] 2025-04-26 14:28:41,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:28:41,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:28:41,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 694 697 698 [WARNING|trainer.py:803] 2025-04-26 14:28:43,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:28:43,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:28:44,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 695 699 698 [WARNING|trainer.py:803] 2025-04-26 14:28:46,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:28:46,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:28:46,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 696 700 699 [WARNING|trainer.py:803] 2025-04-26 14:28:48,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:28:48,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:28:49,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 697 701 700 [WARNING|trainer.py:803] 2025-04-26 14:28:50,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:28:50,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:28:51,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 702 701 698 [WARNING|trainer.py:803] 2025-04-26 14:28:53,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:28:53,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:28:53,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 703 702 699 [WARNING|trainer.py:803] 2025-04-26 14:28:55,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 14:28:56,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:28:56,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 704 703 700 [WARNING|trainer.py:803] 2025-04-26 14:28:57,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:28:58,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 14:28:58,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 705 704 701 [WARNING|trainer.py:803] 2025-04-26 14:29:00,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:00,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:29:00,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 705 706 702 [WARNING|trainer.py:803] 2025-04-26 14:29:02,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:02,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:03,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 707 703 706 [WARNING|trainer.py:803] 2025-04-26 14:29:05,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:29:05,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 14:29:05,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 704 708 707 [WARNING|trainer.py:803] 2025-04-26 14:29:07,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:29:07,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:07,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 705 709 708 [WARNING|trainer.py:803] 2025-04-26 14:29:09,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:09,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:10,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 709 710 706 [WARNING|trainer.py:803] 2025-04-26 14:29:12,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:12,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:29:12,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 707 711 710 [WARNING|trainer.py:803] 2025-04-26 14:29:14,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:29:14,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:14,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 712 711 708 [WARNING|trainer.py:803] 2025-04-26 14:29:17,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:17,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:17,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 709 712 713 [WARNING|trainer.py:803] 2025-04-26 14:29:19,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:19,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:19,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 710 713 714 [WARNING|trainer.py:803] 2025-04-26 14:29:22,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:29:22,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:29:22,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 711 715 714 [WARNING|trainer.py:803] 2025-04-26 14:29:24,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:24,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:24,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 712 715 716 [WARNING|trainer.py:803] 2025-04-26 14:29:26,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:27,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:27,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 716 713 717 [WARNING|trainer.py:803] 2025-04-26 14:29:29,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:29:29,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:29:29,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 717 718 714 [WARNING|trainer.py:803] 2025-04-26 14:29:32,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:32,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:32,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 715 718 719 [WARNING|trainer.py:803] 2025-04-26 14:29:34,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:34,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:29:34,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 716 720 719 [WARNING|trainer.py:803] 2025-04-26 14:29:36,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:29:37,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:29:37,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 721 720 717 [WARNING|trainer.py:803] 2025-04-26 14:29:39,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:29:39,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:29:39,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 722 721 718 [WARNING|trainer.py:803] 2025-04-26 14:29:41,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:29:42,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:29:42,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 723 722 719 [WARNING|trainer.py:803] 2025-04-26 14:29:44,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:29:44,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:29:44,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 723 720 724 [WARNING|trainer.py:803] 2025-04-26 14:29:46,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:29:47,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:29:47,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 721 724 725 [WARNING|trainer.py:803] 2025-04-26 14:29:49,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:29:49,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:29:49,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 722 726 725 [WARNING|trainer.py:803] 2025-04-26 14:29:52,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:29:52,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:52,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 723 727 726 [WARNING|trainer.py:803] 2025-04-26 14:29:54,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:29:54,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:54,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 728 727 724 [WARNING|trainer.py:803] 2025-04-26 14:29:56,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:29:57,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:29:57,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 728 729 [WARNING|trainer.py:803] 2025-04-26 14:29:59,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 725 [WARNING|trainer.py:803] 2025-04-26 14:29:59,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:00,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 730 729 [WARNING|trainer.py:803] 2025-04-26 14:30:01,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 726 [WARNING|trainer.py:803] 2025-04-26 14:30:01,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:02,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 731 730 [WARNING|trainer.py:803] 2025-04-26 14:30:03,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 727 [WARNING|trainer.py:803] 2025-04-26 14:30:03,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:04,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 732 731 [WARNING|trainer.py:803] 2025-04-26 14:30:05,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 728 [WARNING|trainer.py:803] 2025-04-26 14:30:06,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:06,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 732 733 [WARNING|trainer.py:803] 2025-04-26 14:30:08,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:08,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 729 [WARNING|trainer.py:803] 2025-04-26 14:30:09,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 733 734 [WARNING|trainer.py:803] 2025-04-26 14:30:10,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:30:10,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 730 [WARNING|trainer.py:803] 2025-04-26 14:30:11,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 734 735 731 [WARNING|trainer.py:803] 2025-04-26 14:30:13,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:13,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:30:13,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 736 735 732 [WARNING|trainer.py:803] 2025-04-26 14:30:15,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:15,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:30:15,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 736 737 733 [WARNING|trainer.py:803] 2025-04-26 14:30:18,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:18,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:30:18,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 738 737 734 [WARNING|trainer.py:803] 2025-04-26 14:30:20,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:30:20,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:30:20,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 739 738 735 [WARNING|trainer.py:803] 2025-04-26 14:30:22,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:30:23,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:30:23,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 740 739 736 [WARNING|trainer.py:803] 2025-04-26 14:30:25,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:25,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:30:25,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 741 740 [WARNING|trainer.py:803] 2025-04-26 14:30:27,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 737 [WARNING|trainer.py:803] 2025-04-26 14:30:27,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:28,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 742 741 [WARNING|trainer.py:803] 2025-04-26 14:30:29,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 738 [WARNING|trainer.py:803] 2025-04-26 14:30:29,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:30,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 742 743 [WARNING|trainer.py:803] 2025-04-26 14:30:32,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:30:32,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 739 [WARNING|trainer.py:803] 2025-04-26 14:30:33,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 744 743 [WARNING|trainer.py:803] 2025-04-26 14:30:34,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:30:34,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 740 [WARNING|trainer.py:803] 2025-04-26 14:30:35,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 744 745 741 [WARNING|trainer.py:803] 2025-04-26 14:30:36,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:30:36,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:30:37,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 746 745 742 [WARNING|trainer.py:803] 2025-04-26 14:30:39,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:39,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:30:39,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 746 747 743 [WARNING|trainer.py:803] 2025-04-26 14:30:41,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:41,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:30:42,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 748 747 744 [WARNING|trainer.py:803] 2025-04-26 14:30:44,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:30:44,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:30:44,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 748 749 745 [WARNING|trainer.py:803] 2025-04-26 14:30:46,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:30:46,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:30:47,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 750 746 749 [WARNING|trainer.py:803] 2025-04-26 14:30:48,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:49,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:49,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 750 751 747 [WARNING|trainer.py:803] 2025-04-26 14:30:51,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:51,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:51,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 752 751 748 [WARNING|trainer.py:803] 2025-04-26 14:30:53,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:53,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:54,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 753 752 749 [WARNING|trainer.py:803] 2025-04-26 14:30:56,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:30:56,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:30:56,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 753 754 750 [WARNING|trainer.py:803] 2025-04-26 14:30:58,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:30:58,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:30:59,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 754 755 [WARNING|trainer.py:803] 2025-04-26 14:31:00,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:31:00,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 751 [WARNING|trainer.py:803] 2025-04-26 14:31:01,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 756 755 [WARNING|trainer.py:803] 2025-04-26 14:31:02,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:31:02,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 752 [WARNING|trainer.py:803] 2025-04-26 14:31:03,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 756 757 [WARNING|trainer.py:803] 2025-04-26 14:31:05,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 753 [WARNING|trainer.py:803] 2025-04-26 14:31:05,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:31:05,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 757 [WARNING|trainer.py:803] 2025-04-26 14:31:07,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 758 754 [WARNING|trainer.py:803] 2025-04-26 14:31:08,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:08,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 758 759 755 [WARNING|trainer.py:803] 2025-04-26 14:31:10,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:10,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:10,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 759 756 760 [WARNING|trainer.py:803] 2025-04-26 14:31:12,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:12,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:31:12,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 757 760 761 [WARNING|trainer.py:803] 2025-04-26 14:31:14,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:31:15,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:15,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 761 762 758 [WARNING|trainer.py:803] 2025-04-26 14:31:17,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:17,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:17,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 762 763 759 [WARNING|trainer.py:803] 2025-04-26 14:31:19,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:19,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:20,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 763 764 760 [WARNING|trainer.py:803] 2025-04-26 14:31:22,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:22,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:31:22,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 764 765 761 [WARNING|trainer.py:803] 2025-04-26 14:31:24,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:31:24,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:25,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 765 766 762 [WARNING|trainer.py:803] 2025-04-26 14:31:26,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:27,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:31:27,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 766 767 763 [WARNING|trainer.py:803] 2025-04-26 14:31:29,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:31:29,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:29,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 767 768 764 [WARNING|trainer.py:803] 2025-04-26 14:31:31,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:32,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:32,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 768 765 769 [WARNING|trainer.py:803] 2025-04-26 14:31:34,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:34,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:35,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 766 770 769 [WARNING|trainer.py:803] 2025-04-26 14:31:37,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:31:37,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:37,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 771 770 767 [WARNING|trainer.py:803] 2025-04-26 14:31:39,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:31:39,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:39,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 772 771 768 [WARNING|trainer.py:803] 2025-04-26 14:31:41,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:31:41,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:31:42,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 773 772 [WARNING|trainer.py:803] 2025-04-26 14:31:43,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:31:43,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 769 [WARNING|trainer.py:803] 2025-04-26 14:31:45,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 774 773 [WARNING|trainer.py:803] 2025-04-26 14:31:46,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:31:46,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 770 [WARNING|trainer.py:803] 2025-04-26 14:31:47,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 775 774 [WARNING|trainer.py:803] 2025-04-26 14:31:48,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 771 [WARNING|trainer.py:803] 2025-04-26 14:31:48,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:31:49,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 776 775 [WARNING|trainer.py:803] 2025-04-26 14:31:50,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:31:50,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 772 777 [WARNING|trainer.py:803] 2025-04-26 14:31:51,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 776 [WARNING|trainer.py:803] 2025-04-26 14:31:52,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:31:52,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 773 [WARNING|trainer.py:803] 2025-04-26 14:31:54,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 778 777 [WARNING|trainer.py:803] 2025-04-26 14:31:55,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:31:55,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 774 [WARNING|trainer.py:803] 2025-04-26 14:31:56,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 779 778 [WARNING|trainer.py:803] 2025-04-26 14:31:57,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 775 [WARNING|trainer.py:803] 2025-04-26 14:31:57,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:31:58,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 779 780 [WARNING|trainer.py:803] 2025-04-26 14:31:59,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:32:00,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 776 [WARNING|trainer.py:803] 2025-04-26 14:32:00,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 781 780 [WARNING|trainer.py:803] 2025-04-26 14:32:02,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 777 [WARNING|trainer.py:803] 2025-04-26 14:32:02,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:32:03,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 782 781 [WARNING|trainer.py:803] 2025-04-26 14:32:04,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:32:04,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 778 [WARNING|trainer.py:803] 2025-04-26 14:32:05,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 783 782 [WARNING|trainer.py:803] 2025-04-26 14:32:07,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:32:07,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 779 [WARNING|trainer.py:803] 2025-04-26 14:32:08,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 784 783 [WARNING|trainer.py:803] 2025-04-26 14:32:09,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:32:09,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 780 [WARNING|trainer.py:803] 2025-04-26 14:32:10,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 784 785 [WARNING|trainer.py:803] 2025-04-26 14:32:11,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:32:11,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 781 [WARNING|trainer.py:803] 2025-04-26 14:32:13,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 786 785 [WARNING|trainer.py:803] 2025-04-26 14:32:14,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:32:14,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 782 [WARNING|trainer.py:803] 2025-04-26 14:32:15,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 786 787 [WARNING|trainer.py:803] 2025-04-26 14:32:16,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:32:16,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 783 [WARNING|trainer.py:803] 2025-04-26 14:32:17,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 787 788 [WARNING|trainer.py:803] 2025-04-26 14:32:19,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 784 [WARNING|trainer.py:803] 2025-04-26 14:32:19,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:32:20,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 789 788 785 [WARNING|trainer.py:803] 2025-04-26 14:32:21,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:32:21,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:32:22,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 790 789 [WARNING|trainer.py:803] 2025-04-26 14:32:23,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 786 [WARNING|trainer.py:803] 2025-04-26 14:32:24,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:32:24,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 791 790 [WARNING|trainer.py:803] 2025-04-26 14:32:25,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:32:26,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 787 [WARNING|trainer.py:803] 2025-04-26 14:32:27,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 791 792 [WARNING|trainer.py:803] 2025-04-26 14:32:28,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:32:28,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 788 792 793 [WARNING|trainer.py:803] 2025-04-26 14:32:30,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:32:30,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:32:31,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 789 793 [WARNING|trainer.py:803] 2025-04-26 14:32:32,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 794 [WARNING|trainer.py:803] 2025-04-26 14:32:33,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:32:33,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 790 795 [WARNING|trainer.py:803] 2025-04-26 14:32:34,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 794 [WARNING|trainer.py:803] 2025-04-26 14:32:35,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:32:35,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 791 [WARNING|trainer.py:803] 2025-04-26 14:32:36,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 795 796 [WARNING|trainer.py:803] 2025-04-26 14:32:37,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:32:37,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 792 [WARNING|trainer.py:803] 2025-04-26 14:32:39,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 796 797 [WARNING|trainer.py:803] 2025-04-26 14:32:40,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:32:40,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 793 [WARNING|trainer.py:803] 2025-04-26 14:32:41,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 797 798 [WARNING|trainer.py:803] 2025-04-26 14:32:42,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:32:42,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 794 799 798 [WARNING|trainer.py:803] 2025-04-26 14:32:44,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:32:44,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:32:45,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 795 [WARNING|trainer.py:803] 2025-04-26 14:32:46,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 799 800 [WARNING|trainer.py:803] 2025-04-26 14:32:47,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:32:47,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 796 [WARNING|trainer.py:803] 2025-04-26 14:32:48,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 800 801 [WARNING|trainer.py:803] 2025-04-26 14:32:49,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:32:49,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 797 [WARNING|trainer.py:803] 2025-04-26 14:32:51,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 801 802 [WARNING|trainer.py:803] 2025-04-26 14:32:52,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:32:52,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 798 [WARNING|trainer.py:803] 2025-04-26 14:32:53,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 803 802 [WARNING|trainer.py:803] 2025-04-26 14:32:54,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:32:54,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 799 [WARNING|trainer.py:803] 2025-04-26 14:32:55,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 804 803 [WARNING|trainer.py:803] 2025-04-26 14:32:56,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:32:57,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 800 805 [WARNING|trainer.py:803] 2025-04-26 14:32:58,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 804 [WARNING|trainer.py:803] 2025-04-26 14:32:59,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:32:59,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 801 806 805 [WARNING|trainer.py:803] 2025-04-26 14:33:00,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:33:01,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:33:01,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 802 806 807 [WARNING|trainer.py:803] 2025-04-26 14:33:03,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:33:03,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:33:04,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 803 808 807 [WARNING|trainer.py:803] 2025-04-26 14:33:06,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:33:06,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:33:06,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 804 809 [WARNING|trainer.py:803] 2025-04-26 14:33:08,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 808 [WARNING|trainer.py:803] 2025-04-26 14:33:08,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:33:08,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 805 810 809 [WARNING|trainer.py:803] 2025-04-26 14:33:10,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:33:10,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:33:11,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 806 811 810 [WARNING|trainer.py:803] 2025-04-26 14:33:12,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:33:12,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:33:13,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 812 807 811 [WARNING|trainer.py:803] 2025-04-26 14:33:15,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:33:15,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:33:15,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 813 808 812 [WARNING|trainer.py:803] 2025-04-26 14:33:17,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:33:17,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:33:17,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 809 814 813 [WARNING|trainer.py:803] 2025-04-26 14:33:19,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:33:20,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:33:20,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 815 810 814 [WARNING|trainer.py:803] 2025-04-26 14:33:22,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:33:22,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:33:22,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 816 811 815 [WARNING|trainer.py:803] 2025-04-26 14:33:24,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:33:24,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:33:24,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 817 812 816 [WARNING|trainer.py:803] 2025-04-26 14:33:26,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:33:26,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:33:27,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 818 813 [WARNING|trainer.py:803] 2025-04-26 14:33:28,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 817 [WARNING|trainer.py:803] 2025-04-26 14:33:28,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:33:29,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 819 814 [WARNING|trainer.py:803] 2025-04-26 14:33:30,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 818 [WARNING|trainer.py:803] 2025-04-26 14:33:31,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:33:31,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 820 815 819 [WARNING|trainer.py:803] 2025-04-26 14:33:33,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:33:33,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:33:33,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 821 816 [WARNING|trainer.py:803] 2025-04-26 14:33:35,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 820 [WARNING|trainer.py:803] 2025-04-26 14:33:35,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:33:36,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 822 817 821 [WARNING|trainer.py:803] 2025-04-26 14:33:37,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:33:38,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:33:38,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 823 818 822 [WARNING|trainer.py:803] 2025-04-26 14:33:40,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:33:40,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:33:40,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 819 824 [WARNING|trainer.py:803] 2025-04-26 14:33:42,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 823 [WARNING|trainer.py:803] 2025-04-26 14:33:42,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:33:43,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 825 820 [WARNING|trainer.py:803] 2025-04-26 14:33:44,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 824 [WARNING|trainer.py:803] 2025-04-26 14:33:44,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:33:45,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 826 821 825 [WARNING|trainer.py:803] 2025-04-26 14:33:46,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:33:47,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:33:47,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 827 822 [WARNING|trainer.py:803] 2025-04-26 14:33:48,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 826 [WARNING|trainer.py:803] 2025-04-26 14:33:49,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:33:49,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 828 [WARNING|trainer.py:803] 2025-04-26 14:33:50,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 827 823 [WARNING|trainer.py:803] 2025-04-26 14:33:51,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:33:52,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 829 828 [WARNING|trainer.py:803] 2025-04-26 14:33:53,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 824 [WARNING|trainer.py:803] 2025-04-26 14:33:53,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:33:54,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 830 829 [WARNING|trainer.py:803] 2025-04-26 14:33:55,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 825 [WARNING|trainer.py:803] 2025-04-26 14:33:56,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:33:56,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 831 830 826 [WARNING|trainer.py:803] 2025-04-26 14:33:57,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:33:58,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:33:58,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 832 831 827 [WARNING|trainer.py:803] 2025-04-26 14:34:00,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:34:00,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:34:00,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 833 828 [WARNING|trainer.py:803] 2025-04-26 14:34:02,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 832 [WARNING|trainer.py:803] 2025-04-26 14:34:02,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:34:03,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 834 829 833 [WARNING|trainer.py:803] 2025-04-26 14:34:04,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:34:05,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:34:05,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 835 834 830 [WARNING|trainer.py:803] 2025-04-26 14:34:06,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:34:07,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:34:07,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 836 [WARNING|trainer.py:803] 2025-04-26 14:34:08,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 835 831 [WARNING|trainer.py:803] 2025-04-26 14:34:09,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:34:09,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 836 837 832 [WARNING|trainer.py:803] 2025-04-26 14:34:11,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:34:11,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:34:12,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 838 833 837 [WARNING|trainer.py:803] 2025-04-26 14:34:13,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:34:14,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:34:14,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 839 834 838 [WARNING|trainer.py:803] 2025-04-26 14:34:16,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:34:16,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:34:16,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 840 835 839 [WARNING|trainer.py:803] 2025-04-26 14:34:18,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:34:18,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:34:18,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 841 836 840 [WARNING|trainer.py:803] 2025-04-26 14:34:20,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:34:20,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:34:21,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 842 841 [WARNING|trainer.py:803] 2025-04-26 14:34:22,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 837 [WARNING|trainer.py:803] 2025-04-26 14:34:23,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:34:23,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 843 842 [WARNING|trainer.py:803] 2025-04-26 14:34:25,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 838 [WARNING|trainer.py:803] 2025-04-26 14:34:25,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:34:26,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 844 843 [WARNING|trainer.py:803] 2025-04-26 14:34:27,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 839 [WARNING|trainer.py:803] 2025-04-26 14:34:28,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:34:28,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 845 844 [WARNING|trainer.py:803] 2025-04-26 14:34:29,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 840 [WARNING|trainer.py:803] 2025-04-26 14:34:30,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:34:30,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 846 845 [WARNING|trainer.py:803] 2025-04-26 14:34:31,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 841 [WARNING|trainer.py:803] 2025-04-26 14:34:32,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 847 [WARNING|trainer.py:803] 2025-04-26 14:34:32,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:34:33,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 846 842 [WARNING|trainer.py:803] 2025-04-26 14:34:34,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 848 [WARNING|trainer.py:803] 2025-04-26 14:34:35,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 847 [WARNING|trainer.py:803] 2025-04-26 14:34:35,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:34:36,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 843 849 [WARNING|trainer.py:803] 2025-04-26 14:34:37,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:34:38,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 848 844 [WARNING|trainer.py:803] 2025-04-26 14:34:38,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:34:39,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 850 849 [WARNING|trainer.py:803] 2025-04-26 14:34:40,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 845 [WARNING|trainer.py:803] 2025-04-26 14:34:41,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:34:41,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 851 [WARNING|trainer.py:803] 2025-04-26 14:34:42,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 850 846 [WARNING|trainer.py:803] 2025-04-26 14:34:43,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:34:43,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 852 [WARNING|trainer.py:803] 2025-04-26 14:34:44,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 851 847 [WARNING|trainer.py:803] 2025-04-26 14:34:45,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:34:46,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 853 [WARNING|trainer.py:803] 2025-04-26 14:34:47,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 852 848 [WARNING|trainer.py:803] 2025-04-26 14:34:47,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 854 [WARNING|trainer.py:803] 2025-04-26 14:34:48,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:34:49,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 853 849 [WARNING|trainer.py:803] 2025-04-26 14:34:50,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 855 [WARNING|trainer.py:803] 2025-04-26 14:34:50,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:34:51,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 854 [WARNING|trainer.py:803] 2025-04-26 14:34:52,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 856 850 [WARNING|trainer.py:803] 2025-04-26 14:34:53,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 14:34:53,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 855 [WARNING|trainer.py:803] 2025-04-26 14:34:54,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 851 857 [WARNING|trainer.py:803] 2025-04-26 14:34:55,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:34:55,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 856 [WARNING|trainer.py:803] 2025-04-26 14:34:56,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 852 858 [WARNING|trainer.py:803] 2025-04-26 14:34:57,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 857 [WARNING|trainer.py:803] 2025-04-26 14:34:58,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:34:58,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 853 859 [WARNING|trainer.py:803] 2025-04-26 14:34:59,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:35:00,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 858 854 [WARNING|trainer.py:803] 2025-04-26 14:35:01,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 860 [WARNING|trainer.py:803] 2025-04-26 14:35:02,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:35:02,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 859 855 [WARNING|trainer.py:803] 2025-04-26 14:35:03,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 861 [WARNING|trainer.py:803] 2025-04-26 14:35:04,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:35:04,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 860 856 [WARNING|trainer.py:803] 2025-04-26 14:35:05,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 862 [WARNING|trainer.py:803] 2025-04-26 14:35:06,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 14:35:06,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 861 857 [WARNING|trainer.py:803] 2025-04-26 14:35:07,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 863 [WARNING|trainer.py:803] 2025-04-26 14:35:08,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:35:08,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 862 [WARNING|trainer.py:803] 2025-04-26 14:35:10,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 864 858 [WARNING|trainer.py:803] 2025-04-26 14:35:11,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:35:11,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 863 [WARNING|trainer.py:803] 2025-04-26 14:35:12,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 865 859 [WARNING|trainer.py:803] 2025-04-26 14:35:13,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 864 [WARNING|trainer.py:803] 2025-04-26 14:35:13,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:35:14,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 866 860 [WARNING|trainer.py:803] 2025-04-26 14:35:15,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:35:15,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 865 867 [WARNING|trainer.py:803] 2025-04-26 14:35:16,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 861 [WARNING|trainer.py:803] 2025-04-26 14:35:17,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:35:17,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 866 [WARNING|trainer.py:803] 2025-04-26 14:35:18,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 868 862 [WARNING|trainer.py:803] 2025-04-26 14:35:19,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 867 [WARNING|trainer.py:803] 2025-04-26 14:35:20,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:35:20,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 869 863 [WARNING|trainer.py:803] 2025-04-26 14:35:21,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 868 [WARNING|trainer.py:803] 2025-04-26 14:35:22,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:35:22,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 870 864 [WARNING|trainer.py:803] 2025-04-26 14:35:23,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 869 [WARNING|trainer.py:803] 2025-04-26 14:35:24,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:35:24,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 871 865 [WARNING|trainer.py:803] 2025-04-26 14:35:25,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 870 [WARNING|trainer.py:803] 2025-04-26 14:35:26,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:35:26,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 866 872 871 [WARNING|trainer.py:803] 2025-04-26 14:35:28,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:35:28,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:35:29,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 867 873 [WARNING|trainer.py:803] 2025-04-26 14:35:30,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:35:30,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 872 [WARNING|trainer.py:803] 2025-04-26 14:35:31,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 868 874 [WARNING|trainer.py:803] 2025-04-26 14:35:32,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:35:33,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 873 869 [WARNING|trainer.py:803] 2025-04-26 14:35:34,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 875 [WARNING|trainer.py:803] 2025-04-26 14:35:35,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 874 [WARNING|trainer.py:803] 2025-04-26 14:35:35,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 870 [WARNING|trainer.py:803] 2025-04-26 14:35:36,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 876 [WARNING|trainer.py:803] 2025-04-26 14:35:37,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:35:37,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 875 871 [WARNING|trainer.py:803] 2025-04-26 14:35:38,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 877 [WARNING|trainer.py:803] 2025-04-26 14:35:39,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:35:39,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 876 [WARNING|trainer.py:803] 2025-04-26 14:35:41,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 878 872 [WARNING|trainer.py:803] 2025-04-26 14:35:41,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:35:42,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 877 879 [WARNING|trainer.py:803] 2025-04-26 14:35:43,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 873 [WARNING|trainer.py:803] 2025-04-26 14:35:43,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 878 [WARNING|trainer.py:803] 2025-04-26 14:35:44,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:35:45,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 880 874 [WARNING|trainer.py:803] 2025-04-26 14:35:46,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 879 [WARNING|trainer.py:803] 2025-04-26 14:35:46,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:35:47,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 881 875 [WARNING|trainer.py:803] 2025-04-26 14:35:48,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 880 [WARNING|trainer.py:803] 2025-04-26 14:35:49,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:35:49,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 882 876 881 [WARNING|trainer.py:803] 2025-04-26 14:35:50,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:35:51,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:35:51,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 883 877 [WARNING|trainer.py:803] 2025-04-26 14:35:53,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 882 [WARNING|trainer.py:803] 2025-04-26 14:35:53,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:35:53,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 884 878 [WARNING|trainer.py:803] 2025-04-26 14:35:55,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 883 [WARNING|trainer.py:803] 2025-04-26 14:35:55,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:35:56,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 885 879 [WARNING|trainer.py:803] 2025-04-26 14:35:57,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 884 [WARNING|trainer.py:803] 2025-04-26 14:35:57,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:35:58,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 886 880 885 [WARNING|trainer.py:803] 2025-04-26 14:35:59,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:00,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:00,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 887 881 886 [WARNING|trainer.py:803] 2025-04-26 14:36:02,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:02,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:36:02,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 888 882 887 [WARNING|trainer.py:803] 2025-04-26 14:36:04,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:36:04,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:36:05,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 889 883 [WARNING|trainer.py:803] 2025-04-26 14:36:06,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:06,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 888 [WARNING|trainer.py:803] 2025-04-26 14:36:07,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 884 890 889 [WARNING|trainer.py:803] 2025-04-26 14:36:08,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:36:09,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:09,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 885 891 [WARNING|trainer.py:803] 2025-04-26 14:36:11,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 890 [WARNING|trainer.py:803] 2025-04-26 14:36:11,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:12,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 886 892 [WARNING|trainer.py:803] 2025-04-26 14:36:13,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:13,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 891 [WARNING|trainer.py:803] 2025-04-26 14:36:14,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 893 887 [WARNING|trainer.py:803] 2025-04-26 14:36:15,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 892 [WARNING|trainer.py:803] 2025-04-26 14:36:16,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:16,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 894 888 893 [WARNING|trainer.py:803] 2025-04-26 14:36:18,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:18,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:36:18,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 889 895 894 [WARNING|trainer.py:803] 2025-04-26 14:36:20,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:20,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:36:21,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 896 890 895 [WARNING|trainer.py:803] 2025-04-26 14:36:23,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:36:23,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:23,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 897 891 896 [WARNING|trainer.py:803] 2025-04-26 14:36:25,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:36:25,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:26,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 892 898 897 [WARNING|trainer.py:803] 2025-04-26 14:36:27,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:27,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:28,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 893 899 898 [WARNING|trainer.py:803] 2025-04-26 14:36:29,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:30,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:30,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 894 900 899 [WARNING|trainer.py:803] 2025-04-26 14:36:32,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:32,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:33,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 901 895 [WARNING|trainer.py:803] 2025-04-26 14:36:34,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 900 [WARNING|trainer.py:803] 2025-04-26 14:36:34,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 902 [WARNING|trainer.py:803] 2025-04-26 14:36:35,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:35,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 901 896 [WARNING|trainer.py:803] 2025-04-26 14:36:36,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:36:37,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 903 902 [WARNING|trainer.py:803] 2025-04-26 14:36:38,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:38,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 897 904 [WARNING|trainer.py:803] 2025-04-26 14:36:39,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:36:40,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 903 905 [WARNING|trainer.py:803] 2025-04-26 14:36:41,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 898 [WARNING|trainer.py:803] 2025-04-26 14:36:41,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 904 [WARNING|trainer.py:803] 2025-04-26 14:36:42,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:42,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 906 899 905 [WARNING|trainer.py:803] 2025-04-26 14:36:43,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:44,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:44,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 907 906 [WARNING|trainer.py:803] 2025-04-26 14:36:45,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 900 [WARNING|trainer.py:803] 2025-04-26 14:36:46,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 908 [WARNING|trainer.py:803] 2025-04-26 14:36:46,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 907 [WARNING|trainer.py:803] 2025-04-26 14:36:47,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes 901 [WARNING|trainer.py:803] 2025-04-26 14:36:47,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 909 [WARNING|trainer.py:803] 2025-04-26 14:36:48,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:36:48,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 908 902 910 [WARNING|trainer.py:803] 2025-04-26 14:36:49,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes [WARNING|trainer.py:803] 2025-04-26 14:36:50,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:36:50,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 909 911 [WARNING|trainer.py:803] 2025-04-26 14:36:51,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 903 [WARNING|trainer.py:803] 2025-04-26 14:36:52,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 910 [WARNING|trainer.py:803] 2025-04-26 14:36:52,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:53,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 912 904 911 [WARNING|trainer.py:803] 2025-04-26 14:36:54,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:54,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:36:54,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 913 905 912 [WARNING|trainer.py:803] 2025-04-26 14:36:55,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:36:55,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:56,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 914 906 913 [WARNING|trainer.py:803] 2025-04-26 14:36:57,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:57,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:36:58,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 915 907 914 [WARNING|trainer.py:803] 2025-04-26 14:36:59,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:36:59,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:00,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 916 908 915 [WARNING|trainer.py:803] 2025-04-26 14:37:01,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:01,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes [WARNING|trainer.py:803] 2025-04-26 14:37:01,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 917 909 916 [WARNING|trainer.py:803] 2025-04-26 14:37:02,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:37:03,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:03,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 918 910 917 [WARNING|trainer.py:803] 2025-04-26 14:37:04,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:04,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:05,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 919 911 918 [WARNING|trainer.py:803] 2025-04-26 14:37:06,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:37:06,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:37:07,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 920 912 919 [WARNING|trainer.py:803] 2025-04-26 14:37:08,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:08,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:08,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 921 913 920 [WARNING|trainer.py:803] 2025-04-26 14:37:09,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:37:10,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:37:10,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 914 922 921 [WARNING|trainer.py:803] 2025-04-26 14:37:11,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:11,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:37:12,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 915 923 922 [WARNING|trainer.py:803] 2025-04-26 14:37:13,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:37:13,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:13,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 916 923 924 [WARNING|trainer.py:803] 2025-04-26 14:37:15,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:15,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:15,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 917 925 [WARNING|trainer.py:803] 2025-04-26 14:37:17,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 924 [WARNING|trainer.py:803] 2025-04-26 14:37:17,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 918 [WARNING|trainer.py:803] 2025-04-26 14:37:18,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 926 [WARNING|trainer.py:803] 2025-04-26 14:37:18,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 925 [WARNING|trainer.py:803] 2025-04-26 14:37:19,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 919 [WARNING|trainer.py:803] 2025-04-26 14:37:19,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 927 [WARNING|trainer.py:803] 2025-04-26 14:37:20,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 926 [WARNING|trainer.py:803] 2025-04-26 14:37:21,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 920 [WARNING|trainer.py:803] 2025-04-26 14:37:21,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:22,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 928 927 921 [WARNING|trainer.py:803] 2025-04-26 14:37:23,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:23,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:37:23,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 929 928 922 [WARNING|trainer.py:803] 2025-04-26 14:37:24,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:37:25,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:25,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 930 929 [WARNING|trainer.py:803] 2025-04-26 14:37:26,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 923 [WARNING|trainer.py:803] 2025-04-26 14:37:26,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:37:27,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 931 930 [WARNING|trainer.py:803] 2025-04-26 14:37:28,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:37:28,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 924 932 931 [WARNING|trainer.py:803] 2025-04-26 14:37:29,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:30,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:30,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 925 933 932 [WARNING|trainer.py:803] 2025-04-26 14:37:31,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:32,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:32,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 926 934 933 [WARNING|trainer.py:803] 2025-04-26 14:37:33,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:33,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:33,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 927 935 934 [WARNING|trainer.py:803] 2025-04-26 14:37:35,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:37:35,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:35,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 928 936 935 [WARNING|trainer.py:803] 2025-04-26 14:37:36,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:37,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:37,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 929 937 936 [WARNING|trainer.py:803] 2025-04-26 14:37:38,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:37:39,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:39,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 930 938 937 [WARNING|trainer.py:803] 2025-04-26 14:37:40,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:37:40,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:37:41,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 931 939 938 [WARNING|trainer.py:803] 2025-04-26 14:37:42,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:37:42,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:42,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 932 940 939 [WARNING|trainer.py:803] 2025-04-26 14:37:44,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:44,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:44,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 933 941 940 [WARNING|trainer.py:803] 2025-04-26 14:37:46,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:46,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:46,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 934 941 942 [WARNING|trainer.py:803] 2025-04-26 14:37:47,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:48,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:48,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 935 943 942 [WARNING|trainer.py:803] 2025-04-26 14:37:49,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:49,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:50,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 936 944 943 [WARNING|trainer.py:803] 2025-04-26 14:37:51,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:51,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:52,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 937 945 944 [WARNING|trainer.py:803] 2025-04-26 14:37:53,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:53,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:54,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 938 946 945 [WARNING|trainer.py:803] 2025-04-26 14:37:55,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:37:55,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:55,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 939 946 947 [WARNING|trainer.py:803] 2025-04-26 14:37:56,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:37:57,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:37:57,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 940 948 [WARNING|trainer.py:803] 2025-04-26 14:37:58,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 947 [WARNING|trainer.py:803] 2025-04-26 14:37:59,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 941 [WARNING|trainer.py:803] 2025-04-26 14:37:59,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 949 [WARNING|trainer.py:803] 2025-04-26 14:38:00,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 948 [WARNING|trainer.py:803] 2025-04-26 14:38:01,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:38:01,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 942 950 [WARNING|trainer.py:803] 2025-04-26 14:38:02,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 949 [WARNING|trainer.py:803] 2025-04-26 14:38:03,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:03,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 943 [WARNING|trainer.py:803] 2025-04-26 14:38:04,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 951 950 [WARNING|trainer.py:803] 2025-04-26 14:38:05,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:05,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 944 [WARNING|trainer.py:803] 2025-04-26 14:38:06,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 952 951 [WARNING|trainer.py:803] 2025-04-26 14:38:07,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 945 NoNo [WARNING|trainer.py:803] 2025-04-26 14:38:07,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:07,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 953 952 [WARNING|trainer.py:803] 2025-04-26 14:38:08,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 946 [WARNING|trainer.py:803] 2025-04-26 14:38:09,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:38:09,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 954 953 [WARNING|trainer.py:803] 2025-04-26 14:38:10,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:10,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 947 955 954 [WARNING|trainer.py:803] 2025-04-26 14:38:12,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:38:12,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:38:12,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 948 956 [WARNING|trainer.py:803] 2025-04-26 14:38:13,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 955 [WARNING|trainer.py:803] 2025-04-26 14:38:14,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:14,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 949 957 [WARNING|trainer.py:803] 2025-04-26 14:38:15,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 956 [WARNING|trainer.py:803] 2025-04-26 14:38:15,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:38:16,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 950 958 957 [WARNING|trainer.py:803] 2025-04-26 14:38:17,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:17,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:18,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 959 951 958 [WARNING|trainer.py:803] 2025-04-26 14:38:19,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:38:19,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:19,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 960 952 959 [WARNING|trainer.py:803] 2025-04-26 14:38:21,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:38:21,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:38:21,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 961 953 960 [WARNING|trainer.py:803] 2025-04-26 14:38:22,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:38:23,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:38:23,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 962 961 954 [WARNING|trainer.py:803] 2025-04-26 14:38:24,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:25,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:38:25,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 963 962 955 [WARNING|trainer.py:803] 2025-04-26 14:38:26,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:26,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:26,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 964 963 956 [WARNING|trainer.py:803] 2025-04-26 14:38:27,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:38:28,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:28,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 965 964 957 [WARNING|trainer.py:803] 2025-04-26 14:38:29,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:30,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:38:30,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 966 965 958 [WARNING|trainer.py:803] 2025-04-26 14:38:31,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:31,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:32,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 967 966 959 [WARNING|trainer.py:803] 2025-04-26 14:38:33,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:38:33,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:33,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 968 967 960 [WARNING|trainer.py:803] 2025-04-26 14:38:34,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:35,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:38:35,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 969 961 968 [WARNING|trainer.py:803] 2025-04-26 14:38:36,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:38:37,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:38:37,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 970 962 969 [WARNING|trainer.py:803] 2025-04-26 14:38:38,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:38:38,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:39,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 971 963 970 [WARNING|trainer.py:803] 2025-04-26 14:38:40,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:38:40,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:41,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 972 964 [WARNING|trainer.py:803] 2025-04-26 14:38:42,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 971 [WARNING|trainer.py:803] 2025-04-26 14:38:42,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 973 965 [WARNING|trainer.py:803] 2025-04-26 14:38:43,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:38:44,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:38:44,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 972 974 966 [WARNING|trainer.py:803] 2025-04-26 14:38:45,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:45,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:38:45,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 973 967 [WARNING|trainer.py:803] 2025-04-26 14:38:46,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 975 [WARNING|trainer.py:803] 2025-04-26 14:38:47,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 974 [WARNING|trainer.py:803] 2025-04-26 14:38:48,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:48,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 968 976 [WARNING|trainer.py:803] 2025-04-26 14:38:49,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:38:50,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 975 969 [WARNING|trainer.py:803] 2025-04-26 14:38:51,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 977 [WARNING|trainer.py:803] 2025-04-26 14:38:51,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:38:51,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 976 970 [WARNING|trainer.py:803] 2025-04-26 14:38:52,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 978 [WARNING|trainer.py:803] 2025-04-26 14:38:53,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:38:53,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 977 971 [WARNING|trainer.py:803] 2025-04-26 14:38:54,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 979 [WARNING|trainer.py:803] 2025-04-26 14:38:55,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 978 [WARNING|trainer.py:803] 2025-04-26 14:38:55,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:38:56,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 972 980 [WARNING|trainer.py:803] 2025-04-26 14:38:57,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 979 [WARNING|trainer.py:803] 2025-04-26 14:38:57,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:38:57,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 973 981 [WARNING|trainer.py:803] 2025-04-26 14:38:58,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:38:59,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 980 974 [WARNING|trainer.py:803] 2025-04-26 14:39:00,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 982 [WARNING|trainer.py:803] 2025-04-26 14:39:00,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 981 [WARNING|trainer.py:803] 2025-04-26 14:39:00,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:39:01,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 975 983 982 [WARNING|trainer.py:803] 2025-04-26 14:39:03,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:03,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:03,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 976 984 [WARNING|trainer.py:803] 2025-04-26 14:39:04,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 983 [WARNING|trainer.py:803] 2025-04-26 14:39:05,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 977 [WARNING|trainer.py:803] 2025-04-26 14:39:05,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 985 [WARNING|trainer.py:803] 2025-04-26 14:39:06,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 984 [WARNING|trainer.py:803] 2025-04-26 14:39:07,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 978 [WARNING|trainer.py:803] 2025-04-26 14:39:07,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:08,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 985 986 979 [WARNING|trainer.py:803] 2025-04-26 14:39:09,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:09,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:39:10,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 987 986 980 [WARNING|trainer.py:803] 2025-04-26 14:39:11,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:11,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:39:12,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 988 987 981 [WARNING|trainer.py:803] 2025-04-26 14:39:13,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:39:13,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:13,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 989 988 [WARNING|trainer.py:803] 2025-04-26 14:39:14,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 982 [WARNING|trainer.py:803] 2025-04-26 14:39:15,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:39:15,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 990 989 [WARNING|trainer.py:803] 2025-04-26 14:39:16,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:17,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 983 991 990 [WARNING|trainer.py:803] 2025-04-26 14:39:18,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:18,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:39:18,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 984 992 991 [WARNING|trainer.py:803] 2025-04-26 14:39:20,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:20,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:39:20,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 993 985 992 [WARNING|trainer.py:803] 2025-04-26 14:39:22,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:22,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:22,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 994 993 986 [WARNING|trainer.py:803] 2025-04-26 14:39:23,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:39:24,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:24,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 995 994 987 [WARNING|trainer.py:803] 2025-04-26 14:39:25,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:25,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:39:26,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 996 988 995 [WARNING|trainer.py:803] 2025-04-26 14:39:27,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:39:28,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:39:28,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 997 989 996 [WARNING|trainer.py:803] 2025-04-26 14:39:29,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:39:29,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:29,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 998 990 997 [WARNING|trainer.py:803] 2025-04-26 14:39:31,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:31,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:31,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 999 991 998 [WARNING|trainer.py:803] 2025-04-26 14:39:32,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:39:33,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:39:33,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1000 999 992 [WARNING|trainer.py:803] 2025-04-26 14:39:34,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:39:35,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:39:35,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1001 993 1000 [WARNING|trainer.py:803] 2025-04-26 14:39:36,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:39:36,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:36,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1002 994 1001 [WARNING|trainer.py:803] 2025-04-26 14:39:38,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:39:38,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:39:39,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1003 995 1002 [WARNING|trainer.py:803] 2025-04-26 14:39:40,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:39:40,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:41,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 996 1004 1003 [WARNING|trainer.py:803] 2025-04-26 14:39:42,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:39:42,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:39:42,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1005 997 1004 [WARNING|trainer.py:803] 2025-04-26 14:39:44,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:39:44,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:39:45,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1006 998 1005 [WARNING|trainer.py:803] 2025-04-26 14:39:46,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:39:46,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:46,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1007 999 1006 [WARNING|trainer.py:803] 2025-04-26 14:39:47,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:39:47,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:39:48,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1008 1000 [WARNING|trainer.py:803] 2025-04-26 14:39:49,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1007 [WARNING|trainer.py:803] 2025-04-26 14:39:49,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1009 [WARNING|trainer.py:803] 2025-04-26 14:39:50,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1001 [WARNING|trainer.py:803] 2025-04-26 14:39:51,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1008 [WARNING|trainer.py:803] 2025-04-26 14:39:51,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:39:52,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1010 1002 1009 [WARNING|trainer.py:803] 2025-04-26 14:39:53,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:53,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:39:54,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1011 1003 [WARNING|trainer.py:803] 2025-04-26 14:39:54,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1010 [WARNING|trainer.py:803] 2025-04-26 14:39:55,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1012 [WARNING|trainer.py:803] 2025-04-26 14:39:55,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:39:56,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1011 1004 1013 [WARNING|trainer.py:803] 2025-04-26 14:39:57,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:39:57,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:39:58,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1012 1005 [WARNING|trainer.py:803] 2025-04-26 14:39:59,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1014 [WARNING|trainer.py:803] 2025-04-26 14:39:59,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1013 [WARNING|trainer.py:803] 2025-04-26 14:40:00,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1006 [WARNING|trainer.py:803] 2025-04-26 14:40:01,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1015 [WARNING|trainer.py:803] 2025-04-26 14:40:01,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:40:02,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1014 1007 1016 [WARNING|trainer.py:803] 2025-04-26 14:40:03,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:40:03,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:40:03,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1015 1008 [WARNING|trainer.py:803] 2025-04-26 14:40:04,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:05,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1017 1016 1009 [WARNING|trainer.py:803] 2025-04-26 14:40:06,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:06,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:40:06,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1018 1017 1010 [WARNING|trainer.py:803] 2025-04-26 14:40:08,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:08,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:08,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1019 1011 [WARNING|trainer.py:803] 2025-04-26 14:40:10,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1018 [WARNING|trainer.py:803] 2025-04-26 14:40:10,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:11,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1012 1020 1019 [WARNING|trainer.py:803] 2025-04-26 14:40:12,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:12,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:40:12,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1013 1021 [WARNING|trainer.py:803] 2025-04-26 14:40:14,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1020 [WARNING|trainer.py:803] 2025-04-26 14:40:14,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:40:15,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1014 1022 [WARNING|trainer.py:803] 2025-04-26 14:40:16,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1021 [WARNING|trainer.py:803] 2025-04-26 14:40:16,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1015 [WARNING|trainer.py:803] 2025-04-26 14:40:17,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1023 [WARNING|trainer.py:803] 2025-04-26 14:40:17,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:18,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1022 1016 [WARNING|trainer.py:803] 2025-04-26 14:40:19,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:40:19,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1024 1023 [WARNING|trainer.py:803] 2025-04-26 14:40:20,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1017 [WARNING|trainer.py:803] 2025-04-26 14:40:21,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:40:21,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1025 [WARNING|trainer.py:803] 2025-04-26 14:40:22,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1024 1018 1026 [WARNING|trainer.py:803] 2025-04-26 14:40:23,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:40:23,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:24,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1025 1019 [WARNING|trainer.py:803] 2025-04-26 14:40:25,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1027 [WARNING|trainer.py:803] 2025-04-26 14:40:25,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:40:26,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1026 1020 1028 [WARNING|trainer.py:803] 2025-04-26 14:40:27,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:40:28,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:40:28,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1027 1029 1021 [WARNING|trainer.py:803] 2025-04-26 14:40:29,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:29,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:40:30,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1028 1030 [WARNING|trainer.py:803] 2025-04-26 14:40:31,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1022 [WARNING|trainer.py:803] 2025-04-26 14:40:31,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1029 [WARNING|trainer.py:803] 2025-04-26 14:40:32,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1031 [WARNING|trainer.py:803] 2025-04-26 14:40:32,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1023 [WARNING|trainer.py:803] 2025-04-26 14:40:33,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1030 [WARNING|trainer.py:803] 2025-04-26 14:40:34,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1032 [WARNING|trainer.py:803] 2025-04-26 14:40:34,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:35,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1031 1024 1033 [WARNING|trainer.py:803] 2025-04-26 14:40:36,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:36,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:40:36,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1032 1025 1034 [WARNING|trainer.py:803] 2025-04-26 14:40:38,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:40:38,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:40:38,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1033 1026 [WARNING|trainer.py:803] 2025-04-26 14:40:39,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1035 [WARNING|trainer.py:803] 2025-04-26 14:40:40,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:40:40,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1034 1027 [WARNING|trainer.py:803] 2025-04-26 14:40:41,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1036 [WARNING|trainer.py:803] 2025-04-26 14:40:42,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:42,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1035 1028 1037 [WARNING|trainer.py:803] 2025-04-26 14:40:44,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:40:44,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:40:44,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1029 1036 1038 [WARNING|trainer.py:803] 2025-04-26 14:40:45,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:40:45,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:46,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1030 1037 1039 [WARNING|trainer.py:803] 2025-04-26 14:40:47,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:47,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:40:48,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1031 1038 1040 [WARNING|trainer.py:803] 2025-04-26 14:40:49,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:49,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:49,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1032 1039 1041 [WARNING|trainer.py:803] 2025-04-26 14:40:51,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:40:51,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:40:51,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1033 1040 1042 [WARNING|trainer.py:803] 2025-04-26 14:40:52,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:53,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:40:53,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1034 1041 [WARNING|trainer.py:803] 2025-04-26 14:40:54,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1043 [WARNING|trainer.py:803] 2025-04-26 14:40:55,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:55,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1035 1042 1044 [WARNING|trainer.py:803] 2025-04-26 14:40:57,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:40:57,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:40:57,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1036 1043 [WARNING|trainer.py:803] 2025-04-26 14:40:58,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1045 [WARNING|trainer.py:803] 2025-04-26 14:40:59,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1037 [WARNING|trainer.py:803] 2025-04-26 14:41:00,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1044 [WARNING|trainer.py:803] 2025-04-26 14:41:00,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1046 [WARNING|trainer.py:803] 2025-04-26 14:41:01,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:41:01,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1038 1047 [WARNING|trainer.py:803] 2025-04-26 14:41:02,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1045 [WARNING|trainer.py:803] 2025-04-26 14:41:03,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:03,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1039 1046 [WARNING|trainer.py:803] 2025-04-26 14:41:04,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1048 [WARNING|trainer.py:803] 2025-04-26 14:41:05,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1040 [WARNING|trainer.py:803] 2025-04-26 14:41:05,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:41:06,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1047 1049 1041 [WARNING|trainer.py:803] 2025-04-26 14:41:07,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:07,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:08,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1050 1048 1042 [WARNING|trainer.py:803] 2025-04-26 14:41:09,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:41:09,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:41:10,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1049 1051 [WARNING|trainer.py:803] 2025-04-26 14:41:11,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:11,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1043 [WARNING|trainer.py:803] 2025-04-26 14:41:12,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1052 1050 [WARNING|trainer.py:803] 2025-04-26 14:41:13,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:13,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1044 [WARNING|trainer.py:803] 2025-04-26 14:41:14,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1053 1051 [WARNING|trainer.py:803] 2025-04-26 14:41:15,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:41:15,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1045 1054 1052 [WARNING|trainer.py:803] 2025-04-26 14:41:16,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:41:17,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:41:17,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1046 1055 [WARNING|trainer.py:803] 2025-04-26 14:41:18,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1053 [WARNING|trainer.py:803] 2025-04-26 14:41:18,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:19,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1047 1056 [WARNING|trainer.py:803] 2025-04-26 14:41:20,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1054 [WARNING|trainer.py:803] 2025-04-26 14:41:20,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:20,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1057 1048 1055 [WARNING|trainer.py:803] 2025-04-26 14:41:22,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:41:22,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:41:22,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1049 1056 1058 [WARNING|trainer.py:803] 2025-04-26 14:41:24,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:24,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:24,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1050 1057 1059 [WARNING|trainer.py:803] 2025-04-26 14:41:26,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:41:26,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:41:26,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1051 1060 1058 [WARNING|trainer.py:803] 2025-04-26 14:41:28,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:41:28,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:41:28,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1052 1059 1061 [WARNING|trainer.py:803] 2025-04-26 14:41:30,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:30,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:41:30,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1053 1062 1060 [WARNING|trainer.py:803] 2025-04-26 14:41:32,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:41:32,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:41:33,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1054 1063 1061 [WARNING|trainer.py:803] 2025-04-26 14:41:33,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:41:34,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:41:34,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1055 1064 1062 [WARNING|trainer.py:803] 2025-04-26 14:41:35,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:36,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:36,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1056 1065 1063 [WARNING|trainer.py:803] 2025-04-26 14:41:37,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:37,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:38,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1057 1066 [WARNING|trainer.py:803] 2025-04-26 14:41:39,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1064 [WARNING|trainer.py:803] 2025-04-26 14:41:39,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:40,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1058 1065 1067 [WARNING|trainer.py:803] 2025-04-26 14:41:41,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:42,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:42,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1059 1066 1068 [WARNING|trainer.py:803] 2025-04-26 14:41:43,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:41:43,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:44,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1060 1067 1069 [WARNING|trainer.py:803] 2025-04-26 14:41:46,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:41:46,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:41:46,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1061 1070 1068 [WARNING|trainer.py:803] 2025-04-26 14:41:48,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:41:48,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:41:48,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1062 1071 [WARNING|trainer.py:803] 2025-04-26 14:41:49,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:41:49,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1069 1063 1072 [WARNING|trainer.py:803] 2025-04-26 14:41:50,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:41:51,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:41:51,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1070 1073 [WARNING|trainer.py:803] 2025-04-26 14:41:52,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1064 [WARNING|trainer.py:803] 2025-04-26 14:41:53,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1071 [WARNING|trainer.py:803] 2025-04-26 14:41:53,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1074 [WARNING|trainer.py:803] 2025-04-26 14:41:54,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1065 [WARNING|trainer.py:803] 2025-04-26 14:41:54,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1072 [WARNING|trainer.py:803] 2025-04-26 14:41:55,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1075 [WARNING|trainer.py:803] 2025-04-26 14:41:56,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1066 [WARNING|trainer.py:803] 2025-04-26 14:41:56,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1073 [WARNING|trainer.py:803] 2025-04-26 14:41:57,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:41:57,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1076 1074 [WARNING|trainer.py:803] 2025-04-26 14:41:58,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1067 [WARNING|trainer.py:803] 2025-04-26 14:41:59,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1077 [WARNING|trainer.py:803] 2025-04-26 14:41:59,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1075 [WARNING|trainer.py:803] 2025-04-26 14:42:00,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1068 [WARNING|trainer.py:803] 2025-04-26 14:42:01,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1078 [WARNING|trainer.py:803] 2025-04-26 14:42:01,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:02,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1076 1069 1079 [WARNING|trainer.py:803] 2025-04-26 14:42:03,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:04,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:42:04,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1077 1070 [WARNING|trainer.py:803] 2025-04-26 14:42:05,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1080 [WARNING|trainer.py:803] 2025-04-26 14:42:05,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1078 [WARNING|trainer.py:803] 2025-04-26 14:42:06,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1071 [WARNING|trainer.py:803] 2025-04-26 14:42:06,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:07,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1081 1079 [WARNING|trainer.py:803] 2025-04-26 14:42:08,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1072 [WARNING|trainer.py:803] 2025-04-26 14:42:09,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:09,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1082 [WARNING|trainer.py:803] 2025-04-26 14:42:10,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1073 1080 [WARNING|trainer.py:803] 2025-04-26 14:42:11,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1083 [WARNING|trainer.py:803] 2025-04-26 14:42:11,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1074 [WARNING|trainer.py:803] 2025-04-26 14:42:12,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1081 [WARNING|trainer.py:803] 2025-04-26 14:42:12,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1084 [WARNING|trainer.py:803] 2025-04-26 14:42:13,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1075 [WARNING|trainer.py:803] 2025-04-26 14:42:13,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1082 [WARNING|trainer.py:803] 2025-04-26 14:42:14,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:42:14,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1085 1076 [WARNING|trainer.py:803] 2025-04-26 14:42:15,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1083 1086 [WARNING|trainer.py:803] 2025-04-26 14:42:16,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:16,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:17,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1077 1084 [WARNING|trainer.py:803] 2025-04-26 14:42:18,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1087 [WARNING|trainer.py:803] 2025-04-26 14:42:18,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:19,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1078 1085 [WARNING|trainer.py:803] 2025-04-26 14:42:20,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:20,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1088 1086 1079 [WARNING|trainer.py:803] 2025-04-26 14:42:21,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:22,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:22,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1089 1087 [WARNING|trainer.py:803] 2025-04-26 14:42:23,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1080 [WARNING|trainer.py:803] 2025-04-26 14:42:24,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1090 [WARNING|trainer.py:803] 2025-04-26 14:42:24,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:42:25,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1088 1081 1091 [WARNING|trainer.py:803] 2025-04-26 14:42:26,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:26,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:26,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1089 1082 1092 [WARNING|trainer.py:803] 2025-04-26 14:42:28,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:28,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:29,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1083 1090 [WARNING|trainer.py:803] 2025-04-26 14:42:30,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:30,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1093 1084 1091 [WARNING|trainer.py:803] 2025-04-26 14:42:31,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:31,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:31,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1094 1085 1092 [WARNING|trainer.py:803] 2025-04-26 14:42:33,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:42:33,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:34,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1095 1086 [WARNING|trainer.py:803] 2025-04-26 14:42:35,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:35,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1093 1096 [WARNING|trainer.py:803] 2025-04-26 14:42:36,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1087 [WARNING|trainer.py:803] 2025-04-26 14:42:36,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1094 [WARNING|trainer.py:803] 2025-04-26 14:42:37,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:38,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1097 1088 [WARNING|trainer.py:803] 2025-04-26 14:42:39,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1095 [WARNING|trainer.py:803] 2025-04-26 14:42:39,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1098 [WARNING|trainer.py:803] 2025-04-26 14:42:39,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1089 [WARNING|trainer.py:803] 2025-04-26 14:42:40,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1096 [WARNING|trainer.py:803] 2025-04-26 14:42:41,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1099 [WARNING|trainer.py:803] 2025-04-26 14:42:41,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:42,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1090 1097 [WARNING|trainer.py:803] 2025-04-26 14:42:43,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1100 [WARNING|trainer.py:803] 2025-04-26 14:42:44,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:44,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1091 1098 [WARNING|trainer.py:803] 2025-04-26 14:42:45,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1101 [WARNING|trainer.py:803] 2025-04-26 14:42:45,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:46,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1092 1099 [WARNING|trainer.py:803] 2025-04-26 14:42:47,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:47,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1102 1100 [WARNING|trainer.py:803] 2025-04-26 14:42:48,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1093 [WARNING|trainer.py:803] 2025-04-26 14:42:49,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1103 [WARNING|trainer.py:803] 2025-04-26 14:42:49,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:50,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1101 1094 1104 [WARNING|trainer.py:803] 2025-04-26 14:42:51,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:51,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:42:52,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1095 1102 1105 [WARNING|trainer.py:803] 2025-04-26 14:42:53,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:53,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:53,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1103 1096 1106 [WARNING|trainer.py:803] 2025-04-26 14:42:55,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:55,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:55,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1104 1107 1097 [WARNING|trainer.py:803] 2025-04-26 14:42:57,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:42:57,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:57,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1105 1098 1108 [WARNING|trainer.py:803] 2025-04-26 14:42:59,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:42:59,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:42:59,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1106 1099 [WARNING|trainer.py:803] 2025-04-26 14:43:00,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1109 [WARNING|trainer.py:803] 2025-04-26 14:43:01,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1107 [WARNING|trainer.py:803] 2025-04-26 14:43:01,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1100 [WARNING|trainer.py:803] 2025-04-26 14:43:02,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1110 [WARNING|trainer.py:803] 2025-04-26 14:43:03,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:43:03,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1108 1111 1101 [WARNING|trainer.py:803] 2025-04-26 14:43:04,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:05,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:05,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1109 1112 [WARNING|trainer.py:803] 2025-04-26 14:43:06,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1102 [WARNING|trainer.py:803] 2025-04-26 14:43:06,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1110 [WARNING|trainer.py:803] 2025-04-26 14:43:07,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1113 [WARNING|trainer.py:803] 2025-04-26 14:43:08,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1103 [WARNING|trainer.py:803] 2025-04-26 14:43:08,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:43:09,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1111 1114 [WARNING|trainer.py:803] 2025-04-26 14:43:10,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:10,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1104 1112 1115 [WARNING|trainer.py:803] 2025-04-26 14:43:11,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:43:11,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:43:12,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1105 1113 1116 [WARNING|trainer.py:803] 2025-04-26 14:43:13,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:43:13,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:43:13,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1106 1114 [WARNING|trainer.py:803] 2025-04-26 14:43:14,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1117 [WARNING|trainer.py:803] 2025-04-26 14:43:15,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1107 [WARNING|trainer.py:803] 2025-04-26 14:43:15,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1115 [WARNING|trainer.py:803] 2025-04-26 14:43:16,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1118 [WARNING|trainer.py:803] 2025-04-26 14:43:17,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:17,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1116 1108 1119 [WARNING|trainer.py:803] 2025-04-26 14:43:18,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:18,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:19,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1117 1109 1120 [WARNING|trainer.py:803] 2025-04-26 14:43:20,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:43:20,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:43:21,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1118 1110 1121 [WARNING|trainer.py:803] 2025-04-26 14:43:22,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:43:22,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:43:23,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1119 1111 [WARNING|trainer.py:803] 2025-04-26 14:43:24,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1122 [WARNING|trainer.py:803] 2025-04-26 14:43:24,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:25,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1112 1120 [WARNING|trainer.py:803] 2025-04-26 14:43:26,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:43:26,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1123 1113 1121 [WARNING|trainer.py:803] 2025-04-26 14:43:27,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:28,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:43:28,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1124 1114 [WARNING|trainer.py:803] 2025-04-26 14:43:29,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1122 [WARNING|trainer.py:803] 2025-04-26 14:43:29,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1125 [WARNING|trainer.py:803] 2025-04-26 14:43:30,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1115 [WARNING|trainer.py:803] 2025-04-26 14:43:31,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:31,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1123 1126 1116 [WARNING|trainer.py:803] 2025-04-26 14:43:32,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:32,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:33,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1124 1127 [WARNING|trainer.py:803] 2025-04-26 14:43:34,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:34,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1117 1128 1125 [WARNING|trainer.py:803] 2025-04-26 14:43:35,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:43:36,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:36,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1118 1126 1129 [WARNING|trainer.py:803] 2025-04-26 14:43:37,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:43:37,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:37,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1119 1127 1130 [WARNING|trainer.py:803] 2025-04-26 14:43:38,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:39,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:39,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1131 1128 1120 [WARNING|trainer.py:803] 2025-04-26 14:43:41,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:43:41,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:41,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1132 1129 1121 [WARNING|trainer.py:803] 2025-04-26 14:43:42,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:43:42,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:43,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1133 1130 1122 [WARNING|trainer.py:803] 2025-04-26 14:43:44,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:44,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:45,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1134 1131 [WARNING|trainer.py:803] 2025-04-26 14:43:46,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:43:46,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1123 1135 1132 [WARNING|trainer.py:803] 2025-04-26 14:43:47,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:47,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:43:48,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1124 1136 1133 [WARNING|trainer.py:803] 2025-04-26 14:43:49,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:49,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:49,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1125 1134 1137 [WARNING|trainer.py:803] 2025-04-26 14:43:50,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:51,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:43:51,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1126 1135 1138 [WARNING|trainer.py:803] 2025-04-26 14:43:52,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:53,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:43:53,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1127 1136 1139 [WARNING|trainer.py:803] 2025-04-26 14:43:54,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:54,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:43:55,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1128 1140 [WARNING|trainer.py:803] 2025-04-26 14:43:56,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1137 [WARNING|trainer.py:803] 2025-04-26 14:43:56,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:43:56,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1129 1141 [WARNING|trainer.py:803] 2025-04-26 14:43:57,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1138 [WARNING|trainer.py:803] 2025-04-26 14:43:58,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1130 [WARNING|trainer.py:803] 2025-04-26 14:43:58,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1142 [WARNING|trainer.py:803] 2025-04-26 14:43:59,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1139 [WARNING|trainer.py:803] 2025-04-26 14:44:00,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1131 [WARNING|trainer.py:803] 2025-04-26 14:44:00,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1143 [WARNING|trainer.py:803] 2025-04-26 14:44:01,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1140 [WARNING|trainer.py:803] 2025-04-26 14:44:01,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:44:02,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 1132 YesYes 1144 [WARNING|trainer.py:803] 2025-04-26 14:44:02,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1141 [WARNING|trainer.py:803] 2025-04-26 14:44:03,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:03,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1133 1145 [WARNING|trainer.py:803] 2025-04-26 14:44:04,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1142 [WARNING|trainer.py:803] 2025-04-26 14:44:05,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1134 [WARNING|trainer.py:803] 2025-04-26 14:44:05,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1146 [WARNING|trainer.py:803] 2025-04-26 14:44:06,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1143 [WARNING|trainer.py:803] 2025-04-26 14:44:07,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1135 [WARNING|trainer.py:803] 2025-04-26 14:44:07,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1147 1144 [WARNING|trainer.py:803] 2025-04-26 14:44:08,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:08,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:08,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1136 1145 1148 [WARNING|trainer.py:803] 2025-04-26 14:44:09,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:10,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:10,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1137 1149 1146 [WARNING|trainer.py:803] 2025-04-26 14:44:11,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:12,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:12,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1138 1147 1150 [WARNING|trainer.py:803] 2025-04-26 14:44:13,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:44:14,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:14,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1139 1151 1148 [WARNING|trainer.py:803] 2025-04-26 14:44:15,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:44:15,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:15,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1140 1152 1149 [WARNING|trainer.py:803] 2025-04-26 14:44:16,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:44:17,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:17,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1141 1153 1150 [WARNING|trainer.py:803] 2025-04-26 14:44:18,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:19,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:19,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1142 1154 1151 [WARNING|trainer.py:803] 2025-04-26 14:44:20,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:20,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:21,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1143 1152 1155 [WARNING|trainer.py:803] 2025-04-26 14:44:22,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:44:22,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:22,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1144 1153 1156 [WARNING|trainer.py:803] 2025-04-26 14:44:24,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:24,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:24,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1145 1154 1157 [WARNING|trainer.py:803] 2025-04-26 14:44:25,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:26,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:26,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1146 1155 1158 [WARNING|trainer.py:803] 2025-04-26 14:44:27,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:27,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:28,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1147 1156 1159 [WARNING|trainer.py:803] 2025-04-26 14:44:29,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:29,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:29,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1157 1148 1160 [WARNING|trainer.py:803] 2025-04-26 14:44:31,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:31,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:44:31,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1149 1158 1161 [WARNING|trainer.py:803] 2025-04-26 14:44:32,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:33,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:33,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1150 1159 1162 [WARNING|trainer.py:803] 2025-04-26 14:44:34,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:34,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:34,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1151 1160 1163 [WARNING|trainer.py:803] 2025-04-26 14:44:36,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:36,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:36,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1152 1161 1164 [WARNING|trainer.py:803] 2025-04-26 14:44:38,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:38,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:38,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1153 1162 1165 [WARNING|trainer.py:803] 2025-04-26 14:44:39,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:40,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:44:40,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1154 1163 1166 [WARNING|trainer.py:803] 2025-04-26 14:44:41,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:41,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:41,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1155 1164 1167 [WARNING|trainer.py:803] 2025-04-26 14:44:43,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:43,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:44:43,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1156 1165 1168 [WARNING|trainer.py:803] 2025-04-26 14:44:45,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:45,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:44:45,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1157 1166 1169 [WARNING|trainer.py:803] 2025-04-26 14:44:46,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:46,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:47,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1167 1158 1170 [WARNING|trainer.py:803] 2025-04-26 14:44:48,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:44:48,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:44:48,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1168 1159 1171 [WARNING|trainer.py:803] 2025-04-26 14:44:50,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:44:50,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:50,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1169 1160 1172 [WARNING|trainer.py:803] 2025-04-26 14:44:52,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:44:52,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:52,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1170 1161 1173 [WARNING|trainer.py:803] 2025-04-26 14:44:53,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:44:53,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:54,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1171 1162 1174 [WARNING|trainer.py:803] 2025-04-26 14:44:55,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:55,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:44:56,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1163 1172 1175 [WARNING|trainer.py:803] 2025-04-26 14:44:57,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:44:57,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:44:58,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1164 1173 1176 [WARNING|trainer.py:803] 2025-04-26 14:44:59,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:44:59,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:44:59,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1165 1174 1177 [WARNING|trainer.py:803] 2025-04-26 14:45:01,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:45:01,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:01,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1166 1175 1178 [WARNING|trainer.py:803] 2025-04-26 14:45:02,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:45:03,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:03,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1167 1179 1176 [WARNING|trainer.py:803] 2025-04-26 14:45:04,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:45:05,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:05,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1168 1180 1177 [WARNING|trainer.py:803] 2025-04-26 14:45:06,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:45:06,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:45:06,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1169 1178 1181 [WARNING|trainer.py:803] 2025-04-26 14:45:07,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:45:08,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:08,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1170 1179 1182 [WARNING|trainer.py:803] 2025-04-26 14:45:09,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:45:10,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:10,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1171 1180 1183 [WARNING|trainer.py:803] 2025-04-26 14:45:11,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:45:12,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:45:12,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1172 1184 1181 [WARNING|trainer.py:803] 2025-04-26 14:45:13,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:45:13,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:45:13,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1173 1182 1185 [WARNING|trainer.py:803] 2025-04-26 14:45:15,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:15,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:45:15,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1174 1183 1186 [WARNING|trainer.py:803] 2025-04-26 14:45:17,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:17,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:45:17,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1184 1175 1187 [WARNING|trainer.py:803] 2025-04-26 14:45:19,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:45:19,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:19,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1176 1185 1188 [WARNING|trainer.py:803] 2025-04-26 14:45:21,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:45:21,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:45:21,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1177 1186 1189 [WARNING|trainer.py:803] 2025-04-26 14:45:22,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:23,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:23,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1178 1187 [WARNING|trainer.py:803] 2025-04-26 14:45:24,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1190 [WARNING|trainer.py:803] 2025-04-26 14:45:25,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1179 [WARNING|trainer.py:803] 2025-04-26 14:45:25,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1188 [WARNING|trainer.py:803] 2025-04-26 14:45:26,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1191 [WARNING|trainer.py:803] 2025-04-26 14:45:26,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1180 [WARNING|trainer.py:803] 2025-04-26 14:45:27,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1189 [WARNING|trainer.py:803] 2025-04-26 14:45:27,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1192 [WARNING|trainer.py:803] 2025-04-26 14:45:28,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1181 [WARNING|trainer.py:803] 2025-04-26 14:45:29,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:29,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1193 1190 1182 [WARNING|trainer.py:803] 2025-04-26 14:45:30,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:31,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:45:31,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1194 1191 1183 [WARNING|trainer.py:803] 2025-04-26 14:45:32,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:32,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:33,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1195 1192 1184 [WARNING|trainer.py:803] 2025-04-26 14:45:34,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:34,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:35,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1196 1193 1185 [WARNING|trainer.py:803] 2025-04-26 14:45:36,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:45:36,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:36,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1197 1194 [WARNING|trainer.py:803] 2025-04-26 14:45:37,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:45:37,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1186 1198 1195 [WARNING|trainer.py:803] 2025-04-26 14:45:38,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:39,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:39,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1187 1196 1199 [WARNING|trainer.py:803] 2025-04-26 14:45:40,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:41,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:45:41,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1188 1197 1200 [WARNING|trainer.py:803] 2025-04-26 14:45:42,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:43,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:45:43,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1189 1198 [WARNING|trainer.py:803] 2025-04-26 14:45:44,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:44,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1201 [WARNING|trainer.py:803] 2025-04-26 14:45:45,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1199 1190 [WARNING|trainer.py:803] 2025-04-26 14:45:46,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:46,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1202 1200 1191 [WARNING|trainer.py:803] 2025-04-26 14:45:48,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:45:48,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:48,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1192 1203 1201 [WARNING|trainer.py:803] 2025-04-26 14:45:50,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:50,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:51,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1193 1204 [WARNING|trainer.py:803] 2025-04-26 14:45:52,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1202 [WARNING|trainer.py:803] 2025-04-26 14:45:52,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1194 [WARNING|trainer.py:803] 2025-04-26 14:45:53,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:45:53,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1205 1195 [WARNING|trainer.py:803] 2025-04-26 14:45:54,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1203 [WARNING|trainer.py:803] 2025-04-26 14:45:55,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:55,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1206 1196 1204 [WARNING|trainer.py:803] 2025-04-26 14:45:57,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:45:57,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:45:58,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1197 1207 [WARNING|trainer.py:803] 2025-04-26 14:45:59,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1205 [WARNING|trainer.py:803] 2025-04-26 14:45:59,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1198 [WARNING|trainer.py:803] 2025-04-26 14:46:00,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1208 [WARNING|trainer.py:803] 2025-04-26 14:46:00,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:46:01,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1206 1199 [WARNING|trainer.py:803] 2025-04-26 14:46:02,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:46:02,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1209 1200 1207 [WARNING|trainer.py:803] 2025-04-26 14:46:04,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:46:04,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:46:04,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1210 1208 1201 [WARNING|trainer.py:803] 2025-04-26 14:46:06,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:46:06,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:46:07,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1211 [WARNING|trainer.py:803] 2025-04-26 14:46:08,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1209 1202 [WARNING|trainer.py:803] 2025-04-26 14:46:09,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:46:09,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1212 1210 [WARNING|trainer.py:803] 2025-04-26 14:46:10,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1203 [WARNING|trainer.py:803] 2025-04-26 14:46:11,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:46:11,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1213 1211 [WARNING|trainer.py:803] 2025-04-26 14:46:13,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1204 [WARNING|trainer.py:803] 2025-04-26 14:46:13,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:46:14,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1214 1212 [WARNING|trainer.py:803] 2025-04-26 14:46:15,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1205 [WARNING|trainer.py:803] 2025-04-26 14:46:16,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:46:16,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1215 [WARNING|trainer.py:803] 2025-04-26 14:46:17,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1213 1206 [WARNING|trainer.py:803] 2025-04-26 14:46:18,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:46:18,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1216 1214 [WARNING|trainer.py:803] 2025-04-26 14:46:19,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1207 [WARNING|trainer.py:803] 2025-04-26 14:46:20,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:46:20,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1217 [WARNING|trainer.py:803] 2025-04-26 14:46:22,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1215 1208 [WARNING|trainer.py:803] 2025-04-26 14:46:22,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:46:23,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1218 [WARNING|trainer.py:803] 2025-04-26 14:46:24,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1216 1209 [WARNING|trainer.py:803] 2025-04-26 14:46:25,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1219 [WARNING|trainer.py:803] 2025-04-26 14:46:25,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:46:26,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1217 1210 [WARNING|trainer.py:803] 2025-04-26 14:46:27,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1220 [WARNING|trainer.py:803] 2025-04-26 14:46:27,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:46:28,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1218 1211 [WARNING|trainer.py:803] 2025-04-26 14:46:29,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1221 [WARNING|trainer.py:803] 2025-04-26 14:46:30,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:46:30,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1219 1212 [WARNING|trainer.py:803] 2025-04-26 14:46:31,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1222 [WARNING|trainer.py:803] 2025-04-26 14:46:32,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:46:32,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1220 1213 [WARNING|trainer.py:803] 2025-04-26 14:46:34,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1223 [WARNING|trainer.py:803] 2025-04-26 14:46:34,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1221 [WARNING|trainer.py:803] 2025-04-26 14:46:35,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:46:36,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1214 [WARNING|trainer.py:803] 2025-04-26 14:46:37,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1224 1222 [WARNING|trainer.py:803] 2025-04-26 14:46:38,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:46:38,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1215 [WARNING|trainer.py:803] 2025-04-26 14:46:39,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1225 1223 [WARNING|trainer.py:803] 2025-04-26 14:46:40,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:46:40,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1216 [WARNING|trainer.py:803] 2025-04-26 14:46:41,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1226 1224 [WARNING|trainer.py:803] 2025-04-26 14:46:42,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1217 [WARNING|trainer.py:803] 2025-04-26 14:46:43,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:46:43,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1227 [WARNING|trainer.py:803] 2025-04-26 14:46:44,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1225 1218 [WARNING|trainer.py:803] 2025-04-26 14:46:45,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:46:46,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1228 1226 [WARNING|trainer.py:803] 2025-04-26 14:46:47,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1219 [WARNING|trainer.py:803] 2025-04-26 14:46:48,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:46:48,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1229 1227 1220 [WARNING|trainer.py:803] 2025-04-26 14:46:49,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:46:50,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:46:50,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1230 1221 1228 [WARNING|trainer.py:803] 2025-04-26 14:46:52,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:46:52,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:46:52,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1231 1222 1229 [WARNING|trainer.py:803] 2025-04-26 14:46:54,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:46:54,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:46:55,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1232 1223 [WARNING|trainer.py:803] 2025-04-26 14:46:56,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1230 [WARNING|trainer.py:803] 2025-04-26 14:46:57,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:46:57,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1233 [WARNING|trainer.py:803] 2025-04-26 14:46:59,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1231 1224 [WARNING|trainer.py:803] 2025-04-26 14:46:59,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:47:00,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1234 1232 [WARNING|trainer.py:803] 2025-04-26 14:47:01,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1225 [WARNING|trainer.py:803] 2025-04-26 14:47:02,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:47:02,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1235 1233 [WARNING|trainer.py:803] 2025-04-26 14:47:03,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1226 [WARNING|trainer.py:803] 2025-04-26 14:47:04,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:47:04,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1236 1234 1227 [WARNING|trainer.py:803] 2025-04-26 14:47:06,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:47:06,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:47:06,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1237 1235 [WARNING|trainer.py:803] 2025-04-26 14:47:08,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1228 [WARNING|trainer.py:803] 2025-04-26 14:47:09,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:47:09,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1238 1236 [WARNING|trainer.py:803] 2025-04-26 14:47:10,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1229 [WARNING|trainer.py:803] 2025-04-26 14:47:11,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:47:11,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1239 [WARNING|trainer.py:803] 2025-04-26 14:47:13,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1237 1230 [WARNING|trainer.py:803] 2025-04-26 14:47:13,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1240 [WARNING|trainer.py:803] 2025-04-26 14:47:14,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:47:15,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1238 1231 [WARNING|trainer.py:803] 2025-04-26 14:47:16,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:47:16,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1241 1239 [WARNING|trainer.py:803] 2025-04-26 14:47:17,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1232 [WARNING|trainer.py:803] 2025-04-26 14:47:18,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:47:18,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1242 1240 1233 [WARNING|trainer.py:803] 2025-04-26 14:47:20,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:47:20,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:47:21,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1243 1241 1234 [WARNING|trainer.py:803] 2025-04-26 14:47:22,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:47:23,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:47:23,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1244 1242 1235 [WARNING|trainer.py:803] 2025-04-26 14:47:25,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:47:25,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:47:25,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1245 1243 1236 [WARNING|trainer.py:803] 2025-04-26 14:47:27,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:47:28,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:47:28,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1246 1244 1237 [WARNING|trainer.py:803] 2025-04-26 14:47:29,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:47:30,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:47:30,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1247 1238 1245 [WARNING|trainer.py:803] 2025-04-26 14:47:32,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:47:32,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:47:33,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1248 1239 1246 [WARNING|trainer.py:803] 2025-04-26 14:47:34,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:47:35,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:47:35,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1249 1240 [WARNING|trainer.py:803] 2025-04-26 14:47:36,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1247 [WARNING|trainer.py:803] 2025-04-26 14:47:37,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:47:37,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1250 1241 [WARNING|trainer.py:803] 2025-04-26 14:47:39,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1248 [WARNING|trainer.py:803] 2025-04-26 14:47:39,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:47:40,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1251 1242 1249 [WARNING|trainer.py:803] 2025-04-26 14:47:41,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:47:42,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:47:42,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1252 1243 1250 [WARNING|trainer.py:803] 2025-04-26 14:47:44,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:47:44,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:47:44,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1253 1244 1251 [WARNING|trainer.py:803] 2025-04-26 14:47:46,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:47:46,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:47:47,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1254 1245 1252 [WARNING|trainer.py:803] 2025-04-26 14:47:48,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:47:49,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:47:49,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1255 1246 1253 [WARNING|trainer.py:803] 2025-04-26 14:47:51,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:47:51,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:47:51,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1256 1254 1247 [WARNING|trainer.py:803] 2025-04-26 14:47:53,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:47:54,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:47:54,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1257 1255 1248 [WARNING|trainer.py:803] 2025-04-26 14:47:55,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:47:56,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:47:56,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1258 1256 1249 [WARNING|trainer.py:803] 2025-04-26 14:47:58,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:47:58,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:47:58,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1259 1257 1250 [WARNING|trainer.py:803] 2025-04-26 14:48:00,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:48:00,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:48:00,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1260 1258 1251 [WARNING|trainer.py:803] 2025-04-26 14:48:03,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:48:03,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:48:03,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1261 1259 [WARNING|trainer.py:803] 2025-04-26 14:48:05,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1252 [WARNING|trainer.py:803] 2025-04-26 14:48:05,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:48:05,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1262 1260 1253 [WARNING|trainer.py:803] 2025-04-26 14:48:07,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:48:08,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:48:08,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1263 1261 [WARNING|trainer.py:803] 2025-04-26 14:48:09,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1254 [WARNING|trainer.py:803] 2025-04-26 14:48:10,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:48:10,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1264 1262 1255 [WARNING|trainer.py:803] 2025-04-26 14:48:12,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:12,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:48:12,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1265 1263 1256 [WARNING|trainer.py:803] 2025-04-26 14:48:14,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:14,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:48:15,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1266 1264 1257 [WARNING|trainer.py:803] 2025-04-26 14:48:16,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:17,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:17,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1267 1265 1258 [WARNING|trainer.py:803] 2025-04-26 14:48:19,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:48:19,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:19,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1268 1259 1266 [WARNING|trainer.py:803] 2025-04-26 14:48:21,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:21,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:48:21,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1269 1267 1260 [WARNING|trainer.py:803] 2025-04-26 14:48:24,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:48:24,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:48:24,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1270 1268 1261 [WARNING|trainer.py:803] 2025-04-26 14:48:26,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:48:26,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:26,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1271 1269 1262 [WARNING|trainer.py:803] 2025-04-26 14:48:28,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:48:28,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:48:29,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1272 1263 1270 [WARNING|trainer.py:803] 2025-04-26 14:48:31,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:31,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:48:31,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1273 1264 1271 [WARNING|trainer.py:803] 2025-04-26 14:48:33,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:48:33,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:33,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1274 1272 1265 [WARNING|trainer.py:803] 2025-04-26 14:48:35,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:35,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:35,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1275 1273 1266 [WARNING|trainer.py:803] 2025-04-26 14:48:38,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:38,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:48:38,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1274 1276 1267 [WARNING|trainer.py:803] 2025-04-26 14:48:40,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:40,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:40,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1277 1268 1275 [WARNING|trainer.py:803] 2025-04-26 14:48:42,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:48:42,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:43,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1278 1276 1269 [WARNING|trainer.py:803] 2025-04-26 14:48:44,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:48:45,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:45,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1279 1277 1270 [WARNING|trainer.py:803] 2025-04-26 14:48:46,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:48:47,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:48:47,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1278 1280 1271 [WARNING|trainer.py:803] 2025-04-26 14:48:49,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:48:49,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:48:49,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1281 1279 1272 [WARNING|trainer.py:803] 2025-04-26 14:48:51,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:51,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:48:52,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1282 1280 1273 [WARNING|trainer.py:803] 2025-04-26 14:48:53,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:48:54,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:48:54,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1281 1274 1283 [WARNING|trainer.py:803] 2025-04-26 14:48:56,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:56,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:56,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1282 1284 1275 [WARNING|trainer.py:803] 2025-04-26 14:48:58,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:48:59,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:48:59,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1276 1283 1285 [WARNING|trainer.py:803] 2025-04-26 14:49:01,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:49:01,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:01,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1277 1286 1284 [WARNING|trainer.py:803] 2025-04-26 14:49:03,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:03,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:04,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1278 1285 [WARNING|trainer.py:803] 2025-04-26 14:49:05,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1287 [WARNING|trainer.py:803] 2025-04-26 14:49:06,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:06,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1279 1288 [WARNING|trainer.py:803] 2025-04-26 14:49:07,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1286 [WARNING|trainer.py:803] 2025-04-26 14:49:08,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:49:08,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1280 1289 [WARNING|trainer.py:803] 2025-04-26 14:49:10,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:49:10,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1287 [WARNING|trainer.py:803] 2025-04-26 14:49:11,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1281 1290 [WARNING|trainer.py:803] 2025-04-26 14:49:12,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:49:12,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1288 [WARNING|trainer.py:803] 2025-04-26 14:49:13,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1282 1291 [WARNING|trainer.py:803] 2025-04-26 14:49:14,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1289 [WARNING|trainer.py:803] 2025-04-26 14:49:15,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:49:15,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1292 1283 1290 [WARNING|trainer.py:803] 2025-04-26 14:49:17,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:49:17,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:17,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1293 1291 1284 [WARNING|trainer.py:803] 2025-04-26 14:49:19,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:20,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:49:20,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1294 1292 [WARNING|trainer.py:803] 2025-04-26 14:49:21,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1285 [WARNING|trainer.py:803] 2025-04-26 14:49:22,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:49:22,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1295 1293 [WARNING|trainer.py:803] 2025-04-26 14:49:24,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1286 [WARNING|trainer.py:803] 2025-04-26 14:49:24,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:25,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1296 1294 [WARNING|trainer.py:803] 2025-04-26 14:49:26,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:26,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1287 [WARNING|trainer.py:803] 2025-04-26 14:49:27,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1297 1295 [WARNING|trainer.py:803] 2025-04-26 14:49:28,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:49:29,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1288 [WARNING|trainer.py:803] 2025-04-26 14:49:30,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1298 1296 1289 [WARNING|trainer.py:803] 2025-04-26 14:49:31,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:49:31,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:32,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1299 1297 1290 [WARNING|trainer.py:803] 2025-04-26 14:49:33,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:49:33,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:49:34,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1300 1298 1291 [WARNING|trainer.py:803] 2025-04-26 14:49:35,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:49:36,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:49:36,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1299 1301 1292 [WARNING|trainer.py:803] 2025-04-26 14:49:38,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:49:38,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:38,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1300 1293 1302 [WARNING|trainer.py:803] 2025-04-26 14:49:40,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:49:40,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:41,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1294 1303 1301 [WARNING|trainer.py:803] 2025-04-26 14:49:43,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:49:43,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:49:43,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1295 1304 1302 [WARNING|trainer.py:803] 2025-04-26 14:49:45,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:49:45,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:49:45,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1305 1296 1303 [WARNING|trainer.py:803] 2025-04-26 14:49:47,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:47,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:47,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1297 1306 1304 [WARNING|trainer.py:803] 2025-04-26 14:49:50,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:50,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:49:50,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1307 1305 1298 [WARNING|trainer.py:803] 2025-04-26 14:49:52,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:49:52,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:52,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1308 1299 1306 [WARNING|trainer.py:803] 2025-04-26 14:49:54,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:49:54,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:49:54,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1309 1307 1300 [WARNING|trainer.py:803] 2025-04-26 14:49:56,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:49:57,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:49:57,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1310 1308 [WARNING|trainer.py:803] 2025-04-26 14:49:59,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1301 [WARNING|trainer.py:803] 2025-04-26 14:49:59,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:50:00,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1311 1309 [WARNING|trainer.py:803] 2025-04-26 14:50:01,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:01,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1302 [WARNING|trainer.py:803] 2025-04-26 14:50:02,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1312 1310 [WARNING|trainer.py:803] 2025-04-26 14:50:03,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:50:03,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1303 [WARNING|trainer.py:803] 2025-04-26 14:50:04,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1311 1313 [WARNING|trainer.py:803] 2025-04-26 14:50:06,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1304 [WARNING|trainer.py:803] 2025-04-26 14:50:06,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:50:07,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1312 1314 1305 [WARNING|trainer.py:803] 2025-04-26 14:50:08,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:50:08,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:50:09,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1315 1313 1306 [WARNING|trainer.py:803] 2025-04-26 14:50:10,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:10,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:50:11,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1314 1316 1307 [WARNING|trainer.py:803] 2025-04-26 14:50:13,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:50:13,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:50:13,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1315 1317 1308 [WARNING|trainer.py:803] 2025-04-26 14:50:15,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:15,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:50:15,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1316 1318 1309 [WARNING|trainer.py:803] 2025-04-26 14:50:17,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:50:17,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:18,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1317 1310 1319 [WARNING|trainer.py:803] 2025-04-26 14:50:20,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:50:20,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:50:20,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1318 1311 1320 [WARNING|trainer.py:803] 2025-04-26 14:50:22,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:22,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:22,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1312 1319 1321 [WARNING|trainer.py:803] 2025-04-26 14:50:25,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:50:25,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:50:25,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1313 1320 1322 [WARNING|trainer.py:803] 2025-04-26 14:50:27,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:50:27,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:50:27,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1314 1323 1321 [WARNING|trainer.py:803] 2025-04-26 14:50:29,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:50:30,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:50:30,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1315 1324 1322 [WARNING|trainer.py:803] 2025-04-26 14:50:32,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:32,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:50:32,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1316 1323 1325 [WARNING|trainer.py:803] 2025-04-26 14:50:34,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:50:34,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:50:35,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1317 1324 1326 [WARNING|trainer.py:803] 2025-04-26 14:50:36,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:50:37,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:50:37,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1318 1327 1325 [WARNING|trainer.py:803] 2025-04-26 14:50:39,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:39,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:39,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1328 1319 1326 [WARNING|trainer.py:803] 2025-04-26 14:50:41,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:41,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:50:42,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1329 1320 1327 [WARNING|trainer.py:803] 2025-04-26 14:50:43,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:44,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:50:44,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1330 1328 1321 [WARNING|trainer.py:803] 2025-04-26 14:50:46,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:46,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:46,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1331 1329 [WARNING|trainer.py:803] 2025-04-26 14:50:48,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1322 [WARNING|trainer.py:803] 2025-04-26 14:50:48,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:49,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1332 1330 [WARNING|trainer.py:803] 2025-04-26 14:50:50,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1323 [WARNING|trainer.py:803] 2025-04-26 14:50:51,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:51,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1333 1331 [WARNING|trainer.py:803] 2025-04-26 14:50:53,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1324 [WARNING|trainer.py:803] 2025-04-26 14:50:53,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:50:53,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1334 1332 [WARNING|trainer.py:803] 2025-04-26 14:50:55,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:50:55,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1325 [WARNING|trainer.py:803] 2025-04-26 14:50:56,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1333 1335 [WARNING|trainer.py:803] 2025-04-26 14:50:57,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:50:57,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1326 [WARNING|trainer.py:803] 2025-04-26 14:50:58,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1334 1336 [WARNING|trainer.py:803] 2025-04-26 14:51:00,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:51:00,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1327 [WARNING|trainer.py:803] 2025-04-26 14:51:01,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1337 1335 [WARNING|trainer.py:803] 2025-04-26 14:51:02,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:51:02,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1328 [WARNING|trainer.py:803] 2025-04-26 14:51:03,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1338 1336 [WARNING|trainer.py:803] 2025-04-26 14:51:04,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1329 [WARNING|trainer.py:803] 2025-04-26 14:51:05,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:51:05,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1337 1339 1330 [WARNING|trainer.py:803] 2025-04-26 14:51:07,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:51:07,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:51:08,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1338 1340 1331 [WARNING|trainer.py:803] 2025-04-26 14:51:09,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:51:09,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:51:10,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1341 1339 1332 [WARNING|trainer.py:803] 2025-04-26 14:51:11,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:51:12,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:51:12,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1342 1340 1333 [WARNING|trainer.py:803] 2025-04-26 14:51:14,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:51:14,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:51:14,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1343 1341 1334 [WARNING|trainer.py:803] 2025-04-26 14:51:16,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:51:16,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:51:17,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1344 1342 1335 [WARNING|trainer.py:803] 2025-04-26 14:51:18,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:51:19,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:51:19,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1345 1343 [WARNING|trainer.py:803] 2025-04-26 14:51:21,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1336 [WARNING|trainer.py:803] 2025-04-26 14:51:21,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:51:22,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1346 1344 1337 [WARNING|trainer.py:803] 2025-04-26 14:51:23,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:51:23,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:51:24,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1345 1347 1338 [WARNING|trainer.py:803] 2025-04-26 14:51:26,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:51:26,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:51:26,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1346 1348 1339 [WARNING|trainer.py:803] 2025-04-26 14:51:28,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:51:28,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:51:29,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1349 1347 [WARNING|trainer.py:803] 2025-04-26 14:51:30,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1340 [WARNING|trainer.py:803] 2025-04-26 14:51:30,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:51:31,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1348 1341 [WARNING|trainer.py:803] 2025-04-26 14:51:33,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1350 [WARNING|trainer.py:803] 2025-04-26 14:51:33,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:51:34,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1349 1342 [WARNING|trainer.py:803] 2025-04-26 14:51:35,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:51:36,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1351 1343 [WARNING|trainer.py:803] 2025-04-26 14:51:37,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:51:38,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1350 [WARNING|trainer.py:803] 2025-04-26 14:51:39,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1344 1352 [WARNING|trainer.py:803] 2025-04-26 14:51:40,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:51:40,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1351 1345 [WARNING|trainer.py:803] 2025-04-26 14:51:42,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1353 [WARNING|trainer.py:803] 2025-04-26 14:51:43,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:51:43,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1346 1352 1354 [WARNING|trainer.py:803] 2025-04-26 14:51:45,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:51:45,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:51:46,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1347 [WARNING|trainer.py:803] 2025-04-26 14:51:48,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1353 1355 [WARNING|trainer.py:803] 2025-04-26 14:51:48,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:51:49,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1348 [WARNING|trainer.py:803] 2025-04-26 14:51:50,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1354 1356 [WARNING|trainer.py:803] 2025-04-26 14:51:51,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1349 [WARNING|trainer.py:803] 2025-04-26 14:51:52,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:51:52,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1355 1357 [WARNING|trainer.py:803] 2025-04-26 14:51:54,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:51:54,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1350 1356 [WARNING|trainer.py:803] 2025-04-26 14:51:56,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:51:57,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1358 [WARNING|trainer.py:803] 2025-04-26 14:51:58,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1351 1357 1359 [WARNING|trainer.py:803] 2025-04-26 14:51:59,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:51:59,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:52:00,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1352 1358 [WARNING|trainer.py:803] 2025-04-26 14:52:02,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:52:03,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1360 [WARNING|trainer.py:803] 2025-04-26 14:52:03,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1359 1353 [WARNING|trainer.py:803] 2025-04-26 14:52:05,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1361 [WARNING|trainer.py:803] 2025-04-26 14:52:05,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:52:06,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1354 1362 [WARNING|trainer.py:803] 2025-04-26 14:52:08,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1360 [WARNING|trainer.py:803] 2025-04-26 14:52:09,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:52:09,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1355 1361 [WARNING|trainer.py:803] 2025-04-26 14:52:11,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1363 [WARNING|trainer.py:803] 2025-04-26 14:52:11,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:52:12,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1356 1362 [WARNING|trainer.py:803] 2025-04-26 14:52:14,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:52:14,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1364 [WARNING|trainer.py:803] 2025-04-26 14:52:15,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1357 1363 [WARNING|trainer.py:803] 2025-04-26 14:52:16,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:52:17,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1365 [WARNING|trainer.py:803] 2025-04-26 14:52:18,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1358 1364 [WARNING|trainer.py:803] 2025-04-26 14:52:20,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1366 [WARNING|trainer.py:803] 2025-04-26 14:52:21,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:52:21,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1359 [WARNING|trainer.py:803] 2025-04-26 14:52:22,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1365 1367 [WARNING|trainer.py:803] 2025-04-26 14:52:23,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:52:24,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1360 1368 1366 [WARNING|trainer.py:803] 2025-04-26 14:52:26,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:52:26,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:52:26,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1361 1367 [WARNING|trainer.py:803] 2025-04-26 14:52:28,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1369 [WARNING|trainer.py:803] 2025-04-26 14:52:29,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:52:29,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1362 1368 [WARNING|trainer.py:803] 2025-04-26 14:52:31,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1370 [WARNING|trainer.py:803] 2025-04-26 14:52:31,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:52:32,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1363 1371 1369 [WARNING|trainer.py:803] 2025-04-26 14:52:34,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:52:35,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:52:35,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1372 1370 1364 [WARNING|trainer.py:803] 2025-04-26 14:52:37,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:52:37,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:52:38,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1373 1371 [WARNING|trainer.py:803] 2025-04-26 14:52:40,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:52:40,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1365 [WARNING|trainer.py:803] 2025-04-26 14:52:41,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1372 1374 [WARNING|trainer.py:803] 2025-04-26 14:52:42,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:52:42,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1366 [WARNING|trainer.py:803] 2025-04-26 14:52:44,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1375 1373 [WARNING|trainer.py:803] 2025-04-26 14:52:45,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:52:45,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1367 [WARNING|trainer.py:803] 2025-04-26 14:52:46,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1374 1376 [WARNING|trainer.py:803] 2025-04-26 14:52:48,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1368 [WARNING|trainer.py:803] 2025-04-26 14:52:48,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:52:49,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1375 1377 [WARNING|trainer.py:803] 2025-04-26 14:52:50,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:52:51,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1369 [WARNING|trainer.py:803] 2025-04-26 14:52:52,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1376 [WARNING|trainer.py:803] 2025-04-26 14:52:53,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1378 1370 [WARNING|trainer.py:803] 2025-04-26 14:52:54,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:52:55,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1377 [WARNING|trainer.py:803] 2025-04-26 14:52:56,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1379 1371 [WARNING|trainer.py:803] 2025-04-26 14:52:57,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:52:58,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1378 [WARNING|trainer.py:803] 2025-04-26 14:52:59,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1372 1380 [WARNING|trainer.py:803] 2025-04-26 14:53:00,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:53:00,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1379 1373 [WARNING|trainer.py:803] 2025-04-26 14:53:02,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1381 [WARNING|trainer.py:803] 2025-04-26 14:53:03,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:53:03,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1380 1374 [WARNING|trainer.py:803] 2025-04-26 14:53:05,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:53:05,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1382 [WARNING|trainer.py:803] 2025-04-26 14:53:06,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1375 1381 [WARNING|trainer.py:803] 2025-04-26 14:53:08,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1383 [WARNING|trainer.py:803] 2025-04-26 14:53:08,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:53:09,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1382 1376 1384 [WARNING|trainer.py:803] 2025-04-26 14:53:11,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:53:11,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:53:11,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1383 1377 [WARNING|trainer.py:803] 2025-04-26 14:53:13,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:53:13,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1385 [WARNING|trainer.py:803] 2025-04-26 14:53:14,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1384 [WARNING|trainer.py:803] 2025-04-26 14:53:16,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1378 1386 [WARNING|trainer.py:803] 2025-04-26 14:53:17,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:53:17,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1385 1379 1387 [WARNING|trainer.py:803] 2025-04-26 14:53:19,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:53:20,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:53:20,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1386 [WARNING|trainer.py:803] 2025-04-26 14:53:22,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1380 [WARNING|trainer.py:803] 2025-04-26 14:53:23,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1388 [WARNING|trainer.py:803] 2025-04-26 14:53:24,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1387 [WARNING|trainer.py:803] 2025-04-26 14:53:25,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1381 [WARNING|trainer.py:803] 2025-04-26 14:53:26,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1389 [WARNING|trainer.py:803] 2025-04-26 14:53:27,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1388 1382 1390 [WARNING|trainer.py:803] 2025-04-26 14:53:29,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:53:29,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:53:30,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1383 1389 [WARNING|trainer.py:803] 2025-04-26 14:53:32,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1391 [WARNING|trainer.py:803] 2025-04-26 14:53:32,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:53:32,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1384 1390 [WARNING|trainer.py:803] 2025-04-26 14:53:34,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:53:35,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1392 [WARNING|trainer.py:803] 2025-04-26 14:53:35,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1385 1391 [WARNING|trainer.py:803] 2025-04-26 14:53:37,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:53:37,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1393 [WARNING|trainer.py:803] 2025-04-26 14:53:39,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1386 1392 [WARNING|trainer.py:803] 2025-04-26 14:53:40,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:53:41,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1394 [WARNING|trainer.py:803] 2025-04-26 14:53:42,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1387 1393 [WARNING|trainer.py:803] 2025-04-26 14:53:43,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:53:44,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1395 [WARNING|trainer.py:803] 2025-04-26 14:53:45,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1394 1388 [WARNING|trainer.py:803] 2025-04-26 14:53:47,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:53:47,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1396 [WARNING|trainer.py:803] 2025-04-26 14:53:49,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1389 1395 [WARNING|trainer.py:803] 2025-04-26 14:53:50,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:53:50,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1397 1390 [WARNING|trainer.py:803] 2025-04-26 14:53:53,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:53:53,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1396 [WARNING|trainer.py:803] 2025-04-26 14:53:54,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1391 1398 [WARNING|trainer.py:803] 2025-04-26 14:53:56,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:53:56,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1397 [WARNING|trainer.py:803] 2025-04-26 14:53:58,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1399 1392 [WARNING|trainer.py:803] 2025-04-26 14:53:58,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:53:59,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1398 1400 [WARNING|trainer.py:803] 2025-04-26 14:54:01,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:54:01,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1393 [WARNING|trainer.py:803] 2025-04-26 14:54:02,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1401 1399 [WARNING|trainer.py:803] 2025-04-26 14:54:03,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:54:03,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1394 1402 1400 [WARNING|trainer.py:803] 2025-04-26 14:54:05,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:54:05,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:54:06,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1403 1401 [WARNING|trainer.py:803] 2025-04-26 14:54:08,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1395 [WARNING|trainer.py:803] 2025-04-26 14:54:08,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:54:09,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1404 1402 [WARNING|trainer.py:803] 2025-04-26 14:54:10,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:54:10,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1405 1403 1396 [WARNING|trainer.py:803] 2025-04-26 14:54:12,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:54:12,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:54:12,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1406 1404 [WARNING|trainer.py:803] 2025-04-26 14:54:14,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:54:15,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1397 1407 [WARNING|trainer.py:803] 2025-04-26 14:54:16,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1405 [WARNING|trainer.py:803] 2025-04-26 14:54:17,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:54:17,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1408 1406 [WARNING|trainer.py:803] 2025-04-26 14:54:19,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1398 [WARNING|trainer.py:803] 2025-04-26 14:54:19,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:54:20,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1409 1407 1399 [WARNING|trainer.py:803] 2025-04-26 14:54:21,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:54:21,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:54:22,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1410 1408 [WARNING|trainer.py:803] 2025-04-26 14:54:23,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1400 [WARNING|trainer.py:803] 2025-04-26 14:54:24,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:54:24,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1411 1409 [WARNING|trainer.py:803] 2025-04-26 14:54:25,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1401 [WARNING|trainer.py:803] 2025-04-26 14:54:26,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:54:26,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1412 1410 [WARNING|trainer.py:803] 2025-04-26 14:54:28,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1402 [WARNING|trainer.py:803] 2025-04-26 14:54:28,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:54:29,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1413 1411 [WARNING|trainer.py:803] 2025-04-26 14:54:30,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1403 [WARNING|trainer.py:803] 2025-04-26 14:54:30,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:54:31,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1414 1412 [WARNING|trainer.py:803] 2025-04-26 14:54:32,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1404 [WARNING|trainer.py:803] 2025-04-26 14:54:33,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:54:33,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1415 1413 [WARNING|trainer.py:803] 2025-04-26 14:54:34,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1405 [WARNING|trainer.py:803] 2025-04-26 14:54:35,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:54:35,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1416 1414 [WARNING|trainer.py:803] 2025-04-26 14:54:36,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1406 [WARNING|trainer.py:803] 2025-04-26 14:54:37,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:54:38,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1417 1415 [WARNING|trainer.py:803] 2025-04-26 14:54:39,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1407 [WARNING|trainer.py:803] 2025-04-26 14:54:39,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:54:40,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1418 1416 [WARNING|trainer.py:803] 2025-04-26 14:54:41,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1408 [WARNING|trainer.py:803] 2025-04-26 14:54:41,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:54:42,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1419 1417 [WARNING|trainer.py:803] 2025-04-26 14:54:43,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1409 [WARNING|trainer.py:803] 2025-04-26 14:54:44,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:54:44,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1420 1418 [WARNING|trainer.py:803] 2025-04-26 14:54:45,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1410 [WARNING|trainer.py:803] 2025-04-26 14:54:46,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:54:46,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1421 1419 [WARNING|trainer.py:803] 2025-04-26 14:54:47,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1411 [WARNING|trainer.py:803] 2025-04-26 14:54:48,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:54:49,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1422 1420 [WARNING|trainer.py:803] 2025-04-26 14:54:50,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1412 [WARNING|trainer.py:803] 2025-04-26 14:54:50,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:54:51,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1423 1421 [WARNING|trainer.py:803] 2025-04-26 14:54:52,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1413 [WARNING|trainer.py:803] 2025-04-26 14:54:52,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:54:53,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1424 1422 [WARNING|trainer.py:803] 2025-04-26 14:54:54,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:54:54,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1414 1425 [WARNING|trainer.py:803] 2025-04-26 14:54:55,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1423 [WARNING|trainer.py:803] 2025-04-26 14:54:56,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1415 [WARNING|trainer.py:803] 2025-04-26 14:54:57,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1426 [WARNING|trainer.py:803] 2025-04-26 14:54:57,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1424 [WARNING|trainer.py:803] 2025-04-26 14:54:58,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1416 [WARNING|trainer.py:803] 2025-04-26 14:54:59,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:55:00,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1427 1425 [WARNING|trainer.py:803] 2025-04-26 14:55:00,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:55:01,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1417 1428 [WARNING|trainer.py:803] 2025-04-26 14:55:02,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1426 [WARNING|trainer.py:803] 2025-04-26 14:55:02,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1418 [WARNING|trainer.py:803] 2025-04-26 14:55:03,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1429 [WARNING|trainer.py:803] 2025-04-26 14:55:04,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1427 [WARNING|trainer.py:803] 2025-04-26 14:55:05,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:55:05,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1419 1430 [WARNING|trainer.py:803] 2025-04-26 14:55:06,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1428 [WARNING|trainer.py:803] 2025-04-26 14:55:07,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:55:07,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1420 1431 [WARNING|trainer.py:803] 2025-04-26 14:55:08,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1429 [WARNING|trainer.py:803] 2025-04-26 14:55:09,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1421 [WARNING|trainer.py:803] 2025-04-26 14:55:10,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1432 [WARNING|trainer.py:803] 2025-04-26 14:55:11,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1430 [WARNING|trainer.py:803] 2025-04-26 14:55:11,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:55:12,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1422 1433 [WARNING|trainer.py:803] 2025-04-26 14:55:13,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1431 [WARNING|trainer.py:803] 2025-04-26 14:55:13,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:55:14,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1423 1434 [WARNING|trainer.py:803] 2025-04-26 14:55:15,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1432 [WARNING|trainer.py:803] 2025-04-26 14:55:16,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:55:16,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1424 1435 [WARNING|trainer.py:803] 2025-04-26 14:55:17,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:55:18,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1433 [WARNING|trainer.py:803] 2025-04-26 14:55:19,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1425 1436 [WARNING|trainer.py:803] 2025-04-26 14:55:19,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1434 [WARNING|trainer.py:803] 2025-04-26 14:55:20,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1426 [WARNING|trainer.py:803] 2025-04-26 14:55:21,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:55:22,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1437 1435 [WARNING|trainer.py:803] 2025-04-26 14:55:23,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1427 [WARNING|trainer.py:803] 2025-04-26 14:55:23,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:55:24,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1438 1436 [WARNING|trainer.py:803] 2025-04-26 14:55:25,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1428 [WARNING|trainer.py:803] 2025-04-26 14:55:25,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1439 [WARNING|trainer.py:803] 2025-04-26 14:55:26,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:55:27,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1437 1429 [WARNING|trainer.py:803] 2025-04-26 14:55:28,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:55:28,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1440 1438 [WARNING|trainer.py:803] 2025-04-26 14:55:29,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1430 [WARNING|trainer.py:803] 2025-04-26 14:55:30,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:55:30,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1441 1439 [WARNING|trainer.py:803] 2025-04-26 14:55:31,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1431 [WARNING|trainer.py:803] 2025-04-26 14:55:32,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:55:32,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1442 1440 [WARNING|trainer.py:803] 2025-04-26 14:55:34,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1432 [WARNING|trainer.py:803] 2025-04-26 14:55:34,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:55:35,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1443 1441 [WARNING|trainer.py:803] 2025-04-26 14:55:36,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1433 [WARNING|trainer.py:803] 2025-04-26 14:55:36,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:55:37,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1444 [WARNING|trainer.py:803] 2025-04-26 14:55:38,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1442 1434 [WARNING|trainer.py:803] 2025-04-26 14:55:39,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1445 [WARNING|trainer.py:803] 2025-04-26 14:55:39,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:55:40,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1443 1435 [WARNING|trainer.py:803] 2025-04-26 14:55:41,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:55:41,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1446 [WARNING|trainer.py:803] 2025-04-26 14:55:42,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1444 1436 [WARNING|trainer.py:803] 2025-04-26 14:55:43,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1447 [WARNING|trainer.py:803] 2025-04-26 14:55:44,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:55:44,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1445 1437 [WARNING|trainer.py:803] 2025-04-26 14:55:45,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1448 [WARNING|trainer.py:803] 2025-04-26 14:55:46,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1446 [WARNING|trainer.py:803] 2025-04-26 14:55:47,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1438 [WARNING|trainer.py:803] 2025-04-26 14:55:47,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1449 [WARNING|trainer.py:803] 2025-04-26 14:55:48,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1447 [WARNING|trainer.py:803] 2025-04-26 14:55:49,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1439 [WARNING|trainer.py:803] 2025-04-26 14:55:49,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:55:50,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1450 1448 [WARNING|trainer.py:803] 2025-04-26 14:55:51,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1440 [WARNING|trainer.py:803] 2025-04-26 14:55:52,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1451 [WARNING|trainer.py:803] 2025-04-26 14:55:53,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1449 [WARNING|trainer.py:803] 2025-04-26 14:55:53,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1441 [WARNING|trainer.py:803] 2025-04-26 14:55:54,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1452 [WARNING|trainer.py:803] 2025-04-26 14:55:55,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:55:56,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1450 [WARNING|trainer.py:803] 2025-04-26 14:55:56,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1442 1453 [WARNING|trainer.py:803] 2025-04-26 14:55:57,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1451 [WARNING|trainer.py:803] 2025-04-26 14:55:58,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:55:59,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1443 1454 [WARNING|trainer.py:803] 2025-04-26 14:56:00,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1452 [WARNING|trainer.py:803] 2025-04-26 14:56:00,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:56:01,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1444 1455 [WARNING|trainer.py:803] 2025-04-26 14:56:02,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:56:02,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1453 [WARNING|trainer.py:803] 2025-04-26 14:56:03,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1445 1456 [WARNING|trainer.py:803] 2025-04-26 14:56:04,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:56:04,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1454 [WARNING|trainer.py:803] 2025-04-26 14:56:05,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1446 1457 [WARNING|trainer.py:803] 2025-04-26 14:56:06,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:56:06,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1455 [WARNING|trainer.py:803] 2025-04-26 14:56:07,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1447 1458 [WARNING|trainer.py:803] 2025-04-26 14:56:08,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1456 [WARNING|trainer.py:803] 2025-04-26 14:56:09,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:56:10,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1448 1459 [WARNING|trainer.py:803] 2025-04-26 14:56:11,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1457 [WARNING|trainer.py:803] 2025-04-26 14:56:11,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:56:12,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1449 1460 [WARNING|trainer.py:803] 2025-04-26 14:56:13,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:56:13,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1458 [WARNING|trainer.py:803] 2025-04-26 14:56:14,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1450 1461 [WARNING|trainer.py:803] 2025-04-26 14:56:15,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:56:15,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1459 [WARNING|trainer.py:803] 2025-04-26 14:56:17,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1451 1462 [WARNING|trainer.py:803] 2025-04-26 14:56:17,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:56:18,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1460 [WARNING|trainer.py:803] 2025-04-26 14:56:19,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1452 1463 [WARNING|trainer.py:803] 2025-04-26 14:56:20,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:56:20,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1461 [WARNING|trainer.py:803] 2025-04-26 14:56:21,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1464 1453 [WARNING|trainer.py:803] 2025-04-26 14:56:22,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1462 [WARNING|trainer.py:803] 2025-04-26 14:56:22,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:56:23,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1465 1454 [WARNING|trainer.py:803] 2025-04-26 14:56:24,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:56:24,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1463 [WARNING|trainer.py:803] 2025-04-26 14:56:25,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1466 1455 [WARNING|trainer.py:803] 2025-04-26 14:56:26,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:56:26,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1464 [WARNING|trainer.py:803] 2025-04-26 14:56:27,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1467 1456 [WARNING|trainer.py:803] 2025-04-26 14:56:28,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:56:29,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1465 [WARNING|trainer.py:803] 2025-04-26 14:56:30,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1468 1457 [WARNING|trainer.py:803] 2025-04-26 14:56:31,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:56:31,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1466 [WARNING|trainer.py:803] 2025-04-26 14:56:32,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1469 1458 [WARNING|trainer.py:803] 2025-04-26 14:56:33,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1467 [WARNING|trainer.py:803] 2025-04-26 14:56:33,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:56:34,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1470 1459 [WARNING|trainer.py:803] 2025-04-26 14:56:35,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1468 [WARNING|trainer.py:803] 2025-04-26 14:56:36,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:56:36,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1471 1460 [WARNING|trainer.py:803] 2025-04-26 14:56:37,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1469 [WARNING|trainer.py:803] 2025-04-26 14:56:38,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:56:38,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1472 1461 [WARNING|trainer.py:803] 2025-04-26 14:56:39,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1470 [WARNING|trainer.py:803] 2025-04-26 14:56:40,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:56:40,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1473 1462 [WARNING|trainer.py:803] 2025-04-26 14:56:42,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1471 [WARNING|trainer.py:803] 2025-04-26 14:56:42,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:56:42,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1474 1463 1472 [WARNING|trainer.py:803] 2025-04-26 14:56:44,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:56:44,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:56:45,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1464 1475 1473 [WARNING|trainer.py:803] 2025-04-26 14:56:46,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:56:47,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:56:47,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1465 1476 1474 [WARNING|trainer.py:803] 2025-04-26 14:56:49,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:56:49,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:56:49,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1466 1477 [WARNING|trainer.py:803] 2025-04-26 14:56:51,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1475 [WARNING|trainer.py:803] 2025-04-26 14:56:51,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:56:52,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1467 1478 [WARNING|trainer.py:803] 2025-04-26 14:56:53,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:56:53,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1476 [WARNING|trainer.py:803] 2025-04-26 14:56:54,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1479 1468 [WARNING|trainer.py:803] 2025-04-26 14:56:55,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:56:55,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1477 [WARNING|trainer.py:803] 2025-04-26 14:56:56,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1469 1480 [WARNING|trainer.py:803] 2025-04-26 14:56:58,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:56:58,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1478 [WARNING|trainer.py:803] 2025-04-26 14:56:58,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1470 1481 1479 [WARNING|trainer.py:803] 2025-04-26 14:57:00,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:57:00,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:57:01,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1471 1482 1480 [WARNING|trainer.py:803] 2025-04-26 14:57:02,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:57:02,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:57:03,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1472 1483 [WARNING|trainer.py:803] 2025-04-26 14:57:04,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1481 [WARNING|trainer.py:803] 2025-04-26 14:57:05,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:57:05,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1473 1484 [WARNING|trainer.py:803] 2025-04-26 14:57:07,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1482 [WARNING|trainer.py:803] 2025-04-26 14:57:07,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:08,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1474 1485 [WARNING|trainer.py:803] 2025-04-26 14:57:09,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1483 [WARNING|trainer.py:803] 2025-04-26 14:57:09,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:57:10,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1475 1486 1484 [WARNING|trainer.py:803] 2025-04-26 14:57:12,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:57:12,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:12,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1476 1485 1487 [WARNING|trainer.py:803] 2025-04-26 14:57:14,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:57:14,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:57:14,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1477 1488 [WARNING|trainer.py:803] 2025-04-26 14:57:16,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1486 [WARNING|trainer.py:803] 2025-04-26 14:57:16,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:57:17,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1478 1489 [WARNING|trainer.py:803] 2025-04-26 14:57:18,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:57:19,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1487 1479 [WARNING|trainer.py:803] 2025-04-26 14:57:20,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:20,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1488 1490 [WARNING|trainer.py:803] 2025-04-26 14:57:22,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:57:22,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1480 [WARNING|trainer.py:803] 2025-04-26 14:57:23,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1491 1489 [WARNING|trainer.py:803] 2025-04-26 14:57:24,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:24,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1481 1492 [WARNING|trainer.py:803] 2025-04-26 14:57:25,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:57:26,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1490 1482 [WARNING|trainer.py:803] 2025-04-26 14:57:27,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1493 [WARNING|trainer.py:803] 2025-04-26 14:57:28,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1491 [WARNING|trainer.py:803] 2025-04-26 14:57:28,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:29,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1494 1483 [WARNING|trainer.py:803] 2025-04-26 14:57:30,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1492 [WARNING|trainer.py:803] 2025-04-26 14:57:30,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:57:31,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1495 1484 1493 [WARNING|trainer.py:803] 2025-04-26 14:57:32,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:57:32,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:33,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1496 1485 1494 [WARNING|trainer.py:803] 2025-04-26 14:57:34,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:57:35,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:57:35,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1497 1486 1495 [WARNING|trainer.py:803] 2025-04-26 14:57:37,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:37,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:37,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1498 1496 1487 [WARNING|trainer.py:803] 2025-04-26 14:57:39,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:39,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:57:40,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1499 1497 1488 [WARNING|trainer.py:803] 2025-04-26 14:57:41,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:42,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:42,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1500 1498 1489 [WARNING|trainer.py:803] 2025-04-26 14:57:44,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:57:44,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:44,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1499 1501 [WARNING|trainer.py:803] 2025-04-26 14:57:46,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1490 [WARNING|trainer.py:803] 2025-04-26 14:57:47,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:57:47,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1500 1491 [WARNING|trainer.py:803] 2025-04-26 14:57:48,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 14:57:49,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1502 [WARNING|trainer.py:803] 2025-04-26 14:57:50,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1492 1501 [WARNING|trainer.py:803] 2025-04-26 14:57:51,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:52,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1493 1503 [WARNING|trainer.py:803] 2025-04-26 14:57:53,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:54,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1502 1494 [WARNING|trainer.py:803] 2025-04-26 14:57:55,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:57:56,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1504 1495 [WARNING|trainer.py:803] 2025-04-26 14:57:57,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:57:58,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1503 [WARNING|trainer.py:803] 2025-04-26 14:57:59,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1496 1505 [WARNING|trainer.py:803] 2025-04-26 14:58:00,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:58:00,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1504 1497 [WARNING|trainer.py:803] 2025-04-26 14:58:02,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:58:02,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1506 [WARNING|trainer.py:803] 2025-04-26 14:58:04,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1498 1505 [WARNING|trainer.py:803] 2025-04-26 14:58:05,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:58:05,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1499 1507 [WARNING|trainer.py:803] 2025-04-26 14:58:07,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:58:07,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1506 1500 [WARNING|trainer.py:803] 2025-04-26 14:58:09,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:58:09,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1508 [WARNING|trainer.py:803] 2025-04-26 14:58:10,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1507 1501 [WARNING|trainer.py:803] 2025-04-26 14:58:12,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:58:13,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1509 [WARNING|trainer.py:803] 2025-04-26 14:58:14,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1508 1502 [WARNING|trainer.py:803] 2025-04-26 14:58:16,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1510 [WARNING|trainer.py:803] 2025-04-26 14:58:16,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:58:17,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1509 1503 [WARNING|trainer.py:803] 2025-04-26 14:58:19,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1511 [WARNING|trainer.py:803] 2025-04-26 14:58:20,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:58:20,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1510 1504 [WARNING|trainer.py:803] 2025-04-26 14:58:22,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1512 [WARNING|trainer.py:803] 2025-04-26 14:58:23,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:58:24,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1511 [WARNING|trainer.py:803] 2025-04-26 14:58:26,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1505 1513 [WARNING|trainer.py:803] 2025-04-26 14:58:26,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:58:27,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1512 [WARNING|trainer.py:803] 2025-04-26 14:58:29,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1506 1514 [WARNING|trainer.py:803] 2025-04-26 14:58:30,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:58:30,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1513 [WARNING|trainer.py:803] 2025-04-26 14:58:32,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1507 1515 [WARNING|trainer.py:803] 2025-04-26 14:58:33,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:58:34,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1514 [WARNING|trainer.py:803] 2025-04-26 14:58:36,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1508 1516 [WARNING|trainer.py:803] 2025-04-26 14:58:37,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:58:37,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1515 [WARNING|trainer.py:803] 2025-04-26 14:58:39,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1509 1517 [WARNING|trainer.py:803] 2025-04-26 14:58:40,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:58:41,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1516 [WARNING|trainer.py:803] 2025-04-26 14:58:43,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1510 1518 [WARNING|trainer.py:803] 2025-04-26 14:58:44,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:58:44,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1517 [WARNING|trainer.py:803] 2025-04-26 14:58:46,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1511 1519 [WARNING|trainer.py:803] 2025-04-26 14:58:47,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 14:58:47,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1518 [WARNING|trainer.py:803] 2025-04-26 14:58:49,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1512 1520 [WARNING|trainer.py:803] 2025-04-26 14:58:50,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:58:51,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1519 [WARNING|trainer.py:803] 2025-04-26 14:58:53,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1513 1521 [WARNING|trainer.py:803] 2025-04-26 14:58:54,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:58:54,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1520 [WARNING|trainer.py:803] 2025-04-26 14:58:56,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1514 1522 [WARNING|trainer.py:803] 2025-04-26 14:58:57,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:58:58,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1521 [WARNING|trainer.py:803] 2025-04-26 14:58:59,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1515 [WARNING|trainer.py:803] 2025-04-26 14:59:00,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1523 [WARNING|trainer.py:803] 2025-04-26 14:59:01,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1522 [WARNING|trainer.py:803] 2025-04-26 14:59:03,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1516 [WARNING|trainer.py:803] 2025-04-26 14:59:04,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1524 [WARNING|trainer.py:803] 2025-04-26 14:59:05,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1523 1517 [WARNING|trainer.py:803] 2025-04-26 14:59:07,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:59:07,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1525 [WARNING|trainer.py:803] 2025-04-26 14:59:08,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1524 1518 [WARNING|trainer.py:803] 2025-04-26 14:59:10,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:59:11,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1526 [WARNING|trainer.py:803] 2025-04-26 14:59:12,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1525 1519 [WARNING|trainer.py:803] 2025-04-26 14:59:14,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 14:59:14,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1527 [WARNING|trainer.py:803] 2025-04-26 14:59:16,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1526 1520 [WARNING|trainer.py:803] 2025-04-26 14:59:17,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:59:18,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1528 [WARNING|trainer.py:803] 2025-04-26 14:59:19,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1521 1527 [WARNING|trainer.py:803] 2025-04-26 14:59:21,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:59:21,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1529 [WARNING|trainer.py:803] 2025-04-26 14:59:22,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1522 1528 [WARNING|trainer.py:803] 2025-04-26 14:59:24,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:59:24,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1530 [WARNING|trainer.py:803] 2025-04-26 14:59:26,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1529 1523 [WARNING|trainer.py:803] 2025-04-26 14:59:28,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:59:28,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1531 [WARNING|trainer.py:803] 2025-04-26 14:59:30,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1530 1524 [WARNING|trainer.py:803] 2025-04-26 14:59:31,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:59:31,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1532 [WARNING|trainer.py:803] 2025-04-26 14:59:33,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1531 1525 [WARNING|trainer.py:803] 2025-04-26 14:59:35,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:59:35,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1533 [WARNING|trainer.py:803] 2025-04-26 14:59:37,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1532 1526 [WARNING|trainer.py:803] 2025-04-26 14:59:38,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 14:59:39,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1534 [WARNING|trainer.py:803] 2025-04-26 14:59:40,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1533 1527 [WARNING|trainer.py:803] 2025-04-26 14:59:42,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 14:59:42,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1535 [WARNING|trainer.py:803] 2025-04-26 14:59:44,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1534 1528 [WARNING|trainer.py:803] 2025-04-26 14:59:46,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:59:46,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1536 [WARNING|trainer.py:803] 2025-04-26 14:59:48,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1529 1535 [WARNING|trainer.py:803] 2025-04-26 14:59:49,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:59:49,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1537 [WARNING|trainer.py:803] 2025-04-26 14:59:51,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1530 [WARNING|trainer.py:803] 2025-04-26 14:59:52,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1536 [WARNING|trainer.py:803] 2025-04-26 14:59:53,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1538 [WARNING|trainer.py:803] 2025-04-26 14:59:54,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1531 1537 [WARNING|trainer.py:803] 2025-04-26 14:59:56,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 14:59:57,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1539 [WARNING|trainer.py:803] 2025-04-26 14:59:58,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1532 1538 [WARNING|trainer.py:803] 2025-04-26 14:59:59,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:00:00,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1540 [WARNING|trainer.py:803] 2025-04-26 15:00:01,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1533 1539 [WARNING|trainer.py:803] 2025-04-26 15:00:03,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:00:04,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1541 [WARNING|trainer.py:803] 2025-04-26 15:00:05,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1534 1540 [WARNING|trainer.py:803] 2025-04-26 15:00:06,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:00:07,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1542 [WARNING|trainer.py:803] 2025-04-26 15:00:08,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1535 1541 [WARNING|trainer.py:803] 2025-04-26 15:00:10,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:00:10,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1543 [WARNING|trainer.py:803] 2025-04-26 15:00:11,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1542 1536 [WARNING|trainer.py:803] 2025-04-26 15:00:14,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:00:14,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1544 [WARNING|trainer.py:803] 2025-04-26 15:00:15,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1543 1537 [WARNING|trainer.py:803] 2025-04-26 15:00:17,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:00:17,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1545 [WARNING|trainer.py:803] 2025-04-26 15:00:18,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1544 1538 1546 [WARNING|trainer.py:803] 2025-04-26 15:00:21,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:00:21,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:00:22,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1545 1539 1547 [WARNING|trainer.py:803] 2025-04-26 15:00:24,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:00:24,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:00:25,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1546 1540 1548 [WARNING|trainer.py:803] 2025-04-26 15:00:28,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:00:28,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:00:28,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1541 1547 1549 [WARNING|trainer.py:803] 2025-04-26 15:00:31,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:00:31,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:00:31,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1548 1542 1550 [WARNING|trainer.py:803] 2025-04-26 15:00:34,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:00:35,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:00:35,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1549 1543 [WARNING|trainer.py:803] 2025-04-26 15:00:38,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1551 [WARNING|trainer.py:803] 2025-04-26 15:00:38,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:00:38,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1550 1544 1552 [WARNING|trainer.py:803] 2025-04-26 15:00:41,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:00:41,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:00:42,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1551 1553 1545 [WARNING|trainer.py:803] 2025-04-26 15:00:45,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:00:45,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:00:45,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1552 1546 1554 [WARNING|trainer.py:803] 2025-04-26 15:00:48,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:00:48,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:00:48,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1553 1547 1555 [WARNING|trainer.py:803] 2025-04-26 15:00:51,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:00:52,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:00:52,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1554 1548 1556 [WARNING|trainer.py:803] 2025-04-26 15:00:55,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:00:55,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:00:55,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1555 1549 1557 [WARNING|trainer.py:803] 2025-04-26 15:00:58,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:00:58,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:00:58,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1556 1558 1550 [WARNING|trainer.py:803] 2025-04-26 15:01:01,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:01:02,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:01:02,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1557 1559 1551 [WARNING|trainer.py:803] 2025-04-26 15:01:05,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:01:05,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:01:05,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1558 1560 1552 [WARNING|trainer.py:803] 2025-04-26 15:01:08,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:01:08,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:01:09,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1559 1561 1553 [WARNING|trainer.py:803] 2025-04-26 15:01:12,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:01:12,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:01:12,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1560 1554 1562 [WARNING|trainer.py:803] 2025-04-26 15:01:15,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:01:15,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:01:15,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1561 1555 1563 [WARNING|trainer.py:803] 2025-04-26 15:01:18,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:01:19,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:01:19,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1564 1562 1556 [WARNING|trainer.py:803] 2025-04-26 15:01:22,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:01:22,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:01:22,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1565 1563 1557 [WARNING|trainer.py:803] 2025-04-26 15:01:25,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:01:25,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:01:25,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1564 1566 1558 [WARNING|trainer.py:803] 2025-04-26 15:01:28,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:01:29,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:01:29,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1565 1567 1559 [WARNING|trainer.py:803] 2025-04-26 15:01:32,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:01:32,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:01:32,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1566 1568 1560 [WARNING|trainer.py:803] 2025-04-26 15:01:35,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:01:36,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:01:36,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1567 1561 1569 [WARNING|trainer.py:803] 2025-04-26 15:01:39,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:01:39,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:01:39,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1568 1570 1562 [WARNING|trainer.py:803] 2025-04-26 15:01:42,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:01:43,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:01:43,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1569 1571 1563 [WARNING|trainer.py:803] 2025-04-26 15:01:46,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:01:46,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:01:46,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1570 1564 1572 [WARNING|trainer.py:803] 2025-04-26 15:01:49,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:01:50,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:01:50,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1571 1573 1565 [WARNING|trainer.py:803] 2025-04-26 15:01:53,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:01:53,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:01:53,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1572 1574 1566 [WARNING|trainer.py:803] 2025-04-26 15:01:56,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:01:56,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:01:57,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1573 1575 1567 [WARNING|trainer.py:803] 2025-04-26 15:02:00,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:02:00,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:02:00,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1574 1576 [WARNING|trainer.py:803] 2025-04-26 15:02:03,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1568 [WARNING|trainer.py:803] 2025-04-26 15:02:03,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:02:04,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1575 1577 [WARNING|trainer.py:803] 2025-04-26 15:02:06,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:02:06,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1569 [WARNING|trainer.py:803] 2025-04-26 15:02:07,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1576 1578 [WARNING|trainer.py:803] 2025-04-26 15:02:10,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:02:10,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1570 [WARNING|trainer.py:803] 2025-04-26 15:02:11,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1579 1577 [WARNING|trainer.py:803] 2025-04-26 15:02:13,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:02:13,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1571 [WARNING|trainer.py:803] 2025-04-26 15:02:14,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1578 1580 [WARNING|trainer.py:803] 2025-04-26 15:02:16,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:02:17,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1572 [WARNING|trainer.py:803] 2025-04-26 15:02:18,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1579 1581 [WARNING|trainer.py:803] 2025-04-26 15:02:20,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:02:20,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1573 [WARNING|trainer.py:803] 2025-04-26 15:02:21,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1582 1580 [WARNING|trainer.py:803] 2025-04-26 15:02:23,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:02:23,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1574 [WARNING|trainer.py:803] 2025-04-26 15:02:25,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1581 1583 [WARNING|trainer.py:803] 2025-04-26 15:02:26,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:02:27,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1575 [WARNING|trainer.py:803] 2025-04-26 15:02:28,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1582 1584 [WARNING|trainer.py:803] 2025-04-26 15:02:30,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:02:31,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1576 [WARNING|trainer.py:803] 2025-04-26 15:02:32,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1583 1585 [WARNING|trainer.py:803] 2025-04-26 15:02:34,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:02:34,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1577 [WARNING|trainer.py:803] 2025-04-26 15:02:35,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1584 1586 [WARNING|trainer.py:803] 2025-04-26 15:02:37,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:02:37,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1578 [WARNING|trainer.py:803] 2025-04-26 15:02:38,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1585 1587 [WARNING|trainer.py:803] 2025-04-26 15:02:40,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1579 [WARNING|trainer.py:803] 2025-04-26 15:02:41,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:02:41,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1588 1586 [WARNING|trainer.py:803] 2025-04-26 15:02:44,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:02:44,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1580 [WARNING|trainer.py:803] 2025-04-26 15:02:45,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1589 1587 [WARNING|trainer.py:803] 2025-04-26 15:02:47,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:02:48,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1581 [WARNING|trainer.py:803] 2025-04-26 15:02:49,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1590 1588 [WARNING|trainer.py:803] 2025-04-26 15:02:50,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:02:51,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1582 [WARNING|trainer.py:803] 2025-04-26 15:02:52,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1591 1589 [WARNING|trainer.py:803] 2025-04-26 15:02:54,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:02:54,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1583 [WARNING|trainer.py:803] 2025-04-26 15:02:56,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1590 1592 [WARNING|trainer.py:803] 2025-04-26 15:02:57,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:02:57,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1584 [WARNING|trainer.py:803] 2025-04-26 15:02:59,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1591 1593 [WARNING|trainer.py:803] 2025-04-26 15:03:01,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:03:01,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1585 [WARNING|trainer.py:803] 2025-04-26 15:03:03,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1594 1592 [WARNING|trainer.py:803] 2025-04-26 15:03:04,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:03:04,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1586 1593 [WARNING|trainer.py:803] 2025-04-26 15:03:07,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1595 [WARNING|trainer.py:803] 2025-04-26 15:03:07,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:03:08,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1587 1596 1594 [WARNING|trainer.py:803] 2025-04-26 15:03:10,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:03:11,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:03:11,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1588 1597 [WARNING|trainer.py:803] 2025-04-26 15:03:13,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1595 [WARNING|trainer.py:803] 2025-04-26 15:03:14,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:03:15,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1589 [WARNING|trainer.py:803] 2025-04-26 15:03:17,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1596 1598 [WARNING|trainer.py:803] 2025-04-26 15:03:18,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:03:18,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1590 [WARNING|trainer.py:803] 2025-04-26 15:03:20,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1597 1599 [WARNING|trainer.py:803] 2025-04-26 15:03:21,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:03:21,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1591 1600 [WARNING|trainer.py:803] 2025-04-26 15:03:23,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1598 [WARNING|trainer.py:803] 2025-04-26 15:03:24,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:03:24,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1592 1599 [WARNING|trainer.py:803] 2025-04-26 15:03:27,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1601 [WARNING|trainer.py:803] 2025-04-26 15:03:28,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:03:28,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1593 1600 1602 [WARNING|trainer.py:803] 2025-04-26 15:03:31,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:03:31,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:03:31,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1594 1603 1601 [WARNING|trainer.py:803] 2025-04-26 15:03:34,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:03:35,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:03:35,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1595 1604 1602 [WARNING|trainer.py:803] 2025-04-26 15:03:38,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:03:38,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:03:38,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1605 1596 1603 [WARNING|trainer.py:803] 2025-04-26 15:03:41,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:03:41,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:03:41,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1597 1606 1604 [WARNING|trainer.py:803] 2025-04-26 15:03:44,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:03:45,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:03:45,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1598 1605 1607 [WARNING|trainer.py:803] 2025-04-26 15:03:48,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:03:48,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:03:48,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1599 1606 [WARNING|trainer.py:803] 2025-04-26 15:03:51,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1608 [WARNING|trainer.py:803] 2025-04-26 15:03:51,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:03:52,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1600 1609 1607 [WARNING|trainer.py:803] 2025-04-26 15:03:55,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:03:55,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:03:55,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1601 1610 1608 [WARNING|trainer.py:803] 2025-04-26 15:03:58,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:03:59,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:03:59,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1602 1609 1611 [WARNING|trainer.py:803] 2025-04-26 15:04:02,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:02,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:03,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1603 1612 [WARNING|trainer.py:803] 2025-04-26 15:04:05,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1610 [WARNING|trainer.py:803] 2025-04-26 15:04:05,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:04:06,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1604 1613 1611 [WARNING|trainer.py:803] 2025-04-26 15:04:09,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:04:09,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:04:10,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1605 1614 1612 [WARNING|trainer.py:803] 2025-04-26 15:04:12,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:12,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:04:12,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1606 1613 1615 [WARNING|trainer.py:803] 2025-04-26 15:04:16,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:04:16,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:04:16,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1614 1607 1616 [WARNING|trainer.py:803] 2025-04-26 15:04:19,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:04:19,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:04:20,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1617 1615 1608 [WARNING|trainer.py:803] 2025-04-26 15:04:23,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:23,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:23,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1609 1618 1616 [WARNING|trainer.py:803] 2025-04-26 15:04:27,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:27,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:27,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1619 1617 1610 [WARNING|trainer.py:803] 2025-04-26 15:04:30,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:30,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:30,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1618 1611 1620 [WARNING|trainer.py:803] 2025-04-26 15:04:34,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:34,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:04:34,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1612 1619 1621 [WARNING|trainer.py:803] 2025-04-26 15:04:37,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:37,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:04:38,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1613 1622 1620 [WARNING|trainer.py:803] 2025-04-26 15:04:41,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:04:41,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:41,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1614 1623 1621 [WARNING|trainer.py:803] 2025-04-26 15:04:44,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:04:44,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:04:45,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1624 1615 1622 [WARNING|trainer.py:803] 2025-04-26 15:04:47,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:04:48,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:48,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1625 1623 1616 [WARNING|trainer.py:803] 2025-04-26 15:04:51,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:04:52,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:04:52,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1626 1624 1617 [WARNING|trainer.py:803] 2025-04-26 15:04:55,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:55,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:04:55,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1618 1627 1625 [WARNING|trainer.py:803] 2025-04-26 15:04:59,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:59,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:04:59,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1628 1619 1626 [WARNING|trainer.py:803] 2025-04-26 15:05:02,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:05:02,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:05:02,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1629 [WARNING|trainer.py:803] 2025-04-26 15:05:05,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1620 1627 [WARNING|trainer.py:803] 2025-04-26 15:05:06,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:05:06,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1630 1628 [WARNING|trainer.py:803] 2025-04-26 15:05:08,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:05:09,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1621 [WARNING|trainer.py:803] 2025-04-26 15:05:10,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1631 1629 [WARNING|trainer.py:803] 2025-04-26 15:05:12,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:05:12,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1622 [WARNING|trainer.py:803] 2025-04-26 15:05:13,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1632 1630 [WARNING|trainer.py:803] 2025-04-26 15:05:15,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:05:16,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1623 [WARNING|trainer.py:803] 2025-04-26 15:05:17,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1633 1631 [WARNING|trainer.py:803] 2025-04-26 15:05:19,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:05:19,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1624 [WARNING|trainer.py:803] 2025-04-26 15:05:20,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1634 1632 [WARNING|trainer.py:803] 2025-04-26 15:05:22,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:05:23,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1625 [WARNING|trainer.py:803] 2025-04-26 15:05:24,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1635 1633 [WARNING|trainer.py:803] 2025-04-26 15:05:25,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:05:26,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1626 [WARNING|trainer.py:803] 2025-04-26 15:05:27,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1636 1634 [WARNING|trainer.py:803] 2025-04-26 15:05:29,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:05:29,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1627 [WARNING|trainer.py:803] 2025-04-26 15:05:31,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1637 1635 [WARNING|trainer.py:803] 2025-04-26 15:05:32,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:05:33,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1628 [WARNING|trainer.py:803] 2025-04-26 15:05:34,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1638 [WARNING|trainer.py:803] 2025-04-26 15:05:35,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1636 [WARNING|trainer.py:803] 2025-04-26 15:05:36,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1629 [WARNING|trainer.py:803] 2025-04-26 15:05:37,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1639 [WARNING|trainer.py:803] 2025-04-26 15:05:39,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1637 [WARNING|trainer.py:803] 2025-04-26 15:05:40,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1630 [WARNING|trainer.py:803] 2025-04-26 15:05:41,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1640 [WARNING|trainer.py:803] 2025-04-26 15:05:42,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1638 [WARNING|trainer.py:803] 2025-04-26 15:05:43,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1631 [WARNING|trainer.py:803] 2025-04-26 15:05:44,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1641 [WARNING|trainer.py:803] 2025-04-26 15:05:45,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1639 [WARNING|trainer.py:803] 2025-04-26 15:05:46,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1632 [WARNING|trainer.py:803] 2025-04-26 15:05:48,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1642 [WARNING|trainer.py:803] 2025-04-26 15:05:49,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1640 [WARNING|trainer.py:803] 2025-04-26 15:05:50,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1633 [WARNING|trainer.py:803] 2025-04-26 15:05:51,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1643 [WARNING|trainer.py:803] 2025-04-26 15:05:52,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1641 [WARNING|trainer.py:803] 2025-04-26 15:05:53,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1634 [WARNING|trainer.py:803] 2025-04-26 15:05:54,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1644 [WARNING|trainer.py:803] 2025-04-26 15:05:55,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1642 [WARNING|trainer.py:803] 2025-04-26 15:05:57,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1635 [WARNING|trainer.py:803] 2025-04-26 15:05:58,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1645 [WARNING|trainer.py:803] 2025-04-26 15:05:59,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1643 [WARNING|trainer.py:803] 2025-04-26 15:06:00,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1636 1646 [WARNING|trainer.py:803] 2025-04-26 15:06:01,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:06:02,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1644 [WARNING|trainer.py:803] 2025-04-26 15:06:03,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1637 1647 [WARNING|trainer.py:803] 2025-04-26 15:06:05,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:06:05,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1645 [WARNING|trainer.py:803] 2025-04-26 15:06:07,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1638 1648 [WARNING|trainer.py:803] 2025-04-26 15:06:08,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:06:09,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1646 [WARNING|trainer.py:803] 2025-04-26 15:06:10,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1649 1639 [WARNING|trainer.py:803] 2025-04-26 15:06:11,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:06:12,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1647 1650 [WARNING|trainer.py:803] 2025-04-26 15:06:13,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1640 [WARNING|trainer.py:803] 2025-04-26 15:06:14,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:06:15,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1648 1651 [WARNING|trainer.py:803] 2025-04-26 15:06:17,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:06:17,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1641 [WARNING|trainer.py:803] 2025-04-26 15:06:19,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1649 1652 [WARNING|trainer.py:803] 2025-04-26 15:06:20,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:06:20,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1642 1650 [WARNING|trainer.py:803] 2025-04-26 15:06:22,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1653 [WARNING|trainer.py:803] 2025-04-26 15:06:22,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:06:23,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1643 1654 1651 [WARNING|trainer.py:803] 2025-04-26 15:06:25,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:06:25,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:06:25,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1655 1652 1644 [WARNING|trainer.py:803] 2025-04-26 15:06:28,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:06:28,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:06:29,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1656 1653 [WARNING|trainer.py:803] 2025-04-26 15:06:31,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:06:31,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1645 [WARNING|trainer.py:803] 2025-04-26 15:06:32,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1654 1657 [WARNING|trainer.py:803] 2025-04-26 15:06:34,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:06:34,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1646 [WARNING|trainer.py:803] 2025-04-26 15:06:35,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1655 1658 [WARNING|trainer.py:803] 2025-04-26 15:06:37,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:06:37,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1647 1656 1659 [WARNING|trainer.py:803] 2025-04-26 15:06:39,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:06:39,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:06:40,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1657 1648 1660 [WARNING|trainer.py:803] 2025-04-26 15:06:42,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:06:42,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:06:42,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1649 1661 1658 [WARNING|trainer.py:803] 2025-04-26 15:06:45,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:06:45,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:06:45,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1662 1659 1650 [WARNING|trainer.py:803] 2025-04-26 15:06:48,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:06:48,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:06:48,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1663 1660 1651 [WARNING|trainer.py:803] 2025-04-26 15:06:51,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:06:51,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:06:51,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1664 1661 1652 [WARNING|trainer.py:803] 2025-04-26 15:06:54,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:06:54,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:06:54,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1665 1662 [WARNING|trainer.py:803] 2025-04-26 15:06:56,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1653 [WARNING|trainer.py:803] 2025-04-26 15:06:57,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:06:57,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1666 1663 [WARNING|trainer.py:803] 2025-04-26 15:06:59,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1654 [WARNING|trainer.py:803] 2025-04-26 15:06:59,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:07:00,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1667 1664 [WARNING|trainer.py:803] 2025-04-26 15:07:02,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1655 [WARNING|trainer.py:803] 2025-04-26 15:07:02,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:07:03,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1668 1665 [WARNING|trainer.py:803] 2025-04-26 15:07:04,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1656 [WARNING|trainer.py:803] 2025-04-26 15:07:05,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:07:05,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1669 1666 1657 [WARNING|trainer.py:803] 2025-04-26 15:07:08,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:07:08,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:07:08,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1670 1667 [WARNING|trainer.py:803] 2025-04-26 15:07:10,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1658 [WARNING|trainer.py:803] 2025-04-26 15:07:10,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:07:11,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1671 1668 [WARNING|trainer.py:803] 2025-04-26 15:07:13,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1659 [WARNING|trainer.py:803] 2025-04-26 15:07:13,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:07:14,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1672 [WARNING|trainer.py:803] 2025-04-26 15:07:15,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1669 1660 [WARNING|trainer.py:803] 2025-04-26 15:07:16,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:07:17,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1673 1670 [WARNING|trainer.py:803] 2025-04-26 15:07:18,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:07:19,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1661 [WARNING|trainer.py:803] 2025-04-26 15:07:20,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1674 1671 [WARNING|trainer.py:803] 2025-04-26 15:07:21,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:07:21,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1662 [WARNING|trainer.py:803] 2025-04-26 15:07:22,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1675 1672 [WARNING|trainer.py:803] 2025-04-26 15:07:23,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:07:24,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1663 [WARNING|trainer.py:803] 2025-04-26 15:07:25,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1676 1673 [WARNING|trainer.py:803] 2025-04-26 15:07:26,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:07:26,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1664 1677 [WARNING|trainer.py:803] 2025-04-26 15:07:28,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1674 [WARNING|trainer.py:803] 2025-04-26 15:07:29,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:07:29,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1665 1678 [WARNING|trainer.py:803] 2025-04-26 15:07:31,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1675 [WARNING|trainer.py:803] 2025-04-26 15:07:32,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:07:32,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1666 [WARNING|trainer.py:803] 2025-04-26 15:07:33,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1676 1679 [WARNING|trainer.py:803] 2025-04-26 15:07:34,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:07:34,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1667 1677 [WARNING|trainer.py:803] 2025-04-26 15:07:36,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1680 [WARNING|trainer.py:803] 2025-04-26 15:07:37,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:07:37,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1668 1678 1681 [WARNING|trainer.py:803] 2025-04-26 15:07:39,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:07:40,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:07:40,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1669 1682 1679 [WARNING|trainer.py:803] 2025-04-26 15:07:42,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:07:43,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:07:43,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1670 1683 1680 [WARNING|trainer.py:803] 2025-04-26 15:07:45,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:07:45,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:07:45,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1671 1684 1681 [WARNING|trainer.py:803] 2025-04-26 15:07:48,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:07:48,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:07:48,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1672 1682 [WARNING|trainer.py:803] 2025-04-26 15:07:50,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1685 [WARNING|trainer.py:803] 2025-04-26 15:07:51,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:07:51,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1673 1683 [WARNING|trainer.py:803] 2025-04-26 15:07:53,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1686 [WARNING|trainer.py:803] 2025-04-26 15:07:54,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:07:54,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1674 [WARNING|trainer.py:803] 2025-04-26 15:07:56,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1684 1687 [WARNING|trainer.py:803] 2025-04-26 15:07:57,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:07:57,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1675 [WARNING|trainer.py:803] 2025-04-26 15:07:58,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1685 1688 [WARNING|trainer.py:803] 2025-04-26 15:08:00,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:08:00,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1676 [WARNING|trainer.py:803] 2025-04-26 15:08:01,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1686 1689 [WARNING|trainer.py:803] 2025-04-26 15:08:02,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:08:02,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1677 [WARNING|trainer.py:803] 2025-04-26 15:08:04,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1687 1690 [WARNING|trainer.py:803] 2025-04-26 15:08:05,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:08:05,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1678 [WARNING|trainer.py:803] 2025-04-26 15:08:07,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1688 1691 [WARNING|trainer.py:803] 2025-04-26 15:08:08,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:08:08,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1679 [WARNING|trainer.py:803] 2025-04-26 15:08:09,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1689 1692 [WARNING|trainer.py:803] 2025-04-26 15:08:11,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:08:11,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1680 [WARNING|trainer.py:803] 2025-04-26 15:08:12,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1690 1693 [WARNING|trainer.py:803] 2025-04-26 15:08:14,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:08:14,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1681 [WARNING|trainer.py:803] 2025-04-26 15:08:15,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1691 1694 [WARNING|trainer.py:803] 2025-04-26 15:08:16,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:08:17,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1682 [WARNING|trainer.py:803] 2025-04-26 15:08:18,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1692 1695 [WARNING|trainer.py:803] 2025-04-26 15:08:19,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:08:20,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1683 [WARNING|trainer.py:803] 2025-04-26 15:08:21,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1693 1696 [WARNING|trainer.py:803] 2025-04-26 15:08:22,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:08:22,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1684 [WARNING|trainer.py:803] 2025-04-26 15:08:23,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1694 1697 [WARNING|trainer.py:803] 2025-04-26 15:08:25,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:08:25,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1685 [WARNING|trainer.py:803] 2025-04-26 15:08:27,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1695 1698 [WARNING|trainer.py:803] 2025-04-26 15:08:28,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:08:28,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1686 [WARNING|trainer.py:803] 2025-04-26 15:08:29,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1699 1696 [WARNING|trainer.py:803] 2025-04-26 15:08:31,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:08:31,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1687 [WARNING|trainer.py:803] 2025-04-26 15:08:32,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1700 1697 [WARNING|trainer.py:803] 2025-04-26 15:08:34,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:08:34,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1688 [WARNING|trainer.py:803] 2025-04-26 15:08:35,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1701 1698 [WARNING|trainer.py:803] 2025-04-26 15:08:36,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:08:37,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1689 [WARNING|trainer.py:803] 2025-04-26 15:08:38,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1702 1699 [WARNING|trainer.py:803] 2025-04-26 15:08:39,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:08:40,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1690 [WARNING|trainer.py:803] 2025-04-26 15:08:41,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1703 1700 [WARNING|trainer.py:803] 2025-04-26 15:08:42,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:08:42,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1691 [WARNING|trainer.py:803] 2025-04-26 15:08:44,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1704 1701 [WARNING|trainer.py:803] 2025-04-26 15:08:45,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:08:45,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1692 [WARNING|trainer.py:803] 2025-04-26 15:08:47,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 1705 Yes 1702 [WARNING|trainer.py:803] 2025-04-26 15:08:47,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:08:48,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1693 1706 [WARNING|trainer.py:803] 2025-04-26 15:08:50,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1703 [WARNING|trainer.py:803] 2025-04-26 15:08:50,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:08:51,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1694 1707 [WARNING|trainer.py:803] 2025-04-26 15:08:52,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1704 [WARNING|trainer.py:803] 2025-04-26 15:08:53,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:08:53,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1695 1708 1705 [WARNING|trainer.py:803] 2025-04-26 15:08:55,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:08:56,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:08:56,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1696 1709 1706 [WARNING|trainer.py:803] 2025-04-26 15:08:58,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:08:58,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:08:59,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1697 1710 1707 [WARNING|trainer.py:803] 2025-04-26 15:09:01,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:09:01,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:09:02,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1711 1698 1708 [WARNING|trainer.py:803] 2025-04-26 15:09:04,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:09:04,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:09:04,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1712 1699 1709 [WARNING|trainer.py:803] 2025-04-26 15:09:07,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:09:07,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:09:07,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1713 1710 1700 [WARNING|trainer.py:803] 2025-04-26 15:09:09,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:09:10,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:09:10,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1714 1711 1701 [WARNING|trainer.py:803] 2025-04-26 15:09:12,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:09:12,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:09:13,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1715 1712 1702 [WARNING|trainer.py:803] 2025-04-26 15:09:15,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:09:15,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:09:16,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1716 1713 [WARNING|trainer.py:803] 2025-04-26 15:09:17,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:09:18,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1703 [WARNING|trainer.py:803] 2025-04-26 15:09:19,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1717 1714 [WARNING|trainer.py:803] 2025-04-26 15:09:20,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1704 [WARNING|trainer.py:803] 2025-04-26 15:09:21,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:09:21,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1718 1715 [WARNING|trainer.py:803] 2025-04-26 15:09:22,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1705 [WARNING|trainer.py:803] 2025-04-26 15:09:23,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:09:24,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1719 1716 [WARNING|trainer.py:803] 2025-04-26 15:09:25,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1706 [WARNING|trainer.py:803] 2025-04-26 15:09:26,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:09:26,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1720 1717 [WARNING|trainer.py:803] 2025-04-26 15:09:28,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:09:28,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1707 [WARNING|trainer.py:803] 2025-04-26 15:09:29,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1721 1718 [WARNING|trainer.py:803] 2025-04-26 15:09:30,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:09:31,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1708 [WARNING|trainer.py:803] 2025-04-26 15:09:32,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1722 1719 [WARNING|trainer.py:803] 2025-04-26 15:09:33,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:09:33,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1709 [WARNING|trainer.py:803] 2025-04-26 15:09:35,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1723 1720 [WARNING|trainer.py:803] 2025-04-26 15:09:36,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:09:36,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1710 [WARNING|trainer.py:803] 2025-04-26 15:09:37,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1724 1721 [WARNING|trainer.py:803] 2025-04-26 15:09:39,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:09:39,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1711 [WARNING|trainer.py:803] 2025-04-26 15:09:40,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1725 1722 [WARNING|trainer.py:803] 2025-04-26 15:09:41,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:09:41,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1712 [WARNING|trainer.py:803] 2025-04-26 15:09:43,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1726 1723 [WARNING|trainer.py:803] 2025-04-26 15:09:44,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:09:44,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1713 [WARNING|trainer.py:803] 2025-04-26 15:09:45,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1727 1724 [WARNING|trainer.py:803] 2025-04-26 15:09:47,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:09:47,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1714 [WARNING|trainer.py:803] 2025-04-26 15:09:48,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1728 1725 [WARNING|trainer.py:803] 2025-04-26 15:09:49,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:09:50,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1715 [WARNING|trainer.py:803] 2025-04-26 15:09:51,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1729 1726 [WARNING|trainer.py:803] 2025-04-26 15:09:52,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:09:52,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1716 1730 [WARNING|trainer.py:803] 2025-04-26 15:09:53,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1727 [WARNING|trainer.py:803] 2025-04-26 15:09:54,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:09:55,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1717 1731 [WARNING|trainer.py:803] 2025-04-26 15:09:56,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1728 [WARNING|trainer.py:803] 2025-04-26 15:09:57,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:09:57,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1718 1732 [WARNING|trainer.py:803] 2025-04-26 15:09:59,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1729 [WARNING|trainer.py:803] 2025-04-26 15:10:00,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:10:00,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1719 1730 1733 [WARNING|trainer.py:803] 2025-04-26 15:10:02,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:10:02,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:02,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1720 1731 1734 [WARNING|trainer.py:803] 2025-04-26 15:10:04,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:10:05,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:05,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1721 1732 1735 [WARNING|trainer.py:803] 2025-04-26 15:10:07,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:10:07,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:10:08,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1722 1733 1736 [WARNING|trainer.py:803] 2025-04-26 15:10:10,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:10:10,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:10:10,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1723 1734 1737 [WARNING|trainer.py:803] 2025-04-26 15:10:13,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:10:13,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:13,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1724 1735 1738 [WARNING|trainer.py:803] 2025-04-26 15:10:16,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:10:16,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:10:16,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1725 1739 1736 [WARNING|trainer.py:803] 2025-04-26 15:10:19,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:19,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:10:19,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1726 1740 1737 [WARNING|trainer.py:803] 2025-04-26 15:10:21,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:21,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:22,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1727 1741 1738 [WARNING|trainer.py:803] 2025-04-26 15:10:24,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:10:24,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:10:24,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1728 1742 1739 [WARNING|trainer.py:803] 2025-04-26 15:10:27,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:10:27,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:27,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1743 1729 1740 [WARNING|trainer.py:803] 2025-04-26 15:10:30,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:30,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:30,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1730 1744 1741 [WARNING|trainer.py:803] 2025-04-26 15:10:32,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:32,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:32,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1731 1745 1742 [WARNING|trainer.py:803] 2025-04-26 15:10:35,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:35,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:35,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1732 1743 1746 [WARNING|trainer.py:803] 2025-04-26 15:10:38,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:10:38,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:38,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1733 1747 1744 [WARNING|trainer.py:803] 2025-04-26 15:10:41,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:10:41,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:10:41,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1745 1734 1748 [WARNING|trainer.py:803] 2025-04-26 15:10:43,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:44,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:44,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1735 1746 1749 [WARNING|trainer.py:803] 2025-04-26 15:10:46,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:10:46,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:47,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1747 1736 1750 [WARNING|trainer.py:803] 2025-04-26 15:10:49,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:10:49,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:10:50,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1737 1748 1751 [WARNING|trainer.py:803] 2025-04-26 15:10:52,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:10:52,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:52,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1738 1749 1752 [WARNING|trainer.py:803] 2025-04-26 15:10:55,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:10:55,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:55,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1739 1753 1750 [WARNING|trainer.py:803] 2025-04-26 15:10:58,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:10:58,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:10:58,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1754 1740 1751 [WARNING|trainer.py:803] 2025-04-26 15:11:00,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:11:00,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:11:01,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1755 1741 1752 [WARNING|trainer.py:803] 2025-04-26 15:11:03,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:11:03,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:11:03,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1756 1753 1742 [WARNING|trainer.py:803] 2025-04-26 15:11:06,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:11:06,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:11:06,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1757 1754 1743 [WARNING|trainer.py:803] 2025-04-26 15:11:09,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:11:09,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:11:09,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1755 1744 1758 [WARNING|trainer.py:803] 2025-04-26 15:11:11,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:11:12,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:11:12,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1745 1756 1759 [WARNING|trainer.py:803] 2025-04-26 15:11:14,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:11:14,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:11:15,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1760 1757 1746 [WARNING|trainer.py:803] 2025-04-26 15:11:17,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:11:17,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:11:17,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1761 1747 1758 [WARNING|trainer.py:803] 2025-04-26 15:11:20,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:11:20,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:11:20,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1762 1759 [WARNING|trainer.py:803] 2025-04-26 15:11:22,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1748 [WARNING|trainer.py:803] 2025-04-26 15:11:23,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:11:23,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1763 1760 [WARNING|trainer.py:803] 2025-04-26 15:11:25,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1749 [WARNING|trainer.py:803] 2025-04-26 15:11:25,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:11:26,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1764 1761 [WARNING|trainer.py:803] 2025-04-26 15:11:27,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:11:28,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1750 [WARNING|trainer.py:803] 2025-04-26 15:11:29,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1765 1762 [WARNING|trainer.py:803] 2025-04-26 15:11:30,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:11:30,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1751 [WARNING|trainer.py:803] 2025-04-26 15:11:32,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1766 1763 [WARNING|trainer.py:803] 2025-04-26 15:11:33,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:11:33,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1752 1767 [WARNING|trainer.py:803] 2025-04-26 15:11:35,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1764 [WARNING|trainer.py:803] 2025-04-26 15:11:35,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:11:36,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1753 1768 [WARNING|trainer.py:803] 2025-04-26 15:11:37,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1765 [WARNING|trainer.py:803] 2025-04-26 15:11:38,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:11:38,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1754 1769 [WARNING|trainer.py:803] 2025-04-26 15:11:40,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1766 [WARNING|trainer.py:803] 2025-04-26 15:11:40,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:11:41,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1755 1770 1767 [WARNING|trainer.py:803] 2025-04-26 15:11:43,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:11:43,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:11:43,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1756 1771 1768 [WARNING|trainer.py:803] 2025-04-26 15:11:45,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:11:46,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:11:46,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1757 1772 1769 [WARNING|trainer.py:803] 2025-04-26 15:11:48,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:11:49,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:11:49,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1773 1770 1758 [WARNING|trainer.py:803] 2025-04-26 15:11:51,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:11:51,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:11:52,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1774 1759 1771 [WARNING|trainer.py:803] 2025-04-26 15:11:54,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:11:54,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:11:54,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1760 1772 1775 [WARNING|trainer.py:803] 2025-04-26 15:11:57,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:11:57,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:11:57,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1761 1773 1776 [WARNING|trainer.py:803] 2025-04-26 15:12:00,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:12:00,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:12:00,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1762 1777 1774 [WARNING|trainer.py:803] 2025-04-26 15:12:02,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:12:03,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:12:03,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1778 1763 1775 [WARNING|trainer.py:803] 2025-04-26 15:12:05,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:12:05,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:12:06,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1779 1764 [WARNING|trainer.py:803] 2025-04-26 15:12:08,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1776 [WARNING|trainer.py:803] 2025-04-26 15:12:08,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:12:09,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1780 1765 1777 [WARNING|trainer.py:803] 2025-04-26 15:12:10,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:12:11,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:12:11,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1781 1766 1778 [WARNING|trainer.py:803] 2025-04-26 15:12:13,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:12:13,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:12:14,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1782 1767 1779 [WARNING|trainer.py:803] 2025-04-26 15:12:16,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:12:16,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:12:16,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1768 1780 1783 [WARNING|trainer.py:803] 2025-04-26 15:12:19,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:12:19,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:12:19,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1784 1781 1769 [WARNING|trainer.py:803] 2025-04-26 15:12:21,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:12:22,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:12:22,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1785 1782 1770 [WARNING|trainer.py:803] 2025-04-26 15:12:24,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:12:24,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:12:25,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1771 1786 1783 [WARNING|trainer.py:803] 2025-04-26 15:12:27,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:12:27,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:12:27,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1772 1784 1787 [WARNING|trainer.py:803] 2025-04-26 15:12:30,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:12:30,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:12:30,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1773 1788 1785 [WARNING|trainer.py:803] 2025-04-26 15:12:33,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:12:33,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:12:33,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1789 1774 1786 [WARNING|trainer.py:803] 2025-04-26 15:12:36,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:12:36,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:12:36,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1790 1775 1787 [WARNING|trainer.py:803] 2025-04-26 15:12:39,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:12:39,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:12:39,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1791 1788 [WARNING|trainer.py:803] 2025-04-26 15:12:41,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1776 [WARNING|trainer.py:803] 2025-04-26 15:12:42,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:12:42,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1792 1789 [WARNING|trainer.py:803] 2025-04-26 15:12:44,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1777 [WARNING|trainer.py:803] 2025-04-26 15:12:45,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:12:45,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1793 [WARNING|trainer.py:803] 2025-04-26 15:12:46,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1778 1790 [WARNING|trainer.py:803] 2025-04-26 15:12:47,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:12:47,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1794 [WARNING|trainer.py:803] 2025-04-26 15:12:49,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1779 1791 [WARNING|trainer.py:803] 2025-04-26 15:12:50,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:12:50,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1795 [WARNING|trainer.py:803] 2025-04-26 15:12:51,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1792 1780 [WARNING|trainer.py:803] 2025-04-26 15:12:53,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:12:53,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1796 [WARNING|trainer.py:803] 2025-04-26 15:12:54,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1793 1781 [WARNING|trainer.py:803] 2025-04-26 15:12:56,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:12:56,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1797 [WARNING|trainer.py:803] 2025-04-26 15:12:57,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1794 1782 [WARNING|trainer.py:803] 2025-04-26 15:12:58,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:12:58,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1798 [WARNING|trainer.py:803] 2025-04-26 15:12:59,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1795 1783 [WARNING|trainer.py:803] 2025-04-26 15:13:01,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1799 [WARNING|trainer.py:803] 2025-04-26 15:13:01,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:13:02,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1796 1784 [WARNING|trainer.py:803] 2025-04-26 15:13:04,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1800 [WARNING|trainer.py:803] 2025-04-26 15:13:04,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:13:05,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1797 1801 1785 [WARNING|trainer.py:803] 2025-04-26 15:13:06,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:13:07,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:13:07,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1798 1802 1786 [WARNING|trainer.py:803] 2025-04-26 15:13:09,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:13:09,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:13:10,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1803 1799 [WARNING|trainer.py:803] 2025-04-26 15:13:11,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:13:12,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1787 [WARNING|trainer.py:803] 2025-04-26 15:13:13,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1804 1800 [WARNING|trainer.py:803] 2025-04-26 15:13:14,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:13:14,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1788 [WARNING|trainer.py:803] 2025-04-26 15:13:15,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1805 1801 [WARNING|trainer.py:803] 2025-04-26 15:13:16,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:13:16,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1789 1806 1802 [WARNING|trainer.py:803] 2025-04-26 15:13:18,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:13:18,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:13:19,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1807 1790 [WARNING|trainer.py:803] 2025-04-26 15:13:20,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1803 [WARNING|trainer.py:803] 2025-04-26 15:13:21,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:13:21,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1808 [WARNING|trainer.py:803] 2025-04-26 15:13:23,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1791 1804 [WARNING|trainer.py:803] 2025-04-26 15:13:23,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:13:24,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1809 [WARNING|trainer.py:803] 2025-04-26 15:13:25,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1805 1792 [WARNING|trainer.py:803] 2025-04-26 15:13:26,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1810 [WARNING|trainer.py:803] 2025-04-26 15:13:26,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:13:27,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1806 [WARNING|trainer.py:803] 2025-04-26 15:13:28,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1793 1811 [WARNING|trainer.py:803] 2025-04-26 15:13:29,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:13:29,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1807 [WARNING|trainer.py:803] 2025-04-26 15:13:30,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1812 1794 [WARNING|trainer.py:803] 2025-04-26 15:13:31,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1808 [WARNING|trainer.py:803] 2025-04-26 15:13:31,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:13:32,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1813 [WARNING|trainer.py:803] 2025-04-26 15:13:33,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1795 1809 [WARNING|trainer.py:803] 2025-04-26 15:13:34,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1814 [WARNING|trainer.py:803] 2025-04-26 15:13:34,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:13:35,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1810 1796 1815 [WARNING|trainer.py:803] 2025-04-26 15:13:36,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:13:37,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:13:37,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1811 [WARNING|trainer.py:803] 2025-04-26 15:13:38,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1797 1816 [WARNING|trainer.py:803] 2025-04-26 15:13:39,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:13:40,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1812 [WARNING|trainer.py:803] 2025-04-26 15:13:41,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1817 1798 [WARNING|trainer.py:803] 2025-04-26 15:13:42,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1813 [WARNING|trainer.py:803] 2025-04-26 15:13:42,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:13:43,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1818 1799 1814 [WARNING|trainer.py:803] 2025-04-26 15:13:44,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:13:45,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:13:45,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1819 1815 [WARNING|trainer.py:803] 2025-04-26 15:13:46,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1800 [WARNING|trainer.py:803] 2025-04-26 15:13:47,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:13:47,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1820 [WARNING|trainer.py:803] 2025-04-26 15:13:49,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1816 1801 [WARNING|trainer.py:803] 2025-04-26 15:13:49,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:13:50,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1821 [WARNING|trainer.py:803] 2025-04-26 15:13:51,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1817 1802 [WARNING|trainer.py:803] 2025-04-26 15:13:51,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:13:52,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1822 1818 [WARNING|trainer.py:803] 2025-04-26 15:13:53,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1803 [WARNING|trainer.py:803] 2025-04-26 15:13:54,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:13:54,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1823 1819 [WARNING|trainer.py:803] 2025-04-26 15:13:56,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:13:56,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1804 [WARNING|trainer.py:803] 2025-04-26 15:13:57,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1820 1824 [WARNING|trainer.py:803] 2025-04-26 15:13:58,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:13:58,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1805 [WARNING|trainer.py:803] 2025-04-26 15:13:59,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1821 1825 [WARNING|trainer.py:803] 2025-04-26 15:14:00,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:14:00,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1806 [WARNING|trainer.py:803] 2025-04-26 15:14:01,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1822 1826 1807 [WARNING|trainer.py:803] 2025-04-26 15:14:02,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:14:02,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:14:03,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1827 1823 [WARNING|trainer.py:803] 2025-04-26 15:14:04,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1808 [WARNING|trainer.py:803] 2025-04-26 15:14:05,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:14:05,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1828 1824 [WARNING|trainer.py:803] 2025-04-26 15:14:07,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1809 [WARNING|trainer.py:803] 2025-04-26 15:14:07,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:14:08,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1829 1825 1810 [WARNING|trainer.py:803] 2025-04-26 15:14:09,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:14:09,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:14:10,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1830 1811 1826 [WARNING|trainer.py:803] 2025-04-26 15:14:12,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:14:12,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:14:12,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1827 1812 1831 [WARNING|trainer.py:803] 2025-04-26 15:14:14,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:14:14,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:14:14,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1813 1832 1828 [WARNING|trainer.py:803] 2025-04-26 15:14:16,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:14:16,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:14:16,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1814 1833 1829 [WARNING|trainer.py:803] 2025-04-26 15:14:18,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:14:18,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:14:18,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1815 1834 1830 [WARNING|trainer.py:803] 2025-04-26 15:14:20,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:14:20,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:14:21,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1835 1816 1831 [WARNING|trainer.py:803] 2025-04-26 15:14:23,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:14:23,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:14:23,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1836 1817 1832 [WARNING|trainer.py:803] 2025-04-26 15:14:25,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:14:25,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:14:25,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1837 1818 1833 [WARNING|trainer.py:803] 2025-04-26 15:14:27,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:14:27,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:14:27,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1838 1819 1834 [WARNING|trainer.py:803] 2025-04-26 15:14:29,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:14:29,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:14:30,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1839 1820 1835 [WARNING|trainer.py:803] 2025-04-26 15:14:31,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:14:32,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:14:32,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1840 1836 [WARNING|trainer.py:803] 2025-04-26 15:14:33,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1821 [WARNING|trainer.py:803] 2025-04-26 15:14:34,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:14:34,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1841 1837 [WARNING|trainer.py:803] 2025-04-26 15:14:35,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1822 [WARNING|trainer.py:803] 2025-04-26 15:14:36,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:14:36,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1842 1838 [WARNING|trainer.py:803] 2025-04-26 15:14:37,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1823 [WARNING|trainer.py:803] 2025-04-26 15:14:38,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:14:39,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1843 1839 [WARNING|trainer.py:803] 2025-04-26 15:14:40,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:14:40,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1824 1840 [WARNING|trainer.py:803] 2025-04-26 15:14:41,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1844 [WARNING|trainer.py:803] 2025-04-26 15:14:42,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:14:42,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1825 1841 [WARNING|trainer.py:803] 2025-04-26 15:14:43,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1845 [WARNING|trainer.py:803] 2025-04-26 15:14:44,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:14:44,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1826 [WARNING|trainer.py:803] 2025-04-26 15:14:46,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1842 1846 [WARNING|trainer.py:803] 2025-04-26 15:14:47,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:14:47,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1827 [WARNING|trainer.py:803] 2025-04-26 15:14:48,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1847 1843 [WARNING|trainer.py:803] 2025-04-26 15:14:49,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:14:49,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1828 1848 [WARNING|trainer.py:803] 2025-04-26 15:14:50,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1844 [WARNING|trainer.py:803] 2025-04-26 15:14:51,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1829 [WARNING|trainer.py:803] 2025-04-26 15:14:51,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1849 [WARNING|trainer.py:803] 2025-04-26 15:14:52,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1845 [WARNING|trainer.py:803] 2025-04-26 15:14:53,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:14:54,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1830 1850 [WARNING|trainer.py:803] 2025-04-26 15:14:55,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:14:55,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1846 [WARNING|trainer.py:803] 2025-04-26 15:14:56,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1831 1851 [WARNING|trainer.py:803] 2025-04-26 15:14:57,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1847 [WARNING|trainer.py:803] 2025-04-26 15:14:58,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:14:58,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1832 1852 [WARNING|trainer.py:803] 2025-04-26 15:14:59,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1848 [WARNING|trainer.py:803] 2025-04-26 15:15:00,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:15:00,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1833 [WARNING|trainer.py:803] 2025-04-26 15:15:01,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1853 1849 [WARNING|trainer.py:803] 2025-04-26 15:15:02,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:03,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1834 [WARNING|trainer.py:803] 2025-04-26 15:15:04,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1854 1850 [WARNING|trainer.py:803] 2025-04-26 15:15:05,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:05,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1835 [WARNING|trainer.py:803] 2025-04-26 15:15:06,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1855 1851 1836 [WARNING|trainer.py:803] 2025-04-26 15:15:07,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:15:07,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:08,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1856 1852 1837 [WARNING|trainer.py:803] 2025-04-26 15:15:09,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:15:10,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:15:10,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1857 1853 [WARNING|trainer.py:803] 2025-04-26 15:15:11,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1838 [WARNING|trainer.py:803] 2025-04-26 15:15:12,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:12,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1858 1854 1839 [WARNING|trainer.py:803] 2025-04-26 15:15:13,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:15:14,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:14,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1859 1840 1855 [WARNING|trainer.py:803] 2025-04-26 15:15:16,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:15:16,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:16,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1860 1841 1856 [WARNING|trainer.py:803] 2025-04-26 15:15:18,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:15:19,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:19,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1861 1857 [WARNING|trainer.py:803] 2025-04-26 15:15:20,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1842 [WARNING|trainer.py:803] 2025-04-26 15:15:21,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:15:21,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1862 1858 [WARNING|trainer.py:803] 2025-04-26 15:15:22,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1843 [WARNING|trainer.py:803] 2025-04-26 15:15:23,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:15:23,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1863 1859 [WARNING|trainer.py:803] 2025-04-26 15:15:24,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1844 [WARNING|trainer.py:803] 2025-04-26 15:15:25,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1864 [WARNING|trainer.py:803] 2025-04-26 15:15:26,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1860 [WARNING|trainer.py:803] 2025-04-26 15:15:27,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:15:27,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1845 1865 1861 [WARNING|trainer.py:803] 2025-04-26 15:15:28,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:29,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:29,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1846 1866 1862 [WARNING|trainer.py:803] 2025-04-26 15:15:31,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:31,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:15:31,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1847 1867 1863 [WARNING|trainer.py:803] 2025-04-26 15:15:33,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:15:33,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:15:33,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1848 1868 1864 [WARNING|trainer.py:803] 2025-04-26 15:15:35,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:15:35,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:36,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1849 1869 1865 [WARNING|trainer.py:803] 2025-04-26 15:15:37,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:37,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:15:38,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1850 1870 1866 [WARNING|trainer.py:803] 2025-04-26 15:15:40,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:40,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:15:40,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1871 1867 1851 [WARNING|trainer.py:803] 2025-04-26 15:15:42,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:15:42,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:15:42,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1872 1868 1852 [WARNING|trainer.py:803] 2025-04-26 15:15:44,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:44,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:45,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1873 1869 1853 [WARNING|trainer.py:803] 2025-04-26 15:15:46,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:46,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:15:47,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1874 1870 1854 [WARNING|trainer.py:803] 2025-04-26 15:15:48,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:49,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:15:49,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1875 1871 [WARNING|trainer.py:803] 2025-04-26 15:15:51,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1855 [WARNING|trainer.py:803] 2025-04-26 15:15:51,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:15:52,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1876 1872 [WARNING|trainer.py:803] 2025-04-26 15:15:53,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1856 [WARNING|trainer.py:803] 2025-04-26 15:15:53,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:54,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1877 1873 [WARNING|trainer.py:803] 2025-04-26 15:15:55,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1857 [WARNING|trainer.py:803] 2025-04-26 15:15:55,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:56,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1878 1874 [WARNING|trainer.py:803] 2025-04-26 15:15:57,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1858 [WARNING|trainer.py:803] 2025-04-26 15:15:57,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:15:58,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1879 1875 [WARNING|trainer.py:803] 2025-04-26 15:15:59,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1859 [WARNING|trainer.py:803] 2025-04-26 15:16:00,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:16:00,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1880 1876 [WARNING|trainer.py:803] 2025-04-26 15:16:01,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:16:02,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1860 1881 [WARNING|trainer.py:803] 2025-04-26 15:16:03,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1877 [WARNING|trainer.py:803] 2025-04-26 15:16:03,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1861 [WARNING|trainer.py:803] 2025-04-26 15:16:04,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1882 [WARNING|trainer.py:803] 2025-04-26 15:16:04,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1878 [WARNING|trainer.py:803] 2025-04-26 15:16:05,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:16:06,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1862 [WARNING|trainer.py:803] 2025-04-26 15:16:07,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1883 1879 [WARNING|trainer.py:803] 2025-04-26 15:16:08,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1863 [WARNING|trainer.py:803] 2025-04-26 15:16:08,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:16:09,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1880 1884 1864 [WARNING|trainer.py:803] 2025-04-26 15:16:10,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:16:11,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:16:11,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1881 1885 [WARNING|trainer.py:803] 2025-04-26 15:16:12,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1865 [WARNING|trainer.py:803] 2025-04-26 15:16:13,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:16:13,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1882 1886 [WARNING|trainer.py:803] 2025-04-26 15:16:14,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1866 [WARNING|trainer.py:803] 2025-04-26 15:16:15,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:16:15,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1883 1867 1887 [WARNING|trainer.py:803] 2025-04-26 15:16:17,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:16:18,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:16:18,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1884 1888 1868 [WARNING|trainer.py:803] 2025-04-26 15:16:20,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:16:20,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:16:20,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1889 1869 1885 [WARNING|trainer.py:803] 2025-04-26 15:16:22,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:16:22,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:16:22,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1890 1870 1886 [WARNING|trainer.py:803] 2025-04-26 15:16:24,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:16:24,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:16:25,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1891 1871 [WARNING|trainer.py:803] 2025-04-26 15:16:26,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1887 [WARNING|trainer.py:803] 2025-04-26 15:16:26,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:16:27,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1892 1872 1888 [WARNING|trainer.py:803] 2025-04-26 15:16:29,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:16:29,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:16:29,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1893 1889 1873 [WARNING|trainer.py:803] 2025-04-26 15:16:31,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:16:31,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:16:31,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1894 1890 1874 [WARNING|trainer.py:803] 2025-04-26 15:16:33,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:16:33,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:16:33,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1895 1891 1875 [WARNING|trainer.py:803] 2025-04-26 15:16:35,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:16:35,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:16:35,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1896 1876 1892 [WARNING|trainer.py:803] 2025-04-26 15:16:37,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:16:38,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:16:38,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1897 1877 1893 [WARNING|trainer.py:803] 2025-04-26 15:16:39,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:16:40,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:16:40,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1878 1894 1898 [WARNING|trainer.py:803] 2025-04-26 15:16:42,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:16:42,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:16:42,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1879 1895 1899 [WARNING|trainer.py:803] 2025-04-26 15:16:44,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:16:44,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:16:45,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1880 1896 1900 [WARNING|trainer.py:803] 2025-04-26 15:16:46,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:16:47,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:16:47,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1881 1897 1901 [WARNING|trainer.py:803] 2025-04-26 15:16:48,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:16:49,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:16:49,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1882 1902 [WARNING|trainer.py:803] 2025-04-26 15:16:50,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1898 [WARNING|trainer.py:803] 2025-04-26 15:16:51,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:16:52,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1883 1903 [WARNING|trainer.py:803] 2025-04-26 15:16:53,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1899 [WARNING|trainer.py:803] 2025-04-26 15:16:53,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:16:54,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1884 1904 1900 [WARNING|trainer.py:803] 2025-04-26 15:16:56,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:16:56,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:16:56,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1905 1885 1901 [WARNING|trainer.py:803] 2025-04-26 15:16:58,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:16:58,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:16:59,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1906 1886 1902 [WARNING|trainer.py:803] 2025-04-26 15:17:00,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:01,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:01,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1907 1903 1887 [WARNING|trainer.py:803] 2025-04-26 15:17:02,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:17:03,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:17:03,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1908 1888 1904 [WARNING|trainer.py:803] 2025-04-26 15:17:05,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:05,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:17:05,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1909 1889 1905 [WARNING|trainer.py:803] 2025-04-26 15:17:07,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:07,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:17:08,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1910 1890 [WARNING|trainer.py:803] 2025-04-26 15:17:09,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1906 [WARNING|trainer.py:803] 2025-04-26 15:17:09,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:17:10,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1911 1891 [WARNING|trainer.py:803] 2025-04-26 15:17:11,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1907 [WARNING|trainer.py:803] 2025-04-26 15:17:12,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:17:12,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1912 [WARNING|trainer.py:803] 2025-04-26 15:17:13,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1908 1892 [WARNING|trainer.py:803] 2025-04-26 15:17:14,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:14,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1913 1909 [WARNING|trainer.py:803] 2025-04-26 15:17:15,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1893 [WARNING|trainer.py:803] 2025-04-26 15:17:16,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:16,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1914 1910 1894 [WARNING|trainer.py:803] 2025-04-26 15:17:18,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:17:18,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:17:19,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1915 1911 [WARNING|trainer.py:803] 2025-04-26 15:17:20,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1895 [WARNING|trainer.py:803] 2025-04-26 15:17:20,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:21,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1916 1912 [WARNING|trainer.py:803] 2025-04-26 15:17:22,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1896 [WARNING|trainer.py:803] 2025-04-26 15:17:22,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:17:23,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1917 1913 1897 [WARNING|trainer.py:803] 2025-04-26 15:17:25,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:25,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:17:25,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1918 1914 [WARNING|trainer.py:803] 2025-04-26 15:17:27,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:27,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1898 [WARNING|trainer.py:803] 2025-04-26 15:17:28,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1919 1915 [WARNING|trainer.py:803] 2025-04-26 15:17:29,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:17:29,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1899 [WARNING|trainer.py:803] 2025-04-26 15:17:30,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1916 1920 [WARNING|trainer.py:803] 2025-04-26 15:17:31,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:32,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1900 [WARNING|trainer.py:803] 2025-04-26 15:17:33,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1917 1921 [WARNING|trainer.py:803] 2025-04-26 15:17:34,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:17:34,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1901 [WARNING|trainer.py:803] 2025-04-26 15:17:35,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1918 1922 [WARNING|trainer.py:803] 2025-04-26 15:17:36,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:36,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1902 [WARNING|trainer.py:803] 2025-04-26 15:17:37,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1919 1923 [WARNING|trainer.py:803] 2025-04-26 15:17:38,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:17:39,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1903 [WARNING|trainer.py:803] 2025-04-26 15:17:40,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1920 1924 [WARNING|trainer.py:803] 2025-04-26 15:17:41,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1904 [WARNING|trainer.py:803] 2025-04-26 15:17:41,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:17:42,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1921 1925 [WARNING|trainer.py:803] 2025-04-26 15:17:43,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1905 [WARNING|trainer.py:803] 2025-04-26 15:17:44,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:17:44,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1922 1926 1906 [WARNING|trainer.py:803] 2025-04-26 15:17:46,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:46,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:46,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1923 1927 1907 [WARNING|trainer.py:803] 2025-04-26 15:17:48,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:17:48,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:49,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1924 1928 1908 [WARNING|trainer.py:803] 2025-04-26 15:17:50,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:17:50,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:17:51,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1925 1929 1909 [WARNING|trainer.py:803] 2025-04-26 15:17:53,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:17:53,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:53,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1926 1930 1910 [WARNING|trainer.py:803] 2025-04-26 15:17:55,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:55,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:55,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1931 1911 1927 [WARNING|trainer.py:803] 2025-04-26 15:17:57,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:57,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:17:57,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1928 1912 1932 [WARNING|trainer.py:803] 2025-04-26 15:17:59,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:17:59,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:17:59,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1929 1933 1913 [WARNING|trainer.py:803] 2025-04-26 15:18:02,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:02,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:18:02,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1930 1934 1914 [WARNING|trainer.py:803] 2025-04-26 15:18:04,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:04,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:04,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1931 1935 1915 [WARNING|trainer.py:803] 2025-04-26 15:18:06,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:06,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:18:06,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1932 1936 1916 [WARNING|trainer.py:803] 2025-04-26 15:18:08,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:08,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:18:08,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1933 1937 1917 [WARNING|trainer.py:803] 2025-04-26 15:18:10,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:18:10,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:18:11,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1934 1938 1918 [WARNING|trainer.py:803] 2025-04-26 15:18:13,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:13,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:18:13,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1939 1935 1919 [WARNING|trainer.py:803] 2025-04-26 15:18:15,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:18:15,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:18:15,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1940 1936 [WARNING|trainer.py:803] 2025-04-26 15:18:17,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:17,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1920 [WARNING|trainer.py:803] 2025-04-26 15:18:18,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1941 1937 [WARNING|trainer.py:803] 2025-04-26 15:18:19,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:19,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1921 [WARNING|trainer.py:803] 2025-04-26 15:18:20,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1938 1942 [WARNING|trainer.py:803] 2025-04-26 15:18:21,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:18:21,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1922 1939 1943 [WARNING|trainer.py:803] 2025-04-26 15:18:23,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:23,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:18:24,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1923 1940 1944 [WARNING|trainer.py:803] 2025-04-26 15:18:25,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:18:25,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:26,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1924 1941 1945 [WARNING|trainer.py:803] 2025-04-26 15:18:28,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:18:28,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:28,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1942 1925 1946 [WARNING|trainer.py:803] 2025-04-26 15:18:30,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:30,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:18:30,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1943 1926 1947 [WARNING|trainer.py:803] 2025-04-26 15:18:32,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:18:32,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:32,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1944 1927 1948 [WARNING|trainer.py:803] 2025-04-26 15:18:34,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:18:35,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:18:35,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1928 1949 1945 [WARNING|trainer.py:803] 2025-04-26 15:18:37,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:18:37,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:37,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1946 1929 1950 [WARNING|trainer.py:803] 2025-04-26 15:18:39,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:39,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:39,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1947 1930 1951 [WARNING|trainer.py:803] 2025-04-26 15:18:41,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:41,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:41,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1948 1931 1952 [WARNING|trainer.py:803] 2025-04-26 15:18:43,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:18:44,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:44,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1949 1932 [WARNING|trainer.py:803] 2025-04-26 15:18:45,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1953 [WARNING|trainer.py:803] 2025-04-26 15:18:46,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:46,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1950 1933 1954 [WARNING|trainer.py:803] 2025-04-26 15:18:48,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:18:48,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:18:48,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1951 1934 1955 [WARNING|trainer.py:803] 2025-04-26 15:18:50,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:18:50,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:18:51,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1952 1935 1956 [WARNING|trainer.py:803] 2025-04-26 15:18:53,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:18:53,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:18:53,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1936 1953 1957 [WARNING|trainer.py:803] 2025-04-26 15:18:55,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:18:55,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:18:55,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1937 1954 [WARNING|trainer.py:803] 2025-04-26 15:18:57,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:18:57,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1958 1938 [WARNING|trainer.py:803] 2025-04-26 15:18:58,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1955 [WARNING|trainer.py:803] 2025-04-26 15:18:59,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:18:59,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1959 [WARNING|trainer.py:803] 2025-04-26 15:19:00,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1939 1956 [WARNING|trainer.py:803] 2025-04-26 15:19:01,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:19:02,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1960 1940 [WARNING|trainer.py:803] 2025-04-26 15:19:03,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1957 [WARNING|trainer.py:803] 2025-04-26 15:19:03,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:04,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1961 1941 [WARNING|trainer.py:803] 2025-04-26 15:19:05,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:06,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1958 1962 [WARNING|trainer.py:803] 2025-04-26 15:19:07,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1942 [WARNING|trainer.py:803] 2025-04-26 15:19:07,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:19:08,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1959 1963 [WARNING|trainer.py:803] 2025-04-26 15:19:09,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1943 [WARNING|trainer.py:803] 2025-04-26 15:19:09,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:19:10,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1960 1964 [WARNING|trainer.py:803] 2025-04-26 15:19:11,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1944 [WARNING|trainer.py:803] 2025-04-26 15:19:12,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:19:12,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1961 1965 [WARNING|trainer.py:803] 2025-04-26 15:19:14,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1945 [WARNING|trainer.py:803] 2025-04-26 15:19:14,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:15,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1962 1966 [WARNING|trainer.py:803] 2025-04-26 15:19:16,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1946 [WARNING|trainer.py:803] 2025-04-26 15:19:16,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:19:17,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1963 1967 1947 [WARNING|trainer.py:803] 2025-04-26 15:19:18,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:19:19,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:19,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1964 1968 1948 [WARNING|trainer.py:803] 2025-04-26 15:19:21,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:19:21,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:21,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1965 1949 [WARNING|trainer.py:803] 2025-04-26 15:19:23,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1969 [WARNING|trainer.py:803] 2025-04-26 15:19:23,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:24,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1966 [WARNING|trainer.py:803] 2025-04-26 15:19:25,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1970 1950 [WARNING|trainer.py:803] 2025-04-26 15:19:26,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:26,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1967 1971 1951 [WARNING|trainer.py:803] 2025-04-26 15:19:27,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:28,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:19:28,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1968 1972 1952 [WARNING|trainer.py:803] 2025-04-26 15:19:30,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:30,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:31,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1973 1969 [WARNING|trainer.py:803] 2025-04-26 15:19:32,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1953 [WARNING|trainer.py:803] 2025-04-26 15:19:33,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:19:33,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1974 1970 [WARNING|trainer.py:803] 2025-04-26 15:19:34,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1954 [WARNING|trainer.py:803] 2025-04-26 15:19:35,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:35,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1975 1971 [WARNING|trainer.py:803] 2025-04-26 15:19:36,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:37,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1955 [WARNING|trainer.py:803] 2025-04-26 15:19:38,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1976 1972 [WARNING|trainer.py:803] 2025-04-26 15:19:38,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:19:39,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1956 1977 1973 [WARNING|trainer.py:803] 2025-04-26 15:19:40,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:19:40,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:19:41,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1957 1974 [WARNING|trainer.py:803] 2025-04-26 15:19:42,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1978 [WARNING|trainer.py:803] 2025-04-26 15:19:43,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:19:43,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1975 1958 1979 [WARNING|trainer.py:803] 2025-04-26 15:19:45,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:45,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:19:45,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1976 1959 1980 [WARNING|trainer.py:803] 2025-04-26 15:19:47,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:19:47,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:48,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1977 1960 1981 [WARNING|trainer.py:803] 2025-04-26 15:19:49,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:19:50,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:50,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1978 1961 1982 [WARNING|trainer.py:803] 2025-04-26 15:19:52,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:19:52,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:52,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1979 1962 1983 [WARNING|trainer.py:803] 2025-04-26 15:19:54,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:19:54,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:19:55,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1980 1963 1984 [WARNING|trainer.py:803] 2025-04-26 15:19:56,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:19:56,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:19:57,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1981 1964 1985 [WARNING|trainer.py:803] 2025-04-26 15:19:59,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:19:59,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:19:59,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1982 1965 1986 [WARNING|trainer.py:803] 2025-04-26 15:20:01,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:20:01,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:20:01,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1983 1966 1987 [WARNING|trainer.py:803] 2025-04-26 15:20:03,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:20:03,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:20:04,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1984 1967 [WARNING|trainer.py:803] 2025-04-26 15:20:05,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1988 [WARNING|trainer.py:803] 2025-04-26 15:20:06,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:20:06,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1985 1968 1989 [WARNING|trainer.py:803] 2025-04-26 15:20:08,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:20:08,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:20:09,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1986 [WARNING|trainer.py:803] 2025-04-26 15:20:10,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1990 1969 [WARNING|trainer.py:803] 2025-04-26 15:20:11,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:20:11,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1987 1991 1970 [WARNING|trainer.py:803] 2025-04-26 15:20:12,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:20:13,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:20:13,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1988 1971 1992 [WARNING|trainer.py:803] 2025-04-26 15:20:15,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:20:15,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:20:16,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1972 1989 1993 [WARNING|trainer.py:803] 2025-04-26 15:20:17,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:20:17,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:20:18,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1973 1990 [WARNING|trainer.py:803] 2025-04-26 15:20:19,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1994 [WARNING|trainer.py:803] 2025-04-26 15:20:20,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:20:20,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1974 1991 [WARNING|trainer.py:803] 2025-04-26 15:20:22,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:20:22,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1995 [WARNING|trainer.py:803] 2025-04-26 15:20:23,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1975 1992 [WARNING|trainer.py:803] 2025-04-26 15:20:24,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:20:24,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1996 1976 [WARNING|trainer.py:803] 2025-04-26 15:20:25,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1993 [WARNING|trainer.py:803] 2025-04-26 15:20:26,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:20:26,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1997 1977 1994 [WARNING|trainer.py:803] 2025-04-26 15:20:28,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:20:28,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:20:29,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1998 1978 [WARNING|trainer.py:803] 2025-04-26 15:20:30,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1995 [WARNING|trainer.py:803] 2025-04-26 15:20:30,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:20:31,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1999 1979 [WARNING|trainer.py:803] 2025-04-26 15:20:32,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:20:33,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1996 [WARNING|trainer.py:803] 2025-04-26 15:20:34,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2000 1980 [WARNING|trainer.py:803] 2025-04-26 15:20:35,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:20:35,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1997 2001 [WARNING|trainer.py:803] 2025-04-26 15:20:36,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1981 [WARNING|trainer.py:803] 2025-04-26 15:20:37,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:20:37,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1998 2002 [WARNING|trainer.py:803] 2025-04-26 15:20:39,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1982 [WARNING|trainer.py:803] 2025-04-26 15:20:39,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:20:40,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1999 [WARNING|trainer.py:803] 2025-04-26 15:20:41,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2003 1983 [WARNING|trainer.py:803] 2025-04-26 15:20:42,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:20:42,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2000 [WARNING|trainer.py:803] 2025-04-26 15:20:43,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2004 1984 [WARNING|trainer.py:803] 2025-04-26 15:20:44,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:20:44,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2001 [WARNING|trainer.py:803] 2025-04-26 15:20:45,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2005 1985 [WARNING|trainer.py:803] 2025-04-26 15:20:47,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:20:47,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2002 [WARNING|trainer.py:803] 2025-04-26 15:20:48,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2006 1986 [WARNING|trainer.py:803] 2025-04-26 15:20:49,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:20:49,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2003 2007 [WARNING|trainer.py:803] 2025-04-26 15:20:50,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1987 [WARNING|trainer.py:803] 2025-04-26 15:20:51,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:20:51,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2004 2008 [WARNING|trainer.py:803] 2025-04-26 15:20:53,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:20:53,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1988 [WARNING|trainer.py:803] 2025-04-26 15:20:54,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2005 2009 [WARNING|trainer.py:803] 2025-04-26 15:20:55,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:20:55,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1989 2006 [WARNING|trainer.py:803] 2025-04-26 15:20:56,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2010 [WARNING|trainer.py:803] 2025-04-26 15:20:57,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:20:57,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1990 2007 2011 [WARNING|trainer.py:803] 2025-04-26 15:20:59,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:20:59,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:20:59,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1991 2008 2012 [WARNING|trainer.py:803] 2025-04-26 15:21:01,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:21:01,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:21:01,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1992 2009 2013 [WARNING|trainer.py:803] 2025-04-26 15:21:03,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:21:03,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:21:03,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2010 1993 2014 [WARNING|trainer.py:803] 2025-04-26 15:21:05,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:21:06,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:21:06,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2011 1994 2015 [WARNING|trainer.py:803] 2025-04-26 15:21:08,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:21:08,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:21:08,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2012 2016 1995 [WARNING|trainer.py:803] 2025-04-26 15:21:10,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:21:10,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:21:10,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2013 2017 [WARNING|trainer.py:803] 2025-04-26 15:21:12,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1996 [WARNING|trainer.py:803] 2025-04-26 15:21:13,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:21:13,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2014 2018 [WARNING|trainer.py:803] 2025-04-26 15:21:14,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1997 [WARNING|trainer.py:803] 2025-04-26 15:21:15,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2015 [WARNING|trainer.py:803] 2025-04-26 15:21:16,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2019 [WARNING|trainer.py:803] 2025-04-26 15:21:16,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:21:17,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1998 2016 [WARNING|trainer.py:803] 2025-04-26 15:21:18,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:21:18,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2020 1999 [WARNING|trainer.py:803] 2025-04-26 15:21:19,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:21:20,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2017 2021 [WARNING|trainer.py:803] 2025-04-26 15:21:21,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:21:21,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2000 [WARNING|trainer.py:803] 2025-04-26 15:21:22,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2018 2022 [WARNING|trainer.py:803] 2025-04-26 15:21:23,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:21:24,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2001 2019 [WARNING|trainer.py:803] 2025-04-26 15:21:25,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:21:25,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2023 [WARNING|trainer.py:803] 2025-04-26 15:21:26,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2002 2020 [WARNING|trainer.py:803] 2025-04-26 15:21:27,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2024 [WARNING|trainer.py:803] 2025-04-26 15:21:28,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:21:28,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2003 2021 2025 [WARNING|trainer.py:803] 2025-04-26 15:21:29,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:21:30,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:21:30,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2004 2022 2026 [WARNING|trainer.py:803] 2025-04-26 15:21:32,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:21:32,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:21:32,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2005 2027 2023 [WARNING|trainer.py:803] 2025-04-26 15:21:34,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:21:35,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:21:35,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2006 2024 2028 [WARNING|trainer.py:803] 2025-04-26 15:21:36,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:21:37,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:21:37,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2007 2025 2029 [WARNING|trainer.py:803] 2025-04-26 15:21:38,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:21:39,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:21:39,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2008 2030 2026 [WARNING|trainer.py:803] 2025-04-26 15:21:41,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:21:41,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:21:41,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2009 2031 2027 [WARNING|trainer.py:803] 2025-04-26 15:21:43,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:21:43,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:21:43,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2010 2032 2028 [WARNING|trainer.py:803] 2025-04-26 15:21:45,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:21:45,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:21:45,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2033 2011 2029 [WARNING|trainer.py:803] 2025-04-26 15:21:47,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:21:47,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:21:48,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2034 2012 2030 [WARNING|trainer.py:803] 2025-04-26 15:21:49,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:21:49,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:21:50,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2013 2035 2031 [WARNING|trainer.py:803] 2025-04-26 15:21:52,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:21:52,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:21:52,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2036 2032 2014 [WARNING|trainer.py:803] 2025-04-26 15:21:54,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:21:54,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:21:54,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2033 2015 2037 [WARNING|trainer.py:803] 2025-04-26 15:21:56,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:21:56,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:21:56,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2034 2038 2016 [WARNING|trainer.py:803] 2025-04-26 15:21:58,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:21:58,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:21:58,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2035 2039 2017 [WARNING|trainer.py:803] 2025-04-26 15:22:00,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:22:01,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:01,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2036 2040 [WARNING|trainer.py:803] 2025-04-26 15:22:02,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2018 [WARNING|trainer.py:803] 2025-04-26 15:22:03,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:03,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2037 2041 2019 [WARNING|trainer.py:803] 2025-04-26 15:22:05,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:22:05,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:05,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2038 2042 [WARNING|trainer.py:803] 2025-04-26 15:22:07,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2020 [WARNING|trainer.py:803] 2025-04-26 15:22:07,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:08,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2039 2043 [WARNING|trainer.py:803] 2025-04-26 15:22:09,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:09,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2021 2040 [WARNING|trainer.py:803] 2025-04-26 15:22:10,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2044 [WARNING|trainer.py:803] 2025-04-26 15:22:11,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:12,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2022 2041 [WARNING|trainer.py:803] 2025-04-26 15:22:13,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2045 [WARNING|trainer.py:803] 2025-04-26 15:22:13,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:14,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2023 2042 2046 [WARNING|trainer.py:803] 2025-04-26 15:22:15,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:22:16,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:16,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2024 2043 2047 [WARNING|trainer.py:803] 2025-04-26 15:22:17,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:22:18,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:18,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2025 2048 [WARNING|trainer.py:803] 2025-04-26 15:22:19,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2044 [WARNING|trainer.py:803] 2025-04-26 15:22:20,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:22:20,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2026 2049 2045 [WARNING|trainer.py:803] 2025-04-26 15:22:21,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:22,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:22:22,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2027 2050 2046 [WARNING|trainer.py:803] 2025-04-26 15:22:24,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:22:24,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:24,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2028 2051 2047 [WARNING|trainer.py:803] 2025-04-26 15:22:26,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:22:26,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:22:26,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2029 2052 2048 [WARNING|trainer.py:803] 2025-04-26 15:22:28,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:22:28,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:22:29,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2030 2049 2053 [WARNING|trainer.py:803] 2025-04-26 15:22:30,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:22:31,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:22:31,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2031 2054 2050 [WARNING|trainer.py:803] 2025-04-26 15:22:33,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:22:33,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:22:33,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2032 2051 2055 [WARNING|trainer.py:803] 2025-04-26 15:22:35,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:22:35,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:22:35,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2033 2056 2052 [WARNING|trainer.py:803] 2025-04-26 15:22:37,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:22:37,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:22:37,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2034 2057 2053 [WARNING|trainer.py:803] 2025-04-26 15:22:39,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:22:39,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:22:39,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2035 2054 2058 [WARNING|trainer.py:803] 2025-04-26 15:22:41,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:22:42,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:22:42,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2036 2055 2059 [WARNING|trainer.py:803] 2025-04-26 15:22:43,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:22:44,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:22:44,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2056 2037 2060 [WARNING|trainer.py:803] 2025-04-26 15:22:46,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:22:46,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:22:46,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2038 2061 2057 [WARNING|trainer.py:803] 2025-04-26 15:22:48,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:48,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:48,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2039 2062 2058 [WARNING|trainer.py:803] 2025-04-26 15:22:50,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:50,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:50,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2040 2063 2059 [WARNING|trainer.py:803] 2025-04-26 15:22:52,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:52,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:52,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2041 2064 2060 [WARNING|trainer.py:803] 2025-04-26 15:22:54,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:54,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:54,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2061 2042 2065 [WARNING|trainer.py:803] 2025-04-26 15:22:57,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:57,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:57,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2062 2043 2066 [WARNING|trainer.py:803] 2025-04-26 15:22:59,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:59,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:22:59,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2067 2063 2044 [WARNING|trainer.py:803] 2025-04-26 15:23:01,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:01,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:01,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2068 2064 2045 [WARNING|trainer.py:803] 2025-04-26 15:23:03,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:03,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:03,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2069 2065 2046 [WARNING|trainer.py:803] 2025-04-26 15:23:05,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:05,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:06,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2070 2066 2047 [WARNING|trainer.py:803] 2025-04-26 15:23:07,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:08,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:08,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2071 2067 2048 [WARNING|trainer.py:803] 2025-04-26 15:23:09,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:23:10,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:10,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2068 2049 2072 [WARNING|trainer.py:803] 2025-04-26 15:23:12,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:12,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:23:12,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2069 2050 2073 [WARNING|trainer.py:803] 2025-04-26 15:23:14,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:14,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:14,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2070 2051 2074 [WARNING|trainer.py:803] 2025-04-26 15:23:16,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:16,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:23:16,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2071 2052 2075 [WARNING|trainer.py:803] 2025-04-26 15:23:18,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:23:18,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:23:19,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2072 2053 2076 [WARNING|trainer.py:803] 2025-04-26 15:23:21,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:23:21,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:23:21,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2054 2077 2073 [WARNING|trainer.py:803] 2025-04-26 15:23:23,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:23:23,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:23:23,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2078 2055 2074 [WARNING|trainer.py:803] 2025-04-26 15:23:25,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:23:25,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:23:25,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2056 2079 2075 [WARNING|trainer.py:803] 2025-04-26 15:23:27,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:23:27,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:23:27,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2080 2057 2076 [WARNING|trainer.py:803] 2025-04-26 15:23:29,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:23:29,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:23:30,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2081 2058 2077 [WARNING|trainer.py:803] 2025-04-26 15:23:32,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:23:32,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:23:32,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2078 2059 2082 [WARNING|trainer.py:803] 2025-04-26 15:23:34,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:23:34,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:23:34,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2060 2079 2083 [WARNING|trainer.py:803] 2025-04-26 15:23:36,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:23:36,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:23:36,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2061 2080 2084 [WARNING|trainer.py:803] 2025-04-26 15:23:38,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:38,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:23:39,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2062 2081 2085 [WARNING|trainer.py:803] 2025-04-26 15:23:40,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:41,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:23:41,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2063 2082 [WARNING|trainer.py:803] 2025-04-26 15:23:42,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2086 [WARNING|trainer.py:803] 2025-04-26 15:23:43,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:23:44,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2064 2083 [WARNING|trainer.py:803] 2025-04-26 15:23:44,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:45,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2087 2065 [WARNING|trainer.py:803] 2025-04-26 15:23:46,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2084 [WARNING|trainer.py:803] 2025-04-26 15:23:47,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2088 [WARNING|trainer.py:803] 2025-04-26 15:23:47,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2066 [WARNING|trainer.py:803] 2025-04-26 15:23:48,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:23:49,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2085 2089 [WARNING|trainer.py:803] 2025-04-26 15:23:50,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2067 [WARNING|trainer.py:803] 2025-04-26 15:23:50,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:51,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2086 2068 2090 [WARNING|trainer.py:803] 2025-04-26 15:23:52,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:23:53,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:53,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2087 2069 2091 [WARNING|trainer.py:803] 2025-04-26 15:23:55,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:23:55,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:55,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2088 2070 2092 [WARNING|trainer.py:803] 2025-04-26 15:23:57,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:23:57,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:57,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2089 2093 2071 [WARNING|trainer.py:803] 2025-04-26 15:23:59,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:23:59,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:23:59,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2094 2072 2090 [WARNING|trainer.py:803] 2025-04-26 15:24:01,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:02,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:24:02,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2095 2091 2073 [WARNING|trainer.py:803] 2025-04-26 15:24:04,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:04,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:04,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2096 2092 [WARNING|trainer.py:803] 2025-04-26 15:24:06,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2074 [WARNING|trainer.py:803] 2025-04-26 15:24:06,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:07,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2097 2093 [WARNING|trainer.py:803] 2025-04-26 15:24:08,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2075 [WARNING|trainer.py:803] 2025-04-26 15:24:08,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:09,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2098 2094 [WARNING|trainer.py:803] 2025-04-26 15:24:10,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2076 [WARNING|trainer.py:803] 2025-04-26 15:24:10,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:11,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2099 2095 2077 [WARNING|trainer.py:803] 2025-04-26 15:24:13,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:13,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:13,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2100 2096 2078 [WARNING|trainer.py:803] 2025-04-26 15:24:15,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:24:15,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2101 [WARNING|trainer.py:803] 2025-04-26 15:24:15,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:24:16,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2097 2102 2079 [WARNING|trainer.py:803] 2025-04-26 15:24:17,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:24:17,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:24:17,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2103 2098 2080 [WARNING|trainer.py:803] 2025-04-26 15:24:19,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2104 [WARNING|trainer.py:803] 2025-04-26 15:24:19,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:19,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:24:20,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2105 2099 2081 [WARNING|trainer.py:803] 2025-04-26 15:24:21,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2106 [WARNING|trainer.py:803] 2025-04-26 15:24:22,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:22,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:24:23,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2107 2100 2082 [WARNING|trainer.py:803] 2025-04-26 15:24:24,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:24:24,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2108 [WARNING|trainer.py:803] 2025-04-26 15:24:24,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2101 [WARNING|trainer.py:803] 2025-04-26 15:24:25,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:25,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2109 2083 2102 [WARNING|trainer.py:803] 2025-04-26 15:24:26,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:24:27,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:24:27,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2110 2103 [WARNING|trainer.py:803] 2025-04-26 15:24:28,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2084 [WARNING|trainer.py:803] 2025-04-26 15:24:28,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2111 2104 [WARNING|trainer.py:803] 2025-04-26 15:24:29,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:24:29,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:24:29,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2112 2105 2085 [WARNING|trainer.py:803] 2025-04-26 15:24:30,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:24:31,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2113 [WARNING|trainer.py:803] 2025-04-26 15:24:31,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2106 [WARNING|trainer.py:803] 2025-04-26 15:24:32,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:24:32,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2114 2107 2086 [WARNING|trainer.py:803] 2025-04-26 15:24:33,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:24:33,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2115 [WARNING|trainer.py:803] 2025-04-26 15:24:33,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2108 [WARNING|trainer.py:803] 2025-04-26 15:24:34,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:24:35,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2116 2109 2087 [WARNING|trainer.py:803] 2025-04-26 15:24:35,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:24:36,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2117 [WARNING|trainer.py:803] 2025-04-26 15:24:36,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2110 [WARNING|trainer.py:803] 2025-04-26 15:24:37,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:24:37,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2118 2088 2111 [WARNING|trainer.py:803] 2025-04-26 15:24:38,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:24:38,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:38,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2119 2112 [WARNING|trainer.py:803] 2025-04-26 15:24:39,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2089 [WARNING|trainer.py:803] 2025-04-26 15:24:40,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2120 2113 [WARNING|trainer.py:803] 2025-04-26 15:24:40,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:24:41,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:41,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2121 2114 [WARNING|trainer.py:803] 2025-04-26 15:24:42,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2090 [WARNING|trainer.py:803] 2025-04-26 15:24:42,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2122 2115 [WARNING|trainer.py:803] 2025-04-26 15:24:43,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:43,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:24:44,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2123 2116 2091 [WARNING|trainer.py:803] 2025-04-26 15:24:44,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:45,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2124 [WARNING|trainer.py:803] 2025-04-26 15:24:45,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2117 [WARNING|trainer.py:803] 2025-04-26 15:24:46,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:24:46,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2125 2092 2118 [WARNING|trainer.py:803] 2025-04-26 15:24:47,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:24:47,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:48,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2126 2119 [WARNING|trainer.py:803] 2025-04-26 15:24:48,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2093 [WARNING|trainer.py:803] 2025-04-26 15:24:49,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2127 2120 [WARNING|trainer.py:803] 2025-04-26 15:24:49,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:50,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:24:50,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2128 2094 2121 [WARNING|trainer.py:803] 2025-04-26 15:24:51,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:24:51,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:51,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2129 2122 [WARNING|trainer.py:803] 2025-04-26 15:24:52,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2095 [WARNING|trainer.py:803] 2025-04-26 15:24:53,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2130 2123 [WARNING|trainer.py:803] 2025-04-26 15:24:54,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:24:54,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2131 [WARNING|trainer.py:803] 2025-04-26 15:24:54,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2124 2096 [WARNING|trainer.py:803] 2025-04-26 15:24:55,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2132 [WARNING|trainer.py:803] 2025-04-26 15:24:56,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:24:56,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2125 [WARNING|trainer.py:803] 2025-04-26 15:24:56,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2133 [WARNING|trainer.py:803] 2025-04-26 15:24:57,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2097 2126 [WARNING|trainer.py:803] 2025-04-26 15:24:58,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:24:58,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2134 [WARNING|trainer.py:803] 2025-04-26 15:24:58,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2127 [WARNING|trainer.py:803] 2025-04-26 15:24:59,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2098 2135 [WARNING|trainer.py:803] 2025-04-26 15:25:00,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2128 [WARNING|trainer.py:803] 2025-04-26 15:25:00,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:25:00,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2136 [WARNING|trainer.py:803] 2025-04-26 15:25:01,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2129 [WARNING|trainer.py:803] 2025-04-26 15:25:02,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2099 2137 [WARNING|trainer.py:803] 2025-04-26 15:25:02,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:25:03,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2130 [WARNING|trainer.py:803] 2025-04-26 15:25:03,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2138 [WARNING|trainer.py:803] 2025-04-26 15:25:03,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2100 2131 [WARNING|trainer.py:803] 2025-04-26 15:25:04,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:25:05,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2139 [WARNING|trainer.py:803] 2025-04-26 15:25:05,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2101 2132 [WARNING|trainer.py:803] 2025-04-26 15:25:06,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:25:06,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2140 [WARNING|trainer.py:803] 2025-04-26 15:25:06,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2102 2133 [WARNING|trainer.py:803] 2025-04-26 15:25:07,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:07,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2141 [WARNING|trainer.py:803] 2025-04-26 15:25:07,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2103 2134 [WARNING|trainer.py:803] 2025-04-26 15:25:08,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:09,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2142 [WARNING|trainer.py:803] 2025-04-26 15:25:09,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2104 2135 [WARNING|trainer.py:803] 2025-04-26 15:25:09,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:25:10,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2143 [WARNING|trainer.py:803] 2025-04-26 15:25:10,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2105 2136 [WARNING|trainer.py:803] 2025-04-26 15:25:11,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:25:11,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2144 [WARNING|trainer.py:803] 2025-04-26 15:25:12,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2106 2137 [WARNING|trainer.py:803] 2025-04-26 15:25:12,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:25:13,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2145 [WARNING|trainer.py:803] 2025-04-26 15:25:13,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2107 2138 [WARNING|trainer.py:803] 2025-04-26 15:25:13,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:25:14,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2146 [WARNING|trainer.py:803] 2025-04-26 15:25:14,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2108 2139 [WARNING|trainer.py:803] 2025-04-26 15:25:15,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:25:15,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2147 [WARNING|trainer.py:803] 2025-04-26 15:25:16,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2109 2140 [WARNING|trainer.py:803] 2025-04-26 15:25:16,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:25:16,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2148 [WARNING|trainer.py:803] 2025-04-26 15:25:17,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2110 2141 [WARNING|trainer.py:803] 2025-04-26 15:25:17,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:25:18,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2149 [WARNING|trainer.py:803] 2025-04-26 15:25:18,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2111 2142 [WARNING|trainer.py:803] 2025-04-26 15:25:19,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:19,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2150 [WARNING|trainer.py:803] 2025-04-26 15:25:19,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2112 2143 [WARNING|trainer.py:803] 2025-04-26 15:25:20,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:20,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2151 [WARNING|trainer.py:803] 2025-04-26 15:25:21,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2113 2144 [WARNING|trainer.py:803] 2025-04-26 15:25:21,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:22,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2152 2114 [WARNING|trainer.py:803] 2025-04-26 15:25:22,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2145 [WARNING|trainer.py:803] 2025-04-26 15:25:23,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:25:23,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2153 2115 [WARNING|trainer.py:803] 2025-04-26 15:25:23,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2146 [WARNING|trainer.py:803] 2025-04-26 15:25:24,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:24,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2154 2116 [WARNING|trainer.py:803] 2025-04-26 15:25:25,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:25:25,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2147 [WARNING|trainer.py:803] 2025-04-26 15:25:26,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2155 2117 [WARNING|trainer.py:803] 2025-04-26 15:25:26,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:25:27,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2148 [WARNING|trainer.py:803] 2025-04-26 15:25:27,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2156 2118 [WARNING|trainer.py:803] 2025-04-26 15:25:27,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:25:28,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2149 [WARNING|trainer.py:803] 2025-04-26 15:25:28,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2157 2119 [WARNING|trainer.py:803] 2025-04-26 15:25:29,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:29,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2150 [WARNING|trainer.py:803] 2025-04-26 15:25:29,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2158 2120 [WARNING|trainer.py:803] 2025-04-26 15:25:30,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:30,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2151 [WARNING|trainer.py:803] 2025-04-26 15:25:31,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2159 2121 [WARNING|trainer.py:803] 2025-04-26 15:25:31,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:32,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2152 [WARNING|trainer.py:803] 2025-04-26 15:25:32,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2160 2122 [WARNING|trainer.py:803] 2025-04-26 15:25:33,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:25:33,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2153 [WARNING|trainer.py:803] 2025-04-26 15:25:33,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2161 2123 [WARNING|trainer.py:803] 2025-04-26 15:25:34,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:34,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2154 [WARNING|trainer.py:803] 2025-04-26 15:25:35,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2162 2124 [WARNING|trainer.py:803] 2025-04-26 15:25:35,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:25:36,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2155 [WARNING|trainer.py:803] 2025-04-26 15:25:36,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2163 2125 [WARNING|trainer.py:803] 2025-04-26 15:25:37,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:37,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2156 [WARNING|trainer.py:803] 2025-04-26 15:25:37,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2164 2126 [WARNING|trainer.py:803] 2025-04-26 15:25:38,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:38,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2157 [WARNING|trainer.py:803] 2025-04-26 15:25:39,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2165 2127 [WARNING|trainer.py:803] 2025-04-26 15:25:39,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:40,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2158 [WARNING|trainer.py:803] 2025-04-26 15:25:40,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2166 2128 [WARNING|trainer.py:803] 2025-04-26 15:25:41,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:41,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2159 [WARNING|trainer.py:803] 2025-04-26 15:25:41,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2167 2129 [WARNING|trainer.py:803] 2025-04-26 15:25:42,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:25:42,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2160 [WARNING|trainer.py:803] 2025-04-26 15:25:43,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2168 2130 [WARNING|trainer.py:803] 2025-04-26 15:25:43,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:25:43,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2161 [WARNING|trainer.py:803] 2025-04-26 15:25:44,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2169 2131 [WARNING|trainer.py:803] 2025-04-26 15:25:44,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:45,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2162 [WARNING|trainer.py:803] 2025-04-26 15:25:45,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2170 2132 [WARNING|trainer.py:803] 2025-04-26 15:25:46,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:46,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2163 [WARNING|trainer.py:803] 2025-04-26 15:25:47,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2171 2133 [WARNING|trainer.py:803] 2025-04-26 15:25:47,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:25:47,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2164 [WARNING|trainer.py:803] 2025-04-26 15:25:48,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2172 2134 [WARNING|trainer.py:803] 2025-04-26 15:25:48,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:49,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2165 [WARNING|trainer.py:803] 2025-04-26 15:25:49,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2173 2135 [WARNING|trainer.py:803] 2025-04-26 15:25:50,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:25:50,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2166 2174 [WARNING|trainer.py:803] 2025-04-26 15:25:51,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2136 [WARNING|trainer.py:803] 2025-04-26 15:25:51,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:51,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2167 [WARNING|trainer.py:803] 2025-04-26 15:25:52,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2175 2137 [WARNING|trainer.py:803] 2025-04-26 15:25:52,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:25:53,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2168 [WARNING|trainer.py:803] 2025-04-26 15:25:53,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2176 2138 [WARNING|trainer.py:803] 2025-04-26 15:25:54,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:54,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2169 [WARNING|trainer.py:803] 2025-04-26 15:25:54,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2177 2139 [WARNING|trainer.py:803] 2025-04-26 15:25:55,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:55,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2170 [WARNING|trainer.py:803] 2025-04-26 15:25:56,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2178 2140 [WARNING|trainer.py:803] 2025-04-26 15:25:56,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:57,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2171 [WARNING|trainer.py:803] 2025-04-26 15:25:57,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2179 2141 [WARNING|trainer.py:803] 2025-04-26 15:25:58,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:25:58,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2172 [WARNING|trainer.py:803] 2025-04-26 15:25:58,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2180 2142 [WARNING|trainer.py:803] 2025-04-26 15:25:59,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:25:59,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2173 [WARNING|trainer.py:803] 2025-04-26 15:26:00,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2181 2143 [WARNING|trainer.py:803] 2025-04-26 15:26:00,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:26:00,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2174 2182 [WARNING|trainer.py:803] 2025-04-26 15:26:01,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2144 [WARNING|trainer.py:803] 2025-04-26 15:26:02,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:02,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:02,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2175 2183 2145 [WARNING|trainer.py:803] 2025-04-26 15:26:03,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:03,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:04,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2176 2184 2146 [WARNING|trainer.py:803] 2025-04-26 15:26:04,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:04,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:05,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2177 2185 2147 [WARNING|trainer.py:803] 2025-04-26 15:26:06,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:06,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2178 [WARNING|trainer.py:803] 2025-04-26 15:26:06,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2186 2148 [WARNING|trainer.py:803] 2025-04-26 15:26:07,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:07,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2179 [WARNING|trainer.py:803] 2025-04-26 15:26:07,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2187 2149 [WARNING|trainer.py:803] 2025-04-26 15:26:08,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:26:08,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:09,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2180 2188 2150 [WARNING|trainer.py:803] 2025-04-26 15:26:10,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:26:10,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:26:10,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2181 2189 2151 [WARNING|trainer.py:803] 2025-04-26 15:26:11,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:26:11,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2182 [WARNING|trainer.py:803] 2025-04-26 15:26:11,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2190 2152 [WARNING|trainer.py:803] 2025-04-26 15:26:12,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:12,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:13,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2183 2191 2153 [WARNING|trainer.py:803] 2025-04-26 15:26:14,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:26:14,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:14,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2192 2184 2154 [WARNING|trainer.py:803] 2025-04-26 15:26:15,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:15,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:15,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2193 2185 2155 [WARNING|trainer.py:803] 2025-04-26 15:26:16,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:26:16,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:17,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2194 2186 2156 [WARNING|trainer.py:803] 2025-04-26 15:26:18,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:18,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:18,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2195 2187 2157 [WARNING|trainer.py:803] 2025-04-26 15:26:19,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:19,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:19,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2196 2188 2158 [WARNING|trainer.py:803] 2025-04-26 15:26:20,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:20,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:26:21,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2197 2189 2159 [WARNING|trainer.py:803] 2025-04-26 15:26:21,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:26:22,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:22,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2198 2190 2160 [WARNING|trainer.py:803] 2025-04-26 15:26:23,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:23,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:23,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2199 2191 2161 [WARNING|trainer.py:803] 2025-04-26 15:26:24,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:24,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:26:24,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2200 2192 2162 [WARNING|trainer.py:803] 2025-04-26 15:26:25,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:26:26,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2201 [WARNING|trainer.py:803] 2025-04-26 15:26:26,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2193 2163 [WARNING|trainer.py:803] 2025-04-26 15:26:27,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:26:27,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:26:27,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2202 2194 2164 [WARNING|trainer.py:803] 2025-04-26 15:26:28,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:28,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:28,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2203 2195 2165 [WARNING|trainer.py:803] 2025-04-26 15:26:29,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:26:30,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:30,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2204 2196 2166 [WARNING|trainer.py:803] 2025-04-26 15:26:31,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:31,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:31,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2205 2197 2167 [WARNING|trainer.py:803] 2025-04-26 15:26:32,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:32,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:26:32,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2206 2198 2168 [WARNING|trainer.py:803] 2025-04-26 15:26:33,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:34,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:34,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2207 2199 2169 [WARNING|trainer.py:803] 2025-04-26 15:26:35,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:26:35,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:35,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2208 2200 2170 [WARNING|trainer.py:803] 2025-04-26 15:26:36,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:26:36,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:26:36,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2209 2201 2171 [WARNING|trainer.py:803] 2025-04-26 15:26:37,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:38,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:26:38,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2210 2202 2172 [WARNING|trainer.py:803] 2025-04-26 15:26:39,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:39,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:39,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2211 2173 2203 [WARNING|trainer.py:803] 2025-04-26 15:26:40,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:26:40,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:26:40,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2212 2174 2204 [WARNING|trainer.py:803] 2025-04-26 15:26:41,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:26:42,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2213 [WARNING|trainer.py:803] 2025-04-26 15:26:42,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2175 2205 [WARNING|trainer.py:803] 2025-04-26 15:26:43,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:43,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2214 [WARNING|trainer.py:803] 2025-04-26 15:26:43,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2176 2206 [WARNING|trainer.py:803] 2025-04-26 15:26:44,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:44,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2215 [WARNING|trainer.py:803] 2025-04-26 15:26:45,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2177 2207 [WARNING|trainer.py:803] 2025-04-26 15:26:45,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2216 [WARNING|trainer.py:803] 2025-04-26 15:26:46,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:46,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2178 2208 [WARNING|trainer.py:803] 2025-04-26 15:26:47,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:26:47,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2217 [WARNING|trainer.py:803] 2025-04-26 15:26:47,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2179 2209 [WARNING|trainer.py:803] 2025-04-26 15:26:48,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:48,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2218 [WARNING|trainer.py:803] 2025-04-26 15:26:49,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2180 2210 [WARNING|trainer.py:803] 2025-04-26 15:26:49,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:50,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2219 [WARNING|trainer.py:803] 2025-04-26 15:26:50,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2181 2211 [WARNING|trainer.py:803] 2025-04-26 15:26:51,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:26:51,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2220 2182 [WARNING|trainer.py:803] 2025-04-26 15:26:51,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2212 [WARNING|trainer.py:803] 2025-04-26 15:26:52,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:26:52,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2221 [WARNING|trainer.py:803] 2025-04-26 15:26:53,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2183 2213 [WARNING|trainer.py:803] 2025-04-26 15:26:53,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:26:54,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2222 [WARNING|trainer.py:803] 2025-04-26 15:26:54,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2184 2214 [WARNING|trainer.py:803] 2025-04-26 15:26:55,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:26:55,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2223 [WARNING|trainer.py:803] 2025-04-26 15:26:55,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2185 2215 [WARNING|trainer.py:803] 2025-04-26 15:26:56,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:26:56,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2224 [WARNING|trainer.py:803] 2025-04-26 15:26:57,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2186 2216 [WARNING|trainer.py:803] 2025-04-26 15:26:57,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:26:58,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2225 [WARNING|trainer.py:803] 2025-04-26 15:26:58,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2187 2217 [WARNING|trainer.py:803] 2025-04-26 15:26:59,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:26:59,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2226 [WARNING|trainer.py:803] 2025-04-26 15:26:59,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2188 2218 [WARNING|trainer.py:803] 2025-04-26 15:27:00,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:00,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2227 2189 [WARNING|trainer.py:803] 2025-04-26 15:27:01,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2219 [WARNING|trainer.py:803] 2025-04-26 15:27:01,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:02,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2228 2190 [WARNING|trainer.py:803] 2025-04-26 15:27:02,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2220 [WARNING|trainer.py:803] 2025-04-26 15:27:03,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:27:03,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2229 2191 [WARNING|trainer.py:803] 2025-04-26 15:27:04,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:04,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes 2221 [WARNING|trainer.py:803] 2025-04-26 15:27:04,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2230 2192 [WARNING|trainer.py:803] 2025-04-26 15:27:05,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:05,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2222 [WARNING|trainer.py:803] 2025-04-26 15:27:06,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2231 2193 [WARNING|trainer.py:803] 2025-04-26 15:27:06,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:27:07,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2223 [WARNING|trainer.py:803] 2025-04-26 15:27:07,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2232 2194 [WARNING|trainer.py:803] 2025-04-26 15:27:08,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:27:08,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:27:08,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2224 2233 2195 [WARNING|trainer.py:803] 2025-04-26 15:27:09,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:09,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:10,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2225 2234 2196 [WARNING|trainer.py:803] 2025-04-26 15:27:10,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:11,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:27:11,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2226 2235 2197 [WARNING|trainer.py:803] 2025-04-26 15:27:12,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:12,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:27:12,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2227 2236 2198 [WARNING|trainer.py:803] 2025-04-26 15:27:13,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:13,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:27:13,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2228 2199 2237 [WARNING|trainer.py:803] 2025-04-26 15:27:14,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:27:15,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:27:15,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2229 2200 2238 [WARNING|trainer.py:803] 2025-04-26 15:27:16,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes [WARNING|trainer.py:803] 2025-04-26 15:27:16,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:16,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2230 2201 2239 [WARNING|trainer.py:803] 2025-04-26 15:27:17,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:27:17,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:27:17,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2231 2240 2202 [WARNING|trainer.py:803] 2025-04-26 15:27:18,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:19,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:27:19,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2232 2241 2203 [WARNING|trainer.py:803] 2025-04-26 15:27:20,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:27:20,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:27:20,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2233 2242 2204 [WARNING|trainer.py:803] 2025-04-26 15:27:21,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:21,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2234 [WARNING|trainer.py:803] 2025-04-26 15:27:22,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2243 2205 [WARNING|trainer.py:803] 2025-04-26 15:27:22,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:27:23,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:23,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2235 2244 2206 [WARNING|trainer.py:803] 2025-04-26 15:27:24,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:27:24,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2236 [WARNING|trainer.py:803] 2025-04-26 15:27:24,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2245 2207 [WARNING|trainer.py:803] 2025-04-26 15:27:25,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:27:26,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2237 [WARNING|trainer.py:803] 2025-04-26 15:27:26,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2246 2208 [WARNING|trainer.py:803] 2025-04-26 15:27:26,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:27:27,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2238 [WARNING|trainer.py:803] 2025-04-26 15:27:27,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2247 2209 [WARNING|trainer.py:803] 2025-04-26 15:27:28,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:27:28,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:27:28,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2239 2248 2210 [WARNING|trainer.py:803] 2025-04-26 15:27:29,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:27:30,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2240 [WARNING|trainer.py:803] 2025-04-26 15:27:30,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2249 2211 [WARNING|trainer.py:803] 2025-04-26 15:27:30,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:27:31,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2241 [WARNING|trainer.py:803] 2025-04-26 15:27:31,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2250 2212 [WARNING|trainer.py:803] 2025-04-26 15:27:32,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:27:32,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2242 [WARNING|trainer.py:803] 2025-04-26 15:27:32,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2251 2213 [WARNING|trainer.py:803] 2025-04-26 15:27:33,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:27:33,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2243 [WARNING|trainer.py:803] 2025-04-26 15:27:34,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2252 2214 [WARNING|trainer.py:803] 2025-04-26 15:27:34,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:35,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2244 [WARNING|trainer.py:803] 2025-04-26 15:27:35,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2253 2215 [WARNING|trainer.py:803] 2025-04-26 15:27:36,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:36,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2245 [WARNING|trainer.py:803] 2025-04-26 15:27:36,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2254 2216 [WARNING|trainer.py:803] 2025-04-26 15:27:37,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:37,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2246 [WARNING|trainer.py:803] 2025-04-26 15:27:38,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2255 2217 [WARNING|trainer.py:803] 2025-04-26 15:27:38,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:27:38,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2256 2247 [WARNING|trainer.py:803] 2025-04-26 15:27:39,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2218 [WARNING|trainer.py:803] 2025-04-26 15:27:40,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:40,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2257 2248 [WARNING|trainer.py:803] 2025-04-26 15:27:40,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2219 [WARNING|trainer.py:803] 2025-04-26 15:27:41,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:27:41,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2258 2249 [WARNING|trainer.py:803] 2025-04-26 15:27:42,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:27:42,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2220 [WARNING|trainer.py:803] 2025-04-26 15:27:42,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2259 2250 [WARNING|trainer.py:803] 2025-04-26 15:27:43,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:43,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2221 [WARNING|trainer.py:803] 2025-04-26 15:27:44,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2260 2251 [WARNING|trainer.py:803] 2025-04-26 15:27:45,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:45,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:27:45,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2222 2261 2252 [WARNING|trainer.py:803] 2025-04-26 15:27:46,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:27:46,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:27:46,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2223 2262 2253 [WARNING|trainer.py:803] 2025-04-26 15:27:47,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:27:47,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:27:48,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2263 2224 2254 [WARNING|trainer.py:803] 2025-04-26 15:27:49,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:27:49,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:49,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2264 2225 2255 [WARNING|trainer.py:803] 2025-04-26 15:27:50,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:27:50,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:50,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2265 2226 2256 [WARNING|trainer.py:803] 2025-04-26 15:27:51,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:27:51,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2266 [WARNING|trainer.py:803] 2025-04-26 15:27:51,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2227 2257 [WARNING|trainer.py:803] 2025-04-26 15:27:52,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2267 [WARNING|trainer.py:803] 2025-04-26 15:27:53,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:27:53,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2228 2258 [WARNING|trainer.py:803] 2025-04-26 15:27:54,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2268 [WARNING|trainer.py:803] 2025-04-26 15:27:54,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:27:54,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2229 2259 [WARNING|trainer.py:803] 2025-04-26 15:27:55,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2269 [WARNING|trainer.py:803] 2025-04-26 15:27:55,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes [WARNING|trainer.py:803] 2025-04-26 15:27:55,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2260 2230 [WARNING|trainer.py:803] 2025-04-26 15:27:56,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2270 [WARNING|trainer.py:803] 2025-04-26 15:27:57,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:27:57,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2261 2231 [WARNING|trainer.py:803] 2025-04-26 15:27:57,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2271 [WARNING|trainer.py:803] 2025-04-26 15:27:58,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:27:58,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2262 [WARNING|trainer.py:803] 2025-04-26 15:27:59,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2232 2272 [WARNING|trainer.py:803] 2025-04-26 15:27:59,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:27:59,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2263 [WARNING|trainer.py:803] 2025-04-26 15:28:00,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2233 2273 [WARNING|trainer.py:803] 2025-04-26 15:28:01,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:01,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2264 [WARNING|trainer.py:803] 2025-04-26 15:28:01,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2234 2274 [WARNING|trainer.py:803] 2025-04-26 15:28:02,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:02,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2265 [WARNING|trainer.py:803] 2025-04-26 15:28:02,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2235 2275 [WARNING|trainer.py:803] 2025-04-26 15:28:03,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:28:03,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2266 [WARNING|trainer.py:803] 2025-04-26 15:28:04,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2236 2276 [WARNING|trainer.py:803] 2025-04-26 15:28:04,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:05,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:05,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2267 2277 2237 [WARNING|trainer.py:803] 2025-04-26 15:28:06,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:28:06,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2268 [WARNING|trainer.py:803] 2025-04-26 15:28:06,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2278 2238 [WARNING|trainer.py:803] 2025-04-26 15:28:07,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:28:07,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2269 [WARNING|trainer.py:803] 2025-04-26 15:28:07,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2279 2239 [WARNING|trainer.py:803] 2025-04-26 15:28:08,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:09,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2270 [WARNING|trainer.py:803] 2025-04-26 15:28:09,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2280 2240 [WARNING|trainer.py:803] 2025-04-26 15:28:10,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:10,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2271 [WARNING|trainer.py:803] 2025-04-26 15:28:10,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2281 2241 [WARNING|trainer.py:803] 2025-04-26 15:28:11,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:11,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2272 [WARNING|trainer.py:803] 2025-04-26 15:28:11,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2282 2242 [WARNING|trainer.py:803] 2025-04-26 15:28:12,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:12,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2273 [WARNING|trainer.py:803] 2025-04-26 15:28:13,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2283 2243 [WARNING|trainer.py:803] 2025-04-26 15:28:13,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:14,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2274 2284 [WARNING|trainer.py:803] 2025-04-26 15:28:14,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2244 [WARNING|trainer.py:803] 2025-04-26 15:28:15,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:28:15,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2275 2285 [WARNING|trainer.py:803] 2025-04-26 15:28:15,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2245 [WARNING|trainer.py:803] 2025-04-26 15:28:16,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:16,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2276 2286 [WARNING|trainer.py:803] 2025-04-26 15:28:17,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2246 [WARNING|trainer.py:803] 2025-04-26 15:28:17,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:17,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2277 2287 [WARNING|trainer.py:803] 2025-04-26 15:28:18,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:18,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2247 [WARNING|trainer.py:803] 2025-04-26 15:28:19,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2278 2288 [WARNING|trainer.py:803] 2025-04-26 15:28:19,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:20,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2248 [WARNING|trainer.py:803] 2025-04-26 15:28:20,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2279 2289 [WARNING|trainer.py:803] 2025-04-26 15:28:21,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:28:21,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2249 [WARNING|trainer.py:803] 2025-04-26 15:28:21,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2280 2290 [WARNING|trainer.py:803] 2025-04-26 15:28:22,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:28:22,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2250 [WARNING|trainer.py:803] 2025-04-26 15:28:22,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2281 2291 [WARNING|trainer.py:803] 2025-04-26 15:28:23,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:24,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2251 [WARNING|trainer.py:803] 2025-04-26 15:28:24,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2282 2292 [WARNING|trainer.py:803] 2025-04-26 15:28:24,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:25,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2252 [WARNING|trainer.py:803] 2025-04-26 15:28:25,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2283 2293 [WARNING|trainer.py:803] 2025-04-26 15:28:26,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2253 [WARNING|trainer.py:803] 2025-04-26 15:28:26,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:26,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2284 2294 [WARNING|trainer.py:803] 2025-04-26 15:28:27,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2254 [WARNING|trainer.py:803] 2025-04-26 15:28:27,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:28:28,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2285 2295 [WARNING|trainer.py:803] 2025-04-26 15:28:28,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2255 [WARNING|trainer.py:803] 2025-04-26 15:28:29,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:29,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2286 2296 [WARNING|trainer.py:803] 2025-04-26 15:28:30,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2256 [WARNING|trainer.py:803] 2025-04-26 15:28:30,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:30,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2287 2297 [WARNING|trainer.py:803] 2025-04-26 15:28:31,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2257 [WARNING|trainer.py:803] 2025-04-26 15:28:31,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:31,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2298 2288 [WARNING|trainer.py:803] 2025-04-26 15:28:32,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2258 [WARNING|trainer.py:803] 2025-04-26 15:28:33,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:33,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2299 2289 [WARNING|trainer.py:803] 2025-04-26 15:28:33,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2259 [WARNING|trainer.py:803] 2025-04-26 15:28:34,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:34,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2300 2290 [WARNING|trainer.py:803] 2025-04-26 15:28:35,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2260 [WARNING|trainer.py:803] 2025-04-26 15:28:35,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:35,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2301 2291 [WARNING|trainer.py:803] 2025-04-26 15:28:36,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2261 [WARNING|trainer.py:803] 2025-04-26 15:28:36,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:36,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2302 2292 [WARNING|trainer.py:803] 2025-04-26 15:28:37,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2262 [WARNING|trainer.py:803] 2025-04-26 15:28:38,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:38,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2303 2293 [WARNING|trainer.py:803] 2025-04-26 15:28:38,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2263 [WARNING|trainer.py:803] 2025-04-26 15:28:39,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:39,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2304 2294 [WARNING|trainer.py:803] 2025-04-26 15:28:40,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:40,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2264 [WARNING|trainer.py:803] 2025-04-26 15:28:40,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2305 2295 [WARNING|trainer.py:803] 2025-04-26 15:28:41,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:41,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2265 [WARNING|trainer.py:803] 2025-04-26 15:28:42,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2306 2296 [WARNING|trainer.py:803] 2025-04-26 15:28:42,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:28:43,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2266 [WARNING|trainer.py:803] 2025-04-26 15:28:43,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2307 2297 [WARNING|trainer.py:803] 2025-04-26 15:28:44,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:44,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2267 [WARNING|trainer.py:803] 2025-04-26 15:28:44,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2308 2298 [WARNING|trainer.py:803] 2025-04-26 15:28:45,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:28:45,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2268 [WARNING|trainer.py:803] 2025-04-26 15:28:45,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2309 2299 [WARNING|trainer.py:803] 2025-04-26 15:28:46,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:28:46,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2269 [WARNING|trainer.py:803] 2025-04-26 15:28:47,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2310 2300 [WARNING|trainer.py:803] 2025-04-26 15:28:47,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:48,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2270 [WARNING|trainer.py:803] 2025-04-26 15:28:48,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2311 2301 [WARNING|trainer.py:803] 2025-04-26 15:28:49,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:49,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2271 [WARNING|trainer.py:803] 2025-04-26 15:28:49,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2312 2302 [WARNING|trainer.py:803] 2025-04-26 15:28:50,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:50,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2272 [WARNING|trainer.py:803] 2025-04-26 15:28:51,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2313 2303 [WARNING|trainer.py:803] 2025-04-26 15:28:51,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:52,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2273 [WARNING|trainer.py:803] 2025-04-26 15:28:52,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2314 2304 [WARNING|trainer.py:803] 2025-04-26 15:28:53,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:53,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:53,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2274 2315 2305 [WARNING|trainer.py:803] 2025-04-26 15:28:54,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:28:54,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:54,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2275 2316 2306 [WARNING|trainer.py:803] 2025-04-26 15:28:55,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:55,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:56,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2276 2317 2307 [WARNING|trainer.py:803] 2025-04-26 15:28:56,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:28:57,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:57,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2277 2318 2308 [WARNING|trainer.py:803] 2025-04-26 15:28:58,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:28:58,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:58,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2278 2319 2309 [WARNING|trainer.py:803] 2025-04-26 15:28:59,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:28:59,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:28:59,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2279 2320 2310 [WARNING|trainer.py:803] 2025-04-26 15:29:00,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:00,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:01,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2321 2280 2311 [WARNING|trainer.py:803] 2025-04-26 15:29:02,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:02,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:02,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2322 2281 2312 [WARNING|trainer.py:803] 2025-04-26 15:29:03,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:03,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:03,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2323 2282 2313 [WARNING|trainer.py:803] 2025-04-26 15:29:04,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:04,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:05,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2324 2283 2314 [WARNING|trainer.py:803] 2025-04-26 15:29:05,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:06,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:06,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2325 2284 2315 [WARNING|trainer.py:803] 2025-04-26 15:29:07,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:29:07,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:29:07,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2326 2285 2316 [WARNING|trainer.py:803] 2025-04-26 15:29:08,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:08,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:08,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2327 2286 2317 [WARNING|trainer.py:803] 2025-04-26 15:29:09,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:09,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:10,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2328 2287 2318 [WARNING|trainer.py:803] 2025-04-26 15:29:10,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:11,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2329 [WARNING|trainer.py:803] 2025-04-26 15:29:11,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2288 2319 [WARNING|trainer.py:803] 2025-04-26 15:29:12,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:12,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2330 [WARNING|trainer.py:803] 2025-04-26 15:29:12,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2289 2320 [WARNING|trainer.py:803] 2025-04-26 15:29:13,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:13,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2331 [WARNING|trainer.py:803] 2025-04-26 15:29:14,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2290 2321 [WARNING|trainer.py:803] 2025-04-26 15:29:14,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2332 [WARNING|trainer.py:803] 2025-04-26 15:29:15,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:15,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2291 2322 [WARNING|trainer.py:803] 2025-04-26 15:29:15,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2333 [WARNING|trainer.py:803] 2025-04-26 15:29:16,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:16,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2292 2323 [WARNING|trainer.py:803] 2025-04-26 15:29:17,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2334 [WARNING|trainer.py:803] 2025-04-26 15:29:17,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:17,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2293 2324 [WARNING|trainer.py:803] 2025-04-26 15:29:18,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2335 [WARNING|trainer.py:803] 2025-04-26 15:29:19,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:19,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2294 2325 [WARNING|trainer.py:803] 2025-04-26 15:29:19,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2336 [WARNING|trainer.py:803] 2025-04-26 15:29:20,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:29:20,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2295 2326 [WARNING|trainer.py:803] 2025-04-26 15:29:20,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2337 [WARNING|trainer.py:803] 2025-04-26 15:29:21,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:21,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2296 2327 [WARNING|trainer.py:803] 2025-04-26 15:29:22,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2338 [WARNING|trainer.py:803] 2025-04-26 15:29:22,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:22,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2297 [WARNING|trainer.py:803] 2025-04-26 15:29:23,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2328 2339 [WARNING|trainer.py:803] 2025-04-26 15:29:24,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:24,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:24,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2298 2329 2340 [WARNING|trainer.py:803] 2025-04-26 15:29:25,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:25,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:25,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2299 2330 2341 [WARNING|trainer.py:803] 2025-04-26 15:29:26,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:26,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:27,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2300 2331 2342 [WARNING|trainer.py:803] 2025-04-26 15:29:28,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:28,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:28,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2301 2332 2343 [WARNING|trainer.py:803] 2025-04-26 15:29:29,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 15:29:29,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:29,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2302 2333 2344 [WARNING|trainer.py:803] 2025-04-26 15:29:30,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:30,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:30,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2303 2334 2345 [WARNING|trainer.py:803] 2025-04-26 15:29:32,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:32,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:32,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2304 2335 2346 [WARNING|trainer.py:803] 2025-04-26 15:29:33,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:33,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:33,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2305 2336 2347 [WARNING|trainer.py:803] 2025-04-26 15:29:34,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:29:34,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:34,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2306 2337 2348 [WARNING|trainer.py:803] 2025-04-26 15:29:35,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:35,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:36,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2307 2338 2349 [WARNING|trainer.py:803] 2025-04-26 15:29:37,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:37,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:37,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2339 2308 2350 [WARNING|trainer.py:803] 2025-04-26 15:29:38,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:38,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:38,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2340 2309 2351 [WARNING|trainer.py:803] 2025-04-26 15:29:39,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:39,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:40,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2341 2310 2352 [WARNING|trainer.py:803] 2025-04-26 15:29:41,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:41,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:41,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2342 2311 2353 [WARNING|trainer.py:803] 2025-04-26 15:29:42,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:42,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:42,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2343 2312 2354 [WARNING|trainer.py:803] 2025-04-26 15:29:43,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:43,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:29:43,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2344 2313 2355 [WARNING|trainer.py:803] 2025-04-26 15:29:44,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:45,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:45,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2345 2314 2356 [WARNING|trainer.py:803] 2025-04-26 15:29:46,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:46,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:46,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2346 2315 2357 [WARNING|trainer.py:803] 2025-04-26 15:29:47,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:29:47,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:47,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2347 2316 2358 [WARNING|trainer.py:803] 2025-04-26 15:29:48,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:48,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:48,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2348 2317 2359 [WARNING|trainer.py:803] 2025-04-26 15:29:50,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:50,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:50,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2349 2318 2360 [WARNING|trainer.py:803] 2025-04-26 15:29:51,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:51,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:51,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2350 2319 2361 [WARNING|trainer.py:803] 2025-04-26 15:29:52,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:29:52,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:52,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2351 2362 2320 [WARNING|trainer.py:803] 2025-04-26 15:29:53,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:54,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:29:54,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2352 2363 2321 [WARNING|trainer.py:803] 2025-04-26 15:29:55,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:55,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:29:55,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2353 2364 2322 [WARNING|trainer.py:803] 2025-04-26 15:29:56,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:56,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:29:56,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2354 2365 2323 [WARNING|trainer.py:803] 2025-04-26 15:29:57,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:57,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:29:58,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2355 2366 2324 [WARNING|trainer.py:803] 2025-04-26 15:29:59,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:59,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:29:59,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2356 2367 2325 [WARNING|trainer.py:803] 2025-04-26 15:30:00,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:30:00,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:00,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2357 2368 2326 [WARNING|trainer.py:803] 2025-04-26 15:30:01,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:01,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:30:02,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2369 2358 2327 [WARNING|trainer.py:803] 2025-04-26 15:30:02,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:03,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:30:03,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2370 2359 2328 [WARNING|trainer.py:803] 2025-04-26 15:30:04,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:30:04,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:30:04,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2371 2360 2329 [WARNING|trainer.py:803] 2025-04-26 15:30:05,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:05,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:05,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2372 2361 2330 [WARNING|trainer.py:803] 2025-04-26 15:30:06,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:06,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:07,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2373 2362 2331 [WARNING|trainer.py:803] 2025-04-26 15:30:08,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:08,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:30:08,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2374 2363 2332 [WARNING|trainer.py:803] 2025-04-26 15:30:09,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:09,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2375 [WARNING|trainer.py:803] 2025-04-26 15:30:09,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2364 2333 [WARNING|trainer.py:803] 2025-04-26 15:30:10,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:10,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2376 [WARNING|trainer.py:803] 2025-04-26 15:30:11,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2365 2334 [WARNING|trainer.py:803] 2025-04-26 15:30:11,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:12,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2377 [WARNING|trainer.py:803] 2025-04-26 15:30:12,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2366 2335 [WARNING|trainer.py:803] 2025-04-26 15:30:13,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:13,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2378 [WARNING|trainer.py:803] 2025-04-26 15:30:13,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2367 2336 [WARNING|trainer.py:803] 2025-04-26 15:30:14,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:14,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2379 2368 [WARNING|trainer.py:803] 2025-04-26 15:30:15,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2337 [WARNING|trainer.py:803] 2025-04-26 15:30:15,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:15,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2380 2369 [WARNING|trainer.py:803] 2025-04-26 15:30:16,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2338 [WARNING|trainer.py:803] 2025-04-26 15:30:16,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:17,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2381 2370 [WARNING|trainer.py:803] 2025-04-26 15:30:17,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2339 [WARNING|trainer.py:803] 2025-04-26 15:30:18,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:18,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2382 2371 [WARNING|trainer.py:803] 2025-04-26 15:30:18,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:19,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2340 [WARNING|trainer.py:803] 2025-04-26 15:30:19,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2383 2372 [WARNING|trainer.py:803] 2025-04-26 15:30:20,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:20,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2341 [WARNING|trainer.py:803] 2025-04-26 15:30:20,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2384 2373 [WARNING|trainer.py:803] 2025-04-26 15:30:21,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:21,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2342 [WARNING|trainer.py:803] 2025-04-26 15:30:22,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2385 2374 [WARNING|trainer.py:803] 2025-04-26 15:30:22,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:23,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2343 [WARNING|trainer.py:803] 2025-04-26 15:30:23,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2386 2375 [WARNING|trainer.py:803] 2025-04-26 15:30:24,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:24,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2344 [WARNING|trainer.py:803] 2025-04-26 15:30:24,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2387 2376 [WARNING|trainer.py:803] 2025-04-26 15:30:25,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:25,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2345 2388 [WARNING|trainer.py:803] 2025-04-26 15:30:26,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2377 [WARNING|trainer.py:803] 2025-04-26 15:30:26,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:26,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2346 2389 [WARNING|trainer.py:803] 2025-04-26 15:30:27,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2378 [WARNING|trainer.py:803] 2025-04-26 15:30:27,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:30:28,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2347 2390 [WARNING|trainer.py:803] 2025-04-26 15:30:28,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:29,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2379 [WARNING|trainer.py:803] 2025-04-26 15:30:29,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2348 2391 [WARNING|trainer.py:803] 2025-04-26 15:30:30,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:30,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2380 [WARNING|trainer.py:803] 2025-04-26 15:30:30,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2349 2392 [WARNING|trainer.py:803] 2025-04-26 15:30:31,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:31,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2381 [WARNING|trainer.py:803] 2025-04-26 15:30:31,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2350 2393 [WARNING|trainer.py:803] 2025-04-26 15:30:32,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:32,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2382 [WARNING|trainer.py:803] 2025-04-26 15:30:33,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2351 2394 [WARNING|trainer.py:803] 2025-04-26 15:30:34,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:30:34,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2383 [WARNING|trainer.py:803] 2025-04-26 15:30:34,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2352 2395 [WARNING|trainer.py:803] 2025-04-26 15:30:35,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:30:35,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2384 [WARNING|trainer.py:803] 2025-04-26 15:30:35,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2353 2396 [WARNING|trainer.py:803] 2025-04-26 15:30:36,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:30:36,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2385 [WARNING|trainer.py:803] 2025-04-26 15:30:37,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2354 2397 [WARNING|trainer.py:803] 2025-04-26 15:30:37,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:30:38,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2386 [WARNING|trainer.py:803] 2025-04-26 15:30:38,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2355 2398 [WARNING|trainer.py:803] 2025-04-26 15:30:39,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:39,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2387 2356 [WARNING|trainer.py:803] 2025-04-26 15:30:39,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2399 [WARNING|trainer.py:803] 2025-04-26 15:30:40,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:40,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2388 2357 [WARNING|trainer.py:803] 2025-04-26 15:30:41,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2400 [WARNING|trainer.py:803] 2025-04-26 15:30:41,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:41,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2389 2358 [WARNING|trainer.py:803] 2025-04-26 15:30:42,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:43,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:43,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2390 2359 2401 [WARNING|trainer.py:803] 2025-04-26 15:30:44,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:44,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:30:44,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2391 2360 [WARNING|trainer.py:803] 2025-04-26 15:30:45,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:45,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2392 2361 2402 [WARNING|trainer.py:803] 2025-04-26 15:30:46,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:46,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:47,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2362 2393 [WARNING|trainer.py:803] 2025-04-26 15:30:48,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:48,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2403 2363 2394 [WARNING|trainer.py:803] 2025-04-26 15:30:49,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:49,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:30:49,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2364 2395 [WARNING|trainer.py:803] 2025-04-26 15:30:50,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:50,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2404 2365 2396 [WARNING|trainer.py:803] 2025-04-26 15:30:51,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:52,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:30:52,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2366 2397 2405 [WARNING|trainer.py:803] 2025-04-26 15:30:53,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:53,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2367 [WARNING|trainer.py:803] 2025-04-26 15:30:54,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2398 [WARNING|trainer.py:803] 2025-04-26 15:30:54,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:54,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2368 2399 2406 [WARNING|trainer.py:803] 2025-04-26 15:30:55,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:30:56,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:30:56,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2369 2400 [WARNING|trainer.py:803] 2025-04-26 15:30:57,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:57,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2370 [WARNING|trainer.py:803] 2025-04-26 15:30:58,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2371 2401 [WARNING|trainer.py:803] 2025-04-26 15:30:59,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:30:59,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2372 [WARNING|trainer.py:803] 2025-04-26 15:31:01,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2407 2373 2402 [WARNING|trainer.py:803] 2025-04-26 15:31:01,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:31:02,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:02,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2374 2408 [WARNING|trainer.py:803] 2025-04-26 15:31:03,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2403 2375 [WARNING|trainer.py:803] 2025-04-26 15:31:04,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:04,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:04,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2376 2409 [WARNING|trainer.py:803] 2025-04-26 15:31:06,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2404 2377 [WARNING|trainer.py:803] 2025-04-26 15:31:06,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:07,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:07,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2378 2410 2405 [WARNING|trainer.py:803] 2025-04-26 15:31:08,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:31:08,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2379 [WARNING|trainer.py:803] 2025-04-26 15:31:09,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:31:10,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2411 2380 [WARNING|trainer.py:803] 2025-04-26 15:31:10,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2406 [WARNING|trainer.py:803] 2025-04-26 15:31:11,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2381 [WARNING|trainer.py:803] 2025-04-26 15:31:11,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:12,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2412 2382 [WARNING|trainer.py:803] 2025-04-26 15:31:13,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:31:13,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2383 2413 [WARNING|trainer.py:803] 2025-04-26 15:31:15,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2384 [WARNING|trainer.py:803] 2025-04-26 15:31:15,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:31:16,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2407 2385 2414 [WARNING|trainer.py:803] 2025-04-26 15:31:17,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:31:17,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2386 [WARNING|trainer.py:803] 2025-04-26 15:31:18,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2408 [WARNING|trainer.py:803] 2025-04-26 15:31:19,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2387 2415 [WARNING|trainer.py:803] 2025-04-26 15:31:19,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:20,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:20,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2388 2409 [WARNING|trainer.py:803] 2025-04-26 15:31:21,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2389 [WARNING|trainer.py:803] 2025-04-26 15:31:22,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2416 [WARNING|trainer.py:803] 2025-04-26 15:31:22,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:23,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2390 2410 [WARNING|trainer.py:803] 2025-04-26 15:31:24,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:24,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2417 2391 [WARNING|trainer.py:803] 2025-04-26 15:31:25,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:31:25,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2411 2392 [WARNING|trainer.py:803] 2025-04-26 15:31:26,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2418 [WARNING|trainer.py:803] 2025-04-26 15:31:26,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2393 [WARNING|trainer.py:803] 2025-04-26 15:31:27,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:28,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2412 2394 2419 [WARNING|trainer.py:803] 2025-04-26 15:31:28,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:31:29,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:29,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2395 2413 [WARNING|trainer.py:803] 2025-04-26 15:31:30,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2396 2420 [WARNING|trainer.py:803] 2025-04-26 15:31:31,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:31:32,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:32,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2397 2414 2421 [WARNING|trainer.py:803] 2025-04-26 15:31:33,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:31:33,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2398 [WARNING|trainer.py:803] 2025-04-26 15:31:34,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:31:34,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2415 2399 2422 [WARNING|trainer.py:803] 2025-04-26 15:31:36,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:31:36,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2400 [WARNING|trainer.py:803] 2025-04-26 15:31:36,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:31:37,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2416 2423 [WARNING|trainer.py:803] 2025-04-26 15:31:38,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2401 [WARNING|trainer.py:803] 2025-04-26 15:31:38,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:39,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2417 [WARNING|trainer.py:803] 2025-04-26 15:31:41,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2402 2424 [WARNING|trainer.py:803] 2025-04-26 15:31:42,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2418 [WARNING|trainer.py:803] 2025-04-26 15:31:42,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:31:43,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2425 2403 2419 [WARNING|trainer.py:803] 2025-04-26 15:31:44,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:31:44,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:45,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2404 2426 2420 [WARNING|trainer.py:803] 2025-04-26 15:31:47,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:47,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:31:47,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2427 2405 2421 [WARNING|trainer.py:803] 2025-04-26 15:31:49,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:31:49,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:31:49,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2406 2428 2422 [WARNING|trainer.py:803] 2025-04-26 15:31:51,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:31:52,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:31:52,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2429 2423 [WARNING|trainer.py:803] 2025-04-26 15:31:54,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:31:54,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2430 2407 [WARNING|trainer.py:803] 2025-04-26 15:31:56,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2424 [WARNING|trainer.py:803] 2025-04-26 15:31:57,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:31:58,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2431 2408 [WARNING|trainer.py:803] 2025-04-26 15:31:59,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2425 [WARNING|trainer.py:803] 2025-04-26 15:31:59,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:32:00,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2432 2409 [WARNING|trainer.py:803] 2025-04-26 15:32:01,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2426 [WARNING|trainer.py:803] 2025-04-26 15:32:02,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:32:02,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2433 2410 [WARNING|trainer.py:803] 2025-04-26 15:32:03,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2427 [WARNING|trainer.py:803] 2025-04-26 15:32:04,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:32:04,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2411 2434 [WARNING|trainer.py:803] 2025-04-26 15:32:06,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:32:06,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2428 [WARNING|trainer.py:803] 2025-04-26 15:32:07,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2435 2412 [WARNING|trainer.py:803] 2025-04-26 15:32:08,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:32:08,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2429 2436 [WARNING|trainer.py:803] 2025-04-26 15:32:10,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2413 [WARNING|trainer.py:803] 2025-04-26 15:32:10,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:32:11,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2430 2437 [WARNING|trainer.py:803] 2025-04-26 15:32:12,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:32:12,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2414 [WARNING|trainer.py:803] 2025-04-26 15:32:13,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2431 2438 [WARNING|trainer.py:803] 2025-04-26 15:32:14,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2415 [WARNING|trainer.py:803] 2025-04-26 15:32:15,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:32:15,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2432 2439 [WARNING|trainer.py:803] 2025-04-26 15:32:17,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:32:17,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2416 2433 [WARNING|trainer.py:803] 2025-04-26 15:32:18,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2440 [WARNING|trainer.py:803] 2025-04-26 15:32:19,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:32:20,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2417 [WARNING|trainer.py:803] 2025-04-26 15:32:20,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2434 2441 [WARNING|trainer.py:803] 2025-04-26 15:32:22,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:32:22,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2418 [WARNING|trainer.py:803] 2025-04-26 15:32:23,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2435 [WARNING|trainer.py:803] 2025-04-26 15:32:24,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2419 [WARNING|trainer.py:803] 2025-04-26 15:32:25,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2436 [WARNING|trainer.py:803] 2025-04-26 15:32:26,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2420 [WARNING|trainer.py:803] 2025-04-26 15:32:27,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2437 2442 [WARNING|trainer.py:803] 2025-04-26 15:32:28,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2421 [WARNING|trainer.py:803] 2025-04-26 15:32:29,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:32:29,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2443 2438 [WARNING|trainer.py:803] 2025-04-26 15:32:31,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:32:31,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2422 [WARNING|trainer.py:803] 2025-04-26 15:32:32,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2439 2444 [WARNING|trainer.py:803] 2025-04-26 15:32:33,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2423 [WARNING|trainer.py:803] 2025-04-26 15:32:34,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 15:32:34,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2445 2440 [WARNING|trainer.py:803] 2025-04-26 15:32:36,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:32:36,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2424 2441 2446 [WARNING|trainer.py:803] 2025-04-26 15:32:38,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:32:38,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:32:38,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2425 [WARNING|trainer.py:803] 2025-04-26 15:32:40,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2447 2426 [WARNING|trainer.py:803] 2025-04-26 15:32:42,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:32:42,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2448 2427 2442 [WARNING|trainer.py:803] 2025-04-26 15:32:44,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:32:44,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:32:45,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2443 2428 [WARNING|trainer.py:803] 2025-04-26 15:32:47,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:32:47,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2444 2429 [WARNING|trainer.py:803] 2025-04-26 15:32:50,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 15:32:50,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2449 [WARNING|trainer.py:803] 2025-04-26 15:32:51,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2445 2430 [WARNING|trainer.py:803] 2025-04-26 15:32:52,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:32:52,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2450 [WARNING|trainer.py:803] 2025-04-26 15:32:54,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2446 2431 [WARNING|trainer.py:803] 2025-04-26 15:32:54,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:32:55,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2451 [WARNING|trainer.py:803] 2025-04-26 15:32:56,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2432 [WARNING|trainer.py:803] 2025-04-26 15:32:57,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2447 2452 [WARNING|trainer.py:803] 2025-04-26 15:32:58,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:32:58,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2433 [WARNING|trainer.py:803] 2025-04-26 15:32:59,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2448 2453 [WARNING|trainer.py:803] 2025-04-26 15:33:00,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:33:01,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2434 [WARNING|trainer.py:803] 2025-04-26 15:33:02,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2454 [WARNING|trainer.py:803] 2025-04-26 15:33:03,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2435 [WARNING|trainer.py:803] 2025-04-26 15:33:04,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2455 [WARNING|trainer.py:803] 2025-04-26 15:33:05,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2436 [WARNING|trainer.py:803] 2025-04-26 15:33:06,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2449 [WARNING|trainer.py:803] 2025-04-26 15:33:07,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2437 [WARNING|trainer.py:803] 2025-04-26 15:33:08,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2450 2456 [WARNING|trainer.py:803] 2025-04-26 15:33:10,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:33:10,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2438 [WARNING|trainer.py:803] 2025-04-26 15:33:11,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2457 2451 [WARNING|trainer.py:803] 2025-04-26 15:33:12,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2439 [WARNING|trainer.py:803] 2025-04-26 15:33:12,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:33:13,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2458 2452 [WARNING|trainer.py:803] 2025-04-26 15:33:14,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:33:15,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2440 [WARNING|trainer.py:803] 2025-04-26 15:33:16,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2459 2453 [WARNING|trainer.py:803] 2025-04-26 15:33:17,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2441 [WARNING|trainer.py:803] 2025-04-26 15:33:17,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:33:18,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2454 2460 [WARNING|trainer.py:803] 2025-04-26 15:33:19,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:33:19,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2461 2455 [WARNING|trainer.py:803] 2025-04-26 15:33:21,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:33:22,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2462 2442 [WARNING|trainer.py:803] 2025-04-26 15:33:24,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:33:25,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2456 2443 [WARNING|trainer.py:803] 2025-04-26 15:33:26,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:33:27,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2457 2463 [WARNING|trainer.py:803] 2025-04-26 15:33:29,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2444 [WARNING|trainer.py:803] 2025-04-26 15:33:29,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:33:30,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 2458 [WARNING|trainer.py:803] 2025-04-26 15:33:31,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2445 [WARNING|trainer.py:803] 2025-04-26 15:33:32,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2459 [WARNING|trainer.py:803] 2025-04-26 15:33:33,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2446 2464 [WARNING|trainer.py:803] 2025-04-26 15:33:34,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:33:34,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2460 [WARNING|trainer.py:803] 2025-04-26 15:33:36,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2461 2447 [WARNING|trainer.py:803] 2025-04-26 15:33:38,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:33:38,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2448 2465 2462 [WARNING|trainer.py:803] 2025-04-26 15:33:40,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:33:40,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:33:41,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2466 2463 [WARNING|trainer.py:803] 2025-04-26 15:33:46,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:33:46,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2449 [WARNING|trainer.py:803] 2025-04-26 15:33:47,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2450 [WARNING|trainer.py:803] 2025-04-26 15:33:50,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2464 2467 [WARNING|trainer.py:803] 2025-04-26 15:33:51,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:33:51,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2451 [WARNING|trainer.py:803] 2025-04-26 15:33:52,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2452 [WARNING|trainer.py:803] 2025-04-26 15:33:55,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2453 2465 2468 [WARNING|trainer.py:803] 2025-04-26 15:33:57,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:33:57,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:33:57,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2454 [WARNING|trainer.py:803] 2025-04-26 15:33:59,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2455 [WARNING|trainer.py:803] 2025-04-26 15:34:02,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2469 2466 [WARNING|trainer.py:803] 2025-04-26 15:34:03,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:34:03,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2456 [WARNING|trainer.py:803] 2025-04-26 15:34:07,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2470 2467 2457 [WARNING|trainer.py:803] 2025-04-26 15:34:08,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:34:08,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:34:09,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2458 [WARNING|trainer.py:803] 2025-04-26 15:34:11,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2459 2471 [WARNING|trainer.py:803] 2025-04-26 15:34:14,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2468 [WARNING|trainer.py:803] 2025-04-26 15:34:14,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:34:15,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2460 2472 [WARNING|trainer.py:803] 2025-04-26 15:34:16,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:34:17,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2461 2473 [WARNING|trainer.py:803] 2025-04-26 15:34:18,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:34:19,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2469 [WARNING|trainer.py:803] 2025-04-26 15:34:20,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2474 2462 [WARNING|trainer.py:803] 2025-04-26 15:34:21,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:34:21,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2475 2470 [WARNING|trainer.py:803] 2025-04-26 15:34:25,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:34:25,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2463 2476 [WARNING|trainer.py:803] 2025-04-26 15:34:26,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:34:27,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2477 [WARNING|trainer.py:803] 2025-04-26 15:34:29,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2471 2478 2464 [WARNING|trainer.py:803] 2025-04-26 15:34:31,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:34:31,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:34:31,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2472 2479 [WARNING|trainer.py:803] 2025-04-26 15:34:33,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:34:33,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2473 2480 [WARNING|trainer.py:803] 2025-04-26 15:34:35,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:34:36,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2474 2465 2481 [WARNING|trainer.py:803] 2025-04-26 15:34:37,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:34:38,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:34:38,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2482 [WARNING|trainer.py:803] 2025-04-26 15:34:40,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2475 [WARNING|trainer.py:803] 2025-04-26 15:34:41,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2466 2476 2483 [WARNING|trainer.py:803] 2025-04-26 15:34:43,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:34:43,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:34:43,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2477 [WARNING|trainer.py:803] 2025-04-26 15:34:45,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2484 [WARNING|trainer.py:803] 2025-04-26 15:34:47,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2478 2467 [WARNING|trainer.py:803] 2025-04-26 15:34:48,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2485 [WARNING|trainer.py:803] 2025-04-26 15:34:48,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:34:49,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2479 [WARNING|trainer.py:803] 2025-04-26 15:34:50,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2486 2480 [WARNING|trainer.py:803] 2025-04-26 15:34:51,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:34:52,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2487 2468 2481 [WARNING|trainer.py:803] 2025-04-26 15:34:54,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:34:54,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:34:54,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2488 2482 [WARNING|trainer.py:803] 2025-04-26 15:34:56,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:34:56,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2489 [WARNING|trainer.py:803] 2025-04-26 15:34:58,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2469 2483 2490 [WARNING|trainer.py:803] 2025-04-26 15:35:00,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:00,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:00,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2491 2484 [WARNING|trainer.py:803] 2025-04-26 15:35:03,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:35:03,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2470 2492 2485 [WARNING|trainer.py:803] 2025-04-26 15:35:05,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:05,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:06,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2493 2486 [WARNING|trainer.py:803] 2025-04-26 15:35:08,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:08,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2494 [WARNING|trainer.py:803] 2025-04-26 15:35:10,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2471 2487 [WARNING|trainer.py:803] 2025-04-26 15:35:11,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:35:11,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2488 2495 2472 [WARNING|trainer.py:803] 2025-04-26 15:35:13,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:35:13,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:13,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2489 2496 2473 [WARNING|trainer.py:803] 2025-04-26 15:35:15,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:35:15,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:15,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2490 2497 2474 [WARNING|trainer.py:803] 2025-04-26 15:35:17,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:35:17,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:17,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2491 2498 [WARNING|trainer.py:803] 2025-04-26 15:35:20,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:35:20,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2475 2499 [WARNING|trainer.py:803] 2025-04-26 15:35:21,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2492 [WARNING|trainer.py:803] 2025-04-26 15:35:22,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2476 [WARNING|trainer.py:803] 2025-04-26 15:35:22,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:23,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2493 2477 [WARNING|trainer.py:803] 2025-04-26 15:35:25,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:25,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2494 2500 2478 [WARNING|trainer.py:803] 2025-04-26 15:35:27,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:35:27,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:35:28,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2501 2479 2495 [WARNING|trainer.py:803] 2025-04-26 15:35:30,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:35:30,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:35:30,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2496 2480 2502 [WARNING|trainer.py:803] 2025-04-26 15:35:32,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:32,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:35:32,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2497 2481 2503 [WARNING|trainer.py:803] 2025-04-26 15:35:34,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:34,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:35:34,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2482 2498 [WARNING|trainer.py:803] 2025-04-26 15:35:36,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2504 [WARNING|trainer.py:803] 2025-04-26 15:35:37,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:35:37,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2499 2505 [WARNING|trainer.py:803] 2025-04-26 15:35:39,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2483 [WARNING|trainer.py:803] 2025-04-26 15:35:40,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:40,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2506 [WARNING|trainer.py:803] 2025-04-26 15:35:42,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2484 [WARNING|trainer.py:803] 2025-04-26 15:35:43,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2500 2507 [WARNING|trainer.py:803] 2025-04-26 15:35:44,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:35:45,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2485 [WARNING|trainer.py:803] 2025-04-26 15:35:45,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2508 2501 [WARNING|trainer.py:803] 2025-04-26 15:35:47,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:47,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2486 2509 [WARNING|trainer.py:803] 2025-04-26 15:35:48,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:49,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2502 [WARNING|trainer.py:803] 2025-04-26 15:35:50,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2487 2510 [WARNING|trainer.py:803] 2025-04-26 15:35:51,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2503 [WARNING|trainer.py:803] 2025-04-26 15:35:51,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:35:52,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2488 2511 [WARNING|trainer.py:803] 2025-04-26 15:35:53,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:35:53,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2489 2504 2512 [WARNING|trainer.py:803] 2025-04-26 15:35:55,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:35:55,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:35:55,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2490 2505 [WARNING|trainer.py:803] 2025-04-26 15:35:57,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2513 [WARNING|trainer.py:803] 2025-04-26 15:35:57,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:58,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2506 2491 2514 [WARNING|trainer.py:803] 2025-04-26 15:35:59,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:35:59,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:36:00,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2507 2492 2515 [WARNING|trainer.py:803] 2025-04-26 15:36:02,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:36:02,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:36:03,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2508 2493 2516 [WARNING|trainer.py:803] 2025-04-26 15:36:04,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:36:04,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:36:05,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2509 2494 [WARNING|trainer.py:803] 2025-04-26 15:36:06,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:36:07,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2517 [WARNING|trainer.py:803] 2025-04-26 15:36:08,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2510 [WARNING|trainer.py:803] 2025-04-26 15:36:09,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2495 [WARNING|trainer.py:803] 2025-04-26 15:36:10,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2518 2511 [WARNING|trainer.py:803] 2025-04-26 15:36:11,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2496 [WARNING|trainer.py:803] 2025-04-26 15:36:11,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:36:12,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2512 2519 2497 [WARNING|trainer.py:803] 2025-04-26 15:36:13,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:36:13,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:36:14,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2513 2520 [WARNING|trainer.py:803] 2025-04-26 15:36:16,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2498 [WARNING|trainer.py:803] 2025-04-26 15:36:16,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:36:17,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2514 2499 [WARNING|trainer.py:803] 2025-04-26 15:36:18,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2521 [WARNING|trainer.py:803] 2025-04-26 15:36:19,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:36:19,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2515 [WARNING|trainer.py:803] 2025-04-26 15:36:21,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2516 [WARNING|trainer.py:803] 2025-04-26 15:36:23,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2522 2500 [WARNING|trainer.py:803] 2025-04-26 15:36:25,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:36:25,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2517 [WARNING|trainer.py:803] 2025-04-26 15:36:26,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2501 [WARNING|trainer.py:803] 2025-04-26 15:36:27,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2518 2502 [WARNING|trainer.py:803] 2025-04-26 15:36:29,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2523 [WARNING|trainer.py:803] 2025-04-26 15:36:30,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:36:30,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2519 2503 [WARNING|trainer.py:803] 2025-04-26 15:36:32,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:36:32,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2520 2504 [WARNING|trainer.py:803] 2025-04-26 15:36:34,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:36:35,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2524 [WARNING|trainer.py:803] 2025-04-26 15:36:36,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2521 2505 [WARNING|trainer.py:803] 2025-04-26 15:36:37,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:36:37,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2506 [WARNING|trainer.py:803] 2025-04-26 15:36:40,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2507 2522 2525 [WARNING|trainer.py:803] 2025-04-26 15:36:42,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:36:43,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:36:43,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2508 [WARNING|trainer.py:803] 2025-04-26 15:36:45,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2509 [WARNING|trainer.py:803] 2025-04-26 15:36:47,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2523 2526 2510 [WARNING|trainer.py:803] 2025-04-26 15:36:48,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:36:49,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:36:49,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2511 [WARNING|trainer.py:803] 2025-04-26 15:36:51,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2512 [WARNING|trainer.py:803] 2025-04-26 15:36:54,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2524 [WARNING|trainer.py:803] 2025-04-26 15:36:55,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2513 [WARNING|trainer.py:803] 2025-04-26 15:36:56,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2514 2527 [WARNING|trainer.py:803] 2025-04-26 15:36:58,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:36:59,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2515 2525 [WARNING|trainer.py:803] 2025-04-26 15:37:01,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:37:01,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2516 2528 [WARNING|trainer.py:803] 2025-04-26 15:37:03,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:37:04,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2517 [WARNING|trainer.py:803] 2025-04-26 15:37:06,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2526 [WARNING|trainer.py:803] 2025-04-26 15:37:08,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2529 2518 [WARNING|trainer.py:803] 2025-04-26 15:37:09,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:37:09,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2519 [WARNING|trainer.py:803] 2025-04-26 15:37:12,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2530 2520 [WARNING|trainer.py:803] 2025-04-26 15:37:14,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:37:15,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2527 2521 2531 [WARNING|trainer.py:803] 2025-04-26 15:37:17,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:37:18,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:37:18,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2532 [WARNING|trainer.py:803] 2025-04-26 15:37:21,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2528 2522 2533 [WARNING|trainer.py:803] 2025-04-26 15:37:23,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:37:23,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:37:23,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2534 [WARNING|trainer.py:803] 2025-04-26 15:37:26,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2529 [WARNING|trainer.py:803] 2025-04-26 15:37:28,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2535 2523 [WARNING|trainer.py:803] 2025-04-26 15:37:29,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:37:29,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2536 [WARNING|trainer.py:803] 2025-04-26 15:37:31,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2530 2537 [WARNING|trainer.py:803] 2025-04-26 15:37:33,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:37:33,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2524 2538 [WARNING|trainer.py:803] 2025-04-26 15:37:35,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:37:35,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2531 [WARNING|trainer.py:803] 2025-04-26 15:37:37,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2539 [WARNING|trainer.py:803] 2025-04-26 15:37:38,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2532 [WARNING|trainer.py:803] 2025-04-26 15:37:40,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2540 [WARNING|trainer.py:803] 2025-04-26 15:37:41,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2525 2533 [WARNING|trainer.py:803] 2025-04-26 15:37:42,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2541 [WARNING|trainer.py:803] 2025-04-26 15:37:43,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:37:43,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2534 2542 [WARNING|trainer.py:803] 2025-04-26 15:37:45,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:37:45,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2535 2543 2526 [WARNING|trainer.py:803] 2025-04-26 15:37:48,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:37:48,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:37:48,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2536 2544 [WARNING|trainer.py:803] 2025-04-26 15:37:50,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:37:51,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2537 [WARNING|trainer.py:803] 2025-04-26 15:37:52,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2538 2545 [WARNING|trainer.py:803] 2025-04-26 15:37:54,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:37:54,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2546 2539 [WARNING|trainer.py:803] 2025-04-26 15:37:57,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2527 [WARNING|trainer.py:803] 2025-04-26 15:37:57,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:37:58,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2547 2540 [WARNING|trainer.py:803] 2025-04-26 15:37:59,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:38:00,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2548 2541 [WARNING|trainer.py:803] 2025-04-26 15:38:01,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:38:02,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2528 2549 [WARNING|trainer.py:803] 2025-04-26 15:38:03,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2542 [WARNING|trainer.py:803] 2025-04-26 15:38:04,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:38:04,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2550 2543 [WARNING|trainer.py:803] 2025-04-26 15:38:06,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2529 [WARNING|trainer.py:803] 2025-04-26 15:38:07,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2551 [WARNING|trainer.py:803] 2025-04-26 15:38:08,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:38:09,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2544 [WARNING|trainer.py:803] 2025-04-26 15:38:10,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2552 [WARNING|trainer.py:803] 2025-04-26 15:38:11,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2530 2545 2553 [WARNING|trainer.py:803] 2025-04-26 15:38:13,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:38:14,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:38:14,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2546 [WARNING|trainer.py:803] 2025-04-26 15:38:16,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2554 2531 [WARNING|trainer.py:803] 2025-04-26 15:38:17,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2547 [WARNING|trainer.py:803] 2025-04-26 15:38:17,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2555 [WARNING|trainer.py:803] 2025-04-26 15:38:18,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:38:19,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2532 2548 [WARNING|trainer.py:803] 2025-04-26 15:38:20,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2556 [WARNING|trainer.py:803] 2025-04-26 15:38:21,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:38:21,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2549 2533 2557 [WARNING|trainer.py:803] 2025-04-26 15:38:23,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:38:23,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:38:23,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2534 2550 [WARNING|trainer.py:803] 2025-04-26 15:38:25,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:38:26,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2558 2551 2535 [WARNING|trainer.py:803] 2025-04-26 15:38:28,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:38:28,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:38:28,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2559 2536 2552 [WARNING|trainer.py:803] 2025-04-26 15:38:30,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:38:30,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:38:30,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2560 2537 2553 [WARNING|trainer.py:803] 2025-04-26 15:38:33,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:38:33,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:38:33,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2538 2561 [WARNING|trainer.py:803] 2025-04-26 15:38:35,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2554 [WARNING|trainer.py:803] 2025-04-26 15:38:35,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:38:36,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2562 2539 2555 [WARNING|trainer.py:803] 2025-04-26 15:38:37,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:38:38,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:38:38,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2563 2540 2556 [WARNING|trainer.py:803] 2025-04-26 15:38:40,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:38:40,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:38:40,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2541 2557 [WARNING|trainer.py:803] 2025-04-26 15:38:42,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:38:43,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2564 [WARNING|trainer.py:803] 2025-04-26 15:38:44,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2542 [WARNING|trainer.py:803] 2025-04-26 15:38:45,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2565 [WARNING|trainer.py:803] 2025-04-26 15:38:46,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2558 2543 [WARNING|trainer.py:803] 2025-04-26 15:38:47,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:38:48,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2566 2559 [WARNING|trainer.py:803] 2025-04-26 15:38:49,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:38:49,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2544 [WARNING|trainer.py:803] 2025-04-26 15:38:50,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2560 2567 [WARNING|trainer.py:803] 2025-04-26 15:38:52,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:38:53,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2545 2561 [WARNING|trainer.py:803] 2025-04-26 15:38:54,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:38:54,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2568 2546 2562 [WARNING|trainer.py:803] 2025-04-26 15:38:56,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:38:56,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:38:57,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2569 2547 2563 [WARNING|trainer.py:803] 2025-04-26 15:38:58,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:38:59,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:38:59,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2570 [WARNING|trainer.py:803] 2025-04-26 15:39:00,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2548 [WARNING|trainer.py:803] 2025-04-26 15:39:01,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2571 2564 [WARNING|trainer.py:803] 2025-04-26 15:39:03,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2549 [WARNING|trainer.py:803] 2025-04-26 15:39:03,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:39:03,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2572 2565 [WARNING|trainer.py:803] 2025-04-26 15:39:05,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:39:05,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2550 2573 [WARNING|trainer.py:803] 2025-04-26 15:39:06,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:39:07,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2566 2551 [WARNING|trainer.py:803] 2025-04-26 15:39:08,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2574 [WARNING|trainer.py:803] 2025-04-26 15:39:09,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:39:09,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2552 2575 2567 [WARNING|trainer.py:803] 2025-04-26 15:39:11,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:39:11,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:39:12,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2553 2576 [WARNING|trainer.py:803] 2025-04-26 15:39:14,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:39:14,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2568 [WARNING|trainer.py:803] 2025-04-26 15:39:15,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2577 2554 [WARNING|trainer.py:803] 2025-04-26 15:39:16,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2569 [WARNING|trainer.py:803] 2025-04-26 15:39:17,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:39:18,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2555 2578 2570 [WARNING|trainer.py:803] 2025-04-26 15:39:19,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:39:19,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:39:20,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2556 2571 2579 [WARNING|trainer.py:803] 2025-04-26 15:39:21,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:39:22,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:39:22,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2557 2572 [WARNING|trainer.py:803] 2025-04-26 15:39:24,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:39:24,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2580 [WARNING|trainer.py:803] 2025-04-26 15:39:25,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2573 [WARNING|trainer.py:803] 2025-04-26 15:39:26,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2581 2558 2574 [WARNING|trainer.py:803] 2025-04-26 15:39:27,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:39:28,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:39:28,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2582 2559 2575 [WARNING|trainer.py:803] 2025-04-26 15:39:30,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:39:30,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:39:31,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2583 2560 2576 [WARNING|trainer.py:803] 2025-04-26 15:39:33,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:39:33,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:39:33,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2584 2577 2561 [WARNING|trainer.py:803] 2025-04-26 15:39:35,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:39:36,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:39:36,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2562 2585 2578 [WARNING|trainer.py:803] 2025-04-26 15:39:38,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:39:38,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:39:38,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2586 2563 [WARNING|trainer.py:803] 2025-04-26 15:39:40,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:39:40,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2579 [WARNING|trainer.py:803] 2025-04-26 15:39:41,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2587 [WARNING|trainer.py:803] 2025-04-26 15:39:43,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2580 2564 [WARNING|trainer.py:803] 2025-04-26 15:39:44,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:39:44,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2588 2565 2581 [WARNING|trainer.py:803] 2025-04-26 15:39:46,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:39:46,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:39:46,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2589 2582 [WARNING|trainer.py:803] 2025-04-26 15:39:48,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2566 [WARNING|trainer.py:803] 2025-04-26 15:39:49,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:39:49,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2590 2583 [WARNING|trainer.py:803] 2025-04-26 15:39:51,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:39:52,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2567 2591 [WARNING|trainer.py:803] 2025-04-26 15:39:53,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2584 [WARNING|trainer.py:803] 2025-04-26 15:39:54,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:39:54,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2592 2568 2585 [WARNING|trainer.py:803] 2025-04-26 15:39:56,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:39:56,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:39:57,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2569 2586 2593 [WARNING|trainer.py:803] 2025-04-26 15:39:59,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:39:59,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:39:59,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2570 2594 2587 [WARNING|trainer.py:803] 2025-04-26 15:40:01,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:40:02,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:40:02,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2571 2595 [WARNING|trainer.py:803] 2025-04-26 15:40:03,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:40:04,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2588 2572 [WARNING|trainer.py:803] 2025-04-26 15:40:05,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:40:05,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2596 [WARNING|trainer.py:803] 2025-04-26 15:40:06,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2589 2573 [WARNING|trainer.py:803] 2025-04-26 15:40:07,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:40:07,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2597 2574 [WARNING|trainer.py:803] 2025-04-26 15:40:09,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:40:10,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2590 [WARNING|trainer.py:803] 2025-04-26 15:40:10,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2598 2575 [WARNING|trainer.py:803] 2025-04-26 15:40:11,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:40:12,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2591 2599 [WARNING|trainer.py:803] 2025-04-26 15:40:13,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:40:13,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2576 2592 [WARNING|trainer.py:803] 2025-04-26 15:40:15,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2600 [WARNING|trainer.py:803] 2025-04-26 15:40:16,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:40:16,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2577 [WARNING|trainer.py:803] 2025-04-26 15:40:17,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2593 2601 [WARNING|trainer.py:803] 2025-04-26 15:40:18,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:40:19,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2578 [WARNING|trainer.py:803] 2025-04-26 15:40:20,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2594 2602 [WARNING|trainer.py:803] 2025-04-26 15:40:21,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:40:21,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2579 2595 [WARNING|trainer.py:803] 2025-04-26 15:40:22,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:40:23,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2603 [WARNING|trainer.py:803] 2025-04-26 15:40:24,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2596 2580 2604 [WARNING|trainer.py:803] 2025-04-26 15:40:25,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:40:26,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:40:26,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2581 2597 [WARNING|trainer.py:803] 2025-04-26 15:40:28,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2605 [WARNING|trainer.py:803] 2025-04-26 15:40:28,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:40:29,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2582 2598 2606 [WARNING|trainer.py:803] 2025-04-26 15:40:30,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:40:30,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:40:31,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2599 2607 [WARNING|trainer.py:803] 2025-04-26 15:40:32,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2583 [WARNING|trainer.py:803] 2025-04-26 15:40:33,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:40:34,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2600 2608 [WARNING|trainer.py:803] 2025-04-26 15:40:35,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2584 [WARNING|trainer.py:803] 2025-04-26 15:40:35,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:40:36,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2609 2601 [WARNING|trainer.py:803] 2025-04-26 15:40:38,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2585 [WARNING|trainer.py:803] 2025-04-26 15:40:38,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:40:39,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2602 2586 [WARNING|trainer.py:803] 2025-04-26 15:40:40,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:40:41,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2610 [WARNING|trainer.py:803] 2025-04-26 15:40:42,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2603 2587 [WARNING|trainer.py:803] 2025-04-26 15:40:43,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:40:43,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2604 [WARNING|trainer.py:803] 2025-04-26 15:40:45,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2588 [WARNING|trainer.py:803] 2025-04-26 15:40:46,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2611 2605 [WARNING|trainer.py:803] 2025-04-26 15:40:48,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:40:48,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2589 [WARNING|trainer.py:803] 2025-04-26 15:40:49,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2612 2606 [WARNING|trainer.py:803] 2025-04-26 15:40:50,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:40:50,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2590 2613 2607 [WARNING|trainer.py:803] 2025-04-26 15:40:52,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:40:52,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:40:52,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2614 2591 2608 [WARNING|trainer.py:803] 2025-04-26 15:40:54,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:40:55,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:40:55,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2615 2609 2592 [WARNING|trainer.py:803] 2025-04-26 15:40:56,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:40:57,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:40:57,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2616 [WARNING|trainer.py:803] 2025-04-26 15:40:59,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2593 2617 2610 [WARNING|trainer.py:803] 2025-04-26 15:41:00,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:41:01,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:41:01,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2594 [WARNING|trainer.py:803] 2025-04-26 15:41:03,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2618 [WARNING|trainer.py:803] 2025-04-26 15:41:04,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2595 [WARNING|trainer.py:803] 2025-04-26 15:41:05,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2619 2611 [WARNING|trainer.py:803] 2025-04-26 15:41:06,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:41:06,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2596 [WARNING|trainer.py:803] 2025-04-26 15:41:07,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2620 2612 [WARNING|trainer.py:803] 2025-04-26 15:41:09,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:41:09,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2597 2621 2613 [WARNING|trainer.py:803] 2025-04-26 15:41:10,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:41:11,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:41:11,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2598 2622 [WARNING|trainer.py:803] 2025-04-26 15:41:12,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2614 [WARNING|trainer.py:803] 2025-04-26 15:41:13,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:41:13,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2599 [WARNING|trainer.py:803] 2025-04-26 15:41:14,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2615 2623 [WARNING|trainer.py:803] 2025-04-26 15:41:15,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:41:16,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2600 2616 [WARNING|trainer.py:803] 2025-04-26 15:41:17,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:41:17,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2624 2617 2601 [WARNING|trainer.py:803] 2025-04-26 15:41:19,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:41:20,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:41:20,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2625 2602 2618 [WARNING|trainer.py:803] 2025-04-26 15:41:22,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:41:22,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:41:22,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2619 2626 2603 [WARNING|trainer.py:803] 2025-04-26 15:41:25,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:41:25,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:41:25,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2627 2604 2620 [WARNING|trainer.py:803] 2025-04-26 15:41:27,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:41:27,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:41:27,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2628 2621 2605 [WARNING|trainer.py:803] 2025-04-26 15:41:29,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:41:29,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:41:30,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2629 2622 2606 [WARNING|trainer.py:803] 2025-04-26 15:41:32,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:41:32,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:41:32,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2630 2607 2623 [WARNING|trainer.py:803] 2025-04-26 15:41:34,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:41:34,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:41:35,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2608 2631 [WARNING|trainer.py:803] 2025-04-26 15:41:37,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:41:37,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2624 2609 [WARNING|trainer.py:803] 2025-04-26 15:41:38,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2632 [WARNING|trainer.py:803] 2025-04-26 15:41:39,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:41:39,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2625 [WARNING|trainer.py:803] 2025-04-26 15:41:40,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2633 [WARNING|trainer.py:803] 2025-04-26 15:41:42,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2626 2610 2634 [WARNING|trainer.py:803] 2025-04-26 15:41:43,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:41:43,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:41:44,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2627 2635 [WARNING|trainer.py:803] 2025-04-26 15:41:46,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:41:46,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2628 2636 [WARNING|trainer.py:803] 2025-04-26 15:41:48,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2611 [WARNING|trainer.py:803] 2025-04-26 15:41:49,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:41:49,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2629 2637 [WARNING|trainer.py:803] 2025-04-26 15:41:50,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:41:51,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2612 [WARNING|trainer.py:803] 2025-04-26 15:41:52,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2630 2638 [WARNING|trainer.py:803] 2025-04-26 15:41:53,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:41:53,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2613 [WARNING|trainer.py:803] 2025-04-26 15:41:54,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2639 [WARNING|trainer.py:803] 2025-04-26 15:41:55,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2631 2614 [WARNING|trainer.py:803] 2025-04-26 15:41:56,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2640 [WARNING|trainer.py:803] 2025-04-26 15:41:56,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:41:57,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2632 2615 [WARNING|trainer.py:803] 2025-04-26 15:41:58,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:41:58,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2641 [WARNING|trainer.py:803] 2025-04-26 15:41:59,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2633 2616 [WARNING|trainer.py:803] 2025-04-26 15:42:00,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:42:00,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2642 [WARNING|trainer.py:803] 2025-04-26 15:42:02,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2634 2617 [WARNING|trainer.py:803] 2025-04-26 15:42:02,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:42:03,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2635 2618 [WARNING|trainer.py:803] 2025-04-26 15:42:05,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:42:05,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2643 [WARNING|trainer.py:803] 2025-04-26 15:42:06,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2636 2619 [WARNING|trainer.py:803] 2025-04-26 15:42:07,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:42:07,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2637 2620 [WARNING|trainer.py:803] 2025-04-26 15:42:09,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:42:10,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2644 2638 [WARNING|trainer.py:803] 2025-04-26 15:42:11,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2621 [WARNING|trainer.py:803] 2025-04-26 15:42:11,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:42:12,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2639 [WARNING|trainer.py:803] 2025-04-26 15:42:13,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2622 [WARNING|trainer.py:803] 2025-04-26 15:42:14,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2640 2645 [WARNING|trainer.py:803] 2025-04-26 15:42:16,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:42:16,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2623 2641 [WARNING|trainer.py:803] 2025-04-26 15:42:18,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:42:18,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2642 2646 2624 [WARNING|trainer.py:803] 2025-04-26 15:42:20,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:42:21,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:42:21,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2625 [WARNING|trainer.py:803] 2025-04-26 15:42:23,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2647 2643 [WARNING|trainer.py:803] 2025-04-26 15:42:25,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:42:25,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2626 [WARNING|trainer.py:803] 2025-04-26 15:42:26,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2627 2648 [WARNING|trainer.py:803] 2025-04-26 15:42:29,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2644 [WARNING|trainer.py:803] 2025-04-26 15:42:29,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2628 [WARNING|trainer.py:803] 2025-04-26 15:42:30,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:42:31,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2629 2649 [WARNING|trainer.py:803] 2025-04-26 15:42:33,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:42:34,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2645 2630 [WARNING|trainer.py:803] 2025-04-26 15:42:35,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:42:35,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2631 2650 2646 [WARNING|trainer.py:803] 2025-04-26 15:42:39,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:42:39,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:42:40,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2632 2651 [WARNING|trainer.py:803] 2025-04-26 15:42:41,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:42:41,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2633 2652 [WARNING|trainer.py:803] 2025-04-26 15:42:43,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2647 [WARNING|trainer.py:803] 2025-04-26 15:42:43,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:42:44,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2653 2634 [WARNING|trainer.py:803] 2025-04-26 15:42:45,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:42:46,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2654 2635 2648 [WARNING|trainer.py:803] 2025-04-26 15:42:48,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:42:48,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:42:48,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2655 2636 [WARNING|trainer.py:803] 2025-04-26 15:42:50,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:42:50,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2656 2649 2637 [WARNING|trainer.py:803] 2025-04-26 15:42:52,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:42:52,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:42:52,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2657 2638 [WARNING|trainer.py:803] 2025-04-26 15:42:54,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:42:55,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2639 2658 [WARNING|trainer.py:803] 2025-04-26 15:42:57,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:42:57,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2650 [WARNING|trainer.py:803] 2025-04-26 15:42:58,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2640 2659 [WARNING|trainer.py:803] 2025-04-26 15:42:59,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:42:59,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2651 [WARNING|trainer.py:803] 2025-04-26 15:43:00,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2660 2641 [WARNING|trainer.py:803] 2025-04-26 15:43:01,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:43:01,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2652 [WARNING|trainer.py:803] 2025-04-26 15:43:02,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2661 2642 [WARNING|trainer.py:803] 2025-04-26 15:43:03,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2653 [WARNING|trainer.py:803] 2025-04-26 15:43:04,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:43:04,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2662 [WARNING|trainer.py:803] 2025-04-26 15:43:05,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2654 [WARNING|trainer.py:803] 2025-04-26 15:43:06,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2643 2663 2655 [WARNING|trainer.py:803] 2025-04-26 15:43:08,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:43:08,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:43:09,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2656 2664 [WARNING|trainer.py:803] 2025-04-26 15:43:11,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:43:11,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2657 2644 [WARNING|trainer.py:803] 2025-04-26 15:43:13,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2665 [WARNING|trainer.py:803] 2025-04-26 15:43:13,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:43:14,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2658 [WARNING|trainer.py:803] 2025-04-26 15:43:16,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2659 2645 [WARNING|trainer.py:803] 2025-04-26 15:43:18,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2666 [WARNING|trainer.py:803] 2025-04-26 15:43:18,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:43:19,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2660 [WARNING|trainer.py:803] 2025-04-26 15:43:20,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2667 [WARNING|trainer.py:803] 2025-04-26 15:43:21,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2661 [WARNING|trainer.py:803] 2025-04-26 15:43:22,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2646 2668 [WARNING|trainer.py:803] 2025-04-26 15:43:23,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:43:23,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2662 [WARNING|trainer.py:803] 2025-04-26 15:43:24,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2669 [WARNING|trainer.py:803] 2025-04-26 15:43:26,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2663 2647 2670 [WARNING|trainer.py:803] 2025-04-26 15:43:27,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:43:28,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:43:28,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2671 2664 [WARNING|trainer.py:803] 2025-04-26 15:43:30,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:43:30,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2648 2665 [WARNING|trainer.py:803] 2025-04-26 15:43:32,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2672 [WARNING|trainer.py:803] 2025-04-26 15:43:33,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:43:33,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2673 [WARNING|trainer.py:803] 2025-04-26 15:43:35,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2649 [WARNING|trainer.py:803] 2025-04-26 15:43:36,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2666 2674 [WARNING|trainer.py:803] 2025-04-26 15:43:38,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:43:39,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2667 [WARNING|trainer.py:803] 2025-04-26 15:43:40,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2650 2668 [WARNING|trainer.py:803] 2025-04-26 15:43:42,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:43:42,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2651 2669 [WARNING|trainer.py:803] 2025-04-26 15:43:44,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:43:45,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2652 2670 [WARNING|trainer.py:803] 2025-04-26 15:43:46,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:43:47,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2653 2675 2671 [WARNING|trainer.py:803] 2025-04-26 15:43:48,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:43:49,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:43:49,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2654 2676 [WARNING|trainer.py:803] 2025-04-26 15:43:51,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2672 [WARNING|trainer.py:803] 2025-04-26 15:43:51,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:43:52,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2655 2677 [WARNING|trainer.py:803] 2025-04-26 15:43:53,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2673 [WARNING|trainer.py:803] 2025-04-26 15:43:53,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:43:54,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2656 2678 [WARNING|trainer.py:803] 2025-04-26 15:43:55,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:43:55,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2657 2679 2674 [WARNING|trainer.py:803] 2025-04-26 15:43:57,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:43:57,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:43:57,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2680 2658 [WARNING|trainer.py:803] 2025-04-26 15:43:59,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:00,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2681 2659 [WARNING|trainer.py:803] 2025-04-26 15:44:02,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:44:02,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2660 2682 [WARNING|trainer.py:803] 2025-04-26 15:44:04,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:44:04,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2683 2661 2675 [WARNING|trainer.py:803] 2025-04-26 15:44:06,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:44:07,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:44:07,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2684 2662 [WARNING|trainer.py:803] 2025-04-26 15:44:08,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2676 [WARNING|trainer.py:803] 2025-04-26 15:44:09,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:44:09,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2685 2677 2663 [WARNING|trainer.py:803] 2025-04-26 15:44:11,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:44:11,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:12,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2678 2686 [WARNING|trainer.py:803] 2025-04-26 15:44:13,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:14,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2664 2679 [WARNING|trainer.py:803] 2025-04-26 15:44:15,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2687 [WARNING|trainer.py:803] 2025-04-26 15:44:15,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:16,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2665 2680 [WARNING|trainer.py:803] 2025-04-26 15:44:17,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2688 [WARNING|trainer.py:803] 2025-04-26 15:44:17,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:18,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2681 2689 [WARNING|trainer.py:803] 2025-04-26 15:44:20,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:44:20,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2666 2682 2690 [WARNING|trainer.py:803] 2025-04-26 15:44:22,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:44:22,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:44:23,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2667 2683 2691 [WARNING|trainer.py:803] 2025-04-26 15:44:24,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:44:25,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:44:25,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2684 2668 2692 [WARNING|trainer.py:803] 2025-04-26 15:44:27,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:44:27,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:44:27,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2669 2685 2693 [WARNING|trainer.py:803] 2025-04-26 15:44:29,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:44:29,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:44:29,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2670 2694 2686 [WARNING|trainer.py:803] 2025-04-26 15:44:31,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:44:31,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:32,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2671 2695 2687 [WARNING|trainer.py:803] 2025-04-26 15:44:33,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:44:34,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:34,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2672 2688 2696 [WARNING|trainer.py:803] 2025-04-26 15:44:36,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:36,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:36,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2689 2673 2697 [WARNING|trainer.py:803] 2025-04-26 15:44:38,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:44:38,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:39,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2690 2698 [WARNING|trainer.py:803] 2025-04-26 15:44:41,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:41,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2674 [WARNING|trainer.py:803] 2025-04-26 15:44:42,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2691 2699 [WARNING|trainer.py:803] 2025-04-26 15:44:43,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:43,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2692 2700 [WARNING|trainer.py:803] 2025-04-26 15:44:45,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:45,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2693 2701 [WARNING|trainer.py:803] 2025-04-26 15:44:47,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:48,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2694 [WARNING|trainer.py:803] 2025-04-26 15:44:50,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2702 [WARNING|trainer.py:803] 2025-04-26 15:44:51,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2675 2695 [WARNING|trainer.py:803] 2025-04-26 15:44:52,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:52,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2703 [WARNING|trainer.py:803] 2025-04-26 15:44:53,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2676 2696 [WARNING|trainer.py:803] 2025-04-26 15:44:54,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:44:54,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2704 2677 [WARNING|trainer.py:803] 2025-04-26 15:44:56,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2697 [WARNING|trainer.py:803] 2025-04-26 15:44:56,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:57,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2678 2705 2698 [WARNING|trainer.py:803] 2025-04-26 15:44:58,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:58,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:44:59,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2679 2706 [WARNING|trainer.py:803] 2025-04-26 15:45:00,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2699 [WARNING|trainer.py:803] 2025-04-26 15:45:01,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:45:01,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2680 [WARNING|trainer.py:803] 2025-04-26 15:45:02,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2707 2700 [WARNING|trainer.py:803] 2025-04-26 15:45:03,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:04,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2681 [WARNING|trainer.py:803] 2025-04-26 15:45:05,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2708 2701 [WARNING|trainer.py:803] 2025-04-26 15:45:06,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:45:06,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2682 [WARNING|trainer.py:803] 2025-04-26 15:45:08,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2709 2702 [WARNING|trainer.py:803] 2025-04-26 15:45:09,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:45:09,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2683 [WARNING|trainer.py:803] 2025-04-26 15:45:10,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2710 2703 2684 [WARNING|trainer.py:803] 2025-04-26 15:45:11,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:11,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:45:12,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2711 2704 2685 [WARNING|trainer.py:803] 2025-04-26 15:45:14,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:14,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:14,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2712 2705 2686 [WARNING|trainer.py:803] 2025-04-26 15:45:16,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:17,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:45:17,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2713 2706 2687 [WARNING|trainer.py:803] 2025-04-26 15:45:19,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:19,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:45:19,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2714 2688 2707 [WARNING|trainer.py:803] 2025-04-26 15:45:21,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:21,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:45:22,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2689 2715 [WARNING|trainer.py:803] 2025-04-26 15:45:23,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2708 [WARNING|trainer.py:803] 2025-04-26 15:45:24,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:45:24,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2690 2716 2709 [WARNING|trainer.py:803] 2025-04-26 15:45:26,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:45:27,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:27,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2691 2717 [WARNING|trainer.py:803] 2025-04-26 15:45:28,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2710 [WARNING|trainer.py:803] 2025-04-26 15:45:29,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:29,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2692 [WARNING|trainer.py:803] 2025-04-26 15:45:31,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2718 2711 [WARNING|trainer.py:803] 2025-04-26 15:45:32,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2693 [WARNING|trainer.py:803] 2025-04-26 15:45:32,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:33,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2719 2712 2694 [WARNING|trainer.py:803] 2025-04-26 15:45:34,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:34,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:35,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2720 2713 2695 [WARNING|trainer.py:803] 2025-04-26 15:45:37,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:45:37,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:37,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2721 2714 2696 [WARNING|trainer.py:803] 2025-04-26 15:45:39,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:45:40,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:40,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2722 2715 2697 [WARNING|trainer.py:803] 2025-04-26 15:45:42,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:45:42,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:45:42,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2698 2723 2716 [WARNING|trainer.py:803] 2025-04-26 15:45:45,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:45:45,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:45:45,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2699 2717 2724 [WARNING|trainer.py:803] 2025-04-26 15:45:47,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:45:47,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:47,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2700 2718 2725 [WARNING|trainer.py:803] 2025-04-26 15:45:49,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:45:50,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:50,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2701 2719 2726 [WARNING|trainer.py:803] 2025-04-26 15:45:52,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:52,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:52,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2702 2727 2720 [WARNING|trainer.py:803] 2025-04-26 15:45:54,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:45:55,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:45:55,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2703 2721 2728 [WARNING|trainer.py:803] 2025-04-26 15:45:57,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:45:58,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:45:58,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2704 2722 2729 [WARNING|trainer.py:803] 2025-04-26 15:46:00,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:00,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:46:00,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2705 2730 2723 [WARNING|trainer.py:803] 2025-04-26 15:46:02,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:46:03,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:03,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2706 2731 2724 [WARNING|trainer.py:803] 2025-04-26 15:46:05,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:46:05,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:05,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2707 2725 2732 [WARNING|trainer.py:803] 2025-04-26 15:46:07,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:08,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:46:08,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2708 2726 2733 [WARNING|trainer.py:803] 2025-04-26 15:46:10,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:46:10,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:11,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2709 2727 2734 [WARNING|trainer.py:803] 2025-04-26 15:46:13,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:46:13,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:13,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2710 2735 2728 [WARNING|trainer.py:803] 2025-04-26 15:46:15,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:16,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:46:16,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2711 2729 2736 [WARNING|trainer.py:803] 2025-04-26 15:46:18,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:18,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:18,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2712 2730 2737 [WARNING|trainer.py:803] 2025-04-26 15:46:20,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:21,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:46:21,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2713 2738 2731 [WARNING|trainer.py:803] 2025-04-26 15:46:23,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:23,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:23,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2714 2739 2732 [WARNING|trainer.py:803] 2025-04-26 15:46:26,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:26,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:26,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2715 2740 2733 [WARNING|trainer.py:803] 2025-04-26 15:46:28,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:46:28,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:29,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2716 2741 2734 [WARNING|trainer.py:803] 2025-04-26 15:46:31,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:46:31,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:31,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2717 2742 2735 [WARNING|trainer.py:803] 2025-04-26 15:46:33,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:34,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:34,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2718 2743 2736 [WARNING|trainer.py:803] 2025-04-26 15:46:36,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:36,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:36,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2719 2737 2744 [WARNING|trainer.py:803] 2025-04-26 15:46:39,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:39,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:46:39,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2720 2745 2738 [WARNING|trainer.py:803] 2025-04-26 15:46:41,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:46:41,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:46:41,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2721 2739 2746 [WARNING|trainer.py:803] 2025-04-26 15:46:44,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:46:44,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:44,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2740 2722 2747 [WARNING|trainer.py:803] 2025-04-26 15:46:47,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:47,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:46:47,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2741 2748 2723 [WARNING|trainer.py:803] 2025-04-26 15:46:49,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:46:49,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 15:46:49,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2742 2749 2724 [WARNING|trainer.py:803] 2025-04-26 15:46:52,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:52,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:52,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2743 2725 2750 [WARNING|trainer.py:803] 2025-04-26 15:46:54,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:54,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:54,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2744 2726 2751 [WARNING|trainer.py:803] 2025-04-26 15:46:57,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:46:57,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:46:57,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2745 2727 2752 [WARNING|trainer.py:803] 2025-04-26 15:46:59,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:47:00,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:00,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2746 2728 2753 [WARNING|trainer.py:803] 2025-04-26 15:47:02,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:47:02,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:02,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2747 2729 2754 [WARNING|trainer.py:803] 2025-04-26 15:47:05,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:47:05,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:05,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2748 2730 2755 [WARNING|trainer.py:803] 2025-04-26 15:47:07,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 15:47:07,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:07,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2749 2731 2756 [WARNING|trainer.py:803] 2025-04-26 15:47:10,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:10,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:10,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2750 2732 2757 [WARNING|trainer.py:803] 2025-04-26 15:47:12,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:12,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:47:13,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2758 2733 2751 [WARNING|trainer.py:803] 2025-04-26 15:47:15,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:15,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:15,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2759 2734 2752 [WARNING|trainer.py:803] 2025-04-26 15:47:18,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:18,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:18,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2760 2735 2753 [WARNING|trainer.py:803] 2025-04-26 15:47:20,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:20,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:47:20,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2761 2754 2736 [WARNING|trainer.py:803] 2025-04-26 15:47:23,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:23,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:47:23,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2762 2755 2737 [WARNING|trainer.py:803] 2025-04-26 15:47:25,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:26,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:26,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2763 2738 2756 [WARNING|trainer.py:803] 2025-04-26 15:47:28,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:47:28,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:28,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2764 2757 2739 [WARNING|trainer.py:803] 2025-04-26 15:47:31,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:47:31,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:31,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2758 2740 2765 [WARNING|trainer.py:803] 2025-04-26 15:47:33,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:33,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:33,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2766 2741 2759 [WARNING|trainer.py:803] 2025-04-26 15:47:36,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:36,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:47:36,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2767 2760 2742 [WARNING|trainer.py:803] 2025-04-26 15:47:38,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:47:38,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:38,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2761 2743 2768 [WARNING|trainer.py:803] 2025-04-26 15:47:41,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:41,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:41,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2762 2769 2744 [WARNING|trainer.py:803] 2025-04-26 15:47:43,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:44,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:44,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2745 2763 2770 [WARNING|trainer.py:803] 2025-04-26 15:47:46,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:47:46,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:46,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2746 2764 2771 [WARNING|trainer.py:803] 2025-04-26 15:47:49,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:47:49,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:47:49,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2747 2772 2765 [WARNING|trainer.py:803] 2025-04-26 15:47:51,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:47:51,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:47:51,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2766 2773 2748 [WARNING|trainer.py:803] 2025-04-26 15:47:54,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:54,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:47:54,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo 2767 2774 2749 [WARNING|trainer.py:803] 2025-04-26 15:47:57,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:47:57,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:57,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2768 2750 2775 [WARNING|trainer.py:803] 2025-04-26 15:47:59,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:47:59,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:47:59,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2769 2751 2776 [WARNING|trainer.py:803] 2025-04-26 15:48:02,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:02,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:02,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2752 2777 2770 [WARNING|trainer.py:803] 2025-04-26 15:48:04,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:48:05,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:05,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2771 2778 2753 [WARNING|trainer.py:803] 2025-04-26 15:48:07,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:48:07,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:07,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2754 2779 2772 [WARNING|trainer.py:803] 2025-04-26 15:48:10,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:48:10,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:48:10,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2773 2755 2780 [WARNING|trainer.py:803] 2025-04-26 15:48:12,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:48:12,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:12,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2781 2756 2774 [WARNING|trainer.py:803] 2025-04-26 15:48:15,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:15,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:48:15,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2757 2782 2775 [WARNING|trainer.py:803] 2025-04-26 15:48:18,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:18,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:48:18,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2758 2783 2776 [WARNING|trainer.py:803] 2025-04-26 15:48:20,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:20,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:20,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2759 2784 2777 [WARNING|trainer.py:803] 2025-04-26 15:48:23,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:23,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:23,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2778 2785 2760 [WARNING|trainer.py:803] 2025-04-26 15:48:25,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:25,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:25,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2761 2779 2786 [WARNING|trainer.py:803] 2025-04-26 15:48:28,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:28,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:48:28,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2762 2780 2787 [WARNING|trainer.py:803] 2025-04-26 15:48:30,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:31,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:48:31,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2781 2763 2788 [WARNING|trainer.py:803] 2025-04-26 15:48:33,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:33,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:48:33,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2789 2782 2764 [WARNING|trainer.py:803] 2025-04-26 15:48:36,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:48:36,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:48:36,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2790 2783 2765 [WARNING|trainer.py:803] 2025-04-26 15:48:38,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:38,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:39,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2766 2784 2791 [WARNING|trainer.py:803] 2025-04-26 15:48:41,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:41,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:48:41,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2785 2767 2792 [WARNING|trainer.py:803] 2025-04-26 15:48:44,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:44,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:48:44,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2786 2768 2793 [WARNING|trainer.py:803] 2025-04-26 15:48:46,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:46,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:48:46,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2787 2769 2794 [WARNING|trainer.py:803] 2025-04-26 15:48:49,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:48:49,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:49,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2788 2795 2770 [WARNING|trainer.py:803] 2025-04-26 15:48:52,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:48:52,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:52,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2789 2771 2796 [WARNING|trainer.py:803] 2025-04-26 15:48:54,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:48:54,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:48:54,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2790 2772 2797 [WARNING|trainer.py:803] 2025-04-26 15:48:57,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:48:57,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:48:57,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2791 2773 2798 [WARNING|trainer.py:803] 2025-04-26 15:48:59,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:48:59,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:48:59,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2792 2774 2799 [WARNING|trainer.py:803] 2025-04-26 15:49:02,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:02,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:02,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2793 2775 2800 [WARNING|trainer.py:803] 2025-04-26 15:49:04,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:05,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:49:05,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2776 2794 2801 [WARNING|trainer.py:803] 2025-04-26 15:49:07,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:07,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:49:07,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2777 2795 2802 [WARNING|trainer.py:803] 2025-04-26 15:49:10,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:10,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:10,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2778 2796 2803 [WARNING|trainer.py:803] 2025-04-26 15:49:12,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:12,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:12,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2779 2797 2804 [WARNING|trainer.py:803] 2025-04-26 15:49:15,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:49:15,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:49:15,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2780 2805 2798 [WARNING|trainer.py:803] 2025-04-26 15:49:17,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:49:18,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:18,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2781 2806 2799 [WARNING|trainer.py:803] 2025-04-26 15:49:20,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:20,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:20,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2782 2807 2800 [WARNING|trainer.py:803] 2025-04-26 15:49:23,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:49:23,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:49:23,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2783 2808 2801 [WARNING|trainer.py:803] 2025-04-26 15:49:25,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:25,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:26,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2784 2809 2802 [WARNING|trainer.py:803] 2025-04-26 15:49:28,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:28,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:28,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2785 2810 2803 [WARNING|trainer.py:803] 2025-04-26 15:49:30,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:30,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:49:31,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2786 2811 2804 [WARNING|trainer.py:803] 2025-04-26 15:49:33,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:33,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:33,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2812 2787 2805 [WARNING|trainer.py:803] 2025-04-26 15:49:36,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:49:36,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:49:36,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2788 2813 2806 [WARNING|trainer.py:803] 2025-04-26 15:49:38,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:49:38,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:49:39,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2789 2814 2807 [WARNING|trainer.py:803] 2025-04-26 15:49:41,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:49:41,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:49:41,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2790 2815 2808 [WARNING|trainer.py:803] 2025-04-26 15:49:43,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:44,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:44,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2791 2809 2816 [WARNING|trainer.py:803] 2025-04-26 15:49:46,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:49:46,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:46,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2792 2817 2810 [WARNING|trainer.py:803] 2025-04-26 15:49:49,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:49,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:49:49,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2793 2811 2818 [WARNING|trainer.py:803] 2025-04-26 15:49:51,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:51,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:52,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2794 2812 2819 [WARNING|trainer.py:803] 2025-04-26 15:49:54,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:54,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:49:54,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2795 2813 2820 [WARNING|trainer.py:803] 2025-04-26 15:49:57,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:57,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:49:57,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2796 2814 2821 [WARNING|trainer.py:803] 2025-04-26 15:49:59,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:49:59,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:49:59,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2815 2797 2822 [WARNING|trainer.py:803] 2025-04-26 15:50:02,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:02,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:50:02,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2816 2798 2823 [WARNING|trainer.py:803] 2025-04-26 15:50:05,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:50:05,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:50:05,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2817 2799 2824 [WARNING|trainer.py:803] 2025-04-26 15:50:07,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:50:07,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:07,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2800 2825 2818 [WARNING|trainer.py:803] 2025-04-26 15:50:10,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:50:10,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:50:10,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2801 2826 2819 [WARNING|trainer.py:803] 2025-04-26 15:50:13,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:13,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:50:13,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2827 2802 2820 [WARNING|trainer.py:803] 2025-04-26 15:50:15,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:15,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:15,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2828 2803 2821 [WARNING|trainer.py:803] 2025-04-26 15:50:18,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:50:18,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:18,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2829 2804 2822 [WARNING|trainer.py:803] 2025-04-26 15:50:20,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:21,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:50:21,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2830 2805 2823 [WARNING|trainer.py:803] 2025-04-26 15:50:23,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:23,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:23,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2831 2806 2824 [WARNING|trainer.py:803] 2025-04-26 15:50:26,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:26,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:26,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2807 2832 2825 [WARNING|trainer.py:803] 2025-04-26 15:50:28,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:50:28,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:29,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2808 2833 2826 [WARNING|trainer.py:803] 2025-04-26 15:50:31,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:31,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:31,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2809 2834 2827 [WARNING|trainer.py:803] 2025-04-26 15:50:33,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:34,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:34,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2810 2835 2828 [WARNING|trainer.py:803] 2025-04-26 15:50:36,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:50:36,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:50:36,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2811 2836 2829 [WARNING|trainer.py:803] 2025-04-26 15:50:39,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:39,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:50:39,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2812 2837 2830 [WARNING|trainer.py:803] 2025-04-26 15:50:41,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:50:41,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:50:41,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2813 2838 2831 [WARNING|trainer.py:803] 2025-04-26 15:50:44,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:50:44,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:50:44,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2814 2839 2832 [WARNING|trainer.py:803] 2025-04-26 15:50:46,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:50:46,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:50:47,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2815 2840 2833 [WARNING|trainer.py:803] 2025-04-26 15:50:49,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:49,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:49,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2816 2841 2834 [WARNING|trainer.py:803] 2025-04-26 15:50:52,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:50:52,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:52,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2817 2842 2835 [WARNING|trainer.py:803] 2025-04-26 15:50:54,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:54,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:50:54,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2818 2843 2836 [WARNING|trainer.py:803] 2025-04-26 15:50:57,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:50:57,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:50:57,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2837 2844 2819 [WARNING|trainer.py:803] 2025-04-26 15:50:59,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:51:00,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:51:00,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2838 2845 2820 [WARNING|trainer.py:803] 2025-04-26 15:51:02,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:51:02,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:51:02,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2839 2846 2821 [WARNING|trainer.py:803] 2025-04-26 15:51:05,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:51:05,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:05,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2840 2847 2822 [WARNING|trainer.py:803] 2025-04-26 15:51:07,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:07,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:51:08,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2841 2848 2823 [WARNING|trainer.py:803] 2025-04-26 15:51:10,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:10,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:51:10,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2842 2849 2824 [WARNING|trainer.py:803] 2025-04-26 15:51:13,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:13,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:13,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2850 2843 2825 [WARNING|trainer.py:803] 2025-04-26 15:51:15,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:51:15,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:51:16,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2844 2851 2826 [WARNING|trainer.py:803] 2025-04-26 15:51:18,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:51:18,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:18,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2845 2852 2827 [WARNING|trainer.py:803] 2025-04-26 15:51:21,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:51:21,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:51:21,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2846 2828 2853 [WARNING|trainer.py:803] 2025-04-26 15:51:23,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:23,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:51:23,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2847 2854 2829 [WARNING|trainer.py:803] 2025-04-26 15:51:26,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:51:26,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:26,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2848 2855 2830 [WARNING|trainer.py:803] 2025-04-26 15:51:29,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:51:29,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:29,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2849 2831 2856 [WARNING|trainer.py:803] 2025-04-26 15:51:31,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:31,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:31,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo 2857 2832 2850 [WARNING|trainer.py:803] 2025-04-26 15:51:34,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:51:34,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:34,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2858 2833 2851 [WARNING|trainer.py:803] 2025-04-26 15:51:37,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:37,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:37,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2834 2859 2852 [WARNING|trainer.py:803] 2025-04-26 15:51:39,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:39,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:51:39,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2835 2860 2853 [WARNING|trainer.py:803] 2025-04-26 15:51:42,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:51:42,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:42,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2836 2861 2854 [WARNING|trainer.py:803] 2025-04-26 15:51:44,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:51:45,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:45,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2837 2862 2855 [WARNING|trainer.py:803] 2025-04-26 15:51:47,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:51:47,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:47,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2838 2863 2856 [WARNING|trainer.py:803] 2025-04-26 15:51:50,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:51:50,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:50,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo 2839 2857 2864 [WARNING|trainer.py:803] 2025-04-26 15:51:52,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:51:52,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:51:52,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2840 2865 2858 [WARNING|trainer.py:803] 2025-04-26 15:51:55,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:55,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:55,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2841 2866 2859 [WARNING|trainer.py:803] 2025-04-26 15:51:58,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:58,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:51:58,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2842 2867 2860 [WARNING|trainer.py:803] 2025-04-26 15:52:00,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:00,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:01,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2843 2861 2868 [WARNING|trainer.py:803] 2025-04-26 15:52:03,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:52:03,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:03,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2862 2844 2869 [WARNING|trainer.py:803] 2025-04-26 15:52:06,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:06,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:52:06,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2870 2863 2845 [WARNING|trainer.py:803] 2025-04-26 15:52:08,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:08,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:08,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2871 2846 2864 [WARNING|trainer.py:803] 2025-04-26 15:52:11,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:52:11,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:11,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2872 2865 2847 [WARNING|trainer.py:803] 2025-04-26 15:52:14,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:52:14,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:14,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2866 2848 2873 [WARNING|trainer.py:803] 2025-04-26 15:52:16,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:16,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:52:16,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2849 2874 2867 [WARNING|trainer.py:803] 2025-04-26 15:52:19,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:19,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:52:19,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2875 2850 2868 [WARNING|trainer.py:803] 2025-04-26 15:52:22,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:22,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:52:22,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2876 2851 2869 [WARNING|trainer.py:803] 2025-04-26 15:52:24,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:24,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:25,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2877 2852 2870 [WARNING|trainer.py:803] 2025-04-26 15:52:27,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:52:27,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:52:27,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2878 2853 2871 [WARNING|trainer.py:803] 2025-04-26 15:52:29,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:52:30,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:52:30,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2879 2854 2872 [WARNING|trainer.py:803] 2025-04-26 15:52:32,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:52:32,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:32,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2880 2873 2855 [WARNING|trainer.py:803] 2025-04-26 15:52:35,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:35,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:52:35,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2881 2856 2874 [WARNING|trainer.py:803] 2025-04-26 15:52:37,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:38,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 15:52:38,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2882 2857 2875 [WARNING|trainer.py:803] 2025-04-26 15:52:40,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:52:40,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:52:40,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2883 2858 2876 [WARNING|trainer.py:803] 2025-04-26 15:52:43,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:52:43,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:43,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2884 2877 2859 [WARNING|trainer.py:803] 2025-04-26 15:52:45,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:45,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:52:45,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2878 2885 2860 [WARNING|trainer.py:803] 2025-04-26 15:52:48,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:52:48,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:52:48,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2879 2861 2886 [WARNING|trainer.py:803] 2025-04-26 15:52:51,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:52:51,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:51,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2880 2862 2887 [WARNING|trainer.py:803] 2025-04-26 15:52:53,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:53,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:53,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2881 2863 2888 [WARNING|trainer.py:803] 2025-04-26 15:52:56,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:56,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:56,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2882 2864 2889 [WARNING|trainer.py:803] 2025-04-26 15:52:59,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:52:59,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:52:59,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2890 2883 2865 [WARNING|trainer.py:803] 2025-04-26 15:53:01,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:01,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:53:01,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2884 2866 2891 [WARNING|trainer.py:803] 2025-04-26 15:53:04,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:04,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:04,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2867 2885 2892 [WARNING|trainer.py:803] 2025-04-26 15:53:07,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:53:07,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:07,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2893 2886 2868 [WARNING|trainer.py:803] 2025-04-26 15:53:09,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:53:09,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:53:09,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2894 2887 2869 [WARNING|trainer.py:803] 2025-04-26 15:53:12,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:12,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:12,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2870 2895 2888 [WARNING|trainer.py:803] 2025-04-26 15:53:15,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:15,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:15,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2896 2871 2889 [WARNING|trainer.py:803] 2025-04-26 15:53:17,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:53:17,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:53:17,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2897 2872 2890 [WARNING|trainer.py:803] 2025-04-26 15:53:20,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:53:20,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:53:20,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2898 2891 2873 [WARNING|trainer.py:803] 2025-04-26 15:53:23,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:53:23,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:23,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2899 2892 2874 [WARNING|trainer.py:803] 2025-04-26 15:53:25,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:53:25,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:53:25,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2900 2893 2875 [WARNING|trainer.py:803] 2025-04-26 15:53:28,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:28,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:53:28,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2894 2901 2876 [WARNING|trainer.py:803] 2025-04-26 15:53:31,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:31,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:31,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2895 2877 2902 [WARNING|trainer.py:803] 2025-04-26 15:53:33,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:33,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:53:33,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2903 2878 2896 [WARNING|trainer.py:803] 2025-04-26 15:53:36,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:36,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:53:36,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2904 2879 2897 [WARNING|trainer.py:803] 2025-04-26 15:53:38,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 15:53:39,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:53:39,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2905 2898 2880 [WARNING|trainer.py:803] 2025-04-26 15:53:41,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:53:41,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:53:41,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2906 2899 2881 [WARNING|trainer.py:803] 2025-04-26 15:53:44,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:44,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:53:44,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2907 2882 2900 [WARNING|trainer.py:803] 2025-04-26 15:53:46,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:53:46,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:53:46,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2908 2883 2901 [WARNING|trainer.py:803] 2025-04-26 15:53:49,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:53:49,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:53:49,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2909 2884 2902 [WARNING|trainer.py:803] 2025-04-26 15:53:52,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:53:52,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:52,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2910 2903 2885 [WARNING|trainer.py:803] 2025-04-26 15:53:54,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:54,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:54,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2911 2904 2886 [WARNING|trainer.py:803] 2025-04-26 15:53:57,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:57,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 15:53:57,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2912 2905 2887 [WARNING|trainer.py:803] 2025-04-26 15:53:59,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:53:59,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:54:00,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2913 2906 2888 [WARNING|trainer.py:803] 2025-04-26 15:54:02,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:54:02,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:02,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2914 2907 2889 [WARNING|trainer.py:803] 2025-04-26 15:54:05,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:05,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:54:05,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2915 2908 2890 [WARNING|trainer.py:803] 2025-04-26 15:54:07,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:07,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:54:08,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2916 2909 2891 [WARNING|trainer.py:803] 2025-04-26 15:54:10,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:54:10,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:54:10,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2917 2910 2892 [WARNING|trainer.py:803] 2025-04-26 15:54:12,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:13,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:13,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2918 2911 2893 [WARNING|trainer.py:803] 2025-04-26 15:54:15,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:15,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:15,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2919 2912 2894 [WARNING|trainer.py:803] 2025-04-26 15:54:17,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:54:18,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:18,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2920 2913 2895 [WARNING|trainer.py:803] 2025-04-26 15:54:20,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:54:20,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:54:21,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2921 2914 2896 [WARNING|trainer.py:803] 2025-04-26 15:54:23,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:54:23,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:23,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2922 2915 2897 [WARNING|trainer.py:803] 2025-04-26 15:54:25,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:26,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:26,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2923 2916 2898 [WARNING|trainer.py:803] 2025-04-26 15:54:28,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:54:28,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:54:28,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2924 2917 2899 [WARNING|trainer.py:803] 2025-04-26 15:54:30,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:54:31,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:31,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2925 2918 2900 [WARNING|trainer.py:803] 2025-04-26 15:54:33,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:33,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:34,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2926 2919 2901 [WARNING|trainer.py:803] 2025-04-26 15:54:36,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:36,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:54:36,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2927 2920 2902 [WARNING|trainer.py:803] 2025-04-26 15:54:38,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:39,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:54:39,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2928 2921 2903 [WARNING|trainer.py:803] 2025-04-26 15:54:41,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:41,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:54:42,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2929 2922 2904 [WARNING|trainer.py:803] 2025-04-26 15:54:43,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:54:44,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:44,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo 2930 2923 2905 [WARNING|trainer.py:803] 2025-04-26 15:54:46,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:54:46,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:54:47,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2931 2924 2906 [WARNING|trainer.py:803] 2025-04-26 15:54:49,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:49,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:54:49,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2932 2925 2907 [WARNING|trainer.py:803] 2025-04-26 15:54:51,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:52,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:52,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2933 2926 2908 [WARNING|trainer.py:803] 2025-04-26 15:54:54,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:54:54,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:55,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2934 2927 [WARNING|trainer.py:803] 2025-04-26 15:54:56,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2909 [WARNING|trainer.py:803] 2025-04-26 15:54:57,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:54:57,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2935 2928 2910 [WARNING|trainer.py:803] 2025-04-26 15:54:59,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:54:59,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:00,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2936 2929 [WARNING|trainer.py:803] 2025-04-26 15:55:02,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2911 [WARNING|trainer.py:803] 2025-04-26 15:55:02,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:55:03,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2937 2930 2912 [WARNING|trainer.py:803] 2025-04-26 15:55:04,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:05,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:55:05,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2938 2931 2913 [WARNING|trainer.py:803] 2025-04-26 15:55:07,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:07,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:08,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2939 2932 2914 [WARNING|trainer.py:803] 2025-04-26 15:55:10,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:10,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:10,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2940 2933 2915 [WARNING|trainer.py:803] 2025-04-26 15:55:12,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:12,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:55:13,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2941 2934 2916 [WARNING|trainer.py:803] 2025-04-26 15:55:15,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:15,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:15,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2942 2935 2917 [WARNING|trainer.py:803] 2025-04-26 15:55:17,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:55:18,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:55:18,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2943 2936 2918 [WARNING|trainer.py:803] 2025-04-26 15:55:20,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:20,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:55:20,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2944 2937 2919 [WARNING|trainer.py:803] 2025-04-26 15:55:23,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:55:23,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:23,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2945 2938 2920 [WARNING|trainer.py:803] 2025-04-26 15:55:25,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:26,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:26,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2946 2939 2921 [WARNING|trainer.py:803] 2025-04-26 15:55:28,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:55:28,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:28,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2947 2940 2922 [WARNING|trainer.py:803] 2025-04-26 15:55:30,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:55:31,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:31,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2948 2941 2923 [WARNING|trainer.py:803] 2025-04-26 15:55:33,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:55:33,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:34,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2949 2942 2924 [WARNING|trainer.py:803] 2025-04-26 15:55:36,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:55:36,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:55:36,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2950 2943 2925 [WARNING|trainer.py:803] 2025-04-26 15:55:38,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:39,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:39,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2944 2951 2926 [WARNING|trainer.py:803] 2025-04-26 15:55:41,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:55:41,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:55:41,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2952 2927 2945 [WARNING|trainer.py:803] 2025-04-26 15:55:44,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:55:44,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:44,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2953 2946 2928 [WARNING|trainer.py:803] 2025-04-26 15:55:46,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:55:46,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:55:46,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2954 2947 2929 [WARNING|trainer.py:803] 2025-04-26 15:55:49,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:55:49,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:55:49,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2955 2930 2948 [WARNING|trainer.py:803] 2025-04-26 15:55:51,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:55:52,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:55:52,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2956 2931 2949 [WARNING|trainer.py:803] 2025-04-26 15:55:54,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:55:54,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:54,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2957 2932 2950 [WARNING|trainer.py:803] 2025-04-26 15:55:57,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:55:57,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:55:57,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2958 2933 2951 [WARNING|trainer.py:803] 2025-04-26 15:55:59,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:55:59,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:56:00,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2959 2934 2952 [WARNING|trainer.py:803] 2025-04-26 15:56:02,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:56:02,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:02,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2960 2953 2935 [WARNING|trainer.py:803] 2025-04-26 15:56:04,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:56:05,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:56:05,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2954 2961 2936 [WARNING|trainer.py:803] 2025-04-26 15:56:07,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:56:07,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:07,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2955 2962 2937 [WARNING|trainer.py:803] 2025-04-26 15:56:10,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:56:10,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:10,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2956 2963 2938 [WARNING|trainer.py:803] 2025-04-26 15:56:12,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:56:12,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:56:13,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2957 2964 2939 [WARNING|trainer.py:803] 2025-04-26 15:56:15,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:56:15,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:15,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2958 2965 2940 [WARNING|trainer.py:803] 2025-04-26 15:56:18,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:56:18,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:18,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2966 2959 2941 [WARNING|trainer.py:803] 2025-04-26 15:56:20,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:56:20,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:56:20,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2960 2967 2942 [WARNING|trainer.py:803] 2025-04-26 15:56:23,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:56:23,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:23,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2968 2943 2961 [WARNING|trainer.py:803] 2025-04-26 15:56:26,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:26,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:26,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2944 2969 2962 [WARNING|trainer.py:803] 2025-04-26 15:56:28,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:56:28,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:28,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2945 2970 2963 [WARNING|trainer.py:803] 2025-04-26 15:56:31,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:31,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:31,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2946 2971 2964 [WARNING|trainer.py:803] 2025-04-26 15:56:34,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:56:34,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:34,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2947 2972 2965 [WARNING|trainer.py:803] 2025-04-26 15:56:36,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:56:36,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:56:36,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2966 2973 2948 [WARNING|trainer.py:803] 2025-04-26 15:56:39,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:56:39,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:56:39,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2967 2974 2949 [WARNING|trainer.py:803] 2025-04-26 15:56:41,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:42,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:56:42,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2968 2975 2950 [WARNING|trainer.py:803] 2025-04-26 15:56:44,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:44,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:44,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2969 2976 2951 [WARNING|trainer.py:803] 2025-04-26 15:56:47,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:47,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:56:47,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2970 2977 2952 [WARNING|trainer.py:803] 2025-04-26 15:56:49,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:49,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:49,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2953 2971 2978 [WARNING|trainer.py:803] 2025-04-26 15:56:52,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:56:52,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:56:52,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2954 2979 2972 [WARNING|trainer.py:803] 2025-04-26 15:56:55,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:56:55,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:56:55,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2955 2980 2973 [WARNING|trainer.py:803] 2025-04-26 15:56:57,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:56:57,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:56:57,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2956 2981 2974 [WARNING|trainer.py:803] 2025-04-26 15:57:00,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:57:00,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:57:00,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2957 2982 2975 [WARNING|trainer.py:803] 2025-04-26 15:57:02,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:57:03,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:57:03,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2983 2958 2976 [WARNING|trainer.py:803] 2025-04-26 15:57:05,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:57:05,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:57:05,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2984 2959 2977 [WARNING|trainer.py:803] 2025-04-26 15:57:08,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:57:08,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:57:08,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2985 2960 2978 [WARNING|trainer.py:803] 2025-04-26 15:57:10,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:57:10,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:57:10,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2979 2986 2961 [WARNING|trainer.py:803] 2025-04-26 15:57:13,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:57:13,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:13,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2980 2962 2987 [WARNING|trainer.py:803] 2025-04-26 15:57:16,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 15:57:16,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:16,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2981 2988 2963 [WARNING|trainer.py:803] 2025-04-26 15:57:18,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:57:18,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:57:18,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2989 2982 2964 [WARNING|trainer.py:803] 2025-04-26 15:57:21,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:57:21,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:57:21,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2990 2983 2965 [WARNING|trainer.py:803] 2025-04-26 15:57:23,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:57:23,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:57:24,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2991 2984 2966 [WARNING|trainer.py:803] 2025-04-26 15:57:26,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:26,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:57:26,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2985 2992 2967 [WARNING|trainer.py:803] 2025-04-26 15:57:28,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:29,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:57:29,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2993 2986 2968 [WARNING|trainer.py:803] 2025-04-26 15:57:31,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:31,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:31,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2994 2987 2969 [WARNING|trainer.py:803] 2025-04-26 15:57:34,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:34,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:34,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2995 2970 2988 [WARNING|trainer.py:803] 2025-04-26 15:57:36,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:37,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:57:37,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2989 2996 2971 [WARNING|trainer.py:803] 2025-04-26 15:57:39,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:57:39,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:39,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2997 2990 2972 [WARNING|trainer.py:803] 2025-04-26 15:57:42,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:57:42,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:57:42,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2998 2991 2973 [WARNING|trainer.py:803] 2025-04-26 15:57:44,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:44,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:44,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2992 2999 2974 [WARNING|trainer.py:803] 2025-04-26 15:57:47,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:47,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:47,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2993 3000 2975 [WARNING|trainer.py:803] 2025-04-26 15:57:50,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:50,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:50,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3001 2994 2976 [WARNING|trainer.py:803] 2025-04-26 15:57:52,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:57:52,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:52,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3002 [WARNING|trainer.py:803] 2025-04-26 15:57:54,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2995 2977 [WARNING|trainer.py:803] 2025-04-26 15:57:55,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:57:55,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3003 [WARNING|trainer.py:803] 2025-04-26 15:57:56,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2996 2978 [WARNING|trainer.py:803] 2025-04-26 15:57:57,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3004 [WARNING|trainer.py:803] 2025-04-26 15:57:58,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:57:58,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2997 2979 3005 [WARNING|trainer.py:803] 2025-04-26 15:58:00,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:58:00,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:58:00,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3006 2998 2980 [WARNING|trainer.py:803] 2025-04-26 15:58:03,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:58:03,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:58:03,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3007 2999 [WARNING|trainer.py:803] 2025-04-26 15:58:05,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2981 [WARNING|trainer.py:803] 2025-04-26 15:58:05,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:58:06,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3008 [WARNING|trainer.py:803] 2025-04-26 15:58:07,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3000 2982 [WARNING|trainer.py:803] 2025-04-26 15:58:08,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:58:08,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3009 [WARNING|trainer.py:803] 2025-04-26 15:58:09,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3001 2983 [WARNING|trainer.py:803] 2025-04-26 15:58:10,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3010 [WARNING|trainer.py:803] 2025-04-26 15:58:11,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:58:11,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3002 [WARNING|trainer.py:803] 2025-04-26 15:58:12,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3011 2984 [WARNING|trainer.py:803] 2025-04-26 15:58:13,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:58:13,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3003 3012 [WARNING|trainer.py:803] 2025-04-26 15:58:15,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2985 [WARNING|trainer.py:803] 2025-04-26 15:58:15,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:58:16,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3004 3013 [WARNING|trainer.py:803] 2025-04-26 15:58:17,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:58:17,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2986 3005 [WARNING|trainer.py:803] 2025-04-26 15:58:18,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3014 [WARNING|trainer.py:803] 2025-04-26 15:58:19,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:58:20,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3006 2987 3015 [WARNING|trainer.py:803] 2025-04-26 15:58:21,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:58:21,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:58:22,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3007 2988 3016 [WARNING|trainer.py:803] 2025-04-26 15:58:23,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:58:24,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:58:24,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3008 3017 [WARNING|trainer.py:803] 2025-04-26 15:58:25,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2989 [WARNING|trainer.py:803] 2025-04-26 15:58:26,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:58:26,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3009 3018 [WARNING|trainer.py:803] 2025-04-26 15:58:27,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2990 [WARNING|trainer.py:803] 2025-04-26 15:58:28,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3010 [WARNING|trainer.py:803] 2025-04-26 15:58:29,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3019 [WARNING|trainer.py:803] 2025-04-26 15:58:30,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:58:30,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2991 3011 [WARNING|trainer.py:803] 2025-04-26 15:58:31,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:58:32,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3020 [WARNING|trainer.py:803] 2025-04-26 15:58:32,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3012 2992 [WARNING|trainer.py:803] 2025-04-26 15:58:34,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3021 [WARNING|trainer.py:803] 2025-04-26 15:58:34,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:58:35,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3013 [WARNING|trainer.py:803] 2025-04-26 15:58:36,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2993 3022 [WARNING|trainer.py:803] 2025-04-26 15:58:36,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:58:37,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3014 [WARNING|trainer.py:803] 2025-04-26 15:58:38,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3023 2994 [WARNING|trainer.py:803] 2025-04-26 15:58:39,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3015 [WARNING|trainer.py:803] 2025-04-26 15:58:39,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:58:40,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3024 2995 [WARNING|trainer.py:803] 2025-04-26 15:58:41,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3016 [WARNING|trainer.py:803] 2025-04-26 15:58:42,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:58:42,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3025 [WARNING|trainer.py:803] 2025-04-26 15:58:43,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3017 2996 [WARNING|trainer.py:803] 2025-04-26 15:58:44,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3026 [WARNING|trainer.py:803] 2025-04-26 15:58:44,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:58:45,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3018 2997 [WARNING|trainer.py:803] 2025-04-26 15:58:46,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3027 [WARNING|trainer.py:803] 2025-04-26 15:58:47,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:58:47,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3019 [WARNING|trainer.py:803] 2025-04-26 15:58:48,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3028 2998 [WARNING|trainer.py:803] 2025-04-26 15:58:50,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:58:50,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3020 [WARNING|trainer.py:803] 2025-04-26 15:58:51,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3029 2999 [WARNING|trainer.py:803] 2025-04-26 15:58:52,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3021 [WARNING|trainer.py:803] 2025-04-26 15:58:52,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:58:53,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3030 [WARNING|trainer.py:803] 2025-04-26 15:58:54,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3022 3000 [WARNING|trainer.py:803] 2025-04-26 15:58:55,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:58:55,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3031 [WARNING|trainer.py:803] 2025-04-26 15:58:56,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3023 3001 [WARNING|trainer.py:803] 2025-04-26 15:58:57,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:58:57,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3032 [WARNING|trainer.py:803] 2025-04-26 15:58:58,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3024 3002 [WARNING|trainer.py:803] 2025-04-26 15:58:59,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:58:59,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3033 [WARNING|trainer.py:803] 2025-04-26 15:59:00,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3025 3003 [WARNING|trainer.py:803] 2025-04-26 15:59:01,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:59:01,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3034 3026 [WARNING|trainer.py:803] 2025-04-26 15:59:03,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3004 [WARNING|trainer.py:803] 2025-04-26 15:59:03,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:59:04,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3035 3027 [WARNING|trainer.py:803] 2025-04-26 15:59:05,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3005 [WARNING|trainer.py:803] 2025-04-26 15:59:05,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:59:06,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3036 [WARNING|trainer.py:803] 2025-04-26 15:59:07,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3028 3006 [WARNING|trainer.py:803] 2025-04-26 15:59:08,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:59:08,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3037 [WARNING|trainer.py:803] 2025-04-26 15:59:09,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3029 3007 [WARNING|trainer.py:803] 2025-04-26 15:59:10,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:59:10,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3038 [WARNING|trainer.py:803] 2025-04-26 15:59:11,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3030 3008 [WARNING|trainer.py:803] 2025-04-26 15:59:12,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:59:12,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3039 [WARNING|trainer.py:803] 2025-04-26 15:59:13,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3031 3009 [WARNING|trainer.py:803] 2025-04-26 15:59:14,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:59:14,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3040 [WARNING|trainer.py:803] 2025-04-26 15:59:15,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3032 3010 [WARNING|trainer.py:803] 2025-04-26 15:59:16,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:59:17,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3041 [WARNING|trainer.py:803] 2025-04-26 15:59:17,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3011 3033 [WARNING|trainer.py:803] 2025-04-26 15:59:19,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:59:19,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3042 [WARNING|trainer.py:803] 2025-04-26 15:59:20,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3012 3034 [WARNING|trainer.py:803] 2025-04-26 15:59:21,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:59:21,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3043 [WARNING|trainer.py:803] 2025-04-26 15:59:22,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3013 3035 [WARNING|trainer.py:803] 2025-04-26 15:59:23,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:59:23,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3044 [WARNING|trainer.py:803] 2025-04-26 15:59:24,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3014 3036 [WARNING|trainer.py:803] 2025-04-26 15:59:25,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:59:25,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3045 [WARNING|trainer.py:803] 2025-04-26 15:59:26,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3037 3015 [WARNING|trainer.py:803] 2025-04-26 15:59:27,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:59:27,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3046 [WARNING|trainer.py:803] 2025-04-26 15:59:28,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3038 3016 [WARNING|trainer.py:803] 2025-04-26 15:59:29,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:59:29,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3047 [WARNING|trainer.py:803] 2025-04-26 15:59:30,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3039 3017 [WARNING|trainer.py:803] 2025-04-26 15:59:31,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 15:59:32,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3048 [WARNING|trainer.py:803] 2025-04-26 15:59:33,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3040 3018 [WARNING|trainer.py:803] 2025-04-26 15:59:34,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:59:34,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3049 [WARNING|trainer.py:803] 2025-04-26 15:59:35,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3041 3019 [WARNING|trainer.py:803] 2025-04-26 15:59:36,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:59:36,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3050 [WARNING|trainer.py:803] 2025-04-26 15:59:37,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3020 3042 [WARNING|trainer.py:803] 2025-04-26 15:59:38,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:59:38,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3051 [WARNING|trainer.py:803] 2025-04-26 15:59:39,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3021 3043 [WARNING|trainer.py:803] 2025-04-26 15:59:40,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:59:40,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3052 [WARNING|trainer.py:803] 2025-04-26 15:59:41,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3022 3044 [WARNING|trainer.py:803] 2025-04-26 15:59:42,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:59:42,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3053 [WARNING|trainer.py:803] 2025-04-26 15:59:43,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3023 3045 [WARNING|trainer.py:803] 2025-04-26 15:59:44,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 15:59:44,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3054 [WARNING|trainer.py:803] 2025-04-26 15:59:45,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3024 3046 [WARNING|trainer.py:803] 2025-04-26 15:59:47,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:59:47,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3055 [WARNING|trainer.py:803] 2025-04-26 15:59:47,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3025 3047 [WARNING|trainer.py:803] 2025-04-26 15:59:49,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:59:49,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3056 [WARNING|trainer.py:803] 2025-04-26 15:59:50,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3026 3048 [WARNING|trainer.py:803] 2025-04-26 15:59:51,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:59:51,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3057 [WARNING|trainer.py:803] 2025-04-26 15:59:52,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3027 3049 [WARNING|trainer.py:803] 2025-04-26 15:59:53,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:59:53,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3058 [WARNING|trainer.py:803] 2025-04-26 15:59:54,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3050 3028 3059 [WARNING|trainer.py:803] 2025-04-26 15:59:55,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:59:55,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 15:59:56,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3051 3029 3060 [WARNING|trainer.py:803] 2025-04-26 15:59:57,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 15:59:57,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 15:59:58,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3052 3030 3061 [WARNING|trainer.py:803] 2025-04-26 15:59:59,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:00:00,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:00:00,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3031 3053 3062 [WARNING|trainer.py:803] 2025-04-26 16:00:02,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:02,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:00:02,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3032 3054 3063 [WARNING|trainer.py:803] 2025-04-26 16:00:04,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:04,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:04,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3055 3033 3064 [WARNING|trainer.py:803] 2025-04-26 16:00:06,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:00:06,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:00:06,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3056 3034 3065 [WARNING|trainer.py:803] 2025-04-26 16:00:08,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:08,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:00:08,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3057 3035 3066 [WARNING|trainer.py:803] 2025-04-26 16:00:10,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:10,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:11,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3058 3036 3067 [WARNING|trainer.py:803] 2025-04-26 16:00:12,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:13,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:13,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3059 3037 3068 [WARNING|trainer.py:803] 2025-04-26 16:00:15,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:00:15,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:15,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3060 3069 3038 [WARNING|trainer.py:803] 2025-04-26 16:00:17,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:00:17,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:00:17,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3061 3070 3039 [WARNING|trainer.py:803] 2025-04-26 16:00:19,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:00:19,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:00:19,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3062 3071 3040 [WARNING|trainer.py:803] 2025-04-26 16:00:21,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:00:21,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:00:21,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3063 3072 3041 [WARNING|trainer.py:803] 2025-04-26 16:00:23,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:00:23,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:23,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3064 3073 3042 [WARNING|trainer.py:803] 2025-04-26 16:00:25,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:00:25,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:00:26,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3065 3074 3043 [WARNING|trainer.py:803] 2025-04-26 16:00:27,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:28,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:28,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3066 3075 [WARNING|trainer.py:803] 2025-04-26 16:00:29,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3044 [WARNING|trainer.py:803] 2025-04-26 16:00:30,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:30,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3067 3076 3045 [WARNING|trainer.py:803] 2025-04-26 16:00:31,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:00:32,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:32,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3068 3077 3046 [WARNING|trainer.py:803] 2025-04-26 16:00:34,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:34,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:34,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3069 3078 3047 [WARNING|trainer.py:803] 2025-04-26 16:00:36,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:00:36,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:36,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3070 3079 3048 [WARNING|trainer.py:803] 2025-04-26 16:00:38,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:00:38,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:00:39,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3071 3080 3049 [WARNING|trainer.py:803] 2025-04-26 16:00:40,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:00:41,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:00:41,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3072 3081 3050 [WARNING|trainer.py:803] 2025-04-26 16:00:42,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:43,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:00:43,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3073 3082 3051 [WARNING|trainer.py:803] 2025-04-26 16:00:44,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:00:45,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:00:45,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3074 3083 3052 [WARNING|trainer.py:803] 2025-04-26 16:00:46,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:47,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:00:47,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3075 3084 3053 [WARNING|trainer.py:803] 2025-04-26 16:00:49,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:49,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:49,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3076 3085 3054 [WARNING|trainer.py:803] 2025-04-26 16:00:51,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:51,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:51,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3077 3055 3086 [WARNING|trainer.py:803] 2025-04-26 16:00:53,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:54,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:00:54,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3078 3087 3056 [WARNING|trainer.py:803] 2025-04-26 16:00:55,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:56,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:00:56,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3079 3088 3057 [WARNING|trainer.py:803] 2025-04-26 16:00:57,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:00:58,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:00:58,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3080 3089 3058 [WARNING|trainer.py:803] 2025-04-26 16:00:59,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:01:00,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:01:00,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3081 3090 3059 [WARNING|trainer.py:803] 2025-04-26 16:01:01,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:02,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:02,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3082 3091 3060 [WARNING|trainer.py:803] 2025-04-26 16:01:04,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:01:04,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:01:04,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3083 3092 3061 [WARNING|trainer.py:803] 2025-04-26 16:01:06,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:06,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:06,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3084 3093 3062 [WARNING|trainer.py:803] 2025-04-26 16:01:08,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:01:08,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:08,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3085 3094 3063 [WARNING|trainer.py:803] 2025-04-26 16:01:10,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:01:10,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:01:11,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3086 3095 3064 [WARNING|trainer.py:803] 2025-04-26 16:01:12,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:12,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:01:13,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3087 3096 3065 [WARNING|trainer.py:803] 2025-04-26 16:01:14,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:01:15,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:15,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3088 3097 3066 [WARNING|trainer.py:803] 2025-04-26 16:01:16,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:01:17,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:17,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3089 3098 3067 [WARNING|trainer.py:803] 2025-04-26 16:01:19,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:01:19,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:19,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3090 3099 3068 [WARNING|trainer.py:803] 2025-04-26 16:01:21,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:21,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:01:21,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3091 3100 3069 [WARNING|trainer.py:803] 2025-04-26 16:01:23,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:01:23,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:01:23,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3092 3070 [WARNING|trainer.py:803] 2025-04-26 16:01:25,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:26,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3101 3093 [WARNING|trainer.py:803] 2025-04-26 16:01:26,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3071 [WARNING|trainer.py:803] 2025-04-26 16:01:27,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:28,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3094 3102 3072 [WARNING|trainer.py:803] 2025-04-26 16:01:29,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:01:29,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:30,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3095 3103 3073 [WARNING|trainer.py:803] 2025-04-26 16:01:31,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:01:32,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:01:32,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3096 3074 [WARNING|trainer.py:803] 2025-04-26 16:01:33,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3104 [WARNING|trainer.py:803] 2025-04-26 16:01:34,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:01:34,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3097 3075 [WARNING|trainer.py:803] 2025-04-26 16:01:36,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3105 [WARNING|trainer.py:803] 2025-04-26 16:01:36,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:01:37,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3098 3076 [WARNING|trainer.py:803] 2025-04-26 16:01:38,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:38,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3106 3099 3077 [WARNING|trainer.py:803] 2025-04-26 16:01:40,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:40,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:01:40,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3107 3100 3078 [WARNING|trainer.py:803] 2025-04-26 16:01:42,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:01:42,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:01:43,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3079 3108 3101 [WARNING|trainer.py:803] 2025-04-26 16:01:45,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:01:45,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:45,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3080 3109 [WARNING|trainer.py:803] 2025-04-26 16:01:47,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3102 [WARNING|trainer.py:803] 2025-04-26 16:01:47,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:48,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3081 3110 [WARNING|trainer.py:803] 2025-04-26 16:01:49,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:50,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3103 3082 [WARNING|trainer.py:803] 2025-04-26 16:01:51,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:01:51,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3111 [WARNING|trainer.py:803] 2025-04-26 16:01:52,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3083 3104 [WARNING|trainer.py:803] 2025-04-26 16:01:53,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:01:53,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3112 [WARNING|trainer.py:803] 2025-04-26 16:01:54,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3084 3105 [WARNING|trainer.py:803] 2025-04-26 16:01:55,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:01:56,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3113 3085 [WARNING|trainer.py:803] 2025-04-26 16:01:57,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3106 [WARNING|trainer.py:803] 2025-04-26 16:01:58,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:01:58,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3114 3086 [WARNING|trainer.py:803] 2025-04-26 16:01:59,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:02:00,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3107 [WARNING|trainer.py:803] 2025-04-26 16:02:01,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3087 3115 [WARNING|trainer.py:803] 2025-04-26 16:02:02,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:02:02,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3108 3088 [WARNING|trainer.py:803] 2025-04-26 16:02:04,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:04,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3116 [WARNING|trainer.py:803] 2025-04-26 16:02:05,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3109 3089 [WARNING|trainer.py:803] 2025-04-26 16:02:06,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:06,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3117 3090 [WARNING|trainer.py:803] 2025-04-26 16:02:07,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3110 [WARNING|trainer.py:803] 2025-04-26 16:02:08,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:08,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3091 3118 3111 [WARNING|trainer.py:803] 2025-04-26 16:02:10,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:02:10,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:11,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3092 3119 3112 [WARNING|trainer.py:803] 2025-04-26 16:02:12,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:13,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:02:13,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3093 [WARNING|trainer.py:803] 2025-04-26 16:02:14,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3113 3120 [WARNING|trainer.py:803] 2025-04-26 16:02:15,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3094 [WARNING|trainer.py:803] 2025-04-26 16:02:16,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:02:17,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3114 3121 3095 [WARNING|trainer.py:803] 2025-04-26 16:02:18,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:02:19,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:02:19,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3115 3096 [WARNING|trainer.py:803] 2025-04-26 16:02:21,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:21,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3122 [WARNING|trainer.py:803] 2025-04-26 16:02:22,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3097 3116 [WARNING|trainer.py:803] 2025-04-26 16:02:23,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3123 [WARNING|trainer.py:803] 2025-04-26 16:02:23,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:24,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3098 3117 [WARNING|trainer.py:803] 2025-04-26 16:02:25,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:26,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3124 3099 [WARNING|trainer.py:803] 2025-04-26 16:02:27,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:02:27,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3118 3100 [WARNING|trainer.py:803] 2025-04-26 16:02:29,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3125 [WARNING|trainer.py:803] 2025-04-26 16:02:30,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:02:30,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3119 [WARNING|trainer.py:803] 2025-04-26 16:02:32,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3126 3101 [WARNING|trainer.py:803] 2025-04-26 16:02:33,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:33,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3120 [WARNING|trainer.py:803] 2025-04-26 16:02:34,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3102 3127 [WARNING|trainer.py:803] 2025-04-26 16:02:35,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:36,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3121 [WARNING|trainer.py:803] 2025-04-26 16:02:37,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3103 3128 [WARNING|trainer.py:803] 2025-04-26 16:02:38,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:38,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 3122 3104 3129 [WARNING|trainer.py:803] 2025-04-26 16:02:41,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:41,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:02:41,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3123 3105 3130 [WARNING|trainer.py:803] 2025-04-26 16:02:43,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:02:43,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:43,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3131 3124 3106 [WARNING|trainer.py:803] 2025-04-26 16:02:46,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:46,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:02:46,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3132 3107 3125 [WARNING|trainer.py:803] 2025-04-26 16:02:48,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:48,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:02:49,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3133 3108 [WARNING|trainer.py:803] 2025-04-26 16:02:50,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3126 [WARNING|trainer.py:803] 2025-04-26 16:02:51,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:02:51,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3134 3109 [WARNING|trainer.py:803] 2025-04-26 16:02:53,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:02:53,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3127 [WARNING|trainer.py:803] 2025-04-26 16:02:54,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3135 3110 [WARNING|trainer.py:803] 2025-04-26 16:02:56,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3128 [WARNING|trainer.py:803] 2025-04-26 16:02:56,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:02:57,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3136 3111 [WARNING|trainer.py:803] 2025-04-26 16:02:58,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:02:58,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3129 [WARNING|trainer.py:803] 2025-04-26 16:02:59,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3137 3112 [WARNING|trainer.py:803] 2025-04-26 16:03:01,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:03:01,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3130 [WARNING|trainer.py:803] 2025-04-26 16:03:02,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3113 3138 [WARNING|trainer.py:803] 2025-04-26 16:03:03,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:03:03,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3131 [WARNING|trainer.py:803] 2025-04-26 16:03:04,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3114 3139 [WARNING|trainer.py:803] 2025-04-26 16:03:06,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:03:06,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3132 [WARNING|trainer.py:803] 2025-04-26 16:03:07,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3115 3140 3133 [WARNING|trainer.py:803] 2025-04-26 16:03:09,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:03:09,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:03:09,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3116 3141 3134 [WARNING|trainer.py:803] 2025-04-26 16:03:11,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:03:11,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:03:12,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3142 3117 [WARNING|trainer.py:803] 2025-04-26 16:03:14,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3135 [WARNING|trainer.py:803] 2025-04-26 16:03:14,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:03:15,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3143 3136 3118 [WARNING|trainer.py:803] 2025-04-26 16:03:16,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:03:17,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:03:17,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3144 3137 [WARNING|trainer.py:803] 2025-04-26 16:03:19,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3119 [WARNING|trainer.py:803] 2025-04-26 16:03:19,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:03:20,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3145 3138 3120 [WARNING|trainer.py:803] 2025-04-26 16:03:22,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:03:22,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:03:22,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3146 3139 [WARNING|trainer.py:803] 2025-04-26 16:03:24,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3121 [WARNING|trainer.py:803] 2025-04-26 16:03:25,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:03:25,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3147 3140 [WARNING|trainer.py:803] 2025-04-26 16:03:27,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3122 [WARNING|trainer.py:803] 2025-04-26 16:03:28,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:03:28,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3148 3141 [WARNING|trainer.py:803] 2025-04-26 16:03:29,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3123 [WARNING|trainer.py:803] 2025-04-26 16:03:30,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3149 [WARNING|trainer.py:803] 2025-04-26 16:03:31,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:03:31,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3142 [WARNING|trainer.py:803] 2025-04-26 16:03:33,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3150 3124 [WARNING|trainer.py:803] 2025-04-26 16:03:34,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:03:34,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3143 3151 [WARNING|trainer.py:803] 2025-04-26 16:03:35,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3125 [WARNING|trainer.py:803] 2025-04-26 16:03:36,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:03:36,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3144 3152 [WARNING|trainer.py:803] 2025-04-26 16:03:38,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:03:38,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3126 [WARNING|trainer.py:803] 2025-04-26 16:03:39,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3153 [WARNING|trainer.py:803] 2025-04-26 16:03:40,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3145 [WARNING|trainer.py:803] 2025-04-26 16:03:41,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3154 3127 [WARNING|trainer.py:803] 2025-04-26 16:03:42,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:03:42,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3146 3155 [WARNING|trainer.py:803] 2025-04-26 16:03:43,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3128 [WARNING|trainer.py:803] 2025-04-26 16:03:44,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:03:45,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3147 3156 [WARNING|trainer.py:803] 2025-04-26 16:03:46,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:03:46,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3129 [WARNING|trainer.py:803] 2025-04-26 16:03:47,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3157 3148 [WARNING|trainer.py:803] 2025-04-26 16:03:48,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:03:49,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3130 3158 [WARNING|trainer.py:803] 2025-04-26 16:03:50,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3149 [WARNING|trainer.py:803] 2025-04-26 16:03:50,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:03:51,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3131 3159 3150 [WARNING|trainer.py:803] 2025-04-26 16:03:52,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:03:52,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:03:53,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3160 3151 3132 [WARNING|trainer.py:803] 2025-04-26 16:03:55,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:03:55,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:03:55,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3161 3152 3133 [WARNING|trainer.py:803] 2025-04-26 16:03:57,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:03:57,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:03:57,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3162 3153 3134 [WARNING|trainer.py:803] 2025-04-26 16:03:59,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:03:59,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:03:59,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3163 3154 [WARNING|trainer.py:803] 2025-04-26 16:04:01,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:04:01,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3135 3155 3164 [WARNING|trainer.py:803] 2025-04-26 16:04:02,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:04:03,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:03,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3136 3156 3165 [WARNING|trainer.py:803] 2025-04-26 16:04:04,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:04:05,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:05,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3166 3137 3157 [WARNING|trainer.py:803] 2025-04-26 16:04:07,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:07,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:04:07,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3167 3158 3138 [WARNING|trainer.py:803] 2025-04-26 16:04:09,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:09,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:10,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3168 3159 [WARNING|trainer.py:803] 2025-04-26 16:04:11,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:11,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3139 [WARNING|trainer.py:803] 2025-04-26 16:04:12,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3169 3160 [WARNING|trainer.py:803] 2025-04-26 16:04:13,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:04:14,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3140 3170 3161 [WARNING|trainer.py:803] 2025-04-26 16:04:15,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:15,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:04:16,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3171 3162 3141 [WARNING|trainer.py:803] 2025-04-26 16:04:17,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:04:18,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:04:18,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3172 3163 3142 [WARNING|trainer.py:803] 2025-04-26 16:04:19,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:20,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:04:20,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3173 3164 [WARNING|trainer.py:803] 2025-04-26 16:04:21,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3143 [WARNING|trainer.py:803] 2025-04-26 16:04:22,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3174 [WARNING|trainer.py:803] 2025-04-26 16:04:23,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3165 [WARNING|trainer.py:803] 2025-04-26 16:04:24,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:04:24,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3144 3175 3166 [WARNING|trainer.py:803] 2025-04-26 16:04:25,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:04:26,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:26,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3176 3167 [WARNING|trainer.py:803] 2025-04-26 16:04:28,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3145 [WARNING|trainer.py:803] 2025-04-26 16:04:28,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:29,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3177 3168 [WARNING|trainer.py:803] 2025-04-26 16:04:30,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:04:30,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3146 [WARNING|trainer.py:803] 2025-04-26 16:04:31,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3178 3169 [WARNING|trainer.py:803] 2025-04-26 16:04:32,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:04:32,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3147 3170 3179 [WARNING|trainer.py:803] 2025-04-26 16:04:34,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:04:34,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:04:34,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3171 3148 3180 [WARNING|trainer.py:803] 2025-04-26 16:04:36,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:04:36,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:04:36,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3172 3149 3181 [WARNING|trainer.py:803] 2025-04-26 16:04:38,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:38,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:38,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3173 3150 3182 [WARNING|trainer.py:803] 2025-04-26 16:04:40,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:04:40,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 16:04:40,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes NoNo 3174 3183 3151 [WARNING|trainer.py:803] 2025-04-26 16:04:42,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:04:42,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:04:43,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3175 3184 3152 [WARNING|trainer.py:803] 2025-04-26 16:04:44,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:45,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:45,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3176 3185 3153 [WARNING|trainer.py:803] 2025-04-26 16:04:47,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:04:47,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:04:47,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3186 3177 3154 [WARNING|trainer.py:803] 2025-04-26 16:04:49,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:04:49,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:04:49,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3187 3155 3178 [WARNING|trainer.py:803] 2025-04-26 16:04:51,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:51,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:51,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3188 3156 3179 [WARNING|trainer.py:803] 2025-04-26 16:04:53,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:04:53,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:04:53,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3189 3157 3180 [WARNING|trainer.py:803] 2025-04-26 16:04:55,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:04:55,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:04:55,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3190 3158 3181 [WARNING|trainer.py:803] 2025-04-26 16:04:57,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:04:57,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:04:57,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3191 3182 3159 [WARNING|trainer.py:803] 2025-04-26 16:04:59,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:04:59,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:04:59,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3183 3192 3160 [WARNING|trainer.py:803] 2025-04-26 16:05:01,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:05:01,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:05:02,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3184 3193 3161 [WARNING|trainer.py:803] 2025-04-26 16:05:03,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:03,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:05:04,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3194 3185 3162 [WARNING|trainer.py:803] 2025-04-26 16:05:06,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:06,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:05:06,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3186 3195 3163 [WARNING|trainer.py:803] 2025-04-26 16:05:08,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:05:08,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:08,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3196 3187 3164 [WARNING|trainer.py:803] 2025-04-26 16:05:10,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:05:10,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:10,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3197 3188 3165 [WARNING|trainer.py:803] 2025-04-26 16:05:12,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:05:12,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:05:12,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3166 3198 3189 [WARNING|trainer.py:803] 2025-04-26 16:05:14,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:14,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:05:14,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3167 3199 3190 [WARNING|trainer.py:803] 2025-04-26 16:05:16,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:16,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:05:16,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3191 3168 3200 [WARNING|trainer.py:803] 2025-04-26 16:05:18,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 16:05:18,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:18,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3201 3169 3192 [WARNING|trainer.py:803] 2025-04-26 16:05:20,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:05:20,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:05:20,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3202 3170 3193 [WARNING|trainer.py:803] 2025-04-26 16:05:22,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:05:22,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:05:23,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3171 3203 3194 [WARNING|trainer.py:803] 2025-04-26 16:05:24,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:05:24,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:05:25,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3204 3172 3195 [WARNING|trainer.py:803] 2025-04-26 16:05:26,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:05:27,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:27,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3205 3173 3196 [WARNING|trainer.py:803] 2025-04-26 16:05:28,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:05:29,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:05:29,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3206 3174 3197 [WARNING|trainer.py:803] 2025-04-26 16:05:31,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:05:31,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:05:31,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3207 3175 3198 [WARNING|trainer.py:803] 2025-04-26 16:05:33,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:05:33,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:33,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3208 3176 3199 [WARNING|trainer.py:803] 2025-04-26 16:05:35,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:05:35,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:05:35,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3209 3177 3200 [WARNING|trainer.py:803] 2025-04-26 16:05:37,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:37,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:05:37,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3210 3201 3178 [WARNING|trainer.py:803] 2025-04-26 16:05:39,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:05:39,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:05:39,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3211 3202 3179 [WARNING|trainer.py:803] 2025-04-26 16:05:41,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:05:41,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:05:41,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3212 3203 3180 [WARNING|trainer.py:803] 2025-04-26 16:05:43,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:05:43,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:05:44,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3213 3204 3181 [WARNING|trainer.py:803] 2025-04-26 16:05:45,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:45,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:05:46,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3214 3205 3182 [WARNING|trainer.py:803] 2025-04-26 16:05:47,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:05:47,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:05:48,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3215 3206 3183 [WARNING|trainer.py:803] 2025-04-26 16:05:49,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:49,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:05:50,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3216 3207 3184 [WARNING|trainer.py:803] 2025-04-26 16:05:51,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:52,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:05:52,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3217 3208 3185 [WARNING|trainer.py:803] 2025-04-26 16:05:53,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:05:54,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:05:54,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3218 3209 3186 [WARNING|trainer.py:803] 2025-04-26 16:05:55,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:05:56,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:56,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3219 3210 3187 [WARNING|trainer.py:803] 2025-04-26 16:05:57,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:05:58,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:05:58,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3220 3211 3188 [WARNING|trainer.py:803] 2025-04-26 16:05:59,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:06:00,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:06:00,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3221 3212 3189 [WARNING|trainer.py:803] 2025-04-26 16:06:02,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:06:02,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:06:02,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3222 3213 [WARNING|trainer.py:803] 2025-04-26 16:06:04,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3190 [WARNING|trainer.py:803] 2025-04-26 16:06:04,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:06:04,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3223 3214 [WARNING|trainer.py:803] 2025-04-26 16:06:06,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3191 [WARNING|trainer.py:803] 2025-04-26 16:06:06,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:06,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3224 3215 [WARNING|trainer.py:803] 2025-04-26 16:06:08,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3192 [WARNING|trainer.py:803] 2025-04-26 16:06:08,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:06:09,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3225 3216 [WARNING|trainer.py:803] 2025-04-26 16:06:10,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3193 [WARNING|trainer.py:803] 2025-04-26 16:06:10,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:06:11,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3226 3217 [WARNING|trainer.py:803] 2025-04-26 16:06:12,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3194 [WARNING|trainer.py:803] 2025-04-26 16:06:12,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3227 [WARNING|trainer.py:803] 2025-04-26 16:06:13,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3218 [WARNING|trainer.py:803] 2025-04-26 16:06:14,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3195 [WARNING|trainer.py:803] 2025-04-26 16:06:14,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3228 [WARNING|trainer.py:803] 2025-04-26 16:06:15,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3219 [WARNING|trainer.py:803] 2025-04-26 16:06:16,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:16,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3196 3229 [WARNING|trainer.py:803] 2025-04-26 16:06:17,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3220 [WARNING|trainer.py:803] 2025-04-26 16:06:18,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:18,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3197 3230 [WARNING|trainer.py:803] 2025-04-26 16:06:19,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3221 [WARNING|trainer.py:803] 2025-04-26 16:06:20,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:20,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3198 3231 [WARNING|trainer.py:803] 2025-04-26 16:06:21,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3222 [WARNING|trainer.py:803] 2025-04-26 16:06:22,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:06:22,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3199 3232 [WARNING|trainer.py:803] 2025-04-26 16:06:24,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3223 [WARNING|trainer.py:803] 2025-04-26 16:06:24,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:24,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3200 3233 3224 [WARNING|trainer.py:803] 2025-04-26 16:06:26,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:06:26,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:06:26,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3201 3234 3225 [WARNING|trainer.py:803] 2025-04-26 16:06:28,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:28,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:28,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3235 3202 3226 [WARNING|trainer.py:803] 2025-04-26 16:06:30,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:30,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:06:30,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3236 3203 3227 [WARNING|trainer.py:803] 2025-04-26 16:06:32,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:32,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:32,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3237 3204 3228 [WARNING|trainer.py:803] 2025-04-26 16:06:34,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:34,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:06:34,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3205 3238 3229 [WARNING|trainer.py:803] 2025-04-26 16:06:36,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:06:36,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:06:36,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3239 3206 3230 [WARNING|trainer.py:803] 2025-04-26 16:06:38,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:06:38,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:06:38,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3231 3240 3207 [WARNING|trainer.py:803] 2025-04-26 16:06:40,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:06:40,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:06:40,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3241 3232 3208 [WARNING|trainer.py:803] 2025-04-26 16:06:42,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:42,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:42,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3242 3233 3209 [WARNING|trainer.py:803] 2025-04-26 16:06:44,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:44,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:06:44,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3234 3243 3210 [WARNING|trainer.py:803] 2025-04-26 16:06:46,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:46,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:06:46,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3244 3235 3211 [WARNING|trainer.py:803] 2025-04-26 16:06:48,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:06:48,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:49,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3236 3245 3212 [WARNING|trainer.py:803] 2025-04-26 16:06:50,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:50,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:51,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3237 3246 3213 [WARNING|trainer.py:803] 2025-04-26 16:06:52,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:52,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:06:53,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3247 3238 3214 [WARNING|trainer.py:803] 2025-04-26 16:06:54,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:54,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:06:55,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3239 3248 3215 [WARNING|trainer.py:803] 2025-04-26 16:06:56,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:06:56,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:57,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3249 3240 3216 [WARNING|trainer.py:803] 2025-04-26 16:06:58,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:06:58,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:06:59,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3241 3250 3217 [WARNING|trainer.py:803] 2025-04-26 16:07:00,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:07:00,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:07:01,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3251 3242 3218 [WARNING|trainer.py:803] 2025-04-26 16:07:02,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:07:02,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:07:03,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3252 3243 3219 [WARNING|trainer.py:803] 2025-04-26 16:07:04,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:07:04,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:07:05,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3253 3244 3220 [WARNING|trainer.py:803] 2025-04-26 16:07:06,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:07:06,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:07:07,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3254 3245 [WARNING|trainer.py:803] 2025-04-26 16:07:08,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3221 [WARNING|trainer.py:803] 2025-04-26 16:07:09,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:07:09,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3255 3246 [WARNING|trainer.py:803] 2025-04-26 16:07:10,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3222 [WARNING|trainer.py:803] 2025-04-26 16:07:11,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:07:11,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3256 3247 [WARNING|trainer.py:803] 2025-04-26 16:07:12,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 3223 [WARNING|trainer.py:803] 2025-04-26 16:07:12,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3257 [WARNING|trainer.py:803] 2025-04-26 16:07:13,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3248 [WARNING|trainer.py:803] 2025-04-26 16:07:14,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 3224 [WARNING|trainer.py:803] 2025-04-26 16:07:14,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:07:15,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3258 3249 [WARNING|trainer.py:803] 2025-04-26 16:07:16,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 3225 [WARNING|trainer.py:803] 2025-04-26 16:07:16,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:07:17,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3259 3250 [WARNING|trainer.py:803] 2025-04-26 16:07:18,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 3226 [WARNING|trainer.py:803] 2025-04-26 16:07:18,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3260 [WARNING|trainer.py:803] 2025-04-26 16:07:19,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3251 [WARNING|trainer.py:803] 2025-04-26 16:07:20,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3227 [WARNING|trainer.py:803] 2025-04-26 16:07:20,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:07:21,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3261 3252 [WARNING|trainer.py:803] 2025-04-26 16:07:22,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3228 [WARNING|trainer.py:803] 2025-04-26 16:07:22,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3262 [WARNING|trainer.py:803] 2025-04-26 16:07:23,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3253 [WARNING|trainer.py:803] 2025-04-26 16:07:24,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:07:24,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3229 3263 [WARNING|trainer.py:803] 2025-04-26 16:07:25,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3254 [WARNING|trainer.py:803] 2025-04-26 16:07:26,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:07:26,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3230 3255 3264 [WARNING|trainer.py:803] 2025-04-26 16:07:27,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:07:28,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:07:28,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3231 [WARNING|trainer.py:803] 2025-04-26 16:07:29,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3256 3265 [WARNING|trainer.py:803] 2025-04-26 16:07:30,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:07:30,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3232 3257 [WARNING|trainer.py:803] 2025-04-26 16:07:31,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3266 [WARNING|trainer.py:803] 2025-04-26 16:07:32,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:07:32,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3233 3258 [WARNING|trainer.py:803] 2025-04-26 16:07:33,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3267 [WARNING|trainer.py:803] 2025-04-26 16:07:34,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:07:34,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3234 3259 [WARNING|trainer.py:803] 2025-04-26 16:07:35,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3268 [WARNING|trainer.py:803] 2025-04-26 16:07:36,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:07:36,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3235 3260 [WARNING|trainer.py:803] 2025-04-26 16:07:37,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3269 [WARNING|trainer.py:803] 2025-04-26 16:07:38,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:07:38,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3236 3261 [WARNING|trainer.py:803] 2025-04-26 16:07:39,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3270 [WARNING|trainer.py:803] 2025-04-26 16:07:40,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:07:40,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3237 3262 [WARNING|trainer.py:803] 2025-04-26 16:07:41,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3271 [WARNING|trainer.py:803] 2025-04-26 16:07:42,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:07:42,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3238 3263 [WARNING|trainer.py:803] 2025-04-26 16:07:43,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3272 [WARNING|trainer.py:803] 2025-04-26 16:07:44,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:07:44,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3239 3264 [WARNING|trainer.py:803] 2025-04-26 16:07:45,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3273 [WARNING|trainer.py:803] 2025-04-26 16:07:46,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:07:46,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3240 3265 [WARNING|trainer.py:803] 2025-04-26 16:07:47,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3274 [WARNING|trainer.py:803] 2025-04-26 16:07:48,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:07:48,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3241 3266 [WARNING|trainer.py:803] 2025-04-26 16:07:49,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3275 [WARNING|trainer.py:803] 2025-04-26 16:07:50,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:07:50,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3242 3267 [WARNING|trainer.py:803] 2025-04-26 16:07:51,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3276 [WARNING|trainer.py:803] 2025-04-26 16:07:52,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:07:52,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo 3243 3268 [WARNING|trainer.py:803] 2025-04-26 16:07:53,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3277 [WARNING|trainer.py:803] 2025-04-26 16:07:54,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:07:54,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3244 [WARNING|trainer.py:803] 2025-04-26 16:07:55,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3269 3278 [WARNING|trainer.py:803] 2025-04-26 16:07:56,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:07:56,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3245 3270 [WARNING|trainer.py:803] 2025-04-26 16:07:57,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3279 [WARNING|trainer.py:803] 2025-04-26 16:07:58,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:07:58,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3246 3271 [WARNING|trainer.py:803] 2025-04-26 16:08:00,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3280 [WARNING|trainer.py:803] 2025-04-26 16:08:00,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:08:00,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3247 3272 [WARNING|trainer.py:803] 2025-04-26 16:08:01,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3281 [WARNING|trainer.py:803] 2025-04-26 16:08:02,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:08:02,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3248 3273 [WARNING|trainer.py:803] 2025-04-26 16:08:04,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3282 [WARNING|trainer.py:803] 2025-04-26 16:08:04,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:08:05,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3249 [WARNING|trainer.py:803] 2025-04-26 16:08:05,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3274 3283 [WARNING|trainer.py:803] 2025-04-26 16:08:06,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:07,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3250 [WARNING|trainer.py:803] 2025-04-26 16:08:08,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3275 3284 3251 [WARNING|trainer.py:803] 2025-04-26 16:08:08,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:09,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:08:09,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3276 3285 3252 [WARNING|trainer.py:803] 2025-04-26 16:08:11,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 16:08:11,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:08:11,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3277 3286 3253 [WARNING|trainer.py:803] 2025-04-26 16:08:13,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:08:13,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:13,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3278 3254 3287 [WARNING|trainer.py:803] 2025-04-26 16:08:15,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:08:15,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:15,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3279 3255 3288 [WARNING|trainer.py:803] 2025-04-26 16:08:16,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:08:17,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:17,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3280 3256 3289 [WARNING|trainer.py:803] 2025-04-26 16:08:19,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:19,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:08:19,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3281 3257 3290 [WARNING|trainer.py:803] 2025-04-26 16:08:21,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:08:21,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:08:21,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3282 3258 3291 [WARNING|trainer.py:803] 2025-04-26 16:08:23,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:08:23,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:08:23,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3283 3259 3292 [WARNING|trainer.py:803] 2025-04-26 16:08:25,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:08:25,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:08:25,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3284 3260 3293 [WARNING|trainer.py:803] 2025-04-26 16:08:27,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:08:27,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:27,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3285 3261 3294 [WARNING|trainer.py:803] 2025-04-26 16:08:29,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:08:29,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:08:29,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3262 3286 3295 [WARNING|trainer.py:803] 2025-04-26 16:08:31,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:08:31,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:31,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3263 3287 3296 [WARNING|trainer.py:803] 2025-04-26 16:08:33,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:08:33,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:33,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3288 3264 3297 [WARNING|trainer.py:803] 2025-04-26 16:08:35,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:08:35,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:08:35,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3289 3265 3298 [WARNING|trainer.py:803] 2025-04-26 16:08:37,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:08:37,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:08:38,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3290 3266 3299 [WARNING|trainer.py:803] 2025-04-26 16:08:39,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:08:39,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:08:40,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3291 3267 3300 [WARNING|trainer.py:803] 2025-04-26 16:08:41,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:42,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:08:42,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3301 3292 3268 [WARNING|trainer.py:803] 2025-04-26 16:08:43,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:08:43,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3302 [WARNING|trainer.py:803] 2025-04-26 16:08:44,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:08:44,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3293 3303 3269 [WARNING|trainer.py:803] 2025-04-26 16:08:45,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:46,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:46,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3304 3294 [WARNING|trainer.py:803] 2025-04-26 16:08:47,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3270 3305 [WARNING|trainer.py:803] 2025-04-26 16:08:47,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:48,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:08:48,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3295 3306 3271 [WARNING|trainer.py:803] 2025-04-26 16:08:50,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:50,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:50,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3307 3296 [WARNING|trainer.py:803] 2025-04-26 16:08:51,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3272 3308 [WARNING|trainer.py:803] 2025-04-26 16:08:52,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:08:52,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:08:52,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3309 3297 3273 [WARNING|trainer.py:803] 2025-04-26 16:08:53,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:53,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3310 [WARNING|trainer.py:803] 2025-04-26 16:08:54,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3298 [WARNING|trainer.py:803] 2025-04-26 16:08:55,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3274 3311 [WARNING|trainer.py:803] 2025-04-26 16:08:56,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:56,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:56,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3312 3299 3275 [WARNING|trainer.py:803] 2025-04-26 16:08:57,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:58,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3313 [WARNING|trainer.py:803] 2025-04-26 16:08:58,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:08:59,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3300 3314 3276 [WARNING|trainer.py:803] 2025-04-26 16:09:00,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:09:00,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:09:00,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo 3301 3315 [WARNING|trainer.py:803] 2025-04-26 16:09:01,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3277 [WARNING|trainer.py:803] 2025-04-26 16:09:01,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3302 3316 [WARNING|trainer.py:803] 2025-04-26 16:09:02,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:02,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:09:03,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3303 3317 3278 [WARNING|trainer.py:803] 2025-04-26 16:09:04,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:04,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:04,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3304 3318 [WARNING|trainer.py:803] 2025-04-26 16:09:05,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3279 [WARNING|trainer.py:803] 2025-04-26 16:09:05,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3305 3319 [WARNING|trainer.py:803] 2025-04-26 16:09:06,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:06,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:06,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3306 3320 3280 [WARNING|trainer.py:803] 2025-04-26 16:09:08,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:08,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:08,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3307 3321 [WARNING|trainer.py:803] 2025-04-26 16:09:09,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:09,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3281 3308 3322 [WARNING|trainer.py:803] 2025-04-26 16:09:10,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:09:10,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:10,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3309 3323 3282 [WARNING|trainer.py:803] 2025-04-26 16:09:12,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:12,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3310 3324 [WARNING|trainer.py:803] 2025-04-26 16:09:12,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:09:13,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:13,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3311 3283 3325 [WARNING|trainer.py:803] 2025-04-26 16:09:14,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:14,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:09:14,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3312 3326 [WARNING|trainer.py:803] 2025-04-26 16:09:15,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3284 [WARNING|trainer.py:803] 2025-04-26 16:09:16,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3313 3327 [WARNING|trainer.py:803] 2025-04-26 16:09:16,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:17,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:17,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3314 3328 3285 [WARNING|trainer.py:803] 2025-04-26 16:09:18,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:09:18,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3315 [WARNING|trainer.py:803] 2025-04-26 16:09:18,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3329 [WARNING|trainer.py:803] 2025-04-26 16:09:19,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:09:20,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3316 3286 3330 [WARNING|trainer.py:803] 2025-04-26 16:09:21,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:21,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:21,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3317 3331 3287 [WARNING|trainer.py:803] 2025-04-26 16:09:22,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:22,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3318 3332 [WARNING|trainer.py:803] 2025-04-26 16:09:23,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:23,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:23,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3319 3288 3333 [WARNING|trainer.py:803] 2025-04-26 16:09:24,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:09:25,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:25,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3320 3334 [WARNING|trainer.py:803] 2025-04-26 16:09:26,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3289 [WARNING|trainer.py:803] 2025-04-26 16:09:26,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3321 3335 [WARNING|trainer.py:803] 2025-04-26 16:09:27,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:27,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:09:27,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3322 3336 3290 [WARNING|trainer.py:803] 2025-04-26 16:09:28,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:09:29,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3323 [WARNING|trainer.py:803] 2025-04-26 16:09:29,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3337 [WARNING|trainer.py:803] 2025-04-26 16:09:30,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:30,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3291 3324 3338 [WARNING|trainer.py:803] 2025-04-26 16:09:31,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:31,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:31,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3325 3339 3292 [WARNING|trainer.py:803] 2025-04-26 16:09:32,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:09:33,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3326 3340 [WARNING|trainer.py:803] 2025-04-26 16:09:33,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:34,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:34,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3327 3293 3341 [WARNING|trainer.py:803] 2025-04-26 16:09:35,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:09:35,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:35,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3328 3342 [WARNING|trainer.py:803] 2025-04-26 16:09:36,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3294 3329 [WARNING|trainer.py:803] 2025-04-26 16:09:37,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:37,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3343 [WARNING|trainer.py:803] 2025-04-26 16:09:37,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3330 [WARNING|trainer.py:803] 2025-04-26 16:09:38,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3295 3344 [WARNING|trainer.py:803] 2025-04-26 16:09:39,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:09:39,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3331 [WARNING|trainer.py:803] 2025-04-26 16:09:39,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3345 [WARNING|trainer.py:803] 2025-04-26 16:09:40,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3296 3332 [WARNING|trainer.py:803] 2025-04-26 16:09:41,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3346 [WARNING|trainer.py:803] 2025-04-26 16:09:41,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:09:41,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3333 [WARNING|trainer.py:803] 2025-04-26 16:09:42,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3297 3347 [WARNING|trainer.py:803] 2025-04-26 16:09:43,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3334 [WARNING|trainer.py:803] 2025-04-26 16:09:43,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:43,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3348 [WARNING|trainer.py:803] 2025-04-26 16:09:44,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3335 3298 [WARNING|trainer.py:803] 2025-04-26 16:09:44,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3349 [WARNING|trainer.py:803] 2025-04-26 16:09:45,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:45,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3336 [WARNING|trainer.py:803] 2025-04-26 16:09:46,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3350 [WARNING|trainer.py:803] 2025-04-26 16:09:46,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3299 3337 [WARNING|trainer.py:803] 2025-04-26 16:09:47,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:47,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3351 [WARNING|trainer.py:803] 2025-04-26 16:09:48,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3338 [WARNING|trainer.py:803] 2025-04-26 16:09:48,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3300 3352 [WARNING|trainer.py:803] 2025-04-26 16:09:49,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:49,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3339 [WARNING|trainer.py:803] 2025-04-26 16:09:50,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3301 3353 [WARNING|trainer.py:803] 2025-04-26 16:09:50,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:09:51,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3340 [WARNING|trainer.py:803] 2025-04-26 16:09:51,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3302 3354 [WARNING|trainer.py:803] 2025-04-26 16:09:52,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:52,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3341 [WARNING|trainer.py:803] 2025-04-26 16:09:52,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3303 3355 [WARNING|trainer.py:803] 2025-04-26 16:09:53,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:53,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3342 [WARNING|trainer.py:803] 2025-04-26 16:09:54,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3304 3356 [WARNING|trainer.py:803] 2025-04-26 16:09:54,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:09:55,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3343 [WARNING|trainer.py:803] 2025-04-26 16:09:55,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3305 3357 [WARNING|trainer.py:803] 2025-04-26 16:09:55,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:56,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3344 [WARNING|trainer.py:803] 2025-04-26 16:09:56,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3306 3358 [WARNING|trainer.py:803] 2025-04-26 16:09:57,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:57,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3345 [WARNING|trainer.py:803] 2025-04-26 16:09:57,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3307 3359 [WARNING|trainer.py:803] 2025-04-26 16:09:58,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:09:58,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3346 [WARNING|trainer.py:803] 2025-04-26 16:09:59,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3308 3360 [WARNING|trainer.py:803] 2025-04-26 16:09:59,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:10:00,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3347 [WARNING|trainer.py:803] 2025-04-26 16:10:00,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3309 3361 [WARNING|trainer.py:803] 2025-04-26 16:10:01,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:10:01,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3348 [WARNING|trainer.py:803] 2025-04-26 16:10:01,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3310 3362 [WARNING|trainer.py:803] 2025-04-26 16:10:02,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:10:02,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3349 [WARNING|trainer.py:803] 2025-04-26 16:10:03,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3311 3363 [WARNING|trainer.py:803] 2025-04-26 16:10:03,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:10:04,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3350 [WARNING|trainer.py:803] 2025-04-26 16:10:04,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3312 3364 [WARNING|trainer.py:803] 2025-04-26 16:10:05,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:05,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3351 [WARNING|trainer.py:803] 2025-04-26 16:10:05,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3313 3365 [WARNING|trainer.py:803] 2025-04-26 16:10:06,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:06,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3352 [WARNING|trainer.py:803] 2025-04-26 16:10:06,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3314 3366 [WARNING|trainer.py:803] 2025-04-26 16:10:07,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:10:07,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3353 [WARNING|trainer.py:803] 2025-04-26 16:10:08,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3315 3367 [WARNING|trainer.py:803] 2025-04-26 16:10:09,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:09,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3354 [WARNING|trainer.py:803] 2025-04-26 16:10:09,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3316 3368 [WARNING|trainer.py:803] 2025-04-26 16:10:10,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:10,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3355 [WARNING|trainer.py:803] 2025-04-26 16:10:10,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3317 3369 [WARNING|trainer.py:803] 2025-04-26 16:10:11,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:11,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3356 [WARNING|trainer.py:803] 2025-04-26 16:10:12,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3318 3370 [WARNING|trainer.py:803] 2025-04-26 16:10:12,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:13,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3357 [WARNING|trainer.py:803] 2025-04-26 16:10:13,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3319 3371 [WARNING|trainer.py:803] 2025-04-26 16:10:14,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:14,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3358 [WARNING|trainer.py:803] 2025-04-26 16:10:14,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3320 3372 [WARNING|trainer.py:803] 2025-04-26 16:10:15,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:10:15,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3359 [WARNING|trainer.py:803] 2025-04-26 16:10:15,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3321 3373 [WARNING|trainer.py:803] 2025-04-26 16:10:16,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:17,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 3360 :Yes [WARNING|trainer.py:803] 2025-04-26 16:10:17,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3322 3374 [WARNING|trainer.py:803] 2025-04-26 16:10:18,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3361 [WARNING|trainer.py:803] 2025-04-26 16:10:18,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:10:18,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3323 3375 [WARNING|trainer.py:803] 2025-04-26 16:10:19,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3362 [WARNING|trainer.py:803] 2025-04-26 16:10:19,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:10:19,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3376 3324 [WARNING|trainer.py:803] 2025-04-26 16:10:20,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3363 [WARNING|trainer.py:803] 2025-04-26 16:10:21,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:21,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3377 3325 [WARNING|trainer.py:803] 2025-04-26 16:10:21,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3364 [WARNING|trainer.py:803] 2025-04-26 16:10:22,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:22,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3378 3326 [WARNING|trainer.py:803] 2025-04-26 16:10:23,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3365 [WARNING|trainer.py:803] 2025-04-26 16:10:23,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:23,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3379 3327 [WARNING|trainer.py:803] 2025-04-26 16:10:24,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3366 [WARNING|trainer.py:803] 2025-04-26 16:10:25,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:10:25,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3380 3328 [WARNING|trainer.py:803] 2025-04-26 16:10:25,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3367 [WARNING|trainer.py:803] 2025-04-26 16:10:26,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:10:26,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3381 3329 [WARNING|trainer.py:803] 2025-04-26 16:10:27,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3368 [WARNING|trainer.py:803] 2025-04-26 16:10:27,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:27,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3382 3330 [WARNING|trainer.py:803] 2025-04-26 16:10:28,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3369 [WARNING|trainer.py:803] 2025-04-26 16:10:28,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:10:29,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3383 3331 [WARNING|trainer.py:803] 2025-04-26 16:10:29,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3370 [WARNING|trainer.py:803] 2025-04-26 16:10:30,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:10:30,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3384 3332 [WARNING|trainer.py:803] 2025-04-26 16:10:30,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3371 [WARNING|trainer.py:803] 2025-04-26 16:10:31,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:31,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3333 3385 [WARNING|trainer.py:803] 2025-04-26 16:10:32,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3372 [WARNING|trainer.py:803] 2025-04-26 16:10:32,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:32,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3334 3386 [WARNING|trainer.py:803] 2025-04-26 16:10:33,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3373 [WARNING|trainer.py:803] 2025-04-26 16:10:34,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:10:34,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3335 3387 [WARNING|trainer.py:803] 2025-04-26 16:10:34,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3374 [WARNING|trainer.py:803] 2025-04-26 16:10:35,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:10:35,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3336 3388 [WARNING|trainer.py:803] 2025-04-26 16:10:36,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3375 [WARNING|trainer.py:803] 2025-04-26 16:10:36,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:36,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3389 3337 [WARNING|trainer.py:803] 2025-04-26 16:10:37,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3376 [WARNING|trainer.py:803] 2025-04-26 16:10:38,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:38,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3390 3338 [WARNING|trainer.py:803] 2025-04-26 16:10:38,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3377 [WARNING|trainer.py:803] 2025-04-26 16:10:39,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:10:39,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3391 3339 [WARNING|trainer.py:803] 2025-04-26 16:10:40,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3378 [WARNING|trainer.py:803] 2025-04-26 16:10:40,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:10:40,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3392 3340 [WARNING|trainer.py:803] 2025-04-26 16:10:41,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3379 [WARNING|trainer.py:803] 2025-04-26 16:10:41,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:10:42,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3393 3341 [WARNING|trainer.py:803] 2025-04-26 16:10:42,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3380 [WARNING|trainer.py:803] 2025-04-26 16:10:43,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:43,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3394 3342 [WARNING|trainer.py:803] 2025-04-26 16:10:43,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3381 [WARNING|trainer.py:803] 2025-04-26 16:10:44,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:44,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3395 3343 [WARNING|trainer.py:803] 2025-04-26 16:10:45,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3382 [WARNING|trainer.py:803] 2025-04-26 16:10:45,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:45,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3396 3344 [WARNING|trainer.py:803] 2025-04-26 16:10:46,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3383 [WARNING|trainer.py:803] 2025-04-26 16:10:47,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:10:47,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3397 3345 [WARNING|trainer.py:803] 2025-04-26 16:10:47,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3384 [WARNING|trainer.py:803] 2025-04-26 16:10:48,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:10:48,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3398 3346 [WARNING|trainer.py:803] 2025-04-26 16:10:49,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3385 [WARNING|trainer.py:803] 2025-04-26 16:10:49,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:49,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3399 3347 [WARNING|trainer.py:803] 2025-04-26 16:10:50,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3386 [WARNING|trainer.py:803] 2025-04-26 16:10:51,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:10:51,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3400 3348 [WARNING|trainer.py:803] 2025-04-26 16:10:51,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3387 [WARNING|trainer.py:803] 2025-04-26 16:10:52,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:10:52,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3401 3349 [WARNING|trainer.py:803] 2025-04-26 16:10:52,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3388 [WARNING|trainer.py:803] 2025-04-26 16:10:53,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:10:53,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3402 3350 [WARNING|trainer.py:803] 2025-04-26 16:10:54,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3389 [WARNING|trainer.py:803] 2025-04-26 16:10:54,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:10:55,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3403 [WARNING|trainer.py:803] 2025-04-26 16:10:55,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3351 3390 [WARNING|trainer.py:803] 2025-04-26 16:10:56,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:10:56,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3404 [WARNING|trainer.py:803] 2025-04-26 16:10:56,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3352 3391 [WARNING|trainer.py:803] 2025-04-26 16:10:57,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:10:57,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3405 [WARNING|trainer.py:803] 2025-04-26 16:10:58,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3353 3392 [WARNING|trainer.py:803] 2025-04-26 16:10:58,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:10:58,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3406 [WARNING|trainer.py:803] 2025-04-26 16:10:59,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3354 3393 [WARNING|trainer.py:803] 2025-04-26 16:11:00,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:11:00,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:00,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3407 3355 3394 [WARNING|trainer.py:803] 2025-04-26 16:11:01,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:01,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:02,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3356 3408 3395 [WARNING|trainer.py:803] 2025-04-26 16:11:02,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:02,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:11:03,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3357 3409 3396 [WARNING|trainer.py:803] 2025-04-26 16:11:04,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:04,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:11:04,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3358 3410 3397 [WARNING|trainer.py:803] 2025-04-26 16:11:05,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:11:05,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:05,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3359 3411 3398 [WARNING|trainer.py:803] 2025-04-26 16:11:06,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:06,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:11:07,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3360 3412 3399 [WARNING|trainer.py:803] 2025-04-26 16:11:08,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:08,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:08,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3361 3413 3400 [WARNING|trainer.py:803] 2025-04-26 16:11:09,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:09,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3362 [WARNING|trainer.py:803] 2025-04-26 16:11:09,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3414 3401 [WARNING|trainer.py:803] 2025-04-26 16:11:10,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:10,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3363 [WARNING|trainer.py:803] 2025-04-26 16:11:11,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3415 3402 [WARNING|trainer.py:803] 2025-04-26 16:11:11,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:11:11,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3364 [WARNING|trainer.py:803] 2025-04-26 16:11:12,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3416 3403 [WARNING|trainer.py:803] 2025-04-26 16:11:13,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:13,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3365 [WARNING|trainer.py:803] 2025-04-26 16:11:13,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 3417 3404 [WARNING|trainer.py:803] 2025-04-26 16:11:14,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:14,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3366 [WARNING|trainer.py:803] 2025-04-26 16:11:15,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3418 3405 [WARNING|trainer.py:803] 2025-04-26 16:11:15,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:11:15,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3367 [WARNING|trainer.py:803] 2025-04-26 16:11:16,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3419 3406 [WARNING|trainer.py:803] 2025-04-26 16:11:17,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:11:17,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3368 [WARNING|trainer.py:803] 2025-04-26 16:11:17,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3420 3407 [WARNING|trainer.py:803] 2025-04-26 16:11:18,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:11:18,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3369 [WARNING|trainer.py:803] 2025-04-26 16:11:19,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3421 3408 [WARNING|trainer.py:803] 2025-04-26 16:11:19,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:11:19,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3370 [WARNING|trainer.py:803] 2025-04-26 16:11:20,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3422 3409 [WARNING|trainer.py:803] 2025-04-26 16:11:21,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:11:21,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3371 [WARNING|trainer.py:803] 2025-04-26 16:11:21,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3423 3410 [WARNING|trainer.py:803] 2025-04-26 16:11:22,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:11:22,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3372 [WARNING|trainer.py:803] 2025-04-26 16:11:22,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3424 3411 [WARNING|trainer.py:803] 2025-04-26 16:11:23,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:23,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3373 [WARNING|trainer.py:803] 2025-04-26 16:11:24,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3425 3412 [WARNING|trainer.py:803] 2025-04-26 16:11:25,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:11:25,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3374 3426 [WARNING|trainer.py:803] 2025-04-26 16:11:25,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3413 [WARNING|trainer.py:803] 2025-04-26 16:11:26,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:26,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3375 3427 [WARNING|trainer.py:803] 2025-04-26 16:11:27,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3414 [WARNING|trainer.py:803] 2025-04-26 16:11:27,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:27,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3376 3428 [WARNING|trainer.py:803] 2025-04-26 16:11:28,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3415 [WARNING|trainer.py:803] 2025-04-26 16:11:28,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:28,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3377 3429 [WARNING|trainer.py:803] 2025-04-26 16:11:29,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3416 [WARNING|trainer.py:803] 2025-04-26 16:11:30,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:30,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3378 3430 [WARNING|trainer.py:803] 2025-04-26 16:11:30,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3417 [WARNING|trainer.py:803] 2025-04-26 16:11:31,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:31,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3379 3431 [WARNING|trainer.py:803] 2025-04-26 16:11:32,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3418 [WARNING|trainer.py:803] 2025-04-26 16:11:32,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:11:32,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3380 3432 [WARNING|trainer.py:803] 2025-04-26 16:11:33,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:11:34,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3419 [WARNING|trainer.py:803] 2025-04-26 16:11:34,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3381 3433 [WARNING|trainer.py:803] 2025-04-26 16:11:35,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:11:35,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:35,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3420 3434 3382 [WARNING|trainer.py:803] 2025-04-26 16:11:36,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:11:36,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:36,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3421 3435 3383 [WARNING|trainer.py:803] 2025-04-26 16:11:37,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:11:38,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:11:38,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3422 3436 3384 [WARNING|trainer.py:803] 2025-04-26 16:11:38,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:11:39,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3423 [WARNING|trainer.py:803] 2025-04-26 16:11:39,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3437 3385 [WARNING|trainer.py:803] 2025-04-26 16:11:40,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:11:40,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3424 [WARNING|trainer.py:803] 2025-04-26 16:11:40,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3438 3386 [WARNING|trainer.py:803] 2025-04-26 16:11:41,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:11:41,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3425 [WARNING|trainer.py:803] 2025-04-26 16:11:42,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3439 3387 [WARNING|trainer.py:803] 2025-04-26 16:11:42,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:11:43,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3426 [WARNING|trainer.py:803] 2025-04-26 16:11:43,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3440 3388 [WARNING|trainer.py:803] 2025-04-26 16:11:44,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:11:44,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3427 [WARNING|trainer.py:803] 2025-04-26 16:11:44,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3441 3389 [WARNING|trainer.py:803] 2025-04-26 16:11:45,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:11:45,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3428 [WARNING|trainer.py:803] 2025-04-26 16:11:45,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3442 3390 [WARNING|trainer.py:803] 2025-04-26 16:11:46,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:11:47,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3429 [WARNING|trainer.py:803] 2025-04-26 16:11:47,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3443 3391 [WARNING|trainer.py:803] 2025-04-26 16:11:48,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:11:48,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:11:48,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3430 3444 3392 [WARNING|trainer.py:803] 2025-04-26 16:11:49,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:11:49,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:11:49,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3431 3445 3393 [WARNING|trainer.py:803] 2025-04-26 16:11:50,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:11:51,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:11:51,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3432 3446 3394 [WARNING|trainer.py:803] 2025-04-26 16:11:52,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:11:52,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:11:52,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3433 3447 3395 [WARNING|trainer.py:803] 2025-04-26 16:11:53,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:11:53,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:53,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3434 3396 3448 [WARNING|trainer.py:803] 2025-04-26 16:11:54,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:55,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:11:55,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3435 3397 3449 [WARNING|trainer.py:803] 2025-04-26 16:11:56,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:11:56,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:11:56,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3436 3398 3450 [WARNING|trainer.py:803] 2025-04-26 16:11:57,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:11:57,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:11:57,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3437 3399 3451 [WARNING|trainer.py:803] 2025-04-26 16:11:58,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:11:59,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:11:59,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3438 3400 3452 [WARNING|trainer.py:803] 2025-04-26 16:11:59,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:12:00,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:12:00,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3439 3401 3453 [WARNING|trainer.py:803] 2025-04-26 16:12:01,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:01,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3440 [WARNING|trainer.py:803] 2025-04-26 16:12:01,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3402 3454 [WARNING|trainer.py:803] 2025-04-26 16:12:02,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:12:02,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3441 [WARNING|trainer.py:803] 2025-04-26 16:12:03,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3403 3455 [WARNING|trainer.py:803] 2025-04-26 16:12:03,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3442 [WARNING|trainer.py:803] 2025-04-26 16:12:04,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:12:04,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3404 3456 [WARNING|trainer.py:803] 2025-04-26 16:12:05,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3443 [WARNING|trainer.py:803] 2025-04-26 16:12:05,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:12:05,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3457 3405 [WARNING|trainer.py:803] 2025-04-26 16:12:06,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3444 [WARNING|trainer.py:803] 2025-04-26 16:12:06,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:07,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3458 3406 [WARNING|trainer.py:803] 2025-04-26 16:12:07,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3445 [WARNING|trainer.py:803] 2025-04-26 16:12:08,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:08,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3459 3407 [WARNING|trainer.py:803] 2025-04-26 16:12:09,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3446 [WARNING|trainer.py:803] 2025-04-26 16:12:09,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:12:09,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3460 3408 [WARNING|trainer.py:803] 2025-04-26 16:12:10,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3447 [WARNING|trainer.py:803] 2025-04-26 16:12:10,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:12:11,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3461 3409 [WARNING|trainer.py:803] 2025-04-26 16:12:11,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3448 [WARNING|trainer.py:803] 2025-04-26 16:12:12,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:12:12,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3462 3410 [WARNING|trainer.py:803] 2025-04-26 16:12:12,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3449 [WARNING|trainer.py:803] 2025-04-26 16:12:13,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:12:13,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3463 [WARNING|trainer.py:803] 2025-04-26 16:12:14,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3411 3450 [WARNING|trainer.py:803] 2025-04-26 16:12:14,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:15,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3464 [WARNING|trainer.py:803] 2025-04-26 16:12:15,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3412 3451 [WARNING|trainer.py:803] 2025-04-26 16:12:16,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:16,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3465 [WARNING|trainer.py:803] 2025-04-26 16:12:16,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3413 3452 [WARNING|trainer.py:803] 2025-04-26 16:12:17,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:17,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3466 [WARNING|trainer.py:803] 2025-04-26 16:12:18,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3414 3453 [WARNING|trainer.py:803] 2025-04-26 16:12:18,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:18,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3467 [WARNING|trainer.py:803] 2025-04-26 16:12:19,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3415 3454 [WARNING|trainer.py:803] 2025-04-26 16:12:20,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:12:20,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3468 [WARNING|trainer.py:803] 2025-04-26 16:12:20,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3416 3455 [WARNING|trainer.py:803] 2025-04-26 16:12:21,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:21,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3469 [WARNING|trainer.py:803] 2025-04-26 16:12:21,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3417 3456 [WARNING|trainer.py:803] 2025-04-26 16:12:22,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:12:22,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3470 [WARNING|trainer.py:803] 2025-04-26 16:12:23,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3418 3457 [WARNING|trainer.py:803] 2025-04-26 16:12:23,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:12:24,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3471 [WARNING|trainer.py:803] 2025-04-26 16:12:24,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3419 3458 [WARNING|trainer.py:803] 2025-04-26 16:12:25,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:12:25,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3472 [WARNING|trainer.py:803] 2025-04-26 16:12:25,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3420 3459 [WARNING|trainer.py:803] 2025-04-26 16:12:26,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:12:26,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3473 [WARNING|trainer.py:803] 2025-04-26 16:12:27,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3421 3460 [WARNING|trainer.py:803] 2025-04-26 16:12:27,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3474 [WARNING|trainer.py:803] 2025-04-26 16:12:28,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:12:28,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3422 3461 [WARNING|trainer.py:803] 2025-04-26 16:12:29,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3475 [WARNING|trainer.py:803] 2025-04-26 16:12:29,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:12:29,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3423 3462 [WARNING|trainer.py:803] 2025-04-26 16:12:30,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3476 [WARNING|trainer.py:803] 2025-04-26 16:12:30,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:12:31,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3424 3463 [WARNING|trainer.py:803] 2025-04-26 16:12:31,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3477 [WARNING|trainer.py:803] 2025-04-26 16:12:32,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:12:32,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3425 3464 [WARNING|trainer.py:803] 2025-04-26 16:12:32,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3478 [WARNING|trainer.py:803] 2025-04-26 16:12:33,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:12:33,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3465 3426 [WARNING|trainer.py:803] 2025-04-26 16:12:34,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3479 [WARNING|trainer.py:803] 2025-04-26 16:12:34,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:34,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3466 3427 [WARNING|trainer.py:803] 2025-04-26 16:12:35,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3480 [WARNING|trainer.py:803] 2025-04-26 16:12:36,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:36,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3467 3428 [WARNING|trainer.py:803] 2025-04-26 16:12:36,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3481 [WARNING|trainer.py:803] 2025-04-26 16:12:37,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:12:37,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3468 3429 [WARNING|trainer.py:803] 2025-04-26 16:12:38,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3482 [WARNING|trainer.py:803] 2025-04-26 16:12:38,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:38,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3469 3430 [WARNING|trainer.py:803] 2025-04-26 16:12:39,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3483 [WARNING|trainer.py:803] 2025-04-26 16:12:40,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:12:40,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3470 [WARNING|trainer.py:803] 2025-04-26 16:12:40,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3431 3484 [WARNING|trainer.py:803] 2025-04-26 16:12:41,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:12:41,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3471 [WARNING|trainer.py:803] 2025-04-26 16:12:42,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3432 3485 [WARNING|trainer.py:803] 2025-04-26 16:12:42,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:12:42,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3472 [WARNING|trainer.py:803] 2025-04-26 16:12:43,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3433 3486 [WARNING|trainer.py:803] 2025-04-26 16:12:43,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:12:44,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3473 [WARNING|trainer.py:803] 2025-04-26 16:12:44,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3434 3487 [WARNING|trainer.py:803] 2025-04-26 16:12:45,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:12:45,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3474 [WARNING|trainer.py:803] 2025-04-26 16:12:45,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3435 3488 [WARNING|trainer.py:803] 2025-04-26 16:12:46,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:12:46,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3475 [WARNING|trainer.py:803] 2025-04-26 16:12:47,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3436 3489 [WARNING|trainer.py:803] 2025-04-26 16:12:47,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:12:48,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3476 [WARNING|trainer.py:803] 2025-04-26 16:12:48,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3437 3490 [WARNING|trainer.py:803] 2025-04-26 16:12:49,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:49,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3477 [WARNING|trainer.py:803] 2025-04-26 16:12:49,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3438 3491 [WARNING|trainer.py:803] 2025-04-26 16:12:50,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:12:50,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3478 [WARNING|trainer.py:803] 2025-04-26 16:12:51,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3439 3492 [WARNING|trainer.py:803] 2025-04-26 16:12:51,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:12:52,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3479 [WARNING|trainer.py:803] 2025-04-26 16:12:52,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3440 3493 [WARNING|trainer.py:803] 2025-04-26 16:12:52,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:53,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3480 [WARNING|trainer.py:803] 2025-04-26 16:12:53,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3441 3494 [WARNING|trainer.py:803] 2025-04-26 16:12:54,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:12:54,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3481 [WARNING|trainer.py:803] 2025-04-26 16:12:54,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3442 3495 [WARNING|trainer.py:803] 2025-04-26 16:12:55,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:55,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3482 [WARNING|trainer.py:803] 2025-04-26 16:12:56,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3443 3496 [WARNING|trainer.py:803] 2025-04-26 16:12:56,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:12:57,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3483 [WARNING|trainer.py:803] 2025-04-26 16:12:57,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3444 3497 [WARNING|trainer.py:803] 2025-04-26 16:12:58,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:12:58,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3484 [WARNING|trainer.py:803] 2025-04-26 16:12:58,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3445 3498 [WARNING|trainer.py:803] 2025-04-26 16:12:59,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:12:59,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3485 [WARNING|trainer.py:803] 2025-04-26 16:13:00,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3446 3499 [WARNING|trainer.py:803] 2025-04-26 16:13:00,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:01,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3486 [WARNING|trainer.py:803] 2025-04-26 16:13:01,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3447 3500 [WARNING|trainer.py:803] 2025-04-26 16:13:02,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:02,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3487 [WARNING|trainer.py:803] 2025-04-26 16:13:02,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3448 3501 [WARNING|trainer.py:803] 2025-04-26 16:13:03,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:03,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3488 [WARNING|trainer.py:803] 2025-04-26 16:13:04,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3449 3502 [WARNING|trainer.py:803] 2025-04-26 16:13:04,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:05,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3489 [WARNING|trainer.py:803] 2025-04-26 16:13:05,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3450 3503 [WARNING|trainer.py:803] 2025-04-26 16:13:06,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3490 [WARNING|trainer.py:803] 2025-04-26 16:13:06,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:13:06,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3504 3451 [WARNING|trainer.py:803] 2025-04-26 16:13:07,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3491 [WARNING|trainer.py:803] 2025-04-26 16:13:07,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:13:07,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3505 3452 [WARNING|trainer.py:803] 2025-04-26 16:13:08,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:13:09,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3492 [WARNING|trainer.py:803] 2025-04-26 16:13:09,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3506 3453 [WARNING|trainer.py:803] 2025-04-26 16:13:09,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:10,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3493 [WARNING|trainer.py:803] 2025-04-26 16:13:10,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3507 3454 [WARNING|trainer.py:803] 2025-04-26 16:13:11,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:11,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3494 [WARNING|trainer.py:803] 2025-04-26 16:13:11,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3508 3455 [WARNING|trainer.py:803] 2025-04-26 16:13:12,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:13:12,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3495 [WARNING|trainer.py:803] 2025-04-26 16:13:13,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3509 3456 [WARNING|trainer.py:803] 2025-04-26 16:13:13,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:14,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3496 [WARNING|trainer.py:803] 2025-04-26 16:13:14,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3510 3457 [WARNING|trainer.py:803] 2025-04-26 16:13:15,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:15,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3497 [WARNING|trainer.py:803] 2025-04-26 16:13:15,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3511 3458 [WARNING|trainer.py:803] 2025-04-26 16:13:16,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:16,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3498 [WARNING|trainer.py:803] 2025-04-26 16:13:17,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3512 3459 [WARNING|trainer.py:803] 2025-04-26 16:13:17,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:13:18,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3499 [WARNING|trainer.py:803] 2025-04-26 16:13:18,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3513 3460 [WARNING|trainer.py:803] 2025-04-26 16:13:19,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:19,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3500 [WARNING|trainer.py:803] 2025-04-26 16:13:19,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3514 3461 [WARNING|trainer.py:803] 2025-04-26 16:13:20,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:20,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3501 [WARNING|trainer.py:803] 2025-04-26 16:13:21,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3515 3462 [WARNING|trainer.py:803] 2025-04-26 16:13:21,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:21,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3502 3516 [WARNING|trainer.py:803] 2025-04-26 16:13:22,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3463 [WARNING|trainer.py:803] 2025-04-26 16:13:23,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:13:23,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3503 3517 [WARNING|trainer.py:803] 2025-04-26 16:13:23,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3464 [WARNING|trainer.py:803] 2025-04-26 16:13:24,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:24,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3504 3518 [WARNING|trainer.py:803] 2025-04-26 16:13:25,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3465 [WARNING|trainer.py:803] 2025-04-26 16:13:25,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:13:25,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3505 3519 [WARNING|trainer.py:803] 2025-04-26 16:13:26,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3466 [WARNING|trainer.py:803] 2025-04-26 16:13:26,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:13:27,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3506 3520 [WARNING|trainer.py:803] 2025-04-26 16:13:27,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3467 [WARNING|trainer.py:803] 2025-04-26 16:13:28,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:13:28,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3507 3521 [WARNING|trainer.py:803] 2025-04-26 16:13:28,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:13:29,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3468 [WARNING|trainer.py:803] 2025-04-26 16:13:29,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3508 3522 [WARNING|trainer.py:803] 2025-04-26 16:13:30,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:30,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3469 [WARNING|trainer.py:803] 2025-04-26 16:13:30,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3509 3523 [WARNING|trainer.py:803] 2025-04-26 16:13:31,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:13:32,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3470 [WARNING|trainer.py:803] 2025-04-26 16:13:32,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3510 3524 [WARNING|trainer.py:803] 2025-04-26 16:13:32,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:13:33,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3471 [WARNING|trainer.py:803] 2025-04-26 16:13:33,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3511 3525 [WARNING|trainer.py:803] 2025-04-26 16:13:34,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:13:34,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3472 [WARNING|trainer.py:803] 2025-04-26 16:13:34,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3512 3526 [WARNING|trainer.py:803] 2025-04-26 16:13:35,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:13:35,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3473 [WARNING|trainer.py:803] 2025-04-26 16:13:36,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3513 3527 [WARNING|trainer.py:803] 2025-04-26 16:13:36,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:13:37,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3474 [WARNING|trainer.py:803] 2025-04-26 16:13:37,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3514 3528 [WARNING|trainer.py:803] 2025-04-26 16:13:38,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3475 [WARNING|trainer.py:803] 2025-04-26 16:13:38,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:13:38,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3515 3529 [WARNING|trainer.py:803] 2025-04-26 16:13:39,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3476 [WARNING|trainer.py:803] 2025-04-26 16:13:39,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:13:39,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3516 3530 [WARNING|trainer.py:803] 2025-04-26 16:13:40,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3477 [WARNING|trainer.py:803] 2025-04-26 16:13:41,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:13:41,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [mov,mp4,m4a,3gp,3g2,mj2 @ 0x89737980] moov atom not found [16:13:41] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4, Invalid data found when processing input Error reading /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4... Failed to load video: /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k 3517 [WARNING|trainer.py:803] 2025-04-26 16:13:41,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3531 3478 [WARNING|trainer.py:803] 2025-04-26 16:13:42,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3518 [WARNING|trainer.py:803] 2025-04-26 16:13:43,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:13:43,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3532 3479 [WARNING|trainer.py:803] 2025-04-26 16:13:43,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3519 [WARNING|trainer.py:803] 2025-04-26 16:13:44,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:44,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3533 [WARNING|trainer.py:803] 2025-04-26 16:13:45,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3480 3520 [WARNING|trainer.py:803] 2025-04-26 16:13:45,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:13:46,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3534 [WARNING|trainer.py:803] 2025-04-26 16:13:46,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3481 3521 [WARNING|trainer.py:803] 2025-04-26 16:13:47,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:47,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3535 [WARNING|trainer.py:803] 2025-04-26 16:13:47,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3482 3522 [WARNING|trainer.py:803] 2025-04-26 16:13:48,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:48,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3536 [WARNING|trainer.py:803] 2025-04-26 16:13:49,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3483 3523 [WARNING|trainer.py:803] 2025-04-26 16:13:49,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:13:50,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3537 [WARNING|trainer.py:803] 2025-04-26 16:13:50,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3484 3524 [WARNING|trainer.py:803] 2025-04-26 16:13:50,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:13:51,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3538 [WARNING|trainer.py:803] 2025-04-26 16:13:51,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3485 3525 [WARNING|trainer.py:803] 2025-04-26 16:13:52,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:52,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3539 [WARNING|trainer.py:803] 2025-04-26 16:13:52,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3486 3526 [WARNING|trainer.py:803] 2025-04-26 16:13:53,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:13:53,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3540 [WARNING|trainer.py:803] 2025-04-26 16:13:54,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3487 3527 [WARNING|trainer.py:803] 2025-04-26 16:13:54,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:13:55,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3541 [WARNING|trainer.py:803] 2025-04-26 16:13:55,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3488 3528 [WARNING|trainer.py:803] 2025-04-26 16:13:56,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:13:56,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3542 [WARNING|trainer.py:803] 2025-04-26 16:13:56,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3489 3529 [WARNING|trainer.py:803] 2025-04-26 16:13:57,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:13:57,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3543 [WARNING|trainer.py:803] 2025-04-26 16:13:58,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3490 3530 [WARNING|trainer.py:803] 2025-04-26 16:13:58,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:13:59,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3544 [WARNING|trainer.py:803] 2025-04-26 16:13:59,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [mov,mp4,m4a,3gp,3g2,mj2 @ 0x3b8651c0] moov atom not found [16:13:59] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4, Invalid data found when processing input Error reading /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4... Failed to load video: /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k 3491 [WARNING|trainer.py:803] 2025-04-26 16:14:00,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:00,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3545 3531 3492 [WARNING|trainer.py:803] 2025-04-26 16:14:01,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:14:01,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:14:01,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3546 3532 3493 [WARNING|trainer.py:803] 2025-04-26 16:14:02,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:14:02,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:14:03,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3547 3533 3494 [WARNING|trainer.py:803] 2025-04-26 16:14:03,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:14:04,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3548 [WARNING|trainer.py:803] 2025-04-26 16:14:04,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3534 3495 [WARNING|trainer.py:803] 2025-04-26 16:14:05,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:14:05,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3549 [WARNING|trainer.py:803] 2025-04-26 16:14:05,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3535 3496 [WARNING|trainer.py:803] 2025-04-26 16:14:06,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:14:06,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3550 [WARNING|trainer.py:803] 2025-04-26 16:14:07,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3536 3497 [WARNING|trainer.py:803] 2025-04-26 16:14:07,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:14:07,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3551 [WARNING|trainer.py:803] 2025-04-26 16:14:08,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3537 3498 [WARNING|trainer.py:803] 2025-04-26 16:14:09,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:09,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3552 [WARNING|trainer.py:803] 2025-04-26 16:14:09,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3538 3499 [WARNING|trainer.py:803] 2025-04-26 16:14:10,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:10,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3553 [WARNING|trainer.py:803] 2025-04-26 16:14:10,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3539 3500 [WARNING|trainer.py:803] 2025-04-26 16:14:11,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:11,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3554 [WARNING|trainer.py:803] 2025-04-26 16:14:12,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3540 3501 [WARNING|trainer.py:803] 2025-04-26 16:14:12,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:13,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3555 [WARNING|trainer.py:803] 2025-04-26 16:14:13,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3541 3502 [WARNING|trainer.py:803] 2025-04-26 16:14:14,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:14,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3556 [WARNING|trainer.py:803] 2025-04-26 16:14:14,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3542 3503 [WARNING|trainer.py:803] 2025-04-26 16:14:15,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:15,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3557 [WARNING|trainer.py:803] 2025-04-26 16:14:16,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3543 3504 [WARNING|trainer.py:803] 2025-04-26 16:14:16,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:17,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3558 [WARNING|trainer.py:803] 2025-04-26 16:14:17,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3544 3505 [WARNING|trainer.py:803] 2025-04-26 16:14:18,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:18,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3559 [WARNING|trainer.py:803] 2025-04-26 16:14:18,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3545 3506 [WARNING|trainer.py:803] 2025-04-26 16:14:19,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3560 [WARNING|trainer.py:803] 2025-04-26 16:14:19,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:14:19,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3546 3507 [WARNING|trainer.py:803] 2025-04-26 16:14:20,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3561 [WARNING|trainer.py:803] 2025-04-26 16:14:21,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:14:21,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3547 3508 [WARNING|trainer.py:803] 2025-04-26 16:14:21,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3562 [WARNING|trainer.py:803] 2025-04-26 16:14:22,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:14:22,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3548 3509 [WARNING|trainer.py:803] 2025-04-26 16:14:23,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3563 [WARNING|trainer.py:803] 2025-04-26 16:14:23,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:14:23,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3549 3510 [WARNING|trainer.py:803] 2025-04-26 16:14:24,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3564 [WARNING|trainer.py:803] 2025-04-26 16:14:24,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:14:25,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3550 3511 [WARNING|trainer.py:803] 2025-04-26 16:14:25,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3565 [WARNING|trainer.py:803] 2025-04-26 16:14:26,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:14:26,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3551 3512 [WARNING|trainer.py:803] 2025-04-26 16:14:26,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3566 [WARNING|trainer.py:803] 2025-04-26 16:14:27,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:27,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3552 3513 [WARNING|trainer.py:803] 2025-04-26 16:14:28,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3567 [WARNING|trainer.py:803] 2025-04-26 16:14:28,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:28,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3553 3514 [WARNING|trainer.py:803] 2025-04-26 16:14:29,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo 3568 [WARNING|trainer.py:803] 2025-04-26 16:14:30,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:30,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3554 3515 [WARNING|trainer.py:803] 2025-04-26 16:14:30,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3569 [WARNING|trainer.py:803] 2025-04-26 16:14:31,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:31,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3555 3516 [WARNING|trainer.py:803] 2025-04-26 16:14:32,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3570 [WARNING|trainer.py:803] 2025-04-26 16:14:32,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:32,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3556 3517 [WARNING|trainer.py:803] 2025-04-26 16:14:33,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3571 [WARNING|trainer.py:803] 2025-04-26 16:14:34,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:34,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3557 3518 [WARNING|trainer.py:803] 2025-04-26 16:14:34,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3572 [WARNING|trainer.py:803] 2025-04-26 16:14:35,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:35,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3558 3519 [WARNING|trainer.py:803] 2025-04-26 16:14:36,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3573 [WARNING|trainer.py:803] 2025-04-26 16:14:36,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:36,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3559 3520 [WARNING|trainer.py:803] 2025-04-26 16:14:37,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3574 [WARNING|trainer.py:803] 2025-04-26 16:14:37,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:38,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3560 3521 [WARNING|trainer.py:803] 2025-04-26 16:14:38,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3575 [WARNING|trainer.py:803] 2025-04-26 16:14:39,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:39,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3561 3522 [WARNING|trainer.py:803] 2025-04-26 16:14:39,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3576 [WARNING|trainer.py:803] 2025-04-26 16:14:40,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:14:40,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3562 3523 [WARNING|trainer.py:803] 2025-04-26 16:14:41,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3577 [WARNING|trainer.py:803] 2025-04-26 16:14:41,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:14:41,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3563 3524 [WARNING|trainer.py:803] 2025-04-26 16:14:42,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3578 [WARNING|trainer.py:803] 2025-04-26 16:14:43,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:14:43,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3564 3525 [WARNING|trainer.py:803] 2025-04-26 16:14:43,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3579 [WARNING|trainer.py:803] 2025-04-26 16:14:44,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:14:44,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3565 3526 [WARNING|trainer.py:803] 2025-04-26 16:14:45,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3580 [WARNING|trainer.py:803] 2025-04-26 16:14:45,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:14:45,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3527 3566 [WARNING|trainer.py:803] 2025-04-26 16:14:46,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3581 [WARNING|trainer.py:803] 2025-04-26 16:14:47,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:14:47,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3528 3567 [WARNING|trainer.py:803] 2025-04-26 16:14:47,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3582 [WARNING|trainer.py:803] 2025-04-26 16:14:48,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:48,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo 3568 3529 [WARNING|trainer.py:803] 2025-04-26 16:14:49,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3583 [WARNING|trainer.py:803] 2025-04-26 16:14:49,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:14:49,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3530 3569 [WARNING|trainer.py:803] 2025-04-26 16:14:50,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3584 [WARNING|trainer.py:803] 2025-04-26 16:14:51,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [mov,mp4,m4a,3gp,3g2,mj2 @ 0x589b8940] moov atom not found [16:14:51] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4, Invalid data found when processing input Error reading /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4... Failed to load video: /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k [WARNING|trainer.py:803] 2025-04-26 16:14:51,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3570 [WARNING|trainer.py:803] 2025-04-26 16:14:51,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3585 3531 [WARNING|trainer.py:803] 2025-04-26 16:14:52,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3571 [WARNING|trainer.py:803] 2025-04-26 16:14:52,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:14:53,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3586 3532 [WARNING|trainer.py:803] 2025-04-26 16:14:53,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3572 [WARNING|trainer.py:803] 2025-04-26 16:14:54,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:54,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3587 3533 [WARNING|trainer.py:803] 2025-04-26 16:14:54,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3573 [WARNING|trainer.py:803] 2025-04-26 16:14:55,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:55,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3588 3534 [WARNING|trainer.py:803] 2025-04-26 16:14:56,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3574 [WARNING|trainer.py:803] 2025-04-26 16:14:56,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:56,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3589 3535 [WARNING|trainer.py:803] 2025-04-26 16:14:57,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3575 [WARNING|trainer.py:803] 2025-04-26 16:14:58,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:58,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3590 3536 [WARNING|trainer.py:803] 2025-04-26 16:14:58,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3576 [WARNING|trainer.py:803] 2025-04-26 16:14:59,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:14:59,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3591 3537 [WARNING|trainer.py:803] 2025-04-26 16:15:00,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3577 [WARNING|trainer.py:803] 2025-04-26 16:15:00,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:00,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3592 3538 [WARNING|trainer.py:803] 2025-04-26 16:15:01,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3578 [WARNING|trainer.py:803] 2025-04-26 16:15:02,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:02,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3593 3539 [WARNING|trainer.py:803] 2025-04-26 16:15:02,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3579 [WARNING|trainer.py:803] 2025-04-26 16:15:03,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:03,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3594 3540 [WARNING|trainer.py:803] 2025-04-26 16:15:03,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3580 [WARNING|trainer.py:803] 2025-04-26 16:15:04,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:04,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3595 3541 [WARNING|trainer.py:803] 2025-04-26 16:15:05,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3581 [WARNING|trainer.py:803] 2025-04-26 16:15:06,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:06,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3596 [WARNING|trainer.py:803] 2025-04-26 16:15:06,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3542 3582 [WARNING|trainer.py:803] 2025-04-26 16:15:07,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:07,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3597 [WARNING|trainer.py:803] 2025-04-26 16:15:07,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3543 3583 [WARNING|trainer.py:803] 2025-04-26 16:15:08,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:15:08,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3598 [WARNING|trainer.py:803] 2025-04-26 16:15:09,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3544 3584 [WARNING|trainer.py:803] 2025-04-26 16:15:09,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:10,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:10,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3599 3545 3585 [WARNING|trainer.py:803] 2025-04-26 16:15:11,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:11,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:15:11,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3600 3546 3586 [WARNING|trainer.py:803] 2025-04-26 16:15:12,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:12,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:15:13,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3601 3547 3587 [WARNING|trainer.py:803] 2025-04-26 16:15:13,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:15:13,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:15:14,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3548 3602 3588 [WARNING|trainer.py:803] 2025-04-26 16:15:15,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:15:15,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:15:15,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3549 3603 3589 [WARNING|trainer.py:803] 2025-04-26 16:15:16,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:15:16,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:16,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3550 3604 3590 [WARNING|trainer.py:803] 2025-04-26 16:15:17,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:15:17,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:15:18,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3551 3605 3591 [WARNING|trainer.py:803] 2025-04-26 16:15:19,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:19,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:19,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3552 3606 3592 [WARNING|trainer.py:803] 2025-04-26 16:15:20,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:20,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:20,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3553 3607 3593 [WARNING|trainer.py:803] 2025-04-26 16:15:21,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:21,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:22,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 3554 NoNo 3608 3594 [WARNING|trainer.py:803] 2025-04-26 16:15:22,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:23,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3555 [WARNING|trainer.py:803] 2025-04-26 16:15:23,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3609 3595 [WARNING|trainer.py:803] 2025-04-26 16:15:24,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:24,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3556 [WARNING|trainer.py:803] 2025-04-26 16:15:24,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3610 3596 [WARNING|trainer.py:803] 2025-04-26 16:15:25,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:25,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3557 [WARNING|trainer.py:803] 2025-04-26 16:15:26,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3611 3597 [WARNING|trainer.py:803] 2025-04-26 16:15:26,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:27,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3558 [WARNING|trainer.py:803] 2025-04-26 16:15:27,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3612 3598 [WARNING|trainer.py:803] 2025-04-26 16:15:28,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:28,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3559 [WARNING|trainer.py:803] 2025-04-26 16:15:28,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3613 3599 [WARNING|trainer.py:803] 2025-04-26 16:15:29,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3560 [WARNING|trainer.py:803] 2025-04-26 16:15:29,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:15:29,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3614 3600 [WARNING|trainer.py:803] 2025-04-26 16:15:30,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3561 [WARNING|trainer.py:803] 2025-04-26 16:15:31,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:15:31,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3615 3601 [WARNING|trainer.py:803] 2025-04-26 16:15:32,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3562 [WARNING|trainer.py:803] 2025-04-26 16:15:32,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:15:32,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3602 3616 [WARNING|trainer.py:803] 2025-04-26 16:15:33,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3563 [WARNING|trainer.py:803] 2025-04-26 16:15:33,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:15:33,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3603 3617 [WARNING|trainer.py:803] 2025-04-26 16:15:34,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3564 [WARNING|trainer.py:803] 2025-04-26 16:15:35,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:35,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3604 [WARNING|trainer.py:803] 2025-04-26 16:15:35,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3618 3565 [WARNING|trainer.py:803] 2025-04-26 16:15:36,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:15:36,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3605 [WARNING|trainer.py:803] 2025-04-26 16:15:37,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3619 3566 [WARNING|trainer.py:803] 2025-04-26 16:15:37,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:37,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3606 [WARNING|trainer.py:803] 2025-04-26 16:15:38,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3620 3567 [WARNING|trainer.py:803] 2025-04-26 16:15:39,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:39,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3607 [WARNING|trainer.py:803] 2025-04-26 16:15:39,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo 3621 3568 [WARNING|trainer.py:803] 2025-04-26 16:15:40,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:40,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3608 [WARNING|trainer.py:803] 2025-04-26 16:15:41,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3622 3569 [WARNING|trainer.py:803] 2025-04-26 16:15:41,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:41,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:15:42,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3609 3623 3570 [WARNING|trainer.py:803] 2025-04-26 16:15:43,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:15:43,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:43,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3610 3624 3571 [WARNING|trainer.py:803] 2025-04-26 16:15:44,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:15:44,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:15:44,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3611 3625 3572 [WARNING|trainer.py:803] 2025-04-26 16:15:45,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:15:46,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:15:46,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3612 3626 3573 [WARNING|trainer.py:803] 2025-04-26 16:15:47,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:47,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:15:47,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3613 3627 3574 [WARNING|trainer.py:803] 2025-04-26 16:15:48,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:15:48,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:15:48,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3614 3575 3628 [WARNING|trainer.py:803] 2025-04-26 16:15:49,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:15:50,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:50,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3615 3576 3629 [WARNING|trainer.py:803] 2025-04-26 16:15:51,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:15:51,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:51,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3577 3616 3630 [WARNING|trainer.py:803] 2025-04-26 16:15:52,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 16:15:52,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo :Yes [WARNING|trainer.py:803] 2025-04-26 16:15:53,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3578 3617 3631 [WARNING|trainer.py:803] 2025-04-26 16:15:54,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:54,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:15:54,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3579 3618 3632 [WARNING|trainer.py:803] 2025-04-26 16:15:55,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:15:55,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:15:55,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3580 3619 3633 [WARNING|trainer.py:803] 2025-04-26 16:15:56,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:56,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:56,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3581 3620 3634 [WARNING|trainer.py:803] 2025-04-26 16:15:57,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:15:58,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:15:58,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3582 3621 3635 [WARNING|trainer.py:803] 2025-04-26 16:15:59,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:15:59,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:15:59,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3583 3622 3636 [WARNING|trainer.py:803] 2025-04-26 16:16:00,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:00,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3584 [WARNING|trainer.py:803] 2025-04-26 16:16:01,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3623 3637 [WARNING|trainer.py:803] 2025-04-26 16:16:01,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:02,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3585 [WARNING|trainer.py:803] 2025-04-26 16:16:02,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3624 3638 [WARNING|trainer.py:803] 2025-04-26 16:16:03,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3586 [WARNING|trainer.py:803] 2025-04-26 16:16:03,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:03,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3625 [WARNING|trainer.py:803] 2025-04-26 16:16:04,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3639 3587 [WARNING|trainer.py:803] 2025-04-26 16:16:05,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:05,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3626 [WARNING|trainer.py:803] 2025-04-26 16:16:05,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3640 3588 [WARNING|trainer.py:803] 2025-04-26 16:16:06,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:16:06,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3627 [WARNING|trainer.py:803] 2025-04-26 16:16:06,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3641 3589 [WARNING|trainer.py:803] 2025-04-26 16:16:07,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:16:07,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:08,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3628 3642 3590 [WARNING|trainer.py:803] 2025-04-26 16:16:09,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:09,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:09,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3629 3643 3591 [WARNING|trainer.py:803] 2025-04-26 16:16:10,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:16:10,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:10,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3630 3644 3592 [WARNING|trainer.py:803] 2025-04-26 16:16:11,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:16:12,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:12,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3631 3645 3593 [WARNING|trainer.py:803] 2025-04-26 16:16:13,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:13,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:13,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3632 3594 3646 [WARNING|trainer.py:803] 2025-04-26 16:16:14,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:14,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:14,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3633 3595 3647 [WARNING|trainer.py:803] 2025-04-26 16:16:15,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:16,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:16,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3634 3596 3648 [WARNING|trainer.py:803] 2025-04-26 16:16:17,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:17,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:17,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3635 3597 3649 [WARNING|trainer.py:803] 2025-04-26 16:16:18,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:18,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:18,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3636 3598 3650 [WARNING|trainer.py:803] 2025-04-26 16:16:20,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:20,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:20,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3599 3637 3651 [WARNING|trainer.py:803] 2025-04-26 16:16:21,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:21,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:21,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3600 3638 3652 [WARNING|trainer.py:803] 2025-04-26 16:16:22,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:22,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:23,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3601 3639 3653 [WARNING|trainer.py:803] 2025-04-26 16:16:23,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:16:24,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3602 [WARNING|trainer.py:803] 2025-04-26 16:16:24,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3640 3654 [WARNING|trainer.py:803] 2025-04-26 16:16:25,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:25,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:25,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3603 3641 3655 [WARNING|trainer.py:803] 2025-04-26 16:16:26,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:26,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3604 [WARNING|trainer.py:803] 2025-04-26 16:16:27,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3642 3656 [WARNING|trainer.py:803] 2025-04-26 16:16:27,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:16:28,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3605 [WARNING|trainer.py:803] 2025-04-26 16:16:28,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3643 3657 [WARNING|trainer.py:803] 2025-04-26 16:16:29,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3606 [WARNING|trainer.py:803] 2025-04-26 16:16:29,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:29,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3658 3644 [WARNING|trainer.py:803] 2025-04-26 16:16:30,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3607 [WARNING|trainer.py:803] 2025-04-26 16:16:31,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:16:31,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3659 3645 [WARNING|trainer.py:803] 2025-04-26 16:16:31,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3608 [WARNING|trainer.py:803] 2025-04-26 16:16:32,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:16:32,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3660 [WARNING|trainer.py:803] 2025-04-26 16:16:33,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3646 3609 [WARNING|trainer.py:803] 2025-04-26 16:16:33,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:16:33,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:34,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3661 3647 3610 [WARNING|trainer.py:803] 2025-04-26 16:16:35,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:35,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:35,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3662 3648 3611 [WARNING|trainer.py:803] 2025-04-26 16:16:36,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:36,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:37,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3663 3649 3612 [WARNING|trainer.py:803] 2025-04-26 16:16:38,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:38,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:38,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3664 3650 3613 [WARNING|trainer.py:803] 2025-04-26 16:16:39,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:39,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:39,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3665 3651 3614 [WARNING|trainer.py:803] 2025-04-26 16:16:40,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:40,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:41,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3666 3652 3615 [WARNING|trainer.py:803] 2025-04-26 16:16:42,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:42,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:42,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3667 3653 3616 [WARNING|trainer.py:803] 2025-04-26 16:16:43,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:43,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:43,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3668 3654 3617 [WARNING|trainer.py:803] 2025-04-26 16:16:44,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:45,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:45,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3669 3655 3618 [WARNING|trainer.py:803] 2025-04-26 16:16:46,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:46,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:46,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3670 3656 3619 [WARNING|trainer.py:803] 2025-04-26 16:16:47,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:47,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:47,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3671 3657 3620 [WARNING|trainer.py:803] 2025-04-26 16:16:49,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:16:49,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:49,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3672 3658 3621 [WARNING|trainer.py:803] 2025-04-26 16:16:50,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:50,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:16:50,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3673 3659 3622 [WARNING|trainer.py:803] 2025-04-26 16:16:51,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:51,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:16:51,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3674 3660 3623 [WARNING|trainer.py:803] 2025-04-26 16:16:52,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:53,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:16:53,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3675 3661 3624 [WARNING|trainer.py:803] 2025-04-26 16:16:54,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:54,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:54,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3676 3662 3625 [WARNING|trainer.py:803] 2025-04-26 16:16:55,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:56,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:16:56,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3677 3663 3626 [WARNING|trainer.py:803] 2025-04-26 16:16:57,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:57,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:16:57,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3678 3664 3627 [WARNING|trainer.py:803] 2025-04-26 16:16:58,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:16:58,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:16:58,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3679 3665 3628 [WARNING|trainer.py:803] 2025-04-26 16:16:59,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:17:00,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:00,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3680 3666 3629 [WARNING|trainer.py:803] 2025-04-26 16:17:01,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:17:01,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:01,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3681 3667 3630 [WARNING|trainer.py:803] 2025-04-26 16:17:02,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:17:02,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3682 [WARNING|trainer.py:803] 2025-04-26 16:17:03,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3668 3631 [WARNING|trainer.py:803] 2025-04-26 16:17:03,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:17:04,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:04,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3683 3669 3632 [WARNING|trainer.py:803] 2025-04-26 16:17:05,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:05,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:17:05,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3684 3633 3670 [WARNING|trainer.py:803] 2025-04-26 16:17:06,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:07,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:07,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3685 3634 3671 [WARNING|trainer.py:803] 2025-04-26 16:17:08,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:17:08,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:17:08,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3686 3635 3672 [WARNING|trainer.py:803] 2025-04-26 16:17:09,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:17:09,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:09,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3687 3636 3673 [WARNING|trainer.py:803] 2025-04-26 16:17:10,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:17:11,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:11,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3688 3674 3637 [WARNING|trainer.py:803] 2025-04-26 16:17:12,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:17:12,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:12,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3689 3675 3638 [WARNING|trainer.py:803] 2025-04-26 16:17:13,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:14,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3690 [WARNING|trainer.py:803] 2025-04-26 16:17:14,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3676 3639 [WARNING|trainer.py:803] 2025-04-26 16:17:14,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:17:15,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:15,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3691 3677 3640 [WARNING|trainer.py:803] 2025-04-26 16:17:16,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:17:16,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:17:16,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3692 3678 3641 [WARNING|trainer.py:803] 2025-04-26 16:17:17,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:18,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:17:18,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3693 3679 3642 [WARNING|trainer.py:803] 2025-04-26 16:17:19,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:17:19,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:17:19,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3694 3680 3643 [WARNING|trainer.py:803] 2025-04-26 16:17:20,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:20,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:17:20,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3695 3681 3644 [WARNING|trainer.py:803] 2025-04-26 16:17:21,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:22,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3696 [WARNING|trainer.py:803] 2025-04-26 16:17:22,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3682 3645 [WARNING|trainer.py:803] 2025-04-26 16:17:23,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3697 [WARNING|trainer.py:803] 2025-04-26 16:17:23,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:17:23,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3646 3683 [WARNING|trainer.py:803] 2025-04-26 16:17:24,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3698 [WARNING|trainer.py:803] 2025-04-26 16:17:25,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:17:25,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3647 [WARNING|trainer.py:803] 2025-04-26 16:17:25,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3684 3699 [WARNING|trainer.py:803] 2025-04-26 16:17:26,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:17:26,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3648 [WARNING|trainer.py:803] 2025-04-26 16:17:27,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3685 3700 [WARNING|trainer.py:803] 2025-04-26 16:17:27,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:17:28,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3649 [WARNING|trainer.py:803] 2025-04-26 16:17:28,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3686 3701 [WARNING|trainer.py:803] 2025-04-26 16:17:29,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:29,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3650 [WARNING|trainer.py:803] 2025-04-26 16:17:29,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3687 3702 [WARNING|trainer.py:803] 2025-04-26 16:17:30,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:17:30,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3651 [WARNING|trainer.py:803] 2025-04-26 16:17:31,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3688 3703 [WARNING|trainer.py:803] 2025-04-26 16:17:31,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:32,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3652 [WARNING|trainer.py:803] 2025-04-26 16:17:32,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3689 [WARNING|trainer.py:803] 2025-04-26 16:17:33,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3704 [WARNING|trainer.py:803] 2025-04-26 16:17:33,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3653 3690 [WARNING|trainer.py:803] 2025-04-26 16:17:34,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:34,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3705 [WARNING|trainer.py:803] 2025-04-26 16:17:34,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3654 3691 [WARNING|trainer.py:803] 2025-04-26 16:17:35,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:17:36,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3706 [WARNING|trainer.py:803] 2025-04-26 16:17:36,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3655 3692 [WARNING|trainer.py:803] 2025-04-26 16:17:37,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:17:37,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3707 [WARNING|trainer.py:803] 2025-04-26 16:17:37,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3656 3693 [WARNING|trainer.py:803] 2025-04-26 16:17:38,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:17:38,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3708 [WARNING|trainer.py:803] 2025-04-26 16:17:39,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3657 3694 [WARNING|trainer.py:803] 2025-04-26 16:17:39,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:17:40,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3709 [WARNING|trainer.py:803] 2025-04-26 16:17:40,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3658 3695 [WARNING|trainer.py:803] 2025-04-26 16:17:41,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:17:41,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:17:41,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3710 3659 3696 [WARNING|trainer.py:803] 2025-04-26 16:17:42,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:17:42,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:17:43,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3711 3660 3697 [WARNING|trainer.py:803] 2025-04-26 16:17:44,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:17:44,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:17:44,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3712 3661 3698 [WARNING|trainer.py:803] 2025-04-26 16:17:45,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:17:45,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:17:45,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3713 3662 3699 [WARNING|trainer.py:803] 2025-04-26 16:17:46,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:47,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:47,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3714 3663 3700 [WARNING|trainer.py:803] 2025-04-26 16:17:48,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:48,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:17:48,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3715 3664 3701 [WARNING|trainer.py:803] 2025-04-26 16:17:49,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:17:49,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:17:50,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3716 3665 3702 [WARNING|trainer.py:803] 2025-04-26 16:17:50,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:51,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3717 [WARNING|trainer.py:803] 2025-04-26 16:17:51,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3666 3703 [WARNING|trainer.py:803] 2025-04-26 16:17:52,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:17:52,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:52,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3667 3718 3704 [WARNING|trainer.py:803] 2025-04-26 16:17:53,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:17:53,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:17:54,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3719 3668 3705 [WARNING|trainer.py:803] 2025-04-26 16:17:55,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:55,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:55,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3669 3720 3706 [WARNING|trainer.py:803] 2025-04-26 16:17:56,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:17:56,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:17:57,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3670 3721 3707 [WARNING|trainer.py:803] 2025-04-26 16:17:58,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:17:58,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:17:58,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3671 3722 3708 [WARNING|trainer.py:803] 2025-04-26 16:17:59,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:17:59,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:17:59,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3672 3723 3709 [WARNING|trainer.py:803] 2025-04-26 16:18:00,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:18:01,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:18:01,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3673 3724 3710 [WARNING|trainer.py:803] 2025-04-26 16:18:02,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3674 [WARNING|trainer.py:803] 2025-04-26 16:18:02,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:02,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3725 3711 [WARNING|trainer.py:803] 2025-04-26 16:18:03,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:18:04,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3675 [WARNING|trainer.py:803] 2025-04-26 16:18:04,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3726 3712 [WARNING|trainer.py:803] 2025-04-26 16:18:05,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:18:05,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3676 [WARNING|trainer.py:803] 2025-04-26 16:18:05,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3727 3713 [WARNING|trainer.py:803] 2025-04-26 16:18:06,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3677 [WARNING|trainer.py:803] 2025-04-26 16:18:06,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:07,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3728 3714 [WARNING|trainer.py:803] 2025-04-26 16:18:07,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3678 [WARNING|trainer.py:803] 2025-04-26 16:18:08,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:18:08,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3729 3715 [WARNING|trainer.py:803] 2025-04-26 16:18:09,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:09,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3679 [WARNING|trainer.py:803] 2025-04-26 16:18:09,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3730 3716 [WARNING|trainer.py:803] 2025-04-26 16:18:10,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:11,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3680 [WARNING|trainer.py:803] 2025-04-26 16:18:11,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3731 3717 [WARNING|trainer.py:803] 2025-04-26 16:18:11,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:12,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3681 [WARNING|trainer.py:803] 2025-04-26 16:18:12,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3732 3718 [WARNING|trainer.py:803] 2025-04-26 16:18:13,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:13,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3682 [WARNING|trainer.py:803] 2025-04-26 16:18:14,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3733 [WARNING|trainer.py:803] 2025-04-26 16:18:14,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3719 [WARNING|trainer.py:803] 2025-04-26 16:18:15,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3683 3734 [WARNING|trainer.py:803] 2025-04-26 16:18:15,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3720 [WARNING|trainer.py:803] 2025-04-26 16:18:16,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:18:16,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3684 3735 [WARNING|trainer.py:803] 2025-04-26 16:18:16,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:17,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3721 [WARNING|trainer.py:803] 2025-04-26 16:18:17,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3736 3685 [WARNING|trainer.py:803] 2025-04-26 16:18:18,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:19,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:18:19,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3722 3686 3737 [WARNING|trainer.py:803] 2025-04-26 16:18:20,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:20,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:20,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3723 3687 3738 [WARNING|trainer.py:803] 2025-04-26 16:18:21,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:18:21,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:21,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3724 3688 3739 [WARNING|trainer.py:803] 2025-04-26 16:18:23,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:23,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:23,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3725 3689 3740 [WARNING|trainer.py:803] 2025-04-26 16:18:24,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:18:24,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:18:24,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3726 3690 3741 [WARNING|trainer.py:803] 2025-04-26 16:18:25,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:25,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:18:25,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3727 3691 3742 [WARNING|trainer.py:803] 2025-04-26 16:18:27,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:27,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:18:27,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3728 3692 3743 [WARNING|trainer.py:803] 2025-04-26 16:18:28,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:18:28,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:18:28,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3729 3693 3744 [WARNING|trainer.py:803] 2025-04-26 16:18:29,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:29,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:30,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3694 3730 3745 [WARNING|trainer.py:803] 2025-04-26 16:18:31,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:18:31,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:18:31,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3695 3731 3746 [WARNING|trainer.py:803] 2025-04-26 16:18:32,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:18:32,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:32,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3696 3732 3747 [WARNING|trainer.py:803] 2025-04-26 16:18:34,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:34,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:18:34,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3697 3733 3748 [WARNING|trainer.py:803] 2025-04-26 16:18:35,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:35,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:35,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3734 3698 3749 [WARNING|trainer.py:803] 2025-04-26 16:18:36,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:18:36,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:37,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3699 3735 3750 [WARNING|trainer.py:803] 2025-04-26 16:18:38,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:38,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:18:38,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3700 3736 3751 [WARNING|trainer.py:803] 2025-04-26 16:18:39,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:39,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:18:39,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3701 3737 3752 [WARNING|trainer.py:803] 2025-04-26 16:18:40,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:18:41,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:18:41,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3702 3738 3753 [WARNING|trainer.py:803] 2025-04-26 16:18:42,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:42,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:42,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3703 3739 3754 [WARNING|trainer.py:803] 2025-04-26 16:18:43,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:43,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:18:43,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3755 3704 3740 [WARNING|trainer.py:803] 2025-04-26 16:18:45,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:45,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:18:45,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3756 3705 3741 [WARNING|trainer.py:803] 2025-04-26 16:18:46,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:46,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:18:46,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3757 3742 3706 [WARNING|trainer.py:803] 2025-04-26 16:18:48,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:18:48,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:48,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3707 3743 3758 [WARNING|trainer.py:803] 2025-04-26 16:18:49,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:18:49,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:49,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3759 3708 3744 [WARNING|trainer.py:803] 2025-04-26 16:18:51,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:18:51,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:18:51,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3709 3745 3760 [WARNING|trainer.py:803] 2025-04-26 16:18:52,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:52,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:52,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3746 3761 3710 [WARNING|trainer.py:803] 2025-04-26 16:18:53,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:18:53,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:54,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3747 3762 3711 [WARNING|trainer.py:803] 2025-04-26 16:18:55,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:18:55,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:55,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3748 3763 3712 [WARNING|trainer.py:803] 2025-04-26 16:18:56,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:56,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:18:56,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3749 3764 3713 [WARNING|trainer.py:803] 2025-04-26 16:18:57,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:58,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:18:58,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3750 3765 3714 [WARNING|trainer.py:803] 2025-04-26 16:18:59,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:59,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:18:59,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3751 3766 3715 [WARNING|trainer.py:803] 2025-04-26 16:19:00,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:19:00,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:01,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3767 3752 3716 [WARNING|trainer.py:803] 2025-04-26 16:19:02,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:19:02,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:02,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3753 3768 3717 [WARNING|trainer.py:803] 2025-04-26 16:19:03,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:19:03,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:19:03,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3769 3754 3718 [WARNING|trainer.py:803] 2025-04-26 16:19:04,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:19:04,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:05,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3770 3755 3719 [WARNING|trainer.py:803] 2025-04-26 16:19:06,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:06,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:19:06,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3771 3756 3720 [WARNING|trainer.py:803] 2025-04-26 16:19:07,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:19:07,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3772 3757 [WARNING|trainer.py:803] 2025-04-26 16:19:08,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3721 [WARNING|trainer.py:803] 2025-04-26 16:19:08,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:19:08,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3773 3758 [WARNING|trainer.py:803] 2025-04-26 16:19:09,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:19:10,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3722 [WARNING|trainer.py:803] 2025-04-26 16:19:10,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3774 3759 [WARNING|trainer.py:803] 2025-04-26 16:19:11,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:19:11,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3723 [WARNING|trainer.py:803] 2025-04-26 16:19:11,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3775 [WARNING|trainer.py:803] 2025-04-26 16:19:12,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3760 [WARNING|trainer.py:803] 2025-04-26 16:19:12,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3724 [WARNING|trainer.py:803] 2025-04-26 16:19:13,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3776 3761 [WARNING|trainer.py:803] 2025-04-26 16:19:14,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:19:14,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3725 [WARNING|trainer.py:803] 2025-04-26 16:19:14,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3777 3762 [WARNING|trainer.py:803] 2025-04-26 16:19:15,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:15,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3726 [WARNING|trainer.py:803] 2025-04-26 16:19:16,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3778 [WARNING|trainer.py:803] 2025-04-26 16:19:16,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3763 [WARNING|trainer.py:803] 2025-04-26 16:19:17,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3727 [WARNING|trainer.py:803] 2025-04-26 16:19:17,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3779 [WARNING|trainer.py:803] 2025-04-26 16:19:18,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3764 [WARNING|trainer.py:803] 2025-04-26 16:19:18,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3728 [WARNING|trainer.py:803] 2025-04-26 16:19:19,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3780 [WARNING|trainer.py:803] 2025-04-26 16:19:19,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3765 3729 [WARNING|trainer.py:803] 2025-04-26 16:19:20,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:19:20,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3781 [WARNING|trainer.py:803] 2025-04-26 16:19:20,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3766 [WARNING|trainer.py:803] 2025-04-26 16:19:21,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3730 [WARNING|trainer.py:803] 2025-04-26 16:19:21,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3782 3767 [WARNING|trainer.py:803] 2025-04-26 16:19:22,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3731 [WARNING|trainer.py:803] 2025-04-26 16:19:22,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:19:23,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3783 [WARNING|trainer.py:803] 2025-04-26 16:19:23,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3768 3732 [WARNING|trainer.py:803] 2025-04-26 16:19:24,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:19:24,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:19:24,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3769 3784 3733 [WARNING|trainer.py:803] 2025-04-26 16:19:25,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:19:25,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3770 [WARNING|trainer.py:803] 2025-04-26 16:19:26,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3785 3734 [WARNING|trainer.py:803] 2025-04-26 16:19:27,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:27,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:19:27,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3771 3786 3735 [WARNING|trainer.py:803] 2025-04-26 16:19:28,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:19:28,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3772 [WARNING|trainer.py:803] 2025-04-26 16:19:29,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3787 3736 [WARNING|trainer.py:803] 2025-04-26 16:19:29,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:19:29,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3773 [WARNING|trainer.py:803] 2025-04-26 16:19:30,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3788 3737 [WARNING|trainer.py:803] 2025-04-26 16:19:31,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:19:31,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3774 [WARNING|trainer.py:803] 2025-04-26 16:19:31,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3789 3738 [WARNING|trainer.py:803] 2025-04-26 16:19:32,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:19:32,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3775 [WARNING|trainer.py:803] 2025-04-26 16:19:33,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3790 3739 [WARNING|trainer.py:803] 2025-04-26 16:19:33,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:19:34,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3776 [WARNING|trainer.py:803] 2025-04-26 16:19:34,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3791 3740 [WARNING|trainer.py:803] 2025-04-26 16:19:35,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:35,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3777 [WARNING|trainer.py:803] 2025-04-26 16:19:35,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3792 3741 [WARNING|trainer.py:803] 2025-04-26 16:19:36,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:36,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3778 [WARNING|trainer.py:803] 2025-04-26 16:19:37,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3793 [WARNING|trainer.py:803] 2025-04-26 16:19:37,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3742 [WARNING|trainer.py:803] 2025-04-26 16:19:38,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3779 [WARNING|trainer.py:803] 2025-04-26 16:19:38,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3794 [WARNING|trainer.py:803] 2025-04-26 16:19:39,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3743 [WARNING|trainer.py:803] 2025-04-26 16:19:39,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3780 3795 [WARNING|trainer.py:803] 2025-04-26 16:19:40,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:19:40,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3744 [WARNING|trainer.py:803] 2025-04-26 16:19:41,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3781 3796 [WARNING|trainer.py:803] 2025-04-26 16:19:42,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:19:42,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:42,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3745 3782 3797 [WARNING|trainer.py:803] 2025-04-26 16:19:43,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:19:43,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:19:43,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3746 3783 3798 [WARNING|trainer.py:803] 2025-04-26 16:19:44,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:19:45,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3747 [WARNING|trainer.py:803] 2025-04-26 16:19:45,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3784 3799 [WARNING|trainer.py:803] 2025-04-26 16:19:46,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:46,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:46,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3748 3785 3800 [WARNING|trainer.py:803] 2025-04-26 16:19:47,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:19:47,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:19:47,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3749 3786 3801 [WARNING|trainer.py:803] 2025-04-26 16:19:48,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:19:49,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:49,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3750 3787 3802 [WARNING|trainer.py:803] 2025-04-26 16:19:50,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:19:50,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:19:50,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3751 3803 3788 [WARNING|trainer.py:803] 2025-04-26 16:19:51,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:19:51,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:19:52,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3752 3804 3789 [WARNING|trainer.py:803] 2025-04-26 16:19:53,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:53,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:19:53,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3753 3805 3790 [WARNING|trainer.py:803] 2025-04-26 16:19:54,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:19:54,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:19:54,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3806 3754 3791 [WARNING|trainer.py:803] 2025-04-26 16:19:55,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:55,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:19:56,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3807 3755 3792 [WARNING|trainer.py:803] 2025-04-26 16:19:57,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:19:57,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3808 [WARNING|trainer.py:803] 2025-04-26 16:19:57,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3756 3793 [WARNING|trainer.py:803] 2025-04-26 16:19:58,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:19:58,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3809 [WARNING|trainer.py:803] 2025-04-26 16:19:58,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3757 3794 [WARNING|trainer.py:803] 2025-04-26 16:19:59,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:19:59,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3810 [WARNING|trainer.py:803] 2025-04-26 16:20:00,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3758 3795 [WARNING|trainer.py:803] 2025-04-26 16:20:00,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3811 [WARNING|trainer.py:803] 2025-04-26 16:20:01,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:20:01,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3759 [WARNING|trainer.py:803] 2025-04-26 16:20:02,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3796 3812 [WARNING|trainer.py:803] 2025-04-26 16:20:02,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:03,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3760 [WARNING|trainer.py:803] 2025-04-26 16:20:03,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3797 3813 [WARNING|trainer.py:803] 2025-04-26 16:20:04,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:04,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:04,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3761 3798 3814 [WARNING|trainer.py:803] 2025-04-26 16:20:05,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:20:05,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:20:06,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3762 3799 3815 [WARNING|trainer.py:803] 2025-04-26 16:20:07,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:20:07,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:07,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3763 3800 3816 [WARNING|trainer.py:803] 2025-04-26 16:20:08,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:20:08,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:08,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3817 3764 3801 [WARNING|trainer.py:803] 2025-04-26 16:20:09,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:10,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:10,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3818 3765 3802 [WARNING|trainer.py:803] 2025-04-26 16:20:11,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:11,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:20:11,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3819 3766 3803 [WARNING|trainer.py:803] 2025-04-26 16:20:12,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:12,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:12,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3820 3804 3767 [WARNING|trainer.py:803] 2025-04-26 16:20:14,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:14,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:20:14,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3821 3805 3768 [WARNING|trainer.py:803] 2025-04-26 16:20:15,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:15,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:20:15,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3822 3806 3769 [WARNING|trainer.py:803] 2025-04-26 16:20:16,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:16,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:16,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3823 3807 3770 [WARNING|trainer.py:803] 2025-04-26 16:20:18,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:18,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:20:18,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3824 3808 3771 [WARNING|trainer.py:803] 2025-04-26 16:20:19,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:19,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:20:19,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3825 3809 3772 [WARNING|trainer.py:803] 2025-04-26 16:20:20,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:20:20,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:20:21,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3826 3810 3773 [WARNING|trainer.py:803] 2025-04-26 16:20:21,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:20:21,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3827 [WARNING|trainer.py:803] 2025-04-26 16:20:22,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3811 3774 [WARNING|trainer.py:803] 2025-04-26 16:20:23,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:23,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3828 3812 [WARNING|trainer.py:803] 2025-04-26 16:20:23,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3775 [WARNING|trainer.py:803] 2025-04-26 16:20:24,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:24,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3829 3813 [WARNING|trainer.py:803] 2025-04-26 16:20:25,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:20:25,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3776 [WARNING|trainer.py:803] 2025-04-26 16:20:25,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3830 3814 [WARNING|trainer.py:803] 2025-04-26 16:20:26,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:26,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3777 [WARNING|trainer.py:803] 2025-04-26 16:20:27,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3831 3815 [WARNING|trainer.py:803] 2025-04-26 16:20:27,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:28,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:28,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3778 3832 3816 [WARNING|trainer.py:803] 2025-04-26 16:20:29,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:29,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:20:29,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3833 3779 3817 [WARNING|trainer.py:803] 2025-04-26 16:20:30,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:30,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:31,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3834 3780 3818 [WARNING|trainer.py:803] 2025-04-26 16:20:31,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:32,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3835 [WARNING|trainer.py:803] 2025-04-26 16:20:32,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3781 3819 [WARNING|trainer.py:803] 2025-04-26 16:20:33,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:33,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3836 [WARNING|trainer.py:803] 2025-04-26 16:20:33,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3782 3820 [WARNING|trainer.py:803] 2025-04-26 16:20:34,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3837 [WARNING|trainer.py:803] 2025-04-26 16:20:35,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:35,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3821 3783 [WARNING|trainer.py:803] 2025-04-26 16:20:35,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3838 [WARNING|trainer.py:803] 2025-04-26 16:20:36,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:36,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3822 [WARNING|trainer.py:803] 2025-04-26 16:20:37,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3784 3839 [WARNING|trainer.py:803] 2025-04-26 16:20:37,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:38,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:38,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3823 3785 3840 [WARNING|trainer.py:803] 2025-04-26 16:20:39,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:39,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:20:39,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3824 3786 3841 [WARNING|trainer.py:803] 2025-04-26 16:20:40,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:40,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3825 [WARNING|trainer.py:803] 2025-04-26 16:20:40,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3842 3787 [WARNING|trainer.py:803] 2025-04-26 16:20:41,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:20:42,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3826 [WARNING|trainer.py:803] 2025-04-26 16:20:42,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3843 3788 [WARNING|trainer.py:803] 2025-04-26 16:20:43,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3827 [WARNING|trainer.py:803] 2025-04-26 16:20:43,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:20:43,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3844 [WARNING|trainer.py:803] 2025-04-26 16:20:44,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3789 3828 [WARNING|trainer.py:803] 2025-04-26 16:20:44,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:45,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3845 [WARNING|trainer.py:803] 2025-04-26 16:20:45,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3790 3829 [WARNING|trainer.py:803] 2025-04-26 16:20:46,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:46,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3846 [WARNING|trainer.py:803] 2025-04-26 16:20:46,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3791 3830 [WARNING|trainer.py:803] 2025-04-26 16:20:47,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3847 [WARNING|trainer.py:803] 2025-04-26 16:20:47,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:20:48,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3792 3831 [WARNING|trainer.py:803] 2025-04-26 16:20:48,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3848 [WARNING|trainer.py:803] 2025-04-26 16:20:49,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:49,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3793 3832 [WARNING|trainer.py:803] 2025-04-26 16:20:49,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3849 [WARNING|trainer.py:803] 2025-04-26 16:20:50,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:20:50,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3833 3794 [WARNING|trainer.py:803] 2025-04-26 16:20:51,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3850 [WARNING|trainer.py:803] 2025-04-26 16:20:51,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:52,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3834 [WARNING|trainer.py:803] 2025-04-26 16:20:52,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3795 3851 [WARNING|trainer.py:803] 2025-04-26 16:20:53,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:53,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3835 [WARNING|trainer.py:803] 2025-04-26 16:20:53,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3796 3852 [WARNING|trainer.py:803] 2025-04-26 16:20:54,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:20:54,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3836 [WARNING|trainer.py:803] 2025-04-26 16:20:55,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3797 3853 [WARNING|trainer.py:803] 2025-04-26 16:20:55,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:56,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3837 [WARNING|trainer.py:803] 2025-04-26 16:20:56,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3854 3798 [WARNING|trainer.py:803] 2025-04-26 16:20:57,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:57,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 3838 NoNo [WARNING|trainer.py:803] 2025-04-26 16:20:57,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3855 3799 [WARNING|trainer.py:803] 2025-04-26 16:20:58,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:20:58,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3839 [WARNING|trainer.py:803] 2025-04-26 16:20:58,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3856 3800 [WARNING|trainer.py:803] 2025-04-26 16:20:59,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:00,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3840 [WARNING|trainer.py:803] 2025-04-26 16:21:00,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3857 3801 [WARNING|trainer.py:803] 2025-04-26 16:21:00,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:01,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3841 [WARNING|trainer.py:803] 2025-04-26 16:21:01,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3858 3802 [WARNING|trainer.py:803] 2025-04-26 16:21:02,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:21:02,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3842 [WARNING|trainer.py:803] 2025-04-26 16:21:03,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3859 3803 [WARNING|trainer.py:803] 2025-04-26 16:21:03,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:04,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3843 [WARNING|trainer.py:803] 2025-04-26 16:21:04,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3860 3804 [WARNING|trainer.py:803] 2025-04-26 16:21:05,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:21:05,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3844 [WARNING|trainer.py:803] 2025-04-26 16:21:05,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3861 3805 [WARNING|trainer.py:803] 2025-04-26 16:21:06,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:06,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3845 [WARNING|trainer.py:803] 2025-04-26 16:21:06,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3862 3806 [WARNING|trainer.py:803] 2025-04-26 16:21:07,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:21:07,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3846 [WARNING|trainer.py:803] 2025-04-26 16:21:08,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3863 3807 [WARNING|trainer.py:803] 2025-04-26 16:21:08,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:21:09,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3847 [WARNING|trainer.py:803] 2025-04-26 16:21:09,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3864 [WARNING|trainer.py:803] 2025-04-26 16:21:10,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3808 [WARNING|trainer.py:803] 2025-04-26 16:21:10,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3848 [WARNING|trainer.py:803] 2025-04-26 16:21:10,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3865 [WARNING|trainer.py:803] 2025-04-26 16:21:11,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3809 3849 [WARNING|trainer.py:803] 2025-04-26 16:21:11,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:12,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3866 [WARNING|trainer.py:803] 2025-04-26 16:21:12,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3810 [WARNING|trainer.py:803] 2025-04-26 16:21:13,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3850 [WARNING|trainer.py:803] 2025-04-26 16:21:13,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3867 3811 [WARNING|trainer.py:803] 2025-04-26 16:21:13,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3851 [WARNING|trainer.py:803] 2025-04-26 16:21:14,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:14,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3868 [WARNING|trainer.py:803] 2025-04-26 16:21:15,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3812 [WARNING|trainer.py:803] 2025-04-26 16:21:15,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3852 [WARNING|trainer.py:803] 2025-04-26 16:21:16,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3869 3813 [WARNING|trainer.py:803] 2025-04-26 16:21:16,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:21:16,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3853 [WARNING|trainer.py:803] 2025-04-26 16:21:17,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3870 [WARNING|trainer.py:803] 2025-04-26 16:21:17,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3814 [WARNING|trainer.py:803] 2025-04-26 16:21:18,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3854 [WARNING|trainer.py:803] 2025-04-26 16:21:18,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3871 [WARNING|trainer.py:803] 2025-04-26 16:21:19,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3815 3855 [WARNING|trainer.py:803] 2025-04-26 16:21:19,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:19,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3872 [WARNING|trainer.py:803] 2025-04-26 16:21:20,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3816 3856 [WARNING|trainer.py:803] 2025-04-26 16:21:20,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:21:21,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3873 [WARNING|trainer.py:803] 2025-04-26 16:21:21,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3817 3857 [WARNING|trainer.py:803] 2025-04-26 16:21:22,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:21:22,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3874 [WARNING|trainer.py:803] 2025-04-26 16:21:22,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3818 3858 [WARNING|trainer.py:803] 2025-04-26 16:21:23,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:24,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:24,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3875 3819 3859 [WARNING|trainer.py:803] 2025-04-26 16:21:25,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:25,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:25,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3876 3860 3820 [WARNING|trainer.py:803] 2025-04-26 16:21:26,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:26,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:21:26,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3877 3821 3861 [WARNING|trainer.py:803] 2025-04-26 16:21:27,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:28,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:28,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3878 3822 3862 [WARNING|trainer.py:803] 2025-04-26 16:21:29,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:29,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:29,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3879 3863 3823 [WARNING|trainer.py:803] 2025-04-26 16:21:30,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:30,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:21:30,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3880 3864 3824 [WARNING|trainer.py:803] 2025-04-26 16:21:31,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:32,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:32,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3881 3865 3825 [WARNING|trainer.py:803] 2025-04-26 16:21:33,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:21:33,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:33,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3882 3866 3826 [WARNING|trainer.py:803] 2025-04-26 16:21:34,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:21:34,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:21:34,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3883 3867 3827 [WARNING|trainer.py:803] 2025-04-26 16:21:35,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:21:35,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:36,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3884 3868 3828 [WARNING|trainer.py:803] 2025-04-26 16:21:37,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:21:37,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:21:37,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3885 3869 3829 [WARNING|trainer.py:803] 2025-04-26 16:21:38,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:21:38,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:38,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3886 3870 3830 [WARNING|trainer.py:803] 2025-04-26 16:21:39,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:39,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:21:39,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3831 3871 3887 [WARNING|trainer.py:803] 2025-04-26 16:21:41,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:41,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:21:41,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3832 3888 3872 [WARNING|trainer.py:803] 2025-04-26 16:21:42,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:42,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:21:42,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3889 3833 3873 [WARNING|trainer.py:803] 2025-04-26 16:21:43,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:43,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:44,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3834 3890 3874 [WARNING|trainer.py:803] 2025-04-26 16:21:45,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:21:45,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:21:45,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3835 3891 3875 [WARNING|trainer.py:803] 2025-04-26 16:21:46,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:21:46,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3836 [WARNING|trainer.py:803] 2025-04-26 16:21:46,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3892 3876 [WARNING|trainer.py:803] 2025-04-26 16:21:47,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:48,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3837 [WARNING|trainer.py:803] 2025-04-26 16:21:48,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3893 3877 [WARNING|trainer.py:803] 2025-04-26 16:21:49,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:49,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3838 [WARNING|trainer.py:803] 2025-04-26 16:21:49,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3894 3878 [WARNING|trainer.py:803] 2025-04-26 16:21:50,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:50,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3839 [WARNING|trainer.py:803] 2025-04-26 16:21:51,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3895 3879 [WARNING|trainer.py:803] 2025-04-26 16:21:51,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:52,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3840 [WARNING|trainer.py:803] 2025-04-26 16:21:52,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3896 3880 [WARNING|trainer.py:803] 2025-04-26 16:21:52,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:21:53,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3841 [WARNING|trainer.py:803] 2025-04-26 16:21:53,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3897 3881 [WARNING|trainer.py:803] 2025-04-26 16:21:54,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3842 [WARNING|trainer.py:803] 2025-04-26 16:21:54,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:55,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3898 [WARNING|trainer.py:803] 2025-04-26 16:21:55,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3882 3843 [WARNING|trainer.py:803] 2025-04-26 16:21:56,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:56,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3899 [WARNING|trainer.py:803] 2025-04-26 16:21:56,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3883 3844 [WARNING|trainer.py:803] 2025-04-26 16:21:57,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:57,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3900 [WARNING|trainer.py:803] 2025-04-26 16:21:58,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3884 3845 [WARNING|trainer.py:803] 2025-04-26 16:21:58,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:21:59,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:21:59,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3901 3885 3846 [WARNING|trainer.py:803] 2025-04-26 16:22:00,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:00,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:00,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3902 3886 3847 [WARNING|trainer.py:803] 2025-04-26 16:22:01,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:01,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:02,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3887 3848 3903 [WARNING|trainer.py:803] 2025-04-26 16:22:03,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:03,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:03,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3888 3849 3904 [WARNING|trainer.py:803] 2025-04-26 16:22:04,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:04,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:04,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3889 3850 3905 [WARNING|trainer.py:803] 2025-04-26 16:22:05,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:05,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3851 [WARNING|trainer.py:803] 2025-04-26 16:22:06,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3890 3906 [WARNING|trainer.py:803] 2025-04-26 16:22:07,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:07,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3852 3891 [WARNING|trainer.py:803] 2025-04-26 16:22:07,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:08,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3907 [WARNING|trainer.py:803] 2025-04-26 16:22:08,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3853 3892 [WARNING|trainer.py:803] 2025-04-26 16:22:09,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:09,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:10,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3854 3908 3893 [WARNING|trainer.py:803] 2025-04-26 16:22:10,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:10,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3855 [WARNING|trainer.py:803] 2025-04-26 16:22:11,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3909 3894 [WARNING|trainer.py:803] 2025-04-26 16:22:12,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3856 [WARNING|trainer.py:803] 2025-04-26 16:22:12,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:12,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3895 3910 [WARNING|trainer.py:803] 2025-04-26 16:22:13,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3857 [WARNING|trainer.py:803] 2025-04-26 16:22:14,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:14,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3896 [WARNING|trainer.py:803] 2025-04-26 16:22:14,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3911 3858 [WARNING|trainer.py:803] 2025-04-26 16:22:15,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:15,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3897 [WARNING|trainer.py:803] 2025-04-26 16:22:15,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3912 3859 [WARNING|trainer.py:803] 2025-04-26 16:22:16,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:17,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3898 [WARNING|trainer.py:803] 2025-04-26 16:22:17,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3860 3913 [WARNING|trainer.py:803] 2025-04-26 16:22:18,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:18,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3899 [WARNING|trainer.py:803] 2025-04-26 16:22:18,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3861 [WARNING|trainer.py:803] 2025-04-26 16:22:19,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3914 [WARNING|trainer.py:803] 2025-04-26 16:22:19,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3900 3862 [WARNING|trainer.py:803] 2025-04-26 16:22:20,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:20,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3915 [WARNING|trainer.py:803] 2025-04-26 16:22:21,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3901 3863 [WARNING|trainer.py:803] 2025-04-26 16:22:22,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:22,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:22,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3902 3864 3916 [WARNING|trainer.py:803] 2025-04-26 16:22:23,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:23,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:22:23,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3865 3917 3903 [WARNING|trainer.py:803] 2025-04-26 16:22:25,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:22:25,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:25,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3866 3904 3918 [WARNING|trainer.py:803] 2025-04-26 16:22:26,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3867 [WARNING|trainer.py:803] 2025-04-26 16:22:26,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:26,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3919 3905 [WARNING|trainer.py:803] 2025-04-26 16:22:27,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3868 [WARNING|trainer.py:803] 2025-04-26 16:22:28,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:28,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:28,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3906 3920 3869 [WARNING|trainer.py:803] 2025-04-26 16:22:29,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:29,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:30,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3907 3921 3870 [WARNING|trainer.py:803] 2025-04-26 16:22:31,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:31,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:31,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3908 3922 3871 [WARNING|trainer.py:803] 2025-04-26 16:22:33,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:33,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:33,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3872 3923 3909 [WARNING|trainer.py:803] 2025-04-26 16:22:34,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:34,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:34,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3873 3910 3924 [WARNING|trainer.py:803] 2025-04-26 16:22:35,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:36,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:36,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3874 3911 3925 [WARNING|trainer.py:803] 2025-04-26 16:22:37,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:37,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:37,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3875 3912 3926 [WARNING|trainer.py:803] 2025-04-26 16:22:38,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3876 [WARNING|trainer.py:803] 2025-04-26 16:22:39,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:39,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3913 [WARNING|trainer.py:803] 2025-04-26 16:22:40,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3927 3877 [WARNING|trainer.py:803] 2025-04-26 16:22:40,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:40,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:41,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3914 3928 3878 [WARNING|trainer.py:803] 2025-04-26 16:22:42,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:42,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:42,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3915 3879 3929 [WARNING|trainer.py:803] 2025-04-26 16:22:43,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:44,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:44,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3880 3916 3930 [WARNING|trainer.py:803] 2025-04-26 16:22:45,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:22:45,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:45,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3881 3917 3931 [WARNING|trainer.py:803] 2025-04-26 16:22:46,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:46,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:47,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3882 3918 3932 [WARNING|trainer.py:803] 2025-04-26 16:22:48,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:48,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3883 [WARNING|trainer.py:803] 2025-04-26 16:22:48,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3919 3933 [WARNING|trainer.py:803] 2025-04-26 16:22:49,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:50,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3884 [WARNING|trainer.py:803] 2025-04-26 16:22:50,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3920 [WARNING|trainer.py:803] 2025-04-26 16:22:50,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3934 3885 [WARNING|trainer.py:803] 2025-04-26 16:22:51,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:52,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:52,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3921 3886 3935 [WARNING|trainer.py:803] 2025-04-26 16:22:53,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:22:53,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:53,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3922 3887 3936 [WARNING|trainer.py:803] 2025-04-26 16:22:54,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:55,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:55,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3923 3888 3937 [WARNING|trainer.py:803] 2025-04-26 16:22:56,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:56,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:56,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3889 3924 3938 [WARNING|trainer.py:803] 2025-04-26 16:22:57,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:22:58,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:58,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3890 3925 3939 [WARNING|trainer.py:803] 2025-04-26 16:22:59,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:22:59,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3891 [WARNING|trainer.py:803] 2025-04-26 16:22:59,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3926 3940 [WARNING|trainer.py:803] 2025-04-26 16:23:00,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3892 [WARNING|trainer.py:803] 2025-04-26 16:23:01,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:01,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:01,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3927 3941 3893 [WARNING|trainer.py:803] 2025-04-26 16:23:02,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:02,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:03,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3928 3942 3894 [WARNING|trainer.py:803] 2025-04-26 16:23:04,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:04,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:04,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3929 3895 3943 [WARNING|trainer.py:803] 2025-04-26 16:23:05,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:05,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:06,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3896 3930 3944 [WARNING|trainer.py:803] 2025-04-26 16:23:07,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:07,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:23:07,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3897 3931 3945 [WARNING|trainer.py:803] 2025-04-26 16:23:08,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:08,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3898 [WARNING|trainer.py:803] 2025-04-26 16:23:09,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3932 [WARNING|trainer.py:803] 2025-04-26 16:23:09,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3946 3899 [WARNING|trainer.py:803] 2025-04-26 16:23:10,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:10,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3933 [WARNING|trainer.py:803] 2025-04-26 16:23:11,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3947 3900 [WARNING|trainer.py:803] 2025-04-26 16:23:12,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:12,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:12,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3934 3948 3901 [WARNING|trainer.py:803] 2025-04-26 16:23:13,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:23:13,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:14,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3935 3949 3902 [WARNING|trainer.py:803] 2025-04-26 16:23:15,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:15,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:15,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3936 3950 3903 [WARNING|trainer.py:803] 2025-04-26 16:23:16,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:16,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:17,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3937 3951 3904 [WARNING|trainer.py:803] 2025-04-26 16:23:18,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:18,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:18,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3938 3952 3905 [WARNING|trainer.py:803] 2025-04-26 16:23:19,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:23:20,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:20,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3939 3953 3906 [WARNING|trainer.py:803] 2025-04-26 16:23:21,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:23:21,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:21,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3940 3954 3907 [WARNING|trainer.py:803] 2025-04-26 16:23:23,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:23,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:23,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3941 3955 3908 [WARNING|trainer.py:803] 2025-04-26 16:23:24,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:24,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:23:24,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3942 3956 3909 [WARNING|trainer.py:803] 2025-04-26 16:23:26,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:26,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:26,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3957 3943 3910 [WARNING|trainer.py:803] 2025-04-26 16:23:27,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:27,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:28,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3958 3944 3911 [WARNING|trainer.py:803] 2025-04-26 16:23:29,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:23:29,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:29,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3959 3945 3912 [WARNING|trainer.py:803] 2025-04-26 16:23:30,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:31,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:31,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3960 3913 3946 [WARNING|trainer.py:803] 2025-04-26 16:23:32,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:23:32,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:32,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3961 3914 3947 [WARNING|trainer.py:803] 2025-04-26 16:23:33,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:34,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:23:34,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3962 3915 3948 [WARNING|trainer.py:803] 2025-04-26 16:23:35,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:35,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3963 [WARNING|trainer.py:803] 2025-04-26 16:23:35,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3916 3949 [WARNING|trainer.py:803] 2025-04-26 16:23:36,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3964 [WARNING|trainer.py:803] 2025-04-26 16:23:37,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:37,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3917 [WARNING|trainer.py:803] 2025-04-26 16:23:38,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3950 3965 [WARNING|trainer.py:803] 2025-04-26 16:23:38,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:23:38,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3951 3918 [WARNING|trainer.py:803] 2025-04-26 16:23:39,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3966 [WARNING|trainer.py:803] 2025-04-26 16:23:40,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:40,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:41,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3919 3952 3967 [WARNING|trainer.py:803] 2025-04-26 16:23:42,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:42,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:23:42,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3953 3920 3968 [WARNING|trainer.py:803] 2025-04-26 16:23:43,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:43,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3921 [WARNING|trainer.py:803] 2025-04-26 16:23:44,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3954 3969 [WARNING|trainer.py:803] 2025-04-26 16:23:45,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:23:45,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3922 3955 [WARNING|trainer.py:803] 2025-04-26 16:23:45,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3970 [WARNING|trainer.py:803] 2025-04-26 16:23:46,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:23:46,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3956 [WARNING|trainer.py:803] 2025-04-26 16:23:47,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3923 [WARNING|trainer.py:803] 2025-04-26 16:23:48,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3971 [WARNING|trainer.py:803] 2025-04-26 16:23:48,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3957 [WARNING|trainer.py:803] 2025-04-26 16:23:48,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3924 3972 [WARNING|trainer.py:803] 2025-04-26 16:23:49,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:50,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3958 [WARNING|trainer.py:803] 2025-04-26 16:23:50,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3925 [WARNING|trainer.py:803] 2025-04-26 16:23:51,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3973 [WARNING|trainer.py:803] 2025-04-26 16:23:51,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3959 [WARNING|trainer.py:803] 2025-04-26 16:23:51,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3926 3974 [WARNING|trainer.py:803] 2025-04-26 16:23:52,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:23:53,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3960 [WARNING|trainer.py:803] 2025-04-26 16:23:53,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3927 [WARNING|trainer.py:803] 2025-04-26 16:23:54,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3975 [WARNING|trainer.py:803] 2025-04-26 16:23:54,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3961 [WARNING|trainer.py:803] 2025-04-26 16:23:55,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3928 [WARNING|trainer.py:803] 2025-04-26 16:23:55,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3976 [WARNING|trainer.py:803] 2025-04-26 16:23:56,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3962 [WARNING|trainer.py:803] 2025-04-26 16:23:56,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3929 [WARNING|trainer.py:803] 2025-04-26 16:23:57,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3977 [WARNING|trainer.py:803] 2025-04-26 16:23:57,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3963 [WARNING|trainer.py:803] 2025-04-26 16:23:58,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3930 [WARNING|trainer.py:803] 2025-04-26 16:23:58,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3978 [WARNING|trainer.py:803] 2025-04-26 16:23:59,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3964 [WARNING|trainer.py:803] 2025-04-26 16:23:59,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3931 [WARNING|trainer.py:803] 2025-04-26 16:24:00,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3979 [WARNING|trainer.py:803] 2025-04-26 16:24:00,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3965 [WARNING|trainer.py:803] 2025-04-26 16:24:01,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3932 [WARNING|trainer.py:803] 2025-04-26 16:24:01,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3980 3966 [WARNING|trainer.py:803] 2025-04-26 16:24:02,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:24:02,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3933 [WARNING|trainer.py:803] 2025-04-26 16:24:03,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3981 3967 [WARNING|trainer.py:803] 2025-04-26 16:24:04,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:04,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:04,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3982 3934 3968 [WARNING|trainer.py:803] 2025-04-26 16:24:05,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:24:05,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:24:06,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3935 3983 3969 [WARNING|trainer.py:803] 2025-04-26 16:24:07,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:07,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:24:07,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3936 3984 3970 [WARNING|trainer.py:803] 2025-04-26 16:24:08,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:08,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:09,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3937 3985 3971 [WARNING|trainer.py:803] 2025-04-26 16:24:10,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:24:10,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:10,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3938 3986 3972 [WARNING|trainer.py:803] 2025-04-26 16:24:11,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:24:11,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:24:12,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3939 3987 3973 [WARNING|trainer.py:803] 2025-04-26 16:24:13,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:24:13,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3988 [WARNING|trainer.py:803] 2025-04-26 16:24:14,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3940 3974 [WARNING|trainer.py:803] 2025-04-26 16:24:14,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:24:15,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3989 [WARNING|trainer.py:803] 2025-04-26 16:24:15,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3941 [WARNING|trainer.py:803] 2025-04-26 16:24:16,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3975 [WARNING|trainer.py:803] 2025-04-26 16:24:16,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3990 [WARNING|trainer.py:803] 2025-04-26 16:24:17,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3942 [WARNING|trainer.py:803] 2025-04-26 16:24:17,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3976 [WARNING|trainer.py:803] 2025-04-26 16:24:18,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3991 [WARNING|trainer.py:803] 2025-04-26 16:24:18,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3943 [WARNING|trainer.py:803] 2025-04-26 16:24:19,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3977 [WARNING|trainer.py:803] 2025-04-26 16:24:19,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3992 [WARNING|trainer.py:803] 2025-04-26 16:24:20,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3944 3978 [WARNING|trainer.py:803] 2025-04-26 16:24:21,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:24:21,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3993 [WARNING|trainer.py:803] 2025-04-26 16:24:21,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3945 3979 [WARNING|trainer.py:803] 2025-04-26 16:24:22,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:23,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3994 [WARNING|trainer.py:803] 2025-04-26 16:24:23,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3946 [WARNING|trainer.py:803] 2025-04-26 16:24:23,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3980 [WARNING|trainer.py:803] 2025-04-26 16:24:24,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3995 [WARNING|trainer.py:803] 2025-04-26 16:24:24,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3947 3981 [WARNING|trainer.py:803] 2025-04-26 16:24:25,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:26,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3996 [WARNING|trainer.py:803] 2025-04-26 16:24:26,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3948 3982 [WARNING|trainer.py:803] 2025-04-26 16:24:27,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:24:27,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:24:27,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3997 3949 [WARNING|trainer.py:803] 2025-04-26 16:24:28,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3983 [WARNING|trainer.py:803] 2025-04-26 16:24:29,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3998 [WARNING|trainer.py:803] 2025-04-26 16:24:29,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3950 [WARNING|trainer.py:803] 2025-04-26 16:24:30,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3984 [WARNING|trainer.py:803] 2025-04-26 16:24:30,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3999 [WARNING|trainer.py:803] 2025-04-26 16:24:31,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3951 [WARNING|trainer.py:803] 2025-04-26 16:24:31,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3985 4000 [WARNING|trainer.py:803] 2025-04-26 16:24:32,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:32,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3952 [WARNING|trainer.py:803] 2025-04-26 16:24:33,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3986 [WARNING|trainer.py:803] 2025-04-26 16:24:33,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4001 [WARNING|trainer.py:803] 2025-04-26 16:24:34,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3953 [WARNING|trainer.py:803] 2025-04-26 16:24:34,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3987 [WARNING|trainer.py:803] 2025-04-26 16:24:35,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4002 [WARNING|trainer.py:803] 2025-04-26 16:24:35,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3954 [WARNING|trainer.py:803] 2025-04-26 16:24:36,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3988 [WARNING|trainer.py:803] 2025-04-26 16:24:36,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4003 [WARNING|trainer.py:803] 2025-04-26 16:24:37,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3955 [WARNING|trainer.py:803] 2025-04-26 16:24:37,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3989 [WARNING|trainer.py:803] 2025-04-26 16:24:38,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4004 [WARNING|trainer.py:803] 2025-04-26 16:24:38,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3956 3990 [WARNING|trainer.py:803] 2025-04-26 16:24:39,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:39,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4005 [WARNING|trainer.py:803] 2025-04-26 16:24:40,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3957 3991 [WARNING|trainer.py:803] 2025-04-26 16:24:41,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:24:41,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:41,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4006 3958 3992 [WARNING|trainer.py:803] 2025-04-26 16:24:42,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:24:42,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4007 [WARNING|trainer.py:803] 2025-04-26 16:24:43,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3959 3993 [WARNING|trainer.py:803] 2025-04-26 16:24:44,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:44,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:44,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4008 3960 3994 [WARNING|trainer.py:803] 2025-04-26 16:24:45,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:24:45,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:24:46,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4009 3961 3995 [WARNING|trainer.py:803] 2025-04-26 16:24:47,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:24:47,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4010 [WARNING|trainer.py:803] 2025-04-26 16:24:48,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3962 3996 [WARNING|trainer.py:803] 2025-04-26 16:24:48,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:24:48,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4011 3963 [WARNING|trainer.py:803] 2025-04-26 16:24:49,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3997 [WARNING|trainer.py:803] 2025-04-26 16:24:50,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:24:50,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4012 [WARNING|trainer.py:803] 2025-04-26 16:24:51,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3964 3998 [WARNING|trainer.py:803] 2025-04-26 16:24:51,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:24:51,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4013 [WARNING|trainer.py:803] 2025-04-26 16:24:52,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3965 3999 [WARNING|trainer.py:803] 2025-04-26 16:24:53,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:53,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4014 [WARNING|trainer.py:803] 2025-04-26 16:24:54,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3966 4000 [WARNING|trainer.py:803] 2025-04-26 16:24:54,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:24:55,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:24:55,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4015 3967 4001 [WARNING|trainer.py:803] 2025-04-26 16:24:56,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:24:56,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:24:57,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4016 3968 4002 [WARNING|trainer.py:803] 2025-04-26 16:24:58,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:24:58,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:24:58,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4017 3969 4003 [WARNING|trainer.py:803] 2025-04-26 16:24:59,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:24:59,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:25:00,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3970 4018 4004 [WARNING|trainer.py:803] 2025-04-26 16:25:01,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:25:01,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:25:01,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3971 4019 4005 [WARNING|trainer.py:803] 2025-04-26 16:25:02,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:02,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:03,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3972 4020 4006 [WARNING|trainer.py:803] 2025-04-26 16:25:04,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:04,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:25:04,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3973 4021 4007 [WARNING|trainer.py:803] 2025-04-26 16:25:05,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:06,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:25:06,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3974 4022 4008 [WARNING|trainer.py:803] 2025-04-26 16:25:07,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:07,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:25:08,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3975 4023 4009 [WARNING|trainer.py:803] 2025-04-26 16:25:09,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:09,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:09,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3976 4024 4010 [WARNING|trainer.py:803] 2025-04-26 16:25:10,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:10,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:25:11,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3977 4025 4011 [WARNING|trainer.py:803] 2025-04-26 16:25:12,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:12,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:12,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3978 4012 4026 [WARNING|trainer.py:803] 2025-04-26 16:25:13,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:25:14,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:14,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3979 4013 4027 [WARNING|trainer.py:803] 2025-04-26 16:25:15,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:15,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:15,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3980 4014 4028 [WARNING|trainer.py:803] 2025-04-26 16:25:16,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:17,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:17,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3981 4015 4029 [WARNING|trainer.py:803] 2025-04-26 16:25:18,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:18,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3982 [WARNING|trainer.py:803] 2025-04-26 16:25:18,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4016 4030 [WARNING|trainer.py:803] 2025-04-26 16:25:19,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:25:20,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:20,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3983 4017 4031 [WARNING|trainer.py:803] 2025-04-26 16:25:21,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:25:21,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:25:21,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 3984 4032 4018 [WARNING|trainer.py:803] 2025-04-26 16:25:22,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:23,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3985 [WARNING|trainer.py:803] 2025-04-26 16:25:23,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4033 4019 [WARNING|trainer.py:803] 2025-04-26 16:25:24,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:24,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3986 [WARNING|trainer.py:803] 2025-04-26 16:25:25,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4034 4020 [WARNING|trainer.py:803] 2025-04-26 16:25:26,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:25:26,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3987 [WARNING|trainer.py:803] 2025-04-26 16:25:26,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4035 [WARNING|trainer.py:803] 2025-04-26 16:25:27,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4021 [WARNING|trainer.py:803] 2025-04-26 16:25:28,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3988 [WARNING|trainer.py:803] 2025-04-26 16:25:28,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4036 [WARNING|trainer.py:803] 2025-04-26 16:25:28,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4022 3989 [WARNING|trainer.py:803] 2025-04-26 16:25:29,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:30,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4037 [WARNING|trainer.py:803] 2025-04-26 16:25:30,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4023 3990 [WARNING|trainer.py:803] 2025-04-26 16:25:31,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:31,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:31,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4038 4024 3991 [WARNING|trainer.py:803] 2025-04-26 16:25:32,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:25:33,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:25:33,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4039 4025 3992 [WARNING|trainer.py:803] 2025-04-26 16:25:34,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:34,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4040 [WARNING|trainer.py:803] 2025-04-26 16:25:35,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3993 4026 [WARNING|trainer.py:803] 2025-04-26 16:25:35,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4041 [WARNING|trainer.py:803] 2025-04-26 16:25:36,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:36,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3994 [WARNING|trainer.py:803] 2025-04-26 16:25:37,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4027 4042 [WARNING|trainer.py:803] 2025-04-26 16:25:38,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:38,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:38,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3995 4028 4043 [WARNING|trainer.py:803] 2025-04-26 16:25:39,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:39,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3996 [WARNING|trainer.py:803] 2025-04-26 16:25:40,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4029 4044 [WARNING|trainer.py:803] 2025-04-26 16:25:41,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:25:41,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:41,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3997 4030 4045 [WARNING|trainer.py:803] 2025-04-26 16:25:42,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:42,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4031 [WARNING|trainer.py:803] 2025-04-26 16:25:43,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3998 4046 [WARNING|trainer.py:803] 2025-04-26 16:25:44,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:25:44,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4032 3999 [WARNING|trainer.py:803] 2025-04-26 16:25:45,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4047 [WARNING|trainer.py:803] 2025-04-26 16:25:45,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:25:45,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4033 4000 [WARNING|trainer.py:803] 2025-04-26 16:25:46,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:47,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:25:47,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4048 4034 4001 [WARNING|trainer.py:803] 2025-04-26 16:25:48,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:48,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4049 [WARNING|trainer.py:803] 2025-04-26 16:25:48,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4035 4002 [WARNING|trainer.py:803] 2025-04-26 16:25:49,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:50,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4050 [WARNING|trainer.py:803] 2025-04-26 16:25:50,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4003 4036 [WARNING|trainer.py:803] 2025-04-26 16:25:51,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4051 [WARNING|trainer.py:803] 2025-04-26 16:25:52,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:52,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4037 4004 [WARNING|trainer.py:803] 2025-04-26 16:25:52,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4052 [WARNING|trainer.py:803] 2025-04-26 16:25:53,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:53,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4005 4038 [WARNING|trainer.py:803] 2025-04-26 16:25:54,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4053 [WARNING|trainer.py:803] 2025-04-26 16:25:55,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:25:55,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4039 4006 [WARNING|trainer.py:803] 2025-04-26 16:25:55,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4054 [WARNING|trainer.py:803] 2025-04-26 16:25:56,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:25:56,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:25:57,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4040 4007 4055 [WARNING|trainer.py:803] 2025-04-26 16:25:58,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:25:58,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4041 [WARNING|trainer.py:803] 2025-04-26 16:25:58,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4008 4056 [WARNING|trainer.py:803] 2025-04-26 16:25:59,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:25:59,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4042 [WARNING|trainer.py:803] 2025-04-26 16:26:00,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4009 [WARNING|trainer.py:803] 2025-04-26 16:26:01,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4057 [WARNING|trainer.py:803] 2025-04-26 16:26:01,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:26:02,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4010 4043 4058 [WARNING|trainer.py:803] 2025-04-26 16:26:02,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:02,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4011 4044 [WARNING|trainer.py:803] 2025-04-26 16:26:03,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4059 [WARNING|trainer.py:803] 2025-04-26 16:26:04,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:04,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4012 [WARNING|trainer.py:803] 2025-04-26 16:26:05,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4045 [WARNING|trainer.py:803] 2025-04-26 16:26:05,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4060 [WARNING|trainer.py:803] 2025-04-26 16:26:06,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4013 4046 [WARNING|trainer.py:803] 2025-04-26 16:26:06,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4061 [WARNING|trainer.py:803] 2025-04-26 16:26:07,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:26:07,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4014 [WARNING|trainer.py:803] 2025-04-26 16:26:08,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4047 [WARNING|trainer.py:803] 2025-04-26 16:26:08,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4062 [WARNING|trainer.py:803] 2025-04-26 16:26:09,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4015 [WARNING|trainer.py:803] 2025-04-26 16:26:09,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4048 [WARNING|trainer.py:803] 2025-04-26 16:26:10,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4063 [WARNING|trainer.py:803] 2025-04-26 16:26:10,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4016 [WARNING|trainer.py:803] 2025-04-26 16:26:11,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4049 [WARNING|trainer.py:803] 2025-04-26 16:26:12,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4064 [WARNING|trainer.py:803] 2025-04-26 16:26:12,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4017 4050 [WARNING|trainer.py:803] 2025-04-26 16:26:13,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:26:13,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4065 [WARNING|trainer.py:803] 2025-04-26 16:26:13,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4018 4051 [WARNING|trainer.py:803] 2025-04-26 16:26:14,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4066 [WARNING|trainer.py:803] 2025-04-26 16:26:15,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:26:15,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4052 [WARNING|trainer.py:803] 2025-04-26 16:26:16,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4019 [WARNING|trainer.py:803] 2025-04-26 16:26:16,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4067 [WARNING|trainer.py:803] 2025-04-26 16:26:16,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4053 4020 [WARNING|trainer.py:803] 2025-04-26 16:26:17,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4068 [WARNING|trainer.py:803] 2025-04-26 16:26:18,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:26:18,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4054 [WARNING|trainer.py:803] 2025-04-26 16:26:19,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4021 [WARNING|trainer.py:803] 2025-04-26 16:26:19,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4069 [WARNING|trainer.py:803] 2025-04-26 16:26:20,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4055 [WARNING|trainer.py:803] 2025-04-26 16:26:20,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4022 4070 [WARNING|trainer.py:803] 2025-04-26 16:26:21,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:21,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4056 [WARNING|trainer.py:803] 2025-04-26 16:26:22,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4023 4071 [WARNING|trainer.py:803] 2025-04-26 16:26:23,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:23,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:23,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4057 4024 4072 [WARNING|trainer.py:803] 2025-04-26 16:26:24,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:26:25,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:26:25,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4058 4073 4025 [WARNING|trainer.py:803] 2025-04-26 16:26:26,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:26:26,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:26,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4059 4074 4026 [WARNING|trainer.py:803] 2025-04-26 16:26:27,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:28,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:26:28,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4060 4075 4027 [WARNING|trainer.py:803] 2025-04-26 16:26:29,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:26:29,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:26:29,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4061 4076 4028 [WARNING|trainer.py:803] 2025-04-26 16:26:31,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:26:31,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:31,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4062 4077 4029 [WARNING|trainer.py:803] 2025-04-26 16:26:32,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:32,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:33,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4063 4078 4030 [WARNING|trainer.py:803] 2025-04-26 16:26:34,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:34,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:34,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4064 4079 4031 [WARNING|trainer.py:803] 2025-04-26 16:26:35,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:26:36,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:26:36,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4065 4032 4080 [WARNING|trainer.py:803] 2025-04-26 16:26:37,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:26:37,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:26:37,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4066 4033 4081 [WARNING|trainer.py:803] 2025-04-26 16:26:38,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:26:39,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:26:39,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4067 4034 4082 [WARNING|trainer.py:803] 2025-04-26 16:26:40,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:40,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:40,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4068 4035 4083 [WARNING|trainer.py:803] 2025-04-26 16:26:42,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:26:42,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:42,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4069 4036 4084 [WARNING|trainer.py:803] 2025-04-26 16:26:43,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:26:44,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:44,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4070 4037 4085 [WARNING|trainer.py:803] 2025-04-26 16:26:45,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:45,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:45,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4071 4038 4086 [WARNING|trainer.py:803] 2025-04-26 16:26:46,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:26:47,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:26:47,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4072 4039 4087 [WARNING|trainer.py:803] 2025-04-26 16:26:48,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:26:48,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4073 [WARNING|trainer.py:803] 2025-04-26 16:26:48,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4040 [WARNING|trainer.py:803] 2025-04-26 16:26:49,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4088 4074 [WARNING|trainer.py:803] 2025-04-26 16:26:50,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:26:50,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4041 [WARNING|trainer.py:803] 2025-04-26 16:26:50,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4089 4075 [WARNING|trainer.py:803] 2025-04-26 16:26:51,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:26:51,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4042 [WARNING|trainer.py:803] 2025-04-26 16:26:52,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4090 [WARNING|trainer.py:803] 2025-04-26 16:26:53,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4076 [WARNING|trainer.py:803] 2025-04-26 16:26:53,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4043 [WARNING|trainer.py:803] 2025-04-26 16:26:54,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4091 4077 [WARNING|trainer.py:803] 2025-04-26 16:26:54,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:26:55,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4044 [WARNING|trainer.py:803] 2025-04-26 16:26:55,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4092 4078 [WARNING|trainer.py:803] 2025-04-26 16:26:56,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:26:56,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4045 [WARNING|trainer.py:803] 2025-04-26 16:26:57,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4093 4079 [WARNING|trainer.py:803] 2025-04-26 16:26:58,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:26:58,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4046 [WARNING|trainer.py:803] 2025-04-26 16:26:58,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4094 [WARNING|trainer.py:803] 2025-04-26 16:26:59,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4080 [WARNING|trainer.py:803] 2025-04-26 16:26:59,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4047 [WARNING|trainer.py:803] 2025-04-26 16:27:00,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4095 [WARNING|trainer.py:803] 2025-04-26 16:27:01,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4081 [WARNING|trainer.py:803] 2025-04-26 16:27:01,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4048 [WARNING|trainer.py:803] 2025-04-26 16:27:01,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4096 4082 [WARNING|trainer.py:803] 2025-04-26 16:27:02,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:02,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4049 [WARNING|trainer.py:803] 2025-04-26 16:27:03,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4097 4083 [WARNING|trainer.py:803] 2025-04-26 16:27:04,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:04,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:05,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4050 4098 4084 [WARNING|trainer.py:803] 2025-04-26 16:27:05,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:27:06,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:27:06,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4051 4099 4085 [WARNING|trainer.py:803] 2025-04-26 16:27:07,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:07,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4052 [WARNING|trainer.py:803] 2025-04-26 16:27:08,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4100 4086 [WARNING|trainer.py:803] 2025-04-26 16:27:08,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:09,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:09,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4053 4101 4087 [WARNING|trainer.py:803] 2025-04-26 16:27:10,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:10,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4054 [WARNING|trainer.py:803] 2025-04-26 16:27:11,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4102 [WARNING|trainer.py:803] 2025-04-26 16:27:11,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4088 [WARNING|trainer.py:803] 2025-04-26 16:27:12,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4055 [WARNING|trainer.py:803] 2025-04-26 16:27:12,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4103 4089 [WARNING|trainer.py:803] 2025-04-26 16:27:13,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:13,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4056 [WARNING|trainer.py:803] 2025-04-26 16:27:14,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4104 4090 [WARNING|trainer.py:803] 2025-04-26 16:27:15,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:15,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4057 [WARNING|trainer.py:803] 2025-04-26 16:27:15,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4105 4091 [WARNING|trainer.py:803] 2025-04-26 16:27:16,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:16,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:17,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4058 4106 4092 [WARNING|trainer.py:803] 2025-04-26 16:27:18,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:18,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4059 [WARNING|trainer.py:803] 2025-04-26 16:27:18,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4107 4093 [WARNING|trainer.py:803] 2025-04-26 16:27:19,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:19,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4108 [WARNING|trainer.py:803] 2025-04-26 16:27:20,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4060 4094 [WARNING|trainer.py:803] 2025-04-26 16:27:21,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:27:21,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4109 [WARNING|trainer.py:803] 2025-04-26 16:27:22,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4061 4095 [WARNING|trainer.py:803] 2025-04-26 16:27:22,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:22,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4110 4062 [WARNING|trainer.py:803] 2025-04-26 16:27:23,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:24,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:24,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4096 4111 4063 [WARNING|trainer.py:803] 2025-04-26 16:27:25,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:25,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:26,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4097 4112 4064 [WARNING|trainer.py:803] 2025-04-26 16:27:27,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:27,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:27,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4098 4113 4065 [WARNING|trainer.py:803] 2025-04-26 16:27:28,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:27:28,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:29,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4099 4114 4066 [WARNING|trainer.py:803] 2025-04-26 16:27:30,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:30,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:31,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4100 4115 4067 [WARNING|trainer.py:803] 2025-04-26 16:27:31,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:32,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:32,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4116 4101 4068 [WARNING|trainer.py:803] 2025-04-26 16:27:33,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:33,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4117 [WARNING|trainer.py:803] 2025-04-26 16:27:34,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4102 [WARNING|trainer.py:803] 2025-04-26 16:27:35,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4069 [WARNING|trainer.py:803] 2025-04-26 16:27:35,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4118 4103 [WARNING|trainer.py:803] 2025-04-26 16:27:35,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4070 [WARNING|trainer.py:803] 2025-04-26 16:27:36,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:36,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4119 [WARNING|trainer.py:803] 2025-04-26 16:27:37,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4104 4071 [WARNING|trainer.py:803] 2025-04-26 16:27:38,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:38,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:38,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4105 4120 4072 [WARNING|trainer.py:803] 2025-04-26 16:27:39,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:39,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:40,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4106 4121 4073 [WARNING|trainer.py:803] 2025-04-26 16:27:41,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:41,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:41,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4107 4122 4074 [WARNING|trainer.py:803] 2025-04-26 16:27:42,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:42,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:43,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4108 4123 4075 [WARNING|trainer.py:803] 2025-04-26 16:27:44,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:27:44,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:44,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4109 4124 4076 [WARNING|trainer.py:803] 2025-04-26 16:27:45,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:45,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:46,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4110 4125 4077 [WARNING|trainer.py:803] 2025-04-26 16:27:47,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:47,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:47,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4111 4126 4078 [WARNING|trainer.py:803] 2025-04-26 16:27:48,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:48,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:49,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4112 4127 4079 [WARNING|trainer.py:803] 2025-04-26 16:27:50,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:50,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4113 [WARNING|trainer.py:803] 2025-04-26 16:27:51,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4128 4080 [WARNING|trainer.py:803] 2025-04-26 16:27:51,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:51,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:52,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4114 4129 4081 [WARNING|trainer.py:803] 2025-04-26 16:27:53,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:53,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4115 4130 [WARNING|trainer.py:803] 2025-04-26 16:27:54,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4082 [WARNING|trainer.py:803] 2025-04-26 16:27:54,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:54,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4116 [WARNING|trainer.py:803] 2025-04-26 16:27:55,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4131 4083 [WARNING|trainer.py:803] 2025-04-26 16:27:56,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:56,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4117 4132 [WARNING|trainer.py:803] 2025-04-26 16:27:57,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:57,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4084 [WARNING|trainer.py:803] 2025-04-26 16:27:57,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4118 4133 [WARNING|trainer.py:803] 2025-04-26 16:27:58,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:27:59,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:27:59,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4085 4119 4134 [WARNING|trainer.py:803] 2025-04-26 16:28:00,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:00,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:00,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4086 4135 4120 [WARNING|trainer.py:803] 2025-04-26 16:28:01,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:02,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4087 [WARNING|trainer.py:803] 2025-04-26 16:28:02,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4136 4121 [WARNING|trainer.py:803] 2025-04-26 16:28:03,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:03,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:04,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4088 4137 4122 [WARNING|trainer.py:803] 2025-04-26 16:28:05,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:05,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:05,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4089 4138 4123 [WARNING|trainer.py:803] 2025-04-26 16:28:06,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:07,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:07,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4090 4139 4124 [WARNING|trainer.py:803] 2025-04-26 16:28:08,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:08,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:08,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4091 4140 4125 [WARNING|trainer.py:803] 2025-04-26 16:28:09,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:10,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:10,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4092 4141 4126 [WARNING|trainer.py:803] 2025-04-26 16:28:11,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:11,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:11,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4093 4142 4127 [WARNING|trainer.py:803] 2025-04-26 16:28:12,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:13,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:13,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4094 4143 4128 [WARNING|trainer.py:803] 2025-04-26 16:28:14,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:28:14,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:14,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4095 4144 4129 [WARNING|trainer.py:803] 2025-04-26 16:28:16,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:16,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:16,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4145 4096 4130 [WARNING|trainer.py:803] 2025-04-26 16:28:17,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:17,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:17,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4146 4097 4131 [WARNING|trainer.py:803] 2025-04-26 16:28:19,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:19,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:19,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4147 4132 4098 [WARNING|trainer.py:803] 2025-04-26 16:28:20,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:20,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:20,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4148 4133 4099 [WARNING|trainer.py:803] 2025-04-26 16:28:22,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:22,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:22,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4134 4149 4100 [WARNING|trainer.py:803] 2025-04-26 16:28:23,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:23,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:24,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4135 4150 4101 [WARNING|trainer.py:803] 2025-04-26 16:28:25,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:25,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:25,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4151 4136 4102 [WARNING|trainer.py:803] 2025-04-26 16:28:26,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:26,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:27,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4152 4137 4103 [WARNING|trainer.py:803] 2025-04-26 16:28:28,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:28,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:28,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4153 4138 4104 [WARNING|trainer.py:803] 2025-04-26 16:28:29,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:29,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:30,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4154 4139 4105 [WARNING|trainer.py:803] 2025-04-26 16:28:31,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:31,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:31,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4155 4140 4106 [WARNING|trainer.py:803] 2025-04-26 16:28:32,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:32,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:33,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4156 4141 4107 [WARNING|trainer.py:803] 2025-04-26 16:28:34,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:34,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:34,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4157 4142 4108 [WARNING|trainer.py:803] 2025-04-26 16:28:35,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:35,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:36,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4143 4158 4109 [WARNING|trainer.py:803] 2025-04-26 16:28:37,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:37,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:37,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4144 4159 4110 [WARNING|trainer.py:803] 2025-04-26 16:28:38,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:38,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:39,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4160 4145 4111 [WARNING|trainer.py:803] 2025-04-26 16:28:40,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:40,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:40,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4161 4146 4112 [WARNING|trainer.py:803] 2025-04-26 16:28:41,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:41,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:42,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4147 4162 4113 [WARNING|trainer.py:803] 2025-04-26 16:28:43,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:43,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:43,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4148 4163 4114 [WARNING|trainer.py:803] 2025-04-26 16:28:45,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:45,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:45,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4164 4149 4115 [WARNING|trainer.py:803] 2025-04-26 16:28:46,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:46,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:46,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4165 4150 4116 [WARNING|trainer.py:803] 2025-04-26 16:28:48,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:48,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:28:48,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4151 4166 4117 [WARNING|trainer.py:803] 2025-04-26 16:28:49,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:49,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:49,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4152 4167 4118 [WARNING|trainer.py:803] 2025-04-26 16:28:50,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:51,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:51,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4153 4168 4119 [WARNING|trainer.py:803] 2025-04-26 16:28:52,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:52,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:28:53,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4154 4169 [WARNING|trainer.py:803] 2025-04-26 16:28:53,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4120 [WARNING|trainer.py:803] 2025-04-26 16:28:54,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4155 [WARNING|trainer.py:803] 2025-04-26 16:28:54,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4170 4121 [WARNING|trainer.py:803] 2025-04-26 16:28:55,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:55,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4156 [WARNING|trainer.py:803] 2025-04-26 16:28:56,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4171 [WARNING|trainer.py:803] 2025-04-26 16:28:56,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4122 [WARNING|trainer.py:803] 2025-04-26 16:28:57,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4157 4172 [WARNING|trainer.py:803] 2025-04-26 16:28:57,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:28:58,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4123 [WARNING|trainer.py:803] 2025-04-26 16:28:58,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4158 4173 [WARNING|trainer.py:803] 2025-04-26 16:28:59,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4124 [WARNING|trainer.py:803] 2025-04-26 16:29:00,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:00,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4159 4174 [WARNING|trainer.py:803] 2025-04-26 16:29:00,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4125 [WARNING|trainer.py:803] 2025-04-26 16:29:01,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:01,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4160 [WARNING|trainer.py:803] 2025-04-26 16:29:02,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4175 [WARNING|trainer.py:803] 2025-04-26 16:29:03,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4126 [WARNING|trainer.py:803] 2025-04-26 16:29:03,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4161 [WARNING|trainer.py:803] 2025-04-26 16:29:03,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4176 [WARNING|trainer.py:803] 2025-04-26 16:29:04,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4127 [WARNING|trainer.py:803] 2025-04-26 16:29:04,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4162 4177 [WARNING|trainer.py:803] 2025-04-26 16:29:05,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:06,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4128 [WARNING|trainer.py:803] 2025-04-26 16:29:06,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4163 4178 [WARNING|trainer.py:803] 2025-04-26 16:29:06,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:07,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:07,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4129 4179 4164 [WARNING|trainer.py:803] 2025-04-26 16:29:08,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:09,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4130 [WARNING|trainer.py:803] 2025-04-26 16:29:09,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4180 4165 [WARNING|trainer.py:803] 2025-04-26 16:29:09,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:10,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4131 [WARNING|trainer.py:803] 2025-04-26 16:29:10,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4181 4166 [WARNING|trainer.py:803] 2025-04-26 16:29:11,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:12,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4132 [WARNING|trainer.py:803] 2025-04-26 16:29:12,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4182 4167 [WARNING|trainer.py:803] 2025-04-26 16:29:13,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:13,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4133 [WARNING|trainer.py:803] 2025-04-26 16:29:13,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4183 4168 [WARNING|trainer.py:803] 2025-04-26 16:29:14,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:15,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4134 [WARNING|trainer.py:803] 2025-04-26 16:29:15,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4184 4169 [WARNING|trainer.py:803] 2025-04-26 16:29:16,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:29:16,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4135 [WARNING|trainer.py:803] 2025-04-26 16:29:16,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4185 4170 [WARNING|trainer.py:803] 2025-04-26 16:29:17,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:18,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4136 [WARNING|trainer.py:803] 2025-04-26 16:29:18,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4186 4171 [WARNING|trainer.py:803] 2025-04-26 16:29:19,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:19,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:19,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4137 4172 4187 [WARNING|trainer.py:803] 2025-04-26 16:29:20,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4138 [WARNING|trainer.py:803] 2025-04-26 16:29:21,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:29:21,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4173 4188 [WARNING|trainer.py:803] 2025-04-26 16:29:22,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:29:22,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4139 [WARNING|trainer.py:803] 2025-04-26 16:29:22,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4174 4189 [WARNING|trainer.py:803] 2025-04-26 16:29:23,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4140 [WARNING|trainer.py:803] 2025-04-26 16:29:24,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:24,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4175 [WARNING|trainer.py:803] 2025-04-26 16:29:25,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4190 4141 [WARNING|trainer.py:803] 2025-04-26 16:29:25,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:29:26,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4176 [WARNING|trainer.py:803] 2025-04-26 16:29:26,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4191 4142 [WARNING|trainer.py:803] 2025-04-26 16:29:27,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:29:27,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4177 [WARNING|trainer.py:803] 2025-04-26 16:29:28,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4192 [WARNING|trainer.py:803] 2025-04-26 16:29:28,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4143 [WARNING|trainer.py:803] 2025-04-26 16:29:29,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4178 [WARNING|trainer.py:803] 2025-04-26 16:29:29,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4193 [WARNING|trainer.py:803] 2025-04-26 16:29:30,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4144 [WARNING|trainer.py:803] 2025-04-26 16:29:30,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4179 [WARNING|trainer.py:803] 2025-04-26 16:29:31,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4194 [WARNING|trainer.py:803] 2025-04-26 16:29:31,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4145 [WARNING|trainer.py:803] 2025-04-26 16:29:32,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4180 [WARNING|trainer.py:803] 2025-04-26 16:29:32,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4195 [WARNING|trainer.py:803] 2025-04-26 16:29:33,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4146 [WARNING|trainer.py:803] 2025-04-26 16:29:33,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4181 4196 [WARNING|trainer.py:803] 2025-04-26 16:29:34,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:34,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4147 [WARNING|trainer.py:803] 2025-04-26 16:29:35,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4182 4197 [WARNING|trainer.py:803] 2025-04-26 16:29:35,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:36,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4148 [WARNING|trainer.py:803] 2025-04-26 16:29:36,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4183 4198 [WARNING|trainer.py:803] 2025-04-26 16:29:37,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:29:37,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4149 [WARNING|trainer.py:803] 2025-04-26 16:29:38,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4184 4199 [WARNING|trainer.py:803] 2025-04-26 16:29:38,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:39,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4150 [WARNING|trainer.py:803] 2025-04-26 16:29:39,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4185 4200 [WARNING|trainer.py:803] 2025-04-26 16:29:40,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:29:40,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4151 [WARNING|trainer.py:803] 2025-04-26 16:29:41,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4186 4201 [WARNING|trainer.py:803] 2025-04-26 16:29:41,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:42,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:42,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4152 4202 4187 [WARNING|trainer.py:803] 2025-04-26 16:29:43,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:43,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:29:43,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4153 4203 4188 [WARNING|trainer.py:803] 2025-04-26 16:29:44,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:44,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:45,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4204 4154 4189 [WARNING|trainer.py:803] 2025-04-26 16:29:46,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:29:46,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4205 [WARNING|trainer.py:803] 2025-04-26 16:29:46,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4155 [WARNING|trainer.py:803] 2025-04-26 16:29:47,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4190 [WARNING|trainer.py:803] 2025-04-26 16:29:47,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4206 4156 [WARNING|trainer.py:803] 2025-04-26 16:29:48,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:48,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4207 4191 [WARNING|trainer.py:803] 2025-04-26 16:29:49,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4157 [WARNING|trainer.py:803] 2025-04-26 16:29:49,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:50,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4208 4192 [WARNING|trainer.py:803] 2025-04-26 16:29:50,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:51,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:29:51,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4158 4209 4193 [WARNING|trainer.py:803] 2025-04-26 16:29:52,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:52,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4210 [WARNING|trainer.py:803] 2025-04-26 16:29:53,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4159 [WARNING|trainer.py:803] 2025-04-26 16:29:53,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4194 [WARNING|trainer.py:803] 2025-04-26 16:29:53,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4211 4160 [WARNING|trainer.py:803] 2025-04-26 16:29:54,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:54,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4195 [WARNING|trainer.py:803] 2025-04-26 16:29:55,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4212 4161 [WARNING|trainer.py:803] 2025-04-26 16:29:56,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:56,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4213 4196 [WARNING|trainer.py:803] 2025-04-26 16:29:56,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:29:57,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4162 [WARNING|trainer.py:803] 2025-04-26 16:29:57,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4214 4197 [WARNING|trainer.py:803] 2025-04-26 16:29:58,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:29:58,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4215 4163 [WARNING|trainer.py:803] 2025-04-26 16:29:59,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4198 [WARNING|trainer.py:803] 2025-04-26 16:29:59,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:29:59,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4216 4164 [WARNING|trainer.py:803] 2025-04-26 16:30:00,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:01,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4199 [WARNING|trainer.py:803] 2025-04-26 16:30:01,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4217 4165 [WARNING|trainer.py:803] 2025-04-26 16:30:02,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:02,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4218 4200 [WARNING|trainer.py:803] 2025-04-26 16:30:02,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4166 [WARNING|trainer.py:803] 2025-04-26 16:30:03,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:03,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4219 4201 [WARNING|trainer.py:803] 2025-04-26 16:30:04,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:04,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:04,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4167 4220 4202 [WARNING|trainer.py:803] 2025-04-26 16:30:05,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:06,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:06,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4221 4203 4168 [WARNING|trainer.py:803] 2025-04-26 16:30:07,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:07,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:30:07,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4222 4204 4169 [WARNING|trainer.py:803] 2025-04-26 16:30:08,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:08,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4223 [WARNING|trainer.py:803] 2025-04-26 16:30:09,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4205 [WARNING|trainer.py:803] 2025-04-26 16:30:09,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4170 [WARNING|trainer.py:803] 2025-04-26 16:30:10,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4224 4206 [WARNING|trainer.py:803] 2025-04-26 16:30:10,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:11,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4171 [WARNING|trainer.py:803] 2025-04-26 16:30:11,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4225 4207 [WARNING|trainer.py:803] 2025-04-26 16:30:12,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:12,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:30:12,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4226 4172 4208 [WARNING|trainer.py:803] 2025-04-26 16:30:13,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:13,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:13,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4227 4209 4173 [WARNING|trainer.py:803] 2025-04-26 16:30:14,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:15,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4228 [WARNING|trainer.py:803] 2025-04-26 16:30:15,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4210 4174 [WARNING|trainer.py:803] 2025-04-26 16:30:15,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4229 [WARNING|trainer.py:803] 2025-04-26 16:30:16,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4211 [WARNING|trainer.py:803] 2025-04-26 16:30:16,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:17,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4175 4230 [WARNING|trainer.py:803] 2025-04-26 16:30:17,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4212 [WARNING|trainer.py:803] 2025-04-26 16:30:18,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:18,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4231 [WARNING|trainer.py:803] 2025-04-26 16:30:18,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4176 4213 [WARNING|trainer.py:803] 2025-04-26 16:30:19,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:19,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4232 [WARNING|trainer.py:803] 2025-04-26 16:30:20,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4177 4214 [WARNING|trainer.py:803] 2025-04-26 16:30:20,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4233 [WARNING|trainer.py:803] 2025-04-26 16:30:21,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:21,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4215 4178 [WARNING|trainer.py:803] 2025-04-26 16:30:22,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4234 [WARNING|trainer.py:803] 2025-04-26 16:30:22,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:22,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4216 [WARNING|trainer.py:803] 2025-04-26 16:30:23,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4179 4235 [WARNING|trainer.py:803] 2025-04-26 16:30:23,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:24,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4217 [WARNING|trainer.py:803] 2025-04-26 16:30:24,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4180 4236 [WARNING|trainer.py:803] 2025-04-26 16:30:25,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4218 [WARNING|trainer.py:803] 2025-04-26 16:30:25,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:25,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4237 [WARNING|trainer.py:803] 2025-04-26 16:30:26,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4181 4219 [WARNING|trainer.py:803] 2025-04-26 16:30:27,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:27,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4238 [WARNING|trainer.py:803] 2025-04-26 16:30:27,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4182 4220 [WARNING|trainer.py:803] 2025-04-26 16:30:28,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4239 [WARNING|trainer.py:803] 2025-04-26 16:30:28,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:30:28,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4221 4183 [WARNING|trainer.py:803] 2025-04-26 16:30:29,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4240 [WARNING|trainer.py:803] 2025-04-26 16:30:30,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:30,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4222 [WARNING|trainer.py:803] 2025-04-26 16:30:30,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4184 4241 [WARNING|trainer.py:803] 2025-04-26 16:30:31,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4223 [WARNING|trainer.py:803] 2025-04-26 16:30:31,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:31,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4242 4185 [WARNING|trainer.py:803] 2025-04-26 16:30:32,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4224 [WARNING|trainer.py:803] 2025-04-26 16:30:33,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:33,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4243 [WARNING|trainer.py:803] 2025-04-26 16:30:33,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4186 4225 [WARNING|trainer.py:803] 2025-04-26 16:30:34,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4244 [WARNING|trainer.py:803] 2025-04-26 16:30:35,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:35,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4226 [WARNING|trainer.py:803] 2025-04-26 16:30:35,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4187 4245 [WARNING|trainer.py:803] 2025-04-26 16:30:36,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:36,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4227 [WARNING|trainer.py:803] 2025-04-26 16:30:36,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4188 4246 [WARNING|trainer.py:803] 2025-04-26 16:30:37,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4228 [WARNING|trainer.py:803] 2025-04-26 16:30:38,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:38,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4247 4189 [WARNING|trainer.py:803] 2025-04-26 16:30:38,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4229 [WARNING|trainer.py:803] 2025-04-26 16:30:39,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:39,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4248 [WARNING|trainer.py:803] 2025-04-26 16:30:40,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4230 4190 [WARNING|trainer.py:803] 2025-04-26 16:30:40,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4249 [WARNING|trainer.py:803] 2025-04-26 16:30:41,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:41,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4231 [WARNING|trainer.py:803] 2025-04-26 16:30:41,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4191 4250 [WARNING|trainer.py:803] 2025-04-26 16:30:42,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:42,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4232 [WARNING|trainer.py:803] 2025-04-26 16:30:43,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4192 4251 [WARNING|trainer.py:803] 2025-04-26 16:30:43,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4233 [WARNING|trainer.py:803] 2025-04-26 16:30:44,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:44,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4252 4193 [WARNING|trainer.py:803] 2025-04-26 16:30:45,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4234 [WARNING|trainer.py:803] 2025-04-26 16:30:45,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:45,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4253 [WARNING|trainer.py:803] 2025-04-26 16:30:46,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4194 4235 [WARNING|trainer.py:803] 2025-04-26 16:30:46,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4254 [WARNING|trainer.py:803] 2025-04-26 16:30:47,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:47,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4236 4195 [WARNING|trainer.py:803] 2025-04-26 16:30:48,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4255 [WARNING|trainer.py:803] 2025-04-26 16:30:48,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:48,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4237 [WARNING|trainer.py:803] 2025-04-26 16:30:49,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4196 4256 [WARNING|trainer.py:803] 2025-04-26 16:30:50,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:50,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4238 [WARNING|trainer.py:803] 2025-04-26 16:30:50,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4257 4197 [WARNING|trainer.py:803] 2025-04-26 16:30:51,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4239 [WARNING|trainer.py:803] 2025-04-26 16:30:51,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:52,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4258 [WARNING|trainer.py:803] 2025-04-26 16:30:52,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4198 4240 [WARNING|trainer.py:803] 2025-04-26 16:30:53,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4259 [WARNING|trainer.py:803] 2025-04-26 16:30:53,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:53,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:54,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4199 4241 4260 [WARNING|trainer.py:803] 2025-04-26 16:30:55,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:55,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:55,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4242 4200 4261 [WARNING|trainer.py:803] 2025-04-26 16:30:56,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:56,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:56,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4243 4201 4262 [WARNING|trainer.py:803] 2025-04-26 16:30:57,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:57,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:57,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4244 4202 4263 [WARNING|trainer.py:803] 2025-04-26 16:30:58,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:30:59,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:30:59,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4245 4203 4264 [WARNING|trainer.py:803] 2025-04-26 16:31:00,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:00,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:00,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4246 4204 4265 [WARNING|trainer.py:803] 2025-04-26 16:31:01,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:01,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:31:01,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4247 4205 4266 [WARNING|trainer.py:803] 2025-04-26 16:31:02,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:02,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:02,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4248 4206 4267 [WARNING|trainer.py:803] 2025-04-26 16:31:03,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:04,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:04,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4249 4207 4268 [WARNING|trainer.py:803] 2025-04-26 16:31:05,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:05,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:05,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4250 4208 4269 [WARNING|trainer.py:803] 2025-04-26 16:31:06,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:06,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:31:06,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4251 4209 4270 [WARNING|trainer.py:803] 2025-04-26 16:31:07,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:07,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:07,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4252 4210 4271 [WARNING|trainer.py:803] 2025-04-26 16:31:08,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:09,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:09,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4253 4211 4272 [WARNING|trainer.py:803] 2025-04-26 16:31:10,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:10,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:10,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4254 4212 4273 [WARNING|trainer.py:803] 2025-04-26 16:31:11,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:11,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:11,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4255 4213 4274 [WARNING|trainer.py:803] 2025-04-26 16:31:12,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:12,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:12,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4256 4214 4275 [WARNING|trainer.py:803] 2025-04-26 16:31:13,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:14,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:14,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4257 4215 4276 [WARNING|trainer.py:803] 2025-04-26 16:31:15,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:15,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:15,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4258 4216 4277 [WARNING|trainer.py:803] 2025-04-26 16:31:16,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:16,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:16,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4259 4278 4217 [WARNING|trainer.py:803] 2025-04-26 16:31:17,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:17,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:17,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4260 4279 4218 [WARNING|trainer.py:803] 2025-04-26 16:31:18,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:19,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:19,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4261 4280 4219 [WARNING|trainer.py:803] 2025-04-26 16:31:20,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:20,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:20,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4262 4281 4220 [WARNING|trainer.py:803] 2025-04-26 16:31:21,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:21,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:21,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4263 4282 4221 [WARNING|trainer.py:803] 2025-04-26 16:31:22,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:22,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:22,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4264 4283 4222 [WARNING|trainer.py:803] 2025-04-26 16:31:23,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:24,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:24,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4265 4284 4223 [WARNING|trainer.py:803] 2025-04-26 16:31:24,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:25,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:25,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4266 4285 4224 [WARNING|trainer.py:803] 2025-04-26 16:31:26,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:26,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:26,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4267 4286 4225 [WARNING|trainer.py:803] 2025-04-26 16:31:27,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:27,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:31:27,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4268 4287 4226 [WARNING|trainer.py:803] 2025-04-26 16:31:28,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:28,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:29,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4269 4288 4227 [WARNING|trainer.py:803] 2025-04-26 16:31:29,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:30,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:31:30,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4270 4289 4228 [WARNING|trainer.py:803] 2025-04-26 16:31:31,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:31,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:31,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4271 4290 4229 [WARNING|trainer.py:803] 2025-04-26 16:31:32,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:32,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:32,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4272 4291 4230 [WARNING|trainer.py:803] 2025-04-26 16:31:33,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:33,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:34,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4273 4292 4231 [WARNING|trainer.py:803] 2025-04-26 16:31:34,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:35,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:35,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4274 4293 4232 [WARNING|trainer.py:803] 2025-04-26 16:31:36,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:36,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:36,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4275 4294 4233 [WARNING|trainer.py:803] 2025-04-26 16:31:37,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:37,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:37,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4276 4295 4234 [WARNING|trainer.py:803] 2025-04-26 16:31:38,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:38,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4277 [WARNING|trainer.py:803] 2025-04-26 16:31:39,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4296 4235 [WARNING|trainer.py:803] 2025-04-26 16:31:39,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:40,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4278 [WARNING|trainer.py:803] 2025-04-26 16:31:40,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4297 4236 [WARNING|trainer.py:803] 2025-04-26 16:31:41,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:41,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4279 [WARNING|trainer.py:803] 2025-04-26 16:31:41,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4298 4237 [WARNING|trainer.py:803] 2025-04-26 16:31:42,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:42,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4280 [WARNING|trainer.py:803] 2025-04-26 16:31:42,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4299 4238 [WARNING|trainer.py:803] 2025-04-26 16:31:43,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:43,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:44,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4281 4300 4239 [WARNING|trainer.py:803] 2025-04-26 16:31:44,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:45,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:45,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4282 4301 4240 [WARNING|trainer.py:803] 2025-04-26 16:31:46,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:46,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:46,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4283 4302 4241 [WARNING|trainer.py:803] 2025-04-26 16:31:47,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:47,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:47,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4284 4303 4242 [WARNING|trainer.py:803] 2025-04-26 16:31:48,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:48,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:48,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4285 4304 4243 [WARNING|trainer.py:803] 2025-04-26 16:31:49,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:50,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:50,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4286 4305 4244 [WARNING|trainer.py:803] 2025-04-26 16:31:51,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:31:51,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:51,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4287 4306 4245 [WARNING|trainer.py:803] 2025-04-26 16:31:52,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:31:52,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4288 [WARNING|trainer.py:803] 2025-04-26 16:31:52,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4307 4246 [WARNING|trainer.py:803] 2025-04-26 16:31:53,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:31:53,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4289 [WARNING|trainer.py:803] 2025-04-26 16:31:53,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4308 4247 [WARNING|trainer.py:803] 2025-04-26 16:31:54,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:55,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4290 [WARNING|trainer.py:803] 2025-04-26 16:31:55,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4309 4248 [WARNING|trainer.py:803] 2025-04-26 16:31:55,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:56,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4291 [WARNING|trainer.py:803] 2025-04-26 16:31:56,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4310 4249 [WARNING|trainer.py:803] 2025-04-26 16:31:57,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:57,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4292 [WARNING|trainer.py:803] 2025-04-26 16:31:57,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4311 4250 [WARNING|trainer.py:803] 2025-04-26 16:31:58,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:31:58,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4293 [WARNING|trainer.py:803] 2025-04-26 16:31:58,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4312 4251 [WARNING|trainer.py:803] 2025-04-26 16:31:59,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:00,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4294 [WARNING|trainer.py:803] 2025-04-26 16:32:00,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4313 4252 [WARNING|trainer.py:803] 2025-04-26 16:32:00,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:01,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4295 [WARNING|trainer.py:803] 2025-04-26 16:32:01,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4314 4253 [WARNING|trainer.py:803] 2025-04-26 16:32:02,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4296 [WARNING|trainer.py:803] 2025-04-26 16:32:02,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:02,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4315 4254 [WARNING|trainer.py:803] 2025-04-26 16:32:03,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:03,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4297 [WARNING|trainer.py:803] 2025-04-26 16:32:03,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4316 4255 [WARNING|trainer.py:803] 2025-04-26 16:32:04,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:32:04,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4298 [WARNING|trainer.py:803] 2025-04-26 16:32:05,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4317 4256 [WARNING|trainer.py:803] 2025-04-26 16:32:05,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4299 [WARNING|trainer.py:803] 2025-04-26 16:32:06,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:06,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4318 4257 [WARNING|trainer.py:803] 2025-04-26 16:32:07,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4300 [WARNING|trainer.py:803] 2025-04-26 16:32:07,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:07,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4319 4258 [WARNING|trainer.py:803] 2025-04-26 16:32:08,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4301 [WARNING|trainer.py:803] 2025-04-26 16:32:08,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:08,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4320 4259 [WARNING|trainer.py:803] 2025-04-26 16:32:09,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4302 [WARNING|trainer.py:803] 2025-04-26 16:32:10,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:10,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4260 4321 [WARNING|trainer.py:803] 2025-04-26 16:32:10,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4303 [WARNING|trainer.py:803] 2025-04-26 16:32:11,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:11,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4261 4322 [WARNING|trainer.py:803] 2025-04-26 16:32:12,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:12,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4304 [WARNING|trainer.py:803] 2025-04-26 16:32:12,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4262 4323 [WARNING|trainer.py:803] 2025-04-26 16:32:13,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4305 [WARNING|trainer.py:803] 2025-04-26 16:32:13,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:13,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4263 4324 [WARNING|trainer.py:803] 2025-04-26 16:32:14,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:14,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:15,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4306 4264 4325 [WARNING|trainer.py:803] 2025-04-26 16:32:15,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:16,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:16,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4307 4265 4326 [WARNING|trainer.py:803] 2025-04-26 16:32:17,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:17,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:17,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4308 4266 4327 [WARNING|trainer.py:803] 2025-04-26 16:32:18,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:18,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:18,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4309 4267 4328 [WARNING|trainer.py:803] 2025-04-26 16:32:19,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:32:19,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:20,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4310 4268 4329 [WARNING|trainer.py:803] 2025-04-26 16:32:20,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:21,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:21,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4311 4269 4330 [WARNING|trainer.py:803] 2025-04-26 16:32:22,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:22,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:22,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4312 4270 4331 [WARNING|trainer.py:803] 2025-04-26 16:32:23,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:23,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:23,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4313 4271 4332 [WARNING|trainer.py:803] 2025-04-26 16:32:24,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:24,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:24,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4314 4272 4333 [WARNING|trainer.py:803] 2025-04-26 16:32:25,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:26,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:26,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4315 4273 4334 [WARNING|trainer.py:803] 2025-04-26 16:32:27,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:27,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:27,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4316 4274 4335 [WARNING|trainer.py:803] 2025-04-26 16:32:28,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:28,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:28,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4317 4275 4336 [WARNING|trainer.py:803] 2025-04-26 16:32:29,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:29,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:29,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4318 4276 4337 [WARNING|trainer.py:803] 2025-04-26 16:32:30,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:31,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:31,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4319 4338 4277 [WARNING|trainer.py:803] 2025-04-26 16:32:32,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:32,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:32:32,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4320 4339 4278 [WARNING|trainer.py:803] 2025-04-26 16:32:33,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:33,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:33,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4321 4340 4279 [WARNING|trainer.py:803] 2025-04-26 16:32:34,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:34,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:34,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4322 4341 4280 [WARNING|trainer.py:803] 2025-04-26 16:32:35,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:36,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:36,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4323 4342 4281 [WARNING|trainer.py:803] 2025-04-26 16:32:37,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:37,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:37,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4324 4343 4282 [WARNING|trainer.py:803] 2025-04-26 16:32:38,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:38,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:38,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4325 4344 4283 [WARNING|trainer.py:803] 2025-04-26 16:32:39,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:39,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:39,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4326 4345 4284 [WARNING|trainer.py:803] 2025-04-26 16:32:40,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:32:41,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:32:41,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4327 4346 4285 [WARNING|trainer.py:803] 2025-04-26 16:32:42,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:42,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:42,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4328 4347 4286 [WARNING|trainer.py:803] 2025-04-26 16:32:43,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:43,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:43,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4329 4348 4287 [WARNING|trainer.py:803] 2025-04-26 16:32:44,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:44,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4330 [WARNING|trainer.py:803] 2025-04-26 16:32:45,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4288 4349 [WARNING|trainer.py:803] 2025-04-26 16:32:45,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:46,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:32:46,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4331 4289 4350 [WARNING|trainer.py:803] 2025-04-26 16:32:47,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:47,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:47,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4332 4351 4290 [WARNING|trainer.py:803] 2025-04-26 16:32:48,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:32:48,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:48,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4333 4352 4291 [WARNING|trainer.py:803] 2025-04-26 16:32:49,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:49,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:49,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4334 4353 4292 [WARNING|trainer.py:803] 2025-04-26 16:32:50,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:51,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:51,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4335 4354 4293 [WARNING|trainer.py:803] 2025-04-26 16:32:52,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:52,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:52,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4336 4355 4294 [WARNING|trainer.py:803] 2025-04-26 16:32:53,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:53,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:53,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4337 4356 4295 [WARNING|trainer.py:803] 2025-04-26 16:32:54,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:54,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:54,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4338 4357 4296 [WARNING|trainer.py:803] 2025-04-26 16:32:55,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:55,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:32:56,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4358 4339 4297 [WARNING|trainer.py:803] 2025-04-26 16:32:56,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:32:57,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4359 [WARNING|trainer.py:803] 2025-04-26 16:32:57,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4340 4298 [WARNING|trainer.py:803] 2025-04-26 16:32:58,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:58,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4360 [WARNING|trainer.py:803] 2025-04-26 16:32:58,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4341 4299 [WARNING|trainer.py:803] 2025-04-26 16:32:59,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:32:59,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4361 [WARNING|trainer.py:803] 2025-04-26 16:32:59,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4342 4300 [WARNING|trainer.py:803] 2025-04-26 16:33:00,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:00,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4362 [WARNING|trainer.py:803] 2025-04-26 16:33:01,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4343 4301 [WARNING|trainer.py:803] 2025-04-26 16:33:01,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:02,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4363 [WARNING|trainer.py:803] 2025-04-26 16:33:02,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4344 4302 [WARNING|trainer.py:803] 2025-04-26 16:33:02,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4364 [WARNING|trainer.py:803] 2025-04-26 16:33:03,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:03,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4345 4303 [WARNING|trainer.py:803] 2025-04-26 16:33:04,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4365 [WARNING|trainer.py:803] 2025-04-26 16:33:04,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:33:04,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4346 4304 [WARNING|trainer.py:803] 2025-04-26 16:33:05,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4366 [WARNING|trainer.py:803] 2025-04-26 16:33:05,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:06,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4347 [WARNING|trainer.py:803] 2025-04-26 16:33:06,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4305 4367 [WARNING|trainer.py:803] 2025-04-26 16:33:07,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:07,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4348 [WARNING|trainer.py:803] 2025-04-26 16:33:07,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4306 4368 [WARNING|trainer.py:803] 2025-04-26 16:33:08,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4349 [WARNING|trainer.py:803] 2025-04-26 16:33:08,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:08,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4307 4369 [WARNING|trainer.py:803] 2025-04-26 16:33:09,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4350 [WARNING|trainer.py:803] 2025-04-26 16:33:09,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:10,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4308 4370 [WARNING|trainer.py:803] 2025-04-26 16:33:10,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4351 [WARNING|trainer.py:803] 2025-04-26 16:33:11,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:11,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4309 4371 [WARNING|trainer.py:803] 2025-04-26 16:33:11,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4352 [WARNING|trainer.py:803] 2025-04-26 16:33:12,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:33:12,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4372 4310 [WARNING|trainer.py:803] 2025-04-26 16:33:13,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4353 [WARNING|trainer.py:803] 2025-04-26 16:33:13,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:13,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4373 4311 [WARNING|trainer.py:803] 2025-04-26 16:33:14,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4354 [WARNING|trainer.py:803] 2025-04-26 16:33:14,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:14,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4374 4312 [WARNING|trainer.py:803] 2025-04-26 16:33:15,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4355 [WARNING|trainer.py:803] 2025-04-26 16:33:15,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:16,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4375 4313 [WARNING|trainer.py:803] 2025-04-26 16:33:16,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4356 [WARNING|trainer.py:803] 2025-04-26 16:33:17,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:17,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4376 [WARNING|trainer.py:803] 2025-04-26 16:33:17,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4314 4357 [WARNING|trainer.py:803] 2025-04-26 16:33:18,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:18,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4377 [WARNING|trainer.py:803] 2025-04-26 16:33:18,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4315 4358 [WARNING|trainer.py:803] 2025-04-26 16:33:19,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:19,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4378 [WARNING|trainer.py:803] 2025-04-26 16:33:20,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4316 4359 [WARNING|trainer.py:803] 2025-04-26 16:33:20,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4379 [WARNING|trainer.py:803] 2025-04-26 16:33:21,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:21,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4317 4360 [WARNING|trainer.py:803] 2025-04-26 16:33:21,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4380 [WARNING|trainer.py:803] 2025-04-26 16:33:22,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:22,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4318 4361 [WARNING|trainer.py:803] 2025-04-26 16:33:23,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4381 [WARNING|trainer.py:803] 2025-04-26 16:33:23,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:23,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4319 4362 [WARNING|trainer.py:803] 2025-04-26 16:33:24,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4382 [WARNING|trainer.py:803] 2025-04-26 16:33:24,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:24,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4320 4363 [WARNING|trainer.py:803] 2025-04-26 16:33:25,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4383 [WARNING|trainer.py:803] 2025-04-26 16:33:26,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:26,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4364 [WARNING|trainer.py:803] 2025-04-26 16:33:26,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4321 4384 [WARNING|trainer.py:803] 2025-04-26 16:33:27,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:27,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4365 [WARNING|trainer.py:803] 2025-04-26 16:33:27,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4322 4385 [WARNING|trainer.py:803] 2025-04-26 16:33:28,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:28,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4366 [WARNING|trainer.py:803] 2025-04-26 16:33:28,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4323 4386 [WARNING|trainer.py:803] 2025-04-26 16:33:29,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:29,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:30,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4367 4324 4387 [WARNING|trainer.py:803] 2025-04-26 16:33:30,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:31,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:31,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4368 4325 4388 [WARNING|trainer.py:803] 2025-04-26 16:33:32,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:33:32,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:32,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4369 4389 4326 [WARNING|trainer.py:803] 2025-04-26 16:33:33,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:33,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4370 [WARNING|trainer.py:803] 2025-04-26 16:33:33,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4390 4327 [WARNING|trainer.py:803] 2025-04-26 16:33:34,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4371 [WARNING|trainer.py:803] 2025-04-26 16:33:34,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:35,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4391 4328 [WARNING|trainer.py:803] 2025-04-26 16:33:35,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4372 [WARNING|trainer.py:803] 2025-04-26 16:33:36,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:36,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4392 4329 [WARNING|trainer.py:803] 2025-04-26 16:33:36,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4373 [WARNING|trainer.py:803] 2025-04-26 16:33:37,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:37,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4393 4330 [WARNING|trainer.py:803] 2025-04-26 16:33:38,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4374 [WARNING|trainer.py:803] 2025-04-26 16:33:38,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:38,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4394 4331 [WARNING|trainer.py:803] 2025-04-26 16:33:39,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4375 [WARNING|trainer.py:803] 2025-04-26 16:33:39,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:39,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4395 4332 [WARNING|trainer.py:803] 2025-04-26 16:33:40,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4376 [WARNING|trainer.py:803] 2025-04-26 16:33:40,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:41,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4396 4333 [WARNING|trainer.py:803] 2025-04-26 16:33:41,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4377 [WARNING|trainer.py:803] 2025-04-26 16:33:42,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4397 [WARNING|trainer.py:803] 2025-04-26 16:33:42,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:42,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4334 4378 [WARNING|trainer.py:803] 2025-04-26 16:33:43,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4398 [WARNING|trainer.py:803] 2025-04-26 16:33:43,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:43,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4335 4379 [WARNING|trainer.py:803] 2025-04-26 16:33:44,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4399 [WARNING|trainer.py:803] 2025-04-26 16:33:44,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:45,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4336 4380 [WARNING|trainer.py:803] 2025-04-26 16:33:45,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4400 [WARNING|trainer.py:803] 2025-04-26 16:33:46,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:46,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4337 4381 [WARNING|trainer.py:803] 2025-04-26 16:33:46,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4401 [WARNING|trainer.py:803] 2025-04-26 16:33:47,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:47,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4338 4382 [WARNING|trainer.py:803] 2025-04-26 16:33:47,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4402 [WARNING|trainer.py:803] 2025-04-26 16:33:48,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:33:48,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4339 4383 [WARNING|trainer.py:803] 2025-04-26 16:33:49,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4403 [WARNING|trainer.py:803] 2025-04-26 16:33:49,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:49,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4340 4384 [WARNING|trainer.py:803] 2025-04-26 16:33:50,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4404 [WARNING|trainer.py:803] 2025-04-26 16:33:51,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:51,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4341 4385 [WARNING|trainer.py:803] 2025-04-26 16:33:51,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4405 [WARNING|trainer.py:803] 2025-04-26 16:33:52,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:52,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4386 [WARNING|trainer.py:803] 2025-04-26 16:33:52,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4342 4406 [WARNING|trainer.py:803] 2025-04-26 16:33:53,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:53,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4387 [WARNING|trainer.py:803] 2025-04-26 16:33:53,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4343 4407 [WARNING|trainer.py:803] 2025-04-26 16:33:54,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:54,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4388 [WARNING|trainer.py:803] 2025-04-26 16:33:55,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4344 4408 [WARNING|trainer.py:803] 2025-04-26 16:33:55,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:56,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4389 [WARNING|trainer.py:803] 2025-04-26 16:33:56,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4345 4409 [WARNING|trainer.py:803] 2025-04-26 16:33:57,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:57,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4390 [WARNING|trainer.py:803] 2025-04-26 16:33:57,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4346 4410 [WARNING|trainer.py:803] 2025-04-26 16:33:58,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:33:58,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4391 [WARNING|trainer.py:803] 2025-04-26 16:33:58,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4347 4411 [WARNING|trainer.py:803] 2025-04-26 16:33:59,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:33:59,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4392 [WARNING|trainer.py:803] 2025-04-26 16:33:59,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4412 4348 [WARNING|trainer.py:803] 2025-04-26 16:34:00,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:01,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4393 [WARNING|trainer.py:803] 2025-04-26 16:34:01,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4413 4349 [WARNING|trainer.py:803] 2025-04-26 16:34:01,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:34:02,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4394 [WARNING|trainer.py:803] 2025-04-26 16:34:02,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4414 4350 [WARNING|trainer.py:803] 2025-04-26 16:34:03,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:03,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4395 [WARNING|trainer.py:803] 2025-04-26 16:34:03,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4415 4351 [WARNING|trainer.py:803] 2025-04-26 16:34:04,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:34:04,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4396 [WARNING|trainer.py:803] 2025-04-26 16:34:04,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4416 4352 [WARNING|trainer.py:803] 2025-04-26 16:34:05,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:05,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4397 [WARNING|trainer.py:803] 2025-04-26 16:34:05,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4417 4353 [WARNING|trainer.py:803] 2025-04-26 16:34:06,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:06,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4398 [WARNING|trainer.py:803] 2025-04-26 16:34:07,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4418 4354 [WARNING|trainer.py:803] 2025-04-26 16:34:07,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:34:08,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4399 [WARNING|trainer.py:803] 2025-04-26 16:34:08,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4419 4355 [WARNING|trainer.py:803] 2025-04-26 16:34:09,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:34:09,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4400 [WARNING|trainer.py:803] 2025-04-26 16:34:09,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4420 4356 [WARNING|trainer.py:803] 2025-04-26 16:34:10,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:34:10,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4401 [WARNING|trainer.py:803] 2025-04-26 16:34:10,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4421 4357 [WARNING|trainer.py:803] 2025-04-26 16:34:11,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:11,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4402 [WARNING|trainer.py:803] 2025-04-26 16:34:11,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4422 4358 [WARNING|trainer.py:803] 2025-04-26 16:34:12,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:12,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4403 [WARNING|trainer.py:803] 2025-04-26 16:34:13,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4423 4359 [WARNING|trainer.py:803] 2025-04-26 16:34:13,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:14,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4404 [WARNING|trainer.py:803] 2025-04-26 16:34:14,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4424 4360 [WARNING|trainer.py:803] 2025-04-26 16:34:15,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:15,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4405 [WARNING|trainer.py:803] 2025-04-26 16:34:15,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4425 4361 [WARNING|trainer.py:803] 2025-04-26 16:34:16,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:16,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4406 [WARNING|trainer.py:803] 2025-04-26 16:34:16,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4426 4362 [WARNING|trainer.py:803] 2025-04-26 16:34:17,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:34:17,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4407 [WARNING|trainer.py:803] 2025-04-26 16:34:17,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4427 4363 [WARNING|trainer.py:803] 2025-04-26 16:34:18,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:18,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4408 [WARNING|trainer.py:803] 2025-04-26 16:34:19,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4428 4364 [WARNING|trainer.py:803] 2025-04-26 16:34:19,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:20,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4409 [WARNING|trainer.py:803] 2025-04-26 16:34:20,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4429 4365 [WARNING|trainer.py:803] 2025-04-26 16:34:21,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:21,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4410 [WARNING|trainer.py:803] 2025-04-26 16:34:21,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4430 4366 [WARNING|trainer.py:803] 2025-04-26 16:34:22,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:34:22,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4411 [WARNING|trainer.py:803] 2025-04-26 16:34:22,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4431 4367 [WARNING|trainer.py:803] 2025-04-26 16:34:23,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:23,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4412 [WARNING|trainer.py:803] 2025-04-26 16:34:23,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4432 4368 [WARNING|trainer.py:803] 2025-04-26 16:34:24,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:24,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4413 [WARNING|trainer.py:803] 2025-04-26 16:34:25,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4433 4369 [WARNING|trainer.py:803] 2025-04-26 16:34:25,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:26,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4414 [WARNING|trainer.py:803] 2025-04-26 16:34:26,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4434 4370 [WARNING|trainer.py:803] 2025-04-26 16:34:27,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:27,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4415 [WARNING|trainer.py:803] 2025-04-26 16:34:27,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4435 4371 [WARNING|trainer.py:803] 2025-04-26 16:34:28,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:28,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4416 [WARNING|trainer.py:803] 2025-04-26 16:34:28,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4436 4372 [WARNING|trainer.py:803] 2025-04-26 16:34:29,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:29,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4417 [WARNING|trainer.py:803] 2025-04-26 16:34:29,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4437 4373 [WARNING|trainer.py:803] 2025-04-26 16:34:30,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:30,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4418 [WARNING|trainer.py:803] 2025-04-26 16:34:31,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4438 4374 [WARNING|trainer.py:803] 2025-04-26 16:34:31,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:32,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4419 [WARNING|trainer.py:803] 2025-04-26 16:34:32,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4439 4375 [WARNING|trainer.py:803] 2025-04-26 16:34:33,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:33,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4420 [WARNING|trainer.py:803] 2025-04-26 16:34:33,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4440 4376 [WARNING|trainer.py:803] 2025-04-26 16:34:34,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:34,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4421 [WARNING|trainer.py:803] 2025-04-26 16:34:34,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4441 4377 [WARNING|trainer.py:803] 2025-04-26 16:34:35,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:35,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4422 [WARNING|trainer.py:803] 2025-04-26 16:34:35,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4442 4378 [WARNING|trainer.py:803] 2025-04-26 16:34:36,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:36,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4423 [WARNING|trainer.py:803] 2025-04-26 16:34:37,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4443 4379 [WARNING|trainer.py:803] 2025-04-26 16:34:37,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:38,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4424 [WARNING|trainer.py:803] 2025-04-26 16:34:38,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4444 4380 [WARNING|trainer.py:803] 2025-04-26 16:34:38,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:39,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4425 [WARNING|trainer.py:803] 2025-04-26 16:34:39,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4445 4381 [WARNING|trainer.py:803] 2025-04-26 16:34:40,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:40,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4426 [WARNING|trainer.py:803] 2025-04-26 16:34:40,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4446 4382 [WARNING|trainer.py:803] 2025-04-26 16:34:41,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:41,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4427 [WARNING|trainer.py:803] 2025-04-26 16:34:41,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4447 4383 [WARNING|trainer.py:803] 2025-04-26 16:34:42,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:42,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4428 [WARNING|trainer.py:803] 2025-04-26 16:34:43,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4448 4384 [WARNING|trainer.py:803] 2025-04-26 16:34:43,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:43,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4429 [WARNING|trainer.py:803] 2025-04-26 16:34:44,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4449 4385 [WARNING|trainer.py:803] 2025-04-26 16:34:44,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:45,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4430 [WARNING|trainer.py:803] 2025-04-26 16:34:45,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4450 4386 [WARNING|trainer.py:803] 2025-04-26 16:34:46,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:34:46,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4431 [WARNING|trainer.py:803] 2025-04-26 16:34:46,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4451 4387 [WARNING|trainer.py:803] 2025-04-26 16:34:47,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:34:47,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4432 [WARNING|trainer.py:803] 2025-04-26 16:34:47,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4452 4388 [WARNING|trainer.py:803] 2025-04-26 16:34:48,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:34:48,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4433 4453 [WARNING|trainer.py:803] 2025-04-26 16:34:49,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4389 [WARNING|trainer.py:803] 2025-04-26 16:34:49,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:49,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4434 4454 [WARNING|trainer.py:803] 2025-04-26 16:34:50,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4390 [WARNING|trainer.py:803] 2025-04-26 16:34:50,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:34:51,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4435 4455 [WARNING|trainer.py:803] 2025-04-26 16:34:51,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4391 [WARNING|trainer.py:803] 2025-04-26 16:34:52,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:52,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4436 4456 [WARNING|trainer.py:803] 2025-04-26 16:34:52,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4392 [WARNING|trainer.py:803] 2025-04-26 16:34:53,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:53,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4437 4457 [WARNING|trainer.py:803] 2025-04-26 16:34:53,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4393 [WARNING|trainer.py:803] 2025-04-26 16:34:54,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:54,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4438 4458 [WARNING|trainer.py:803] 2025-04-26 16:34:55,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4394 [WARNING|trainer.py:803] 2025-04-26 16:34:55,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:34:55,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4439 4459 [WARNING|trainer.py:803] 2025-04-26 16:34:56,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4395 [WARNING|trainer.py:803] 2025-04-26 16:34:56,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:34:56,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4440 4460 [WARNING|trainer.py:803] 2025-04-26 16:34:57,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4396 [WARNING|trainer.py:803] 2025-04-26 16:34:58,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:34:58,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4441 4461 [WARNING|trainer.py:803] 2025-04-26 16:34:58,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4397 [WARNING|trainer.py:803] 2025-04-26 16:34:59,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:34:59,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4442 4462 [WARNING|trainer.py:803] 2025-04-26 16:34:59,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4398 [WARNING|trainer.py:803] 2025-04-26 16:35:00,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:00,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4443 4463 [WARNING|trainer.py:803] 2025-04-26 16:35:01,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4399 [WARNING|trainer.py:803] 2025-04-26 16:35:01,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:01,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4444 4464 [WARNING|trainer.py:803] 2025-04-26 16:35:02,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4400 [WARNING|trainer.py:803] 2025-04-26 16:35:02,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:02,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4445 4465 [WARNING|trainer.py:803] 2025-04-26 16:35:03,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4401 [WARNING|trainer.py:803] 2025-04-26 16:35:04,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 16:35:04,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo NoNo 4446 4466 [WARNING|trainer.py:803] 2025-04-26 16:35:04,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4402 [WARNING|trainer.py:803] 2025-04-26 16:35:05,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:05,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4447 4467 [WARNING|trainer.py:803] 2025-04-26 16:35:05,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4403 [WARNING|trainer.py:803] 2025-04-26 16:35:06,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:06,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4448 4468 [WARNING|trainer.py:803] 2025-04-26 16:35:06,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4404 [WARNING|trainer.py:803] 2025-04-26 16:35:07,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:07,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4449 4469 [WARNING|trainer.py:803] 2025-04-26 16:35:08,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4405 [WARNING|trainer.py:803] 2025-04-26 16:35:08,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:08,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4450 4470 [WARNING|trainer.py:803] 2025-04-26 16:35:09,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4406 [WARNING|trainer.py:803] 2025-04-26 16:35:10,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:10,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4471 4451 [WARNING|trainer.py:803] 2025-04-26 16:35:10,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4407 [WARNING|trainer.py:803] 2025-04-26 16:35:11,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:11,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:11,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4472 4452 4408 [WARNING|trainer.py:803] 2025-04-26 16:35:12,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:35:12,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:12,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4473 4453 4409 [WARNING|trainer.py:803] 2025-04-26 16:35:13,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:13,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:14,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4474 4454 4410 [WARNING|trainer.py:803] 2025-04-26 16:35:14,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:14,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:15,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4475 4455 4411 [WARNING|trainer.py:803] 2025-04-26 16:35:16,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:16,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:16,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4476 4456 4412 [WARNING|trainer.py:803] 2025-04-26 16:35:17,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:17,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:17,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4477 4457 4413 [WARNING|trainer.py:803] 2025-04-26 16:35:18,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:18,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:18,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4478 4458 4414 [WARNING|trainer.py:803] 2025-04-26 16:35:19,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:19,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:35:19,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4479 4459 4415 [WARNING|trainer.py:803] 2025-04-26 16:35:20,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:20,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:21,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4480 4460 4416 [WARNING|trainer.py:803] 2025-04-26 16:35:22,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:22,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:22,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4481 4461 4417 [WARNING|trainer.py:803] 2025-04-26 16:35:23,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:23,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:23,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4482 4462 4418 [WARNING|trainer.py:803] 2025-04-26 16:35:24,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:24,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:24,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4483 4463 4419 [WARNING|trainer.py:803] 2025-04-26 16:35:25,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:25,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:25,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4484 4464 4420 [WARNING|trainer.py:803] 2025-04-26 16:35:26,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:26,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4485 [WARNING|trainer.py:803] 2025-04-26 16:35:27,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4465 4421 [WARNING|trainer.py:803] 2025-04-26 16:35:27,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:28,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4486 [WARNING|trainer.py:803] 2025-04-26 16:35:28,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4466 4422 [WARNING|trainer.py:803] 2025-04-26 16:35:29,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:29,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4487 [WARNING|trainer.py:803] 2025-04-26 16:35:29,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4467 4423 [WARNING|trainer.py:803] 2025-04-26 16:35:30,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:30,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4488 [WARNING|trainer.py:803] 2025-04-26 16:35:30,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4468 4424 [WARNING|trainer.py:803] 2025-04-26 16:35:31,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:31,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4489 [WARNING|trainer.py:803] 2025-04-26 16:35:31,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4469 4425 [WARNING|trainer.py:803] 2025-04-26 16:35:32,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:32,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4490 [WARNING|trainer.py:803] 2025-04-26 16:35:33,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4470 4426 [WARNING|trainer.py:803] 2025-04-26 16:35:33,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:34,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4491 [WARNING|trainer.py:803] 2025-04-26 16:35:34,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4471 4427 [WARNING|trainer.py:803] 2025-04-26 16:35:35,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:35,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:35,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4492 4472 4428 [WARNING|trainer.py:803] 2025-04-26 16:35:36,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:36,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:35:36,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4493 4473 4429 [WARNING|trainer.py:803] 2025-04-26 16:35:37,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:37,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:37,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4494 4474 4430 [WARNING|trainer.py:803] 2025-04-26 16:35:38,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:38,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:39,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4495 4475 4431 [WARNING|trainer.py:803] 2025-04-26 16:35:40,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:40,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:40,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4496 4476 4432 [WARNING|trainer.py:803] 2025-04-26 16:35:41,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:41,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:41,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4497 4477 4433 [WARNING|trainer.py:803] 2025-04-26 16:35:42,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:42,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:42,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4498 4478 4434 [WARNING|trainer.py:803] 2025-04-26 16:35:43,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:43,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:43,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4479 4499 4435 [WARNING|trainer.py:803] 2025-04-26 16:35:44,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:45,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:45,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4480 4500 4436 [WARNING|trainer.py:803] 2025-04-26 16:35:46,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:46,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:46,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4481 4501 4437 [WARNING|trainer.py:803] 2025-04-26 16:35:47,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:47,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:35:47,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4482 4502 4438 [WARNING|trainer.py:803] 2025-04-26 16:35:48,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:48,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:48,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4483 4503 4439 [WARNING|trainer.py:803] 2025-04-26 16:35:49,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:49,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:49,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4484 4504 4440 [WARNING|trainer.py:803] 2025-04-26 16:35:50,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:51,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:51,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4485 4505 4441 [WARNING|trainer.py:803] 2025-04-26 16:35:52,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:52,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:52,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4486 4506 4442 [WARNING|trainer.py:803] 2025-04-26 16:35:53,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:53,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:53,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4487 4507 4443 [WARNING|trainer.py:803] 2025-04-26 16:35:54,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:54,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:54,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4488 4508 4444 [WARNING|trainer.py:803] 2025-04-26 16:35:55,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:55,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:55,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4489 4509 4445 [WARNING|trainer.py:803] 2025-04-26 16:35:56,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:57,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:57,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4490 4510 4446 [WARNING|trainer.py:803] 2025-04-26 16:35:58,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:58,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:58,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4491 4511 4447 [WARNING|trainer.py:803] 2025-04-26 16:35:59,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:35:59,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:35:59,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4492 4512 4448 [WARNING|trainer.py:803] 2025-04-26 16:36:00,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:00,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:00,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4493 4513 4449 [WARNING|trainer.py:803] 2025-04-26 16:36:01,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:01,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:01,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4494 4514 4450 [WARNING|trainer.py:803] 2025-04-26 16:36:02,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:03,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:03,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4495 4515 4451 [WARNING|trainer.py:803] 2025-04-26 16:36:04,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:04,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:36:04,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4496 4516 4452 [WARNING|trainer.py:803] 2025-04-26 16:36:05,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:05,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:05,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4497 4517 4453 [WARNING|trainer.py:803] 2025-04-26 16:36:06,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:06,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:06,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4498 4518 4454 [WARNING|trainer.py:803] 2025-04-26 16:36:07,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:07,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:07,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4519 4455 4499 [WARNING|trainer.py:803] 2025-04-26 16:36:09,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:09,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:09,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4520 4456 4500 [WARNING|trainer.py:803] 2025-04-26 16:36:10,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:10,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:10,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4521 4457 4501 [WARNING|trainer.py:803] 2025-04-26 16:36:11,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:36:11,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:11,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4522 4458 4502 [WARNING|trainer.py:803] 2025-04-26 16:36:12,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:12,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:36:12,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4523 4459 4503 [WARNING|trainer.py:803] 2025-04-26 16:36:13,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:14,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:14,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4524 4460 4504 [WARNING|trainer.py:803] 2025-04-26 16:36:15,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:15,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:15,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4525 4461 4505 [WARNING|trainer.py:803] 2025-04-26 16:36:16,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:16,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:16,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4526 4462 4506 [WARNING|trainer.py:803] 2025-04-26 16:36:17,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:17,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:17,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4527 4463 4507 [WARNING|trainer.py:803] 2025-04-26 16:36:18,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:18,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:18,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4528 4464 4508 [WARNING|trainer.py:803] 2025-04-26 16:36:20,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:20,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:20,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4529 4465 4509 [WARNING|trainer.py:803] 2025-04-26 16:36:21,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:21,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:21,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4530 4466 4510 [WARNING|trainer.py:803] 2025-04-26 16:36:22,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:22,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:22,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4531 4467 4511 [WARNING|trainer.py:803] 2025-04-26 16:36:23,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:23,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:23,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4532 4468 4512 [WARNING|trainer.py:803] 2025-04-26 16:36:24,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:24,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:25,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4533 4469 4513 [WARNING|trainer.py:803] 2025-04-26 16:36:26,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:26,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4470 4534 [WARNING|trainer.py:803] 2025-04-26 16:36:26,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4514 [WARNING|trainer.py:803] 2025-04-26 16:36:27,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:27,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4471 [WARNING|trainer.py:803] 2025-04-26 16:36:27,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4535 4515 [WARNING|trainer.py:803] 2025-04-26 16:36:28,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:28,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4472 [WARNING|trainer.py:803] 2025-04-26 16:36:28,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4536 4516 [WARNING|trainer.py:803] 2025-04-26 16:36:29,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:36:29,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4473 [WARNING|trainer.py:803] 2025-04-26 16:36:30,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4537 4517 [WARNING|trainer.py:803] 2025-04-26 16:36:30,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:30,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4474 [WARNING|trainer.py:803] 2025-04-26 16:36:31,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4538 4518 [WARNING|trainer.py:803] 2025-04-26 16:36:32,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:32,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4475 [WARNING|trainer.py:803] 2025-04-26 16:36:32,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4539 4519 [WARNING|trainer.py:803] 2025-04-26 16:36:33,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:33,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4476 [WARNING|trainer.py:803] 2025-04-26 16:36:33,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4540 4520 [WARNING|trainer.py:803] 2025-04-26 16:36:34,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:34,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4477 [WARNING|trainer.py:803] 2025-04-26 16:36:34,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4541 4521 [WARNING|trainer.py:803] 2025-04-26 16:36:35,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:35,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4478 [WARNING|trainer.py:803] 2025-04-26 16:36:36,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4542 4522 [WARNING|trainer.py:803] 2025-04-26 16:36:36,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:36,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4479 [WARNING|trainer.py:803] 2025-04-26 16:36:37,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4543 4523 [WARNING|trainer.py:803] 2025-04-26 16:36:38,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:38,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4480 [WARNING|trainer.py:803] 2025-04-26 16:36:38,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4544 4524 [WARNING|trainer.py:803] 2025-04-26 16:36:39,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:39,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4481 [WARNING|trainer.py:803] 2025-04-26 16:36:39,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4545 4525 [WARNING|trainer.py:803] 2025-04-26 16:36:40,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:40,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4482 [WARNING|trainer.py:803] 2025-04-26 16:36:40,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4546 4526 [WARNING|trainer.py:803] 2025-04-26 16:36:41,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:41,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4483 [WARNING|trainer.py:803] 2025-04-26 16:36:42,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4547 4527 [WARNING|trainer.py:803] 2025-04-26 16:36:42,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:43,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4484 [WARNING|trainer.py:803] 2025-04-26 16:36:43,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4548 4528 [WARNING|trainer.py:803] 2025-04-26 16:36:43,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:44,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4485 [WARNING|trainer.py:803] 2025-04-26 16:36:44,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4549 4529 [WARNING|trainer.py:803] 2025-04-26 16:36:45,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:36:45,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4486 [WARNING|trainer.py:803] 2025-04-26 16:36:45,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4550 4530 [WARNING|trainer.py:803] 2025-04-26 16:36:46,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:46,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4487 [WARNING|trainer.py:803] 2025-04-26 16:36:46,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4551 4531 [WARNING|trainer.py:803] 2025-04-26 16:36:47,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:47,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4488 [WARNING|trainer.py:803] 2025-04-26 16:36:48,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4552 4532 [WARNING|trainer.py:803] 2025-04-26 16:36:48,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:49,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4489 [WARNING|trainer.py:803] 2025-04-26 16:36:49,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4553 4533 [WARNING|trainer.py:803] 2025-04-26 16:36:49,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4490 [WARNING|trainer.py:803] 2025-04-26 16:36:50,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:50,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4554 4534 [WARNING|trainer.py:803] 2025-04-26 16:36:51,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:51,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4491 [WARNING|trainer.py:803] 2025-04-26 16:36:51,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4555 4535 [WARNING|trainer.py:803] 2025-04-26 16:36:52,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:52,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4492 [WARNING|trainer.py:803] 2025-04-26 16:36:52,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4556 4536 [WARNING|trainer.py:803] 2025-04-26 16:36:53,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:53,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4493 [WARNING|trainer.py:803] 2025-04-26 16:36:54,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4557 4537 [WARNING|trainer.py:803] 2025-04-26 16:36:54,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:55,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4494 [WARNING|trainer.py:803] 2025-04-26 16:36:55,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4558 4538 [WARNING|trainer.py:803] 2025-04-26 16:36:55,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:56,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4495 [WARNING|trainer.py:803] 2025-04-26 16:36:56,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4559 4539 [WARNING|trainer.py:803] 2025-04-26 16:36:57,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:57,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4496 [WARNING|trainer.py:803] 2025-04-26 16:36:57,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4560 4540 [WARNING|trainer.py:803] 2025-04-26 16:36:58,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:58,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4497 [WARNING|trainer.py:803] 2025-04-26 16:36:58,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4561 4541 [WARNING|trainer.py:803] 2025-04-26 16:36:59,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:36:59,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4498 [WARNING|trainer.py:803] 2025-04-26 16:37:00,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4562 4542 [WARNING|trainer.py:803] 2025-04-26 16:37:00,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:01,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4499 [WARNING|trainer.py:803] 2025-04-26 16:37:01,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4563 4543 [WARNING|trainer.py:803] 2025-04-26 16:37:02,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:02,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4500 [WARNING|trainer.py:803] 2025-04-26 16:37:02,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4564 4544 [WARNING|trainer.py:803] 2025-04-26 16:37:03,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:03,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4501 [WARNING|trainer.py:803] 2025-04-26 16:37:03,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4565 4545 [WARNING|trainer.py:803] 2025-04-26 16:37:04,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:37:04,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4502 [WARNING|trainer.py:803] 2025-04-26 16:37:04,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4566 4546 [WARNING|trainer.py:803] 2025-04-26 16:37:05,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:05,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4503 [WARNING|trainer.py:803] 2025-04-26 16:37:06,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4567 4547 [WARNING|trainer.py:803] 2025-04-26 16:37:06,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:07,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4504 [WARNING|trainer.py:803] 2025-04-26 16:37:07,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4568 4548 [WARNING|trainer.py:803] 2025-04-26 16:37:08,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:08,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4505 [WARNING|trainer.py:803] 2025-04-26 16:37:08,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4569 4549 [WARNING|trainer.py:803] 2025-04-26 16:37:09,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:09,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4506 [WARNING|trainer.py:803] 2025-04-26 16:37:09,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4570 4550 [WARNING|trainer.py:803] 2025-04-26 16:37:10,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:10,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4507 [WARNING|trainer.py:803] 2025-04-26 16:37:10,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4571 4551 [WARNING|trainer.py:803] 2025-04-26 16:37:11,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:11,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4508 [WARNING|trainer.py:803] 2025-04-26 16:37:12,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4572 4552 [WARNING|trainer.py:803] 2025-04-26 16:37:12,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:13,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4509 [WARNING|trainer.py:803] 2025-04-26 16:37:13,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4573 4553 [WARNING|trainer.py:803] 2025-04-26 16:37:14,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:14,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4510 [WARNING|trainer.py:803] 2025-04-26 16:37:14,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4574 4554 [WARNING|trainer.py:803] 2025-04-26 16:37:15,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:15,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4511 [WARNING|trainer.py:803] 2025-04-26 16:37:15,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4575 4555 [WARNING|trainer.py:803] 2025-04-26 16:37:16,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:37:16,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4512 [WARNING|trainer.py:803] 2025-04-26 16:37:17,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4576 4556 [WARNING|trainer.py:803] 2025-04-26 16:37:17,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:17,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4513 [WARNING|trainer.py:803] 2025-04-26 16:37:18,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4577 4557 [WARNING|trainer.py:803] 2025-04-26 16:37:18,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:37:19,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4514 [WARNING|trainer.py:803] 2025-04-26 16:37:19,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4578 4558 [WARNING|trainer.py:803] 2025-04-26 16:37:20,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:20,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4515 [WARNING|trainer.py:803] 2025-04-26 16:37:20,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4579 4559 [WARNING|trainer.py:803] 2025-04-26 16:37:21,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:37:21,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4516 [WARNING|trainer.py:803] 2025-04-26 16:37:21,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4580 4560 [WARNING|trainer.py:803] 2025-04-26 16:37:22,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:22,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4517 [WARNING|trainer.py:803] 2025-04-26 16:37:23,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4581 4561 [WARNING|trainer.py:803] 2025-04-26 16:37:23,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:37:23,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4518 [WARNING|trainer.py:803] 2025-04-26 16:37:24,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4582 4562 [WARNING|trainer.py:803] 2025-04-26 16:37:24,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:25,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4519 [WARNING|trainer.py:803] 2025-04-26 16:37:25,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4583 4563 [WARNING|trainer.py:803] 2025-04-26 16:37:25,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:26,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4520 [WARNING|trainer.py:803] 2025-04-26 16:37:26,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4584 4564 [WARNING|trainer.py:803] 2025-04-26 16:37:27,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4521 [WARNING|trainer.py:803] 2025-04-26 16:37:27,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:37:27,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4585 4565 [WARNING|trainer.py:803] 2025-04-26 16:37:28,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4522 [WARNING|trainer.py:803] 2025-04-26 16:37:28,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:37:29,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4586 4566 [WARNING|trainer.py:803] 2025-04-26 16:37:29,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4523 [WARNING|trainer.py:803] 2025-04-26 16:37:29,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:37:30,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4587 4567 [WARNING|trainer.py:803] 2025-04-26 16:37:30,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4524 [WARNING|trainer.py:803] 2025-04-26 16:37:31,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4588 [WARNING|trainer.py:803] 2025-04-26 16:37:31,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4568 [WARNING|trainer.py:803] 2025-04-26 16:37:31,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4525 [WARNING|trainer.py:803] 2025-04-26 16:37:32,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4589 [WARNING|trainer.py:803] 2025-04-26 16:37:32,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4569 [WARNING|trainer.py:803] 2025-04-26 16:37:33,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4526 [WARNING|trainer.py:803] 2025-04-26 16:37:33,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4590 [WARNING|trainer.py:803] 2025-04-26 16:37:33,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4570 [WARNING|trainer.py:803] 2025-04-26 16:37:34,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4527 [WARNING|trainer.py:803] 2025-04-26 16:37:34,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4591 [WARNING|trainer.py:803] 2025-04-26 16:37:35,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4571 [WARNING|trainer.py:803] 2025-04-26 16:37:35,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:37:35,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4528 4592 [WARNING|trainer.py:803] 2025-04-26 16:37:36,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4572 [WARNING|trainer.py:803] 2025-04-26 16:37:36,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:37,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4529 4593 [WARNING|trainer.py:803] 2025-04-26 16:37:37,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4573 [WARNING|trainer.py:803] 2025-04-26 16:37:37,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:37:38,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4530 4594 [WARNING|trainer.py:803] 2025-04-26 16:37:38,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4574 [WARNING|trainer.py:803] 2025-04-26 16:37:39,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:37:39,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4531 4595 [WARNING|trainer.py:803] 2025-04-26 16:37:39,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4575 [WARNING|trainer.py:803] 2025-04-26 16:37:40,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:40,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4532 4596 [WARNING|trainer.py:803] 2025-04-26 16:37:41,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4576 [WARNING|trainer.py:803] 2025-04-26 16:37:41,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:37:41,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4533 4597 [WARNING|trainer.py:803] 2025-04-26 16:37:42,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4577 [WARNING|trainer.py:803] 2025-04-26 16:37:42,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:37:43,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4534 4598 [WARNING|trainer.py:803] 2025-04-26 16:37:43,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4578 [WARNING|trainer.py:803] 2025-04-26 16:37:43,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:37:44,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4535 4599 [WARNING|trainer.py:803] 2025-04-26 16:37:44,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4579 [WARNING|trainer.py:803] 2025-04-26 16:37:45,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:45,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4536 4600 [WARNING|trainer.py:803] 2025-04-26 16:37:45,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4580 [WARNING|trainer.py:803] 2025-04-26 16:37:46,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:46,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4537 4601 [WARNING|trainer.py:803] 2025-04-26 16:37:47,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4581 [WARNING|trainer.py:803] 2025-04-26 16:37:47,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:47,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4538 4602 [WARNING|trainer.py:803] 2025-04-26 16:37:48,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4582 [WARNING|trainer.py:803] 2025-04-26 16:37:48,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4539 [WARNING|trainer.py:803] 2025-04-26 16:37:49,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4603 [WARNING|trainer.py:803] 2025-04-26 16:37:49,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4583 [WARNING|trainer.py:803] 2025-04-26 16:37:49,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4540 [WARNING|trainer.py:803] 2025-04-26 16:37:50,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4604 [WARNING|trainer.py:803] 2025-04-26 16:37:50,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4584 [WARNING|trainer.py:803] 2025-04-26 16:37:51,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4541 [WARNING|trainer.py:803] 2025-04-26 16:37:51,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4605 [WARNING|trainer.py:803] 2025-04-26 16:37:51,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4585 [WARNING|trainer.py:803] 2025-04-26 16:37:52,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4542 [WARNING|trainer.py:803] 2025-04-26 16:37:52,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4606 [WARNING|trainer.py:803] 2025-04-26 16:37:53,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4586 [WARNING|trainer.py:803] 2025-04-26 16:37:53,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4543 [WARNING|trainer.py:803] 2025-04-26 16:37:53,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4607 [WARNING|trainer.py:803] 2025-04-26 16:37:54,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4587 [WARNING|trainer.py:803] 2025-04-26 16:37:54,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4544 [WARNING|trainer.py:803] 2025-04-26 16:37:55,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4608 [WARNING|trainer.py:803] 2025-04-26 16:37:55,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:37:55,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 4588 NoNo 4545 [WARNING|trainer.py:803] 2025-04-26 16:37:56,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4609 [WARNING|trainer.py:803] 2025-04-26 16:37:56,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:57,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4589 4546 [WARNING|trainer.py:803] 2025-04-26 16:37:57,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4610 [WARNING|trainer.py:803] 2025-04-26 16:37:57,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4590 [WARNING|trainer.py:803] 2025-04-26 16:37:58,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:58,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4547 4611 [WARNING|trainer.py:803] 2025-04-26 16:37:59,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4591 [WARNING|trainer.py:803] 2025-04-26 16:37:59,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:37:59,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4548 4612 [WARNING|trainer.py:803] 2025-04-26 16:38:00,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4592 [WARNING|trainer.py:803] 2025-04-26 16:38:00,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:01,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4549 4613 [WARNING|trainer.py:803] 2025-04-26 16:38:01,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4593 [WARNING|trainer.py:803] 2025-04-26 16:38:02,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:02,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4550 4614 [WARNING|trainer.py:803] 2025-04-26 16:38:02,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4594 [WARNING|trainer.py:803] 2025-04-26 16:38:03,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:38:03,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4551 4615 [WARNING|trainer.py:803] 2025-04-26 16:38:03,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4595 [WARNING|trainer.py:803] 2025-04-26 16:38:04,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:38:04,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4552 4616 [WARNING|trainer.py:803] 2025-04-26 16:38:05,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4596 [WARNING|trainer.py:803] 2025-04-26 16:38:05,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:05,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4553 4617 [WARNING|trainer.py:803] 2025-04-26 16:38:06,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4597 [WARNING|trainer.py:803] 2025-04-26 16:38:06,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:07,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4554 [WARNING|trainer.py:803] 2025-04-26 16:38:07,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4618 4598 [WARNING|trainer.py:803] 2025-04-26 16:38:08,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:08,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4555 [WARNING|trainer.py:803] 2025-04-26 16:38:08,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4619 4599 [WARNING|trainer.py:803] 2025-04-26 16:38:09,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:09,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4556 [WARNING|trainer.py:803] 2025-04-26 16:38:09,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4620 4600 [WARNING|trainer.py:803] 2025-04-26 16:38:10,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:38:10,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4557 [WARNING|trainer.py:803] 2025-04-26 16:38:11,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4621 4601 [WARNING|trainer.py:803] 2025-04-26 16:38:11,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4558 [WARNING|trainer.py:803] 2025-04-26 16:38:11,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:12,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4622 4602 [WARNING|trainer.py:803] 2025-04-26 16:38:12,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4559 [WARNING|trainer.py:803] 2025-04-26 16:38:13,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4623 [WARNING|trainer.py:803] 2025-04-26 16:38:13,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4603 [WARNING|trainer.py:803] 2025-04-26 16:38:13,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4560 [WARNING|trainer.py:803] 2025-04-26 16:38:14,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4624 [WARNING|trainer.py:803] 2025-04-26 16:38:14,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4604 [WARNING|trainer.py:803] 2025-04-26 16:38:15,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4561 [WARNING|trainer.py:803] 2025-04-26 16:38:15,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4625 [WARNING|trainer.py:803] 2025-04-26 16:38:15,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4605 [WARNING|trainer.py:803] 2025-04-26 16:38:16,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4562 [WARNING|trainer.py:803] 2025-04-26 16:38:16,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4626 [WARNING|trainer.py:803] 2025-04-26 16:38:17,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:17,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4606 4563 [WARNING|trainer.py:803] 2025-04-26 16:38:17,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4627 [WARNING|trainer.py:803] 2025-04-26 16:38:18,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:38:18,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4607 4564 [WARNING|trainer.py:803] 2025-04-26 16:38:19,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4628 [WARNING|trainer.py:803] 2025-04-26 16:38:19,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:19,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4608 4565 [WARNING|trainer.py:803] 2025-04-26 16:38:20,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4629 [WARNING|trainer.py:803] 2025-04-26 16:38:20,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:21,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4609 4566 [WARNING|trainer.py:803] 2025-04-26 16:38:21,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4630 [WARNING|trainer.py:803] 2025-04-26 16:38:22,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:22,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4610 4567 [WARNING|trainer.py:803] 2025-04-26 16:38:22,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4631 [WARNING|trainer.py:803] 2025-04-26 16:38:23,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:23,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4611 4568 [WARNING|trainer.py:803] 2025-04-26 16:38:23,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4632 [WARNING|trainer.py:803] 2025-04-26 16:38:24,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:24,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4612 4569 [WARNING|trainer.py:803] 2025-04-26 16:38:25,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4633 [WARNING|trainer.py:803] 2025-04-26 16:38:25,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:25,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4613 4570 [WARNING|trainer.py:803] 2025-04-26 16:38:26,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4634 [WARNING|trainer.py:803] 2025-04-26 16:38:26,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:27,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4614 4571 [WARNING|trainer.py:803] 2025-04-26 16:38:27,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4635 [WARNING|trainer.py:803] 2025-04-26 16:38:28,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:38:28,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4615 4572 [WARNING|trainer.py:803] 2025-04-26 16:38:28,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4636 [WARNING|trainer.py:803] 2025-04-26 16:38:29,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:29,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4616 4573 [WARNING|trainer.py:803] 2025-04-26 16:38:29,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4637 [WARNING|trainer.py:803] 2025-04-26 16:38:30,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:30,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4617 4574 [WARNING|trainer.py:803] 2025-04-26 16:38:31,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4638 [WARNING|trainer.py:803] 2025-04-26 16:38:31,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:31,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4618 4575 [WARNING|trainer.py:803] 2025-04-26 16:38:32,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4639 [WARNING|trainer.py:803] 2025-04-26 16:38:33,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:33,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4619 4576 [WARNING|trainer.py:803] 2025-04-26 16:38:33,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4640 [WARNING|trainer.py:803] 2025-04-26 16:38:34,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:34,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4620 4577 [WARNING|trainer.py:803] 2025-04-26 16:38:34,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4641 [WARNING|trainer.py:803] 2025-04-26 16:38:35,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:38:35,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4621 4578 [WARNING|trainer.py:803] 2025-04-26 16:38:35,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4642 [WARNING|trainer.py:803] 2025-04-26 16:38:36,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:36,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4622 4579 [WARNING|trainer.py:803] 2025-04-26 16:38:37,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4643 [WARNING|trainer.py:803] 2025-04-26 16:38:37,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:37,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4623 4580 [WARNING|trainer.py:803] 2025-04-26 16:38:38,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4644 [WARNING|trainer.py:803] 2025-04-26 16:38:39,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:39,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4624 [WARNING|trainer.py:803] 2025-04-26 16:38:39,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 4581 NoNo 4645 [WARNING|trainer.py:803] 2025-04-26 16:38:40,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:38:40,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4625 4582 [WARNING|trainer.py:803] 2025-04-26 16:38:40,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4646 [WARNING|trainer.py:803] 2025-04-26 16:38:41,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:41,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4626 [WARNING|trainer.py:803] 2025-04-26 16:38:41,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4583 4647 [WARNING|trainer.py:803] 2025-04-26 16:38:42,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:38:42,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4627 [WARNING|trainer.py:803] 2025-04-26 16:38:43,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4584 4648 [WARNING|trainer.py:803] 2025-04-26 16:38:43,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:44,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:38:44,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4628 4585 4649 [WARNING|trainer.py:803] 2025-04-26 16:38:45,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:45,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:38:45,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4629 4586 4650 [WARNING|trainer.py:803] 2025-04-26 16:38:46,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:38:46,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:38:46,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4630 4587 4651 [WARNING|trainer.py:803] 2025-04-26 16:38:47,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:38:47,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:38:47,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4588 4631 4652 [WARNING|trainer.py:803] 2025-04-26 16:38:48,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:48,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:38:49,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4589 4632 4653 [WARNING|trainer.py:803] 2025-04-26 16:38:50,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:38:50,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:50,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4590 4633 4654 [WARNING|trainer.py:803] 2025-04-26 16:38:51,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:51,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:51,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4591 4634 4655 [WARNING|trainer.py:803] 2025-04-26 16:38:52,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:52,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:52,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4592 4635 4656 [WARNING|trainer.py:803] 2025-04-26 16:38:53,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:53,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:54,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4593 4636 4657 [WARNING|trainer.py:803] 2025-04-26 16:38:54,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:54,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4594 4637 [WARNING|trainer.py:803] 2025-04-26 16:38:55,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4658 [WARNING|trainer.py:803] 2025-04-26 16:38:56,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:38:56,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4595 4638 [WARNING|trainer.py:803] 2025-04-26 16:38:56,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4659 [WARNING|trainer.py:803] 2025-04-26 16:38:57,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:57,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4596 4639 [WARNING|trainer.py:803] 2025-04-26 16:38:57,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4660 [WARNING|trainer.py:803] 2025-04-26 16:38:58,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:38:58,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4597 4640 [WARNING|trainer.py:803] 2025-04-26 16:38:58,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4661 [WARNING|trainer.py:803] 2025-04-26 16:38:59,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:38:59,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4598 [WARNING|trainer.py:803] 2025-04-26 16:39:00,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4641 4662 [WARNING|trainer.py:803] 2025-04-26 16:39:00,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:00,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4599 [WARNING|trainer.py:803] 2025-04-26 16:39:01,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4642 4663 [WARNING|trainer.py:803] 2025-04-26 16:39:02,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:02,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:02,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4600 4643 4664 [WARNING|trainer.py:803] 2025-04-26 16:39:03,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:03,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:03,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4601 4644 4665 [WARNING|trainer.py:803] 2025-04-26 16:39:04,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:39:04,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:04,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4602 4645 4666 [WARNING|trainer.py:803] 2025-04-26 16:39:05,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:05,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:06,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4603 4646 4667 [WARNING|trainer.py:803] 2025-04-26 16:39:06,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:06,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:07,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4604 4647 4668 [WARNING|trainer.py:803] 2025-04-26 16:39:08,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:08,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:08,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4648 4605 4669 [WARNING|trainer.py:803] 2025-04-26 16:39:09,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:09,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:09,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4649 4606 4670 [WARNING|trainer.py:803] 2025-04-26 16:39:10,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:10,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:39:10,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4650 4607 4671 [WARNING|trainer.py:803] 2025-04-26 16:39:11,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:11,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:11,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4651 4608 4672 [WARNING|trainer.py:803] 2025-04-26 16:39:12,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:13,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:13,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4652 4609 4673 [WARNING|trainer.py:803] 2025-04-26 16:39:14,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:14,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:14,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4653 4610 4674 [WARNING|trainer.py:803] 2025-04-26 16:39:15,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:15,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:15,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4654 4611 4675 [WARNING|trainer.py:803] 2025-04-26 16:39:16,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:16,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:16,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4655 4612 4676 [WARNING|trainer.py:803] 2025-04-26 16:39:17,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:17,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:17,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4656 4677 4613 [WARNING|trainer.py:803] 2025-04-26 16:39:19,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:19,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:19,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4657 4678 4614 [WARNING|trainer.py:803] 2025-04-26 16:39:20,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:20,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:20,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4658 4679 4615 [WARNING|trainer.py:803] 2025-04-26 16:39:21,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:21,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:21,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4659 4680 4616 [WARNING|trainer.py:803] 2025-04-26 16:39:22,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:22,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:22,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4660 4681 4617 [WARNING|trainer.py:803] 2025-04-26 16:39:23,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:23,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:24,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4661 4682 4618 [WARNING|trainer.py:803] 2025-04-26 16:39:25,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:25,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4662 [WARNING|trainer.py:803] 2025-04-26 16:39:25,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4683 4619 [WARNING|trainer.py:803] 2025-04-26 16:39:26,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:26,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4663 [WARNING|trainer.py:803] 2025-04-26 16:39:26,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4684 4620 [WARNING|trainer.py:803] 2025-04-26 16:39:27,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:27,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4664 [WARNING|trainer.py:803] 2025-04-26 16:39:27,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4685 4621 [WARNING|trainer.py:803] 2025-04-26 16:39:28,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:28,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4665 4686 [WARNING|trainer.py:803] 2025-04-26 16:39:29,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4622 [WARNING|trainer.py:803] 2025-04-26 16:39:29,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:29,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4666 4687 [WARNING|trainer.py:803] 2025-04-26 16:39:30,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4623 [WARNING|trainer.py:803] 2025-04-26 16:39:31,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:31,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4667 4688 [WARNING|trainer.py:803] 2025-04-26 16:39:31,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4624 [WARNING|trainer.py:803] 2025-04-26 16:39:32,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:32,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4668 4689 [WARNING|trainer.py:803] 2025-04-26 16:39:32,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4625 [WARNING|trainer.py:803] 2025-04-26 16:39:33,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:33,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4690 4669 [WARNING|trainer.py:803] 2025-04-26 16:39:33,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4626 [WARNING|trainer.py:803] 2025-04-26 16:39:34,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:34,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4691 4670 [WARNING|trainer.py:803] 2025-04-26 16:39:35,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4627 [WARNING|trainer.py:803] 2025-04-26 16:39:35,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:35,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4692 4671 [WARNING|trainer.py:803] 2025-04-26 16:39:36,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4628 [WARNING|trainer.py:803] 2025-04-26 16:39:37,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:37,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4693 [WARNING|trainer.py:803] 2025-04-26 16:39:37,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4672 4629 [WARNING|trainer.py:803] 2025-04-26 16:39:38,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:38,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4694 [WARNING|trainer.py:803] 2025-04-26 16:39:38,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4673 4630 [WARNING|trainer.py:803] 2025-04-26 16:39:39,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:39,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4695 [WARNING|trainer.py:803] 2025-04-26 16:39:40,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4674 4631 [WARNING|trainer.py:803] 2025-04-26 16:39:40,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:40,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4696 [WARNING|trainer.py:803] 2025-04-26 16:39:41,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4675 4632 [WARNING|trainer.py:803] 2025-04-26 16:39:41,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4697 [WARNING|trainer.py:803] 2025-04-26 16:39:42,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:42,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4676 4633 [WARNING|trainer.py:803] 2025-04-26 16:39:42,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4698 [WARNING|trainer.py:803] 2025-04-26 16:39:43,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:43,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4677 4634 [WARNING|trainer.py:803] 2025-04-26 16:39:44,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4699 [WARNING|trainer.py:803] 2025-04-26 16:39:44,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:44,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4678 4635 [WARNING|trainer.py:803] 2025-04-26 16:39:45,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4700 [WARNING|trainer.py:803] 2025-04-26 16:39:45,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:46,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4679 4636 [WARNING|trainer.py:803] 2025-04-26 16:39:46,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4701 [WARNING|trainer.py:803] 2025-04-26 16:39:47,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:47,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4680 4637 [WARNING|trainer.py:803] 2025-04-26 16:39:47,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4702 [WARNING|trainer.py:803] 2025-04-26 16:39:48,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:48,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4681 [WARNING|trainer.py:803] 2025-04-26 16:39:48,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4638 4703 [WARNING|trainer.py:803] 2025-04-26 16:39:49,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:49,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4682 [WARNING|trainer.py:803] 2025-04-26 16:39:50,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4639 4704 [WARNING|trainer.py:803] 2025-04-26 16:39:50,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:50,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4683 [WARNING|trainer.py:803] 2025-04-26 16:39:51,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4640 4705 [WARNING|trainer.py:803] 2025-04-26 16:39:51,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:52,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4684 [WARNING|trainer.py:803] 2025-04-26 16:39:52,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4641 4706 [WARNING|trainer.py:803] 2025-04-26 16:39:53,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:53,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4685 [WARNING|trainer.py:803] 2025-04-26 16:39:53,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4642 4707 [WARNING|trainer.py:803] 2025-04-26 16:39:54,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:39:54,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4686 [WARNING|trainer.py:803] 2025-04-26 16:39:54,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4643 4708 [WARNING|trainer.py:803] 2025-04-26 16:39:55,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:55,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4687 [WARNING|trainer.py:803] 2025-04-26 16:39:55,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4644 4709 [WARNING|trainer.py:803] 2025-04-26 16:39:56,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:56,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4688 [WARNING|trainer.py:803] 2025-04-26 16:39:57,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4645 4710 [WARNING|trainer.py:803] 2025-04-26 16:39:57,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:58,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4689 [WARNING|trainer.py:803] 2025-04-26 16:39:58,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4646 4711 [WARNING|trainer.py:803] 2025-04-26 16:39:59,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:39:59,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4690 [WARNING|trainer.py:803] 2025-04-26 16:39:59,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4647 4712 [WARNING|trainer.py:803] 2025-04-26 16:40:00,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:00,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4691 [WARNING|trainer.py:803] 2025-04-26 16:40:00,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4648 4713 [WARNING|trainer.py:803] 2025-04-26 16:40:01,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:01,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4692 [WARNING|trainer.py:803] 2025-04-26 16:40:01,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4649 4714 [WARNING|trainer.py:803] 2025-04-26 16:40:02,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:02,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4693 [WARNING|trainer.py:803] 2025-04-26 16:40:03,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4650 4715 [WARNING|trainer.py:803] 2025-04-26 16:40:03,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:40:04,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:04,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4694 4716 4651 [WARNING|trainer.py:803] 2025-04-26 16:40:05,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:05,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:05,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4695 4717 4652 [WARNING|trainer.py:803] 2025-04-26 16:40:06,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:06,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4696 [WARNING|trainer.py:803] 2025-04-26 16:40:06,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4718 4653 [WARNING|trainer.py:803] 2025-04-26 16:40:07,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:07,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4697 [WARNING|trainer.py:803] 2025-04-26 16:40:07,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4719 4654 [WARNING|trainer.py:803] 2025-04-26 16:40:08,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:08,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4698 [WARNING|trainer.py:803] 2025-04-26 16:40:09,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4720 4655 [WARNING|trainer.py:803] 2025-04-26 16:40:09,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:10,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4699 [WARNING|trainer.py:803] 2025-04-26 16:40:10,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4721 4656 [WARNING|trainer.py:803] 2025-04-26 16:40:11,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:11,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4700 [WARNING|trainer.py:803] 2025-04-26 16:40:11,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4722 4657 [WARNING|trainer.py:803] 2025-04-26 16:40:12,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:40:12,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4701 [WARNING|trainer.py:803] 2025-04-26 16:40:12,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4723 4658 [WARNING|trainer.py:803] 2025-04-26 16:40:13,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:13,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:13,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4702 4724 4659 [WARNING|trainer.py:803] 2025-04-26 16:40:14,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:14,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:15,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4703 4725 4660 [WARNING|trainer.py:803] 2025-04-26 16:40:15,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:16,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4704 [WARNING|trainer.py:803] 2025-04-26 16:40:16,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4726 4661 [WARNING|trainer.py:803] 2025-04-26 16:40:17,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:17,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4705 [WARNING|trainer.py:803] 2025-04-26 16:40:17,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4727 4662 [WARNING|trainer.py:803] 2025-04-26 16:40:18,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:18,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4706 [WARNING|trainer.py:803] 2025-04-26 16:40:18,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4728 4663 [WARNING|trainer.py:803] 2025-04-26 16:40:19,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:19,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4707 [WARNING|trainer.py:803] 2025-04-26 16:40:19,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4729 4664 [WARNING|trainer.py:803] 2025-04-26 16:40:20,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:21,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4708 [WARNING|trainer.py:803] 2025-04-26 16:40:21,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4730 4665 [WARNING|trainer.py:803] 2025-04-26 16:40:21,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:40:22,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4709 [WARNING|trainer.py:803] 2025-04-26 16:40:22,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4731 4666 [WARNING|trainer.py:803] 2025-04-26 16:40:23,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:23,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4710 [WARNING|trainer.py:803] 2025-04-26 16:40:23,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4732 4667 [WARNING|trainer.py:803] 2025-04-26 16:40:24,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:24,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4711 [WARNING|trainer.py:803] 2025-04-26 16:40:24,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4733 4668 [WARNING|trainer.py:803] 2025-04-26 16:40:25,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:25,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4712 [WARNING|trainer.py:803] 2025-04-26 16:40:26,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4734 4669 [WARNING|trainer.py:803] 2025-04-26 16:40:26,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:27,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4713 [WARNING|trainer.py:803] 2025-04-26 16:40:27,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4735 4670 [WARNING|trainer.py:803] 2025-04-26 16:40:27,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:28,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4714 [WARNING|trainer.py:803] 2025-04-26 16:40:28,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4736 4671 [WARNING|trainer.py:803] 2025-04-26 16:40:29,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:29,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4715 [WARNING|trainer.py:803] 2025-04-26 16:40:29,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4737 4672 [WARNING|trainer.py:803] 2025-04-26 16:40:30,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:30,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4716 [WARNING|trainer.py:803] 2025-04-26 16:40:30,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4738 4673 [WARNING|trainer.py:803] 2025-04-26 16:40:31,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:31,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4717 [WARNING|trainer.py:803] 2025-04-26 16:40:32,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4739 4674 [WARNING|trainer.py:803] 2025-04-26 16:40:32,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:33,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4718 [WARNING|trainer.py:803] 2025-04-26 16:40:33,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4740 4675 [WARNING|trainer.py:803] 2025-04-26 16:40:33,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:34,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4719 [WARNING|trainer.py:803] 2025-04-26 16:40:34,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4741 4676 [WARNING|trainer.py:803] 2025-04-26 16:40:35,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:35,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4720 [WARNING|trainer.py:803] 2025-04-26 16:40:35,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4742 4677 [WARNING|trainer.py:803] 2025-04-26 16:40:36,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:36,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4721 [WARNING|trainer.py:803] 2025-04-26 16:40:36,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4743 4678 [WARNING|trainer.py:803] 2025-04-26 16:40:37,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:37,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4722 [WARNING|trainer.py:803] 2025-04-26 16:40:38,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4744 4679 [WARNING|trainer.py:803] 2025-04-26 16:40:38,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:39,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4723 [WARNING|trainer.py:803] 2025-04-26 16:40:39,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4745 4680 [WARNING|trainer.py:803] 2025-04-26 16:40:39,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:40,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4724 [WARNING|trainer.py:803] 2025-04-26 16:40:40,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4746 4681 [WARNING|trainer.py:803] 2025-04-26 16:40:41,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:41,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4725 [WARNING|trainer.py:803] 2025-04-26 16:40:41,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4747 4682 [WARNING|trainer.py:803] 2025-04-26 16:40:42,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4726 [WARNING|trainer.py:803] 2025-04-26 16:40:42,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:42,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4748 4683 [WARNING|trainer.py:803] 2025-04-26 16:40:43,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4727 [WARNING|trainer.py:803] 2025-04-26 16:40:43,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:44,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4749 4684 [WARNING|trainer.py:803] 2025-04-26 16:40:44,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4728 [WARNING|trainer.py:803] 2025-04-26 16:40:45,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:45,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4750 4685 [WARNING|trainer.py:803] 2025-04-26 16:40:45,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4729 [WARNING|trainer.py:803] 2025-04-26 16:40:46,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:46,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4751 4686 [WARNING|trainer.py:803] 2025-04-26 16:40:47,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4730 [WARNING|trainer.py:803] 2025-04-26 16:40:47,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:47,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4752 4687 [WARNING|trainer.py:803] 2025-04-26 16:40:48,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4731 [WARNING|trainer.py:803] 2025-04-26 16:40:48,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:48,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4753 4688 [WARNING|trainer.py:803] 2025-04-26 16:40:49,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4732 [WARNING|trainer.py:803] 2025-04-26 16:40:49,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:50,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4754 4689 [WARNING|trainer.py:803] 2025-04-26 16:40:50,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4733 [WARNING|trainer.py:803] 2025-04-26 16:40:51,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:51,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4755 4690 [WARNING|trainer.py:803] 2025-04-26 16:40:51,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4734 [WARNING|trainer.py:803] 2025-04-26 16:40:52,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:52,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4756 4691 [WARNING|trainer.py:803] 2025-04-26 16:40:53,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4735 [WARNING|trainer.py:803] 2025-04-26 16:40:53,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:53,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4757 4692 [WARNING|trainer.py:803] 2025-04-26 16:40:54,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4736 [WARNING|trainer.py:803] 2025-04-26 16:40:54,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:54,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4758 4693 [WARNING|trainer.py:803] 2025-04-26 16:40:55,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4737 [WARNING|trainer.py:803] 2025-04-26 16:40:55,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:40:56,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4759 4694 [WARNING|trainer.py:803] 2025-04-26 16:40:56,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4738 [WARNING|trainer.py:803] 2025-04-26 16:40:57,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:57,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4760 4695 [WARNING|trainer.py:803] 2025-04-26 16:40:57,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4739 [WARNING|trainer.py:803] 2025-04-26 16:40:58,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:58,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4761 4696 [WARNING|trainer.py:803] 2025-04-26 16:40:59,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4740 [WARNING|trainer.py:803] 2025-04-26 16:40:59,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:40:59,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4762 4697 [WARNING|trainer.py:803] 2025-04-26 16:41:00,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4741 [WARNING|trainer.py:803] 2025-04-26 16:41:00,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:00,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4763 4698 [WARNING|trainer.py:803] 2025-04-26 16:41:01,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4742 [WARNING|trainer.py:803] 2025-04-26 16:41:01,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:02,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4764 4699 [WARNING|trainer.py:803] 2025-04-26 16:41:02,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4743 [WARNING|trainer.py:803] 2025-04-26 16:41:03,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:41:03,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4765 4700 [WARNING|trainer.py:803] 2025-04-26 16:41:03,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4744 [WARNING|trainer.py:803] 2025-04-26 16:41:04,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:41:04,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4766 4701 [WARNING|trainer.py:803] 2025-04-26 16:41:04,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4745 [WARNING|trainer.py:803] 2025-04-26 16:41:05,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:05,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4767 4702 [WARNING|trainer.py:803] 2025-04-26 16:41:06,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4746 [WARNING|trainer.py:803] 2025-04-26 16:41:06,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:06,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4768 4703 [WARNING|trainer.py:803] 2025-04-26 16:41:07,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4747 [WARNING|trainer.py:803] 2025-04-26 16:41:07,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:08,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4769 4704 [WARNING|trainer.py:803] 2025-04-26 16:41:08,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4748 [WARNING|trainer.py:803] 2025-04-26 16:41:09,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:09,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4770 4705 [WARNING|trainer.py:803] 2025-04-26 16:41:09,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4749 [WARNING|trainer.py:803] 2025-04-26 16:41:10,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:10,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4771 4706 [WARNING|trainer.py:803] 2025-04-26 16:41:10,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4750 [WARNING|trainer.py:803] 2025-04-26 16:41:11,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:41:11,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4772 4707 [WARNING|trainer.py:803] 2025-04-26 16:41:12,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4751 [WARNING|trainer.py:803] 2025-04-26 16:41:12,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:12,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4773 4708 [WARNING|trainer.py:803] 2025-04-26 16:41:13,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4752 [WARNING|trainer.py:803] 2025-04-26 16:41:13,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:41:14,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4774 4709 [WARNING|trainer.py:803] 2025-04-26 16:41:14,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4753 [WARNING|trainer.py:803] 2025-04-26 16:41:15,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:15,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4775 4710 [WARNING|trainer.py:803] 2025-04-26 16:41:15,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4754 [WARNING|trainer.py:803] 2025-04-26 16:41:16,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:16,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4776 4711 [WARNING|trainer.py:803] 2025-04-26 16:41:16,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4755 [WARNING|trainer.py:803] 2025-04-26 16:41:17,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:17,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4777 4712 [WARNING|trainer.py:803] 2025-04-26 16:41:18,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4756 [WARNING|trainer.py:803] 2025-04-26 16:41:18,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:18,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4778 4713 [WARNING|trainer.py:803] 2025-04-26 16:41:19,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4757 [WARNING|trainer.py:803] 2025-04-26 16:41:19,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:20,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4779 4714 [WARNING|trainer.py:803] 2025-04-26 16:41:20,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4758 [WARNING|trainer.py:803] 2025-04-26 16:41:21,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:41:21,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4780 4715 [WARNING|trainer.py:803] 2025-04-26 16:41:21,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4759 [WARNING|trainer.py:803] 2025-04-26 16:41:22,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:22,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4781 4716 [WARNING|trainer.py:803] 2025-04-26 16:41:22,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4760 [WARNING|trainer.py:803] 2025-04-26 16:41:23,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:23,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4782 4717 [WARNING|trainer.py:803] 2025-04-26 16:41:24,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4761 [WARNING|trainer.py:803] 2025-04-26 16:41:24,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:41:24,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4783 4718 [WARNING|trainer.py:803] 2025-04-26 16:41:25,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4762 [WARNING|trainer.py:803] 2025-04-26 16:41:25,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:41:26,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4784 4719 [WARNING|trainer.py:803] 2025-04-26 16:41:26,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4763 [WARNING|trainer.py:803] 2025-04-26 16:41:26,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:41:27,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4785 4720 [WARNING|trainer.py:803] 2025-04-26 16:41:27,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4764 [WARNING|trainer.py:803] 2025-04-26 16:41:28,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:28,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4786 4721 [WARNING|trainer.py:803] 2025-04-26 16:41:28,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4765 [WARNING|trainer.py:803] 2025-04-26 16:41:29,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:29,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4787 4722 [WARNING|trainer.py:803] 2025-04-26 16:41:30,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4766 [WARNING|trainer.py:803] 2025-04-26 16:41:30,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:30,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4788 4723 [WARNING|trainer.py:803] 2025-04-26 16:41:31,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4767 [WARNING|trainer.py:803] 2025-04-26 16:41:31,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:32,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4789 4724 [WARNING|trainer.py:803] 2025-04-26 16:41:32,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4768 [WARNING|trainer.py:803] 2025-04-26 16:41:32,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:33,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4790 4725 [WARNING|trainer.py:803] 2025-04-26 16:41:33,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4769 [WARNING|trainer.py:803] 2025-04-26 16:41:34,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:34,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4791 4726 [WARNING|trainer.py:803] 2025-04-26 16:41:34,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4770 [WARNING|trainer.py:803] 2025-04-26 16:41:35,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:35,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4792 [WARNING|trainer.py:803] 2025-04-26 16:41:36,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4727 4771 [WARNING|trainer.py:803] 2025-04-26 16:41:36,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:36,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4793 4728 [WARNING|trainer.py:803] 2025-04-26 16:41:37,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4772 [WARNING|trainer.py:803] 2025-04-26 16:41:37,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:38,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4794 4729 [WARNING|trainer.py:803] 2025-04-26 16:41:38,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4773 [WARNING|trainer.py:803] 2025-04-26 16:41:38,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:39,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4795 4730 [WARNING|trainer.py:803] 2025-04-26 16:41:39,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4774 [WARNING|trainer.py:803] 2025-04-26 16:41:40,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:40,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4796 4731 [WARNING|trainer.py:803] 2025-04-26 16:41:40,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4775 [WARNING|trainer.py:803] 2025-04-26 16:41:41,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:41,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4797 4732 [WARNING|trainer.py:803] 2025-04-26 16:41:42,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4776 [WARNING|trainer.py:803] 2025-04-26 16:41:42,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:42,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4798 4733 [WARNING|trainer.py:803] 2025-04-26 16:41:43,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:43,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4777 [WARNING|trainer.py:803] 2025-04-26 16:41:44,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4799 4734 [WARNING|trainer.py:803] 2025-04-26 16:41:44,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:44,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4778 [WARNING|trainer.py:803] 2025-04-26 16:41:45,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4800 4735 [WARNING|trainer.py:803] 2025-04-26 16:41:45,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:46,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4779 [WARNING|trainer.py:803] 2025-04-26 16:41:46,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4801 4736 [WARNING|trainer.py:803] 2025-04-26 16:41:47,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:41:47,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4780 [WARNING|trainer.py:803] 2025-04-26 16:41:47,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4802 4737 [WARNING|trainer.py:803] 2025-04-26 16:41:48,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:48,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4781 [WARNING|trainer.py:803] 2025-04-26 16:41:48,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4803 4738 [WARNING|trainer.py:803] 2025-04-26 16:41:49,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:49,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4782 [WARNING|trainer.py:803] 2025-04-26 16:41:50,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4804 4739 [WARNING|trainer.py:803] 2025-04-26 16:41:50,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:41:50,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4783 [WARNING|trainer.py:803] 2025-04-26 16:41:51,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4805 4740 [WARNING|trainer.py:803] 2025-04-26 16:41:52,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:41:52,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:52,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4784 4806 4741 [WARNING|trainer.py:803] 2025-04-26 16:41:53,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:41:53,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:53,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4785 4807 4742 [WARNING|trainer.py:803] 2025-04-26 16:41:54,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:54,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:54,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4808 4786 4743 [WARNING|trainer.py:803] 2025-04-26 16:41:55,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:55,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:55,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4809 4787 4744 [WARNING|trainer.py:803] 2025-04-26 16:41:56,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:56,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:57,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4810 4788 4745 [WARNING|trainer.py:803] 2025-04-26 16:41:58,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:41:58,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4811 [WARNING|trainer.py:803] 2025-04-26 16:41:58,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4789 4746 [WARNING|trainer.py:803] 2025-04-26 16:41:59,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:41:59,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4812 [WARNING|trainer.py:803] 2025-04-26 16:41:59,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4790 4747 [WARNING|trainer.py:803] 2025-04-26 16:42:00,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:00,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4813 [WARNING|trainer.py:803] 2025-04-26 16:42:00,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4791 4748 [WARNING|trainer.py:803] 2025-04-26 16:42:01,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:01,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4814 [WARNING|trainer.py:803] 2025-04-26 16:42:01,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4792 4749 [WARNING|trainer.py:803] 2025-04-26 16:42:02,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:03,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:03,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4793 4815 4750 [WARNING|trainer.py:803] 2025-04-26 16:42:04,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:04,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:42:04,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4794 4816 4751 [WARNING|trainer.py:803] 2025-04-26 16:42:05,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:05,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:05,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4795 4817 4752 [WARNING|trainer.py:803] 2025-04-26 16:42:06,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:06,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:06,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4796 4818 4753 [WARNING|trainer.py:803] 2025-04-26 16:42:07,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:07,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:07,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4797 4819 4754 [WARNING|trainer.py:803] 2025-04-26 16:42:08,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:09,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:09,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4798 4820 4755 [WARNING|trainer.py:803] 2025-04-26 16:42:10,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:10,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:10,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4799 4821 4756 [WARNING|trainer.py:803] 2025-04-26 16:42:11,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:11,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:42:11,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4800 4822 4757 [WARNING|trainer.py:803] 2025-04-26 16:42:12,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:12,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:12,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4801 4823 4758 [WARNING|trainer.py:803] 2025-04-26 16:42:13,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:13,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:14,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4802 4824 4759 [WARNING|trainer.py:803] 2025-04-26 16:42:14,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:15,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:15,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4803 4825 4760 [WARNING|trainer.py:803] 2025-04-26 16:42:16,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:16,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:42:16,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4804 4826 4761 [WARNING|trainer.py:803] 2025-04-26 16:42:17,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:17,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:17,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4805 4827 4762 [WARNING|trainer.py:803] 2025-04-26 16:42:18,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:18,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:18,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4806 4828 4763 [WARNING|trainer.py:803] 2025-04-26 16:42:19,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:19,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:20,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4807 4829 4764 [WARNING|trainer.py:803] 2025-04-26 16:42:20,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:21,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:21,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4808 4830 4765 [WARNING|trainer.py:803] 2025-04-26 16:42:22,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:22,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4809 [WARNING|trainer.py:803] 2025-04-26 16:42:22,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4831 4766 [WARNING|trainer.py:803] 2025-04-26 16:42:23,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:23,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4810 [WARNING|trainer.py:803] 2025-04-26 16:42:23,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4832 4767 [WARNING|trainer.py:803] 2025-04-26 16:42:24,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:24,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4811 [WARNING|trainer.py:803] 2025-04-26 16:42:24,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4833 4768 [WARNING|trainer.py:803] 2025-04-26 16:42:25,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:25,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4812 [WARNING|trainer.py:803] 2025-04-26 16:42:26,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4834 4769 [WARNING|trainer.py:803] 2025-04-26 16:42:26,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:27,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4813 [WARNING|trainer.py:803] 2025-04-26 16:42:27,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4835 4770 [WARNING|trainer.py:803] 2025-04-26 16:42:28,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:28,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4814 [WARNING|trainer.py:803] 2025-04-26 16:42:28,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4836 4771 [WARNING|trainer.py:803] 2025-04-26 16:42:29,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:29,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4815 [WARNING|trainer.py:803] 2025-04-26 16:42:29,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4837 4772 [WARNING|trainer.py:803] 2025-04-26 16:42:30,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:42:30,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4816 [WARNING|trainer.py:803] 2025-04-26 16:42:30,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4838 4773 [WARNING|trainer.py:803] 2025-04-26 16:42:31,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:31,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4817 [WARNING|trainer.py:803] 2025-04-26 16:42:32,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4839 4774 [WARNING|trainer.py:803] 2025-04-26 16:42:32,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:32,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4818 [WARNING|trainer.py:803] 2025-04-26 16:42:33,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4840 4775 [WARNING|trainer.py:803] 2025-04-26 16:42:33,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:34,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4819 [WARNING|trainer.py:803] 2025-04-26 16:42:34,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4841 4776 [WARNING|trainer.py:803] 2025-04-26 16:42:35,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:35,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4820 [WARNING|trainer.py:803] 2025-04-26 16:42:35,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4842 4777 [WARNING|trainer.py:803] 2025-04-26 16:42:36,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:36,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4821 [WARNING|trainer.py:803] 2025-04-26 16:42:36,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4843 4778 [WARNING|trainer.py:803] 2025-04-26 16:42:37,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:42:37,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4822 [WARNING|trainer.py:803] 2025-04-26 16:42:37,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4844 4779 [WARNING|trainer.py:803] 2025-04-26 16:42:38,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:38,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4823 [WARNING|trainer.py:803] 2025-04-26 16:42:39,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4845 4780 [WARNING|trainer.py:803] 2025-04-26 16:42:39,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:40,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4824 [WARNING|trainer.py:803] 2025-04-26 16:42:40,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4846 4781 [WARNING|trainer.py:803] 2025-04-26 16:42:41,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:41,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4825 [WARNING|trainer.py:803] 2025-04-26 16:42:41,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4847 4782 [WARNING|trainer.py:803] 2025-04-26 16:42:42,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:42:42,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4826 [WARNING|trainer.py:803] 2025-04-26 16:42:42,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4848 4783 [WARNING|trainer.py:803] 2025-04-26 16:42:43,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:43,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4827 [WARNING|trainer.py:803] 2025-04-26 16:42:43,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4849 4784 [WARNING|trainer.py:803] 2025-04-26 16:42:44,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:44,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4828 [WARNING|trainer.py:803] 2025-04-26 16:42:45,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4850 4785 [WARNING|trainer.py:803] 2025-04-26 16:42:45,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:46,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4829 [WARNING|trainer.py:803] 2025-04-26 16:42:46,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4851 4786 [WARNING|trainer.py:803] 2025-04-26 16:42:47,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:47,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4830 4852 [WARNING|trainer.py:803] 2025-04-26 16:42:47,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4787 [WARNING|trainer.py:803] 2025-04-26 16:42:48,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:48,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4831 4853 [WARNING|trainer.py:803] 2025-04-26 16:42:48,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4788 [WARNING|trainer.py:803] 2025-04-26 16:42:49,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:49,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4832 4854 [WARNING|trainer.py:803] 2025-04-26 16:42:49,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4789 [WARNING|trainer.py:803] 2025-04-26 16:42:50,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:50,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4833 4855 [WARNING|trainer.py:803] 2025-04-26 16:42:51,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4790 [WARNING|trainer.py:803] 2025-04-26 16:42:51,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:51,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4834 4856 [WARNING|trainer.py:803] 2025-04-26 16:42:52,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4791 [WARNING|trainer.py:803] 2025-04-26 16:42:52,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:42:53,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4835 4857 [WARNING|trainer.py:803] 2025-04-26 16:42:53,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4792 [WARNING|trainer.py:803] 2025-04-26 16:42:54,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:54,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4836 4858 [WARNING|trainer.py:803] 2025-04-26 16:42:54,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4793 [WARNING|trainer.py:803] 2025-04-26 16:42:55,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:42:55,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4837 4859 [WARNING|trainer.py:803] 2025-04-26 16:42:55,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4794 [WARNING|trainer.py:803] 2025-04-26 16:42:56,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:56,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4838 4860 [WARNING|trainer.py:803] 2025-04-26 16:42:57,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4795 [WARNING|trainer.py:803] 2025-04-26 16:42:57,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:42:57,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4839 4861 [WARNING|trainer.py:803] 2025-04-26 16:42:58,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4796 [WARNING|trainer.py:803] 2025-04-26 16:42:58,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:42:59,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4840 4862 [WARNING|trainer.py:803] 2025-04-26 16:42:59,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4797 [WARNING|trainer.py:803] 2025-04-26 16:43:00,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:00,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4841 4863 [WARNING|trainer.py:803] 2025-04-26 16:43:00,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4798 [WARNING|trainer.py:803] 2025-04-26 16:43:01,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:01,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4842 4864 [WARNING|trainer.py:803] 2025-04-26 16:43:01,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4799 [WARNING|trainer.py:803] 2025-04-26 16:43:02,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:02,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4843 4865 [WARNING|trainer.py:803] 2025-04-26 16:43:03,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4800 [WARNING|trainer.py:803] 2025-04-26 16:43:03,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:03,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4844 4866 [WARNING|trainer.py:803] 2025-04-26 16:43:04,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4801 [WARNING|trainer.py:803] 2025-04-26 16:43:04,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:04,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4845 4867 [WARNING|trainer.py:803] 2025-04-26 16:43:05,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4802 [WARNING|trainer.py:803] 2025-04-26 16:43:06,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:06,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4846 4868 [WARNING|trainer.py:803] 2025-04-26 16:43:06,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4803 [WARNING|trainer.py:803] 2025-04-26 16:43:07,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:07,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4847 4869 [WARNING|trainer.py:803] 2025-04-26 16:43:07,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4804 [WARNING|trainer.py:803] 2025-04-26 16:43:08,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:08,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4848 4870 [WARNING|trainer.py:803] 2025-04-26 16:43:08,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4805 [WARNING|trainer.py:803] 2025-04-26 16:43:09,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:09,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4871 4849 [WARNING|trainer.py:803] 2025-04-26 16:43:10,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4806 [WARNING|trainer.py:803] 2025-04-26 16:43:10,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:10,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4872 4850 [WARNING|trainer.py:803] 2025-04-26 16:43:11,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4807 [WARNING|trainer.py:803] 2025-04-26 16:43:12,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:12,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4873 [WARNING|trainer.py:803] 2025-04-26 16:43:12,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4851 4808 [WARNING|trainer.py:803] 2025-04-26 16:43:13,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:13,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4874 [WARNING|trainer.py:803] 2025-04-26 16:43:13,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4852 4809 [WARNING|trainer.py:803] 2025-04-26 16:43:14,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:14,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:14,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4875 4853 4810 [WARNING|trainer.py:803] 2025-04-26 16:43:15,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:15,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:16,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4876 4854 4811 [WARNING|trainer.py:803] 2025-04-26 16:43:16,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:16,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:17,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4877 4855 4812 [WARNING|trainer.py:803] 2025-04-26 16:43:18,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:18,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:18,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4878 4856 4813 [WARNING|trainer.py:803] 2025-04-26 16:43:19,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:19,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:19,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4879 4857 4814 [WARNING|trainer.py:803] 2025-04-26 16:43:20,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:20,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:20,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4880 4858 4815 [WARNING|trainer.py:803] 2025-04-26 16:43:21,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:21,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:21,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4881 4859 4816 [WARNING|trainer.py:803] 2025-04-26 16:43:22,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:22,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:23,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4882 4860 4817 [WARNING|trainer.py:803] 2025-04-26 16:43:24,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:24,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:24,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4883 4861 4818 [WARNING|trainer.py:803] 2025-04-26 16:43:25,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:25,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:25,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4884 4862 4819 [WARNING|trainer.py:803] 2025-04-26 16:43:26,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:26,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:26,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4885 4863 4820 [WARNING|trainer.py:803] 2025-04-26 16:43:27,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:27,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:43:27,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4886 4864 4821 [WARNING|trainer.py:803] 2025-04-26 16:43:28,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:28,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:29,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4887 4865 4822 [WARNING|trainer.py:803] 2025-04-26 16:43:29,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:30,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:30,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4888 4866 4823 [WARNING|trainer.py:803] 2025-04-26 16:43:31,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:43:31,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:31,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4889 4867 4824 [WARNING|trainer.py:803] 2025-04-26 16:43:32,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:32,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:43:32,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4890 4868 4825 [WARNING|trainer.py:803] 2025-04-26 16:43:33,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:33,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:33,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4891 4869 4826 [WARNING|trainer.py:803] 2025-04-26 16:43:34,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:34,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:35,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4892 4870 4827 [WARNING|trainer.py:803] 2025-04-26 16:43:35,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:35,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:36,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4871 4893 4828 [WARNING|trainer.py:803] 2025-04-26 16:43:37,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:37,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:37,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4872 4894 4829 [WARNING|trainer.py:803] 2025-04-26 16:43:38,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:38,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:38,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4873 4895 4830 [WARNING|trainer.py:803] 2025-04-26 16:43:39,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:39,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:39,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4874 4896 4831 [WARNING|trainer.py:803] 2025-04-26 16:43:40,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:40,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:43:40,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4875 4897 4832 [WARNING|trainer.py:803] 2025-04-26 16:43:41,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:41,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:43:42,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4876 4898 4833 [WARNING|trainer.py:803] 2025-04-26 16:43:43,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:43,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:43,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4877 4899 4834 [WARNING|trainer.py:803] 2025-04-26 16:43:44,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:44,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:44,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4878 4900 4835 [WARNING|trainer.py:803] 2025-04-26 16:43:45,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:45,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:43:45,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4879 4901 4836 [WARNING|trainer.py:803] 2025-04-26 16:43:46,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:46,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:43:46,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4880 4902 4837 [WARNING|trainer.py:803] 2025-04-26 16:43:47,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:47,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:48,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4881 4903 4838 [WARNING|trainer.py:803] 2025-04-26 16:43:49,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:49,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:49,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4882 4904 4839 [WARNING|trainer.py:803] 2025-04-26 16:43:50,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:50,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:50,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4883 4905 4840 [WARNING|trainer.py:803] 2025-04-26 16:43:51,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:51,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:51,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4884 4906 4841 [WARNING|trainer.py:803] 2025-04-26 16:43:52,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:52,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:52,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4885 4907 4842 [WARNING|trainer.py:803] 2025-04-26 16:43:53,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:53,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:54,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4886 4908 4843 [WARNING|trainer.py:803] 2025-04-26 16:43:54,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:55,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:55,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4887 4909 4844 [WARNING|trainer.py:803] 2025-04-26 16:43:56,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:43:56,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:43:56,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4888 4910 4845 [WARNING|trainer.py:803] 2025-04-26 16:43:57,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:43:57,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:57,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4889 4911 4846 [WARNING|trainer.py:803] 2025-04-26 16:43:58,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:58,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:43:58,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4890 4912 4847 [WARNING|trainer.py:803] 2025-04-26 16:43:59,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:43:59,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:00,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4891 4913 4848 [WARNING|trainer.py:803] 2025-04-26 16:44:00,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:01,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:01,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4892 4914 4849 [WARNING|trainer.py:803] 2025-04-26 16:44:02,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:02,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:02,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4893 4915 4850 [WARNING|trainer.py:803] 2025-04-26 16:44:03,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:03,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:03,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4894 4916 4851 [WARNING|trainer.py:803] 2025-04-26 16:44:04,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:04,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:04,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4895 4917 4852 [WARNING|trainer.py:803] 2025-04-26 16:44:05,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:06,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:06,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4896 4853 4918 [WARNING|trainer.py:803] 2025-04-26 16:44:06,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:44:07,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4897 [WARNING|trainer.py:803] 2025-04-26 16:44:07,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4854 4919 [WARNING|trainer.py:803] 2025-04-26 16:44:08,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:44:08,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4898 [WARNING|trainer.py:803] 2025-04-26 16:44:08,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4855 4920 [WARNING|trainer.py:803] 2025-04-26 16:44:09,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4899 [WARNING|trainer.py:803] 2025-04-26 16:44:09,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:09,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4856 4921 [WARNING|trainer.py:803] 2025-04-26 16:44:10,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4900 [WARNING|trainer.py:803] 2025-04-26 16:44:10,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:10,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4857 4922 [WARNING|trainer.py:803] 2025-04-26 16:44:11,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4901 [WARNING|trainer.py:803] 2025-04-26 16:44:11,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:12,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4858 4923 [WARNING|trainer.py:803] 2025-04-26 16:44:12,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:44:13,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4902 [WARNING|trainer.py:803] 2025-04-26 16:44:13,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4859 4924 [WARNING|trainer.py:803] 2025-04-26 16:44:13,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:14,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4903 [WARNING|trainer.py:803] 2025-04-26 16:44:14,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4860 4925 [WARNING|trainer.py:803] 2025-04-26 16:44:15,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:15,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4904 [WARNING|trainer.py:803] 2025-04-26 16:44:15,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4861 4926 [WARNING|trainer.py:803] 2025-04-26 16:44:16,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:16,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4905 [WARNING|trainer.py:803] 2025-04-26 16:44:16,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4862 4927 [WARNING|trainer.py:803] 2025-04-26 16:44:17,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:17,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4906 [WARNING|trainer.py:803] 2025-04-26 16:44:18,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4863 4928 [WARNING|trainer.py:803] 2025-04-26 16:44:18,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:19,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4907 [WARNING|trainer.py:803] 2025-04-26 16:44:19,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4864 4929 [WARNING|trainer.py:803] 2025-04-26 16:44:19,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:20,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4908 [WARNING|trainer.py:803] 2025-04-26 16:44:20,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4865 4930 [WARNING|trainer.py:803] 2025-04-26 16:44:21,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:21,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4909 [WARNING|trainer.py:803] 2025-04-26 16:44:21,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4866 4931 [WARNING|trainer.py:803] 2025-04-26 16:44:22,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:44:22,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4910 [WARNING|trainer.py:803] 2025-04-26 16:44:22,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4867 4932 [WARNING|trainer.py:803] 2025-04-26 16:44:23,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:23,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4911 [WARNING|trainer.py:803] 2025-04-26 16:44:24,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4868 4933 [WARNING|trainer.py:803] 2025-04-26 16:44:24,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:44:24,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4912 [WARNING|trainer.py:803] 2025-04-26 16:44:25,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4869 4934 [WARNING|trainer.py:803] 2025-04-26 16:44:25,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:26,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4913 [WARNING|trainer.py:803] 2025-04-26 16:44:26,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4870 4935 [WARNING|trainer.py:803] 2025-04-26 16:44:27,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:27,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4914 [WARNING|trainer.py:803] 2025-04-26 16:44:27,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4871 4936 [WARNING|trainer.py:803] 2025-04-26 16:44:28,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:28,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4915 [WARNING|trainer.py:803] 2025-04-26 16:44:28,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4872 4937 [WARNING|trainer.py:803] 2025-04-26 16:44:29,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:29,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4916 [WARNING|trainer.py:803] 2025-04-26 16:44:29,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4873 4938 [WARNING|trainer.py:803] 2025-04-26 16:44:30,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:30,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4917 [WARNING|trainer.py:803] 2025-04-26 16:44:31,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4874 4939 [WARNING|trainer.py:803] 2025-04-26 16:44:31,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:32,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4918 [WARNING|trainer.py:803] 2025-04-26 16:44:32,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4875 4940 [WARNING|trainer.py:803] 2025-04-26 16:44:33,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:33,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4919 [WARNING|trainer.py:803] 2025-04-26 16:44:33,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4876 4941 [WARNING|trainer.py:803] 2025-04-26 16:44:34,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:34,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4920 [WARNING|trainer.py:803] 2025-04-26 16:44:34,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4877 4942 [WARNING|trainer.py:803] 2025-04-26 16:44:35,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:35,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4921 [WARNING|trainer.py:803] 2025-04-26 16:44:35,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4878 4943 [WARNING|trainer.py:803] 2025-04-26 16:44:36,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:36,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:36,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4922 4879 4944 [WARNING|trainer.py:803] 2025-04-26 16:44:37,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:37,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:38,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4923 4880 4945 [WARNING|trainer.py:803] 2025-04-26 16:44:39,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:39,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:39,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4924 4881 4946 [WARNING|trainer.py:803] 2025-04-26 16:44:40,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:40,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:40,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4925 4882 4947 [WARNING|trainer.py:803] 2025-04-26 16:44:41,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:41,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:41,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4926 4883 4948 [WARNING|trainer.py:803] 2025-04-26 16:44:42,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:42,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:42,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4927 4884 4949 [WARNING|trainer.py:803] 2025-04-26 16:44:43,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:43,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:44,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4928 4885 4950 [WARNING|trainer.py:803] 2025-04-26 16:44:44,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:45,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:45,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4929 4886 4951 [WARNING|trainer.py:803] 2025-04-26 16:44:46,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:46,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:46,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4930 4887 4952 [WARNING|trainer.py:803] 2025-04-26 16:44:47,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:47,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:47,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4931 4888 4953 [WARNING|trainer.py:803] 2025-04-26 16:44:48,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:48,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:44:48,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4932 4889 4954 [WARNING|trainer.py:803] 2025-04-26 16:44:49,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:49,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:49,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4933 4890 4955 [WARNING|trainer.py:803] 2025-04-26 16:44:50,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:51,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:51,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4934 4891 4956 [WARNING|trainer.py:803] 2025-04-26 16:44:52,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:52,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:52,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4935 4892 4957 [WARNING|trainer.py:803] 2025-04-26 16:44:53,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:53,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:53,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4936 4893 4958 [WARNING|trainer.py:803] 2025-04-26 16:44:54,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:54,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:54,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4937 4894 4959 [WARNING|trainer.py:803] 2025-04-26 16:44:55,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:55,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:44:55,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4938 4895 4960 [WARNING|trainer.py:803] 2025-04-26 16:44:56,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:44:57,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:57,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4939 4896 4961 [WARNING|trainer.py:803] 2025-04-26 16:44:58,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:44:58,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:44:58,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4940 4897 4962 [WARNING|trainer.py:803] 2025-04-26 16:44:59,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:44:59,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:44:59,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4941 4898 4963 [WARNING|trainer.py:803] 2025-04-26 16:45:00,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:45:00,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:00,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4942 4899 4964 [WARNING|trainer.py:803] 2025-04-26 16:45:01,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:01,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:01,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4943 4900 4965 [WARNING|trainer.py:803] 2025-04-26 16:45:03,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:03,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:45:03,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4944 4901 4966 [WARNING|trainer.py:803] 2025-04-26 16:45:04,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:04,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:04,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4945 4967 4902 [WARNING|trainer.py:803] 2025-04-26 16:45:05,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:05,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:05,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4946 4968 4903 [WARNING|trainer.py:803] 2025-04-26 16:45:06,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:06,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:45:06,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4947 4969 4904 [WARNING|trainer.py:803] 2025-04-26 16:45:07,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:45:07,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:07,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4970 4948 4905 [WARNING|trainer.py:803] 2025-04-26 16:45:09,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:09,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:09,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4971 4949 4906 [WARNING|trainer.py:803] 2025-04-26 16:45:10,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:10,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:10,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4972 4950 4907 [WARNING|trainer.py:803] 2025-04-26 16:45:11,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:11,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:11,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4973 4951 4908 [WARNING|trainer.py:803] 2025-04-26 16:45:12,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:12,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:12,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4974 4952 4909 [WARNING|trainer.py:803] 2025-04-26 16:45:13,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:13,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:13,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4975 4953 4910 [WARNING|trainer.py:803] 2025-04-26 16:45:14,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:15,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:15,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4976 4954 4911 [WARNING|trainer.py:803] 2025-04-26 16:45:16,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:16,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:16,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4977 4955 4912 [WARNING|trainer.py:803] 2025-04-26 16:45:17,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:17,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:17,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4978 4956 4913 [WARNING|trainer.py:803] 2025-04-26 16:45:18,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:18,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:18,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4979 4957 4914 [WARNING|trainer.py:803] 2025-04-26 16:45:19,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:19,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:19,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4980 4958 4915 [WARNING|trainer.py:803] 2025-04-26 16:45:20,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:21,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:21,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4981 4959 4916 [WARNING|trainer.py:803] 2025-04-26 16:45:21,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:22,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:22,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4982 4960 4917 [WARNING|trainer.py:803] 2025-04-26 16:45:23,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:23,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4983 [WARNING|trainer.py:803] 2025-04-26 16:45:23,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4961 4918 [WARNING|trainer.py:803] 2025-04-26 16:45:24,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4984 [WARNING|trainer.py:803] 2025-04-26 16:45:24,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:24,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4962 4919 [WARNING|trainer.py:803] 2025-04-26 16:45:25,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4985 [WARNING|trainer.py:803] 2025-04-26 16:45:25,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:45:25,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4963 4920 [WARNING|trainer.py:803] 2025-04-26 16:45:26,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4986 [WARNING|trainer.py:803] 2025-04-26 16:45:27,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:27,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4964 4921 [WARNING|trainer.py:803] 2025-04-26 16:45:27,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4987 [WARNING|trainer.py:803] 2025-04-26 16:45:28,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:28,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4965 4922 [WARNING|trainer.py:803] 2025-04-26 16:45:29,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4988 [WARNING|trainer.py:803] 2025-04-26 16:45:29,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:29,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4966 4923 [WARNING|trainer.py:803] 2025-04-26 16:45:30,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4989 [WARNING|trainer.py:803] 2025-04-26 16:45:30,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:30,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4967 4924 [WARNING|trainer.py:803] 2025-04-26 16:45:31,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4990 [WARNING|trainer.py:803] 2025-04-26 16:45:31,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:31,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4968 4925 [WARNING|trainer.py:803] 2025-04-26 16:45:32,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4991 [WARNING|trainer.py:803] 2025-04-26 16:45:33,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:45:33,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4969 4926 [WARNING|trainer.py:803] 2025-04-26 16:45:33,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4992 [WARNING|trainer.py:803] 2025-04-26 16:45:34,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:45:34,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4970 4927 [WARNING|trainer.py:803] 2025-04-26 16:45:35,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4993 [WARNING|trainer.py:803] 2025-04-26 16:45:35,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:35,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4971 4928 [WARNING|trainer.py:803] 2025-04-26 16:45:36,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4994 [WARNING|trainer.py:803] 2025-04-26 16:45:36,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:36,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4972 4929 [WARNING|trainer.py:803] 2025-04-26 16:45:37,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4995 [WARNING|trainer.py:803] 2025-04-26 16:45:37,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:37,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4930 4973 [WARNING|trainer.py:803] 2025-04-26 16:45:38,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4996 [WARNING|trainer.py:803] 2025-04-26 16:45:39,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:39,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4974 4931 [WARNING|trainer.py:803] 2025-04-26 16:45:39,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4997 [WARNING|trainer.py:803] 2025-04-26 16:45:40,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:40,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4975 4932 [WARNING|trainer.py:803] 2025-04-26 16:45:40,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4998 [WARNING|trainer.py:803] 2025-04-26 16:45:41,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:41,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4976 4933 [WARNING|trainer.py:803] 2025-04-26 16:45:42,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4999 [WARNING|trainer.py:803] 2025-04-26 16:45:42,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:42,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4977 4934 [WARNING|trainer.py:803] 2025-04-26 16:45:43,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5000 [WARNING|trainer.py:803] 2025-04-26 16:45:43,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:43,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4935 4978 [WARNING|trainer.py:803] 2025-04-26 16:45:44,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5001 [WARNING|trainer.py:803] 2025-04-26 16:45:45,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:45,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4936 4979 [WARNING|trainer.py:803] 2025-04-26 16:45:45,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5002 [WARNING|trainer.py:803] 2025-04-26 16:45:46,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:46,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4937 4980 [WARNING|trainer.py:803] 2025-04-26 16:45:46,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5003 [WARNING|trainer.py:803] 2025-04-26 16:45:47,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:47,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4938 4981 [WARNING|trainer.py:803] 2025-04-26 16:45:48,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5004 [WARNING|trainer.py:803] 2025-04-26 16:45:48,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:45:48,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4939 4982 [WARNING|trainer.py:803] 2025-04-26 16:45:49,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5005 [WARNING|trainer.py:803] 2025-04-26 16:45:49,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:49,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4940 4983 [WARNING|trainer.py:803] 2025-04-26 16:45:50,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5006 [WARNING|trainer.py:803] 2025-04-26 16:45:51,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:45:51,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4941 4984 [WARNING|trainer.py:803] 2025-04-26 16:45:51,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5007 [WARNING|trainer.py:803] 2025-04-26 16:45:52,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:45:52,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4942 4985 [WARNING|trainer.py:803] 2025-04-26 16:45:52,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5008 [WARNING|trainer.py:803] 2025-04-26 16:45:53,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:53,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4943 4986 [WARNING|trainer.py:803] 2025-04-26 16:45:53,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5009 [WARNING|trainer.py:803] 2025-04-26 16:45:54,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:54,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4944 4987 [WARNING|trainer.py:803] 2025-04-26 16:45:55,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5010 [WARNING|trainer.py:803] 2025-04-26 16:45:55,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:55,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4945 4988 [WARNING|trainer.py:803] 2025-04-26 16:45:56,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5011 [WARNING|trainer.py:803] 2025-04-26 16:45:56,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:45:57,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4946 4989 [WARNING|trainer.py:803] 2025-04-26 16:45:57,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5012 [WARNING|trainer.py:803] 2025-04-26 16:45:58,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:58,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4947 4990 [WARNING|trainer.py:803] 2025-04-26 16:45:58,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5013 [WARNING|trainer.py:803] 2025-04-26 16:45:59,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:45:59,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4948 4991 [WARNING|trainer.py:803] 2025-04-26 16:45:59,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5014 [WARNING|trainer.py:803] 2025-04-26 16:46:00,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:00,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4949 [WARNING|trainer.py:803] 2025-04-26 16:46:01,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4992 5015 [WARNING|trainer.py:803] 2025-04-26 16:46:01,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:01,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4950 [WARNING|trainer.py:803] 2025-04-26 16:46:02,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4993 5016 [WARNING|trainer.py:803] 2025-04-26 16:46:02,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:03,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4951 [WARNING|trainer.py:803] 2025-04-26 16:46:03,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4994 5017 [WARNING|trainer.py:803] 2025-04-26 16:46:04,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:04,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4952 [WARNING|trainer.py:803] 2025-04-26 16:46:04,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4995 5018 [WARNING|trainer.py:803] 2025-04-26 16:46:05,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:05,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4953 [WARNING|trainer.py:803] 2025-04-26 16:46:05,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4996 5019 [WARNING|trainer.py:803] 2025-04-26 16:46:06,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:06,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4954 [WARNING|trainer.py:803] 2025-04-26 16:46:07,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4997 5020 [WARNING|trainer.py:803] 2025-04-26 16:46:07,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:07,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4955 [WARNING|trainer.py:803] 2025-04-26 16:46:08,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4998 5021 [WARNING|trainer.py:803] 2025-04-26 16:46:08,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:09,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4956 [WARNING|trainer.py:803] 2025-04-26 16:46:09,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4999 5022 [WARNING|trainer.py:803] 2025-04-26 16:46:09,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:10,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4957 [WARNING|trainer.py:803] 2025-04-26 16:46:10,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5000 5023 [WARNING|trainer.py:803] 2025-04-26 16:46:11,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:11,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4958 [WARNING|trainer.py:803] 2025-04-26 16:46:11,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5001 5024 [WARNING|trainer.py:803] 2025-04-26 16:46:12,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:12,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4959 [WARNING|trainer.py:803] 2025-04-26 16:46:13,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5002 5025 [WARNING|trainer.py:803] 2025-04-26 16:46:13,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:13,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4960 [WARNING|trainer.py:803] 2025-04-26 16:46:14,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5003 5026 [WARNING|trainer.py:803] 2025-04-26 16:46:14,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:15,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4961 [WARNING|trainer.py:803] 2025-04-26 16:46:15,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5004 5027 [WARNING|trainer.py:803] 2025-04-26 16:46:15,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4962 [WARNING|trainer.py:803] 2025-04-26 16:46:16,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:16,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5005 5028 [WARNING|trainer.py:803] 2025-04-26 16:46:17,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4963 [WARNING|trainer.py:803] 2025-04-26 16:46:17,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:17,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5006 5029 [WARNING|trainer.py:803] 2025-04-26 16:46:18,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:18,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4964 [WARNING|trainer.py:803] 2025-04-26 16:46:18,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5007 5030 [WARNING|trainer.py:803] 2025-04-26 16:46:19,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:19,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4965 [WARNING|trainer.py:803] 2025-04-26 16:46:20,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5008 5031 [WARNING|trainer.py:803] 2025-04-26 16:46:20,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:21,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4966 [WARNING|trainer.py:803] 2025-04-26 16:46:21,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5009 5032 [WARNING|trainer.py:803] 2025-04-26 16:46:22,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:22,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4967 [WARNING|trainer.py:803] 2025-04-26 16:46:22,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5010 5033 [WARNING|trainer.py:803] 2025-04-26 16:46:23,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:23,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4968 [WARNING|trainer.py:803] 2025-04-26 16:46:23,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5011 5034 [WARNING|trainer.py:803] 2025-04-26 16:46:24,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:46:24,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4969 [WARNING|trainer.py:803] 2025-04-26 16:46:24,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5012 5035 [WARNING|trainer.py:803] 2025-04-26 16:46:25,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:46:25,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4970 [WARNING|trainer.py:803] 2025-04-26 16:46:26,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5013 5036 [WARNING|trainer.py:803] 2025-04-26 16:46:26,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:26,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4971 [WARNING|trainer.py:803] 2025-04-26 16:46:27,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5014 5037 [WARNING|trainer.py:803] 2025-04-26 16:46:27,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:28,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4972 [WARNING|trainer.py:803] 2025-04-26 16:46:28,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5015 5038 [WARNING|trainer.py:803] 2025-04-26 16:46:29,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:29,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4973 [WARNING|trainer.py:803] 2025-04-26 16:46:29,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5016 5039 [WARNING|trainer.py:803] 2025-04-26 16:46:30,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:30,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4974 [WARNING|trainer.py:803] 2025-04-26 16:46:30,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5017 5040 [WARNING|trainer.py:803] 2025-04-26 16:46:31,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:31,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4975 [WARNING|trainer.py:803] 2025-04-26 16:46:32,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5018 5041 [WARNING|trainer.py:803] 2025-04-26 16:46:32,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:32,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4976 [WARNING|trainer.py:803] 2025-04-26 16:46:33,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5019 5042 [WARNING|trainer.py:803] 2025-04-26 16:46:33,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:34,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4977 [WARNING|trainer.py:803] 2025-04-26 16:46:34,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5020 5043 [WARNING|trainer.py:803] 2025-04-26 16:46:35,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:35,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4978 [WARNING|trainer.py:803] 2025-04-26 16:46:35,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5021 5044 [WARNING|trainer.py:803] 2025-04-26 16:46:36,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:36,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4979 [WARNING|trainer.py:803] 2025-04-26 16:46:36,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5022 5045 [WARNING|trainer.py:803] 2025-04-26 16:46:37,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:37,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4980 [WARNING|trainer.py:803] 2025-04-26 16:46:37,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5023 5046 [WARNING|trainer.py:803] 2025-04-26 16:46:38,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:38,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4981 [WARNING|trainer.py:803] 2025-04-26 16:46:39,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5024 5047 [WARNING|trainer.py:803] 2025-04-26 16:46:39,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:40,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4982 [WARNING|trainer.py:803] 2025-04-26 16:46:40,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5025 5048 [WARNING|trainer.py:803] 2025-04-26 16:46:40,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:41,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4983 [WARNING|trainer.py:803] 2025-04-26 16:46:41,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5026 5049 [WARNING|trainer.py:803] 2025-04-26 16:46:42,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:42,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4984 [WARNING|trainer.py:803] 2025-04-26 16:46:42,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5027 5050 [WARNING|trainer.py:803] 2025-04-26 16:46:43,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:43,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4985 [WARNING|trainer.py:803] 2025-04-26 16:46:43,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5028 5051 [WARNING|trainer.py:803] 2025-04-26 16:46:44,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:46:44,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4986 [WARNING|trainer.py:803] 2025-04-26 16:46:45,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5029 5052 [WARNING|trainer.py:803] 2025-04-26 16:46:45,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4987 [WARNING|trainer.py:803] 2025-04-26 16:46:45,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:46,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5030 5053 [WARNING|trainer.py:803] 2025-04-26 16:46:46,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4988 [WARNING|trainer.py:803] 2025-04-26 16:46:47,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:47,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5031 5054 [WARNING|trainer.py:803] 2025-04-26 16:46:47,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4989 [WARNING|trainer.py:803] 2025-04-26 16:46:48,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:48,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5032 5055 [WARNING|trainer.py:803] 2025-04-26 16:46:49,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4990 [WARNING|trainer.py:803] 2025-04-26 16:46:49,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:49,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5033 5056 [WARNING|trainer.py:803] 2025-04-26 16:46:50,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4991 [WARNING|trainer.py:803] 2025-04-26 16:46:50,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:51,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5034 5057 [WARNING|trainer.py:803] 2025-04-26 16:46:51,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4992 [WARNING|trainer.py:803] 2025-04-26 16:46:51,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:52,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5035 5058 [WARNING|trainer.py:803] 2025-04-26 16:46:52,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4993 [WARNING|trainer.py:803] 2025-04-26 16:46:53,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:53,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5036 5059 [WARNING|trainer.py:803] 2025-04-26 16:46:53,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4994 [WARNING|trainer.py:803] 2025-04-26 16:46:54,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:54,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5037 5060 [WARNING|trainer.py:803] 2025-04-26 16:46:55,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4995 [WARNING|trainer.py:803] 2025-04-26 16:46:55,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:55,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5038 5061 [WARNING|trainer.py:803] 2025-04-26 16:46:56,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4996 [WARNING|trainer.py:803] 2025-04-26 16:46:56,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:56,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5039 5062 [WARNING|trainer.py:803] 2025-04-26 16:46:57,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4997 [WARNING|trainer.py:803] 2025-04-26 16:46:57,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:46:58,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5040 5063 [WARNING|trainer.py:803] 2025-04-26 16:46:58,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4998 [WARNING|trainer.py:803] 2025-04-26 16:46:59,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:46:59,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5041 5064 [WARNING|trainer.py:803] 2025-04-26 16:46:59,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4999 [WARNING|trainer.py:803] 2025-04-26 16:47:00,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:00,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5042 5065 [WARNING|trainer.py:803] 2025-04-26 16:47:00,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5000 [WARNING|trainer.py:803] 2025-04-26 16:47:01,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:01,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5043 5066 [WARNING|trainer.py:803] 2025-04-26 16:47:02,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5001 [WARNING|trainer.py:803] 2025-04-26 16:47:02,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:02,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5044 5067 [WARNING|trainer.py:803] 2025-04-26 16:47:03,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5002 [WARNING|trainer.py:803] 2025-04-26 16:47:03,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:04,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5045 5068 [WARNING|trainer.py:803] 2025-04-26 16:47:04,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5003 [WARNING|trainer.py:803] 2025-04-26 16:47:04,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:05,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5046 [WARNING|trainer.py:803] 2025-04-26 16:47:05,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5069 5004 [WARNING|trainer.py:803] 2025-04-26 16:47:06,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:06,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5047 [WARNING|trainer.py:803] 2025-04-26 16:47:06,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5070 5005 [WARNING|trainer.py:803] 2025-04-26 16:47:07,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:07,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5048 [WARNING|trainer.py:803] 2025-04-26 16:47:08,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5071 5006 [WARNING|trainer.py:803] 2025-04-26 16:47:08,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:47:08,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5049 [WARNING|trainer.py:803] 2025-04-26 16:47:09,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5072 5007 [WARNING|trainer.py:803] 2025-04-26 16:47:09,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:10,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5050 [WARNING|trainer.py:803] 2025-04-26 16:47:10,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5073 5008 [WARNING|trainer.py:803] 2025-04-26 16:47:10,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5051 [WARNING|trainer.py:803] 2025-04-26 16:47:11,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:11,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5074 5009 [WARNING|trainer.py:803] 2025-04-26 16:47:12,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5052 [WARNING|trainer.py:803] 2025-04-26 16:47:12,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:12,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5075 5010 [WARNING|trainer.py:803] 2025-04-26 16:47:13,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5053 [WARNING|trainer.py:803] 2025-04-26 16:47:13,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:13,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5076 5011 [WARNING|trainer.py:803] 2025-04-26 16:47:14,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:14,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5054 [WARNING|trainer.py:803] 2025-04-26 16:47:15,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5077 5012 [WARNING|trainer.py:803] 2025-04-26 16:47:15,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5055 [WARNING|trainer.py:803] 2025-04-26 16:47:16,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:16,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5078 5013 [WARNING|trainer.py:803] 2025-04-26 16:47:16,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5056 [WARNING|trainer.py:803] 2025-04-26 16:47:17,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:17,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5079 5014 [WARNING|trainer.py:803] 2025-04-26 16:47:18,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5057 [WARNING|trainer.py:803] 2025-04-26 16:47:18,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:18,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5080 5015 [WARNING|trainer.py:803] 2025-04-26 16:47:19,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:19,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5058 [WARNING|trainer.py:803] 2025-04-26 16:47:19,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5081 5016 [WARNING|trainer.py:803] 2025-04-26 16:47:20,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:20,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5059 [WARNING|trainer.py:803] 2025-04-26 16:47:21,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5082 5017 [WARNING|trainer.py:803] 2025-04-26 16:47:21,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:21,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5060 [WARNING|trainer.py:803] 2025-04-26 16:47:22,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5083 5018 [WARNING|trainer.py:803] 2025-04-26 16:47:22,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:23,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5061 [WARNING|trainer.py:803] 2025-04-26 16:47:23,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5084 5019 [WARNING|trainer.py:803] 2025-04-26 16:47:24,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:24,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5062 [WARNING|trainer.py:803] 2025-04-26 16:47:24,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5085 5020 [WARNING|trainer.py:803] 2025-04-26 16:47:25,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:25,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5063 [WARNING|trainer.py:803] 2025-04-26 16:47:25,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5086 5021 [WARNING|trainer.py:803] 2025-04-26 16:47:26,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:26,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5064 [WARNING|trainer.py:803] 2025-04-26 16:47:26,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5087 5022 [WARNING|trainer.py:803] 2025-04-26 16:47:27,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:27,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5065 [WARNING|trainer.py:803] 2025-04-26 16:47:28,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5088 5023 [WARNING|trainer.py:803] 2025-04-26 16:47:28,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:47:29,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5066 [WARNING|trainer.py:803] 2025-04-26 16:47:29,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5089 5024 [WARNING|trainer.py:803] 2025-04-26 16:47:29,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:30,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5067 [WARNING|trainer.py:803] 2025-04-26 16:47:30,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5090 5025 [WARNING|trainer.py:803] 2025-04-26 16:47:31,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:31,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5068 [WARNING|trainer.py:803] 2025-04-26 16:47:31,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5091 5026 [WARNING|trainer.py:803] 2025-04-26 16:47:32,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:32,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5069 [WARNING|trainer.py:803] 2025-04-26 16:47:32,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5092 5027 [WARNING|trainer.py:803] 2025-04-26 16:47:33,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:33,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5070 [WARNING|trainer.py:803] 2025-04-26 16:47:34,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5093 5028 [WARNING|trainer.py:803] 2025-04-26 16:47:34,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:35,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5071 [WARNING|trainer.py:803] 2025-04-26 16:47:35,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5094 5029 [WARNING|trainer.py:803] 2025-04-26 16:47:35,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:47:36,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5072 [WARNING|trainer.py:803] 2025-04-26 16:47:36,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5095 5030 [WARNING|trainer.py:803] 2025-04-26 16:47:37,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:37,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5073 [WARNING|trainer.py:803] 2025-04-26 16:47:37,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5096 5031 [WARNING|trainer.py:803] 2025-04-26 16:47:38,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:38,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5074 [WARNING|trainer.py:803] 2025-04-26 16:47:38,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5097 5032 [WARNING|trainer.py:803] 2025-04-26 16:47:39,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5075 [WARNING|trainer.py:803] 2025-04-26 16:47:39,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:40,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5098 5033 [WARNING|trainer.py:803] 2025-04-26 16:47:40,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5076 [WARNING|trainer.py:803] 2025-04-26 16:47:41,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:41,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5099 5034 [WARNING|trainer.py:803] 2025-04-26 16:47:41,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5077 [WARNING|trainer.py:803] 2025-04-26 16:47:42,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:42,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5100 5035 [WARNING|trainer.py:803] 2025-04-26 16:47:42,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5078 [WARNING|trainer.py:803] 2025-04-26 16:47:43,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:43,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5101 5036 [WARNING|trainer.py:803] 2025-04-26 16:47:44,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5079 [WARNING|trainer.py:803] 2025-04-26 16:47:44,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:47:44,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5037 5102 [WARNING|trainer.py:803] 2025-04-26 16:47:45,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5080 [WARNING|trainer.py:803] 2025-04-26 16:47:45,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:46,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5038 [WARNING|trainer.py:803] 2025-04-26 16:47:46,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5103 5081 [WARNING|trainer.py:803] 2025-04-26 16:47:47,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5039 [WARNING|trainer.py:803] 2025-04-26 16:47:47,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:47,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5104 5082 [WARNING|trainer.py:803] 2025-04-26 16:47:48,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5040 [WARNING|trainer.py:803] 2025-04-26 16:47:48,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:47:48,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5083 5105 [WARNING|trainer.py:803] 2025-04-26 16:47:49,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5041 [WARNING|trainer.py:803] 2025-04-26 16:47:50,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:50,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5084 [WARNING|trainer.py:803] 2025-04-26 16:47:50,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5106 5042 [WARNING|trainer.py:803] 2025-04-26 16:47:51,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:51,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5085 [WARNING|trainer.py:803] 2025-04-26 16:47:51,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5107 5043 [WARNING|trainer.py:803] 2025-04-26 16:47:52,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5086 [WARNING|trainer.py:803] 2025-04-26 16:47:53,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:53,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5044 5108 [WARNING|trainer.py:803] 2025-04-26 16:47:53,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5087 [WARNING|trainer.py:803] 2025-04-26 16:47:54,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:54,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5045 5109 [WARNING|trainer.py:803] 2025-04-26 16:47:54,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5088 [WARNING|trainer.py:803] 2025-04-26 16:47:55,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:55,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5046 [WARNING|trainer.py:803] 2025-04-26 16:47:56,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5110 5089 [WARNING|trainer.py:803] 2025-04-26 16:47:56,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5047 [WARNING|trainer.py:803] 2025-04-26 16:47:57,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:47:57,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5111 5090 [WARNING|trainer.py:803] 2025-04-26 16:47:57,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5048 [WARNING|trainer.py:803] 2025-04-26 16:47:58,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:47:58,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5091 [WARNING|trainer.py:803] 2025-04-26 16:47:58,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5112 5049 [WARNING|trainer.py:803] 2025-04-26 16:47:59,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:47:59,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5092 [WARNING|trainer.py:803] 2025-04-26 16:48:00,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5113 5050 [WARNING|trainer.py:803] 2025-04-26 16:48:00,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:01,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5093 [WARNING|trainer.py:803] 2025-04-26 16:48:01,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5051 5114 [WARNING|trainer.py:803] 2025-04-26 16:48:02,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5094 [WARNING|trainer.py:803] 2025-04-26 16:48:02,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:02,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5052 5115 [WARNING|trainer.py:803] 2025-04-26 16:48:03,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5095 [WARNING|trainer.py:803] 2025-04-26 16:48:03,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:04,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5053 5116 [WARNING|trainer.py:803] 2025-04-26 16:48:04,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5096 [WARNING|trainer.py:803] 2025-04-26 16:48:04,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5054 [WARNING|trainer.py:803] 2025-04-26 16:48:05,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:48:05,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5117 5097 [WARNING|trainer.py:803] 2025-04-26 16:48:06,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5055 [WARNING|trainer.py:803] 2025-04-26 16:48:06,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:06,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5098 [WARNING|trainer.py:803] 2025-04-26 16:48:07,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5118 5056 [WARNING|trainer.py:803] 2025-04-26 16:48:08,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:08,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:08,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 5099 NoYes 5119 5057 [WARNING|trainer.py:803] 2025-04-26 16:48:09,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:09,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5100 [WARNING|trainer.py:803] 2025-04-26 16:48:09,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5058 5120 [WARNING|trainer.py:803] 2025-04-26 16:48:10,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:10,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:10,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5101 5059 5121 [WARNING|trainer.py:803] 2025-04-26 16:48:11,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:48:12,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:12,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5102 5060 5122 [WARNING|trainer.py:803] 2025-04-26 16:48:13,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:13,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5061 [WARNING|trainer.py:803] 2025-04-26 16:48:13,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5103 5123 [WARNING|trainer.py:803] 2025-04-26 16:48:14,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:14,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5062 [WARNING|trainer.py:803] 2025-04-26 16:48:15,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5104 [WARNING|trainer.py:803] 2025-04-26 16:48:15,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5124 5063 [WARNING|trainer.py:803] 2025-04-26 16:48:16,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:48:16,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5105 [WARNING|trainer.py:803] 2025-04-26 16:48:16,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5125 5064 [WARNING|trainer.py:803] 2025-04-26 16:48:17,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:17,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5106 [WARNING|trainer.py:803] 2025-04-26 16:48:18,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5065 5126 [WARNING|trainer.py:803] 2025-04-26 16:48:18,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:19,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:19,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5107 5066 5127 [WARNING|trainer.py:803] 2025-04-26 16:48:20,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:20,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:20,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5108 5067 5128 [WARNING|trainer.py:803] 2025-04-26 16:48:21,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:21,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:21,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5068 5109 5129 [WARNING|trainer.py:803] 2025-04-26 16:48:22,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:22,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5069 [WARNING|trainer.py:803] 2025-04-26 16:48:23,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5110 5130 [WARNING|trainer.py:803] 2025-04-26 16:48:24,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:24,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5070 [WARNING|trainer.py:803] 2025-04-26 16:48:24,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5111 [WARNING|trainer.py:803] 2025-04-26 16:48:25,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5131 5071 [WARNING|trainer.py:803] 2025-04-26 16:48:25,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:48:26,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5112 [WARNING|trainer.py:803] 2025-04-26 16:48:26,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5132 5072 [WARNING|trainer.py:803] 2025-04-26 16:48:26,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:27,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5113 [WARNING|trainer.py:803] 2025-04-26 16:48:27,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5073 5133 [WARNING|trainer.py:803] 2025-04-26 16:48:28,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:28,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:28,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5114 5074 5134 [WARNING|trainer.py:803] 2025-04-26 16:48:29,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:48:29,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:30,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5115 5075 5135 [WARNING|trainer.py:803] 2025-04-26 16:48:31,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:48:31,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5076 [WARNING|trainer.py:803] 2025-04-26 16:48:31,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5116 5136 [WARNING|trainer.py:803] 2025-04-26 16:48:32,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:32,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5077 [WARNING|trainer.py:803] 2025-04-26 16:48:32,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5117 5137 [WARNING|trainer.py:803] 2025-04-26 16:48:33,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:33,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5078 [WARNING|trainer.py:803] 2025-04-26 16:48:34,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5118 [WARNING|trainer.py:803] 2025-04-26 16:48:34,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5138 5079 [WARNING|trainer.py:803] 2025-04-26 16:48:35,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:35,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5119 [WARNING|trainer.py:803] 2025-04-26 16:48:35,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5139 5080 [WARNING|trainer.py:803] 2025-04-26 16:48:36,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:48:36,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:48:37,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5120 5081 5140 [WARNING|trainer.py:803] 2025-04-26 16:48:37,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:38,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:38,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5121 5082 5141 [WARNING|trainer.py:803] 2025-04-26 16:48:39,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:48:39,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:39,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5083 5122 5142 [WARNING|trainer.py:803] 2025-04-26 16:48:40,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:40,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5084 [WARNING|trainer.py:803] 2025-04-26 16:48:41,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5123 5143 [WARNING|trainer.py:803] 2025-04-26 16:48:41,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:42,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5085 [WARNING|trainer.py:803] 2025-04-26 16:48:42,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5124 5144 [WARNING|trainer.py:803] 2025-04-26 16:48:43,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5086 [WARNING|trainer.py:803] 2025-04-26 16:48:43,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:43,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5125 [WARNING|trainer.py:803] 2025-04-26 16:48:44,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5145 5087 [WARNING|trainer.py:803] 2025-04-26 16:48:44,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:48:45,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5126 [WARNING|trainer.py:803] 2025-04-26 16:48:45,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5088 5146 [WARNING|trainer.py:803] 2025-04-26 16:48:46,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:46,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:46,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5127 5089 5147 [WARNING|trainer.py:803] 2025-04-26 16:48:47,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:48:47,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:48,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5128 5090 5148 [WARNING|trainer.py:803] 2025-04-26 16:48:48,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:48,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5091 [WARNING|trainer.py:803] 2025-04-26 16:48:49,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5129 5149 [WARNING|trainer.py:803] 2025-04-26 16:48:50,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:50,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5092 [WARNING|trainer.py:803] 2025-04-26 16:48:50,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5130 [WARNING|trainer.py:803] 2025-04-26 16:48:51,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5150 [WARNING|trainer.py:803] 2025-04-26 16:48:51,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5093 5131 [WARNING|trainer.py:803] 2025-04-26 16:48:52,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:52,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5151 5094 [WARNING|trainer.py:803] 2025-04-26 16:48:52,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5132 [WARNING|trainer.py:803] 2025-04-26 16:48:53,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:53,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5095 5152 [WARNING|trainer.py:803] 2025-04-26 16:48:54,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5133 [WARNING|trainer.py:803] 2025-04-26 16:48:54,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:54,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5096 5153 [WARNING|trainer.py:803] 2025-04-26 16:48:55,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:56,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5134 [WARNING|trainer.py:803] 2025-04-26 16:48:56,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5097 5154 [WARNING|trainer.py:803] 2025-04-26 16:48:57,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:48:57,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5135 5098 [WARNING|trainer.py:803] 2025-04-26 16:48:57,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5155 [WARNING|trainer.py:803] 2025-04-26 16:48:58,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:48:58,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5099 5136 [WARNING|trainer.py:803] 2025-04-26 16:48:59,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5156 [WARNING|trainer.py:803] 2025-04-26 16:48:59,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:48:59,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5100 5137 [WARNING|trainer.py:803] 2025-04-26 16:49:00,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:49:00,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5157 [WARNING|trainer.py:803] 2025-04-26 16:49:01,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5101 5138 [WARNING|trainer.py:803] 2025-04-26 16:49:01,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:02,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5158 [WARNING|trainer.py:803] 2025-04-26 16:49:02,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5102 5139 [WARNING|trainer.py:803] 2025-04-26 16:49:03,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:49:03,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5159 [WARNING|trainer.py:803] 2025-04-26 16:49:03,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5103 5140 [WARNING|trainer.py:803] 2025-04-26 16:49:04,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:49:04,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5160 [WARNING|trainer.py:803] 2025-04-26 16:49:05,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5104 5141 [WARNING|trainer.py:803] 2025-04-26 16:49:05,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:49:06,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5161 [WARNING|trainer.py:803] 2025-04-26 16:49:06,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5105 5142 [WARNING|trainer.py:803] 2025-04-26 16:49:07,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:07,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5162 [WARNING|trainer.py:803] 2025-04-26 16:49:07,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5106 5143 [WARNING|trainer.py:803] 2025-04-26 16:49:08,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:09,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5163 [WARNING|trainer.py:803] 2025-04-26 16:49:09,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5107 5144 [WARNING|trainer.py:803] 2025-04-26 16:49:09,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:49:10,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5164 [WARNING|trainer.py:803] 2025-04-26 16:49:10,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5108 5145 [WARNING|trainer.py:803] 2025-04-26 16:49:11,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:49:11,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5165 [WARNING|trainer.py:803] 2025-04-26 16:49:12,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5109 5146 [WARNING|trainer.py:803] 2025-04-26 16:49:12,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:13,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5166 [WARNING|trainer.py:803] 2025-04-26 16:49:13,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5110 5147 [WARNING|trainer.py:803] 2025-04-26 16:49:14,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:14,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5167 [WARNING|trainer.py:803] 2025-04-26 16:49:14,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5111 5148 [WARNING|trainer.py:803] 2025-04-26 16:49:15,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5168 [WARNING|trainer.py:803] 2025-04-26 16:49:15,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:49:16,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5112 5149 [WARNING|trainer.py:803] 2025-04-26 16:49:16,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5169 [WARNING|trainer.py:803] 2025-04-26 16:49:17,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:49:17,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5113 5150 [WARNING|trainer.py:803] 2025-04-26 16:49:18,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:49:18,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5170 [WARNING|trainer.py:803] 2025-04-26 16:49:18,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5114 5151 [WARNING|trainer.py:803] 2025-04-26 16:49:19,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:20,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5171 [WARNING|trainer.py:803] 2025-04-26 16:49:20,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5115 5152 [WARNING|trainer.py:803] 2025-04-26 16:49:20,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:49:21,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5172 [WARNING|trainer.py:803] 2025-04-26 16:49:21,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5116 5153 [WARNING|trainer.py:803] 2025-04-26 16:49:22,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:22,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5173 [WARNING|trainer.py:803] 2025-04-26 16:49:23,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5117 5154 [WARNING|trainer.py:803] 2025-04-26 16:49:23,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:49:24,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5174 [WARNING|trainer.py:803] 2025-04-26 16:49:24,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5118 5155 [WARNING|trainer.py:803] 2025-04-26 16:49:25,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:49:25,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5175 [WARNING|trainer.py:803] 2025-04-26 16:49:25,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5119 5156 [WARNING|trainer.py:803] 2025-04-26 16:49:26,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:26,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5176 [WARNING|trainer.py:803] 2025-04-26 16:49:27,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5120 5157 [WARNING|trainer.py:803] 2025-04-26 16:49:27,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:49:28,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5177 [WARNING|trainer.py:803] 2025-04-26 16:49:28,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5121 5158 [WARNING|trainer.py:803] 2025-04-26 16:49:29,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5178 [WARNING|trainer.py:803] 2025-04-26 16:49:29,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:49:29,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5159 5122 [WARNING|trainer.py:803] 2025-04-26 16:49:30,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5179 [WARNING|trainer.py:803] 2025-04-26 16:49:31,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:49:31,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5160 5123 [WARNING|trainer.py:803] 2025-04-26 16:49:31,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5180 [WARNING|trainer.py:803] 2025-04-26 16:49:32,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:49:32,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5161 5124 [WARNING|trainer.py:803] 2025-04-26 16:49:33,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:33,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5181 [WARNING|trainer.py:803] 2025-04-26 16:49:34,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5162 5125 [WARNING|trainer.py:803] 2025-04-26 16:49:34,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5182 [WARNING|trainer.py:803] 2025-04-26 16:49:35,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:35,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5163 5126 [WARNING|trainer.py:803] 2025-04-26 16:49:36,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:49:36,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5183 [WARNING|trainer.py:803] 2025-04-26 16:49:36,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5164 5127 [WARNING|trainer.py:803] 2025-04-26 16:49:37,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:37,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5184 [WARNING|trainer.py:803] 2025-04-26 16:49:38,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5165 5128 [WARNING|trainer.py:803] 2025-04-26 16:49:39,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:39,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5185 [WARNING|trainer.py:803] 2025-04-26 16:49:39,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5166 5129 [WARNING|trainer.py:803] 2025-04-26 16:49:40,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:40,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5186 [WARNING|trainer.py:803] 2025-04-26 16:49:41,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5167 5130 [WARNING|trainer.py:803] 2025-04-26 16:49:41,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:49:42,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5187 [WARNING|trainer.py:803] 2025-04-26 16:49:42,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5168 5131 [WARNING|trainer.py:803] 2025-04-26 16:49:43,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:43,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5188 [WARNING|trainer.py:803] 2025-04-26 16:49:43,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5169 5132 [WARNING|trainer.py:803] 2025-04-26 16:49:44,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:49:44,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:49:45,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5189 5170 5133 [WARNING|trainer.py:803] 2025-04-26 16:49:45,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:49:46,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:46,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5190 5171 5134 [WARNING|trainer.py:803] 2025-04-26 16:49:47,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:49:47,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:49:47,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5191 5172 5135 [WARNING|trainer.py:803] 2025-04-26 16:49:48,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:49:48,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:49,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5192 5173 5136 [WARNING|trainer.py:803] 2025-04-26 16:49:50,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:49:50,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5193 [WARNING|trainer.py:803] 2025-04-26 16:49:50,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5174 5137 [WARNING|trainer.py:803] 2025-04-26 16:49:51,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:51,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5194 [WARNING|trainer.py:803] 2025-04-26 16:49:51,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5175 5138 [WARNING|trainer.py:803] 2025-04-26 16:49:52,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:53,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5195 [WARNING|trainer.py:803] 2025-04-26 16:49:53,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5176 5139 [WARNING|trainer.py:803] 2025-04-26 16:49:54,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:54,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:49:54,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5196 5177 5140 [WARNING|trainer.py:803] 2025-04-26 16:49:55,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:49:55,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:56,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5197 5178 5141 [WARNING|trainer.py:803] 2025-04-26 16:49:56,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:49:57,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:57,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5198 5179 5142 [WARNING|trainer.py:803] 2025-04-26 16:49:58,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:49:58,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:49:58,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5199 5180 5143 [WARNING|trainer.py:803] 2025-04-26 16:49:59,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:49:59,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:00,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5200 5181 5144 [WARNING|trainer.py:803] 2025-04-26 16:50:01,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:50:01,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:01,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5201 5182 5145 [WARNING|trainer.py:803] 2025-04-26 16:50:02,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:02,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:50:02,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5202 5183 5146 [WARNING|trainer.py:803] 2025-04-26 16:50:03,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:04,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:04,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5203 5184 5147 [WARNING|trainer.py:803] 2025-04-26 16:50:05,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:05,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:05,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5204 5185 5148 [WARNING|trainer.py:803] 2025-04-26 16:50:06,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:06,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:07,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5205 5186 5149 [WARNING|trainer.py:803] 2025-04-26 16:50:08,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:08,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:08,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5206 5187 5150 [WARNING|trainer.py:803] 2025-04-26 16:50:09,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:50:09,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:09,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5188 5207 5151 [WARNING|trainer.py:803] 2025-04-26 16:50:11,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:11,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:11,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5189 5208 5152 [WARNING|trainer.py:803] 2025-04-26 16:50:12,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:50:12,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:50:12,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5190 5209 5153 [WARNING|trainer.py:803] 2025-04-26 16:50:13,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:50:13,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:14,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5191 5210 5154 [WARNING|trainer.py:803] 2025-04-26 16:50:15,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:50:15,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:15,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5211 5192 5155 [WARNING|trainer.py:803] 2025-04-26 16:50:16,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:50:16,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:50:16,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5212 5193 5156 [WARNING|trainer.py:803] 2025-04-26 16:50:18,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:50:18,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:18,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5213 5194 5157 [WARNING|trainer.py:803] 2025-04-26 16:50:19,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:19,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:19,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5214 5195 5158 [WARNING|trainer.py:803] 2025-04-26 16:50:20,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:50:20,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:20,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5196 5215 5159 [WARNING|trainer.py:803] 2025-04-26 16:50:22,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:22,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:22,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5197 5160 5216 [WARNING|trainer.py:803] 2025-04-26 16:50:23,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:50:23,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:23,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5198 5161 5217 [WARNING|trainer.py:803] 2025-04-26 16:50:25,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:25,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:25,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5199 5162 5218 [WARNING|trainer.py:803] 2025-04-26 16:50:26,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:26,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:26,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5200 5163 5219 [WARNING|trainer.py:803] 2025-04-26 16:50:27,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:50:27,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:50:28,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5201 5164 5220 [WARNING|trainer.py:803] 2025-04-26 16:50:29,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:29,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:50:29,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5202 5165 5221 [WARNING|trainer.py:803] 2025-04-26 16:50:30,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:30,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:30,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5166 5203 5222 [WARNING|trainer.py:803] 2025-04-26 16:50:31,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:31,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:32,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5167 5204 5223 [WARNING|trainer.py:803] 2025-04-26 16:50:33,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:50:33,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:33,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5168 5205 5224 [WARNING|trainer.py:803] 2025-04-26 16:50:34,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:34,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:35,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5169 5206 5225 [WARNING|trainer.py:803] 2025-04-26 16:50:36,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:50:36,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:50:36,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5170 5207 5226 [WARNING|trainer.py:803] 2025-04-26 16:50:37,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:37,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:37,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5171 5208 5227 [WARNING|trainer.py:803] 2025-04-26 16:50:38,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:50:38,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:50:39,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5172 5209 5228 [WARNING|trainer.py:803] 2025-04-26 16:50:40,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:40,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:40,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5210 5173 5229 [WARNING|trainer.py:803] 2025-04-26 16:50:41,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:41,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:50:42,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5211 5174 5230 [WARNING|trainer.py:803] 2025-04-26 16:50:43,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:50:43,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:50:43,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5212 5175 5231 [WARNING|trainer.py:803] 2025-04-26 16:50:44,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:50:44,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:44,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5213 5176 5232 [WARNING|trainer.py:803] 2025-04-26 16:50:45,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:45,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5214 [WARNING|trainer.py:803] 2025-04-26 16:50:46,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5177 5233 [WARNING|trainer.py:803] 2025-04-26 16:50:47,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:50:47,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:47,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5215 5178 5234 [WARNING|trainer.py:803] 2025-04-26 16:50:48,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:48,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:49,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5216 5179 5235 [WARNING|trainer.py:803] 2025-04-26 16:50:50,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:50,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:50:50,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5217 5180 5236 [WARNING|trainer.py:803] 2025-04-26 16:50:51,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:51,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:51,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5181 5218 5237 [WARNING|trainer.py:803] 2025-04-26 16:50:52,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:52,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:53,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5182 5219 5238 [WARNING|trainer.py:803] 2025-04-26 16:50:54,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:50:54,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:54,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5220 5183 5239 [WARNING|trainer.py:803] 2025-04-26 16:50:55,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:50:55,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:56,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5221 5184 5240 [WARNING|trainer.py:803] 2025-04-26 16:50:57,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:50:57,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:57,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5185 5222 5241 [WARNING|trainer.py:803] 2025-04-26 16:50:58,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:58,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:50:58,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5186 5223 5242 [WARNING|trainer.py:803] 2025-04-26 16:50:59,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:50:59,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:00,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5187 5224 5243 [WARNING|trainer.py:803] 2025-04-26 16:51:01,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:01,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:01,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5188 5225 5244 [WARNING|trainer.py:803] 2025-04-26 16:51:02,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:02,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:03,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5189 5226 5245 [WARNING|trainer.py:803] 2025-04-26 16:51:04,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:51:04,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:04,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5190 5227 5246 [WARNING|trainer.py:803] 2025-04-26 16:51:05,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:05,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5191 [WARNING|trainer.py:803] 2025-04-26 16:51:05,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5228 5247 [WARNING|trainer.py:803] 2025-04-26 16:51:06,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:06,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:07,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5192 5229 5248 [WARNING|trainer.py:803] 2025-04-26 16:51:08,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:08,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:08,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5193 5230 5249 [WARNING|trainer.py:803] 2025-04-26 16:51:09,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:09,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:10,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5194 5231 5250 [WARNING|trainer.py:803] 2025-04-26 16:51:10,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:11,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:11,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5195 5232 5251 [WARNING|trainer.py:803] 2025-04-26 16:51:12,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:12,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:12,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5196 5233 5252 [WARNING|trainer.py:803] 2025-04-26 16:51:13,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:13,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:14,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5234 5197 5253 [WARNING|trainer.py:803] 2025-04-26 16:51:15,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:15,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:15,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5235 5198 5254 [WARNING|trainer.py:803] 2025-04-26 16:51:16,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:16,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:16,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5236 5199 5255 [WARNING|trainer.py:803] 2025-04-26 16:51:17,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:17,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:18,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5200 5237 5256 [WARNING|trainer.py:803] 2025-04-26 16:51:19,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:19,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:19,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5201 5238 5257 [WARNING|trainer.py:803] 2025-04-26 16:51:20,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:20,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:21,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5239 5202 5258 [WARNING|trainer.py:803] 2025-04-26 16:51:22,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:22,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:22,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5240 5203 5259 [WARNING|trainer.py:803] 2025-04-26 16:51:23,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:23,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:23,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5241 5204 5260 [WARNING|trainer.py:803] 2025-04-26 16:51:24,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:25,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:25,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5242 5205 5261 [WARNING|trainer.py:803] 2025-04-26 16:51:26,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:26,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:26,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5243 5206 5262 [WARNING|trainer.py:803] 2025-04-26 16:51:27,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:27,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:51:28,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5244 5207 5263 [WARNING|trainer.py:803] 2025-04-26 16:51:29,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:29,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:29,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5245 5208 5264 [WARNING|trainer.py:803] 2025-04-26 16:51:30,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:30,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:51:30,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5246 5209 5265 [WARNING|trainer.py:803] 2025-04-26 16:51:32,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:32,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:32,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5247 5210 5266 [WARNING|trainer.py:803] 2025-04-26 16:51:33,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:33,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:33,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5248 5211 5267 [WARNING|trainer.py:803] 2025-04-26 16:51:34,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:34,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:51:34,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5249 5212 5268 [WARNING|trainer.py:803] 2025-04-26 16:51:36,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:36,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:36,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5250 5213 5269 [WARNING|trainer.py:803] 2025-04-26 16:51:37,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:37,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:37,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5251 5270 5214 [WARNING|trainer.py:803] 2025-04-26 16:51:39,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:39,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:39,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5252 5271 5215 [WARNING|trainer.py:803] 2025-04-26 16:51:40,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:40,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:40,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5253 5272 5216 [WARNING|trainer.py:803] 2025-04-26 16:51:41,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:41,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:42,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5254 5273 5217 [WARNING|trainer.py:803] 2025-04-26 16:51:43,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:43,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:43,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5255 5274 5218 [WARNING|trainer.py:803] 2025-04-26 16:51:44,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:44,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:51:44,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5256 5275 5219 [WARNING|trainer.py:803] 2025-04-26 16:51:45,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:45,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:46,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5257 5276 5220 [WARNING|trainer.py:803] 2025-04-26 16:51:47,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:47,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:47,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5277 5258 5221 [WARNING|trainer.py:803] 2025-04-26 16:51:48,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:48,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:49,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5278 5259 5222 [WARNING|trainer.py:803] 2025-04-26 16:51:50,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:50,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:51:50,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5279 5260 5223 [WARNING|trainer.py:803] 2025-04-26 16:51:51,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:51,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:51,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5280 5261 5224 [WARNING|trainer.py:803] 2025-04-26 16:51:52,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:52,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:53,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5262 5281 5225 [WARNING|trainer.py:803] 2025-04-26 16:51:54,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:54,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:54,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5263 5282 5226 [WARNING|trainer.py:803] 2025-04-26 16:51:55,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:55,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:56,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5264 5283 5227 [WARNING|trainer.py:803] 2025-04-26 16:51:57,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:51:57,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:51:57,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5265 5284 5228 [WARNING|trainer.py:803] 2025-04-26 16:51:58,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:51:58,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5285 5266 [WARNING|trainer.py:803] 2025-04-26 16:51:59,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5229 [WARNING|trainer.py:803] 2025-04-26 16:51:59,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:51:59,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5286 5267 [WARNING|trainer.py:803] 2025-04-26 16:52:00,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5230 [WARNING|trainer.py:803] 2025-04-26 16:52:01,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:52:01,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5287 5268 [WARNING|trainer.py:803] 2025-04-26 16:52:02,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5231 [WARNING|trainer.py:803] 2025-04-26 16:52:02,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:52:02,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5288 5269 [WARNING|trainer.py:803] 2025-04-26 16:52:03,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5232 [WARNING|trainer.py:803] 2025-04-26 16:52:04,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:52:04,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5270 5289 [WARNING|trainer.py:803] 2025-04-26 16:52:04,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5233 [WARNING|trainer.py:803] 2025-04-26 16:52:05,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:52:05,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5271 5290 [WARNING|trainer.py:803] 2025-04-26 16:52:06,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5234 [WARNING|trainer.py:803] 2025-04-26 16:52:06,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:52:06,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5272 5291 [WARNING|trainer.py:803] 2025-04-26 16:52:07,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5235 [WARNING|trainer.py:803] 2025-04-26 16:52:08,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:52:08,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5273 5292 [WARNING|trainer.py:803] 2025-04-26 16:52:08,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5236 [WARNING|trainer.py:803] 2025-04-26 16:52:09,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:52:09,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5274 5293 [WARNING|trainer.py:803] 2025-04-26 16:52:10,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5237 [WARNING|trainer.py:803] 2025-04-26 16:52:10,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:52:10,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5275 5294 [WARNING|trainer.py:803] 2025-04-26 16:52:11,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5238 [WARNING|trainer.py:803] 2025-04-26 16:52:12,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:52:12,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5276 5295 [WARNING|trainer.py:803] 2025-04-26 16:52:12,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5239 [WARNING|trainer.py:803] 2025-04-26 16:52:13,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:52:13,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5277 5296 [WARNING|trainer.py:803] 2025-04-26 16:52:14,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5240 [WARNING|trainer.py:803] 2025-04-26 16:52:14,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:52:15,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5278 5297 [WARNING|trainer.py:803] 2025-04-26 16:52:15,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5241 [WARNING|trainer.py:803] 2025-04-26 16:52:16,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:52:16,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5279 5298 [WARNING|trainer.py:803] 2025-04-26 16:52:17,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5242 [WARNING|trainer.py:803] 2025-04-26 16:52:17,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:52:17,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5280 5299 [WARNING|trainer.py:803] 2025-04-26 16:52:18,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5243 [WARNING|trainer.py:803] 2025-04-26 16:52:19,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:52:19,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5281 5300 [WARNING|trainer.py:803] 2025-04-26 16:52:19,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:52:20,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5244 [WARNING|trainer.py:803] 2025-04-26 16:52:20,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5282 5301 [WARNING|trainer.py:803] 2025-04-26 16:52:21,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:52:21,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5245 [WARNING|trainer.py:803] 2025-04-26 16:52:22,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5283 5302 [WARNING|trainer.py:803] 2025-04-26 16:52:22,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:52:23,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5246 [WARNING|trainer.py:803] 2025-04-26 16:52:23,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5284 5303 [WARNING|trainer.py:803] 2025-04-26 16:52:24,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:52:24,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5247 [WARNING|trainer.py:803] 2025-04-26 16:52:24,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5285 5304 [WARNING|trainer.py:803] 2025-04-26 16:52:25,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:52:26,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5248 [WARNING|trainer.py:803] 2025-04-26 16:52:26,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5286 5305 [WARNING|trainer.py:803] 2025-04-26 16:52:27,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:52:27,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5249 [WARNING|trainer.py:803] 2025-04-26 16:52:27,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5287 5306 [WARNING|trainer.py:803] 2025-04-26 16:52:28,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:52:28,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5250 [WARNING|trainer.py:803] 2025-04-26 16:52:28,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5288 5307 [WARNING|trainer.py:803] 2025-04-26 16:52:29,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:52:30,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5251 [WARNING|trainer.py:803] 2025-04-26 16:52:30,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5289 5308 [WARNING|trainer.py:803] 2025-04-26 16:52:31,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:52:31,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5252 [WARNING|trainer.py:803] 2025-04-26 16:52:31,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5290 5309 [WARNING|trainer.py:803] 2025-04-26 16:52:32,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:52:33,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5253 [WARNING|trainer.py:803] 2025-04-26 16:52:33,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5291 5310 [WARNING|trainer.py:803] 2025-04-26 16:52:33,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:52:34,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5254 [WARNING|trainer.py:803] 2025-04-26 16:52:34,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5292 5311 [WARNING|trainer.py:803] 2025-04-26 16:52:35,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5255 [WARNING|trainer.py:803] 2025-04-26 16:52:35,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:52:35,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5293 5312 [WARNING|trainer.py:803] 2025-04-26 16:52:36,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:52:37,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5256 [WARNING|trainer.py:803] 2025-04-26 16:52:37,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5294 5313 [WARNING|trainer.py:803] 2025-04-26 16:52:38,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:52:38,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5257 [WARNING|trainer.py:803] 2025-04-26 16:52:38,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5295 5314 [WARNING|trainer.py:803] 2025-04-26 16:52:39,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:52:39,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5258 [WARNING|trainer.py:803] 2025-04-26 16:52:40,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5296 5315 [WARNING|trainer.py:803] 2025-04-26 16:52:40,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:52:41,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5259 [WARNING|trainer.py:803] 2025-04-26 16:52:41,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5297 5316 [WARNING|trainer.py:803] 2025-04-26 16:52:42,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:52:42,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5260 [WARNING|trainer.py:803] 2025-04-26 16:52:42,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5298 5317 [WARNING|trainer.py:803] 2025-04-26 16:52:43,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:52:44,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5261 [WARNING|trainer.py:803] 2025-04-26 16:52:44,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5299 5318 [WARNING|trainer.py:803] 2025-04-26 16:52:45,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:52:45,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5262 [WARNING|trainer.py:803] 2025-04-26 16:52:45,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5300 5319 [WARNING|trainer.py:803] 2025-04-26 16:52:46,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:52:46,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5263 [WARNING|trainer.py:803] 2025-04-26 16:52:47,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5301 5320 [WARNING|trainer.py:803] 2025-04-26 16:52:47,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5264 [WARNING|trainer.py:803] 2025-04-26 16:52:48,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:52:48,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5302 5321 [WARNING|trainer.py:803] 2025-04-26 16:52:49,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5265 [WARNING|trainer.py:803] 2025-04-26 16:52:49,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:52:49,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5303 5322 [WARNING|trainer.py:803] 2025-04-26 16:52:50,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5266 [WARNING|trainer.py:803] 2025-04-26 16:52:51,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:52:51,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5304 5323 [WARNING|trainer.py:803] 2025-04-26 16:52:51,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5267 [WARNING|trainer.py:803] 2025-04-26 16:52:52,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:52:52,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5305 5324 [WARNING|trainer.py:803] 2025-04-26 16:52:53,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5268 [WARNING|trainer.py:803] 2025-04-26 16:52:53,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:52:54,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5306 5325 [WARNING|trainer.py:803] 2025-04-26 16:52:54,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:52:55,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5269 [WARNING|trainer.py:803] 2025-04-26 16:52:55,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5307 5326 [WARNING|trainer.py:803] 2025-04-26 16:52:56,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:52:56,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5270 [WARNING|trainer.py:803] 2025-04-26 16:52:56,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5308 5327 [WARNING|trainer.py:803] 2025-04-26 16:52:57,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5271 [WARNING|trainer.py:803] 2025-04-26 16:52:58,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:52:58,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5309 5328 [WARNING|trainer.py:803] 2025-04-26 16:52:58,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5272 [WARNING|trainer.py:803] 2025-04-26 16:52:59,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:52:59,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5310 5329 [WARNING|trainer.py:803] 2025-04-26 16:53:00,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5273 [WARNING|trainer.py:803] 2025-04-26 16:53:00,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:00,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5311 5330 [WARNING|trainer.py:803] 2025-04-26 16:53:01,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5274 [WARNING|trainer.py:803] 2025-04-26 16:53:02,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:02,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5312 5331 [WARNING|trainer.py:803] 2025-04-26 16:53:02,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5275 [WARNING|trainer.py:803] 2025-04-26 16:53:03,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:03,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5332 5313 [WARNING|trainer.py:803] 2025-04-26 16:53:04,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5276 [WARNING|trainer.py:803] 2025-04-26 16:53:04,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:04,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5333 5314 [WARNING|trainer.py:803] 2025-04-26 16:53:05,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5277 [WARNING|trainer.py:803] 2025-04-26 16:53:06,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:06,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5334 5315 [WARNING|trainer.py:803] 2025-04-26 16:53:07,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5278 [WARNING|trainer.py:803] 2025-04-26 16:53:07,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:07,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5335 5316 [WARNING|trainer.py:803] 2025-04-26 16:53:08,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:08,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5279 [WARNING|trainer.py:803] 2025-04-26 16:53:09,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5336 5317 [WARNING|trainer.py:803] 2025-04-26 16:53:09,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:10,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5280 [WARNING|trainer.py:803] 2025-04-26 16:53:10,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5337 5318 [WARNING|trainer.py:803] 2025-04-26 16:53:11,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:53:11,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5281 [WARNING|trainer.py:803] 2025-04-26 16:53:11,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5338 5319 [WARNING|trainer.py:803] 2025-04-26 16:53:12,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5282 [WARNING|trainer.py:803] 2025-04-26 16:53:13,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:53:13,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5339 5320 [WARNING|trainer.py:803] 2025-04-26 16:53:13,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5283 [WARNING|trainer.py:803] 2025-04-26 16:53:14,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:53:14,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5340 5321 [WARNING|trainer.py:803] 2025-04-26 16:53:15,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5284 [WARNING|trainer.py:803] 2025-04-26 16:53:15,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:15,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5341 5322 [WARNING|trainer.py:803] 2025-04-26 16:53:16,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5285 [WARNING|trainer.py:803] 2025-04-26 16:53:17,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:17,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5342 5323 [WARNING|trainer.py:803] 2025-04-26 16:53:18,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5286 [WARNING|trainer.py:803] 2025-04-26 16:53:18,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:18,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5343 5324 [WARNING|trainer.py:803] 2025-04-26 16:53:19,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:20,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5287 [WARNING|trainer.py:803] 2025-04-26 16:53:20,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5344 5325 [WARNING|trainer.py:803] 2025-04-26 16:53:20,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:21,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:21,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5288 5345 5326 [WARNING|trainer.py:803] 2025-04-26 16:53:22,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:22,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:22,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5289 5327 5346 [WARNING|trainer.py:803] 2025-04-26 16:53:23,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:24,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:24,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5290 5328 5347 [WARNING|trainer.py:803] 2025-04-26 16:53:25,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:53:25,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:25,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5291 5329 5348 [WARNING|trainer.py:803] 2025-04-26 16:53:26,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:26,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:27,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5292 5330 5349 [WARNING|trainer.py:803] 2025-04-26 16:53:28,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:28,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:28,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5293 5331 5350 [WARNING|trainer.py:803] 2025-04-26 16:53:29,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:29,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:53:29,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5294 5332 5351 [WARNING|trainer.py:803] 2025-04-26 16:53:30,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:53:31,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:31,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5295 5333 5352 [WARNING|trainer.py:803] 2025-04-26 16:53:32,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:32,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:32,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5296 5334 5353 [WARNING|trainer.py:803] 2025-04-26 16:53:33,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:33,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:33,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5297 5335 5354 [WARNING|trainer.py:803] 2025-04-26 16:53:35,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:53:35,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:35,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5298 5336 5355 [WARNING|trainer.py:803] 2025-04-26 16:53:36,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:53:36,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:36,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5299 5337 5356 [WARNING|trainer.py:803] 2025-04-26 16:53:37,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:38,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:38,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 5300 5338 5357 [WARNING|trainer.py:803] 2025-04-26 16:53:39,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:53:39,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:53:39,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5301 5339 5358 [WARNING|trainer.py:803] 2025-04-26 16:53:40,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:40,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:53:40,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5340 5359 5302 [WARNING|trainer.py:803] 2025-04-26 16:53:42,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:42,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:42,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5360 5341 5303 [WARNING|trainer.py:803] 2025-04-26 16:53:43,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:43,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:43,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5361 5342 5304 [WARNING|trainer.py:803] 2025-04-26 16:53:44,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:44,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:45,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5362 5343 5305 [WARNING|trainer.py:803] 2025-04-26 16:53:46,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:53:46,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:53:46,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5363 5344 5306 [WARNING|trainer.py:803] 2025-04-26 16:53:47,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:47,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:47,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5364 5345 5307 [WARNING|trainer.py:803] 2025-04-26 16:53:49,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:53:49,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:49,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5365 5346 5308 [WARNING|trainer.py:803] 2025-04-26 16:53:50,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:53:50,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:53:50,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5366 5347 5309 [WARNING|trainer.py:803] 2025-04-26 16:53:51,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:51,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:52,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5367 5348 5310 [WARNING|trainer.py:803] 2025-04-26 16:53:53,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:53:53,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:53,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5368 5349 5311 [WARNING|trainer.py:803] 2025-04-26 16:53:54,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:53:54,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:54,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5369 5350 5312 [WARNING|trainer.py:803] 2025-04-26 16:53:55,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:56,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:53:56,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5370 5351 5313 [WARNING|trainer.py:803] 2025-04-26 16:53:57,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:53:57,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:57,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5371 5352 5314 [WARNING|trainer.py:803] 2025-04-26 16:53:58,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:58,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:53:59,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5372 5353 5315 [WARNING|trainer.py:803] 2025-04-26 16:54:00,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:00,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:00,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5373 5354 5316 [WARNING|trainer.py:803] 2025-04-26 16:54:01,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:01,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:01,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5374 5355 5317 [WARNING|trainer.py:803] 2025-04-26 16:54:02,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:03,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:03,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5375 5356 5318 [WARNING|trainer.py:803] 2025-04-26 16:54:04,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:04,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 16:54:04,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5376 5357 5319 [WARNING|trainer.py:803] 2025-04-26 16:54:05,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:05,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:06,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5377 5358 5320 [WARNING|trainer.py:803] 2025-04-26 16:54:06,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:07,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5378 [WARNING|trainer.py:803] 2025-04-26 16:54:07,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5359 5321 [WARNING|trainer.py:803] 2025-04-26 16:54:08,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:08,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5379 [WARNING|trainer.py:803] 2025-04-26 16:54:08,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5360 5322 [WARNING|trainer.py:803] 2025-04-26 16:54:09,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:54:09,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5380 [WARNING|trainer.py:803] 2025-04-26 16:54:10,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5361 5323 [WARNING|trainer.py:803] 2025-04-26 16:54:11,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:11,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5381 [WARNING|trainer.py:803] 2025-04-26 16:54:11,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5362 5324 [WARNING|trainer.py:803] 2025-04-26 16:54:12,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:54:12,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5382 [WARNING|trainer.py:803] 2025-04-26 16:54:13,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5363 5325 [WARNING|trainer.py:803] 2025-04-26 16:54:13,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:54:14,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5383 [WARNING|trainer.py:803] 2025-04-26 16:54:14,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5364 5326 [WARNING|trainer.py:803] 2025-04-26 16:54:15,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:54:15,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5384 [WARNING|trainer.py:803] 2025-04-26 16:54:15,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5365 5327 [WARNING|trainer.py:803] 2025-04-26 16:54:16,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:54:16,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5385 [WARNING|trainer.py:803] 2025-04-26 16:54:17,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5366 5328 [WARNING|trainer.py:803] 2025-04-26 16:54:17,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:54:18,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5386 [WARNING|trainer.py:803] 2025-04-26 16:54:18,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5367 5329 [WARNING|trainer.py:803] 2025-04-26 16:54:19,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:54:19,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5387 [WARNING|trainer.py:803] 2025-04-26 16:54:19,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5368 5330 [WARNING|trainer.py:803] 2025-04-26 16:54:20,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:20,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5388 [WARNING|trainer.py:803] 2025-04-26 16:54:21,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5369 5331 [WARNING|trainer.py:803] 2025-04-26 16:54:21,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:22,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5389 [WARNING|trainer.py:803] 2025-04-26 16:54:22,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5370 5332 [WARNING|trainer.py:803] 2025-04-26 16:54:23,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:54:23,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5390 [WARNING|trainer.py:803] 2025-04-26 16:54:23,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5371 5333 [WARNING|trainer.py:803] 2025-04-26 16:54:24,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:25,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:25,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5391 5372 5334 [WARNING|trainer.py:803] 2025-04-26 16:54:26,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:26,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:26,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5392 5373 5335 [WARNING|trainer.py:803] 2025-04-26 16:54:27,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:27,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:27,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5393 5374 5336 [WARNING|trainer.py:803] 2025-04-26 16:54:28,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:29,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:29,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5394 5375 5337 [WARNING|trainer.py:803] 2025-04-26 16:54:30,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:30,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:30,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5395 5376 5338 [WARNING|trainer.py:803] 2025-04-26 16:54:31,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:32,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:32,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5396 5377 5339 [WARNING|trainer.py:803] 2025-04-26 16:54:33,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:33,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:33,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5397 5378 5340 [WARNING|trainer.py:803] 2025-04-26 16:54:34,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:54:34,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:34,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5398 5379 5341 [WARNING|trainer.py:803] 2025-04-26 16:54:35,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:36,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:54:36,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5399 5380 5342 [WARNING|trainer.py:803] 2025-04-26 16:54:37,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:37,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:37,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5400 5381 5343 [WARNING|trainer.py:803] 2025-04-26 16:54:38,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:38,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:54:39,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5401 5382 5344 [WARNING|trainer.py:803] 2025-04-26 16:54:40,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:54:40,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:54:40,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5402 5383 5345 [WARNING|trainer.py:803] 2025-04-26 16:54:41,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:54:41,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:54:41,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5384 5403 5346 [WARNING|trainer.py:803] 2025-04-26 16:54:43,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:54:43,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:43,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5385 5404 5347 [WARNING|trainer.py:803] 2025-04-26 16:54:44,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:54:44,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:54:44,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5386 5348 5405 [WARNING|trainer.py:803] 2025-04-26 16:54:45,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:54:46,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:54:46,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5387 5349 5406 [WARNING|trainer.py:803] 2025-04-26 16:54:47,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:47,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:54:47,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5388 5350 5407 [WARNING|trainer.py:803] 2025-04-26 16:54:48,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:48,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5389 [WARNING|trainer.py:803] 2025-04-26 16:54:49,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5351 5408 [WARNING|trainer.py:803] 2025-04-26 16:54:49,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:54:50,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5390 [WARNING|trainer.py:803] 2025-04-26 16:54:50,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5352 [WARNING|trainer.py:803] 2025-04-26 16:54:51,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5409 [WARNING|trainer.py:803] 2025-04-26 16:54:51,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5391 [WARNING|trainer.py:803] 2025-04-26 16:54:52,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5353 [WARNING|trainer.py:803] 2025-04-26 16:54:52,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5410 [WARNING|trainer.py:803] 2025-04-26 16:54:53,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5392 [WARNING|trainer.py:803] 2025-04-26 16:54:53,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5354 [WARNING|trainer.py:803] 2025-04-26 16:54:54,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5411 [WARNING|trainer.py:803] 2025-04-26 16:54:54,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5393 5355 [WARNING|trainer.py:803] 2025-04-26 16:54:55,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:54:55,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5412 [WARNING|trainer.py:803] 2025-04-26 16:54:55,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5394 5356 [WARNING|trainer.py:803] 2025-04-26 16:54:56,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:54:56,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:57,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 5413 5395 5357 [WARNING|trainer.py:803] 2025-04-26 16:54:58,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:54:58,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:58,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5396 5414 5358 [WARNING|trainer.py:803] 2025-04-26 16:54:59,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:54:59,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:54:59,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5397 5415 5359 [WARNING|trainer.py:803] 2025-04-26 16:55:00,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:55:01,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:55:01,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5398 5416 5360 [WARNING|trainer.py:803] 2025-04-26 16:55:02,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:55:02,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:55:02,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5399 5417 5361 [WARNING|trainer.py:803] 2025-04-26 16:55:03,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:55:04,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5400 [WARNING|trainer.py:803] 2025-04-26 16:55:04,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5362 5418 [WARNING|trainer.py:803] 2025-04-26 16:55:04,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:55:05,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 16:55:05,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes :Yes 5401 5363 5419 [WARNING|trainer.py:803] 2025-04-26 16:55:06,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:55:06,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:55:07,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5402 5364 5420 [WARNING|trainer.py:803] 2025-04-26 16:55:07,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:55:08,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:55:08,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5403 5365 5421 [WARNING|trainer.py:803] 2025-04-26 16:55:09,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:55:09,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:55:10,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5366 5404 5422 [WARNING|trainer.py:803] 2025-04-26 16:55:10,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:55:10,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5367 [WARNING|trainer.py:803] 2025-04-26 16:55:11,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5405 5423 [WARNING|trainer.py:803] 2025-04-26 16:55:12,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:55:12,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5368 [WARNING|trainer.py:803] 2025-04-26 16:55:13,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5406 [WARNING|trainer.py:803] 2025-04-26 16:55:13,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5424 [WARNING|trainer.py:803] 2025-04-26 16:55:13,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5369 [WARNING|trainer.py:803] 2025-04-26 16:55:14,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5407 [WARNING|trainer.py:803] 2025-04-26 16:55:15,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5425 [WARNING|trainer.py:803] 2025-04-26 16:55:15,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5370 [WARNING|trainer.py:803] 2025-04-26 16:55:16,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5408 [WARNING|trainer.py:803] 2025-04-26 16:55:16,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5426 5371 [WARNING|trainer.py:803] 2025-04-26 16:55:16,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:55:17,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5409 [WARNING|trainer.py:803] 2025-04-26 16:55:17,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5427 5372 [WARNING|trainer.py:803] 2025-04-26 16:55:18,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:55:19,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5410 [WARNING|trainer.py:803] 2025-04-26 16:55:19,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5373 5428 [WARNING|trainer.py:803] 2025-04-26 16:55:19,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5411 [WARNING|trainer.py:803] 2025-04-26 16:55:20,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:55:20,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5374 5429 [WARNING|trainer.py:803] 2025-04-26 16:55:21,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:55:21,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5412 [WARNING|trainer.py:803] 2025-04-26 16:55:22,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5375 5430 [WARNING|trainer.py:803] 2025-04-26 16:55:22,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:55:23,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5413 [WARNING|trainer.py:803] 2025-04-26 16:55:23,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5376 [WARNING|trainer.py:803] 2025-04-26 16:55:24,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5431 [WARNING|trainer.py:803] 2025-04-26 16:55:24,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5414 [WARNING|trainer.py:803] 2025-04-26 16:55:25,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5377 [WARNING|trainer.py:803] 2025-04-26 16:55:25,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 5432:Yes [WARNING|trainer.py:803] 2025-04-26 16:55:26,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5415 5378 [WARNING|trainer.py:803] 2025-04-26 16:55:26,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5433 [WARNING|trainer.py:803] 2025-04-26 16:55:27,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:55:27,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5379 5416 [WARNING|trainer.py:803] 2025-04-26 16:55:28,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5434 [WARNING|trainer.py:803] 2025-04-26 16:55:28,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:55:28,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5380 5417 [WARNING|trainer.py:803] 2025-04-26 16:55:29,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:55:30,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5435 [WARNING|trainer.py:803] 2025-04-26 16:55:30,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5381 5418 [WARNING|trainer.py:803] 2025-04-26 16:55:31,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:55:31,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5436 [WARNING|trainer.py:803] 2025-04-26 16:55:31,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5382 5419 [WARNING|trainer.py:803] 2025-04-26 16:55:32,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:55:32,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5437 [WARNING|trainer.py:803] 2025-04-26 16:55:33,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5383 5420 [WARNING|trainer.py:803] 2025-04-26 16:55:34,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:55:34,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:55:34,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5438 5384 5421 [WARNING|trainer.py:803] 2025-04-26 16:55:35,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:55:35,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5385 [WARNING|trainer.py:803] 2025-04-26 16:55:36,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5439 5422 [WARNING|trainer.py:803] 2025-04-26 16:55:37,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:55:37,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5386 [WARNING|trainer.py:803] 2025-04-26 16:55:37,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5440 [WARNING|trainer.py:803] 2025-04-26 16:55:38,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5423 [WARNING|trainer.py:803] 2025-04-26 16:55:38,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5387 5441 [WARNING|trainer.py:803] 2025-04-26 16:55:39,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:55:39,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5424 [WARNING|trainer.py:803] 2025-04-26 16:55:40,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5388 5442 [WARNING|trainer.py:803] 2025-04-26 16:55:40,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:55:41,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5425 [WARNING|trainer.py:803] 2025-04-26 16:55:41,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5389 [WARNING|trainer.py:803] 2025-04-26 16:55:42,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5443 [WARNING|trainer.py:803] 2025-04-26 16:55:42,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5426 5390 [WARNING|trainer.py:803] 2025-04-26 16:55:43,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:55:43,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5444 [WARNING|trainer.py:803] 2025-04-26 16:55:43,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5391 5427 [WARNING|trainer.py:803] 2025-04-26 16:55:44,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:55:45,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:55:45,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5445 5392 5428 [WARNING|trainer.py:803] 2025-04-26 16:55:46,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:55:46,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5446 [WARNING|trainer.py:803] 2025-04-26 16:55:46,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5393 5429 [WARNING|trainer.py:803] 2025-04-26 16:55:47,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:55:48,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5447 [WARNING|trainer.py:803] 2025-04-26 16:55:48,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5394 5430 [WARNING|trainer.py:803] 2025-04-26 16:55:49,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:55:49,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5448 [WARNING|trainer.py:803] 2025-04-26 16:55:50,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5395 5431 [WARNING|trainer.py:803] 2025-04-26 16:55:50,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:55:50,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5396 5449 [WARNING|trainer.py:803] 2025-04-26 16:55:51,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5432 [WARNING|trainer.py:803] 2025-04-26 16:55:52,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:55:52,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5397 5450 [WARNING|trainer.py:803] 2025-04-26 16:55:53,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:55:53,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5433 [WARNING|trainer.py:803] 2025-04-26 16:55:53,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5398 [WARNING|trainer.py:803] 2025-04-26 16:55:54,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5451 [WARNING|trainer.py:803] 2025-04-26 16:55:55,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5434 [WARNING|trainer.py:803] 2025-04-26 16:55:55,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5399 5452 [WARNING|trainer.py:803] 2025-04-26 16:55:56,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:55:56,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5435 [WARNING|trainer.py:803] 2025-04-26 16:55:56,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5400 5453 [WARNING|trainer.py:803] 2025-04-26 16:55:57,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:55:57,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5436 [WARNING|trainer.py:803] 2025-04-26 16:55:58,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5401 5454 [WARNING|trainer.py:803] 2025-04-26 16:55:59,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:55:59,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5437 [WARNING|trainer.py:803] 2025-04-26 16:55:59,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5402 5455 [WARNING|trainer.py:803] 2025-04-26 16:56:00,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:00,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5438 [WARNING|trainer.py:803] 2025-04-26 16:56:01,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5403 5456 [WARNING|trainer.py:803] 2025-04-26 16:56:02,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:02,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5439 [WARNING|trainer.py:803] 2025-04-26 16:56:02,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5404 5457 [WARNING|trainer.py:803] 2025-04-26 16:56:03,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:03,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:04,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5440 5405 5458 [WARNING|trainer.py:803] 2025-04-26 16:56:05,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:05,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:05,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5441 5406 5459 [WARNING|trainer.py:803] 2025-04-26 16:56:06,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:07,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:56:07,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5442 5407 5460 [WARNING|trainer.py:803] 2025-04-26 16:56:08,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:08,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:08,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5443 5408 5461 [WARNING|trainer.py:803] 2025-04-26 16:56:09,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:56:10,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:56:10,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5444 5409 5462 [WARNING|trainer.py:803] 2025-04-26 16:56:11,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:56:11,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:11,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5445 5410 5463 [WARNING|trainer.py:803] 2025-04-26 16:56:12,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:56:13,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:13,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5446 5464 5411 [WARNING|trainer.py:803] 2025-04-26 16:56:14,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:14,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:56:14,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5447 5412 5465 [WARNING|trainer.py:803] 2025-04-26 16:56:16,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:56:16,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:16,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5448 5466 5413 [WARNING|trainer.py:803] 2025-04-26 16:56:17,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:17,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:17,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5467 5449 5414 [WARNING|trainer.py:803] 2025-04-26 16:56:19,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:19,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:19,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5450 5468 5415 [WARNING|trainer.py:803] 2025-04-26 16:56:20,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:20,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:20,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5469 5451 5416 [WARNING|trainer.py:803] 2025-04-26 16:56:22,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:56:22,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:22,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5470 5452 5417 [WARNING|trainer.py:803] 2025-04-26 16:56:23,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:23,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:56:23,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5471 5453 5418 [WARNING|trainer.py:803] 2025-04-26 16:56:25,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:56:25,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:25,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5472 5454 5419 [WARNING|trainer.py:803] 2025-04-26 16:56:26,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:26,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:26,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5473 5420 5455 [WARNING|trainer.py:803] 2025-04-26 16:56:28,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:56:28,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:28,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5474 5421 5456 [WARNING|trainer.py:803] 2025-04-26 16:56:29,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:56:29,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:56:29,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5457 5422 5475 [WARNING|trainer.py:803] 2025-04-26 16:56:31,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:31,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:31,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5458 5423 5476 [WARNING|trainer.py:803] 2025-04-26 16:56:32,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:32,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:32,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5459 5477 5424 [WARNING|trainer.py:803] 2025-04-26 16:56:34,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:34,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:34,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5478 5460 5425 [WARNING|trainer.py:803] 2025-04-26 16:56:35,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:35,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:36,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5461 5479 5426 [WARNING|trainer.py:803] 2025-04-26 16:56:37,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:56:37,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:56:37,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5480 5462 5427 [WARNING|trainer.py:803] 2025-04-26 16:56:38,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:56:38,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:39,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5481 5463 5428 [WARNING|trainer.py:803] 2025-04-26 16:56:40,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:40,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:56:40,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5482 5464 5429 [WARNING|trainer.py:803] 2025-04-26 16:56:41,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:56:41,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:56:42,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5483 5465 5430 [WARNING|trainer.py:803] 2025-04-26 16:56:43,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:43,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:43,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5466 5484 5431 [WARNING|trainer.py:803] 2025-04-26 16:56:44,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:44,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:45,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5467 5485 5432 [WARNING|trainer.py:803] 2025-04-26 16:56:46,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:46,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:46,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5468 5486 5433 [WARNING|trainer.py:803] 2025-04-26 16:56:47,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:48,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:56:48,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5469 5487 5434 [WARNING|trainer.py:803] 2025-04-26 16:56:49,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:56:49,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:49,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5488 5470 5435 [WARNING|trainer.py:803] 2025-04-26 16:56:51,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:56:51,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:51,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5489 5471 5436 [WARNING|trainer.py:803] 2025-04-26 16:56:52,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:52,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:56:52,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5490 5472 5437 [WARNING|trainer.py:803] 2025-04-26 16:56:53,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:54,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 16:56:54,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes NoYes 5491 5473 5438 [WARNING|trainer.py:803] 2025-04-26 16:56:55,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:56:55,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:56:55,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5492 5474 5439 [WARNING|trainer.py:803] 2025-04-26 16:56:57,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:57,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:56:57,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5493 5440 5475 [WARNING|trainer.py:803] 2025-04-26 16:56:58,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:56:58,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:56:58,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5494 5441 5476 [WARNING|trainer.py:803] 2025-04-26 16:56:59,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:00,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:00,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5495 5477 5442 [WARNING|trainer.py:803] 2025-04-26 16:57:01,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:01,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:01,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5496 5478 5443 [WARNING|trainer.py:803] 2025-04-26 16:57:02,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:03,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:03,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5497 5479 5444 [WARNING|trainer.py:803] 2025-04-26 16:57:04,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:04,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:04,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5498 5480 5445 [WARNING|trainer.py:803] 2025-04-26 16:57:06,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:06,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:06,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5499 5481 5446 [WARNING|trainer.py:803] 2025-04-26 16:57:07,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:07,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:08,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5482 5500 5447 [WARNING|trainer.py:803] 2025-04-26 16:57:09,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:57:09,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:09,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5501 5483 5448 [WARNING|trainer.py:803] 2025-04-26 16:57:10,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:10,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:11,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5502 5484 5449 [WARNING|trainer.py:803] 2025-04-26 16:57:12,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:12,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:12,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5503 5485 5450 [WARNING|trainer.py:803] 2025-04-26 16:57:13,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:57:13,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5504 [WARNING|trainer.py:803] 2025-04-26 16:57:14,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5486 5451 [WARNING|trainer.py:803] 2025-04-26 16:57:14,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:15,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5505 [WARNING|trainer.py:803] 2025-04-26 16:57:15,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5487 [WARNING|trainer.py:803] 2025-04-26 16:57:16,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5452 [WARNING|trainer.py:803] 2025-04-26 16:57:16,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5506 [WARNING|trainer.py:803] 2025-04-26 16:57:17,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5488 [WARNING|trainer.py:803] 2025-04-26 16:57:17,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5453 [WARNING|trainer.py:803] 2025-04-26 16:57:18,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5507 5489 [WARNING|trainer.py:803] 2025-04-26 16:57:18,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:19,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5454 [WARNING|trainer.py:803] 2025-04-26 16:57:19,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5508 5490 [WARNING|trainer.py:803] 2025-04-26 16:57:20,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:20,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5455 [WARNING|trainer.py:803] 2025-04-26 16:57:21,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5509 5491 [WARNING|trainer.py:803] 2025-04-26 16:57:21,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:22,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5456 [WARNING|trainer.py:803] 2025-04-26 16:57:22,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5510 5492 [WARNING|trainer.py:803] 2025-04-26 16:57:23,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:23,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5457 5511 [WARNING|trainer.py:803] 2025-04-26 16:57:24,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5493 [WARNING|trainer.py:803] 2025-04-26 16:57:24,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:24,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5512 5458 [WARNING|trainer.py:803] 2025-04-26 16:57:25,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5494 [WARNING|trainer.py:803] 2025-04-26 16:57:26,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:57:26,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5513 5459 [WARNING|trainer.py:803] 2025-04-26 16:57:27,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:27,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5495 [WARNING|trainer.py:803] 2025-04-26 16:57:27,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5514 5460 [WARNING|trainer.py:803] 2025-04-26 16:57:28,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:29,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5496 [WARNING|trainer.py:803] 2025-04-26 16:57:29,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5515 5461 [WARNING|trainer.py:803] 2025-04-26 16:57:30,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:30,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:57:30,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5497 5516 5462 [WARNING|trainer.py:803] 2025-04-26 16:57:31,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:32,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:32,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5498 5517 5463 [WARNING|trainer.py:803] 2025-04-26 16:57:33,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:33,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:33,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5499 5518 5464 [WARNING|trainer.py:803] 2025-04-26 16:57:34,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:35,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:57:35,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5500 5519 5465 [WARNING|trainer.py:803] 2025-04-26 16:57:36,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:36,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:57:36,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5501 5520 5466 [WARNING|trainer.py:803] 2025-04-26 16:57:37,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:38,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:38,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5502 5521 5467 [WARNING|trainer.py:803] 2025-04-26 16:57:39,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:39,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:39,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5503 5522 5468 [WARNING|trainer.py:803] 2025-04-26 16:57:40,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:57:40,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:57:41,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5504 5523 5469 [WARNING|trainer.py:803] 2025-04-26 16:57:42,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:42,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5505 [WARNING|trainer.py:803] 2025-04-26 16:57:42,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5524 5470 [WARNING|trainer.py:803] 2025-04-26 16:57:43,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:43,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5506 5525 [WARNING|trainer.py:803] 2025-04-26 16:57:44,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5471 [WARNING|trainer.py:803] 2025-04-26 16:57:45,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:45,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5507 [WARNING|trainer.py:803] 2025-04-26 16:57:45,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5526 [WARNING|trainer.py:803] 2025-04-26 16:57:46,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5472 [WARNING|trainer.py:803] 2025-04-26 16:57:46,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5508 5527 [WARNING|trainer.py:803] 2025-04-26 16:57:47,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:48,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5473 [WARNING|trainer.py:803] 2025-04-26 16:57:48,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5509 5528 [WARNING|trainer.py:803] 2025-04-26 16:57:48,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:49,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5474 [WARNING|trainer.py:803] 2025-04-26 16:57:49,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5510 5529 [WARNING|trainer.py:803] 2025-04-26 16:57:50,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:50,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:51,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5475 5511 5530 [WARNING|trainer.py:803] 2025-04-26 16:57:52,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:52,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:57:52,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5476 5512 5531 [WARNING|trainer.py:803] 2025-04-26 16:57:53,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:53,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:57:54,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5477 5513 5532 [WARNING|trainer.py:803] 2025-04-26 16:57:55,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:55,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:55,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5478 5514 5533 [WARNING|trainer.py:803] 2025-04-26 16:57:56,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:57:56,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:56,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5479 5515 5534 [WARNING|trainer.py:803] 2025-04-26 16:57:58,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:58,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:57:58,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5516 5480 5535 [WARNING|trainer.py:803] 2025-04-26 16:57:59,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:57:59,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:57:59,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5517 5481 5536 [WARNING|trainer.py:803] 2025-04-26 16:58:00,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:01,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:01,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5518 5482 5537 [WARNING|trainer.py:803] 2025-04-26 16:58:02,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:58:02,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:58:02,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5519 5483 5538 [WARNING|trainer.py:803] 2025-04-26 16:58:03,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:58:04,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:58:04,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5520 5484 [WARNING|trainer.py:803] 2025-04-26 16:58:05,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5539 [WARNING|trainer.py:803] 2025-04-26 16:58:05,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5521 [WARNING|trainer.py:803] 2025-04-26 16:58:06,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5485 [WARNING|trainer.py:803] 2025-04-26 16:58:06,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5540 [WARNING|trainer.py:803] 2025-04-26 16:58:07,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5522 5486 [WARNING|trainer.py:803] 2025-04-26 16:58:08,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:08,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:58:08,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5523 5541 5487 [WARNING|trainer.py:803] 2025-04-26 16:58:09,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:58:09,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:10,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5524 5542 5488 [WARNING|trainer.py:803] 2025-04-26 16:58:11,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:11,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:11,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5525 5489 5543 [WARNING|trainer.py:803] 2025-04-26 16:58:12,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:13,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5526 [WARNING|trainer.py:803] 2025-04-26 16:58:13,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5490 [WARNING|trainer.py:803] 2025-04-26 16:58:14,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5544 [WARNING|trainer.py:803] 2025-04-26 16:58:14,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5527 [WARNING|trainer.py:803] 2025-04-26 16:58:15,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5491 [WARNING|trainer.py:803] 2025-04-26 16:58:15,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5545 5528 [WARNING|trainer.py:803] 2025-04-26 16:58:16,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:58:16,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5492 [WARNING|trainer.py:803] 2025-04-26 16:58:16,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5529 [WARNING|trainer.py:803] 2025-04-26 16:58:17,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5546 5493 [WARNING|trainer.py:803] 2025-04-26 16:58:18,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:18,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5530 [WARNING|trainer.py:803] 2025-04-26 16:58:19,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5547 [WARNING|trainer.py:803] 2025-04-26 16:58:19,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5494 [WARNING|trainer.py:803] 2025-04-26 16:58:20,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 5531 YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:20,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:58:21,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5495 5548 5532 [WARNING|trainer.py:803] 2025-04-26 16:58:22,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:58:22,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:22,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5496 5549 5533 [WARNING|trainer.py:803] 2025-04-26 16:58:23,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:23,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:24,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5497 5550 5534 [WARNING|trainer.py:803] 2025-04-26 16:58:25,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:25,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:25,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5498 5551 5535 [WARNING|trainer.py:803] 2025-04-26 16:58:26,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:26,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:27,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5499 5552 5536 [WARNING|trainer.py:803] 2025-04-26 16:58:28,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:28,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:28,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5500 5553 5537 [WARNING|trainer.py:803] 2025-04-26 16:58:29,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:30,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:58:30,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5501 5554 5538 [WARNING|trainer.py:803] 2025-04-26 16:58:31,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:31,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:31,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5502 5555 5539 [WARNING|trainer.py:803] 2025-04-26 16:58:32,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:33,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5503 [WARNING|trainer.py:803] 2025-04-26 16:58:33,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5556 [WARNING|trainer.py:803] 2025-04-26 16:58:34,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:58:34,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5540 5504 5557 [WARNING|trainer.py:803] 2025-04-26 16:58:35,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:35,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:36,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5541 5505 5558 [WARNING|trainer.py:803] 2025-04-26 16:58:37,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:37,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:37,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5506 5542 5559 [WARNING|trainer.py:803] 2025-04-26 16:58:38,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:38,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:39,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5507 5560 5543 [WARNING|trainer.py:803] 2025-04-26 16:58:40,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:58:40,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:40,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5508 5561 [WARNING|trainer.py:803] 2025-04-26 16:58:41,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5544 [WARNING|trainer.py:803] 2025-04-26 16:58:42,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5509 [WARNING|trainer.py:803] 2025-04-26 16:58:42,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5562 [WARNING|trainer.py:803] 2025-04-26 16:58:43,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5545 5510 [WARNING|trainer.py:803] 2025-04-26 16:58:43,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:58:44,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5563 [WARNING|trainer.py:803] 2025-04-26 16:58:44,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5511 5546 [WARNING|trainer.py:803] 2025-04-26 16:58:45,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:45,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5564 [WARNING|trainer.py:803] 2025-04-26 16:58:46,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5512 [WARNING|trainer.py:803] 2025-04-26 16:58:46,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5547 [WARNING|trainer.py:803] 2025-04-26 16:58:47,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5565 5513 [WARNING|trainer.py:803] 2025-04-26 16:58:47,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:48,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:48,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5548 5566 5514 [WARNING|trainer.py:803] 2025-04-26 16:58:49,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:49,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:58:50,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5549 5567 5515 [WARNING|trainer.py:803] 2025-04-26 16:58:51,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:51,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:51,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5550 5568 5516 [WARNING|trainer.py:803] 2025-04-26 16:58:52,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:53,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:58:53,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5551 5517 5569 [WARNING|trainer.py:803] 2025-04-26 16:58:54,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:54,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:54,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5552 5570 5518 [WARNING|trainer.py:803] 2025-04-26 16:58:55,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:56,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:58:56,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5553 5571 5519 [WARNING|trainer.py:803] 2025-04-26 16:58:57,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:58:57,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:58:57,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5554 5572 5520 [WARNING|trainer.py:803] 2025-04-26 16:58:58,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:58:59,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:58:59,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5555 5573 5521 [WARNING|trainer.py:803] 2025-04-26 16:59:00,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:00,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:00,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5556 5574 5522 [WARNING|trainer.py:803] 2025-04-26 16:59:01,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:02,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:59:02,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5557 5575 5523 [WARNING|trainer.py:803] 2025-04-26 16:59:03,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:59:03,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:03,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5576 5524 5558 [WARNING|trainer.py:803] 2025-04-26 16:59:05,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:05,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:05,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5525 5577 5559 [WARNING|trainer.py:803] 2025-04-26 16:59:06,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:06,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:06,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5526 5578 5560 [WARNING|trainer.py:803] 2025-04-26 16:59:07,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:08,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:08,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5527 5561 5579 [WARNING|trainer.py:803] 2025-04-26 16:59:09,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:09,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:09,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5528 5562 5580 [WARNING|trainer.py:803] 2025-04-26 16:59:10,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:11,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:59:11,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5529 5563 5581 [WARNING|trainer.py:803] 2025-04-26 16:59:12,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:12,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:12,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5530 5582 5564 [WARNING|trainer.py:803] 2025-04-26 16:59:13,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5531 [WARNING|trainer.py:803] 2025-04-26 16:59:14,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:14,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5583 5565 [WARNING|trainer.py:803] 2025-04-26 16:59:15,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5532 [WARNING|trainer.py:803] 2025-04-26 16:59:15,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:59:15,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5566 5584 [WARNING|trainer.py:803] 2025-04-26 16:59:16,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5533 [WARNING|trainer.py:803] 2025-04-26 16:59:17,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:17,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:18,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5567 5585 5534 [WARNING|trainer.py:803] 2025-04-26 16:59:18,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:18,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:19,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5586 5568 5535 [WARNING|trainer.py:803] 2025-04-26 16:59:20,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:20,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:20,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5587 5569 5536 [WARNING|trainer.py:803] 2025-04-26 16:59:21,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:21,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:22,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5570 5588 5537 [WARNING|trainer.py:803] 2025-04-26 16:59:23,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:23,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:59:23,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5571 5589 5538 [WARNING|trainer.py:803] 2025-04-26 16:59:24,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:59:25,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:25,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5572 5590 [WARNING|trainer.py:803] 2025-04-26 16:59:26,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5539 [WARNING|trainer.py:803] 2025-04-26 16:59:26,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5573 5591 [WARNING|trainer.py:803] 2025-04-26 16:59:27,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:27,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:27,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5540 5574 5592 [WARNING|trainer.py:803] 2025-04-26 16:59:29,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:29,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:59:29,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5575 5541 5593 [WARNING|trainer.py:803] 2025-04-26 16:59:30,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:30,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:31,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5576 5542 5594 [WARNING|trainer.py:803] 2025-04-26 16:59:32,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:32,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:32,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5577 5595 5543 [WARNING|trainer.py:803] 2025-04-26 16:59:33,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:34,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:34,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5578 5596 5544 [WARNING|trainer.py:803] 2025-04-26 16:59:35,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:35,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5579 [WARNING|trainer.py:803] 2025-04-26 16:59:36,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5597 [WARNING|trainer.py:803] 2025-04-26 16:59:37,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5545 [WARNING|trainer.py:803] 2025-04-26 16:59:37,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5580 [WARNING|trainer.py:803] 2025-04-26 16:59:37,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5598 [WARNING|trainer.py:803] 2025-04-26 16:59:38,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5546 [WARNING|trainer.py:803] 2025-04-26 16:59:38,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5581 5599 [WARNING|trainer.py:803] 2025-04-26 16:59:39,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 16:59:40,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:40,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5547 5582 5600 [WARNING|trainer.py:803] 2025-04-26 16:59:41,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:41,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:41,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5583 5548 5601 [WARNING|trainer.py:803] 2025-04-26 16:59:43,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:59:43,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:43,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5584 5549 5602 [WARNING|trainer.py:803] 2025-04-26 16:59:44,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:44,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:44,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5585 5603 5550 [WARNING|trainer.py:803] 2025-04-26 16:59:46,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:46,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:46,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5586 5604 5551 [WARNING|trainer.py:803] 2025-04-26 16:59:47,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:47,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:47,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5605 5587 5552 [WARNING|trainer.py:803] 2025-04-26 16:59:49,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:49,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:49,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5606 5588 5553 [WARNING|trainer.py:803] 2025-04-26 16:59:50,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:50,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 16:59:51,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5607 5589 5554 [WARNING|trainer.py:803] 2025-04-26 16:59:52,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:52,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:52,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5608 5590 5555 [WARNING|trainer.py:803] 2025-04-26 16:59:53,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:54,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:54,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5609 5591 5556 [WARNING|trainer.py:803] 2025-04-26 16:59:55,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:55,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 16:59:55,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5610 5592 5557 [WARNING|trainer.py:803] 2025-04-26 16:59:56,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 16:59:57,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 16:59:57,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5611 5593 5558 [WARNING|trainer.py:803] 2025-04-26 16:59:58,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 16:59:58,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5612 [WARNING|trainer.py:803] 2025-04-26 16:59:58,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5594 5559 [WARNING|trainer.py:803] 2025-04-26 16:59:59,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5613 [WARNING|trainer.py:803] 2025-04-26 17:00:00,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:00:00,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5595 [WARNING|trainer.py:803] 2025-04-26 17:00:00,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5560 [WARNING|trainer.py:803] 2025-04-26 17:00:01,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5614 [WARNING|trainer.py:803] 2025-04-26 17:00:01,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5596 [WARNING|trainer.py:803] 2025-04-26 17:00:02,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5561 5615 [WARNING|trainer.py:803] 2025-04-26 17:00:03,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:03,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5597 [WARNING|trainer.py:803] 2025-04-26 17:00:03,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5562 5616 [WARNING|trainer.py:803] 2025-04-26 17:00:04,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:00:04,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:00:05,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5598 5563 5617 [WARNING|trainer.py:803] 2025-04-26 17:00:06,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:00:06,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5599 [WARNING|trainer.py:803] 2025-04-26 17:00:06,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5564 5618 [WARNING|trainer.py:803] 2025-04-26 17:00:07,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:00:08,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5600 [WARNING|trainer.py:803] 2025-04-26 17:00:08,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5565 5619 [WARNING|trainer.py:803] 2025-04-26 17:00:09,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:00:09,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5601 [WARNING|trainer.py:803] 2025-04-26 17:00:09,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5566 5620 [WARNING|trainer.py:803] 2025-04-26 17:00:10,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:11,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5602 [WARNING|trainer.py:803] 2025-04-26 17:00:11,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5567 5621 [WARNING|trainer.py:803] 2025-04-26 17:00:12,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:12,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5603 [WARNING|trainer.py:803] 2025-04-26 17:00:12,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5568 5622 [WARNING|trainer.py:803] 2025-04-26 17:00:13,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:00:14,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5604 [WARNING|trainer.py:803] 2025-04-26 17:00:14,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5569 5623 [WARNING|trainer.py:803] 2025-04-26 17:00:15,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5605 [WARNING|trainer.py:803] 2025-04-26 17:00:15,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:00:15,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5570 5624 [WARNING|trainer.py:803] 2025-04-26 17:00:16,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5606 [WARNING|trainer.py:803] 2025-04-26 17:00:17,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:00:17,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5571 5625 [WARNING|trainer.py:803] 2025-04-26 17:00:18,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5607 [WARNING|trainer.py:803] 2025-04-26 17:00:18,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:00:18,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5572 5626 [WARNING|trainer.py:803] 2025-04-26 17:00:19,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5608 [WARNING|trainer.py:803] 2025-04-26 17:00:20,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:20,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5573 5627 [WARNING|trainer.py:803] 2025-04-26 17:00:21,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5609 [WARNING|trainer.py:803] 2025-04-26 17:00:21,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:21,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5574 5628 [WARNING|trainer.py:803] 2025-04-26 17:00:22,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:00:23,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5610 [WARNING|trainer.py:803] 2025-04-26 17:00:23,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5575 5629 [WARNING|trainer.py:803] 2025-04-26 17:00:24,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:00:24,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5611 [WARNING|trainer.py:803] 2025-04-26 17:00:24,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5630 5576 [WARNING|trainer.py:803] 2025-04-26 17:00:25,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5612 [WARNING|trainer.py:803] 2025-04-26 17:00:26,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:26,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5631 5577 [WARNING|trainer.py:803] 2025-04-26 17:00:27,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5613 [WARNING|trainer.py:803] 2025-04-26 17:00:27,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:00:27,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5632 [WARNING|trainer.py:803] 2025-04-26 17:00:28,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5578 5614 [WARNING|trainer.py:803] 2025-04-26 17:00:29,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:00:29,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5633 [WARNING|trainer.py:803] 2025-04-26 17:00:29,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5579 5615 [WARNING|trainer.py:803] 2025-04-26 17:00:30,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:30,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5634 [WARNING|trainer.py:803] 2025-04-26 17:00:31,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5580 5616 [WARNING|trainer.py:803] 2025-04-26 17:00:32,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:32,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5635 [WARNING|trainer.py:803] 2025-04-26 17:00:32,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5581 [WARNING|trainer.py:803] 2025-04-26 17:00:33,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5617 [WARNING|trainer.py:803] 2025-04-26 17:00:34,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5636 [WARNING|trainer.py:803] 2025-04-26 17:00:34,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5582 [WARNING|trainer.py:803] 2025-04-26 17:00:35,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5618 [WARNING|trainer.py:803] 2025-04-26 17:00:35,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5637 [WARNING|trainer.py:803] 2025-04-26 17:00:35,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5583 [WARNING|trainer.py:803] 2025-04-26 17:00:36,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5619 [WARNING|trainer.py:803] 2025-04-26 17:00:37,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5638 [WARNING|trainer.py:803] 2025-04-26 17:00:37,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5584 [WARNING|trainer.py:803] 2025-04-26 17:00:38,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5620 [WARNING|trainer.py:803] 2025-04-26 17:00:38,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5639 [WARNING|trainer.py:803] 2025-04-26 17:00:39,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5585 [WARNING|trainer.py:803] 2025-04-26 17:00:39,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5621 5640 [WARNING|trainer.py:803] 2025-04-26 17:00:40,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:40,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5586 [WARNING|trainer.py:803] 2025-04-26 17:00:40,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5622 5641 [WARNING|trainer.py:803] 2025-04-26 17:00:41,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:42,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:00:42,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5587 5623 5642 [WARNING|trainer.py:803] 2025-04-26 17:00:43,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:00:43,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:00:43,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5588 5624 5643 [WARNING|trainer.py:803] 2025-04-26 17:00:44,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:00:44,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:45,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5625 5589 5644 [WARNING|trainer.py:803] 2025-04-26 17:00:46,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:00:46,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:46,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5626 5590 5645 [WARNING|trainer.py:803] 2025-04-26 17:00:47,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:00:48,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:00:48,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5627 5591 5646 [WARNING|trainer.py:803] 2025-04-26 17:00:49,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:49,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:00:49,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5628 5592 5647 [WARNING|trainer.py:803] 2025-04-26 17:00:50,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:51,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:00:51,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5629 5593 5648 [WARNING|trainer.py:803] 2025-04-26 17:00:52,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:52,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:00:52,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5630 5649 5594 [WARNING|trainer.py:803] 2025-04-26 17:00:53,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:00:54,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:00:54,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5631 5650 5595 [WARNING|trainer.py:803] 2025-04-26 17:00:55,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:00:55,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:00:55,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5632 5651 5596 [WARNING|trainer.py:803] 2025-04-26 17:00:56,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:00:57,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5633 [WARNING|trainer.py:803] 2025-04-26 17:00:57,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5652 [WARNING|trainer.py:803] 2025-04-26 17:00:58,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5597 [WARNING|trainer.py:803] 2025-04-26 17:00:58,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5634 [WARNING|trainer.py:803] 2025-04-26 17:00:58,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5653 5598 [WARNING|trainer.py:803] 2025-04-26 17:00:59,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:01:00,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5635 [WARNING|trainer.py:803] 2025-04-26 17:01:00,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5654 5599 [WARNING|trainer.py:803] 2025-04-26 17:01:01,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:01,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5636 [WARNING|trainer.py:803] 2025-04-26 17:01:01,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5655 5600 [WARNING|trainer.py:803] 2025-04-26 17:01:02,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:01:02,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:01:03,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5637 5656 5601 [WARNING|trainer.py:803] 2025-04-26 17:01:04,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:04,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5638 [WARNING|trainer.py:803] 2025-04-26 17:01:04,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5657 5602 [WARNING|trainer.py:803] 2025-04-26 17:01:05,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:05,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5639 [WARNING|trainer.py:803] 2025-04-26 17:01:06,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5658 5603 [WARNING|trainer.py:803] 2025-04-26 17:01:07,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:07,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5640 [WARNING|trainer.py:803] 2025-04-26 17:01:07,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5659 5604 [WARNING|trainer.py:803] 2025-04-26 17:01:08,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:08,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5641 [WARNING|trainer.py:803] 2025-04-26 17:01:09,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5660 5605 [WARNING|trainer.py:803] 2025-04-26 17:01:10,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:01:10,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5642 [WARNING|trainer.py:803] 2025-04-26 17:01:10,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5661 5606 [WARNING|trainer.py:803] 2025-04-26 17:01:11,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:11,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:12,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5643 5662 5607 [WARNING|trainer.py:803] 2025-04-26 17:01:13,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:01:13,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:13,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5663 5644 5608 [WARNING|trainer.py:803] 2025-04-26 17:01:14,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:14,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:01:15,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5645 5664 5609 [WARNING|trainer.py:803] 2025-04-26 17:01:16,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:16,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:01:16,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5665 5646 5610 [WARNING|trainer.py:803] 2025-04-26 17:01:17,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:17,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5666 [WARNING|trainer.py:803] 2025-04-26 17:01:18,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5647 5611 [WARNING|trainer.py:803] 2025-04-26 17:01:19,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:19,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:01:19,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5667 5648 5612 [WARNING|trainer.py:803] 2025-04-26 17:01:20,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:01:20,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:01:21,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5668 5649 5613 [WARNING|trainer.py:803] 2025-04-26 17:01:21,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:22,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5669 [WARNING|trainer.py:803] 2025-04-26 17:01:22,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5650 5614 [WARNING|trainer.py:803] 2025-04-26 17:01:23,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:23,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5670 [WARNING|trainer.py:803] 2025-04-26 17:01:24,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5651 5615 [WARNING|trainer.py:803] 2025-04-26 17:01:24,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:25,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5671 [WARNING|trainer.py:803] 2025-04-26 17:01:25,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5652 5616 [WARNING|trainer.py:803] 2025-04-26 17:01:26,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:01:26,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5672 [WARNING|trainer.py:803] 2025-04-26 17:01:27,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5653 5617 [WARNING|trainer.py:803] 2025-04-26 17:01:27,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:28,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5673 [WARNING|trainer.py:803] 2025-04-26 17:01:28,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5654 5618 [WARNING|trainer.py:803] 2025-04-26 17:01:29,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:01:29,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5674 [WARNING|trainer.py:803] 2025-04-26 17:01:30,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5655 5619 [WARNING|trainer.py:803] 2025-04-26 17:01:30,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:01:30,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5675 [WARNING|trainer.py:803] 2025-04-26 17:01:31,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5656 5620 [WARNING|trainer.py:803] 2025-04-26 17:01:32,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:32,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5676 5657 [WARNING|trainer.py:803] 2025-04-26 17:01:33,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5621 [WARNING|trainer.py:803] 2025-04-26 17:01:33,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:01:33,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5677 [WARNING|trainer.py:803] 2025-04-26 17:01:34,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5658 5622 [WARNING|trainer.py:803] 2025-04-26 17:01:35,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:01:35,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5678 [WARNING|trainer.py:803] 2025-04-26 17:01:36,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5659 5623 [WARNING|trainer.py:803] 2025-04-26 17:01:36,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:01:36,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5679 [WARNING|trainer.py:803] 2025-04-26 17:01:37,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5660 5624 [WARNING|trainer.py:803] 2025-04-26 17:01:38,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:01:38,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5680 [WARNING|trainer.py:803] 2025-04-26 17:01:38,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5661 5625 [WARNING|trainer.py:803] 2025-04-26 17:01:39,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:39,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5681 [WARNING|trainer.py:803] 2025-04-26 17:01:40,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5662 5626 [WARNING|trainer.py:803] 2025-04-26 17:01:41,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:01:41,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5682 [WARNING|trainer.py:803] 2025-04-26 17:01:41,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5663 5627 [WARNING|trainer.py:803] 2025-04-26 17:01:42,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:01:42,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5683 [WARNING|trainer.py:803] 2025-04-26 17:01:43,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5664 5628 [WARNING|trainer.py:803] 2025-04-26 17:01:44,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:01:44,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5684 [WARNING|trainer.py:803] 2025-04-26 17:01:44,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5665 5629 [WARNING|trainer.py:803] 2025-04-26 17:01:45,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:01:45,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5685 [WARNING|trainer.py:803] 2025-04-26 17:01:46,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5666 5630 [WARNING|trainer.py:803] 2025-04-26 17:01:47,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:01:47,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:47,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5686 5667 5631 [WARNING|trainer.py:803] 2025-04-26 17:01:48,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:01:48,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:01:49,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5668 5687 5632 [WARNING|trainer.py:803] 2025-04-26 17:01:50,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:50,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5688 5669 [WARNING|trainer.py:803] 2025-04-26 17:01:50,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5633 [WARNING|trainer.py:803] 2025-04-26 17:01:51,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:01:51,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5689 5670 [WARNING|trainer.py:803] 2025-04-26 17:01:52,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5634 [WARNING|trainer.py:803] 2025-04-26 17:01:53,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:01:53,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:01:53,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 5671 NoYes 5690 5635 [WARNING|trainer.py:803] 2025-04-26 17:01:54,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:01:54,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5672 5691 [WARNING|trainer.py:803] 2025-04-26 17:01:55,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5636 [WARNING|trainer.py:803] 2025-04-26 17:01:56,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:01:56,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5692 5673 [WARNING|trainer.py:803] 2025-04-26 17:01:56,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5637 [WARNING|trainer.py:803] 2025-04-26 17:01:57,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:01:57,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5693 5674 [WARNING|trainer.py:803] 2025-04-26 17:01:58,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5638 [WARNING|trainer.py:803] 2025-04-26 17:01:58,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:01:59,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5694 [WARNING|trainer.py:803] 2025-04-26 17:01:59,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5675 5639 [WARNING|trainer.py:803] 2025-04-26 17:02:00,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:02:00,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5695 [WARNING|trainer.py:803] 2025-04-26 17:02:01,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5676 5640 [WARNING|trainer.py:803] 2025-04-26 17:02:01,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:02:02,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5696 [WARNING|trainer.py:803] 2025-04-26 17:02:02,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5677 [WARNING|trainer.py:803] 2025-04-26 17:02:03,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [mov,mp4,m4a,3gp,3g2,mj2 @ 0x8975f900] moov atom not found [17:02:03] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4, Invalid data found when processing input Error reading /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4... Failed to load video: /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k 5641 [WARNING|trainer.py:803] 2025-04-26 17:02:03,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5697 [WARNING|trainer.py:803] 2025-04-26 17:02:04,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5678 [WARNING|trainer.py:803] 2025-04-26 17:02:04,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5642 [WARNING|trainer.py:803] 2025-04-26 17:02:05,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5698 [WARNING|trainer.py:803] 2025-04-26 17:02:05,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5679 [WARNING|trainer.py:803] 2025-04-26 17:02:06,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5643 [WARNING|trainer.py:803] 2025-04-26 17:02:06,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5699 [WARNING|trainer.py:803] 2025-04-26 17:02:07,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5680 [WARNING|trainer.py:803] 2025-04-26 17:02:07,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5644 [WARNING|trainer.py:803] 2025-04-26 17:02:08,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5700 [WARNING|trainer.py:803] 2025-04-26 17:02:08,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5681 [WARNING|trainer.py:803] 2025-04-26 17:02:09,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5645 [WARNING|trainer.py:803] 2025-04-26 17:02:09,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:02:10,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5682 5701 5646 [WARNING|trainer.py:803] 2025-04-26 17:02:11,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:02:11,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:02:11,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5683 5647 5702 [WARNING|trainer.py:803] 2025-04-26 17:02:12,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:02:12,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5684 [WARNING|trainer.py:803] 2025-04-26 17:02:13,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5648 [WARNING|trainer.py:803] 2025-04-26 17:02:14,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:02:14,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5685 5703 5649 [WARNING|trainer.py:803] 2025-04-26 17:02:15,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:02:15,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:02:15,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5686 5650 5704 [WARNING|trainer.py:803] 2025-04-26 17:02:17,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:02:17,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5687 [WARNING|trainer.py:803] 2025-04-26 17:02:17,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5651 [WARNING|trainer.py:803] 2025-04-26 17:02:18,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:02:18,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5705 5688 5652 [WARNING|trainer.py:803] 2025-04-26 17:02:20,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:02:20,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:02:20,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5689 5653 5706 [WARNING|trainer.py:803] 2025-04-26 17:02:21,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:02:21,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:02:22,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5690 5654 [WARNING|trainer.py:803] 2025-04-26 17:02:23,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:02:23,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5707 5691 5655 [WARNING|trainer.py:803] 2025-04-26 17:02:24,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:02:24,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:02:24,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5656 5708 5692 [WARNING|trainer.py:803] 2025-04-26 17:02:26,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:02:26,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:02:26,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5693 5657 5709 [WARNING|trainer.py:803] 2025-04-26 17:02:27,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:02:27,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:02:28,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5694 5658 [WARNING|trainer.py:803] 2025-04-26 17:02:29,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:02:29,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5710 5695 5659 [WARNING|trainer.py:803] 2025-04-26 17:02:30,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:02:30,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:02:30,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5696 5660 5711 [WARNING|trainer.py:803] 2025-04-26 17:02:32,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [mov,mp4,m4a,3gp,3g2,mj2 @ 0x3b93e0c0] moov atom not found [17:02:32] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4, Invalid data found when processing input Error reading /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4... Failed to load video: /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k [WARNING|trainer.py:803] 2025-04-26 17:02:32,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:02:32,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5697 5661 [WARNING|trainer.py:803] 2025-04-26 17:02:33,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5712 [WARNING|trainer.py:803] 2025-04-26 17:02:33,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5662 5698 [WARNING|trainer.py:803] 2025-04-26 17:02:34,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:02:35,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:02:35,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5663 5713 5699 [WARNING|trainer.py:803] 2025-04-26 17:02:36,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:02:36,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:02:36,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5664 5700 5714 [WARNING|trainer.py:803] 2025-04-26 17:02:38,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:02:38,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5665 [WARNING|trainer.py:803] 2025-04-26 17:02:38,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5701 [WARNING|trainer.py:803] 2025-04-26 17:02:39,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5715 5666 [WARNING|trainer.py:803] 2025-04-26 17:02:40,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:02:40,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:02:40,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5702 5667 5716 [WARNING|trainer.py:803] 2025-04-26 17:02:42,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:02:42,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:02:42,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5668 5703 [WARNING|trainer.py:803] 2025-04-26 17:02:43,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5717 5669 [WARNING|trainer.py:803] 2025-04-26 17:02:44,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:02:44,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:02:45,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5704 5670 5718 [WARNING|trainer.py:803] 2025-04-26 17:02:46,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:02:46,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:02:47,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5671 5705 5719 [WARNING|trainer.py:803] 2025-04-26 17:02:48,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:02:48,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5672 [WARNING|trainer.py:803] 2025-04-26 17:02:49,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:02:49,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5706 5720 5673 [WARNING|trainer.py:803] 2025-04-26 17:02:50,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:02:51,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:02:51,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5707 5674 5721 [WARNING|trainer.py:803] 2025-04-26 17:02:52,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:02:52,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:02:53,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5675 5708 [WARNING|trainer.py:803] 2025-04-26 17:02:54,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5722 [WARNING|trainer.py:803] 2025-04-26 17:02:54,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5676 [WARNING|trainer.py:803] 2025-04-26 17:02:55,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:02:55,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5709 5677 5723 [WARNING|trainer.py:803] 2025-04-26 17:02:56,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:02:57,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:02:57,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5678 5710 [WARNING|trainer.py:803] 2025-04-26 17:02:58,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5724 [WARNING|trainer.py:803] 2025-04-26 17:02:59,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5679 [WARNING|trainer.py:803] 2025-04-26 17:02:59,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5711 [WARNING|trainer.py:803] 2025-04-26 17:03:00,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5725 5680 [WARNING|trainer.py:803] 2025-04-26 17:03:01,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:01,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:03:01,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5712 5681 5726 [WARNING|trainer.py:803] 2025-04-26 17:03:03,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:03,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:03:03,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5682 5713 [WARNING|trainer.py:803] 2025-04-26 17:03:04,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5727 [WARNING|trainer.py:803] 2025-04-26 17:03:05,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5683 [WARNING|trainer.py:803] 2025-04-26 17:03:05,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:06,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5714 5684 5728 [WARNING|trainer.py:803] 2025-04-26 17:03:07,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:03:07,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:08,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5685 5715 5729 [WARNING|trainer.py:803] 2025-04-26 17:03:09,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:03:09,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:03:10,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5686 5716 [WARNING|trainer.py:803] 2025-04-26 17:03:11,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5730 [WARNING|trainer.py:803] 2025-04-26 17:03:11,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5687 [WARNING|trainer.py:803] 2025-04-26 17:03:12,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:03:12,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5717 5688 5731 [WARNING|trainer.py:803] 2025-04-26 17:03:13,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:03:14,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:03:14,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5689 5718 5732 [WARNING|trainer.py:803] 2025-04-26 17:03:15,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:15,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5690 [WARNING|trainer.py:803] 2025-04-26 17:03:16,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:17,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5719 5733 5691 [WARNING|trainer.py:803] 2025-04-26 17:03:17,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:18,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:18,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5720 5692 5734 [WARNING|trainer.py:803] 2025-04-26 17:03:19,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:03:19,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:03:20,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5693 5721 [WARNING|trainer.py:803] 2025-04-26 17:03:21,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5735 [WARNING|trainer.py:803] 2025-04-26 17:03:21,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5694 [WARNING|trainer.py:803] 2025-04-26 17:03:22,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:22,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5722 5695 5736 [WARNING|trainer.py:803] 2025-04-26 17:03:24,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:24,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:24,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5696 5723 5737 [WARNING|trainer.py:803] 2025-04-26 17:03:25,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [mov,mp4,m4a,3gp,3g2,mj2 @ 0x6612ad40] moov atom not found [17:03:25] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4, Invalid data found when processing input Error reading /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4... Failed to load video: /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k [WARNING|trainer.py:803] 2025-04-26 17:03:26,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5697 [WARNING|trainer.py:803] 2025-04-26 17:03:26,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:03:27,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5724 5698 5738 [WARNING|trainer.py:803] 2025-04-26 17:03:28,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:03:28,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:03:28,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5699 5725 5739 [WARNING|trainer.py:803] 2025-04-26 17:03:30,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:03:30,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:03:30,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5700 5726 [WARNING|trainer.py:803] 2025-04-26 17:03:31,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5740 [WARNING|trainer.py:803] 2025-04-26 17:03:32,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:03:33,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5701 [WARNING|trainer.py:803] 2025-04-26 17:03:33,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5727 5741 [WARNING|trainer.py:803] 2025-04-26 17:03:34,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:34,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5702 5728 5742 [WARNING|trainer.py:803] 2025-04-26 17:03:36,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:03:37,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:03:37,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5703 5729 5743 [WARNING|trainer.py:803] 2025-04-26 17:03:38,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:03:39,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:03:39,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5704 5730 5744 [WARNING|trainer.py:803] 2025-04-26 17:03:40,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:03:41,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:03:41,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5705 5731 5745 [WARNING|trainer.py:803] 2025-04-26 17:03:42,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:43,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:03:43,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5706 5732 5746 [WARNING|trainer.py:803] 2025-04-26 17:03:44,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:03:45,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:45,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5707 5733 [WARNING|trainer.py:803] 2025-04-26 17:03:46,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5747 [WARNING|trainer.py:803] 2025-04-26 17:03:47,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:47,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5708 5734 [WARNING|trainer.py:803] 2025-04-26 17:03:49,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5748 [WARNING|trainer.py:803] 2025-04-26 17:03:49,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:03:49,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5709 5735 5749 [WARNING|trainer.py:803] 2025-04-26 17:03:51,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:03:51,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:51,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5710 5736 5750 [WARNING|trainer.py:803] 2025-04-26 17:03:53,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:53,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:03:54,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5711 5737 5751 [WARNING|trainer.py:803] 2025-04-26 17:03:55,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:55,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:03:56,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5712 5738 5752 [WARNING|trainer.py:803] 2025-04-26 17:03:57,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:03:58,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:03:58,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5713 5739 5753 [WARNING|trainer.py:803] 2025-04-26 17:03:59,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:00,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:04:00,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5714 5740 5754 [WARNING|trainer.py:803] 2025-04-26 17:04:01,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:04:02,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:04:02,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5715 5741 5755 [WARNING|trainer.py:803] 2025-04-26 17:04:04,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:04:04,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:04,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5742 5716 5756 [WARNING|trainer.py:803] 2025-04-26 17:04:06,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:06,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:06,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5717 5757 5743 [WARNING|trainer.py:803] 2025-04-26 17:04:08,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:04:08,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:04:08,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5758 5718 5744 [WARNING|trainer.py:803] 2025-04-26 17:04:10,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:10,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:04:10,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5759 5745 5719 [WARNING|trainer.py:803] 2025-04-26 17:04:12,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:12,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:04:12,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5760 5720 5746 [WARNING|trainer.py:803] 2025-04-26 17:04:14,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:04:14,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:04:14,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5761 5721 5747 [WARNING|trainer.py:803] 2025-04-26 17:04:16,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:04:16,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:17,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5762 5722 5748 [WARNING|trainer.py:803] 2025-04-26 17:04:18,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:18,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:04:19,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5763 5723 5749 [WARNING|trainer.py:803] 2025-04-26 17:04:20,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:04:20,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:04:21,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5764 5724 5750 [WARNING|trainer.py:803] 2025-04-26 17:04:22,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:04:23,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:23,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5765 5725 5751 [WARNING|trainer.py:803] 2025-04-26 17:04:24,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:04:25,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:04:25,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5766 5726 5752 [WARNING|trainer.py:803] 2025-04-26 17:04:26,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:27,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:27,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5767 5727 5753 [WARNING|trainer.py:803] 2025-04-26 17:04:28,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:04:29,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:04:29,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5768 5754 [WARNING|trainer.py:803] 2025-04-26 17:04:30,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5728 [WARNING|trainer.py:803] 2025-04-26 17:04:31,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:04:31,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5769 [WARNING|trainer.py:803] 2025-04-26 17:04:32,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5755 5729 [WARNING|trainer.py:803] 2025-04-26 17:04:33,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:33,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5770 5756 [WARNING|trainer.py:803] 2025-04-26 17:04:34,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5730 [WARNING|trainer.py:803] 2025-04-26 17:04:35,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:04:35,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5771 5757 [WARNING|trainer.py:803] 2025-04-26 17:04:36,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5731 [WARNING|trainer.py:803] 2025-04-26 17:04:37,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:04:37,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5772 5758 5732 [WARNING|trainer.py:803] 2025-04-26 17:04:39,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:04:39,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:39,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5773 5759 5733 [WARNING|trainer.py:803] 2025-04-26 17:04:41,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:04:41,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:41,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5774 5760 [WARNING|trainer.py:803] 2025-04-26 17:04:43,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5734 [WARNING|trainer.py:803] 2025-04-26 17:04:43,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:04:43,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5775 5761 5735 [WARNING|trainer.py:803] 2025-04-26 17:04:45,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:45,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:04:46,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5776 5762 5736 [WARNING|trainer.py:803] 2025-04-26 17:04:47,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:04:47,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:48,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5777 5763 5737 [WARNING|trainer.py:803] 2025-04-26 17:04:49,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:04:49,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:04:50,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5778 5764 [WARNING|trainer.py:803] 2025-04-26 17:04:51,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5738 [WARNING|trainer.py:803] 2025-04-26 17:04:51,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:04:52,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5779 5765 [WARNING|trainer.py:803] 2025-04-26 17:04:53,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5739 [WARNING|trainer.py:803] 2025-04-26 17:04:53,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:04:54,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5766 5780 5740 [WARNING|trainer.py:803] 2025-04-26 17:04:55,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:55,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:04:56,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5781 5767 5741 [WARNING|trainer.py:803] 2025-04-26 17:04:57,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:04:58,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:04:58,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5782 5768 5742 [WARNING|trainer.py:803] 2025-04-26 17:04:59,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:05:00,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:05:00,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5769 5783 5743 [WARNING|trainer.py:803] 2025-04-26 17:05:02,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:05:02,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:02,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5770 5784 5744 [WARNING|trainer.py:803] 2025-04-26 17:05:04,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:04,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:04,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5771 5785 5745 [WARNING|trainer.py:803] 2025-04-26 17:05:06,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:05:06,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:06,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5772 5786 5746 [WARNING|trainer.py:803] 2025-04-26 17:05:08,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:05:08,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:09,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5773 5787 [WARNING|trainer.py:803] 2025-04-26 17:05:10,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5747 [WARNING|trainer.py:803] 2025-04-26 17:05:10,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:05:11,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5774 5788 [WARNING|trainer.py:803] 2025-04-26 17:05:12,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5748 [WARNING|trainer.py:803] 2025-04-26 17:05:13,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:05:13,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5775 5789 [WARNING|trainer.py:803] 2025-04-26 17:05:14,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5749 [WARNING|trainer.py:803] 2025-04-26 17:05:14,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:05:15,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5776 5790 [WARNING|trainer.py:803] 2025-04-26 17:05:16,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5750 [WARNING|trainer.py:803] 2025-04-26 17:05:16,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:17,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5777 5791 [WARNING|trainer.py:803] 2025-04-26 17:05:18,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5751 [WARNING|trainer.py:803] 2025-04-26 17:05:19,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:05:19,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5778 5792 [WARNING|trainer.py:803] 2025-04-26 17:05:20,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:21,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5752 [WARNING|trainer.py:803] 2025-04-26 17:05:22,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5779 5793 [WARNING|trainer.py:803] 2025-04-26 17:05:23,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5753 [WARNING|trainer.py:803] 2025-04-26 17:05:23,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:05:23,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5794 5780 5754 [WARNING|trainer.py:803] 2025-04-26 17:05:25,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:05:25,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:05:25,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5795 5781 5755 [WARNING|trainer.py:803] 2025-04-26 17:05:27,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:05:27,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:28,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5796 5782 5756 [WARNING|trainer.py:803] 2025-04-26 17:05:29,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:05:29,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:05:30,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5797 5783 5757 [WARNING|trainer.py:803] 2025-04-26 17:05:31,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:05:32,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:32,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5798 5758 5784 [WARNING|trainer.py:803] 2025-04-26 17:05:33,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:34,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:34,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5799 5759 5785 [WARNING|trainer.py:803] 2025-04-26 17:05:35,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:05:36,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:36,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5800 5760 [WARNING|trainer.py:803] 2025-04-26 17:05:37,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5786 [WARNING|trainer.py:803] 2025-04-26 17:05:38,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:05:38,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5801 5761 [WARNING|trainer.py:803] 2025-04-26 17:05:39,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:40,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5787 [WARNING|trainer.py:803] 2025-04-26 17:05:40,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5802 5762 [WARNING|trainer.py:803] 2025-04-26 17:05:41,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5788 [WARNING|trainer.py:803] 2025-04-26 17:05:42,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:42,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5763 5803 5789 [WARNING|trainer.py:803] 2025-04-26 17:05:44,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:05:44,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:05:44,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5764 5804 5790 [WARNING|trainer.py:803] 2025-04-26 17:05:46,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:05:46,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:05:46,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5765 5805 5791 [WARNING|trainer.py:803] 2025-04-26 17:05:48,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:05:48,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:48,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5766 5806 5792 [WARNING|trainer.py:803] 2025-04-26 17:05:50,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:50,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:05:50,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5767 5807 5793 [WARNING|trainer.py:803] 2025-04-26 17:05:52,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:05:52,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:05:52,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5768 5808 5794 [WARNING|trainer.py:803] 2025-04-26 17:05:54,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:05:54,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:05:54,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5769 5809 5795 [WARNING|trainer.py:803] 2025-04-26 17:05:56,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:05:56,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:56,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5770 5810 5796 [WARNING|trainer.py:803] 2025-04-26 17:05:58,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:05:58,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:05:59,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5771 5811 5797 [WARNING|trainer.py:803] 2025-04-26 17:06:00,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:06:00,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:01,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5812 5772 5798 [WARNING|trainer.py:803] 2025-04-26 17:06:02,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:02,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:06:03,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5813 5773 5799 [WARNING|trainer.py:803] 2025-04-26 17:06:04,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:06:04,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:06:05,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5774 5814 5800 [WARNING|trainer.py:803] 2025-04-26 17:06:06,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:06:06,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:06:07,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5815 5775 5801 [WARNING|trainer.py:803] 2025-04-26 17:06:08,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:09,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:09,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5816 5776 5802 [WARNING|trainer.py:803] 2025-04-26 17:06:11,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:06:11,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:06:11,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5817 5777 5803 [WARNING|trainer.py:803] 2025-04-26 17:06:13,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:13,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:06:13,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5778 5818 5804 [WARNING|trainer.py:803] 2025-04-26 17:06:15,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:15,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:16,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5779 5819 5805 [WARNING|trainer.py:803] 2025-04-26 17:06:17,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:17,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:06:18,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5820 5780 5806 [WARNING|trainer.py:803] 2025-04-26 17:06:19,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:06:19,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:06:20,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5781 5821 5807 [WARNING|trainer.py:803] 2025-04-26 17:06:21,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:22,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:06:22,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5782 5808 5822 [WARNING|trainer.py:803] 2025-04-26 17:06:24,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:06:24,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:06:24,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5809 5783 5823 [WARNING|trainer.py:803] 2025-04-26 17:06:26,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:26,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:26,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5810 5824 5784 [WARNING|trainer.py:803] 2025-04-26 17:06:28,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:06:28,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:06:28,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5811 5785 5825 [WARNING|trainer.py:803] 2025-04-26 17:06:30,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:30,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:30,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5812 5826 5786 [WARNING|trainer.py:803] 2025-04-26 17:06:32,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:32,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:06:32,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5813 5827 5787 [WARNING|trainer.py:803] 2025-04-26 17:06:34,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:06:35,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:35,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5814 5828 5788 [WARNING|trainer.py:803] 2025-04-26 17:06:36,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:06:37,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:06:37,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5815 5829 5789 [WARNING|trainer.py:803] 2025-04-26 17:06:38,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:39,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:39,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5816 5790 5830 [WARNING|trainer.py:803] 2025-04-26 17:06:40,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:06:41,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:41,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5817 5831 5791 [WARNING|trainer.py:803] 2025-04-26 17:06:43,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:43,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:43,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5818 5792 5832 [WARNING|trainer.py:803] 2025-04-26 17:06:45,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:45,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:06:45,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5793 5833 5819 [WARNING|trainer.py:803] 2025-04-26 17:06:47,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:06:47,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:47,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5794 5834 5820 [WARNING|trainer.py:803] 2025-04-26 17:06:49,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:06:49,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:49,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5795 5835 5821 [WARNING|trainer.py:803] 2025-04-26 17:06:51,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:06:51,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:06:52,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5796 5836 5822 [WARNING|trainer.py:803] 2025-04-26 17:06:53,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:06:53,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:06:54,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5797 5837 5823 [WARNING|trainer.py:803] 2025-04-26 17:06:55,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:06:55,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:06:56,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5838 5798 5824 [WARNING|trainer.py:803] 2025-04-26 17:06:57,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:06:57,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:06:58,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5799 5839 5825 [WARNING|trainer.py:803] 2025-04-26 17:06:59,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:00,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:00,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5800 5840 5826 [WARNING|trainer.py:803] 2025-04-26 17:07:02,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:02,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:02,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5841 5801 [WARNING|trainer.py:803] 2025-04-26 17:07:04,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5827 [WARNING|trainer.py:803] 2025-04-26 17:07:04,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:05,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5842 5802 5828 [WARNING|trainer.py:803] 2025-04-26 17:07:06,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:06,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:06,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5843 5803 5829 [WARNING|trainer.py:803] 2025-04-26 17:07:08,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:08,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:07:08,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5844 5804 5830 [WARNING|trainer.py:803] 2025-04-26 17:07:10,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:10,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:11,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5845 5805 5831 [WARNING|trainer.py:803] 2025-04-26 17:07:12,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:12,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:13,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5846 5806 5832 [WARNING|trainer.py:803] 2025-04-26 17:07:14,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:14,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:07:15,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5847 5807 5833 [WARNING|trainer.py:803] 2025-04-26 17:07:16,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:07:16,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:07:17,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5848 5808 5834 [WARNING|trainer.py:803] 2025-04-26 17:07:18,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:18,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:07:19,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5849 5809 5835 [WARNING|trainer.py:803] 2025-04-26 17:07:20,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:20,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:21,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5810 5850 5836 [WARNING|trainer.py:803] 2025-04-26 17:07:22,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:07:22,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:23,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5811 5851 5837 [WARNING|trainer.py:803] 2025-04-26 17:07:24,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:25,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:25,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5812 5852 5838 [WARNING|trainer.py:803] 2025-04-26 17:07:27,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:27,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:27,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5813 5853 5839 [WARNING|trainer.py:803] 2025-04-26 17:07:29,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:29,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:07:29,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5814 5854 5840 [WARNING|trainer.py:803] 2025-04-26 17:07:31,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:07:31,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:31,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5815 5855 5841 [WARNING|trainer.py:803] 2025-04-26 17:07:33,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:33,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:33,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5816 5856 5842 [WARNING|trainer.py:803] 2025-04-26 17:07:35,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:07:35,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:35,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5857 5817 5843 [WARNING|trainer.py:803] 2025-04-26 17:07:37,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:07:37,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:38,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5858 5818 5844 [WARNING|trainer.py:803] 2025-04-26 17:07:39,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:39,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:40,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5859 5819 5845 [WARNING|trainer.py:803] 2025-04-26 17:07:41,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:42,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:07:42,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5860 5846 5820 [WARNING|trainer.py:803] 2025-04-26 17:07:44,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:07:44,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:44,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5861 5847 5821 [WARNING|trainer.py:803] 2025-04-26 17:07:46,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:46,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:07:46,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5862 5848 5822 [WARNING|trainer.py:803] 2025-04-26 17:07:48,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:48,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:48,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5863 5849 5823 [WARNING|trainer.py:803] 2025-04-26 17:07:50,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:50,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:51,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5850 5864 5824 [WARNING|trainer.py:803] 2025-04-26 17:07:52,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:07:52,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:53,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5851 5865 5825 [WARNING|trainer.py:803] 2025-04-26 17:07:54,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:54,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:55,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5866 5852 5826 [WARNING|trainer.py:803] 2025-04-26 17:07:56,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:57,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:07:57,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5853 5867 5827 [WARNING|trainer.py:803] 2025-04-26 17:07:59,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:07:59,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:07:59,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5854 5868 5828 [WARNING|trainer.py:803] 2025-04-26 17:08:01,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:01,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:01,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5855 5869 5829 [WARNING|trainer.py:803] 2025-04-26 17:08:03,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:03,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:03,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5856 5870 5830 [WARNING|trainer.py:803] 2025-04-26 17:08:05,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:05,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:05,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5857 5871 5831 [WARNING|trainer.py:803] 2025-04-26 17:08:07,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:08:07,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:08:07,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5858 5872 5832 [WARNING|trainer.py:803] 2025-04-26 17:08:09,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:09,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:10,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5859 5873 5833 [WARNING|trainer.py:803] 2025-04-26 17:08:11,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:11,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:08:12,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5874 5860 5834 [WARNING|trainer.py:803] 2025-04-26 17:08:13,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:08:13,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:08:14,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5875 5861 5835 [WARNING|trainer.py:803] 2025-04-26 17:08:15,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:15,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:16,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5876 5862 5836 [WARNING|trainer.py:803] 2025-04-26 17:08:17,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:17,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:18,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5877 5863 5837 [WARNING|trainer.py:803] 2025-04-26 17:08:19,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:20,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:20,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5878 5864 5838 [WARNING|trainer.py:803] 2025-04-26 17:08:21,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:08:22,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:22,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5879 5865 5839 [WARNING|trainer.py:803] 2025-04-26 17:08:24,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:08:24,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:24,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5880 5866 5840 [WARNING|trainer.py:803] 2025-04-26 17:08:26,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:26,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:26,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5881 5867 5841 [WARNING|trainer.py:803] 2025-04-26 17:08:28,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:28,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:28,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5882 5868 5842 [WARNING|trainer.py:803] 2025-04-26 17:08:30,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:08:30,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:31,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5883 5869 5843 [WARNING|trainer.py:803] 2025-04-26 17:08:33,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:08:33,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:33,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5884 5844 5870 [WARNING|trainer.py:803] 2025-04-26 17:08:35,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:08:35,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:35,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5885 5845 5871 [WARNING|trainer.py:803] 2025-04-26 17:08:37,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:37,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:37,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5886 5846 5872 [WARNING|trainer.py:803] 2025-04-26 17:08:39,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:08:39,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:39,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5873 5847 5887 [WARNING|trainer.py:803] 2025-04-26 17:08:41,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:08:41,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:08:41,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5874 5848 5888 [WARNING|trainer.py:803] 2025-04-26 17:08:43,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:08:43,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:44,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5875 5849 5889 [WARNING|trainer.py:803] 2025-04-26 17:08:45,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:45,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:46,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5876 5850 [WARNING|trainer.py:803] 2025-04-26 17:08:47,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5890 [WARNING|trainer.py:803] 2025-04-26 17:08:47,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:08:48,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5877 5851 5891 [WARNING|trainer.py:803] 2025-04-26 17:08:49,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:50,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:50,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5878 5852 [WARNING|trainer.py:803] 2025-04-26 17:08:51,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5892 [WARNING|trainer.py:803] 2025-04-26 17:08:52,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:52,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5879 5853 [WARNING|trainer.py:803] 2025-04-26 17:08:53,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5893 [WARNING|trainer.py:803] 2025-04-26 17:08:54,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:08:54,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5880 5854 5894 [WARNING|trainer.py:803] 2025-04-26 17:08:56,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:56,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:56,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5881 5855 5895 [WARNING|trainer.py:803] 2025-04-26 17:08:58,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:08:58,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:08:59,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5882 5856 5896 [WARNING|trainer.py:803] 2025-04-26 17:09:00,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:00,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:09:01,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5883 5857 [WARNING|trainer.py:803] 2025-04-26 17:09:02,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:02,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5897 [WARNING|trainer.py:803] 2025-04-26 17:09:03,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5884 5858 [WARNING|trainer.py:803] 2025-04-26 17:09:04,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5898 [WARNING|trainer.py:803] 2025-04-26 17:09:05,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:05,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5885 5859 [WARNING|trainer.py:803] 2025-04-26 17:09:06,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5899 [WARNING|trainer.py:803] 2025-04-26 17:09:07,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:07,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5886 5860 [WARNING|trainer.py:803] 2025-04-26 17:09:08,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5900 [WARNING|trainer.py:803] 2025-04-26 17:09:09,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:09,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5887 5861 [WARNING|trainer.py:803] 2025-04-26 17:09:11,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5901 [WARNING|trainer.py:803] 2025-04-26 17:09:11,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:11,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5862 5888 5902 [WARNING|trainer.py:803] 2025-04-26 17:09:13,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:13,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:09:13,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5863 5889 5903 [WARNING|trainer.py:803] 2025-04-26 17:09:15,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:15,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:15,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5864 5904 5890 [WARNING|trainer.py:803] 2025-04-26 17:09:17,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:09:17,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:18,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5865 5891 5905 [WARNING|trainer.py:803] 2025-04-26 17:09:19,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:19,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:20,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5866 5906 5892 [WARNING|trainer.py:803] 2025-04-26 17:09:22,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:22,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:22,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5907 5867 5893 [WARNING|trainer.py:803] 2025-04-26 17:09:24,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:09:24,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:24,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5908 5868 5894 [WARNING|trainer.py:803] 2025-04-26 17:09:26,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:26,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:26,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5909 5869 5895 [WARNING|trainer.py:803] 2025-04-26 17:09:28,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:28,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:09:28,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5910 5870 5896 [WARNING|trainer.py:803] 2025-04-26 17:09:30,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:30,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:30,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5911 5871 [WARNING|trainer.py:803] 2025-04-26 17:09:32,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5897 [WARNING|trainer.py:803] 2025-04-26 17:09:32,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:33,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5912 5872 [WARNING|trainer.py:803] 2025-04-26 17:09:34,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5898 [WARNING|trainer.py:803] 2025-04-26 17:09:34,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:35,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5913 5873 [WARNING|trainer.py:803] 2025-04-26 17:09:36,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5899 [WARNING|trainer.py:803] 2025-04-26 17:09:36,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:37,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5914 5874 [WARNING|trainer.py:803] 2025-04-26 17:09:38,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5900 [WARNING|trainer.py:803] 2025-04-26 17:09:38,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5915 [WARNING|trainer.py:803] 2025-04-26 17:09:39,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:09:40,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5875 5901 [WARNING|trainer.py:803] 2025-04-26 17:09:40,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5916 [WARNING|trainer.py:803] 2025-04-26 17:09:41,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:42,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5876 5902 [WARNING|trainer.py:803] 2025-04-26 17:09:43,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5917 [WARNING|trainer.py:803] 2025-04-26 17:09:43,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:44,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5877 5903 [WARNING|trainer.py:803] 2025-04-26 17:09:45,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5918 [WARNING|trainer.py:803] 2025-04-26 17:09:45,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:46,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5878 5904 [WARNING|trainer.py:803] 2025-04-26 17:09:47,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5919 [WARNING|trainer.py:803] 2025-04-26 17:09:47,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:48,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5879 5905 5920 [WARNING|trainer.py:803] 2025-04-26 17:09:49,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:09:49,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:09:50,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5906 5880 5921 [WARNING|trainer.py:803] 2025-04-26 17:09:51,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:51,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:09:52,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5907 5881 5922 [WARNING|trainer.py:803] 2025-04-26 17:09:53,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:09:54,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:09:54,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5908 5882 5923 [WARNING|trainer.py:803] 2025-04-26 17:09:55,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:56,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:56,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5909 5883 5924 [WARNING|trainer.py:803] 2025-04-26 17:09:57,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:09:58,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:09:58,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5910 5925 [WARNING|trainer.py:803] 2025-04-26 17:09:59,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5884 [WARNING|trainer.py:803] 2025-04-26 17:10:00,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:10:00,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5911 5926 [WARNING|trainer.py:803] 2025-04-26 17:10:01,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5885 [WARNING|trainer.py:803] 2025-04-26 17:10:02,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:02,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5912 5927 [WARNING|trainer.py:803] 2025-04-26 17:10:03,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5886 [WARNING|trainer.py:803] 2025-04-26 17:10:04,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:10:04,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5913 5928 [WARNING|trainer.py:803] 2025-04-26 17:10:05,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5887 [WARNING|trainer.py:803] 2025-04-26 17:10:06,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5914 [WARNING|trainer.py:803] 2025-04-26 17:10:07,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5929 [WARNING|trainer.py:803] 2025-04-26 17:10:07,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:08,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5888 5915 5930 [WARNING|trainer.py:803] 2025-04-26 17:10:09,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:10:09,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:10:10,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5916 5889 5931 [WARNING|trainer.py:803] 2025-04-26 17:10:11,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:10:11,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:12,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5917 5890 5932 [WARNING|trainer.py:803] 2025-04-26 17:10:13,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:10:14,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:10:14,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5918 5933 5891 [WARNING|trainer.py:803] 2025-04-26 17:10:15,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:16,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:16,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5919 5934 5892 [WARNING|trainer.py:803] 2025-04-26 17:10:17,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:10:18,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:10:18,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5920 5935 [WARNING|trainer.py:803] 2025-04-26 17:10:20,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5893 [WARNING|trainer.py:803] 2025-04-26 17:10:20,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:20,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5921 5936 [WARNING|trainer.py:803] 2025-04-26 17:10:21,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:22,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5894 [WARNING|trainer.py:803] 2025-04-26 17:10:23,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5922 5937 [WARNING|trainer.py:803] 2025-04-26 17:10:24,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:24,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5895 [WARNING|trainer.py:803] 2025-04-26 17:10:25,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5938 5923 [WARNING|trainer.py:803] 2025-04-26 17:10:26,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:10:26,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5896 5939 [WARNING|trainer.py:803] 2025-04-26 17:10:27,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5924 [WARNING|trainer.py:803] 2025-04-26 17:10:28,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:10:28,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5897 5940 5925 [WARNING|trainer.py:803] 2025-04-26 17:10:29,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:10:30,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:10:30,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5898 5941 5926 [WARNING|trainer.py:803] 2025-04-26 17:10:31,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:10:32,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:10:32,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5899 5942 5927 [WARNING|trainer.py:803] 2025-04-26 17:10:33,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:10:34,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:10:34,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5900 5943 5928 [WARNING|trainer.py:803] 2025-04-26 17:10:36,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:10:36,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:10:36,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5944 5901 5929 [WARNING|trainer.py:803] 2025-04-26 17:10:38,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:38,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:38,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5945 5930 5902 [WARNING|trainer.py:803] 2025-04-26 17:10:39,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:10:40,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:40,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5946 5931 5903 [WARNING|trainer.py:803] 2025-04-26 17:10:42,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:10:42,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:10:42,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5947 5932 5904 [WARNING|trainer.py:803] 2025-04-26 17:10:43,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:10:44,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:10:44,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5948 5933 5905 [WARNING|trainer.py:803] 2025-04-26 17:10:45,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:10:46,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:46,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5934 5949 5906 [WARNING|trainer.py:803] 2025-04-26 17:10:48,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:10:48,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:10:48,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5935 5907 5950 [WARNING|trainer.py:803] 2025-04-26 17:10:49,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:50,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 17:10:50,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes YesYes 5936 5951 5908 [WARNING|trainer.py:803] 2025-04-26 17:10:51,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:10:52,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:52,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5937 5952 5909 [WARNING|trainer.py:803] 2025-04-26 17:10:54,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:10:54,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:54,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5938 5953 5910 [WARNING|trainer.py:803] 2025-04-26 17:10:56,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:10:56,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:56,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5939 5954 5911 [WARNING|trainer.py:803] 2025-04-26 17:10:57,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:10:58,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:10:58,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5940 5955 5912 [WARNING|trainer.py:803] 2025-04-26 17:11:00,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:11:00,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:00,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5956 5941 5913 [WARNING|trainer.py:803] 2025-04-26 17:11:02,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:11:02,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:11:02,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5957 5942 5914 [WARNING|trainer.py:803] 2025-04-26 17:11:04,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:04,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:11:04,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5958 5943 5915 [WARNING|trainer.py:803] 2025-04-26 17:11:06,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:06,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:06,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5944 5959 5916 [WARNING|trainer.py:803] 2025-04-26 17:11:08,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:08,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:08,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5945 5960 5917 [WARNING|trainer.py:803] 2025-04-26 17:11:10,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:11:10,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:11:10,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5961 5946 5918 [WARNING|trainer.py:803] 2025-04-26 17:11:12,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:11:12,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:11:12,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5947 5962 5919 [WARNING|trainer.py:803] 2025-04-26 17:11:14,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:11:14,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:14,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5948 5963 5920 [WARNING|trainer.py:803] 2025-04-26 17:11:16,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:11:16,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:16,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5949 5964 5921 [WARNING|trainer.py:803] 2025-04-26 17:11:18,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:11:18,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:18,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5965 5950 5922 [WARNING|trainer.py:803] 2025-04-26 17:11:20,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:20,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:20,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5966 5951 5923 [WARNING|trainer.py:803] 2025-04-26 17:11:22,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:22,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:22,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5967 5952 5924 [WARNING|trainer.py:803] 2025-04-26 17:11:23,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:11:24,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:24,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5968 5953 5925 [WARNING|trainer.py:803] 2025-04-26 17:11:25,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:26,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:26,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5954 5969 5926 [WARNING|trainer.py:803] 2025-04-26 17:11:28,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:28,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:28,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5955 5970 5927 [WARNING|trainer.py:803] 2025-04-26 17:11:30,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:30,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:30,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5956 5971 5928 [WARNING|trainer.py:803] 2025-04-26 17:11:32,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:11:32,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:11:32,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5957 5972 5929 [WARNING|trainer.py:803] 2025-04-26 17:11:34,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:34,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:34,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5958 5973 5930 [WARNING|trainer.py:803] 2025-04-26 17:11:36,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:36,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:36,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5959 5974 5931 [WARNING|trainer.py:803] 2025-04-26 17:11:38,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:38,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:38,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5960 5932 5975 [WARNING|trainer.py:803] 2025-04-26 17:11:40,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:11:40,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:40,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5961 5933 5976 [WARNING|trainer.py:803] 2025-04-26 17:11:42,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:11:42,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:42,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5962 5934 5977 [WARNING|trainer.py:803] 2025-04-26 17:11:44,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:44,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:11:44,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5963 5935 5978 [WARNING|trainer.py:803] 2025-04-26 17:11:46,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:46,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:46,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5964 5979 5936 [WARNING|trainer.py:803] 2025-04-26 17:11:48,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:48,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:48,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5965 5937 5980 [WARNING|trainer.py:803] 2025-04-26 17:11:50,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:50,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:11:50,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5966 5938 5981 [WARNING|trainer.py:803] 2025-04-26 17:11:52,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:11:52,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:52,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5967 5939 5982 [WARNING|trainer.py:803] 2025-04-26 17:11:54,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:11:54,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:11:54,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5968 5940 5983 [WARNING|trainer.py:803] 2025-04-26 17:11:56,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:56,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:11:56,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5969 5941 5984 [WARNING|trainer.py:803] 2025-04-26 17:11:58,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:11:58,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:11:58,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5970 5942 5985 [WARNING|trainer.py:803] 2025-04-26 17:12:00,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:00,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:12:00,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5971 5943 [WARNING|trainer.py:803] 2025-04-26 17:12:02,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:12:02,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5986 [WARNING|trainer.py:803] 2025-04-26 17:12:03,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5944 5972 [WARNING|trainer.py:803] 2025-04-26 17:12:04,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:04,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5987 [WARNING|trainer.py:803] 2025-04-26 17:12:05,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5945 5973 [WARNING|trainer.py:803] 2025-04-26 17:12:06,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:12:06,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5988 [WARNING|trainer.py:803] 2025-04-26 17:12:07,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5946 5974 [WARNING|trainer.py:803] 2025-04-26 17:12:08,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:12:08,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5989 [WARNING|trainer.py:803] 2025-04-26 17:12:09,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5947 5975 [WARNING|trainer.py:803] 2025-04-26 17:12:10,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:12:10,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5990 5948 [WARNING|trainer.py:803] 2025-04-26 17:12:11,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5976 [WARNING|trainer.py:803] 2025-04-26 17:12:12,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:12:12,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5991 5949 [WARNING|trainer.py:803] 2025-04-26 17:12:13,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5977 [WARNING|trainer.py:803] 2025-04-26 17:12:14,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:12:14,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5992 5950 [WARNING|trainer.py:803] 2025-04-26 17:12:15,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5978 [WARNING|trainer.py:803] 2025-04-26 17:12:16,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:12:16,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5993 5951 5979 [WARNING|trainer.py:803] 2025-04-26 17:12:17,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:18,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:18,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5994 5952 5980 [WARNING|trainer.py:803] 2025-04-26 17:12:19,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:20,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:20,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5995 5953 [WARNING|trainer.py:803] 2025-04-26 17:12:21,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5981 [WARNING|trainer.py:803] 2025-04-26 17:12:22,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:22,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5996 5954 [WARNING|trainer.py:803] 2025-04-26 17:12:23,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5982 [WARNING|trainer.py:803] 2025-04-26 17:12:24,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:24,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5997 5955 [WARNING|trainer.py:803] 2025-04-26 17:12:25,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5983 [WARNING|trainer.py:803] 2025-04-26 17:12:26,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:26,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5998 5956 [WARNING|trainer.py:803] 2025-04-26 17:12:27,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5984 [WARNING|trainer.py:803] 2025-04-26 17:12:28,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:12:28,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5999 5957 [WARNING|trainer.py:803] 2025-04-26 17:12:29,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5985 [WARNING|trainer.py:803] 2025-04-26 17:12:30,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6000 [WARNING|trainer.py:803] 2025-04-26 17:12:31,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5958 [WARNING|trainer.py:803] 2025-04-26 17:12:31,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5986 [WARNING|trainer.py:803] 2025-04-26 17:12:32,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6001 [WARNING|trainer.py:803] 2025-04-26 17:12:33,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5959 [WARNING|trainer.py:803] 2025-04-26 17:12:33,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:12:34,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5987 6002 [WARNING|trainer.py:803] 2025-04-26 17:12:35,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5960 [WARNING|trainer.py:803] 2025-04-26 17:12:35,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:12:36,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5988 6003 [WARNING|trainer.py:803] 2025-04-26 17:12:37,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5961 [WARNING|trainer.py:803] 2025-04-26 17:12:38,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:38,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5989 6004 [WARNING|trainer.py:803] 2025-04-26 17:12:39,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5962 [WARNING|trainer.py:803] 2025-04-26 17:12:40,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:12:40,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5990 6005 5963 [WARNING|trainer.py:803] 2025-04-26 17:12:41,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:12:42,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:12:42,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5991 6006 5964 [WARNING|trainer.py:803] 2025-04-26 17:12:43,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:44,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:44,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5992 6007 5965 [WARNING|trainer.py:803] 2025-04-26 17:12:45,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:45,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:12:46,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5993 6008 5966 [WARNING|trainer.py:803] 2025-04-26 17:12:47,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:47,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:12:48,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5994 6009 5967 [WARNING|trainer.py:803] 2025-04-26 17:12:50,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:50,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:12:50,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6010 5995 5968 [WARNING|trainer.py:803] 2025-04-26 17:12:51,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:12:52,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:52,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6011 5996 5969 [WARNING|trainer.py:803] 2025-04-26 17:12:53,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:12:54,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:54,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6012 5997 5970 [WARNING|trainer.py:803] 2025-04-26 17:12:55,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:12:56,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:12:56,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6013 5998 5971 [WARNING|trainer.py:803] 2025-04-26 17:12:57,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:12:58,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:12:58,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6014 5999 5972 [WARNING|trainer.py:803] 2025-04-26 17:13:00,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:13:00,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:13:00,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6015 6000 5973 [WARNING|trainer.py:803] 2025-04-26 17:13:02,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:13:02,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:13:02,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6016 6001 5974 [WARNING|trainer.py:803] 2025-04-26 17:13:04,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:13:04,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:13:04,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6017 6002 [WARNING|trainer.py:803] 2025-04-26 17:13:05,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5975 [WARNING|trainer.py:803] 2025-04-26 17:13:06,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:13:06,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6018 6003 [WARNING|trainer.py:803] 2025-04-26 17:13:07,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5976 [WARNING|trainer.py:803] 2025-04-26 17:13:08,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:13:08,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6019 6004 [WARNING|trainer.py:803] 2025-04-26 17:13:09,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5977 [WARNING|trainer.py:803] 2025-04-26 17:13:10,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:13:10,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6020 6005 [WARNING|trainer.py:803] 2025-04-26 17:13:11,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5978 [WARNING|trainer.py:803] 2025-04-26 17:13:12,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:13:12,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6021 6006 [WARNING|trainer.py:803] 2025-04-26 17:13:13,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5979 [WARNING|trainer.py:803] 2025-04-26 17:13:14,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:13:14,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6022 6007 [WARNING|trainer.py:803] 2025-04-26 17:13:15,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5980 [WARNING|trainer.py:803] 2025-04-26 17:13:16,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:13:16,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6023 6008 [WARNING|trainer.py:803] 2025-04-26 17:13:17,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5981 [WARNING|trainer.py:803] 2025-04-26 17:13:18,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:13:18,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6024 6009 [WARNING|trainer.py:803] 2025-04-26 17:13:20,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5982 [WARNING|trainer.py:803] 2025-04-26 17:13:20,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:13:21,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6025 6010 [WARNING|trainer.py:803] 2025-04-26 17:13:22,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:13:22,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5983 [WARNING|trainer.py:803] 2025-04-26 17:13:23,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6011 6026 [WARNING|trainer.py:803] 2025-04-26 17:13:23,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:13:24,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5984 6027 [WARNING|trainer.py:803] 2025-04-26 17:13:25,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6012 [WARNING|trainer.py:803] 2025-04-26 17:13:25,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:13:26,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5985 6013 6028 [WARNING|trainer.py:803] 2025-04-26 17:13:27,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:13:28,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:13:28,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5986 6014 6029 [WARNING|trainer.py:803] 2025-04-26 17:13:29,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:13:30,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:13:30,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5987 6015 [WARNING|trainer.py:803] 2025-04-26 17:13:31,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6030 [WARNING|trainer.py:803] 2025-04-26 17:13:32,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:13:32,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5988 6016 [WARNING|trainer.py:803] 2025-04-26 17:13:33,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6031 [WARNING|trainer.py:803] 2025-04-26 17:13:34,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:13:34,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5989 6017 [WARNING|trainer.py:803] 2025-04-26 17:13:35,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6032 [WARNING|trainer.py:803] 2025-04-26 17:13:36,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:13:36,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6018 5990 6033 [WARNING|trainer.py:803] 2025-04-26 17:13:38,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:13:38,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:13:38,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6019 5991 6034 [WARNING|trainer.py:803] 2025-04-26 17:13:40,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:13:40,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:13:40,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6020 5992 6035 [WARNING|trainer.py:803] 2025-04-26 17:13:42,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:13:42,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:13:42,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6021 5993 6036 [WARNING|trainer.py:803] 2025-04-26 17:13:44,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:13:44,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:13:44,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6022 5994 6037 [WARNING|trainer.py:803] 2025-04-26 17:13:46,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:13:46,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:13:46,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5995 6023 6038 [WARNING|trainer.py:803] 2025-04-26 17:13:48,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:13:48,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:13:48,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5996 6039 6024 [WARNING|trainer.py:803] 2025-04-26 17:13:50,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:13:50,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:13:50,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5997 6040 6025 [WARNING|trainer.py:803] 2025-04-26 17:13:52,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:13:52,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:13:52,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5998 6026 6041 [WARNING|trainer.py:803] 2025-04-26 17:13:54,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:13:54,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:13:54,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5999 6027 6042 [WARNING|trainer.py:803] 2025-04-26 17:13:56,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:13:56,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:13:56,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6000 6028 6043 [WARNING|trainer.py:803] 2025-04-26 17:13:58,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:13:58,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:13:58,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6001 6029 6044 [WARNING|trainer.py:803] 2025-04-26 17:14:00,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:14:00,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:14:00,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6002 6045 6030 [WARNING|trainer.py:803] 2025-04-26 17:14:02,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:14:02,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:14:03,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6046 6003 6031 [WARNING|trainer.py:803] 2025-04-26 17:14:04,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:04,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:14:05,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6004 6047 6032 [WARNING|trainer.py:803] 2025-04-26 17:14:06,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:14:07,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:14:07,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6005 6048 6033 [WARNING|trainer.py:803] 2025-04-26 17:14:08,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:09,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:14:09,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6006 6049 6034 [WARNING|trainer.py:803] 2025-04-26 17:14:10,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:14:11,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:11,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6007 6050 6035 [WARNING|trainer.py:803] 2025-04-26 17:14:12,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:13,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:14:13,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6008 6051 6036 [WARNING|trainer.py:803] 2025-04-26 17:14:14,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:14:15,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:14:15,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6009 6037 6052 [WARNING|trainer.py:803] 2025-04-26 17:14:16,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:17,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:17,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6010 [WARNING|trainer.py:803] 2025-04-26 17:14:18,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6053 6038 [WARNING|trainer.py:803] 2025-04-26 17:14:19,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:19,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6011 [WARNING|trainer.py:803] 2025-04-26 17:14:20,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6054 6039 [WARNING|trainer.py:803] 2025-04-26 17:14:21,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:14:21,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6012 6055 6040 [WARNING|trainer.py:803] 2025-04-26 17:14:22,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:14:23,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:14:23,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6013 6056 [WARNING|trainer.py:803] 2025-04-26 17:14:24,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6041 [WARNING|trainer.py:803] 2025-04-26 17:14:25,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:14:25,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6014 6057 [WARNING|trainer.py:803] 2025-04-26 17:14:26,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6042 [WARNING|trainer.py:803] 2025-04-26 17:14:27,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:27,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6015 6058 [WARNING|trainer.py:803] 2025-04-26 17:14:28,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6043 [WARNING|trainer.py:803] 2025-04-26 17:14:29,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:14:29,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6016 6059 [WARNING|trainer.py:803] 2025-04-26 17:14:30,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:31,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6044 6017 [WARNING|trainer.py:803] 2025-04-26 17:14:31,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6060 [WARNING|trainer.py:803] 2025-04-26 17:14:32,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:14:32,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6045 [WARNING|trainer.py:803] 2025-04-26 17:14:33,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6061 6018 [WARNING|trainer.py:803] 2025-04-26 17:14:34,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:14:34,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6046 6062 [WARNING|trainer.py:803] 2025-04-26 17:14:35,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6019 [WARNING|trainer.py:803] 2025-04-26 17:14:36,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:36,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6047 6063 6020 [WARNING|trainer.py:803] 2025-04-26 17:14:38,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:14:38,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:14:38,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6048 6064 6021 [WARNING|trainer.py:803] 2025-04-26 17:14:40,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:14:40,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:40,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6049 6022 6065 [WARNING|trainer.py:803] 2025-04-26 17:14:42,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:42,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:42,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6050 6066 6023 [WARNING|trainer.py:803] 2025-04-26 17:14:44,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:14:44,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:14:44,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6067 6051 [WARNING|trainer.py:803] 2025-04-26 17:14:46,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6024 [WARNING|trainer.py:803] 2025-04-26 17:14:46,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:14:47,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6068 6052 [WARNING|trainer.py:803] 2025-04-26 17:14:48,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6025 [WARNING|trainer.py:803] 2025-04-26 17:14:48,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:14:48,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6069 6053 6026 [WARNING|trainer.py:803] 2025-04-26 17:14:50,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:14:50,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:50,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6070 6054 6027 [WARNING|trainer.py:803] 2025-04-26 17:14:52,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:14:52,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:14:52,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6071 6055 6028 [WARNING|trainer.py:803] 2025-04-26 17:14:54,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:14:54,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:14:54,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6072 6056 [WARNING|trainer.py:803] 2025-04-26 17:14:56,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6029 [WARNING|trainer.py:803] 2025-04-26 17:14:56,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:14:57,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6057 6073 [WARNING|trainer.py:803] 2025-04-26 17:14:58,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:14:58,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6030 6074 6058 [WARNING|trainer.py:803] 2025-04-26 17:14:59,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:15:00,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:00,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6031 6059 [WARNING|trainer.py:803] 2025-04-26 17:15:01,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6075 [WARNING|trainer.py:803] 2025-04-26 17:15:02,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:15:02,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6032 6060 [WARNING|trainer.py:803] 2025-04-26 17:15:03,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6076 [WARNING|trainer.py:803] 2025-04-26 17:15:04,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:15:04,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6033 6061 [WARNING|trainer.py:803] 2025-04-26 17:15:05,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6077 [WARNING|trainer.py:803] 2025-04-26 17:15:05,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:15:06,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6034 6062 6078 [WARNING|trainer.py:803] 2025-04-26 17:15:07,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:15:07,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:08,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6063 6035 6079 [WARNING|trainer.py:803] 2025-04-26 17:15:09,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:15:09,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:10,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6064 6036 6080 [WARNING|trainer.py:803] 2025-04-26 17:15:11,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:11,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:15:12,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6037 6081 6065 [WARNING|trainer.py:803] 2025-04-26 17:15:13,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:13,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:15:13,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6038 6082 6066 [WARNING|trainer.py:803] 2025-04-26 17:15:15,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:15:15,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:15:15,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6067 6039 6083 [WARNING|trainer.py:803] 2025-04-26 17:15:17,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:15:17,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:15:18,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6068 6040 [WARNING|trainer.py:803] 2025-04-26 17:15:19,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:19,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6084 [WARNING|trainer.py:803] 2025-04-26 17:15:20,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6069 6041 [WARNING|trainer.py:803] 2025-04-26 17:15:21,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6085 [WARNING|trainer.py:803] 2025-04-26 17:15:21,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:22,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6070 6042 [WARNING|trainer.py:803] 2025-04-26 17:15:23,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:15:23,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6086 [WARNING|trainer.py:803] 2025-04-26 17:15:24,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6071 6043 [WARNING|trainer.py:803] 2025-04-26 17:15:25,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:15:25,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6087 [WARNING|trainer.py:803] 2025-04-26 17:15:26,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6072 6044 6088 [WARNING|trainer.py:803] 2025-04-26 17:15:27,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:28,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:15:28,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6073 6045 6089 [WARNING|trainer.py:803] 2025-04-26 17:15:29,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:29,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:15:30,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6074 6046 6090 [WARNING|trainer.py:803] 2025-04-26 17:15:31,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:32,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:32,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6075 6091 6047 [WARNING|trainer.py:803] 2025-04-26 17:15:34,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:15:34,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:15:34,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6076 6048 6092 [WARNING|trainer.py:803] 2025-04-26 17:15:35,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:36,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:15:36,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6077 6049 6093 [WARNING|trainer.py:803] 2025-04-26 17:15:37,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:38,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:38,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6078 6094 6050 [WARNING|trainer.py:803] 2025-04-26 17:15:39,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:15:40,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:15:40,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6079 6095 6051 [WARNING|trainer.py:803] 2025-04-26 17:15:41,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:15:42,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:42,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6080 6096 [WARNING|trainer.py:803] 2025-04-26 17:15:43,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6052 [WARNING|trainer.py:803] 2025-04-26 17:15:44,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:15:44,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6081 6097 [WARNING|trainer.py:803] 2025-04-26 17:15:45,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6053 [WARNING|trainer.py:803] 2025-04-26 17:15:46,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:15:46,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6082 [WARNING|trainer.py:803] 2025-04-26 17:15:47,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6054 6098 [WARNING|trainer.py:803] 2025-04-26 17:15:48,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:15:48,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6083 6099 6055 [WARNING|trainer.py:803] 2025-04-26 17:15:49,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:15:50,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:15:50,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6084 6100 6056 [WARNING|trainer.py:803] 2025-04-26 17:15:52,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:15:52,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:15:52,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6085 6057 6101 [WARNING|trainer.py:803] 2025-04-26 17:15:54,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:54,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:54,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6058 6086 6102 [WARNING|trainer.py:803] 2025-04-26 17:15:56,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:15:56,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:15:56,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6059 6103 6087 [WARNING|trainer.py:803] 2025-04-26 17:15:58,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:15:58,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:15:58,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6060 6088 6104 [WARNING|trainer.py:803] 2025-04-26 17:16:00,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:16:00,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:16:00,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6061 6089 [WARNING|trainer.py:803] 2025-04-26 17:16:01,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6105 [WARNING|trainer.py:803] 2025-04-26 17:16:02,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:16:02,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6062 6090 [WARNING|trainer.py:803] 2025-04-26 17:16:03,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:16:04,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6106 6063 [WARNING|trainer.py:803] 2025-04-26 17:16:04,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6091 [WARNING|trainer.py:803] 2025-04-26 17:16:05,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:16:06,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6107 6064 [WARNING|trainer.py:803] 2025-04-26 17:16:07,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:16:07,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6092 6108 [WARNING|trainer.py:803] 2025-04-26 17:16:08,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6065 [WARNING|trainer.py:803] 2025-04-26 17:16:09,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6093 [WARNING|trainer.py:803] 2025-04-26 17:16:09,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6109 [WARNING|trainer.py:803] 2025-04-26 17:16:10,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6066 [WARNING|trainer.py:803] 2025-04-26 17:16:10,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6094 [WARNING|trainer.py:803] 2025-04-26 17:16:11,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:16:12,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6110 6067 [WARNING|trainer.py:803] 2025-04-26 17:16:13,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6095 [WARNING|trainer.py:803] 2025-04-26 17:16:13,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6111 [WARNING|trainer.py:803] 2025-04-26 17:16:14,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6068 [WARNING|trainer.py:803] 2025-04-26 17:16:14,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:16:15,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6096 6112 [WARNING|trainer.py:803] 2025-04-26 17:16:16,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6069 [WARNING|trainer.py:803] 2025-04-26 17:16:16,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:16:17,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6097 6113 [WARNING|trainer.py:803] 2025-04-26 17:16:18,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6070 [WARNING|trainer.py:803] 2025-04-26 17:16:19,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:16:19,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6098 6114 6071 [WARNING|trainer.py:803] 2025-04-26 17:16:20,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:16:21,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:16:21,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6099 6115 [WARNING|trainer.py:803] 2025-04-26 17:16:22,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6072 [WARNING|trainer.py:803] 2025-04-26 17:16:23,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:16:23,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6100 6116 [WARNING|trainer.py:803] 2025-04-26 17:16:24,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6073 [WARNING|trainer.py:803] 2025-04-26 17:16:25,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:16:25,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6101 6074 6117 [WARNING|trainer.py:803] 2025-04-26 17:16:26,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:16:27,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:16:27,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6102 [WARNING|trainer.py:803] 2025-04-26 17:16:28,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6075 6118 [WARNING|trainer.py:803] 2025-04-26 17:16:29,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:16:29,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6103 6076 [WARNING|trainer.py:803] 2025-04-26 17:16:30,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6119 [WARNING|trainer.py:803] 2025-04-26 17:16:31,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:16:31,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6104 6077 [WARNING|trainer.py:803] 2025-04-26 17:16:32,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6120 [WARNING|trainer.py:803] 2025-04-26 17:16:33,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6105 [WARNING|trainer.py:803] 2025-04-26 17:16:34,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6078 [WARNING|trainer.py:803] 2025-04-26 17:16:34,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6121 [WARNING|trainer.py:803] 2025-04-26 17:16:35,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6106 [WARNING|trainer.py:803] 2025-04-26 17:16:36,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6079 [WARNING|trainer.py:803] 2025-04-26 17:16:37,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:16:37,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6122 6107 [WARNING|trainer.py:803] 2025-04-26 17:16:38,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6080 [WARNING|trainer.py:803] 2025-04-26 17:16:39,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:16:39,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6123 6108 6081 [WARNING|trainer.py:803] 2025-04-26 17:16:40,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:16:41,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:16:41,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6124 6109 6082 [WARNING|trainer.py:803] 2025-04-26 17:16:42,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:16:43,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:16:43,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6125 6110 6083 [WARNING|trainer.py:803] 2025-04-26 17:16:45,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:16:45,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:16:45,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6111 6126 [WARNING|trainer.py:803] 2025-04-26 17:16:47,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:16:47,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6084 [WARNING|trainer.py:803] 2025-04-26 17:16:47,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6112 6127 6085 [WARNING|trainer.py:803] 2025-04-26 17:16:49,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:16:49,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:16:49,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6128 6113 [WARNING|trainer.py:803] 2025-04-26 17:16:51,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:16:51,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6086 [WARNING|trainer.py:803] 2025-04-26 17:16:52,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6129 6114 [WARNING|trainer.py:803] 2025-04-26 17:16:53,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:16:53,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6087 [WARNING|trainer.py:803] 2025-04-26 17:16:54,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6130 6115 [WARNING|trainer.py:803] 2025-04-26 17:16:55,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6088 [WARNING|trainer.py:803] 2025-04-26 17:16:55,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6131 [WARNING|trainer.py:803] 2025-04-26 17:16:56,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6116 [WARNING|trainer.py:803] 2025-04-26 17:16:56,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6089 [WARNING|trainer.py:803] 2025-04-26 17:16:57,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:16:57,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6132 6117 [WARNING|trainer.py:803] 2025-04-26 17:16:58,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6090 [WARNING|trainer.py:803] 2025-04-26 17:16:59,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:16:59,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6133 6091 [WARNING|trainer.py:803] 2025-04-26 17:17:01,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6118 [WARNING|trainer.py:803] 2025-04-26 17:17:01,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:17:02,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6134 [WARNING|trainer.py:803] 2025-04-26 17:17:03,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6119 6092 [WARNING|trainer.py:803] 2025-04-26 17:17:04,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:17:04,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6135 6093 [WARNING|trainer.py:803] 2025-04-26 17:17:05,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6120 [WARNING|trainer.py:803] 2025-04-26 17:17:06,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:17:06,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6136 6094 [WARNING|trainer.py:803] 2025-04-26 17:17:07,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6121 [WARNING|trainer.py:803] 2025-04-26 17:17:07,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:17:08,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6137 6095 [WARNING|trainer.py:803] 2025-04-26 17:17:09,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:17:10,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6122 6138 [WARNING|trainer.py:803] 2025-04-26 17:17:11,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6096 [WARNING|trainer.py:803] 2025-04-26 17:17:11,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:17:11,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6123 6139 [WARNING|trainer.py:803] 2025-04-26 17:17:12,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6097 [WARNING|trainer.py:803] 2025-04-26 17:17:13,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:17:13,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6124 [WARNING|trainer.py:803] 2025-04-26 17:17:14,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6140 6098 [WARNING|trainer.py:803] 2025-04-26 17:17:16,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:17:16,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6125 6141 [WARNING|trainer.py:803] 2025-04-26 17:17:17,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6099 [WARNING|trainer.py:803] 2025-04-26 17:17:18,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:17:18,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6126 6142 [WARNING|trainer.py:803] 2025-04-26 17:17:19,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6100 [WARNING|trainer.py:803] 2025-04-26 17:17:20,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:17:20,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6127 6143 [WARNING|trainer.py:803] 2025-04-26 17:17:21,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6101 [WARNING|trainer.py:803] 2025-04-26 17:17:22,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:17:22,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6128 [WARNING|trainer.py:803] 2025-04-26 17:17:23,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6144 6102 [WARNING|trainer.py:803] 2025-04-26 17:17:24,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:17:24,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6129 [WARNING|trainer.py:803] 2025-04-26 17:17:25,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6103 6145 [WARNING|trainer.py:803] 2025-04-26 17:17:26,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:17:26,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6130 [WARNING|trainer.py:803] 2025-04-26 17:17:27,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6146 6104 [WARNING|trainer.py:803] 2025-04-26 17:17:28,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6131 [WARNING|trainer.py:803] 2025-04-26 17:17:28,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:17:29,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6105 6147 6132 [WARNING|trainer.py:803] 2025-04-26 17:17:30,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:17:30,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:17:31,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6148 6106 6133 [WARNING|trainer.py:803] 2025-04-26 17:17:32,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:17:32,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:17:33,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6149 6107 6134 [WARNING|trainer.py:803] 2025-04-26 17:17:34,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:17:34,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:17:35,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6108 6150 [WARNING|trainer.py:803] 2025-04-26 17:17:37,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6135 [WARNING|trainer.py:803] 2025-04-26 17:17:37,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:17:37,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6109 6151 [WARNING|trainer.py:803] 2025-04-26 17:17:38,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:17:39,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6136 [WARNING|trainer.py:803] 2025-04-26 17:17:39,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6110 6152 [WARNING|trainer.py:803] 2025-04-26 17:17:41,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6137 [WARNING|trainer.py:803] 2025-04-26 17:17:41,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6111 [WARNING|trainer.py:803] 2025-04-26 17:17:41,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6153 [WARNING|trainer.py:803] 2025-04-26 17:17:42,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6138 [WARNING|trainer.py:803] 2025-04-26 17:17:43,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:17:43,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6112 6154 [WARNING|trainer.py:803] 2025-04-26 17:17:44,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6139 [WARNING|trainer.py:803] 2025-04-26 17:17:45,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6113 [WARNING|trainer.py:803] 2025-04-26 17:17:46,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6155 [WARNING|trainer.py:803] 2025-04-26 17:17:47,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:17:47,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6140 [WARNING|trainer.py:803] 2025-04-26 17:17:48,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6114 6156 [WARNING|trainer.py:803] 2025-04-26 17:17:49,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:17:49,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6141 [WARNING|trainer.py:803] 2025-04-26 17:17:50,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6157 6115 [WARNING|trainer.py:803] 2025-04-26 17:17:51,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:17:51,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6142 [WARNING|trainer.py:803] 2025-04-26 17:17:52,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6116 6158 [WARNING|trainer.py:803] 2025-04-26 17:17:53,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6143 [WARNING|trainer.py:803] 2025-04-26 17:17:53,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6117 [WARNING|trainer.py:803] 2025-04-26 17:17:54,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6159 [WARNING|trainer.py:803] 2025-04-26 17:17:55,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6144 [WARNING|trainer.py:803] 2025-04-26 17:17:55,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:17:56,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6118 6160 [WARNING|trainer.py:803] 2025-04-26 17:17:57,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:17:58,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6145 [WARNING|trainer.py:803] 2025-04-26 17:17:59,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6161 6119 6146 [WARNING|trainer.py:803] 2025-04-26 17:17:59,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:18:00,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:18:00,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6162 6120 [WARNING|trainer.py:803] 2025-04-26 17:18:01,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:02,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6147 [WARNING|trainer.py:803] 2025-04-26 17:18:03,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6163 6121 [WARNING|trainer.py:803] 2025-04-26 17:18:04,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6148 [WARNING|trainer.py:803] 2025-04-26 17:18:04,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:18:05,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6164 6122 [WARNING|trainer.py:803] 2025-04-26 17:18:06,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6149 [WARNING|trainer.py:803] 2025-04-26 17:18:07,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:07,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6165 6123 [WARNING|trainer.py:803] 2025-04-26 17:18:08,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6150 [WARNING|trainer.py:803] 2025-04-26 17:18:08,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6166 [WARNING|trainer.py:803] 2025-04-26 17:18:09,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:10,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6124 6151 [WARNING|trainer.py:803] 2025-04-26 17:18:10,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6167 [WARNING|trainer.py:803] 2025-04-26 17:18:11,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:18:12,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6125 6152 6168 [WARNING|trainer.py:803] 2025-04-26 17:18:13,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:18:13,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:18:14,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6126 6153 6169 [WARNING|trainer.py:803] 2025-04-26 17:18:15,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:15,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:18:15,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6127 6170 6154 [WARNING|trainer.py:803] 2025-04-26 17:18:17,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:17,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:18:17,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6128 6171 6155 [WARNING|trainer.py:803] 2025-04-26 17:18:19,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:18:19,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:18:19,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6129 6172 6156 [WARNING|trainer.py:803] 2025-04-26 17:18:21,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:21,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:21,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6130 6173 6157 [WARNING|trainer.py:803] 2025-04-26 17:18:23,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:18:23,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:23,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6131 6174 [WARNING|trainer.py:803] 2025-04-26 17:18:25,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:25,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6158 [WARNING|trainer.py:803] 2025-04-26 17:18:26,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6175 6132 [WARNING|trainer.py:803] 2025-04-26 17:18:27,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6159 [WARNING|trainer.py:803] 2025-04-26 17:18:27,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:28,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6176 6133 6160 [WARNING|trainer.py:803] 2025-04-26 17:18:29,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:29,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:18:30,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6177 6134 [WARNING|trainer.py:803] 2025-04-26 17:18:31,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6161 [WARNING|trainer.py:803] 2025-04-26 17:18:31,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:18:32,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6178 6135 6162 [WARNING|trainer.py:803] 2025-04-26 17:18:33,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes [WARNING|trainer.py:803] 2025-04-26 17:18:33,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:18:34,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6179 6136 6163 [WARNING|trainer.py:803] 2025-04-26 17:18:35,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:18:36,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:18:36,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6180 6137 6164 [WARNING|trainer.py:803] 2025-04-26 17:18:37,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:18:38,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:18:38,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6181 6138 6165 [WARNING|trainer.py:803] 2025-04-26 17:18:40,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:18:40,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:18:40,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6182 6166 6139 [WARNING|trainer.py:803] 2025-04-26 17:18:42,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:18:42,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:42,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6183 6167 6140 [WARNING|trainer.py:803] 2025-04-26 17:18:44,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:18:44,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:44,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6184 6168 6141 [WARNING|trainer.py:803] 2025-04-26 17:18:46,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:46,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:46,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6169 6185 6142 [WARNING|trainer.py:803] 2025-04-26 17:18:48,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:48,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:18:48,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6170 6186 [WARNING|trainer.py:803] 2025-04-26 17:18:49,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:18:49,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6143 [WARNING|trainer.py:803] 2025-04-26 17:18:50,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6171 6187 [WARNING|trainer.py:803] 2025-04-26 17:18:51,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:18:52,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6144 [WARNING|trainer.py:803] 2025-04-26 17:18:52,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6172 6188 [WARNING|trainer.py:803] 2025-04-26 17:18:53,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:54,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6145 6173 [WARNING|trainer.py:803] 2025-04-26 17:18:55,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6189 [WARNING|trainer.py:803] 2025-04-26 17:18:55,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6146 [WARNING|trainer.py:803] 2025-04-26 17:18:56,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6174 [WARNING|trainer.py:803] 2025-04-26 17:18:56,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6190 [WARNING|trainer.py:803] 2025-04-26 17:18:57,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:18:58,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6147 6175 6191 [WARNING|trainer.py:803] 2025-04-26 17:18:59,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:18:59,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:18:59,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6148 6176 [WARNING|trainer.py:803] 2025-04-26 17:19:01,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6192 [WARNING|trainer.py:803] 2025-04-26 17:19:01,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:02,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6149 6177 [WARNING|trainer.py:803] 2025-04-26 17:19:03,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:03,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6193 [WARNING|trainer.py:803] 2025-04-26 17:19:04,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6150 6178 [WARNING|trainer.py:803] 2025-04-26 17:19:05,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6194 [WARNING|trainer.py:803] 2025-04-26 17:19:05,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes [WARNING|trainer.py:803] 2025-04-26 17:19:06,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6151 6179 [WARNING|trainer.py:803] 2025-04-26 17:19:07,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6195 [WARNING|trainer.py:803] 2025-04-26 17:19:07,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:19:08,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6152 6180 [WARNING|trainer.py:803] 2025-04-26 17:19:09,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6196 [WARNING|trainer.py:803] 2025-04-26 17:19:10,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:19:10,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6153 6181 [WARNING|trainer.py:803] 2025-04-26 17:19:12,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:19:12,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6197 [WARNING|trainer.py:803] 2025-04-26 17:19:13,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6154 6182 [WARNING|trainer.py:803] 2025-04-26 17:19:13,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:19:14,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6198 [WARNING|trainer.py:803] 2025-04-26 17:19:15,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6155 6183 6199 [WARNING|trainer.py:803] 2025-04-26 17:19:16,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:16,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:19:16,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6156 6184 [WARNING|trainer.py:803] 2025-04-26 17:19:18,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6200 [WARNING|trainer.py:803] 2025-04-26 17:19:18,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:18,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6157 6185 [WARNING|trainer.py:803] 2025-04-26 17:19:19,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6201 [WARNING|trainer.py:803] 2025-04-26 17:19:20,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:19:21,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6186 6158 [WARNING|trainer.py:803] 2025-04-26 17:19:22,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6202 [WARNING|trainer.py:803] 2025-04-26 17:19:22,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:23,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6187 6159 [WARNING|trainer.py:803] 2025-04-26 17:19:24,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:19:24,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6203 [WARNING|trainer.py:803] 2025-04-26 17:19:25,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6188 6160 6204 [WARNING|trainer.py:803] 2025-04-26 17:19:26,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:19:26,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:27,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6189 6161 6205 [WARNING|trainer.py:803] 2025-04-26 17:19:28,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:19:28,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:19:28,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6162 6190 6206 [WARNING|trainer.py:803] 2025-04-26 17:19:30,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:30,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:30,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6191 6163 6207 [WARNING|trainer.py:803] 2025-04-26 17:19:32,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:19:32,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:19:32,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6192 6164 6208 [WARNING|trainer.py:803] 2025-04-26 17:19:34,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:19:34,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:19:34,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6165 6193 6209 [WARNING|trainer.py:803] 2025-04-26 17:19:36,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:36,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:19:37,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6166 6210 6194 [WARNING|trainer.py:803] 2025-04-26 17:19:38,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:38,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:19:39,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6211 6167 6195 [WARNING|trainer.py:803] 2025-04-26 17:19:40,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:19:40,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:40,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6212 6168 6196 [WARNING|trainer.py:803] 2025-04-26 17:19:42,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:19:42,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:43,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6213 6169 [WARNING|trainer.py:803] 2025-04-26 17:19:44,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:19:44,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6197 [WARNING|trainer.py:803] 2025-04-26 17:19:45,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6214 6170 [WARNING|trainer.py:803] 2025-04-26 17:19:46,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:19:46,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6198 6215 6171 [WARNING|trainer.py:803] 2025-04-26 17:19:47,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:19:48,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:19:48,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6199 [WARNING|trainer.py:803] 2025-04-26 17:19:49,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6172 6216 [WARNING|trainer.py:803] 2025-04-26 17:19:50,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:50,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6200 6173 6217 [WARNING|trainer.py:803] 2025-04-26 17:19:51,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:52,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:19:52,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6201 6174 6218 [WARNING|trainer.py:803] 2025-04-26 17:19:53,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:19:54,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:19:54,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6202 6175 6219 [WARNING|trainer.py:803] 2025-04-26 17:19:55,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:19:56,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:19:56,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6203 6220 6176 [WARNING|trainer.py:803] 2025-04-26 17:19:57,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:19:58,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:19:58,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6204 6221 6177 [WARNING|trainer.py:803] 2025-04-26 17:19:59,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:19:59,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:20:00,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6205 6222 [WARNING|trainer.py:803] 2025-04-26 17:20:01,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6178 [WARNING|trainer.py:803] 2025-04-26 17:20:01,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:20:02,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes 6206 6223 [WARNING|trainer.py:803] 2025-04-26 17:20:03,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6179 [WARNING|trainer.py:803] 2025-04-26 17:20:03,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6207 [WARNING|trainer.py:803] 2025-04-26 17:20:04,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6224 [WARNING|trainer.py:803] 2025-04-26 17:20:05,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:05,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6180 6208 6225 [WARNING|trainer.py:803] 2025-04-26 17:20:06,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:07,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:07,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6181 6226 6209 [WARNING|trainer.py:803] 2025-04-26 17:20:08,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:09,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:20:09,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6182 6227 6210 [WARNING|trainer.py:803] 2025-04-26 17:20:10,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:11,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:11,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6228 6183 6211 [WARNING|trainer.py:803] 2025-04-26 17:20:12,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:13,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:20:13,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6229 6212 6184 [WARNING|trainer.py:803] 2025-04-26 17:20:14,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:20:14,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:15,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6230 6213 6185 [WARNING|trainer.py:803] 2025-04-26 17:20:16,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:16,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:17,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6231 6214 6186 [WARNING|trainer.py:803] 2025-04-26 17:20:18,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:18,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:20:19,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6232 6215 6187 [WARNING|trainer.py:803] 2025-04-26 17:20:20,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:20,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:20:21,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6233 6216 [WARNING|trainer.py:803] 2025-04-26 17:20:22,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6188 [WARNING|trainer.py:803] 2025-04-26 17:20:22,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6234 [WARNING|trainer.py:803] 2025-04-26 17:20:23,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6217 [WARNING|trainer.py:803] 2025-04-26 17:20:23,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:24,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6189 6235 [WARNING|trainer.py:803] 2025-04-26 17:20:25,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6218 [WARNING|trainer.py:803] 2025-04-26 17:20:25,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:20:26,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6190 6236 [WARNING|trainer.py:803] 2025-04-26 17:20:27,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6219 [WARNING|trainer.py:803] 2025-04-26 17:20:27,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:28,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6191 6237 [WARNING|trainer.py:803] 2025-04-26 17:20:29,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6220 [WARNING|trainer.py:803] 2025-04-26 17:20:29,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:30,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6192 6238 6221 [WARNING|trainer.py:803] 2025-04-26 17:20:31,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:20:31,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:20:31,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6239 6193 6222 [WARNING|trainer.py:803] 2025-04-26 17:20:33,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:20:33,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:33,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6240 6194 6223 [WARNING|trainer.py:803] 2025-04-26 17:20:35,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:35,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:20:35,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6241 6224 6195 [WARNING|trainer.py:803] 2025-04-26 17:20:37,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:37,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:20:37,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6242 6225 [WARNING|trainer.py:803] 2025-04-26 17:20:39,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6196 [WARNING|trainer.py:803] 2025-04-26 17:20:39,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:40,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6243 6226 [WARNING|trainer.py:803] 2025-04-26 17:20:41,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:20:41,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6197 6244 6227 [WARNING|trainer.py:803] 2025-04-26 17:20:42,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:42,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:20:42,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6198 6245 6228 [WARNING|trainer.py:803] 2025-04-26 17:20:44,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:20:44,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:44,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6199 6246 6229 [WARNING|trainer.py:803] 2025-04-26 17:20:46,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:20:46,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:20:46,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6247 6200 6230 [WARNING|trainer.py:803] 2025-04-26 17:20:48,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:48,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:20:48,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6248 6201 6231 [WARNING|trainer.py:803] 2025-04-26 17:20:50,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:50,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:50,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6249 6232 6202 [WARNING|trainer.py:803] 2025-04-26 17:20:52,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:52,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:52,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6233 6250 6203 [WARNING|trainer.py:803] 2025-04-26 17:20:53,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:54,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:20:54,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6251 6234 6204 [WARNING|trainer.py:803] 2025-04-26 17:20:55,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:55,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:56,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6252 6235 6205 [WARNING|trainer.py:803] 2025-04-26 17:20:57,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:57,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:20:58,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6253 6236 6206 [WARNING|trainer.py:803] 2025-04-26 17:20:59,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:20:59,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:20:59,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6254 6237 [WARNING|trainer.py:803] 2025-04-26 17:21:01,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6207 [WARNING|trainer.py:803] 2025-04-26 17:21:01,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6255 [WARNING|trainer.py:803] 2025-04-26 17:21:02,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6238 [WARNING|trainer.py:803] 2025-04-26 17:21:02,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6208 [WARNING|trainer.py:803] 2025-04-26 17:21:03,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6256 [WARNING|trainer.py:803] 2025-04-26 17:21:04,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6239 [WARNING|trainer.py:803] 2025-04-26 17:21:04,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:21:05,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6209 6257 [WARNING|trainer.py:803] 2025-04-26 17:21:06,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6240 [WARNING|trainer.py:803] 2025-04-26 17:21:06,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:21:07,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6210 6258 [WARNING|trainer.py:803] 2025-04-26 17:21:08,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6241 [WARNING|trainer.py:803] 2025-04-26 17:21:08,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:21:08,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6211 6259 6242 [WARNING|trainer.py:803] 2025-04-26 17:21:09,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:10,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:21:10,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6212 6260 6243 [WARNING|trainer.py:803] 2025-04-26 17:21:11,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:12,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:21:12,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6213 6261 6244 [WARNING|trainer.py:803] 2025-04-26 17:21:13,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:21:14,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:21:14,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6214 6262 6245 [WARNING|trainer.py:803] 2025-04-26 17:21:15,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:21:16,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:21:16,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6215 6246 6263 [WARNING|trainer.py:803] 2025-04-26 17:21:17,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:21:17,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:21:17,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6216 6247 6264 [WARNING|trainer.py:803] 2025-04-26 17:21:19,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:21:19,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:19,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6217 6248 6265 [WARNING|trainer.py:803] 2025-04-26 17:21:21,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:21:21,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:21:21,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6218 6249 6266 [WARNING|trainer.py:803] 2025-04-26 17:21:23,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:21:23,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:23,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6219 6250 6267 [WARNING|trainer.py:803] 2025-04-26 17:21:25,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:21:25,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:21:25,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6220 6251 6268 [WARNING|trainer.py:803] 2025-04-26 17:21:27,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:27,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:21:27,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6221 6252 6269 [WARNING|trainer.py:803] 2025-04-26 17:21:29,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:21:29,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:21:29,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6253 6222 6270 [WARNING|trainer.py:803] 2025-04-26 17:21:30,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:21:30,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:21:31,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6254 6223 [WARNING|trainer.py:803] 2025-04-26 17:21:32,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6271 [WARNING|trainer.py:803] 2025-04-26 17:21:33,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6255 [WARNING|trainer.py:803] 2025-04-26 17:21:33,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6224 [WARNING|trainer.py:803] 2025-04-26 17:21:34,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:34,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6272 6256 [WARNING|trainer.py:803] 2025-04-26 17:21:35,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6225 [WARNING|trainer.py:803] 2025-04-26 17:21:36,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:21:36,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6273 6257 6226 [WARNING|trainer.py:803] 2025-04-26 17:21:37,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:37,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:21:38,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6258 6274 6227 [WARNING|trainer.py:803] 2025-04-26 17:21:39,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:21:39,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:40,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6259 6275 6228 [WARNING|trainer.py:803] 2025-04-26 17:21:41,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:21:42,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:21:42,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6260 6276 6229 [WARNING|trainer.py:803] 2025-04-26 17:21:43,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:21:43,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:44,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6261 6277 6230 [WARNING|trainer.py:803] 2025-04-26 17:21:45,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:21:45,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:45,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6278 6262 6231 [WARNING|trainer.py:803] 2025-04-26 17:21:47,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:21:47,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:21:47,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6279 6263 6232 [WARNING|trainer.py:803] 2025-04-26 17:21:49,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:21:49,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:21:49,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6280 6264 6233 [WARNING|trainer.py:803] 2025-04-26 17:21:51,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:51,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:21:51,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6265 6234 6281 [WARNING|trainer.py:803] 2025-04-26 17:21:53,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:21:53,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:53,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6235 6266 6282 [WARNING|trainer.py:803] 2025-04-26 17:21:55,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:21:55,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:21:55,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6236 6267 6283 [WARNING|trainer.py:803] 2025-04-26 17:21:57,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:57,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:57,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6268 6237 6284 [WARNING|trainer.py:803] 2025-04-26 17:21:59,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:21:59,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:21:59,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6269 6238 6285 [WARNING|trainer.py:803] 2025-04-26 17:22:01,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:22:01,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:22:01,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6270 6239 6286 [WARNING|trainer.py:803] 2025-04-26 17:22:03,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:22:03,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:22:03,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6271 6240 [WARNING|trainer.py:803] 2025-04-26 17:22:04,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:22:05,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6287 6241 [WARNING|trainer.py:803] 2025-04-26 17:22:06,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6272 [WARNING|trainer.py:803] 2025-04-26 17:22:06,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6288 [WARNING|trainer.py:803] 2025-04-26 17:22:07,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:22:07,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6242 6273 6289 [WARNING|trainer.py:803] 2025-04-26 17:22:08,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:22:09,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:22:09,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6243 6274 [WARNING|trainer.py:803] 2025-04-26 17:22:10,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6290 [WARNING|trainer.py:803] 2025-04-26 17:22:11,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6244 [WARNING|trainer.py:803] 2025-04-26 17:22:12,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:22:12,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6275 6291 [WARNING|trainer.py:803] 2025-04-26 17:22:13,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6245 [WARNING|trainer.py:803] 2025-04-26 17:22:14,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:22:14,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6276 6292 6246 [WARNING|trainer.py:803] 2025-04-26 17:22:15,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:22:15,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:22:16,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6277 6293 6247 [WARNING|trainer.py:803] 2025-04-26 17:22:17,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:22:17,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:22:17,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6278 6294 [WARNING|trainer.py:803] 2025-04-26 17:22:18,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6248 [WARNING|trainer.py:803] 2025-04-26 17:22:19,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:22:19,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6279 [WARNING|trainer.py:803] 2025-04-26 17:22:20,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6295 6249 [WARNING|trainer.py:803] 2025-04-26 17:22:21,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:22:21,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6280 [WARNING|trainer.py:803] 2025-04-26 17:22:22,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6296 6250 [WARNING|trainer.py:803] 2025-04-26 17:22:23,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:22:23,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6281 6251 6297 [WARNING|trainer.py:803] 2025-04-26 17:22:24,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:22:25,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:22:25,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6282 6252 6298 [WARNING|trainer.py:803] 2025-04-26 17:22:26,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:22:27,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:22:27,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6283 6253 6299 [WARNING|trainer.py:803] 2025-04-26 17:22:28,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:22:28,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:22:29,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6254 6284 6300 [WARNING|trainer.py:803] 2025-04-26 17:22:30,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:22:30,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:22:31,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6255 6285 6301 [WARNING|trainer.py:803] 2025-04-26 17:22:32,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:22:32,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:22:32,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6302 6256 [WARNING|trainer.py:803] 2025-04-26 17:22:34,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6286 [WARNING|trainer.py:803] 2025-04-26 17:22:34,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6303 [WARNING|trainer.py:803] 2025-04-26 17:22:35,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6257 [WARNING|trainer.py:803] 2025-04-26 17:22:35,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6304 [WARNING|trainer.py:803] 2025-04-26 17:22:36,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6287 [WARNING|trainer.py:803] 2025-04-26 17:22:36,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6305 6258 [WARNING|trainer.py:803] 2025-04-26 17:22:37,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:22:38,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:22:38,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6288 6306 [WARNING|trainer.py:803] 2025-04-26 17:22:39,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6259 [WARNING|trainer.py:803] 2025-04-26 17:22:39,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6307 [WARNING|trainer.py:803] 2025-04-26 17:22:40,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6289 [WARNING|trainer.py:803] 2025-04-26 17:22:40,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6260 [WARNING|trainer.py:803] 2025-04-26 17:22:41,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6308 [WARNING|trainer.py:803] 2025-04-26 17:22:41,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:22:42,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6309 6290 6261 [WARNING|trainer.py:803] 2025-04-26 17:22:43,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:22:43,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6310 [WARNING|trainer.py:803] 2025-04-26 17:22:44,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6291 [WARNING|trainer.py:803] 2025-04-26 17:22:44,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6262 6311 [WARNING|trainer.py:803] 2025-04-26 17:22:45,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:22:46,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:22:46,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6312 6292 6263 [WARNING|trainer.py:803] 2025-04-26 17:22:47,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:22:47,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6313 [WARNING|trainer.py:803] 2025-04-26 17:22:48,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6293 [WARNING|trainer.py:803] 2025-04-26 17:22:48,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6264 6314 [WARNING|trainer.py:803] 2025-04-26 17:22:49,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:22:49,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:22:50,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6294 6315 6265 [WARNING|trainer.py:803] 2025-04-26 17:22:51,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:22:51,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:22:51,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6316 6295 [WARNING|trainer.py:803] 2025-04-26 17:22:52,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6266 [WARNING|trainer.py:803] 2025-04-26 17:22:53,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6317 [WARNING|trainer.py:803] 2025-04-26 17:22:53,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:22:54,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6296 6318 6267 [WARNING|trainer.py:803] 2025-04-26 17:22:55,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:22:55,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:22:55,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6319 6297 [WARNING|trainer.py:803] 2025-04-26 17:22:56,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6268 [WARNING|trainer.py:803] 2025-04-26 17:22:57,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6320 [WARNING|trainer.py:803] 2025-04-26 17:22:57,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:22:58,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6298 6321 6269 [WARNING|trainer.py:803] 2025-04-26 17:22:59,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:22:59,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:22:59,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6322 6299 6270 [WARNING|trainer.py:803] 2025-04-26 17:23:00,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:23:01,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6323 [WARNING|trainer.py:803] 2025-04-26 17:23:01,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:23:02,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6300 6271 6324 [WARNING|trainer.py:803] 2025-04-26 17:23:03,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:23:03,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:23:03,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6301 6325 [WARNING|trainer.py:803] 2025-04-26 17:23:04,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:23:04,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6272 6302 6326 [WARNING|trainer.py:803] 2025-04-26 17:23:05,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:23:05,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:23:06,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6303 6327 6273 [WARNING|trainer.py:803] 2025-04-26 17:23:07,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:23:07,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6304 [WARNING|trainer.py:803] 2025-04-26 17:23:07,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6328 [WARNING|trainer.py:803] 2025-04-26 17:23:08,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:23:08,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6305 6274 6329 [WARNING|trainer.py:803] 2025-04-26 17:23:09,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:23:09,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:23:10,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6306 6330 [WARNING|trainer.py:803] 2025-04-26 17:23:11,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6275 [WARNING|trainer.py:803] 2025-04-26 17:23:11,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6307 [WARNING|trainer.py:803] 2025-04-26 17:23:11,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6331 [WARNING|trainer.py:803] 2025-04-26 17:23:12,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6308 [WARNING|trainer.py:803] 2025-04-26 17:23:12,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6276 6332 [WARNING|trainer.py:803] 2025-04-26 17:23:13,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:23:13,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:23:14,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6309 6333 6277 [WARNING|trainer.py:803] 2025-04-26 17:23:15,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6310 [WARNING|trainer.py:803] 2025-04-26 17:23:15,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:23:15,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6334 [WARNING|trainer.py:803] 2025-04-26 17:23:16,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6278 6311 [WARNING|trainer.py:803] 2025-04-26 17:23:16,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6335 [WARNING|trainer.py:803] 2025-04-26 17:23:17,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:23:17,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6312 [WARNING|trainer.py:803] 2025-04-26 17:23:18,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6279 6336 [WARNING|trainer.py:803] 2025-04-26 17:23:18,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:23:19,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6313 [WARNING|trainer.py:803] 2025-04-26 17:23:19,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6337 [WARNING|trainer.py:803] 2025-04-26 17:23:20,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6280 6314 [WARNING|trainer.py:803] 2025-04-26 17:23:20,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6338 [WARNING|trainer.py:803] 2025-04-26 17:23:21,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:23:21,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6315 [WARNING|trainer.py:803] 2025-04-26 17:23:22,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6339 6281 [WARNING|trainer.py:803] 2025-04-26 17:23:22,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6316 [WARNING|trainer.py:803] 2025-04-26 17:23:23,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:23:23,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6340 [WARNING|trainer.py:803] 2025-04-26 17:23:24,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6317 6282 [WARNING|trainer.py:803] 2025-04-26 17:23:24,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6341 [WARNING|trainer.py:803] 2025-04-26 17:23:25,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:23:25,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6318 [WARNING|trainer.py:803] 2025-04-26 17:23:26,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6342 6283 [WARNING|trainer.py:803] 2025-04-26 17:23:26,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6319 [WARNING|trainer.py:803] 2025-04-26 17:23:27,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:23:27,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6343 [WARNING|trainer.py:803] 2025-04-26 17:23:28,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6320 6284 [WARNING|trainer.py:803] 2025-04-26 17:23:28,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6344 [WARNING|trainer.py:803] 2025-04-26 17:23:29,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:23:29,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6321 [WARNING|trainer.py:803] 2025-04-26 17:23:30,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6345 6285 [WARNING|trainer.py:803] 2025-04-26 17:23:30,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6322 [WARNING|trainer.py:803] 2025-04-26 17:23:31,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:23:31,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6346 [WARNING|trainer.py:803] 2025-04-26 17:23:32,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6323 [WARNING|trainer.py:803] 2025-04-26 17:23:32,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6286 6347 [WARNING|trainer.py:803] 2025-04-26 17:23:33,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6324 [WARNING|trainer.py:803] 2025-04-26 17:23:33,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:23:34,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:23:34,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6348 6325 6287 [WARNING|trainer.py:803] 2025-04-26 17:23:35,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:23:35,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6349 [WARNING|trainer.py:803] 2025-04-26 17:23:36,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6326 [WARNING|trainer.py:803] 2025-04-26 17:23:36,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6288 [WARNING|trainer.py:803] 2025-04-26 17:23:37,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6350 6327 [WARNING|trainer.py:803] 2025-04-26 17:23:37,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:23:38,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:23:38,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6351 6289 6328 [WARNING|trainer.py:803] 2025-04-26 17:23:39,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:23:39,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:23:39,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6352 6329 [WARNING|trainer.py:803] 2025-04-26 17:23:40,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:23:41,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6353 6290 6330 [WARNING|trainer.py:803] 2025-04-26 17:23:42,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:23:42,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:23:42,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6354 6331 6291 [WARNING|trainer.py:803] 2025-04-26 17:23:43,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:23:43,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6355 [WARNING|trainer.py:803] 2025-04-26 17:23:44,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6332 [WARNING|trainer.py:803] 2025-04-26 17:23:44,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:23:45,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6356 6292 6333 [WARNING|trainer.py:803] 2025-04-26 17:23:46,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:23:46,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:23:46,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6357 6334 6293 [WARNING|trainer.py:803] 2025-04-26 17:23:47,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:23:47,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6358 [WARNING|trainer.py:803] 2025-04-26 17:23:48,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6335 [WARNING|trainer.py:803] 2025-04-26 17:23:48,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6294 [WARNING|trainer.py:803] 2025-04-26 17:23:49,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6359 6336 [WARNING|trainer.py:803] 2025-04-26 17:23:49,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:23:50,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:23:50,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6360 6337 6295 [WARNING|trainer.py:803] 2025-04-26 17:23:51,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:23:51,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6361 [WARNING|trainer.py:803] 2025-04-26 17:23:51,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6338 [WARNING|trainer.py:803] 2025-04-26 17:23:52,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6296 [WARNING|trainer.py:803] 2025-04-26 17:23:53,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6362 6339 [WARNING|trainer.py:803] 2025-04-26 17:23:53,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:23:54,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:23:54,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6363 6340 6297 [WARNING|trainer.py:803] 2025-04-26 17:23:55,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:23:55,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6364 [WARNING|trainer.py:803] 2025-04-26 17:23:55,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6341 [WARNING|trainer.py:803] 2025-04-26 17:23:56,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:23:57,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6298 6365 6342 [WARNING|trainer.py:803] 2025-04-26 17:23:57,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:23:57,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:23:58,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6366 6343 6299 [WARNING|trainer.py:803] 2025-04-26 17:23:59,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:23:59,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6367 [WARNING|trainer.py:803] 2025-04-26 17:23:59,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6344 [WARNING|trainer.py:803] 2025-04-26 17:24:00,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6300 [WARNING|trainer.py:803] 2025-04-26 17:24:01,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6368 6345 [WARNING|trainer.py:803] 2025-04-26 17:24:01,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:01,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6301 [WARNING|trainer.py:803] 2025-04-26 17:24:02,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6369 6346 [WARNING|trainer.py:803] 2025-04-26 17:24:03,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:24:03,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6302 [WARNING|trainer.py:803] 2025-04-26 17:24:03,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6370 6347 [WARNING|trainer.py:803] 2025-04-26 17:24:04,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:24:04,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6303 [WARNING|trainer.py:803] 2025-04-26 17:24:05,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6371 6348 [WARNING|trainer.py:803] 2025-04-26 17:24:05,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:06,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6304 [WARNING|trainer.py:803] 2025-04-26 17:24:06,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6372 6349 [WARNING|trainer.py:803] 2025-04-26 17:24:07,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:24:07,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6305 [WARNING|trainer.py:803] 2025-04-26 17:24:07,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6373 6350 [WARNING|trainer.py:803] 2025-04-26 17:24:08,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:24:08,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6306 [WARNING|trainer.py:803] 2025-04-26 17:24:09,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6374 6351 [WARNING|trainer.py:803] 2025-04-26 17:24:09,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:10,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6307 [WARNING|trainer.py:803] 2025-04-26 17:24:10,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6375 6352 [WARNING|trainer.py:803] 2025-04-26 17:24:11,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:11,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6308 [WARNING|trainer.py:803] 2025-04-26 17:24:11,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6376 6353 [WARNING|trainer.py:803] 2025-04-26 17:24:12,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:12,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6309 6377 [WARNING|trainer.py:803] 2025-04-26 17:24:13,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:24:13,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6354 [WARNING|trainer.py:803] 2025-04-26 17:24:14,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6310 6378 [WARNING|trainer.py:803] 2025-04-26 17:24:14,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6355 [WARNING|trainer.py:803] 2025-04-26 17:24:15,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:15,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6311 6379 [WARNING|trainer.py:803] 2025-04-26 17:24:15,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:24:16,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6356 [WARNING|trainer.py:803] 2025-04-26 17:24:16,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6312 6380 [WARNING|trainer.py:803] 2025-04-26 17:24:17,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6357 [WARNING|trainer.py:803] 2025-04-26 17:24:17,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:18,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6313 6381 [WARNING|trainer.py:803] 2025-04-26 17:24:18,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6358 [WARNING|trainer.py:803] 2025-04-26 17:24:19,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:24:19,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6314 6382 [WARNING|trainer.py:803] 2025-04-26 17:24:19,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6359 [WARNING|trainer.py:803] 2025-04-26 17:24:20,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:24:20,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6315 [WARNING|trainer.py:803] 2025-04-26 17:24:21,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6383 6360 [WARNING|trainer.py:803] 2025-04-26 17:24:21,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:24:22,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6316 [WARNING|trainer.py:803] 2025-04-26 17:24:22,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6384 6361 [WARNING|trainer.py:803] 2025-04-26 17:24:23,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:24:23,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6317 [WARNING|trainer.py:803] 2025-04-26 17:24:23,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6385 6362 [WARNING|trainer.py:803] 2025-04-26 17:24:24,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:24:24,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6318 [WARNING|trainer.py:803] 2025-04-26 17:24:25,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6386 6363 [WARNING|trainer.py:803] 2025-04-26 17:24:25,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:26,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6319 [WARNING|trainer.py:803] 2025-04-26 17:24:26,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6387 6364 [WARNING|trainer.py:803] 2025-04-26 17:24:27,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:24:27,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6320 [WARNING|trainer.py:803] 2025-04-26 17:24:27,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6388 6365 [WARNING|trainer.py:803] 2025-04-26 17:24:28,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:28,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6321 [WARNING|trainer.py:803] 2025-04-26 17:24:29,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6389 6366 [WARNING|trainer.py:803] 2025-04-26 17:24:29,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:24:29,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6322 [WARNING|trainer.py:803] 2025-04-26 17:24:30,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6390 6367 [WARNING|trainer.py:803] 2025-04-26 17:24:31,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:24:31,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6323 [WARNING|trainer.py:803] 2025-04-26 17:24:31,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6391 6368 [WARNING|trainer.py:803] 2025-04-26 17:24:32,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:32,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6324 [WARNING|trainer.py:803] 2025-04-26 17:24:33,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6392 6369 [WARNING|trainer.py:803] 2025-04-26 17:24:33,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:24:33,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6325 [WARNING|trainer.py:803] 2025-04-26 17:24:34,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6393 6370 [WARNING|trainer.py:803] 2025-04-26 17:24:34,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:24:35,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6326 [WARNING|trainer.py:803] 2025-04-26 17:24:35,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6394 6371 [WARNING|trainer.py:803] 2025-04-26 17:24:36,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:36,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6327 [WARNING|trainer.py:803] 2025-04-26 17:24:36,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6395 6372 [WARNING|trainer.py:803] 2025-04-26 17:24:37,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:24:37,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6328 [WARNING|trainer.py:803] 2025-04-26 17:24:38,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6396 6373 [WARNING|trainer.py:803] 2025-04-26 17:24:38,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:24:39,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6329 [WARNING|trainer.py:803] 2025-04-26 17:24:39,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6397 6374 [WARNING|trainer.py:803] 2025-04-26 17:24:40,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:24:40,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6330 [WARNING|trainer.py:803] 2025-04-26 17:24:40,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6398 6375 [WARNING|trainer.py:803] 2025-04-26 17:24:41,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:24:41,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6331 [WARNING|trainer.py:803] 2025-04-26 17:24:42,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6399 6376 [WARNING|trainer.py:803] 2025-04-26 17:24:42,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:24:43,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6332 [WARNING|trainer.py:803] 2025-04-26 17:24:43,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6400 6377 [WARNING|trainer.py:803] 2025-04-26 17:24:44,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:44,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6333 [WARNING|trainer.py:803] 2025-04-26 17:24:44,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6401 6378 [WARNING|trainer.py:803] 2025-04-26 17:24:45,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:45,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6334 [WARNING|trainer.py:803] 2025-04-26 17:24:46,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6402 6379 [WARNING|trainer.py:803] 2025-04-26 17:24:46,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:24:47,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6335 [WARNING|trainer.py:803] 2025-04-26 17:24:47,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6403 6380 [WARNING|trainer.py:803] 2025-04-26 17:24:48,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:24:48,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6336 [WARNING|trainer.py:803] 2025-04-26 17:24:48,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6404 6381 [WARNING|trainer.py:803] 2025-04-26 17:24:49,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:24:49,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6337 [WARNING|trainer.py:803] 2025-04-26 17:24:50,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6405 6382 [WARNING|trainer.py:803] 2025-04-26 17:24:50,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:24:51,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6338 [WARNING|trainer.py:803] 2025-04-26 17:24:51,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6406 6383 [WARNING|trainer.py:803] 2025-04-26 17:24:52,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:52,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6339 [WARNING|trainer.py:803] 2025-04-26 17:24:52,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6407 6384 [WARNING|trainer.py:803] 2025-04-26 17:24:53,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6340 [WARNING|trainer.py:803] 2025-04-26 17:24:53,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:54,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6408 6385 [WARNING|trainer.py:803] 2025-04-26 17:24:54,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6341 [WARNING|trainer.py:803] 2025-04-26 17:24:55,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:55,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6409 6386 [WARNING|trainer.py:803] 2025-04-26 17:24:56,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6342 [WARNING|trainer.py:803] 2025-04-26 17:24:56,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:24:56,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6410 6387 [WARNING|trainer.py:803] 2025-04-26 17:24:57,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6343 [WARNING|trainer.py:803] 2025-04-26 17:24:57,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:24:58,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6411 6388 [WARNING|trainer.py:803] 2025-04-26 17:24:58,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6344 [WARNING|trainer.py:803] 2025-04-26 17:24:59,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:24:59,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6412 6389 [WARNING|trainer.py:803] 2025-04-26 17:25:00,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6345 [WARNING|trainer.py:803] 2025-04-26 17:25:00,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:25:00,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6413 6390 [WARNING|trainer.py:803] 2025-04-26 17:25:01,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6346 [WARNING|trainer.py:803] 2025-04-26 17:25:01,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:25:02,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6414 6391 [WARNING|trainer.py:803] 2025-04-26 17:25:02,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6347 [WARNING|trainer.py:803] 2025-04-26 17:25:03,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:25:03,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6415 6392 [WARNING|trainer.py:803] 2025-04-26 17:25:04,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:25:04,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6348 [WARNING|trainer.py:803] 2025-04-26 17:25:04,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6416 6393 [WARNING|trainer.py:803] 2025-04-26 17:25:05,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6349 [WARNING|trainer.py:803] 2025-04-26 17:25:05,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:25:06,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6417 6394 [WARNING|trainer.py:803] 2025-04-26 17:25:06,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6350 [WARNING|trainer.py:803] 2025-04-26 17:25:07,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:25:07,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6395 6418 [WARNING|trainer.py:803] 2025-04-26 17:25:08,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6351 [WARNING|trainer.py:803] 2025-04-26 17:25:08,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:25:08,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6396 6419 [WARNING|trainer.py:803] 2025-04-26 17:25:09,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6352 [WARNING|trainer.py:803] 2025-04-26 17:25:09,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:25:10,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6397 6420 [WARNING|trainer.py:803] 2025-04-26 17:25:10,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6353 [WARNING|trainer.py:803] 2025-04-26 17:25:11,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:25:11,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6398 6421 [WARNING|trainer.py:803] 2025-04-26 17:25:12,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6354 [WARNING|trainer.py:803] 2025-04-26 17:25:12,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:25:12,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6399 6422 [WARNING|trainer.py:803] 2025-04-26 17:25:13,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6355 [WARNING|trainer.py:803] 2025-04-26 17:25:13,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:25:14,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6400 6423 [WARNING|trainer.py:803] 2025-04-26 17:25:14,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6356 [WARNING|trainer.py:803] 2025-04-26 17:25:15,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:25:15,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6401 6424 [WARNING|trainer.py:803] 2025-04-26 17:25:15,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6357 [WARNING|trainer.py:803] 2025-04-26 17:25:16,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:25:16,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6402 6425 [WARNING|trainer.py:803] 2025-04-26 17:25:17,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6358 [WARNING|trainer.py:803] 2025-04-26 17:25:17,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:25:18,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6403 [WARNING|trainer.py:803] 2025-04-26 17:25:18,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6426 6359 [WARNING|trainer.py:803] 2025-04-26 17:25:19,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:25:19,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6404 [WARNING|trainer.py:803] 2025-04-26 17:25:19,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6427 6360 [WARNING|trainer.py:803] 2025-04-26 17:25:20,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:25:20,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6405 [WARNING|trainer.py:803] 2025-04-26 17:25:21,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6428 6361 [WARNING|trainer.py:803] 2025-04-26 17:25:21,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:25:22,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6406 [WARNING|trainer.py:803] 2025-04-26 17:25:22,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6429 6362 [WARNING|trainer.py:803] 2025-04-26 17:25:23,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:25:23,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6407 [WARNING|trainer.py:803] 2025-04-26 17:25:23,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6430 6363 [WARNING|trainer.py:803] 2025-04-26 17:25:24,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:25:24,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6408 [WARNING|trainer.py:803] 2025-04-26 17:25:25,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6431 6364 [WARNING|trainer.py:803] 2025-04-26 17:25:25,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:25:26,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6409 [WARNING|trainer.py:803] 2025-04-26 17:25:26,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6432 6365 [WARNING|trainer.py:803] 2025-04-26 17:25:27,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:25:27,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6410 [WARNING|trainer.py:803] 2025-04-26 17:25:27,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6433 6366 [WARNING|trainer.py:803] 2025-04-26 17:25:28,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:25:28,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6411 [WARNING|trainer.py:803] 2025-04-26 17:25:29,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6434 6367 [WARNING|trainer.py:803] 2025-04-26 17:25:29,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:25:30,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6412 [WARNING|trainer.py:803] 2025-04-26 17:25:30,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6435 6368 [WARNING|trainer.py:803] 2025-04-26 17:25:31,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:25:31,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6413 [WARNING|trainer.py:803] 2025-04-26 17:25:31,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6436 6369 [WARNING|trainer.py:803] 2025-04-26 17:25:32,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:25:32,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6414 [WARNING|trainer.py:803] 2025-04-26 17:25:33,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6437 6370 [WARNING|trainer.py:803] 2025-04-26 17:25:33,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:25:34,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6415 [WARNING|trainer.py:803] 2025-04-26 17:25:34,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6438 6371 [WARNING|trainer.py:803] 2025-04-26 17:25:35,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:25:35,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6416 [WARNING|trainer.py:803] 2025-04-26 17:25:35,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6439 6372 [WARNING|trainer.py:803] 2025-04-26 17:25:36,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:25:36,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6417 [WARNING|trainer.py:803] 2025-04-26 17:25:37,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6440 6373 [WARNING|trainer.py:803] 2025-04-26 17:25:37,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:25:38,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6418 [WARNING|trainer.py:803] 2025-04-26 17:25:38,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6441 6374 [WARNING|trainer.py:803] 2025-04-26 17:25:39,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:25:39,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6419 [WARNING|trainer.py:803] 2025-04-26 17:25:39,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6442 6375 [WARNING|trainer.py:803] 2025-04-26 17:25:40,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:25:40,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:25:40,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6420 6443 6376 [WARNING|trainer.py:803] 2025-04-26 17:25:41,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:25:42,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:25:42,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6421 6444 6377 [WARNING|trainer.py:803] 2025-04-26 17:25:43,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:25:43,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:25:43,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6422 6445 6378 [WARNING|trainer.py:803] 2025-04-26 17:25:44,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:25:44,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6423 [WARNING|trainer.py:803] 2025-04-26 17:25:44,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6446 6379 [WARNING|trainer.py:803] 2025-04-26 17:25:45,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:25:46,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:25:46,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6424 6447 6380 [WARNING|trainer.py:803] 2025-04-26 17:25:47,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:25:47,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:25:47,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6425 6448 6381 [WARNING|trainer.py:803] 2025-04-26 17:25:48,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:25:48,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:25:48,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6426 6449 6382 [WARNING|trainer.py:803] 2025-04-26 17:25:49,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:25:50,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:25:50,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6427 6450 6383 [WARNING|trainer.py:803] 2025-04-26 17:25:51,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:25:51,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6428 [WARNING|trainer.py:803] 2025-04-26 17:25:51,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6451 6384 [WARNING|trainer.py:803] 2025-04-26 17:25:52,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:25:52,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6429 [WARNING|trainer.py:803] 2025-04-26 17:25:53,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6452 6385 [WARNING|trainer.py:803] 2025-04-26 17:25:53,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:25:54,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6430 [WARNING|trainer.py:803] 2025-04-26 17:25:54,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6453 6386 [WARNING|trainer.py:803] 2025-04-26 17:25:55,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:25:55,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6431 [WARNING|trainer.py:803] 2025-04-26 17:25:55,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6454 6387 [WARNING|trainer.py:803] 2025-04-26 17:25:56,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:25:56,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6432 [WARNING|trainer.py:803] 2025-04-26 17:25:56,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6455 6388 [WARNING|trainer.py:803] 2025-04-26 17:25:57,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:25:58,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6433 [WARNING|trainer.py:803] 2025-04-26 17:25:58,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6456 6389 [WARNING|trainer.py:803] 2025-04-26 17:25:59,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:25:59,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6434 [WARNING|trainer.py:803] 2025-04-26 17:25:59,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6457 6390 [WARNING|trainer.py:803] 2025-04-26 17:26:00,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:26:00,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6435 [WARNING|trainer.py:803] 2025-04-26 17:26:00,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6458 6391 [WARNING|trainer.py:803] 2025-04-26 17:26:01,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6436 [WARNING|trainer.py:803] 2025-04-26 17:26:02,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:26:02,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6459 6392 [WARNING|trainer.py:803] 2025-04-26 17:26:02,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6437 [WARNING|trainer.py:803] 2025-04-26 17:26:03,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:26:03,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6460 6393 [WARNING|trainer.py:803] 2025-04-26 17:26:04,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6438 [WARNING|trainer.py:803] 2025-04-26 17:26:04,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:26:04,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6461 6394 [WARNING|trainer.py:803] 2025-04-26 17:26:05,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6439 [WARNING|trainer.py:803] 2025-04-26 17:26:06,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:26:06,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6462 6395 [WARNING|trainer.py:803] 2025-04-26 17:26:06,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6440 [WARNING|trainer.py:803] 2025-04-26 17:26:07,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:26:07,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6463 6396 [WARNING|trainer.py:803] 2025-04-26 17:26:08,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6441 [WARNING|trainer.py:803] 2025-04-26 17:26:08,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:26:08,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6397 6464 [WARNING|trainer.py:803] 2025-04-26 17:26:09,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6442 [WARNING|trainer.py:803] 2025-04-26 17:26:10,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:26:10,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6465 6398 [WARNING|trainer.py:803] 2025-04-26 17:26:10,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6443 [WARNING|trainer.py:803] 2025-04-26 17:26:11,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:26:11,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6399 6466 [WARNING|trainer.py:803] 2025-04-26 17:26:12,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6444 [WARNING|trainer.py:803] 2025-04-26 17:26:12,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:26:12,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6400 6467 [WARNING|trainer.py:803] 2025-04-26 17:26:13,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6445 [WARNING|trainer.py:803] 2025-04-26 17:26:14,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:26:14,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6401 [WARNING|trainer.py:803] 2025-04-26 17:26:14,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6468 6446 [WARNING|trainer.py:803] 2025-04-26 17:26:15,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:26:15,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6402 [WARNING|trainer.py:803] 2025-04-26 17:26:16,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6469 6447 [WARNING|trainer.py:803] 2025-04-26 17:26:16,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:26:17,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6403 [WARNING|trainer.py:803] 2025-04-26 17:26:17,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6470 6448 [WARNING|trainer.py:803] 2025-04-26 17:26:18,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:26:18,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6404 [WARNING|trainer.py:803] 2025-04-26 17:26:18,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6471 6449 [WARNING|trainer.py:803] 2025-04-26 17:26:19,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:26:19,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6405 6472 [WARNING|trainer.py:803] 2025-04-26 17:26:20,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6450 [WARNING|trainer.py:803] 2025-04-26 17:26:20,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:26:21,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6406 6473 [WARNING|trainer.py:803] 2025-04-26 17:26:21,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6451 [WARNING|trainer.py:803] 2025-04-26 17:26:22,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:26:22,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6407 6474 [WARNING|trainer.py:803] 2025-04-26 17:26:22,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6452 [WARNING|trainer.py:803] 2025-04-26 17:26:23,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:26:23,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6475 6408 [WARNING|trainer.py:803] 2025-04-26 17:26:24,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6453 [WARNING|trainer.py:803] 2025-04-26 17:26:24,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:26:25,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6409 6476 [WARNING|trainer.py:803] 2025-04-26 17:26:25,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6454 [WARNING|trainer.py:803] 2025-04-26 17:26:26,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:26:26,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6410 6477 [WARNING|trainer.py:803] 2025-04-26 17:26:26,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6455 [WARNING|trainer.py:803] 2025-04-26 17:26:27,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:26:27,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6478 6411 [WARNING|trainer.py:803] 2025-04-26 17:26:28,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6456 [WARNING|trainer.py:803] 2025-04-26 17:26:28,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:26:28,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6412 6479 [WARNING|trainer.py:803] 2025-04-26 17:26:29,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6457 [WARNING|trainer.py:803] 2025-04-26 17:26:30,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:26:30,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6413 6480 [WARNING|trainer.py:803] 2025-04-26 17:26:30,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6458 [WARNING|trainer.py:803] 2025-04-26 17:26:31,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:26:31,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6414 6481 [WARNING|trainer.py:803] 2025-04-26 17:26:32,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6459 [WARNING|trainer.py:803] 2025-04-26 17:26:32,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:26:33,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6415 6482 [WARNING|trainer.py:803] 2025-04-26 17:26:33,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6460 [WARNING|trainer.py:803] 2025-04-26 17:26:34,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:26:34,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6416 6483 [WARNING|trainer.py:803] 2025-04-26 17:26:35,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6461 [WARNING|trainer.py:803] 2025-04-26 17:26:35,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:26:35,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6417 [WARNING|trainer.py:803] 2025-04-26 17:26:36,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6484 6462 [WARNING|trainer.py:803] 2025-04-26 17:26:36,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:26:37,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6418 [WARNING|trainer.py:803] 2025-04-26 17:26:37,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6485 6463 [WARNING|trainer.py:803] 2025-04-26 17:26:38,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:26:38,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6419 [WARNING|trainer.py:803] 2025-04-26 17:26:39,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6486 6464 [WARNING|trainer.py:803] 2025-04-26 17:26:39,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:26:39,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6420 6487 [WARNING|trainer.py:803] 2025-04-26 17:26:40,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6465 [WARNING|trainer.py:803] 2025-04-26 17:26:40,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:26:41,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6421 [WARNING|trainer.py:803] 2025-04-26 17:26:41,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6488 6466 [WARNING|trainer.py:803] 2025-04-26 17:26:42,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:26:42,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6422 [WARNING|trainer.py:803] 2025-04-26 17:26:43,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6489 [WARNING|trainer.py:803] 2025-04-26 17:26:43,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6467 [WARNING|trainer.py:803] 2025-04-26 17:26:43,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6423 6490 [WARNING|trainer.py:803] 2025-04-26 17:26:44,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:26:44,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6468 [WARNING|trainer.py:803] 2025-04-26 17:26:45,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6424 6491 [WARNING|trainer.py:803] 2025-04-26 17:26:45,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:26:46,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6469 [WARNING|trainer.py:803] 2025-04-26 17:26:46,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6425 6492 [WARNING|trainer.py:803] 2025-04-26 17:26:47,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:26:47,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6470 [WARNING|trainer.py:803] 2025-04-26 17:26:47,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6426 6493 [WARNING|trainer.py:803] 2025-04-26 17:26:48,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:26:48,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6471 [WARNING|trainer.py:803] 2025-04-26 17:26:49,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6427 6494 [WARNING|trainer.py:803] 2025-04-26 17:26:49,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:26:50,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6472 [WARNING|trainer.py:803] 2025-04-26 17:26:50,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6428 6495 [WARNING|trainer.py:803] 2025-04-26 17:26:51,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:26:51,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6473 [WARNING|trainer.py:803] 2025-04-26 17:26:51,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6429 6496 [WARNING|trainer.py:803] 2025-04-26 17:26:52,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:26:52,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6474 [WARNING|trainer.py:803] 2025-04-26 17:26:53,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6430 6497 [WARNING|trainer.py:803] 2025-04-26 17:26:53,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:26:54,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6475 [WARNING|trainer.py:803] 2025-04-26 17:26:54,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6431 6498 [WARNING|trainer.py:803] 2025-04-26 17:26:55,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:26:55,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6476 [WARNING|trainer.py:803] 2025-04-26 17:26:55,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6432 6499 [WARNING|trainer.py:803] 2025-04-26 17:26:56,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:26:56,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6477 [WARNING|trainer.py:803] 2025-04-26 17:26:57,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6433 6500 [WARNING|trainer.py:803] 2025-04-26 17:26:57,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:26:58,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6478 [WARNING|trainer.py:803] 2025-04-26 17:26:58,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6434 6501 [WARNING|trainer.py:803] 2025-04-26 17:26:59,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:26:59,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6479 [WARNING|trainer.py:803] 2025-04-26 17:26:59,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6435 6502 [WARNING|trainer.py:803] 2025-04-26 17:27:00,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:00,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6480 [WARNING|trainer.py:803] 2025-04-26 17:27:01,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6436 6503 [WARNING|trainer.py:803] 2025-04-26 17:27:01,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:27:02,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6481 [WARNING|trainer.py:803] 2025-04-26 17:27:02,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6437 6504 [WARNING|trainer.py:803] 2025-04-26 17:27:03,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:03,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6482 [WARNING|trainer.py:803] 2025-04-26 17:27:03,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6438 6505 [WARNING|trainer.py:803] 2025-04-26 17:27:04,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:04,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6483 [WARNING|trainer.py:803] 2025-04-26 17:27:04,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6439 6506 [WARNING|trainer.py:803] 2025-04-26 17:27:05,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:06,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6484 [WARNING|trainer.py:803] 2025-04-26 17:27:06,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6440 6507 [WARNING|trainer.py:803] 2025-04-26 17:27:07,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:07,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6485 [WARNING|trainer.py:803] 2025-04-26 17:27:07,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6441 6508 [WARNING|trainer.py:803] 2025-04-26 17:27:08,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:08,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:27:08,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6486 6442 6509 [WARNING|trainer.py:803] 2025-04-26 17:27:09,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:10,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:27:10,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6487 6443 6510 [WARNING|trainer.py:803] 2025-04-26 17:27:11,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:11,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:27:11,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6488 6444 6511 [WARNING|trainer.py:803] 2025-04-26 17:27:12,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:12,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:12,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6489 6445 6512 [WARNING|trainer.py:803] 2025-04-26 17:27:13,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:14,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:14,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6490 6446 6513 [WARNING|trainer.py:803] 2025-04-26 17:27:15,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:27:15,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:15,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6491 6447 6514 [WARNING|trainer.py:803] 2025-04-26 17:27:16,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:16,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:16,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6492 6515 6448 [WARNING|trainer.py:803] 2025-04-26 17:27:17,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:18,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:18,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6493 6516 6449 [WARNING|trainer.py:803] 2025-04-26 17:27:19,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:27:19,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:19,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6494 6517 6450 [WARNING|trainer.py:803] 2025-04-26 17:27:20,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:27:20,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:20,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6495 6518 6451 [WARNING|trainer.py:803] 2025-04-26 17:27:21,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:22,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:22,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6496 6519 6452 [WARNING|trainer.py:803] 2025-04-26 17:27:23,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:23,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:23,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6497 6520 6453 [WARNING|trainer.py:803] 2025-04-26 17:27:24,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:24,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:24,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6498 6521 6454 [WARNING|trainer.py:803] 2025-04-26 17:27:25,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:27:26,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:26,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6499 6522 6455 [WARNING|trainer.py:803] 2025-04-26 17:27:27,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:27,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:27,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6500 6523 6456 [WARNING|trainer.py:803] 2025-04-26 17:27:28,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:28,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:28,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6501 6524 6457 [WARNING|trainer.py:803] 2025-04-26 17:27:29,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:30,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:30,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6502 6525 6458 [WARNING|trainer.py:803] 2025-04-26 17:27:31,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:31,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:31,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6503 6526 6459 [WARNING|trainer.py:803] 2025-04-26 17:27:32,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:32,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6504 [WARNING|trainer.py:803] 2025-04-26 17:27:33,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6527 6460 [WARNING|trainer.py:803] 2025-04-26 17:27:33,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:34,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6505 [WARNING|trainer.py:803] 2025-04-26 17:27:34,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6528 6461 [WARNING|trainer.py:803] 2025-04-26 17:27:35,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:35,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:35,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6506 6529 6462 [WARNING|trainer.py:803] 2025-04-26 17:27:36,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:27:36,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:37,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6507 6530 6463 [WARNING|trainer.py:803] 2025-04-26 17:27:37,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:27:37,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:38,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6508 6531 6464 [WARNING|trainer.py:803] 2025-04-26 17:27:39,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:27:39,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:27:39,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6509 6532 6465 [WARNING|trainer.py:803] 2025-04-26 17:27:40,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:40,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6533 [WARNING|trainer.py:803] 2025-04-26 17:27:41,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6510 6466 [WARNING|trainer.py:803] 2025-04-26 17:27:41,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:41,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6534 [WARNING|trainer.py:803] 2025-04-26 17:27:42,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6511 6467 [WARNING|trainer.py:803] 2025-04-26 17:27:43,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:43,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6535 [WARNING|trainer.py:803] 2025-04-26 17:27:43,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6512 6468 [WARNING|trainer.py:803] 2025-04-26 17:27:44,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:44,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6536 [WARNING|trainer.py:803] 2025-04-26 17:27:45,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6513 6469 [WARNING|trainer.py:803] 2025-04-26 17:27:45,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:45,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6537 [WARNING|trainer.py:803] 2025-04-26 17:27:46,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6514 6470 [WARNING|trainer.py:803] 2025-04-26 17:27:47,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:47,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6538 [WARNING|trainer.py:803] 2025-04-26 17:27:47,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6515 6471 [WARNING|trainer.py:803] 2025-04-26 17:27:48,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:27:48,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6539 [WARNING|trainer.py:803] 2025-04-26 17:27:49,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6516 6472 [WARNING|trainer.py:803] 2025-04-26 17:27:49,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:27:49,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6540 [WARNING|trainer.py:803] 2025-04-26 17:27:50,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6517 6473 [WARNING|trainer.py:803] 2025-04-26 17:27:50,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:27:51,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6541 [WARNING|trainer.py:803] 2025-04-26 17:27:51,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6518 6474 [WARNING|trainer.py:803] 2025-04-26 17:27:52,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:27:52,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6542 [WARNING|trainer.py:803] 2025-04-26 17:27:53,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6519 6475 [WARNING|trainer.py:803] 2025-04-26 17:27:53,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:53,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6543 [WARNING|trainer.py:803] 2025-04-26 17:27:54,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6520 6476 [WARNING|trainer.py:803] 2025-04-26 17:27:54,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6544 [WARNING|trainer.py:803] 2025-04-26 17:27:55,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:55,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6521 [WARNING|trainer.py:803] 2025-04-26 17:27:56,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6477 6545 [WARNING|trainer.py:803] 2025-04-26 17:27:56,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:56,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6522 [WARNING|trainer.py:803] 2025-04-26 17:27:57,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6478 6546 [WARNING|trainer.py:803] 2025-04-26 17:27:57,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:58,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6523 [WARNING|trainer.py:803] 2025-04-26 17:27:58,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6479 6547 [WARNING|trainer.py:803] 2025-04-26 17:27:59,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:27:59,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6524 [WARNING|trainer.py:803] 2025-04-26 17:28:00,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6480 6548 [WARNING|trainer.py:803] 2025-04-26 17:28:00,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:00,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6525 [WARNING|trainer.py:803] 2025-04-26 17:28:01,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6481 6549 [WARNING|trainer.py:803] 2025-04-26 17:28:01,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:02,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6526 [WARNING|trainer.py:803] 2025-04-26 17:28:02,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6482 6550 [WARNING|trainer.py:803] 2025-04-26 17:28:03,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:28:03,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6527 [WARNING|trainer.py:803] 2025-04-26 17:28:04,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6483 6551 [WARNING|trainer.py:803] 2025-04-26 17:28:04,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:04,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6528 [WARNING|trainer.py:803] 2025-04-26 17:28:05,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6484 6552 [WARNING|trainer.py:803] 2025-04-26 17:28:05,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:06,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6529 [WARNING|trainer.py:803] 2025-04-26 17:28:06,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6485 6553 [WARNING|trainer.py:803] 2025-04-26 17:28:07,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:07,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6530 [WARNING|trainer.py:803] 2025-04-26 17:28:07,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6486 6554 [WARNING|trainer.py:803] 2025-04-26 17:28:08,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:08,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6531 [WARNING|trainer.py:803] 2025-04-26 17:28:09,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6487 6555 [WARNING|trainer.py:803] 2025-04-26 17:28:09,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:28:10,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6532 [WARNING|trainer.py:803] 2025-04-26 17:28:10,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6488 [WARNING|trainer.py:803] 2025-04-26 17:28:11,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6556 6533 [WARNING|trainer.py:803] 2025-04-26 17:28:11,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:11,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6489 [WARNING|trainer.py:803] 2025-04-26 17:28:12,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6557 6534 [WARNING|trainer.py:803] 2025-04-26 17:28:12,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:13,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6490 [WARNING|trainer.py:803] 2025-04-26 17:28:13,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6558 6535 [WARNING|trainer.py:803] 2025-04-26 17:28:14,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:28:14,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6491 [WARNING|trainer.py:803] 2025-04-26 17:28:15,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6559 6536 [WARNING|trainer.py:803] 2025-04-26 17:28:15,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:15,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6492 [WARNING|trainer.py:803] 2025-04-26 17:28:16,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6560 6537 [WARNING|trainer.py:803] 2025-04-26 17:28:16,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:17,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6493 [WARNING|trainer.py:803] 2025-04-26 17:28:17,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6561 6538 [WARNING|trainer.py:803] 2025-04-26 17:28:18,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:28:18,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6494 [WARNING|trainer.py:803] 2025-04-26 17:28:18,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6562 6539 [WARNING|trainer.py:803] 2025-04-26 17:28:19,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:28:19,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6495 [WARNING|trainer.py:803] 2025-04-26 17:28:20,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6563 6540 [WARNING|trainer.py:803] 2025-04-26 17:28:20,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:21,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6496 [WARNING|trainer.py:803] 2025-04-26 17:28:21,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6564 6541 [WARNING|trainer.py:803] 2025-04-26 17:28:22,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:22,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6497 [WARNING|trainer.py:803] 2025-04-26 17:28:22,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6565 6542 [WARNING|trainer.py:803] 2025-04-26 17:28:23,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:23,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6498 [WARNING|trainer.py:803] 2025-04-26 17:28:24,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6566 6543 [WARNING|trainer.py:803] 2025-04-26 17:28:24,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:28:25,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6499 [WARNING|trainer.py:803] 2025-04-26 17:28:25,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6567 6544 [WARNING|trainer.py:803] 2025-04-26 17:28:26,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:26,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6500 [WARNING|trainer.py:803] 2025-04-26 17:28:26,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6568 6545 [WARNING|trainer.py:803] 2025-04-26 17:28:27,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:27,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6501 [WARNING|trainer.py:803] 2025-04-26 17:28:28,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6569 6546 [WARNING|trainer.py:803] 2025-04-26 17:28:28,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:29,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6502 [WARNING|trainer.py:803] 2025-04-26 17:28:29,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6570 6547 [WARNING|trainer.py:803] 2025-04-26 17:28:30,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:30,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6503 [WARNING|trainer.py:803] 2025-04-26 17:28:30,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6571 6548 [WARNING|trainer.py:803] 2025-04-26 17:28:31,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:31,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6504 [WARNING|trainer.py:803] 2025-04-26 17:28:32,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6572 6549 [WARNING|trainer.py:803] 2025-04-26 17:28:32,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:33,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6505 [WARNING|trainer.py:803] 2025-04-26 17:28:33,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6573 6550 [WARNING|trainer.py:803] 2025-04-26 17:28:34,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:34,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6506 [WARNING|trainer.py:803] 2025-04-26 17:28:34,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6574 6551 [WARNING|trainer.py:803] 2025-04-26 17:28:35,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6507 [WARNING|trainer.py:803] 2025-04-26 17:28:35,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:36,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6575 6552 [WARNING|trainer.py:803] 2025-04-26 17:28:36,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6508 [WARNING|trainer.py:803] 2025-04-26 17:28:37,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:28:37,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6576 6553 [WARNING|trainer.py:803] 2025-04-26 17:28:37,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6509 [WARNING|trainer.py:803] 2025-04-26 17:28:38,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:38,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6577 6554 [WARNING|trainer.py:803] 2025-04-26 17:28:39,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6510 [WARNING|trainer.py:803] 2025-04-26 17:28:39,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:40,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6578 6555 [WARNING|trainer.py:803] 2025-04-26 17:28:40,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6511 [WARNING|trainer.py:803] 2025-04-26 17:28:41,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:41,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6579 6556 [WARNING|trainer.py:803] 2025-04-26 17:28:41,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6512 [WARNING|trainer.py:803] 2025-04-26 17:28:42,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:42,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6580 [WARNING|trainer.py:803] 2025-04-26 17:28:43,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6557 6513 [WARNING|trainer.py:803] 2025-04-26 17:28:43,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:28:44,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6581 [WARNING|trainer.py:803] 2025-04-26 17:28:44,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6558 6514 [WARNING|trainer.py:803] 2025-04-26 17:28:45,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:45,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6582 [WARNING|trainer.py:803] 2025-04-26 17:28:45,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6559 6515 [WARNING|trainer.py:803] 2025-04-26 17:28:46,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:28:46,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6583 [WARNING|trainer.py:803] 2025-04-26 17:28:47,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6560 6516 [WARNING|trainer.py:803] 2025-04-26 17:28:47,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:28:48,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6584 [WARNING|trainer.py:803] 2025-04-26 17:28:48,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6561 6517 [WARNING|trainer.py:803] 2025-04-26 17:28:49,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:49,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6585 [WARNING|trainer.py:803] 2025-04-26 17:28:49,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6562 6518 [WARNING|trainer.py:803] 2025-04-26 17:28:50,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:50,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:28:51,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6586 6563 6519 [WARNING|trainer.py:803] 2025-04-26 17:28:51,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:52,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:52,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6587 6564 6520 [WARNING|trainer.py:803] 2025-04-26 17:28:53,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:53,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6588 [WARNING|trainer.py:803] 2025-04-26 17:28:53,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6565 6521 [WARNING|trainer.py:803] 2025-04-26 17:28:54,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:54,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6589 [WARNING|trainer.py:803] 2025-04-26 17:28:55,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6566 6522 [WARNING|trainer.py:803] 2025-04-26 17:28:55,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:56,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:56,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6590 6567 6523 [WARNING|trainer.py:803] 2025-04-26 17:28:57,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:28:57,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:28:57,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6591 6568 6524 [WARNING|trainer.py:803] 2025-04-26 17:28:58,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:58,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:28:59,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6592 6569 6525 [WARNING|trainer.py:803] 2025-04-26 17:28:59,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:00,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:29:00,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6593 6570 6526 [WARNING|trainer.py:803] 2025-04-26 17:29:01,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:01,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:29:01,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6594 6571 6527 [WARNING|trainer.py:803] 2025-04-26 17:29:02,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:02,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:03,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6595 6572 6528 [WARNING|trainer.py:803] 2025-04-26 17:29:03,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:04,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:29:04,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6596 6573 6529 [WARNING|trainer.py:803] 2025-04-26 17:29:05,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:05,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:05,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6597 6574 6530 [WARNING|trainer.py:803] 2025-04-26 17:29:06,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:06,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:06,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6598 6575 6531 [WARNING|trainer.py:803] 2025-04-26 17:29:07,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:08,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:29:08,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6599 6576 6532 [WARNING|trainer.py:803] 2025-04-26 17:29:09,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:09,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:09,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6600 6577 6533 [WARNING|trainer.py:803] 2025-04-26 17:29:10,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:10,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:10,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6601 6578 6534 [WARNING|trainer.py:803] 2025-04-26 17:29:11,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:12,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:12,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6602 6579 6535 [WARNING|trainer.py:803] 2025-04-26 17:29:13,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:13,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:13,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6603 6580 6536 [WARNING|trainer.py:803] 2025-04-26 17:29:14,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:14,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:29:14,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6604 6581 6537 [WARNING|trainer.py:803] 2025-04-26 17:29:15,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:29:16,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:16,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6605 6538 6582 [WARNING|trainer.py:803] 2025-04-26 17:29:17,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:17,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:29:17,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6606 6539 6583 [WARNING|trainer.py:803] 2025-04-26 17:29:18,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:18,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:29:18,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6607 6540 6584 [WARNING|trainer.py:803] 2025-04-26 17:29:19,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:20,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:29:20,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6608 6541 6585 [WARNING|trainer.py:803] 2025-04-26 17:29:21,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:21,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:21,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6609 6542 6586 [WARNING|trainer.py:803] 2025-04-26 17:29:22,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:22,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:22,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6610 6543 6587 [WARNING|trainer.py:803] 2025-04-26 17:29:23,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:24,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6611 [WARNING|trainer.py:803] 2025-04-26 17:29:24,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6544 6588 [WARNING|trainer.py:803] 2025-04-26 17:29:25,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:25,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6612 [WARNING|trainer.py:803] 2025-04-26 17:29:25,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6545 6589 [WARNING|trainer.py:803] 2025-04-26 17:29:26,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:26,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6613 [WARNING|trainer.py:803] 2025-04-26 17:29:26,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6546 6590 [WARNING|trainer.py:803] 2025-04-26 17:29:27,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:28,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6614 [WARNING|trainer.py:803] 2025-04-26 17:29:28,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6547 6591 [WARNING|trainer.py:803] 2025-04-26 17:29:28,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:29,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6615 [WARNING|trainer.py:803] 2025-04-26 17:29:29,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6548 6592 [WARNING|trainer.py:803] 2025-04-26 17:29:30,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:29:30,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6616 [WARNING|trainer.py:803] 2025-04-26 17:29:30,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6549 6593 [WARNING|trainer.py:803] 2025-04-26 17:29:31,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:29:31,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6617 [WARNING|trainer.py:803] 2025-04-26 17:29:32,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6550 6594 [WARNING|trainer.py:803] 2025-04-26 17:29:32,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:33,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6618 [WARNING|trainer.py:803] 2025-04-26 17:29:33,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6551 6595 [WARNING|trainer.py:803] 2025-04-26 17:29:34,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:34,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6619 [WARNING|trainer.py:803] 2025-04-26 17:29:34,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6552 [WARNING|trainer.py:803] 2025-04-26 17:29:35,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6596 6620 [WARNING|trainer.py:803] 2025-04-26 17:29:36,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:36,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6553 [WARNING|trainer.py:803] 2025-04-26 17:29:36,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6597 6621 [WARNING|trainer.py:803] 2025-04-26 17:29:37,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:37,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6554 [WARNING|trainer.py:803] 2025-04-26 17:29:38,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6598 6622 [WARNING|trainer.py:803] 2025-04-26 17:29:38,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:38,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6555 [WARNING|trainer.py:803] 2025-04-26 17:29:39,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6599 6623 [WARNING|trainer.py:803] 2025-04-26 17:29:39,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:40,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6556 [WARNING|trainer.py:803] 2025-04-26 17:29:40,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6600 6624 [WARNING|trainer.py:803] 2025-04-26 17:29:41,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:29:41,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6557 [WARNING|trainer.py:803] 2025-04-26 17:29:41,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6601 6625 [WARNING|trainer.py:803] 2025-04-26 17:29:42,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:42,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6558 [WARNING|trainer.py:803] 2025-04-26 17:29:43,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6602 6626 [WARNING|trainer.py:803] 2025-04-26 17:29:43,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:29:44,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6559 [WARNING|trainer.py:803] 2025-04-26 17:29:44,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6603 6627 [WARNING|trainer.py:803] 2025-04-26 17:29:45,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:45,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6560 [WARNING|trainer.py:803] 2025-04-26 17:29:45,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6604 6628 [WARNING|trainer.py:803] 2025-04-26 17:29:46,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:46,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6561 [WARNING|trainer.py:803] 2025-04-26 17:29:47,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6605 6629 [WARNING|trainer.py:803] 2025-04-26 17:29:47,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:48,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6562 [WARNING|trainer.py:803] 2025-04-26 17:29:48,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6606 6630 [WARNING|trainer.py:803] 2025-04-26 17:29:49,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:29:49,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6563 [WARNING|trainer.py:803] 2025-04-26 17:29:49,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6607 6631 [WARNING|trainer.py:803] 2025-04-26 17:29:50,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:50,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6564 [WARNING|trainer.py:803] 2025-04-26 17:29:51,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6608 6632 [WARNING|trainer.py:803] 2025-04-26 17:29:51,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6565 [WARNING|trainer.py:803] 2025-04-26 17:29:52,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:52,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6609 6633 [WARNING|trainer.py:803] 2025-04-26 17:29:53,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6566 [WARNING|trainer.py:803] 2025-04-26 17:29:53,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:53,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6610 6634 [WARNING|trainer.py:803] 2025-04-26 17:29:54,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6567 [WARNING|trainer.py:803] 2025-04-26 17:29:54,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:55,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6611 [WARNING|trainer.py:803] 2025-04-26 17:29:55,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6635 6568 [WARNING|trainer.py:803] 2025-04-26 17:29:56,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:56,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6612 [WARNING|trainer.py:803] 2025-04-26 17:29:56,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6636 6569 [WARNING|trainer.py:803] 2025-04-26 17:29:57,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:29:57,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6613 [WARNING|trainer.py:803] 2025-04-26 17:29:58,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6637 6570 [WARNING|trainer.py:803] 2025-04-26 17:29:58,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:29:59,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6614 [WARNING|trainer.py:803] 2025-04-26 17:29:59,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6638 6571 [WARNING|trainer.py:803] 2025-04-26 17:30:00,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:30:00,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6615 [WARNING|trainer.py:803] 2025-04-26 17:30:00,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6639 6572 [WARNING|trainer.py:803] 2025-04-26 17:30:01,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:30:01,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6616 [WARNING|trainer.py:803] 2025-04-26 17:30:02,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6640 6573 [WARNING|trainer.py:803] 2025-04-26 17:30:02,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:30:03,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6617 [WARNING|trainer.py:803] 2025-04-26 17:30:03,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6641 6574 [WARNING|trainer.py:803] 2025-04-26 17:30:04,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:04,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6618 [WARNING|trainer.py:803] 2025-04-26 17:30:04,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6642 6575 [WARNING|trainer.py:803] 2025-04-26 17:30:05,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:05,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6619 [WARNING|trainer.py:803] 2025-04-26 17:30:06,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6643 6576 [WARNING|trainer.py:803] 2025-04-26 17:30:06,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:07,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6620 [WARNING|trainer.py:803] 2025-04-26 17:30:07,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6644 6577 [WARNING|trainer.py:803] 2025-04-26 17:30:08,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:30:08,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6621 [WARNING|trainer.py:803] 2025-04-26 17:30:08,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6645 6578 [WARNING|trainer.py:803] 2025-04-26 17:30:09,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:09,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6622 [WARNING|trainer.py:803] 2025-04-26 17:30:10,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6646 6579 [WARNING|trainer.py:803] 2025-04-26 17:30:10,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:30:11,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6623 [WARNING|trainer.py:803] 2025-04-26 17:30:11,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6647 6580 [WARNING|trainer.py:803] 2025-04-26 17:30:12,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:30:12,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6624 [WARNING|trainer.py:803] 2025-04-26 17:30:12,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6648 6581 [WARNING|trainer.py:803] 2025-04-26 17:30:13,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:30:13,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6625 [WARNING|trainer.py:803] 2025-04-26 17:30:14,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6649 6582 [WARNING|trainer.py:803] 2025-04-26 17:30:14,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:30:15,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6626 [WARNING|trainer.py:803] 2025-04-26 17:30:15,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6650 6583 [WARNING|trainer.py:803] 2025-04-26 17:30:16,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:30:16,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6627 [WARNING|trainer.py:803] 2025-04-26 17:30:16,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6651 6584 [WARNING|trainer.py:803] 2025-04-26 17:30:17,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:30:17,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6628 [WARNING|trainer.py:803] 2025-04-26 17:30:18,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6652 6585 [WARNING|trainer.py:803] 2025-04-26 17:30:18,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:30:19,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6629 [WARNING|trainer.py:803] 2025-04-26 17:30:19,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6653 6586 [WARNING|trainer.py:803] 2025-04-26 17:30:20,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:30:20,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6630 [WARNING|trainer.py:803] 2025-04-26 17:30:20,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6654 6587 [WARNING|trainer.py:803] 2025-04-26 17:30:21,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:30:21,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6631 [WARNING|trainer.py:803] 2025-04-26 17:30:22,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6655 6588 [WARNING|trainer.py:803] 2025-04-26 17:30:22,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:23,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6632 [WARNING|trainer.py:803] 2025-04-26 17:30:23,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6656 6589 [WARNING|trainer.py:803] 2025-04-26 17:30:23,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:24,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6633 [WARNING|trainer.py:803] 2025-04-26 17:30:24,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6657 [WARNING|trainer.py:803] 2025-04-26 17:30:25,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6590 [WARNING|trainer.py:803] 2025-04-26 17:30:25,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6634 6658 [WARNING|trainer.py:803] 2025-04-26 17:30:26,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:26,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6591 [WARNING|trainer.py:803] 2025-04-26 17:30:27,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6635 6659 [WARNING|trainer.py:803] 2025-04-26 17:30:27,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:30:27,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6592 [WARNING|trainer.py:803] 2025-04-26 17:30:28,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6636 [WARNING|trainer.py:803] 2025-04-26 17:30:28,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6660 [WARNING|trainer.py:803] 2025-04-26 17:30:29,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6593 [WARNING|trainer.py:803] 2025-04-26 17:30:29,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6637 [WARNING|trainer.py:803] 2025-04-26 17:30:30,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6661 [WARNING|trainer.py:803] 2025-04-26 17:30:30,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6594 [WARNING|trainer.py:803] 2025-04-26 17:30:30,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6638 [WARNING|trainer.py:803] 2025-04-26 17:30:31,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6662 [WARNING|trainer.py:803] 2025-04-26 17:30:31,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6595 [WARNING|trainer.py:803] 2025-04-26 17:30:32,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6639 [WARNING|trainer.py:803] 2025-04-26 17:30:32,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6663 [WARNING|trainer.py:803] 2025-04-26 17:30:33,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6596 [WARNING|trainer.py:803] 2025-04-26 17:30:33,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6640 [WARNING|trainer.py:803] 2025-04-26 17:30:34,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6664 [WARNING|trainer.py:803] 2025-04-26 17:30:34,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6597 [WARNING|trainer.py:803] 2025-04-26 17:30:34,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 6641 NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:35,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6665 [WARNING|trainer.py:803] 2025-04-26 17:30:35,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6598 6642 [WARNING|trainer.py:803] 2025-04-26 17:30:36,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:36,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6666 [WARNING|trainer.py:803] 2025-04-26 17:30:37,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6599 [WARNING|trainer.py:803] 2025-04-26 17:30:37,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6643 [WARNING|trainer.py:803] 2025-04-26 17:30:37,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6667 [WARNING|trainer.py:803] 2025-04-26 17:30:38,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6600 [WARNING|trainer.py:803] 2025-04-26 17:30:38,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6644 [WARNING|trainer.py:803] 2025-04-26 17:30:39,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6668 [WARNING|trainer.py:803] 2025-04-26 17:30:39,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6601 6645 [WARNING|trainer.py:803] 2025-04-26 17:30:40,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:30:40,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6669 [WARNING|trainer.py:803] 2025-04-26 17:30:41,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6602 6646 [WARNING|trainer.py:803] 2025-04-26 17:30:41,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:30:41,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6670 [WARNING|trainer.py:803] 2025-04-26 17:30:42,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6603 6647 [WARNING|trainer.py:803] 2025-04-26 17:30:42,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:43,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6671 [WARNING|trainer.py:803] 2025-04-26 17:30:43,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6604 6648 [WARNING|trainer.py:803] 2025-04-26 17:30:44,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:44,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6672 [WARNING|trainer.py:803] 2025-04-26 17:30:44,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6605 6649 [WARNING|trainer.py:803] 2025-04-26 17:30:45,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:45,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6673 [WARNING|trainer.py:803] 2025-04-26 17:30:46,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6606 6650 [WARNING|trainer.py:803] 2025-04-26 17:30:46,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:30:47,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6674 [WARNING|trainer.py:803] 2025-04-26 17:30:47,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6607 [WARNING|trainer.py:803] 2025-04-26 17:30:48,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6651 [WARNING|trainer.py:803] 2025-04-26 17:30:48,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6675 [WARNING|trainer.py:803] 2025-04-26 17:30:48,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6608 6652 [WARNING|trainer.py:803] 2025-04-26 17:30:49,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:49,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6676 [WARNING|trainer.py:803] 2025-04-26 17:30:50,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6609 6653 [WARNING|trainer.py:803] 2025-04-26 17:30:50,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:51,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6677 [WARNING|trainer.py:803] 2025-04-26 17:30:51,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6610 [WARNING|trainer.py:803] 2025-04-26 17:30:52,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6654 [WARNING|trainer.py:803] 2025-04-26 17:30:52,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6678 [WARNING|trainer.py:803] 2025-04-26 17:30:52,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6611 [WARNING|trainer.py:803] 2025-04-26 17:30:53,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6655 [WARNING|trainer.py:803] 2025-04-26 17:30:53,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6679 [WARNING|trainer.py:803] 2025-04-26 17:30:54,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6612 [WARNING|trainer.py:803] 2025-04-26 17:30:54,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6656 [WARNING|trainer.py:803] 2025-04-26 17:30:55,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6680 6613 [WARNING|trainer.py:803] 2025-04-26 17:30:55,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6657 [WARNING|trainer.py:803] 2025-04-26 17:30:56,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:56,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6681 [WARNING|trainer.py:803] 2025-04-26 17:30:56,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6614 6658 [WARNING|trainer.py:803] 2025-04-26 17:30:57,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:30:57,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6682 [WARNING|trainer.py:803] 2025-04-26 17:30:58,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6615 [WARNING|trainer.py:803] 2025-04-26 17:30:58,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6659 [WARNING|trainer.py:803] 2025-04-26 17:30:59,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6683 [WARNING|trainer.py:803] 2025-04-26 17:30:59,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6616 6660 [WARNING|trainer.py:803] 2025-04-26 17:31:00,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:00,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6684 [WARNING|trainer.py:803] 2025-04-26 17:31:00,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6617 6661 [WARNING|trainer.py:803] 2025-04-26 17:31:01,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:01,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6685 [WARNING|trainer.py:803] 2025-04-26 17:31:02,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6618 6662 [WARNING|trainer.py:803] 2025-04-26 17:31:02,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:03,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6686 [WARNING|trainer.py:803] 2025-04-26 17:31:03,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6619 6663 [WARNING|trainer.py:803] 2025-04-26 17:31:04,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:31:04,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6687 [WARNING|trainer.py:803] 2025-04-26 17:31:04,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6620 6664 [WARNING|trainer.py:803] 2025-04-26 17:31:05,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:05,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6688 [WARNING|trainer.py:803] 2025-04-26 17:31:06,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6621 6665 [WARNING|trainer.py:803] 2025-04-26 17:31:06,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:31:06,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6689 [WARNING|trainer.py:803] 2025-04-26 17:31:07,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6622 6666 [WARNING|trainer.py:803] 2025-04-26 17:31:08,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:08,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6690 [WARNING|trainer.py:803] 2025-04-26 17:31:08,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6623 6667 [WARNING|trainer.py:803] 2025-04-26 17:31:09,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:09,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6691 [WARNING|trainer.py:803] 2025-04-26 17:31:10,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6624 6668 [WARNING|trainer.py:803] 2025-04-26 17:31:10,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:10,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6692 [WARNING|trainer.py:803] 2025-04-26 17:31:11,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6625 6669 [WARNING|trainer.py:803] 2025-04-26 17:31:12,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:12,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6693 [WARNING|trainer.py:803] 2025-04-26 17:31:12,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6626 6670 [WARNING|trainer.py:803] 2025-04-26 17:31:13,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:13,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6694 [WARNING|trainer.py:803] 2025-04-26 17:31:13,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6627 6671 [WARNING|trainer.py:803] 2025-04-26 17:31:14,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:14,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6695 [WARNING|trainer.py:803] 2025-04-26 17:31:15,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6628 6672 [WARNING|trainer.py:803] 2025-04-26 17:31:15,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:16,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6696 [WARNING|trainer.py:803] 2025-04-26 17:31:16,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6629 6673 [WARNING|trainer.py:803] 2025-04-26 17:31:17,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:31:17,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6697 [WARNING|trainer.py:803] 2025-04-26 17:31:17,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6630 6674 [WARNING|trainer.py:803] 2025-04-26 17:31:18,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:31:18,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6698 [WARNING|trainer.py:803] 2025-04-26 17:31:19,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6631 6675 [WARNING|trainer.py:803] 2025-04-26 17:31:19,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:31:20,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6699 [WARNING|trainer.py:803] 2025-04-26 17:31:20,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6632 6676 [WARNING|trainer.py:803] 2025-04-26 17:31:21,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:21,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6700 [WARNING|trainer.py:803] 2025-04-26 17:31:21,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6633 6677 [WARNING|trainer.py:803] 2025-04-26 17:31:22,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:31:22,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6701 [WARNING|trainer.py:803] 2025-04-26 17:31:23,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6634 6678 [WARNING|trainer.py:803] 2025-04-26 17:31:23,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:31:24,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6702 [WARNING|trainer.py:803] 2025-04-26 17:31:24,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6635 6679 [WARNING|trainer.py:803] 2025-04-26 17:31:25,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:31:25,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6703 [WARNING|trainer.py:803] 2025-04-26 17:31:25,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6636 6680 [WARNING|trainer.py:803] 2025-04-26 17:31:26,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:31:26,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6704 [WARNING|trainer.py:803] 2025-04-26 17:31:27,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6637 6681 [WARNING|trainer.py:803] 2025-04-26 17:31:27,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:28,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6705 [WARNING|trainer.py:803] 2025-04-26 17:31:28,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6638 6682 [WARNING|trainer.py:803] 2025-04-26 17:31:29,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:31:29,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:29,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6706 6639 6683 [WARNING|trainer.py:803] 2025-04-26 17:31:30,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:31:30,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:31,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6707 6640 6684 [WARNING|trainer.py:803] 2025-04-26 17:31:31,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:32,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6708 [WARNING|trainer.py:803] 2025-04-26 17:31:32,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6641 6685 [WARNING|trainer.py:803] 2025-04-26 17:31:33,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:33,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6709 [WARNING|trainer.py:803] 2025-04-26 17:31:33,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6642 6686 [WARNING|trainer.py:803] 2025-04-26 17:31:34,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:31:34,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6710 [WARNING|trainer.py:803] 2025-04-26 17:31:35,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6643 6687 [WARNING|trainer.py:803] 2025-04-26 17:31:35,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:36,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6711 [WARNING|trainer.py:803] 2025-04-26 17:31:36,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6644 6688 [WARNING|trainer.py:803] 2025-04-26 17:31:37,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:37,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6712 [WARNING|trainer.py:803] 2025-04-26 17:31:37,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6645 6689 [WARNING|trainer.py:803] 2025-04-26 17:31:38,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:38,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6713 [WARNING|trainer.py:803] 2025-04-26 17:31:39,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6646 6690 [WARNING|trainer.py:803] 2025-04-26 17:31:39,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:40,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6714 [WARNING|trainer.py:803] 2025-04-26 17:31:40,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6647 6691 [WARNING|trainer.py:803] 2025-04-26 17:31:41,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:41,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6715 [WARNING|trainer.py:803] 2025-04-26 17:31:41,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6648 6692 [WARNING|trainer.py:803] 2025-04-26 17:31:42,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:42,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6716 [WARNING|trainer.py:803] 2025-04-26 17:31:43,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6649 6693 [WARNING|trainer.py:803] 2025-04-26 17:31:43,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:44,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6717 [WARNING|trainer.py:803] 2025-04-26 17:31:44,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6650 6694 [WARNING|trainer.py:803] 2025-04-26 17:31:45,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:45,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:31:45,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6718 6651 6695 [WARNING|trainer.py:803] 2025-04-26 17:31:46,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:31:46,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:46,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6719 6652 6696 [WARNING|trainer.py:803] 2025-04-26 17:31:47,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:48,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:48,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6720 6653 6697 [WARNING|trainer.py:803] 2025-04-26 17:31:49,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:31:49,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:49,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6721 6654 6698 [WARNING|trainer.py:803] 2025-04-26 17:31:50,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:50,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:50,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6722 6655 6699 [WARNING|trainer.py:803] 2025-04-26 17:31:51,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:52,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:31:52,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6723 6656 6700 [WARNING|trainer.py:803] 2025-04-26 17:31:53,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:53,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:53,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6724 6657 6701 [WARNING|trainer.py:803] 2025-04-26 17:31:54,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:54,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:54,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6725 6658 6702 [WARNING|trainer.py:803] 2025-04-26 17:31:55,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:56,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:56,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6726 6659 6703 [WARNING|trainer.py:803] 2025-04-26 17:31:57,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:31:57,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:57,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6727 6660 6704 [WARNING|trainer.py:803] 2025-04-26 17:31:58,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:31:58,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:31:58,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6728 6661 6705 [WARNING|trainer.py:803] 2025-04-26 17:31:59,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:00,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:00,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6729 6662 6706 [WARNING|trainer.py:803] 2025-04-26 17:32:01,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:01,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:01,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6730 6663 6707 [WARNING|trainer.py:803] 2025-04-26 17:32:02,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 17:32:02,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:32:02,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6731 6664 6708 [WARNING|trainer.py:803] 2025-04-26 17:32:03,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:32:04,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:04,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6732 6665 6709 [WARNING|trainer.py:803] 2025-04-26 17:32:05,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:32:05,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:05,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6733 6666 6710 [WARNING|trainer.py:803] 2025-04-26 17:32:06,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:32:06,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:06,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6734 6667 6711 [WARNING|trainer.py:803] 2025-04-26 17:32:07,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:08,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:08,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6735 6668 6712 [WARNING|trainer.py:803] 2025-04-26 17:32:09,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:09,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:09,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6736 6669 6713 [WARNING|trainer.py:803] 2025-04-26 17:32:10,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:10,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:32:10,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6737 6670 6714 [WARNING|trainer.py:803] 2025-04-26 17:32:11,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:12,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:12,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6738 6671 6715 [WARNING|trainer.py:803] 2025-04-26 17:32:13,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:32:13,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:13,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6739 6672 6716 [WARNING|trainer.py:803] 2025-04-26 17:32:14,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:14,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:14,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6740 6673 6717 [WARNING|trainer.py:803] 2025-04-26 17:32:15,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:32:16,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:16,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6741 6674 6718 [WARNING|trainer.py:803] 2025-04-26 17:32:17,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:32:17,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6742 [WARNING|trainer.py:803] 2025-04-26 17:32:17,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6675 6719 [WARNING|trainer.py:803] 2025-04-26 17:32:18,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:32:18,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6743 [WARNING|trainer.py:803] 2025-04-26 17:32:18,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6676 6720 [WARNING|trainer.py:803] 2025-04-26 17:32:19,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:32:20,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6744 [WARNING|trainer.py:803] 2025-04-26 17:32:20,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6677 6721 [WARNING|trainer.py:803] 2025-04-26 17:32:21,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:32:21,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6745 [WARNING|trainer.py:803] 2025-04-26 17:32:21,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6678 6722 [WARNING|trainer.py:803] 2025-04-26 17:32:22,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:32:22,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6746 [WARNING|trainer.py:803] 2025-04-26 17:32:23,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6679 6723 [WARNING|trainer.py:803] 2025-04-26 17:32:23,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:24,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6747 [WARNING|trainer.py:803] 2025-04-26 17:32:24,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6680 6724 [WARNING|trainer.py:803] 2025-04-26 17:32:25,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:25,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6748 [WARNING|trainer.py:803] 2025-04-26 17:32:25,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6681 6725 [WARNING|trainer.py:803] 2025-04-26 17:32:26,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:32:26,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6749 [WARNING|trainer.py:803] 2025-04-26 17:32:27,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6682 6726 [WARNING|trainer.py:803] 2025-04-26 17:32:27,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:28,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6750 [WARNING|trainer.py:803] 2025-04-26 17:32:28,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6683 6727 [WARNING|trainer.py:803] 2025-04-26 17:32:29,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6751 [WARNING|trainer.py:803] 2025-04-26 17:32:29,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:29,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6684 6728 [WARNING|trainer.py:803] 2025-04-26 17:32:30,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6752 [WARNING|trainer.py:803] 2025-04-26 17:32:30,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:31,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6685 6729 [WARNING|trainer.py:803] 2025-04-26 17:32:31,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6753 [WARNING|trainer.py:803] 2025-04-26 17:32:32,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:32,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6686 6730 [WARNING|trainer.py:803] 2025-04-26 17:32:32,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6754 [WARNING|trainer.py:803] 2025-04-26 17:32:33,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:32:33,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6687 6731 [WARNING|trainer.py:803] 2025-04-26 17:32:34,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6755 [WARNING|trainer.py:803] 2025-04-26 17:32:34,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:35,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6688 [WARNING|trainer.py:803] 2025-04-26 17:32:35,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6732 6756 [WARNING|trainer.py:803] 2025-04-26 17:32:36,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:32:36,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6689 [WARNING|trainer.py:803] 2025-04-26 17:32:36,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6733 6757 [WARNING|trainer.py:803] 2025-04-26 17:32:37,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:37,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6690 [WARNING|trainer.py:803] 2025-04-26 17:32:38,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6734 6758 [WARNING|trainer.py:803] 2025-04-26 17:32:38,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:38,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6691 6735 [WARNING|trainer.py:803] 2025-04-26 17:32:39,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6759 [WARNING|trainer.py:803] 2025-04-26 17:32:40,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:40,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6692 6736 [WARNING|trainer.py:803] 2025-04-26 17:32:40,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6760 [WARNING|trainer.py:803] 2025-04-26 17:32:41,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:41,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6693 6737 [WARNING|trainer.py:803] 2025-04-26 17:32:42,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6761 [WARNING|trainer.py:803] 2025-04-26 17:32:42,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:42,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6694 6738 [WARNING|trainer.py:803] 2025-04-26 17:32:43,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6762 [WARNING|trainer.py:803] 2025-04-26 17:32:44,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:44,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6695 6739 [WARNING|trainer.py:803] 2025-04-26 17:32:44,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6763 [WARNING|trainer.py:803] 2025-04-26 17:32:45,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:45,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6696 6740 [WARNING|trainer.py:803] 2025-04-26 17:32:46,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6764 [WARNING|trainer.py:803] 2025-04-26 17:32:46,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:32:46,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6697 6741 [WARNING|trainer.py:803] 2025-04-26 17:32:47,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6765 [WARNING|trainer.py:803] 2025-04-26 17:32:48,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:32:48,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6698 6742 [WARNING|trainer.py:803] 2025-04-26 17:32:48,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6766 [WARNING|trainer.py:803] 2025-04-26 17:32:49,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:32:49,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6699 6743 [WARNING|trainer.py:803] 2025-04-26 17:32:50,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6767 [WARNING|trainer.py:803] 2025-04-26 17:32:50,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:50,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6700 6744 [WARNING|trainer.py:803] 2025-04-26 17:32:51,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6768 [WARNING|trainer.py:803] 2025-04-26 17:32:52,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:32:52,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6701 [WARNING|trainer.py:803] 2025-04-26 17:32:52,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6745 6769 [WARNING|trainer.py:803] 2025-04-26 17:32:53,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:32:53,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6702 [WARNING|trainer.py:803] 2025-04-26 17:32:53,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6746 6770 [WARNING|trainer.py:803] 2025-04-26 17:32:54,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:32:54,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:55,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6703 6747 6771 [WARNING|trainer.py:803] 2025-04-26 17:32:56,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:32:56,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:32:56,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6704 6748 6772 [WARNING|trainer.py:803] 2025-04-26 17:32:57,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:57,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:32:57,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6705 6749 6773 [WARNING|trainer.py:803] 2025-04-26 17:32:58,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:32:58,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:32:59,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6750 6706 6774 [WARNING|trainer.py:803] 2025-04-26 17:33:00,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:33:00,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:33:00,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6751 6707 6775 [WARNING|trainer.py:803] 2025-04-26 17:33:01,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:01,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:01,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6752 6708 6776 [WARNING|trainer.py:803] 2025-04-26 17:33:02,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:02,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:03,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6753 6709 6777 [WARNING|trainer.py:803] 2025-04-26 17:33:04,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:04,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:33:04,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6754 6710 6778 [WARNING|trainer.py:803] 2025-04-26 17:33:05,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:05,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6755 [WARNING|trainer.py:803] 2025-04-26 17:33:05,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6779 6711 [WARNING|trainer.py:803] 2025-04-26 17:33:06,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6756 [WARNING|trainer.py:803] 2025-04-26 17:33:07,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:07,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6780 6712 [WARNING|trainer.py:803] 2025-04-26 17:33:07,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6757 [WARNING|trainer.py:803] 2025-04-26 17:33:08,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:08,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6781 6713 [WARNING|trainer.py:803] 2025-04-26 17:33:09,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:09,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6758 [WARNING|trainer.py:803] 2025-04-26 17:33:09,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6782 6714 [WARNING|trainer.py:803] 2025-04-26 17:33:10,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6759 [WARNING|trainer.py:803] 2025-04-26 17:33:11,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:11,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6783 6715 [WARNING|trainer.py:803] 2025-04-26 17:33:11,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:12,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6760 [WARNING|trainer.py:803] 2025-04-26 17:33:12,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6784 6716 [WARNING|trainer.py:803] 2025-04-26 17:33:13,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:13,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6761 [WARNING|trainer.py:803] 2025-04-26 17:33:13,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6785 6717 [WARNING|trainer.py:803] 2025-04-26 17:33:14,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6762 [WARNING|trainer.py:803] 2025-04-26 17:33:14,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:33:15,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6786 [WARNING|trainer.py:803] 2025-04-26 17:33:15,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6718 6763 [WARNING|trainer.py:803] 2025-04-26 17:33:16,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:16,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6787 [WARNING|trainer.py:803] 2025-04-26 17:33:17,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6719 6764 [WARNING|trainer.py:803] 2025-04-26 17:33:17,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:18,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6788 [WARNING|trainer.py:803] 2025-04-26 17:33:18,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6720 6765 [WARNING|trainer.py:803] 2025-04-26 17:33:18,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6789 [WARNING|trainer.py:803] 2025-04-26 17:33:19,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:33:19,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6721 6766 [WARNING|trainer.py:803] 2025-04-26 17:33:20,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6790 [WARNING|trainer.py:803] 2025-04-26 17:33:20,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:21,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6722 6767 [WARNING|trainer.py:803] 2025-04-26 17:33:21,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6791 [WARNING|trainer.py:803] 2025-04-26 17:33:22,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:22,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6723 6768 [WARNING|trainer.py:803] 2025-04-26 17:33:22,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6792 [WARNING|trainer.py:803] 2025-04-26 17:33:23,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:23,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6724 6769 [WARNING|trainer.py:803] 2025-04-26 17:33:24,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6793 [WARNING|trainer.py:803] 2025-04-26 17:33:24,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:25,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6725 6770 [WARNING|trainer.py:803] 2025-04-26 17:33:25,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:26,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:26,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6794 6726 6771 [WARNING|trainer.py:803] 2025-04-26 17:33:27,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:33:27,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6795 [WARNING|trainer.py:803] 2025-04-26 17:33:27,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6727 6772 [WARNING|trainer.py:803] 2025-04-26 17:33:28,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:28,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:33:28,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6796 6728 6773 [WARNING|trainer.py:803] 2025-04-26 17:33:29,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:30,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:30,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6797 6729 6774 [WARNING|trainer.py:803] 2025-04-26 17:33:31,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:31,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:31,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6798 6730 6775 [WARNING|trainer.py:803] 2025-04-26 17:33:32,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:33:32,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:33:32,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6799 6731 6776 [WARNING|trainer.py:803] 2025-04-26 17:33:33,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:33:34,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:34,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6800 6777 6732 [WARNING|trainer.py:803] 2025-04-26 17:33:35,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6801 [WARNING|trainer.py:803] 2025-04-26 17:33:35,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:35,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6778 6733 [WARNING|trainer.py:803] 2025-04-26 17:33:36,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6802 [WARNING|trainer.py:803] 2025-04-26 17:33:36,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:36,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6734 6779 [WARNING|trainer.py:803] 2025-04-26 17:33:37,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:38,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 17:33:38,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo NoYes 6803 6780 6735 [WARNING|trainer.py:803] 2025-04-26 17:33:39,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6804 [WARNING|trainer.py:803] 2025-04-26 17:33:39,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:39,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6736 6781 [WARNING|trainer.py:803] 2025-04-26 17:33:40,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6805 [WARNING|trainer.py:803] 2025-04-26 17:33:40,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:40,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6782 6737 [WARNING|trainer.py:803] 2025-04-26 17:33:41,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6806 [WARNING|trainer.py:803] 2025-04-26 17:33:42,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:42,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6783 6738 [WARNING|trainer.py:803] 2025-04-26 17:33:42,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6807 [WARNING|trainer.py:803] 2025-04-26 17:33:43,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:43,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6784 6739 [WARNING|trainer.py:803] 2025-04-26 17:33:44,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6808 [WARNING|trainer.py:803] 2025-04-26 17:33:44,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:44,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6785 6740 [WARNING|trainer.py:803] 2025-04-26 17:33:45,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6809 [WARNING|trainer.py:803] 2025-04-26 17:33:46,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:33:46,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6786 6741 [WARNING|trainer.py:803] 2025-04-26 17:33:46,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6810 [WARNING|trainer.py:803] 2025-04-26 17:33:47,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:47,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6787 6742 [WARNING|trainer.py:803] 2025-04-26 17:33:48,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6811 [WARNING|trainer.py:803] 2025-04-26 17:33:48,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:48,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6788 [WARNING|trainer.py:803] 2025-04-26 17:33:49,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6743 6812 [WARNING|trainer.py:803] 2025-04-26 17:33:50,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:50,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6789 [WARNING|trainer.py:803] 2025-04-26 17:33:50,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6744 6813 [WARNING|trainer.py:803] 2025-04-26 17:33:51,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:51,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6790 [WARNING|trainer.py:803] 2025-04-26 17:33:51,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6745 6814 [WARNING|trainer.py:803] 2025-04-26 17:33:52,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:52,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6791 [WARNING|trainer.py:803] 2025-04-26 17:33:53,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6746 6815 [WARNING|trainer.py:803] 2025-04-26 17:33:54,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:54,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6792 [WARNING|trainer.py:803] 2025-04-26 17:33:54,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6747 6816 [WARNING|trainer.py:803] 2025-04-26 17:33:55,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:55,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6793 [WARNING|trainer.py:803] 2025-04-26 17:33:55,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6748 6817 [WARNING|trainer.py:803] 2025-04-26 17:33:56,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:33:57,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6794 [WARNING|trainer.py:803] 2025-04-26 17:33:57,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6749 6818 [WARNING|trainer.py:803] 2025-04-26 17:33:57,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:33:58,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6795 [WARNING|trainer.py:803] 2025-04-26 17:33:58,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6750 6819 [WARNING|trainer.py:803] 2025-04-26 17:33:59,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:33:59,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6796 [WARNING|trainer.py:803] 2025-04-26 17:33:59,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6751 6820 [WARNING|trainer.py:803] 2025-04-26 17:34:00,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:01,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6797 [WARNING|trainer.py:803] 2025-04-26 17:34:01,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6752 6821 [WARNING|trainer.py:803] 2025-04-26 17:34:01,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6798 [WARNING|trainer.py:803] 2025-04-26 17:34:02,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:02,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6753 6822 [WARNING|trainer.py:803] 2025-04-26 17:34:03,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6799 [WARNING|trainer.py:803] 2025-04-26 17:34:03,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:03,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6754 6823 [WARNING|trainer.py:803] 2025-04-26 17:34:04,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6800 [WARNING|trainer.py:803] 2025-04-26 17:34:05,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:05,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6755 6824 [WARNING|trainer.py:803] 2025-04-26 17:34:05,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6801 [WARNING|trainer.py:803] 2025-04-26 17:34:06,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:06,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6756 6825 [WARNING|trainer.py:803] 2025-04-26 17:34:07,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6802 [WARNING|trainer.py:803] 2025-04-26 17:34:07,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:34:07,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6757 6826 [WARNING|trainer.py:803] 2025-04-26 17:34:08,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6803 [WARNING|trainer.py:803] 2025-04-26 17:34:08,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:09,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6758 6827 [WARNING|trainer.py:803] 2025-04-26 17:34:09,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6804 [WARNING|trainer.py:803] 2025-04-26 17:34:10,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:10,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6759 6828 [WARNING|trainer.py:803] 2025-04-26 17:34:10,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6805 [WARNING|trainer.py:803] 2025-04-26 17:34:11,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:11,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6760 6829 [WARNING|trainer.py:803] 2025-04-26 17:34:12,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6806 [WARNING|trainer.py:803] 2025-04-26 17:34:12,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:12,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6761 6830 [WARNING|trainer.py:803] 2025-04-26 17:34:13,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6807 [WARNING|trainer.py:803] 2025-04-26 17:34:14,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:14,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6762 6831 [WARNING|trainer.py:803] 2025-04-26 17:34:14,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6808 [WARNING|trainer.py:803] 2025-04-26 17:34:15,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:34:15,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6763 6832 [WARNING|trainer.py:803] 2025-04-26 17:34:16,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6809 [WARNING|trainer.py:803] 2025-04-26 17:34:16,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:16,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6833 6764 [WARNING|trainer.py:803] 2025-04-26 17:34:17,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6810 [WARNING|trainer.py:803] 2025-04-26 17:34:18,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:18,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6834 [WARNING|trainer.py:803] 2025-04-26 17:34:18,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6765 6811 [WARNING|trainer.py:803] 2025-04-26 17:34:19,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:19,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:19,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 6835 NoNo 6766 6812 [WARNING|trainer.py:803] 2025-04-26 17:34:20,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:20,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:21,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6836 6767 6813 [WARNING|trainer.py:803] 2025-04-26 17:34:22,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:22,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:34:22,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6837 6768 6814 [WARNING|trainer.py:803] 2025-04-26 17:34:23,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:23,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:34:23,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6838 6769 6815 [WARNING|trainer.py:803] 2025-04-26 17:34:24,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:24,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:34:25,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6839 6770 6816 [WARNING|trainer.py:803] 2025-04-26 17:34:26,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:34:26,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:26,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6840 6771 6817 [WARNING|trainer.py:803] 2025-04-26 17:34:27,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:34:27,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:27,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6841 6772 6818 [WARNING|trainer.py:803] 2025-04-26 17:34:28,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:28,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6842 [WARNING|trainer.py:803] 2025-04-26 17:34:29,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6773 6819 [WARNING|trainer.py:803] 2025-04-26 17:34:29,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:30,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6843 [WARNING|trainer.py:803] 2025-04-26 17:34:30,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6774 6820 [WARNING|trainer.py:803] 2025-04-26 17:34:31,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:31,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6844 [WARNING|trainer.py:803] 2025-04-26 17:34:31,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6775 6821 [WARNING|trainer.py:803] 2025-04-26 17:34:32,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:32,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6845 [WARNING|trainer.py:803] 2025-04-26 17:34:33,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6776 6822 [WARNING|trainer.py:803] 2025-04-26 17:34:33,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:34,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:34,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6846 6777 6823 [WARNING|trainer.py:803] 2025-04-26 17:34:35,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6847 [WARNING|trainer.py:803] 2025-04-26 17:34:35,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:35,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6778 6824 [WARNING|trainer.py:803] 2025-04-26 17:34:36,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6848 [WARNING|trainer.py:803] 2025-04-26 17:34:36,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:37,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6825 6779 [WARNING|trainer.py:803] 2025-04-26 17:34:37,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6849 [WARNING|trainer.py:803] 2025-04-26 17:34:38,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:38,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6826 6780 [WARNING|trainer.py:803] 2025-04-26 17:34:39,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6850 [WARNING|trainer.py:803] 2025-04-26 17:34:39,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:39,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6827 6781 [WARNING|trainer.py:803] 2025-04-26 17:34:40,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6851 [WARNING|trainer.py:803] 2025-04-26 17:34:40,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:40,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6828 6782 [WARNING|trainer.py:803] 2025-04-26 17:34:41,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6852 [WARNING|trainer.py:803] 2025-04-26 17:34:42,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:42,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6829 6783 [WARNING|trainer.py:803] 2025-04-26 17:34:42,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6853 [WARNING|trainer.py:803] 2025-04-26 17:34:43,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:43,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6830 6784 [WARNING|trainer.py:803] 2025-04-26 17:34:44,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6854 [WARNING|trainer.py:803] 2025-04-26 17:34:44,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:44,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6831 6785 [WARNING|trainer.py:803] 2025-04-26 17:34:45,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:46,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6855 [WARNING|trainer.py:803] 2025-04-26 17:34:46,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6832 6786 [WARNING|trainer.py:803] 2025-04-26 17:34:46,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6856 [WARNING|trainer.py:803] 2025-04-26 17:34:47,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:47,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6833 6787 [WARNING|trainer.py:803] 2025-04-26 17:34:48,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6857 [WARNING|trainer.py:803] 2025-04-26 17:34:48,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:48,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6834 6788 [WARNING|trainer.py:803] 2025-04-26 17:34:49,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6858 [WARNING|trainer.py:803] 2025-04-26 17:34:49,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:50,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6835 6789 [WARNING|trainer.py:803] 2025-04-26 17:34:50,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6859 [WARNING|trainer.py:803] 2025-04-26 17:34:51,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:51,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6836 [WARNING|trainer.py:803] 2025-04-26 17:34:52,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6790 6860 [WARNING|trainer.py:803] 2025-04-26 17:34:52,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:52,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6837 [WARNING|trainer.py:803] 2025-04-26 17:34:53,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6791 6861 [WARNING|trainer.py:803] 2025-04-26 17:34:53,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:54,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6838 [WARNING|trainer.py:803] 2025-04-26 17:34:54,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6792 6862 [WARNING|trainer.py:803] 2025-04-26 17:34:55,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:34:55,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6839 [WARNING|trainer.py:803] 2025-04-26 17:34:55,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6793 6863 [WARNING|trainer.py:803] 2025-04-26 17:34:56,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:34:56,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6840 [WARNING|trainer.py:803] 2025-04-26 17:34:57,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6794 6864 [WARNING|trainer.py:803] 2025-04-26 17:34:57,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:34:58,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6841 [WARNING|trainer.py:803] 2025-04-26 17:34:58,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6795 6865 [WARNING|trainer.py:803] 2025-04-26 17:34:59,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6842 [WARNING|trainer.py:803] 2025-04-26 17:34:59,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:34:59,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6796 6866 [WARNING|trainer.py:803] 2025-04-26 17:35:00,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6843 [WARNING|trainer.py:803] 2025-04-26 17:35:00,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:00,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6797 6867 [WARNING|trainer.py:803] 2025-04-26 17:35:01,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6844 [WARNING|trainer.py:803] 2025-04-26 17:35:02,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:02,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6868 6798 [WARNING|trainer.py:803] 2025-04-26 17:35:03,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6845 [WARNING|trainer.py:803] 2025-04-26 17:35:03,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:03,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6869 6799 [WARNING|trainer.py:803] 2025-04-26 17:35:04,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6846 [WARNING|trainer.py:803] 2025-04-26 17:35:04,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:04,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6870 6800 [WARNING|trainer.py:803] 2025-04-26 17:35:05,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6847 [WARNING|trainer.py:803] 2025-04-26 17:35:06,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:06,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6871 6801 [WARNING|trainer.py:803] 2025-04-26 17:35:06,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6848 [WARNING|trainer.py:803] 2025-04-26 17:35:07,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:35:07,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6872 6802 [WARNING|trainer.py:803] 2025-04-26 17:35:08,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6849 [WARNING|trainer.py:803] 2025-04-26 17:35:08,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:35:08,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6873 6803 [WARNING|trainer.py:803] 2025-04-26 17:35:09,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6850 [WARNING|trainer.py:803] 2025-04-26 17:35:10,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:35:10,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6874 6804 [WARNING|trainer.py:803] 2025-04-26 17:35:10,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6851 [WARNING|trainer.py:803] 2025-04-26 17:35:11,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:11,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6875 6805 [WARNING|trainer.py:803] 2025-04-26 17:35:12,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6852 [WARNING|trainer.py:803] 2025-04-26 17:35:12,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:12,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6876 6806 [WARNING|trainer.py:803] 2025-04-26 17:35:13,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6853 [WARNING|trainer.py:803] 2025-04-26 17:35:14,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:14,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6877 [WARNING|trainer.py:803] 2025-04-26 17:35:14,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6807 6854 [WARNING|trainer.py:803] 2025-04-26 17:35:15,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:15,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:15,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6878 6808 6855 [WARNING|trainer.py:803] 2025-04-26 17:35:16,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:16,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:17,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6879 6809 6856 [WARNING|trainer.py:803] 2025-04-26 17:35:18,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:18,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:35:18,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6880 6810 6857 [WARNING|trainer.py:803] 2025-04-26 17:35:19,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:19,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:35:19,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6881 6811 6858 [WARNING|trainer.py:803] 2025-04-26 17:35:20,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:20,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:21,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6812 6882 6859 [WARNING|trainer.py:803] 2025-04-26 17:35:22,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:22,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:35:22,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6813 6883 6860 [WARNING|trainer.py:803] 2025-04-26 17:35:23,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:23,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:35:23,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6814 6884 6861 [WARNING|trainer.py:803] 2025-04-26 17:35:24,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:24,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:25,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6815 6885 6862 [WARNING|trainer.py:803] 2025-04-26 17:35:25,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:26,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:26,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6816 6886 6863 [WARNING|trainer.py:803] 2025-04-26 17:35:27,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:27,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:27,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6817 6887 6864 [WARNING|trainer.py:803] 2025-04-26 17:35:28,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:28,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:29,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6888 6818 6865 [WARNING|trainer.py:803] 2025-04-26 17:35:29,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:30,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:30,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6889 6819 6866 [WARNING|trainer.py:803] 2025-04-26 17:35:31,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:31,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:31,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6890 6820 6867 [WARNING|trainer.py:803] 2025-04-26 17:35:32,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:32,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:32,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6891 6821 6868 [WARNING|trainer.py:803] 2025-04-26 17:35:33,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:34,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:34,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6892 6822 6869 [WARNING|trainer.py:803] 2025-04-26 17:35:35,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:35,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:35,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6893 6823 6870 [WARNING|trainer.py:803] 2025-04-26 17:35:36,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:36,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:36,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6894 6824 6871 [WARNING|trainer.py:803] 2025-04-26 17:35:37,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:38,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:38,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6895 6825 6872 [WARNING|trainer.py:803] 2025-04-26 17:35:39,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:39,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:39,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6896 6826 6873 [WARNING|trainer.py:803] 2025-04-26 17:35:40,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:40,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:40,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6897 6827 6874 [WARNING|trainer.py:803] 2025-04-26 17:35:41,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:42,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:42,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6898 6828 6875 [WARNING|trainer.py:803] 2025-04-26 17:35:43,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:43,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:43,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6899 6829 6876 [WARNING|trainer.py:803] 2025-04-26 17:35:44,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:44,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:44,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6900 6830 6877 [WARNING|trainer.py:803] 2025-04-26 17:35:45,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:46,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:46,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6901 6831 6878 [WARNING|trainer.py:803] 2025-04-26 17:35:47,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:47,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:47,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6902 6832 6879 [WARNING|trainer.py:803] 2025-04-26 17:35:48,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:48,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:48,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6903 6833 6880 [WARNING|trainer.py:803] 2025-04-26 17:35:49,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:49,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:50,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 6904 NoNo 6834 6881 [WARNING|trainer.py:803] 2025-04-26 17:35:50,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:35:51,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6905 [WARNING|trainer.py:803] 2025-04-26 17:35:51,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6835 6882 [WARNING|trainer.py:803] 2025-04-26 17:35:52,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6906 [WARNING|trainer.py:803] 2025-04-26 17:35:52,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:52,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6836 6883 [WARNING|trainer.py:803] 2025-04-26 17:35:53,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6907 [WARNING|trainer.py:803] 2025-04-26 17:35:53,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:54,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6837 6884 [WARNING|trainer.py:803] 2025-04-26 17:35:54,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6908 [WARNING|trainer.py:803] 2025-04-26 17:35:55,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:55,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6838 6885 [WARNING|trainer.py:803] 2025-04-26 17:35:55,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6909 [WARNING|trainer.py:803] 2025-04-26 17:35:56,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:35:56,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6839 [WARNING|trainer.py:803] 2025-04-26 17:35:57,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6886 6910 [WARNING|trainer.py:803] 2025-04-26 17:35:57,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:35:57,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6840 [WARNING|trainer.py:803] 2025-04-26 17:35:58,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6887 6911 [WARNING|trainer.py:803] 2025-04-26 17:35:59,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:35:59,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:35:59,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6841 6888 6912 [WARNING|trainer.py:803] 2025-04-26 17:36:00,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:00,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:00,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6842 6889 6913 [WARNING|trainer.py:803] 2025-04-26 17:36:01,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:01,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:02,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6843 6890 6914 [WARNING|trainer.py:803] 2025-04-26 17:36:03,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:03,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:03,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6844 6891 6915 [WARNING|trainer.py:803] 2025-04-26 17:36:04,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:04,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:04,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6845 6916 6892 [WARNING|trainer.py:803] 2025-04-26 17:36:05,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:05,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:05,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6846 6917 6893 [WARNING|trainer.py:803] 2025-04-26 17:36:07,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:36:07,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:07,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6918 6847 6894 [WARNING|trainer.py:803] 2025-04-26 17:36:08,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:08,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:08,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6919 6848 6895 [WARNING|trainer.py:803] 2025-04-26 17:36:09,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:09,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:09,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6920 6849 6896 [WARNING|trainer.py:803] 2025-04-26 17:36:10,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:11,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:11,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6921 6850 6897 [WARNING|trainer.py:803] 2025-04-26 17:36:12,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:36:12,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6922 [WARNING|trainer.py:803] 2025-04-26 17:36:12,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6851 6898 [WARNING|trainer.py:803] 2025-04-26 17:36:13,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:13,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6923 [WARNING|trainer.py:803] 2025-04-26 17:36:13,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6852 6899 [WARNING|trainer.py:803] 2025-04-26 17:36:14,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6924 [WARNING|trainer.py:803] 2025-04-26 17:36:14,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:15,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6853 6900 [WARNING|trainer.py:803] 2025-04-26 17:36:15,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6925 [WARNING|trainer.py:803] 2025-04-26 17:36:16,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:16,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6854 6901 [WARNING|trainer.py:803] 2025-04-26 17:36:16,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6926 [WARNING|trainer.py:803] 2025-04-26 17:36:17,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:17,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6855 6902 [WARNING|trainer.py:803] 2025-04-26 17:36:18,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6927 [WARNING|trainer.py:803] 2025-04-26 17:36:18,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:18,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6903 6856 [WARNING|trainer.py:803] 2025-04-26 17:36:19,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6928 [WARNING|trainer.py:803] 2025-04-26 17:36:20,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:20,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6904 6857 [WARNING|trainer.py:803] 2025-04-26 17:36:20,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6929 [WARNING|trainer.py:803] 2025-04-26 17:36:21,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:36:21,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6905 [WARNING|trainer.py:803] 2025-04-26 17:36:21,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6858 6930 [WARNING|trainer.py:803] 2025-04-26 17:36:22,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:22,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6906 [WARNING|trainer.py:803] 2025-04-26 17:36:23,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6859 6931 [WARNING|trainer.py:803] 2025-04-26 17:36:23,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:24,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6907 [WARNING|trainer.py:803] 2025-04-26 17:36:24,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6860 6932 [WARNING|trainer.py:803] 2025-04-26 17:36:25,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:25,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6908 [WARNING|trainer.py:803] 2025-04-26 17:36:25,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6861 6933 [WARNING|trainer.py:803] 2025-04-26 17:36:26,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6909 [WARNING|trainer.py:803] 2025-04-26 17:36:26,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:26,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6862 6934 [WARNING|trainer.py:803] 2025-04-26 17:36:27,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6910 [WARNING|trainer.py:803] 2025-04-26 17:36:27,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:36:28,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6863 6935 [WARNING|trainer.py:803] 2025-04-26 17:36:28,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6911 [WARNING|trainer.py:803] 2025-04-26 17:36:29,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:29,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6936 6864 [WARNING|trainer.py:803] 2025-04-26 17:36:29,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6912 [WARNING|trainer.py:803] 2025-04-26 17:36:30,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:30,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6937 6865 [WARNING|trainer.py:803] 2025-04-26 17:36:31,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6913 [WARNING|trainer.py:803] 2025-04-26 17:36:31,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:31,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6938 [WARNING|trainer.py:803] 2025-04-26 17:36:32,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6866 6914 [WARNING|trainer.py:803] 2025-04-26 17:36:33,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:33,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6939 [WARNING|trainer.py:803] 2025-04-26 17:36:33,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6867 6915 [WARNING|trainer.py:803] 2025-04-26 17:36:34,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:36:34,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6940 [WARNING|trainer.py:803] 2025-04-26 17:36:34,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6868 6916 [WARNING|trainer.py:803] 2025-04-26 17:36:35,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:35,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6941 [WARNING|trainer.py:803] 2025-04-26 17:36:36,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6869 6917 [WARNING|trainer.py:803] 2025-04-26 17:36:36,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:37,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6942 [WARNING|trainer.py:803] 2025-04-26 17:36:37,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6870 6918 [WARNING|trainer.py:803] 2025-04-26 17:36:38,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:38,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6943 [WARNING|trainer.py:803] 2025-04-26 17:36:38,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6871 6919 [WARNING|trainer.py:803] 2025-04-26 17:36:39,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:39,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6944 [WARNING|trainer.py:803] 2025-04-26 17:36:39,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6920 6872 [WARNING|trainer.py:803] 2025-04-26 17:36:40,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6945 [WARNING|trainer.py:803] 2025-04-26 17:36:40,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:41,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6921 6873 [WARNING|trainer.py:803] 2025-04-26 17:36:41,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6946 [WARNING|trainer.py:803] 2025-04-26 17:36:42,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:36:42,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6922 6874 [WARNING|trainer.py:803] 2025-04-26 17:36:43,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6947 [WARNING|trainer.py:803] 2025-04-26 17:36:43,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:43,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6923 6875 [WARNING|trainer.py:803] 2025-04-26 17:36:44,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6948 [WARNING|trainer.py:803] 2025-04-26 17:36:44,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:45,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6924 [WARNING|trainer.py:803] 2025-04-26 17:36:45,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6876 [WARNING|trainer.py:803] 2025-04-26 17:36:45,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6949 6925 [WARNING|trainer.py:803] 2025-04-26 17:36:46,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:46,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6877 [WARNING|trainer.py:803] 2025-04-26 17:36:47,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6950 6926 [WARNING|trainer.py:803] 2025-04-26 17:36:47,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:48,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6878 [WARNING|trainer.py:803] 2025-04-26 17:36:48,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6951 6927 [WARNING|trainer.py:803] 2025-04-26 17:36:49,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:49,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6879 [WARNING|trainer.py:803] 2025-04-26 17:36:49,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6952 6928 [WARNING|trainer.py:803] 2025-04-26 17:36:50,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:50,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6880 6953 [WARNING|trainer.py:803] 2025-04-26 17:36:51,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6929 [WARNING|trainer.py:803] 2025-04-26 17:36:51,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:51,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6881 6954 [WARNING|trainer.py:803] 2025-04-26 17:36:52,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6930 [WARNING|trainer.py:803] 2025-04-26 17:36:52,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:53,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6882 [WARNING|trainer.py:803] 2025-04-26 17:36:53,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6955 6931 [WARNING|trainer.py:803] 2025-04-26 17:36:54,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:36:54,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:54,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6883 6956 6932 [WARNING|trainer.py:803] 2025-04-26 17:36:55,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:36:55,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:55,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6957 6884 6933 [WARNING|trainer.py:803] 2025-04-26 17:36:56,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:56,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:57,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6958 6885 6934 [WARNING|trainer.py:803] 2025-04-26 17:36:58,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:58,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:58,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6959 6886 6935 [WARNING|trainer.py:803] 2025-04-26 17:36:59,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:36:59,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:36:59,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6960 6887 6936 [WARNING|trainer.py:803] 2025-04-26 17:37:00,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:37:00,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:00,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6961 6888 6937 [WARNING|trainer.py:803] 2025-04-26 17:37:01,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:02,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:02,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6962 6938 6889 [WARNING|trainer.py:803] 2025-04-26 17:37:03,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:03,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:03,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6963 6939 6890 [WARNING|trainer.py:803] 2025-04-26 17:37:04,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:04,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6964 [WARNING|trainer.py:803] 2025-04-26 17:37:04,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6940 6891 [WARNING|trainer.py:803] 2025-04-26 17:37:05,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:05,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6965 [WARNING|trainer.py:803] 2025-04-26 17:37:06,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6941 6892 [WARNING|trainer.py:803] 2025-04-26 17:37:06,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:07,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6966 [WARNING|trainer.py:803] 2025-04-26 17:37:07,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6942 6893 [WARNING|trainer.py:803] 2025-04-26 17:37:08,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:08,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6967 6943 [WARNING|trainer.py:803] 2025-04-26 17:37:08,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6894 [WARNING|trainer.py:803] 2025-04-26 17:37:09,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:37:09,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6968 6944 [WARNING|trainer.py:803] 2025-04-26 17:37:10,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6895 [WARNING|trainer.py:803] 2025-04-26 17:37:10,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:10,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6969 6945 [WARNING|trainer.py:803] 2025-04-26 17:37:11,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:11,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6896 [WARNING|trainer.py:803] 2025-04-26 17:37:12,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6970 6946 [WARNING|trainer.py:803] 2025-04-26 17:37:12,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:13,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6897 [WARNING|trainer.py:803] 2025-04-26 17:37:13,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6971 6947 [WARNING|trainer.py:803] 2025-04-26 17:37:14,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:14,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6898 [WARNING|trainer.py:803] 2025-04-26 17:37:14,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6972 6948 [WARNING|trainer.py:803] 2025-04-26 17:37:15,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:15,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:15,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6899 6973 6949 [WARNING|trainer.py:803] 2025-04-26 17:37:16,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:16,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:17,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6900 6974 6950 [WARNING|trainer.py:803] 2025-04-26 17:37:18,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:18,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:18,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6901 6975 6951 [WARNING|trainer.py:803] 2025-04-26 17:37:19,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:19,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:19,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6902 6976 6952 [WARNING|trainer.py:803] 2025-04-26 17:37:20,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:20,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:20,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6903 6977 6953 [WARNING|trainer.py:803] 2025-04-26 17:37:21,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:21,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:22,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6904 6978 6954 [WARNING|trainer.py:803] 2025-04-26 17:37:23,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:37:23,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:23,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6905 6979 6955 [WARNING|trainer.py:803] 2025-04-26 17:37:24,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:24,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:24,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6906 6980 6956 [WARNING|trainer.py:803] 2025-04-26 17:37:25,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:25,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:25,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6907 6981 6957 [WARNING|trainer.py:803] 2025-04-26 17:37:26,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:26,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:27,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6908 6982 6958 [WARNING|trainer.py:803] 2025-04-26 17:37:28,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:28,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:28,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6909 6983 6959 [WARNING|trainer.py:803] 2025-04-26 17:37:29,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:29,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:29,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6910 6984 6960 [WARNING|trainer.py:803] 2025-04-26 17:37:30,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:30,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:30,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6985 6911 6961 [WARNING|trainer.py:803] 2025-04-26 17:37:31,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:32,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:32,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6986 6912 6962 [WARNING|trainer.py:803] 2025-04-26 17:37:33,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:37:33,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:33,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6987 6913 6963 [WARNING|trainer.py:803] 2025-04-26 17:37:34,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:34,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:34,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6988 6914 6964 [WARNING|trainer.py:803] 2025-04-26 17:37:35,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:35,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:35,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6989 6915 6965 [WARNING|trainer.py:803] 2025-04-26 17:37:37,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:37,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:37,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6990 6916 6966 [WARNING|trainer.py:803] 2025-04-26 17:37:38,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:38,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:38,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6991 6917 6967 [WARNING|trainer.py:803] 2025-04-26 17:37:39,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:39,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:37:39,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6992 6968 6918 [WARNING|trainer.py:803] 2025-04-26 17:37:40,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:40,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:40,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6993 6969 6919 [WARNING|trainer.py:803] 2025-04-26 17:37:42,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:42,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:42,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6994 6970 6920 [WARNING|trainer.py:803] 2025-04-26 17:37:43,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:43,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:43,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6995 6971 6921 [WARNING|trainer.py:803] 2025-04-26 17:37:44,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:44,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:44,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6996 6972 6922 [WARNING|trainer.py:803] 2025-04-26 17:37:45,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:45,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:46,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6997 6973 6923 [WARNING|trainer.py:803] 2025-04-26 17:37:47,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:37:47,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:47,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6998 6974 6924 [WARNING|trainer.py:803] 2025-04-26 17:37:48,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:48,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:48,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6999 6975 6925 [WARNING|trainer.py:803] 2025-04-26 17:37:49,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:49,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:49,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7000 6976 6926 [WARNING|trainer.py:803] 2025-04-26 17:37:50,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:50,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:51,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7001 6977 6927 [WARNING|trainer.py:803] 2025-04-26 17:37:52,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:37:52,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:52,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7002 6978 6928 [WARNING|trainer.py:803] 2025-04-26 17:37:53,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:37:53,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:53,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7003 6979 6929 [WARNING|trainer.py:803] 2025-04-26 17:37:54,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:54,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:54,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6980 7004 6930 [WARNING|trainer.py:803] 2025-04-26 17:37:55,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:55,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:56,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6981 7005 6931 [WARNING|trainer.py:803] 2025-04-26 17:37:57,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:57,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:57,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6982 7006 6932 [WARNING|trainer.py:803] 2025-04-26 17:37:58,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:37:58,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:37:58,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6983 7007 6933 [WARNING|trainer.py:803] 2025-04-26 17:37:59,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:59,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:37:59,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6984 7008 6934 [WARNING|trainer.py:803] 2025-04-26 17:38:00,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:00,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:01,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6985 7009 6935 [WARNING|trainer.py:803] 2025-04-26 17:38:02,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:02,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:38:02,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6986 7010 6936 [WARNING|trainer.py:803] 2025-04-26 17:38:03,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:38:03,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6987 [WARNING|trainer.py:803] 2025-04-26 17:38:03,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7011 6937 [WARNING|trainer.py:803] 2025-04-26 17:38:04,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:04,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6988 [WARNING|trainer.py:803] 2025-04-26 17:38:05,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7012 6938 [WARNING|trainer.py:803] 2025-04-26 17:38:05,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:05,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6989 [WARNING|trainer.py:803] 2025-04-26 17:38:06,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7013 6939 [WARNING|trainer.py:803] 2025-04-26 17:38:07,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:07,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6990 [WARNING|trainer.py:803] 2025-04-26 17:38:07,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7014 6940 [WARNING|trainer.py:803] 2025-04-26 17:38:08,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:08,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6991 [WARNING|trainer.py:803] 2025-04-26 17:38:08,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7015 6941 [WARNING|trainer.py:803] 2025-04-26 17:38:09,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:09,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6992 [WARNING|trainer.py:803] 2025-04-26 17:38:10,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7016 6942 [WARNING|trainer.py:803] 2025-04-26 17:38:10,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:11,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6993 [WARNING|trainer.py:803] 2025-04-26 17:38:11,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7017 6943 [WARNING|trainer.py:803] 2025-04-26 17:38:11,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:12,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6994 [WARNING|trainer.py:803] 2025-04-26 17:38:12,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7018 6944 [WARNING|trainer.py:803] 2025-04-26 17:38:13,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6995 [WARNING|trainer.py:803] 2025-04-26 17:38:13,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:13,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7019 6945 [WARNING|trainer.py:803] 2025-04-26 17:38:14,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6996 [WARNING|trainer.py:803] 2025-04-26 17:38:14,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:14,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7020 6946 [WARNING|trainer.py:803] 2025-04-26 17:38:15,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6997 [WARNING|trainer.py:803] 2025-04-26 17:38:16,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:38:16,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7021 6947 [WARNING|trainer.py:803] 2025-04-26 17:38:16,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6998 [WARNING|trainer.py:803] 2025-04-26 17:38:17,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:17,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7022 6948 [WARNING|trainer.py:803] 2025-04-26 17:38:18,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6999 [WARNING|trainer.py:803] 2025-04-26 17:38:18,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:38:18,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7023 6949 [WARNING|trainer.py:803] 2025-04-26 17:38:19,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7000 [WARNING|trainer.py:803] 2025-04-26 17:38:19,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:19,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7024 6950 [WARNING|trainer.py:803] 2025-04-26 17:38:20,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7001 [WARNING|trainer.py:803] 2025-04-26 17:38:21,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:21,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7025 6951 [WARNING|trainer.py:803] 2025-04-26 17:38:21,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7002 [WARNING|trainer.py:803] 2025-04-26 17:38:22,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:22,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7026 6952 [WARNING|trainer.py:803] 2025-04-26 17:38:23,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7003 [WARNING|trainer.py:803] 2025-04-26 17:38:23,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:38:23,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7027 6953 [WARNING|trainer.py:803] 2025-04-26 17:38:24,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7004 [WARNING|trainer.py:803] 2025-04-26 17:38:24,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:38:25,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7028 6954 [WARNING|trainer.py:803] 2025-04-26 17:38:25,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7005 [WARNING|trainer.py:803] 2025-04-26 17:38:26,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:26,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7029 6955 [WARNING|trainer.py:803] 2025-04-26 17:38:26,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7006 [WARNING|trainer.py:803] 2025-04-26 17:38:27,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:27,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7030 6956 [WARNING|trainer.py:803] 2025-04-26 17:38:27,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7007 [WARNING|trainer.py:803] 2025-04-26 17:38:28,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:38:28,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7031 6957 [WARNING|trainer.py:803] 2025-04-26 17:38:29,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7008 [WARNING|trainer.py:803] 2025-04-26 17:38:29,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:29,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7032 6958 [WARNING|trainer.py:803] 2025-04-26 17:38:30,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7009 [WARNING|trainer.py:803] 2025-04-26 17:38:31,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:31,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7033 6959 [WARNING|trainer.py:803] 2025-04-26 17:38:31,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7010 [WARNING|trainer.py:803] 2025-04-26 17:38:32,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:32,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7034 6960 [WARNING|trainer.py:803] 2025-04-26 17:38:32,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7011 [WARNING|trainer.py:803] 2025-04-26 17:38:33,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:33,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7035 [WARNING|trainer.py:803] 2025-04-26 17:38:34,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6961 7012 [WARNING|trainer.py:803] 2025-04-26 17:38:34,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:34,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:35,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7036 6962 7013 [WARNING|trainer.py:803] 2025-04-26 17:38:36,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:36,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:36,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7037 6963 7014 [WARNING|trainer.py:803] 2025-04-26 17:38:37,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:37,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:37,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7038 6964 7015 [WARNING|trainer.py:803] 2025-04-26 17:38:38,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:38:38,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:39,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7039 6965 7016 [WARNING|trainer.py:803] 2025-04-26 17:38:39,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:39,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:40,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7040 6966 7017 [WARNING|trainer.py:803] 2025-04-26 17:38:41,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:38:41,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:41,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 7041 6967 7018 [WARNING|trainer.py:803] 2025-04-26 17:38:42,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:38:42,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:38:42,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7042 6968 7019 [WARNING|trainer.py:803] 2025-04-26 17:38:43,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:43,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:44,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7043 6969 7020 [WARNING|trainer.py:803] 2025-04-26 17:38:44,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:38:44,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:45,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7044 6970 7021 [WARNING|trainer.py:803] 2025-04-26 17:38:46,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:46,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:46,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7045 6971 7022 [WARNING|trainer.py:803] 2025-04-26 17:38:47,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:38:47,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:47,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7046 6972 7023 [WARNING|trainer.py:803] 2025-04-26 17:38:48,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:48,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:49,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7047 6973 7024 [WARNING|trainer.py:803] 2025-04-26 17:38:49,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:49,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:50,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6974 7048 7025 [WARNING|trainer.py:803] 2025-04-26 17:38:51,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:51,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:38:51,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6975 7049 7026 [WARNING|trainer.py:803] 2025-04-26 17:38:52,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:52,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:38:52,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6976 7050 7027 [WARNING|trainer.py:803] 2025-04-26 17:38:53,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:38:53,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:54,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6977 7051 7028 [WARNING|trainer.py:803] 2025-04-26 17:38:54,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:55,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:55,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6978 7052 7029 [WARNING|trainer.py:803] 2025-04-26 17:38:56,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:56,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:56,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6979 7053 7030 [WARNING|trainer.py:803] 2025-04-26 17:38:57,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:57,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:57,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6980 7054 7031 [WARNING|trainer.py:803] 2025-04-26 17:38:58,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:58,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:38:59,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6981 7055 7032 [WARNING|trainer.py:803] 2025-04-26 17:38:59,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:00,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:00,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6982 7056 7033 [WARNING|trainer.py:803] 2025-04-26 17:39:01,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:01,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:01,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6983 7057 7034 [WARNING|trainer.py:803] 2025-04-26 17:39:02,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:02,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:02,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6984 7058 7035 [WARNING|trainer.py:803] 2025-04-26 17:39:03,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:03,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:04,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6985 7059 7036 [WARNING|trainer.py:803] 2025-04-26 17:39:05,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:05,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:05,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6986 7060 7037 [WARNING|trainer.py:803] 2025-04-26 17:39:06,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:39:06,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:06,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6987 7061 7038 [WARNING|trainer.py:803] 2025-04-26 17:39:07,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:07,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:39:07,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6988 7062 7039 [WARNING|trainer.py:803] 2025-04-26 17:39:08,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:08,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:09,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6989 7063 7040 [WARNING|trainer.py:803] 2025-04-26 17:39:10,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:10,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:39:10,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6990 7064 7041 [WARNING|trainer.py:803] 2025-04-26 17:39:11,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:11,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:11,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6991 7065 7042 [WARNING|trainer.py:803] 2025-04-26 17:39:12,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:12,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:12,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6992 7066 7043 [WARNING|trainer.py:803] 2025-04-26 17:39:13,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:13,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:14,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6993 7067 7044 [WARNING|trainer.py:803] 2025-04-26 17:39:15,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:15,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:39:15,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6994 7068 7045 [WARNING|trainer.py:803] 2025-04-26 17:39:16,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:16,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:39:16,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6995 7069 7046 [WARNING|trainer.py:803] 2025-04-26 17:39:17,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:17,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:39:17,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6996 7070 7047 [WARNING|trainer.py:803] 2025-04-26 17:39:18,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:18,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:19,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6997 7071 7048 [WARNING|trainer.py:803] 2025-04-26 17:39:20,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:39:20,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:20,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6998 7072 7049 [WARNING|trainer.py:803] 2025-04-26 17:39:21,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:21,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:21,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6999 7073 7050 [WARNING|trainer.py:803] 2025-04-26 17:39:22,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:22,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:22,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7000 7074 7051 [WARNING|trainer.py:803] 2025-04-26 17:39:23,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:23,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:24,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7001 7075 7052 [WARNING|trainer.py:803] 2025-04-26 17:39:25,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:39:25,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:25,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7002 7076 7053 [WARNING|trainer.py:803] 2025-04-26 17:39:26,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:39:26,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:26,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7003 7077 7054 [WARNING|trainer.py:803] 2025-04-26 17:39:27,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:27,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:27,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7004 7078 7055 [WARNING|trainer.py:803] 2025-04-26 17:39:28,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:28,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:29,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7005 7079 7056 [WARNING|trainer.py:803] 2025-04-26 17:39:30,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:30,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:30,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7080 7006 7057 [WARNING|trainer.py:803] 2025-04-26 17:39:31,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:31,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:39:31,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7081 7058 7007 [WARNING|trainer.py:803] 2025-04-26 17:39:32,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:32,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:32,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7082 7059 7008 [WARNING|trainer.py:803] 2025-04-26 17:39:34,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:34,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:34,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7083 7060 7009 [WARNING|trainer.py:803] 2025-04-26 17:39:35,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:35,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:35,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7084 7061 7010 [WARNING|trainer.py:803] 2025-04-26 17:39:36,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes [WARNING|trainer.py:803] 2025-04-26 17:39:36,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:39:36,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7085 7062 7011 [WARNING|trainer.py:803] 2025-04-26 17:39:37,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:37,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:37,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7086 7063 7012 [WARNING|trainer.py:803] 2025-04-26 17:39:39,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:39,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:39:39,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7087 7064 7013 [WARNING|trainer.py:803] 2025-04-26 17:39:40,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:40,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:40,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7088 7065 7014 [WARNING|trainer.py:803] 2025-04-26 17:39:41,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:41,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:41,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7089 7066 7015 [WARNING|trainer.py:803] 2025-04-26 17:39:42,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:42,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:42,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7067 7090 7016 [WARNING|trainer.py:803] 2025-04-26 17:39:44,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:39:44,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:44,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7068 7091 7017 [WARNING|trainer.py:803] 2025-04-26 17:39:45,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:39:45,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:45,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 7069 7092 7018 [WARNING|trainer.py:803] 2025-04-26 17:39:46,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:39:46,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:46,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7070 7093 7019 [WARNING|trainer.py:803] 2025-04-26 17:39:47,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:47,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:48,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7071 7094 7020 [WARNING|trainer.py:803] 2025-04-26 17:39:49,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:49,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:49,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7072 7095 7021 [WARNING|trainer.py:803] 2025-04-26 17:39:50,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:50,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:50,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7073 7096 7022 [WARNING|trainer.py:803] 2025-04-26 17:39:51,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:51,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:51,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7074 7097 7023 [WARNING|trainer.py:803] 2025-04-26 17:39:52,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:52,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:53,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7075 7098 7024 [WARNING|trainer.py:803] 2025-04-26 17:39:53,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:54,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7076 [WARNING|trainer.py:803] 2025-04-26 17:39:54,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7099 7025 [WARNING|trainer.py:803] 2025-04-26 17:39:55,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:55,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7077 [WARNING|trainer.py:803] 2025-04-26 17:39:55,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7100 7026 [WARNING|trainer.py:803] 2025-04-26 17:39:56,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:56,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7078 [WARNING|trainer.py:803] 2025-04-26 17:39:56,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7101 7027 [WARNING|trainer.py:803] 2025-04-26 17:39:57,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:39:57,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7079 [WARNING|trainer.py:803] 2025-04-26 17:39:58,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7102 7028 [WARNING|trainer.py:803] 2025-04-26 17:39:58,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:39:59,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7080 [WARNING|trainer.py:803] 2025-04-26 17:39:59,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7103 7029 [WARNING|trainer.py:803] 2025-04-26 17:40:00,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:00,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7081 [WARNING|trainer.py:803] 2025-04-26 17:40:00,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7104 7030 [WARNING|trainer.py:803] 2025-04-26 17:40:01,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:01,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7082 [WARNING|trainer.py:803] 2025-04-26 17:40:01,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7105 7031 [WARNING|trainer.py:803] 2025-04-26 17:40:02,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:02,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7083 [WARNING|trainer.py:803] 2025-04-26 17:40:03,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7106 7032 [WARNING|trainer.py:803] 2025-04-26 17:40:03,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7084 [WARNING|trainer.py:803] 2025-04-26 17:40:04,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:04,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7107 7033 [WARNING|trainer.py:803] 2025-04-26 17:40:04,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes 7085 [WARNING|trainer.py:803] 2025-04-26 17:40:05,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:05,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7108 7034 [WARNING|trainer.py:803] 2025-04-26 17:40:06,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7086 [WARNING|trainer.py:803] 2025-04-26 17:40:06,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:06,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7109 7035 [WARNING|trainer.py:803] 2025-04-26 17:40:07,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7087 [WARNING|trainer.py:803] 2025-04-26 17:40:07,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:40:08,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7110 7036 [WARNING|trainer.py:803] 2025-04-26 17:40:08,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7088 [WARNING|trainer.py:803] 2025-04-26 17:40:09,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:09,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7111 7037 [WARNING|trainer.py:803] 2025-04-26 17:40:09,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7089 [WARNING|trainer.py:803] 2025-04-26 17:40:10,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:10,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7112 7038 [WARNING|trainer.py:803] 2025-04-26 17:40:11,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7090 [WARNING|trainer.py:803] 2025-04-26 17:40:11,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:11,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7113 7039 [WARNING|trainer.py:803] 2025-04-26 17:40:12,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7091 [WARNING|trainer.py:803] 2025-04-26 17:40:12,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:13,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7114 7040 [WARNING|trainer.py:803] 2025-04-26 17:40:13,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7092 [WARNING|trainer.py:803] 2025-04-26 17:40:14,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:14,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7115 [WARNING|trainer.py:803] 2025-04-26 17:40:14,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7041 7093 [WARNING|trainer.py:803] 2025-04-26 17:40:15,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7116 [WARNING|trainer.py:803] 2025-04-26 17:40:15,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:40:16,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7042 7094 [WARNING|trainer.py:803] 2025-04-26 17:40:16,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7117 [WARNING|trainer.py:803] 2025-04-26 17:40:16,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:17,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7043 7095 [WARNING|trainer.py:803] 2025-04-26 17:40:17,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7118 [WARNING|trainer.py:803] 2025-04-26 17:40:18,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:40:18,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7044 7096 [WARNING|trainer.py:803] 2025-04-26 17:40:18,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7119 [WARNING|trainer.py:803] 2025-04-26 17:40:19,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:19,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7045 7097 [WARNING|trainer.py:803] 2025-04-26 17:40:20,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7120 [WARNING|trainer.py:803] 2025-04-26 17:40:20,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:40:21,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7046 7098 [WARNING|trainer.py:803] 2025-04-26 17:40:21,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7121 [WARNING|trainer.py:803] 2025-04-26 17:40:22,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:22,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7047 [WARNING|trainer.py:803] 2025-04-26 17:40:22,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7099 7122 [WARNING|trainer.py:803] 2025-04-26 17:40:23,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:23,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7048 [WARNING|trainer.py:803] 2025-04-26 17:40:23,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7100 7123 [WARNING|trainer.py:803] 2025-04-26 17:40:24,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:40:24,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7049 7101 [WARNING|trainer.py:803] 2025-04-26 17:40:25,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7124 [WARNING|trainer.py:803] 2025-04-26 17:40:25,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:40:25,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7050 7102 [WARNING|trainer.py:803] 2025-04-26 17:40:26,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7125 [WARNING|trainer.py:803] 2025-04-26 17:40:26,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:27,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7051 7103 [WARNING|trainer.py:803] 2025-04-26 17:40:27,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7126 [WARNING|trainer.py:803] 2025-04-26 17:40:28,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:28,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7052 7104 [WARNING|trainer.py:803] 2025-04-26 17:40:28,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7127 [WARNING|trainer.py:803] 2025-04-26 17:40:29,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:29,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7053 7105 [WARNING|trainer.py:803] 2025-04-26 17:40:30,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7128 [WARNING|trainer.py:803] 2025-04-26 17:40:30,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:30,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7054 7106 [WARNING|trainer.py:803] 2025-04-26 17:40:31,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7129 [WARNING|trainer.py:803] 2025-04-26 17:40:31,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:32,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7055 7107 [WARNING|trainer.py:803] 2025-04-26 17:40:32,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7130 [WARNING|trainer.py:803] 2025-04-26 17:40:33,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:33,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7056 7108 [WARNING|trainer.py:803] 2025-04-26 17:40:33,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7131 [WARNING|trainer.py:803] 2025-04-26 17:40:34,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:34,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7057 7109 [WARNING|trainer.py:803] 2025-04-26 17:40:35,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7132 [WARNING|trainer.py:803] 2025-04-26 17:40:35,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:35,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7058 7110 [WARNING|trainer.py:803] 2025-04-26 17:40:36,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7133 [WARNING|trainer.py:803] 2025-04-26 17:40:36,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:37,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7059 7111 [WARNING|trainer.py:803] 2025-04-26 17:40:37,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7134 [WARNING|trainer.py:803] 2025-04-26 17:40:38,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:38,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7060 7112 [WARNING|trainer.py:803] 2025-04-26 17:40:38,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7135 [WARNING|trainer.py:803] 2025-04-26 17:40:39,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:39,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7061 7113 [WARNING|trainer.py:803] 2025-04-26 17:40:39,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7136 [WARNING|trainer.py:803] 2025-04-26 17:40:40,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:40:40,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7062 7114 [WARNING|trainer.py:803] 2025-04-26 17:40:41,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7137 [WARNING|trainer.py:803] 2025-04-26 17:40:41,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:41,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7063 7115 [WARNING|trainer.py:803] 2025-04-26 17:40:42,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7138 [WARNING|trainer.py:803] 2025-04-26 17:40:42,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:40:43,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7064 7116 [WARNING|trainer.py:803] 2025-04-26 17:40:43,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7139 [WARNING|trainer.py:803] 2025-04-26 17:40:44,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:44,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7065 7117 [WARNING|trainer.py:803] 2025-04-26 17:40:44,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7140 [WARNING|trainer.py:803] 2025-04-26 17:40:45,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:45,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7066 7118 [WARNING|trainer.py:803] 2025-04-26 17:40:46,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7141 [WARNING|trainer.py:803] 2025-04-26 17:40:46,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:46,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7067 7119 [WARNING|trainer.py:803] 2025-04-26 17:40:47,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7142 [WARNING|trainer.py:803] 2025-04-26 17:40:47,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:40:48,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7068 7120 [WARNING|trainer.py:803] 2025-04-26 17:40:48,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7143 [WARNING|trainer.py:803] 2025-04-26 17:40:49,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:40:49,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7069 7121 [WARNING|trainer.py:803] 2025-04-26 17:40:49,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7144 [WARNING|trainer.py:803] 2025-04-26 17:40:50,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:40:50,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7070 7122 [WARNING|trainer.py:803] 2025-04-26 17:40:51,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7145 [WARNING|trainer.py:803] 2025-04-26 17:40:51,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:51,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7071 7123 [WARNING|trainer.py:803] 2025-04-26 17:40:52,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7146 [WARNING|trainer.py:803] 2025-04-26 17:40:52,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:53,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7072 7124 [WARNING|trainer.py:803] 2025-04-26 17:40:53,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:40:54,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7147 [WARNING|trainer.py:803] 2025-04-26 17:40:54,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7073 7125 [WARNING|trainer.py:803] 2025-04-26 17:40:54,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:55,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7148 [WARNING|trainer.py:803] 2025-04-26 17:40:55,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7074 7126 [WARNING|trainer.py:803] 2025-04-26 17:40:56,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:56,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7149 [WARNING|trainer.py:803] 2025-04-26 17:40:56,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7075 7127 [WARNING|trainer.py:803] 2025-04-26 17:40:57,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:40:57,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7150 [WARNING|trainer.py:803] 2025-04-26 17:40:58,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7076 7128 [WARNING|trainer.py:803] 2025-04-26 17:40:58,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7151 [WARNING|trainer.py:803] 2025-04-26 17:40:59,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:40:59,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7077 7129 [WARNING|trainer.py:803] 2025-04-26 17:40:59,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7152 [WARNING|trainer.py:803] 2025-04-26 17:41:00,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:00,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7078 7130 [WARNING|trainer.py:803] 2025-04-26 17:41:01,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7153 [WARNING|trainer.py:803] 2025-04-26 17:41:01,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:01,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7079 7131 [WARNING|trainer.py:803] 2025-04-26 17:41:02,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7154 [WARNING|trainer.py:803] 2025-04-26 17:41:02,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:02,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7080 7132 [WARNING|trainer.py:803] 2025-04-26 17:41:03,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7155 [WARNING|trainer.py:803] 2025-04-26 17:41:04,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:04,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7081 7133 [WARNING|trainer.py:803] 2025-04-26 17:41:04,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7156 [WARNING|trainer.py:803] 2025-04-26 17:41:05,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:05,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7082 7134 [WARNING|trainer.py:803] 2025-04-26 17:41:06,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7157 [WARNING|trainer.py:803] 2025-04-26 17:41:06,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:06,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7083 7135 [WARNING|trainer.py:803] 2025-04-26 17:41:07,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7158 [WARNING|trainer.py:803] 2025-04-26 17:41:07,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:07,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7084 7136 [WARNING|trainer.py:803] 2025-04-26 17:41:08,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7159 [WARNING|trainer.py:803] 2025-04-26 17:41:08,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes [WARNING|trainer.py:803] 2025-04-26 17:41:09,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7085 7137 [WARNING|trainer.py:803] 2025-04-26 17:41:09,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7160 [WARNING|trainer.py:803] 2025-04-26 17:41:10,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:10,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7086 7138 [WARNING|trainer.py:803] 2025-04-26 17:41:10,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7161 [WARNING|trainer.py:803] 2025-04-26 17:41:11,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:11,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7087 7139 [WARNING|trainer.py:803] 2025-04-26 17:41:12,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7162 [WARNING|trainer.py:803] 2025-04-26 17:41:12,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:12,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7088 7140 [WARNING|trainer.py:803] 2025-04-26 17:41:13,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7163 [WARNING|trainer.py:803] 2025-04-26 17:41:13,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:14,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7089 7141 [WARNING|trainer.py:803] 2025-04-26 17:41:14,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7164 [WARNING|trainer.py:803] 2025-04-26 17:41:15,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:15,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7090 7142 [WARNING|trainer.py:803] 2025-04-26 17:41:15,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7165 [WARNING|trainer.py:803] 2025-04-26 17:41:16,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:16,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7091 7143 [WARNING|trainer.py:803] 2025-04-26 17:41:17,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7166 [WARNING|trainer.py:803] 2025-04-26 17:41:17,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:17,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7092 7144 [WARNING|trainer.py:803] 2025-04-26 17:41:18,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7167 [WARNING|trainer.py:803] 2025-04-26 17:41:18,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:19,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7093 7145 [WARNING|trainer.py:803] 2025-04-26 17:41:19,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7168 [WARNING|trainer.py:803] 2025-04-26 17:41:20,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:20,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7094 7146 [WARNING|trainer.py:803] 2025-04-26 17:41:20,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7169 [WARNING|trainer.py:803] 2025-04-26 17:41:21,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:21,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7095 7147 [WARNING|trainer.py:803] 2025-04-26 17:41:22,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7170 [WARNING|trainer.py:803] 2025-04-26 17:41:22,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:22,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7096 7148 [WARNING|trainer.py:803] 2025-04-26 17:41:23,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7171 [WARNING|trainer.py:803] 2025-04-26 17:41:23,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:24,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7097 7149 [WARNING|trainer.py:803] 2025-04-26 17:41:24,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7172 [WARNING|trainer.py:803] 2025-04-26 17:41:25,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:25,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7098 7150 [WARNING|trainer.py:803] 2025-04-26 17:41:25,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7173 [WARNING|trainer.py:803] 2025-04-26 17:41:26,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:26,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7099 7151 [WARNING|trainer.py:803] 2025-04-26 17:41:27,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7174 [WARNING|trainer.py:803] 2025-04-26 17:41:27,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:27,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7100 7152 [WARNING|trainer.py:803] 2025-04-26 17:41:28,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7175 [WARNING|trainer.py:803] 2025-04-26 17:41:28,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:41:28,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7101 7153 [WARNING|trainer.py:803] 2025-04-26 17:41:29,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7176 [WARNING|trainer.py:803] 2025-04-26 17:41:30,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:30,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7102 7154 [WARNING|trainer.py:803] 2025-04-26 17:41:30,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7177 [WARNING|trainer.py:803] 2025-04-26 17:41:31,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:31,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7103 7155 [WARNING|trainer.py:803] 2025-04-26 17:41:32,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7178 [WARNING|trainer.py:803] 2025-04-26 17:41:32,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:32,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7104 7156 [WARNING|trainer.py:803] 2025-04-26 17:41:33,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7179 [WARNING|trainer.py:803] 2025-04-26 17:41:33,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:34,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7105 7157 [WARNING|trainer.py:803] 2025-04-26 17:41:34,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7180 [WARNING|trainer.py:803] 2025-04-26 17:41:35,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:35,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7106 7158 [WARNING|trainer.py:803] 2025-04-26 17:41:35,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7181 [WARNING|trainer.py:803] 2025-04-26 17:41:36,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:36,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7107 7159 [WARNING|trainer.py:803] 2025-04-26 17:41:36,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7182 [WARNING|trainer.py:803] 2025-04-26 17:41:37,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:37,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7108 [WARNING|trainer.py:803] 2025-04-26 17:41:38,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7160 7183 [WARNING|trainer.py:803] 2025-04-26 17:41:38,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:39,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7109 [WARNING|trainer.py:803] 2025-04-26 17:41:39,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7161 7184 [WARNING|trainer.py:803] 2025-04-26 17:41:40,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:41:40,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7110 [WARNING|trainer.py:803] 2025-04-26 17:41:40,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7162 7185 [WARNING|trainer.py:803] 2025-04-26 17:41:41,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:41,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7111 [WARNING|trainer.py:803] 2025-04-26 17:41:41,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7163 7186 [WARNING|trainer.py:803] 2025-04-26 17:41:42,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:42,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7112 [WARNING|trainer.py:803] 2025-04-26 17:41:43,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7164 7187 [WARNING|trainer.py:803] 2025-04-26 17:41:43,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:44,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7113 [WARNING|trainer.py:803] 2025-04-26 17:41:44,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7165 7188 [WARNING|trainer.py:803] 2025-04-26 17:41:44,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:45,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7114 [WARNING|trainer.py:803] 2025-04-26 17:41:45,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7166 7189 [WARNING|trainer.py:803] 2025-04-26 17:41:46,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:46,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7115 [WARNING|trainer.py:803] 2025-04-26 17:41:46,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7167 7190 [WARNING|trainer.py:803] 2025-04-26 17:41:47,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:47,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7116 [WARNING|trainer.py:803] 2025-04-26 17:41:48,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7168 7191 [WARNING|trainer.py:803] 2025-04-26 17:41:48,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:49,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7117 [WARNING|trainer.py:803] 2025-04-26 17:41:49,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7169 7192 [WARNING|trainer.py:803] 2025-04-26 17:41:49,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:50,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7118 [WARNING|trainer.py:803] 2025-04-26 17:41:50,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7170 7193 [WARNING|trainer.py:803] 2025-04-26 17:41:51,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:51,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7119 [WARNING|trainer.py:803] 2025-04-26 17:41:51,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7171 7194 [WARNING|trainer.py:803] 2025-04-26 17:41:52,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7120 [WARNING|trainer.py:803] 2025-04-26 17:41:52,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:53,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7172 7195 [WARNING|trainer.py:803] 2025-04-26 17:41:53,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7121 [WARNING|trainer.py:803] 2025-04-26 17:41:54,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:41:54,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7173 7196 [WARNING|trainer.py:803] 2025-04-26 17:41:54,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7122 [WARNING|trainer.py:803] 2025-04-26 17:41:55,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:55,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7174 7197 [WARNING|trainer.py:803] 2025-04-26 17:41:56,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7123 [WARNING|trainer.py:803] 2025-04-26 17:41:56,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:41:56,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7175 7198 [WARNING|trainer.py:803] 2025-04-26 17:41:57,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7124 [WARNING|trainer.py:803] 2025-04-26 17:41:57,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:41:58,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7176 7199 [WARNING|trainer.py:803] 2025-04-26 17:41:58,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7125 [WARNING|trainer.py:803] 2025-04-26 17:41:58,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:41:59,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7177 7200 [WARNING|trainer.py:803] 2025-04-26 17:41:59,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7126 [WARNING|trainer.py:803] 2025-04-26 17:42:00,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:00,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7178 7201 [WARNING|trainer.py:803] 2025-04-26 17:42:00,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7127 [WARNING|trainer.py:803] 2025-04-26 17:42:01,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:01,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7179 7202 [WARNING|trainer.py:803] 2025-04-26 17:42:02,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7128 [WARNING|trainer.py:803] 2025-04-26 17:42:02,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:03,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7180 7203 [WARNING|trainer.py:803] 2025-04-26 17:42:03,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7129 [WARNING|trainer.py:803] 2025-04-26 17:42:03,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:04,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7181 7204 [WARNING|trainer.py:803] 2025-04-26 17:42:04,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:05,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7130 [WARNING|trainer.py:803] 2025-04-26 17:42:05,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7182 7205 [WARNING|trainer.py:803] 2025-04-26 17:42:06,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:06,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7131 [WARNING|trainer.py:803] 2025-04-26 17:42:06,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7183 7206 [WARNING|trainer.py:803] 2025-04-26 17:42:07,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:07,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7132 [WARNING|trainer.py:803] 2025-04-26 17:42:07,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7184 7207 [WARNING|trainer.py:803] 2025-04-26 17:42:08,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:08,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7133 [WARNING|trainer.py:803] 2025-04-26 17:42:09,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7185 7208 [WARNING|trainer.py:803] 2025-04-26 17:42:09,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:10,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7134 [WARNING|trainer.py:803] 2025-04-26 17:42:10,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7186 7209 [WARNING|trainer.py:803] 2025-04-26 17:42:11,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:11,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7135 [WARNING|trainer.py:803] 2025-04-26 17:42:11,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7187 7210 [WARNING|trainer.py:803] 2025-04-26 17:42:12,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:12,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7136 [WARNING|trainer.py:803] 2025-04-26 17:42:12,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7188 7211 [WARNING|trainer.py:803] 2025-04-26 17:42:13,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:13,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7137 [WARNING|trainer.py:803] 2025-04-26 17:42:14,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7189 7212 [WARNING|trainer.py:803] 2025-04-26 17:42:14,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:15,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7138 [WARNING|trainer.py:803] 2025-04-26 17:42:15,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7190 7213 [WARNING|trainer.py:803] 2025-04-26 17:42:16,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:42:16,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7139 [WARNING|trainer.py:803] 2025-04-26 17:42:16,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7191 7214 [WARNING|trainer.py:803] 2025-04-26 17:42:17,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:17,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7140 [WARNING|trainer.py:803] 2025-04-26 17:42:17,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7192 7215 [WARNING|trainer.py:803] 2025-04-26 17:42:18,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:18,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7141 [WARNING|trainer.py:803] 2025-04-26 17:42:19,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7193 7216 [WARNING|trainer.py:803] 2025-04-26 17:42:19,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:20,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:20,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7142 7194 7217 [WARNING|trainer.py:803] 2025-04-26 17:42:21,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:21,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:21,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7143 7195 7218 [WARNING|trainer.py:803] 2025-04-26 17:42:22,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:22,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:22,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7144 7196 7219 [WARNING|trainer.py:803] 2025-04-26 17:42:23,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:23,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:24,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7145 7197 7220 [WARNING|trainer.py:803] 2025-04-26 17:42:25,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:25,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:25,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7146 7198 7221 [WARNING|trainer.py:803] 2025-04-26 17:42:26,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:42:26,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:26,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7147 7199 7222 [WARNING|trainer.py:803] 2025-04-26 17:42:27,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:27,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:27,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7148 7200 7223 [WARNING|trainer.py:803] 2025-04-26 17:42:28,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:28,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:29,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7149 7201 7224 [WARNING|trainer.py:803] 2025-04-26 17:42:30,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:30,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:42:30,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7150 7202 7225 [WARNING|trainer.py:803] 2025-04-26 17:42:31,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:31,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:42:31,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7151 7203 7226 [WARNING|trainer.py:803] 2025-04-26 17:42:32,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:32,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:32,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7152 7227 7204 [WARNING|trainer.py:803] 2025-04-26 17:42:33,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:33,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:42:33,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7153 7228 7205 [WARNING|trainer.py:803] 2025-04-26 17:42:34,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:35,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:42:35,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7154 7229 7206 [WARNING|trainer.py:803] 2025-04-26 17:42:36,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:36,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:42:36,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7155 7230 7207 [WARNING|trainer.py:803] 2025-04-26 17:42:37,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:37,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:42:37,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7156 7231 7208 [WARNING|trainer.py:803] 2025-04-26 17:42:38,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:38,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:39,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7157 7232 7209 [WARNING|trainer.py:803] 2025-04-26 17:42:39,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:40,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:40,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7158 7233 7210 [WARNING|trainer.py:803] 2025-04-26 17:42:41,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:41,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:42:41,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7159 7234 7211 [WARNING|trainer.py:803] 2025-04-26 17:42:42,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:42,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:42:42,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7160 7235 7212 [WARNING|trainer.py:803] 2025-04-26 17:42:43,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:43,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:44,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7161 7236 7213 [WARNING|trainer.py:803] 2025-04-26 17:42:45,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:45,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:45,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7162 7237 7214 [WARNING|trainer.py:803] 2025-04-26 17:42:46,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:46,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:46,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7163 7238 7215 [WARNING|trainer.py:803] 2025-04-26 17:42:47,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:47,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:47,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7164 7239 7216 [WARNING|trainer.py:803] 2025-04-26 17:42:48,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:48,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:42:49,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7165 7240 7217 [WARNING|trainer.py:803] 2025-04-26 17:42:50,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:50,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:50,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7241 7166 7218 [WARNING|trainer.py:803] 2025-04-26 17:42:51,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:51,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:51,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7242 7167 7219 [WARNING|trainer.py:803] 2025-04-26 17:42:52,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:52,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:52,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7243 7168 7220 [WARNING|trainer.py:803] 2025-04-26 17:42:53,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:42:53,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:54,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7244 7169 7221 [WARNING|trainer.py:803] 2025-04-26 17:42:55,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:55,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:55,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7245 7170 7222 [WARNING|trainer.py:803] 2025-04-26 17:42:56,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:56,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:42:56,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7246 7171 7223 [WARNING|trainer.py:803] 2025-04-26 17:42:57,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:57,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:42:57,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7247 7172 7224 [WARNING|trainer.py:803] 2025-04-26 17:42:58,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:42:59,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:42:59,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7248 7173 7225 [WARNING|trainer.py:803] 2025-04-26 17:42:59,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:00,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:43:00,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7249 7226 7174 [WARNING|trainer.py:803] 2025-04-26 17:43:01,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:43:01,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:43:01,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7250 7227 7175 [WARNING|trainer.py:803] 2025-04-26 17:43:02,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:43:02,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7251 [WARNING|trainer.py:803] 2025-04-26 17:43:02,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7228 7176 [WARNING|trainer.py:803] 2025-04-26 17:43:03,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7252 [WARNING|trainer.py:803] 2025-04-26 17:43:04,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:43:04,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7229 7177 [WARNING|trainer.py:803] 2025-04-26 17:43:04,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7253 [WARNING|trainer.py:803] 2025-04-26 17:43:05,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:43:05,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7230 7178 [WARNING|trainer.py:803] 2025-04-26 17:43:06,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7254 [WARNING|trainer.py:803] 2025-04-26 17:43:06,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:43:06,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7231 7179 [WARNING|trainer.py:803] 2025-04-26 17:43:07,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7255 [WARNING|trainer.py:803] 2025-04-26 17:43:07,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:43:07,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7232 7180 [WARNING|trainer.py:803] 2025-04-26 17:43:08,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7256 [WARNING|trainer.py:803] 2025-04-26 17:43:08,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:43:09,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7233 7181 [WARNING|trainer.py:803] 2025-04-26 17:43:09,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7257 [WARNING|trainer.py:803] 2025-04-26 17:43:10,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:43:10,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7234 7182 [WARNING|trainer.py:803] 2025-04-26 17:43:10,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7258 [WARNING|trainer.py:803] 2025-04-26 17:43:11,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:43:11,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7235 [WARNING|trainer.py:803] 2025-04-26 17:43:12,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7183 7259 [WARNING|trainer.py:803] 2025-04-26 17:43:12,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:43:12,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7236 [WARNING|trainer.py:803] 2025-04-26 17:43:13,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7184 7260 [WARNING|trainer.py:803] 2025-04-26 17:43:13,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:43:14,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7237 [WARNING|trainer.py:803] 2025-04-26 17:43:14,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7185 7261 [WARNING|trainer.py:803] 2025-04-26 17:43:15,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:43:15,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7238 [WARNING|trainer.py:803] 2025-04-26 17:43:15,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7186 7262 [WARNING|trainer.py:803] 2025-04-26 17:43:16,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7239 [WARNING|trainer.py:803] 2025-04-26 17:43:16,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:17,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7187 7263 [WARNING|trainer.py:803] 2025-04-26 17:43:17,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7240 [WARNING|trainer.py:803] 2025-04-26 17:43:17,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:18,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7188 7264 [WARNING|trainer.py:803] 2025-04-26 17:43:18,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7241 [WARNING|trainer.py:803] 2025-04-26 17:43:19,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:19,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7189 [WARNING|trainer.py:803] 2025-04-26 17:43:19,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7265 7242 [WARNING|trainer.py:803] 2025-04-26 17:43:20,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:20,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7190 [WARNING|trainer.py:803] 2025-04-26 17:43:21,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7266 7243 [WARNING|trainer.py:803] 2025-04-26 17:43:21,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:22,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7191 [WARNING|trainer.py:803] 2025-04-26 17:43:22,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7267 7244 [WARNING|trainer.py:803] 2025-04-26 17:43:23,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:23,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7192 [WARNING|trainer.py:803] 2025-04-26 17:43:23,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7268 7245 [WARNING|trainer.py:803] 2025-04-26 17:43:24,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:24,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7193 [WARNING|trainer.py:803] 2025-04-26 17:43:24,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7269 7246 [WARNING|trainer.py:803] 2025-04-26 17:43:25,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:25,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7194 [WARNING|trainer.py:803] 2025-04-26 17:43:26,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7270 7247 [WARNING|trainer.py:803] 2025-04-26 17:43:26,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:27,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7195 [WARNING|trainer.py:803] 2025-04-26 17:43:27,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7271 7248 [WARNING|trainer.py:803] 2025-04-26 17:43:27,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:28,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7196 [WARNING|trainer.py:803] 2025-04-26 17:43:28,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7272 7249 [WARNING|trainer.py:803] 2025-04-26 17:43:29,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:29,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7197 [WARNING|trainer.py:803] 2025-04-26 17:43:29,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7273 7250 [WARNING|trainer.py:803] 2025-04-26 17:43:30,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:30,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7198 [WARNING|trainer.py:803] 2025-04-26 17:43:31,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7274 7251 [WARNING|trainer.py:803] 2025-04-26 17:43:31,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:32,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7199 [WARNING|trainer.py:803] 2025-04-26 17:43:32,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7275 7252 [WARNING|trainer.py:803] 2025-04-26 17:43:32,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:33,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7200 [WARNING|trainer.py:803] 2025-04-26 17:43:33,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7276 7253 [WARNING|trainer.py:803] 2025-04-26 17:43:34,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:34,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7201 [WARNING|trainer.py:803] 2025-04-26 17:43:34,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7277 7254 [WARNING|trainer.py:803] 2025-04-26 17:43:35,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:43:35,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7202 [WARNING|trainer.py:803] 2025-04-26 17:43:35,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7278 7255 [WARNING|trainer.py:803] 2025-04-26 17:43:36,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:43:36,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7203 [WARNING|trainer.py:803] 2025-04-26 17:43:37,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7279 7256 [WARNING|trainer.py:803] 2025-04-26 17:43:37,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:38,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7204 [WARNING|trainer.py:803] 2025-04-26 17:43:38,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7280 7257 [WARNING|trainer.py:803] 2025-04-26 17:43:39,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:43:39,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7205 [WARNING|trainer.py:803] 2025-04-26 17:43:39,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7281 7258 [WARNING|trainer.py:803] 2025-04-26 17:43:40,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7206 [WARNING|trainer.py:803] 2025-04-26 17:43:40,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:43:40,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7282 7259 [WARNING|trainer.py:803] 2025-04-26 17:43:41,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7207 [WARNING|trainer.py:803] 2025-04-26 17:43:42,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:43:42,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7283 7260 [WARNING|trainer.py:803] 2025-04-26 17:43:42,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7208 [WARNING|trainer.py:803] 2025-04-26 17:43:43,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:43:43,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7284 7261 [WARNING|trainer.py:803] 2025-04-26 17:43:44,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7209 [WARNING|trainer.py:803] 2025-04-26 17:43:44,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:43:44,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7285 7262 [WARNING|trainer.py:803] 2025-04-26 17:43:45,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7210 [WARNING|trainer.py:803] 2025-04-26 17:43:45,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:43:45,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7286 7263 [WARNING|trainer.py:803] 2025-04-26 17:43:46,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7211 [WARNING|trainer.py:803] 2025-04-26 17:43:47,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:43:47,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7287 7264 [WARNING|trainer.py:803] 2025-04-26 17:43:47,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7212 [WARNING|trainer.py:803] 2025-04-26 17:43:48,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:43:48,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7288 7265 [WARNING|trainer.py:803] 2025-04-26 17:43:48,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7213 [WARNING|trainer.py:803] 2025-04-26 17:43:49,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:43:49,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7289 7266 [WARNING|trainer.py:803] 2025-04-26 17:43:50,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7214 [WARNING|trainer.py:803] 2025-04-26 17:43:50,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:43:50,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7290 7267 [WARNING|trainer.py:803] 2025-04-26 17:43:51,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7215 [WARNING|trainer.py:803] 2025-04-26 17:43:51,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:43:52,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7291 7268 [WARNING|trainer.py:803] 2025-04-26 17:43:52,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7216 [WARNING|trainer.py:803] 2025-04-26 17:43:53,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:53,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7292 7269 [WARNING|trainer.py:803] 2025-04-26 17:43:53,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7217 [WARNING|trainer.py:803] 2025-04-26 17:43:54,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:43:54,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7293 7270 [WARNING|trainer.py:803] 2025-04-26 17:43:55,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7218 [WARNING|trainer.py:803] 2025-04-26 17:43:55,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:43:55,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7294 7271 [WARNING|trainer.py:803] 2025-04-26 17:43:56,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7219 [WARNING|trainer.py:803] 2025-04-26 17:43:56,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:43:57,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7295 7272 [WARNING|trainer.py:803] 2025-04-26 17:43:57,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7220 [WARNING|trainer.py:803] 2025-04-26 17:43:58,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:43:58,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7296 7273 [WARNING|trainer.py:803] 2025-04-26 17:43:59,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7221 [WARNING|trainer.py:803] 2025-04-26 17:43:59,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:43:59,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7297 7274 [WARNING|trainer.py:803] 2025-04-26 17:44:00,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7222 [WARNING|trainer.py:803] 2025-04-26 17:44:00,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:00,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7298 7275 [WARNING|trainer.py:803] 2025-04-26 17:44:01,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7223 [WARNING|trainer.py:803] 2025-04-26 17:44:01,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:02,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7299 7276 [WARNING|trainer.py:803] 2025-04-26 17:44:02,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7224 [WARNING|trainer.py:803] 2025-04-26 17:44:03,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:03,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7300 7277 [WARNING|trainer.py:803] 2025-04-26 17:44:04,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7225 [WARNING|trainer.py:803] 2025-04-26 17:44:04,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:44:04,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7301 7278 [WARNING|trainer.py:803] 2025-04-26 17:44:05,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7226 [WARNING|trainer.py:803] 2025-04-26 17:44:05,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:05,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7302 7279 [WARNING|trainer.py:803] 2025-04-26 17:44:06,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7227 [WARNING|trainer.py:803] 2025-04-26 17:44:06,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:07,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7303 7280 [WARNING|trainer.py:803] 2025-04-26 17:44:07,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7228 [WARNING|trainer.py:803] 2025-04-26 17:44:08,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 17:44:08,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7304 7281 [WARNING|trainer.py:803] 2025-04-26 17:44:09,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7229 [WARNING|trainer.py:803] 2025-04-26 17:44:09,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:44:09,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7305 7282 [WARNING|trainer.py:803] 2025-04-26 17:44:10,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7230 [WARNING|trainer.py:803] 2025-04-26 17:44:10,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:10,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7283 7306 [WARNING|trainer.py:803] 2025-04-26 17:44:11,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7231 [WARNING|trainer.py:803] 2025-04-26 17:44:12,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:12,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7284 7307 [WARNING|trainer.py:803] 2025-04-26 17:44:12,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7232 [WARNING|trainer.py:803] 2025-04-26 17:44:13,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:44:13,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7285 7308 [WARNING|trainer.py:803] 2025-04-26 17:44:13,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7233 [WARNING|trainer.py:803] 2025-04-26 17:44:14,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:14,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7286 7309 [WARNING|trainer.py:803] 2025-04-26 17:44:15,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7234 [WARNING|trainer.py:803] 2025-04-26 17:44:15,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:15,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7287 7310 [WARNING|trainer.py:803] 2025-04-26 17:44:16,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7235 [WARNING|trainer.py:803] 2025-04-26 17:44:16,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:44:17,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7288 7311 [WARNING|trainer.py:803] 2025-04-26 17:44:17,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7236 [WARNING|trainer.py:803] 2025-04-26 17:44:18,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:44:18,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7289 7312 [WARNING|trainer.py:803] 2025-04-26 17:44:18,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7237 [WARNING|trainer.py:803] 2025-04-26 17:44:19,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:19,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7290 7313 [WARNING|trainer.py:803] 2025-04-26 17:44:20,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7238 [WARNING|trainer.py:803] 2025-04-26 17:44:20,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:20,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7291 7314 [WARNING|trainer.py:803] 2025-04-26 17:44:21,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7239 [WARNING|trainer.py:803] 2025-04-26 17:44:21,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:44:22,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7292 7315 [WARNING|trainer.py:803] 2025-04-26 17:44:22,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7240 [WARNING|trainer.py:803] 2025-04-26 17:44:23,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:44:23,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7293 7316 [WARNING|trainer.py:803] 2025-04-26 17:44:23,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7241 [WARNING|trainer.py:803] 2025-04-26 17:44:24,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:44:24,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7294 7317 [WARNING|trainer.py:803] 2025-04-26 17:44:25,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7242 [WARNING|trainer.py:803] 2025-04-26 17:44:25,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:44:25,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7295 7318 [WARNING|trainer.py:803] 2025-04-26 17:44:26,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7243 [WARNING|trainer.py:803] 2025-04-26 17:44:26,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:44:27,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7296 7319 [WARNING|trainer.py:803] 2025-04-26 17:44:27,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7244 [WARNING|trainer.py:803] 2025-04-26 17:44:28,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:44:28,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7297 7320 [WARNING|trainer.py:803] 2025-04-26 17:44:28,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7245 [WARNING|trainer.py:803] 2025-04-26 17:44:29,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:29,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7298 7321 [WARNING|trainer.py:803] 2025-04-26 17:44:30,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7246 [WARNING|trainer.py:803] 2025-04-26 17:44:30,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:30,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7299 7322 [WARNING|trainer.py:803] 2025-04-26 17:44:31,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7247 [WARNING|trainer.py:803] 2025-04-26 17:44:31,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:32,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7300 7323 [WARNING|trainer.py:803] 2025-04-26 17:44:32,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7248 [WARNING|trainer.py:803] 2025-04-26 17:44:33,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:44:33,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7301 7324 [WARNING|trainer.py:803] 2025-04-26 17:44:33,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7249 [WARNING|trainer.py:803] 2025-04-26 17:44:34,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:34,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7302 7325 [WARNING|trainer.py:803] 2025-04-26 17:44:35,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7250 [WARNING|trainer.py:803] 2025-04-26 17:44:35,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:35,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7303 7326 [WARNING|trainer.py:803] 2025-04-26 17:44:36,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7251 [WARNING|trainer.py:803] 2025-04-26 17:44:36,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 17:44:37,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7304 7327 [WARNING|trainer.py:803] 2025-04-26 17:44:37,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7252 [WARNING|trainer.py:803] 2025-04-26 17:44:38,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:44:38,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7305 7328 [WARNING|trainer.py:803] 2025-04-26 17:44:38,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7253 [WARNING|trainer.py:803] 2025-04-26 17:44:39,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:39,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7306 7329 [WARNING|trainer.py:803] 2025-04-26 17:44:40,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7254 [WARNING|trainer.py:803] 2025-04-26 17:44:40,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:40,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7307 7330 [WARNING|trainer.py:803] 2025-04-26 17:44:41,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7255 [WARNING|trainer.py:803] 2025-04-26 17:44:41,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:44:42,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7308 [WARNING|trainer.py:803] 2025-04-26 17:44:42,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7331 7256 [WARNING|trainer.py:803] 2025-04-26 17:44:42,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:43,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7309 [WARNING|trainer.py:803] 2025-04-26 17:44:43,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7332 7257 [WARNING|trainer.py:803] 2025-04-26 17:44:44,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:44,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 7310 :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:44,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7333 7258 [WARNING|trainer.py:803] 2025-04-26 17:44:45,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7311 [WARNING|trainer.py:803] 2025-04-26 17:44:45,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:46,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7334 7259 [WARNING|trainer.py:803] 2025-04-26 17:44:46,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7312 [WARNING|trainer.py:803] 2025-04-26 17:44:47,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:44:47,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7335 7260 [WARNING|trainer.py:803] 2025-04-26 17:44:47,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7313 [WARNING|trainer.py:803] 2025-04-26 17:44:48,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:44:48,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7336 7261 [WARNING|trainer.py:803] 2025-04-26 17:44:49,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7314 [WARNING|trainer.py:803] 2025-04-26 17:44:49,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:44:49,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7337 7262 [WARNING|trainer.py:803] 2025-04-26 17:44:50,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7315 [WARNING|trainer.py:803] 2025-04-26 17:44:50,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:44:51,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7338 7263 [WARNING|trainer.py:803] 2025-04-26 17:44:51,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7316 [WARNING|trainer.py:803] 2025-04-26 17:44:52,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:52,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7339 7264 [WARNING|trainer.py:803] 2025-04-26 17:44:52,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7317 [WARNING|trainer.py:803] 2025-04-26 17:44:53,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:53,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7340 7265 [WARNING|trainer.py:803] 2025-04-26 17:44:54,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7318 [WARNING|trainer.py:803] 2025-04-26 17:44:54,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:54,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7341 7266 [WARNING|trainer.py:803] 2025-04-26 17:44:55,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7319 [WARNING|trainer.py:803] 2025-04-26 17:44:55,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:56,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7342 7267 [WARNING|trainer.py:803] 2025-04-26 17:44:56,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7320 [WARNING|trainer.py:803] 2025-04-26 17:44:56,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:44:57,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7343 7268 [WARNING|trainer.py:803] 2025-04-26 17:44:57,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7321 [WARNING|trainer.py:803] 2025-04-26 17:44:58,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:44:58,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7344 7269 [WARNING|trainer.py:803] 2025-04-26 17:44:58,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7322 [WARNING|trainer.py:803] 2025-04-26 17:44:59,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:44:59,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7345 7270 [WARNING|trainer.py:803] 2025-04-26 17:45:00,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7323 [WARNING|trainer.py:803] 2025-04-26 17:45:00,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:45:01,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7346 7271 [WARNING|trainer.py:803] 2025-04-26 17:45:01,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7324 [WARNING|trainer.py:803] 2025-04-26 17:45:01,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:45:02,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7347 7272 [WARNING|trainer.py:803] 2025-04-26 17:45:02,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7325 [WARNING|trainer.py:803] 2025-04-26 17:45:03,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:45:03,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7348 7273 [WARNING|trainer.py:803] 2025-04-26 17:45:03,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7326 [WARNING|trainer.py:803] 2025-04-26 17:45:04,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:45:04,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7349 7274 [WARNING|trainer.py:803] 2025-04-26 17:45:05,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7327 [WARNING|trainer.py:803] 2025-04-26 17:45:05,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:05,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7350 7275 [WARNING|trainer.py:803] 2025-04-26 17:45:06,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7328 [WARNING|trainer.py:803] 2025-04-26 17:45:06,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:07,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7351 [WARNING|trainer.py:803] 2025-04-26 17:45:07,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7276 7329 [WARNING|trainer.py:803] 2025-04-26 17:45:08,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:45:08,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7352 [WARNING|trainer.py:803] 2025-04-26 17:45:08,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7277 7330 [WARNING|trainer.py:803] 2025-04-26 17:45:09,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:45:09,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7353 7278 [WARNING|trainer.py:803] 2025-04-26 17:45:10,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7331 [WARNING|trainer.py:803] 2025-04-26 17:45:10,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:45:10,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7354 7279 [WARNING|trainer.py:803] 2025-04-26 17:45:11,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7332 [WARNING|trainer.py:803] 2025-04-26 17:45:11,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:45:12,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7355 [WARNING|trainer.py:803] 2025-04-26 17:45:12,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 7280 :Yes 7333 [WARNING|trainer.py:803] 2025-04-26 17:45:13,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:13,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7356 [WARNING|trainer.py:803] 2025-04-26 17:45:13,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7281 7334 [WARNING|trainer.py:803] 2025-04-26 17:45:14,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:14,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7357 [WARNING|trainer.py:803] 2025-04-26 17:45:14,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7282 7335 [WARNING|trainer.py:803] 2025-04-26 17:45:15,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:45:15,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7358 [WARNING|trainer.py:803] 2025-04-26 17:45:16,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7283 7336 [WARNING|trainer.py:803] 2025-04-26 17:45:16,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:45:17,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7359 [WARNING|trainer.py:803] 2025-04-26 17:45:17,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7284 7337 [WARNING|trainer.py:803] 2025-04-26 17:45:18,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:45:18,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7360 [WARNING|trainer.py:803] 2025-04-26 17:45:18,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7285 7338 [WARNING|trainer.py:803] 2025-04-26 17:45:19,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:19,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7361 [WARNING|trainer.py:803] 2025-04-26 17:45:19,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7286 7339 [WARNING|trainer.py:803] 2025-04-26 17:45:20,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:20,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7362 [WARNING|trainer.py:803] 2025-04-26 17:45:21,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7287 7340 [WARNING|trainer.py:803] 2025-04-26 17:45:21,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:22,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7363 [WARNING|trainer.py:803] 2025-04-26 17:45:22,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7288 7341 [WARNING|trainer.py:803] 2025-04-26 17:45:23,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7364 [WARNING|trainer.py:803] 2025-04-26 17:45:23,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:45:23,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7289 7342 [WARNING|trainer.py:803] 2025-04-26 17:45:24,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7365 [WARNING|trainer.py:803] 2025-04-26 17:45:24,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:24,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7290 7343 [WARNING|trainer.py:803] 2025-04-26 17:45:25,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7366 [WARNING|trainer.py:803] 2025-04-26 17:45:25,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:26,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7291 7344 [WARNING|trainer.py:803] 2025-04-26 17:45:26,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7367 [WARNING|trainer.py:803] 2025-04-26 17:45:27,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:45:27,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7292 7345 [WARNING|trainer.py:803] 2025-04-26 17:45:27,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7368 [WARNING|trainer.py:803] 2025-04-26 17:45:28,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:45:28,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7293 7346 [WARNING|trainer.py:803] 2025-04-26 17:45:29,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7369 [WARNING|trainer.py:803] 2025-04-26 17:45:29,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:45:29,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7294 7347 [WARNING|trainer.py:803] 2025-04-26 17:45:30,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7370 [WARNING|trainer.py:803] 2025-04-26 17:45:30,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:45:31,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7295 7348 [WARNING|trainer.py:803] 2025-04-26 17:45:31,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7371 [WARNING|trainer.py:803] 2025-04-26 17:45:32,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:45:32,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7296 7349 [WARNING|trainer.py:803] 2025-04-26 17:45:33,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:45:33,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7372 [WARNING|trainer.py:803] 2025-04-26 17:45:33,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7297 7350 [WARNING|trainer.py:803] 2025-04-26 17:45:34,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:34,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7373 [WARNING|trainer.py:803] 2025-04-26 17:45:34,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7298 7351 [WARNING|trainer.py:803] 2025-04-26 17:45:35,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:45:35,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7374 [WARNING|trainer.py:803] 2025-04-26 17:45:36,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7299 7352 [WARNING|trainer.py:803] 2025-04-26 17:45:36,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:45:37,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7375 [WARNING|trainer.py:803] 2025-04-26 17:45:37,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7300 7353 [WARNING|trainer.py:803] 2025-04-26 17:45:37,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:38,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7376 [WARNING|trainer.py:803] 2025-04-26 17:45:38,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7301 7354 [WARNING|trainer.py:803] 2025-04-26 17:45:39,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:45:39,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7377 [WARNING|trainer.py:803] 2025-04-26 17:45:39,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7302 7355 [WARNING|trainer.py:803] 2025-04-26 17:45:40,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:45:40,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7378 [WARNING|trainer.py:803] 2025-04-26 17:45:41,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7303 7356 [WARNING|trainer.py:803] 2025-04-26 17:45:41,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7379 [WARNING|trainer.py:803] 2025-04-26 17:45:42,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 17:45:42,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7304 7357 [WARNING|trainer.py:803] 2025-04-26 17:45:42,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7380 [WARNING|trainer.py:803] 2025-04-26 17:45:43,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:45:43,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7305 7358 [WARNING|trainer.py:803] 2025-04-26 17:45:44,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7381 [WARNING|trainer.py:803] 2025-04-26 17:45:44,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:44,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7306 7359 [WARNING|trainer.py:803] 2025-04-26 17:45:45,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7382 [WARNING|trainer.py:803] 2025-04-26 17:45:45,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:46,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7307 7360 [WARNING|trainer.py:803] 2025-04-26 17:45:46,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7383 [WARNING|trainer.py:803] 2025-04-26 17:45:47,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:45:47,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7308 7361 [WARNING|trainer.py:803] 2025-04-26 17:45:47,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7384 [WARNING|trainer.py:803] 2025-04-26 17:45:48,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:48,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7362 7309 [WARNING|trainer.py:803] 2025-04-26 17:45:49,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7385 [WARNING|trainer.py:803] 2025-04-26 17:45:49,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:49,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7363 7310 [WARNING|trainer.py:803] 2025-04-26 17:45:50,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7386 [WARNING|trainer.py:803] 2025-04-26 17:45:50,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:45:51,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7364 7311 [WARNING|trainer.py:803] 2025-04-26 17:45:51,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7387 [WARNING|trainer.py:803] 2025-04-26 17:45:52,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:45:52,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7365 7312 [WARNING|trainer.py:803] 2025-04-26 17:45:52,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7388 [WARNING|trainer.py:803] 2025-04-26 17:45:53,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:45:53,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7366 7313 [WARNING|trainer.py:803] 2025-04-26 17:45:54,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7389 [WARNING|trainer.py:803] 2025-04-26 17:45:54,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:45:54,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7367 7314 [WARNING|trainer.py:803] 2025-04-26 17:45:55,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7390 [WARNING|trainer.py:803] 2025-04-26 17:45:55,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:45:56,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7368 7315 [WARNING|trainer.py:803] 2025-04-26 17:45:56,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7391 [WARNING|trainer.py:803] 2025-04-26 17:45:57,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:45:57,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7369 7316 [WARNING|trainer.py:803] 2025-04-26 17:45:57,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7392 [WARNING|trainer.py:803] 2025-04-26 17:45:58,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:45:58,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7370 [WARNING|trainer.py:803] 2025-04-26 17:45:59,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7317 7393 [WARNING|trainer.py:803] 2025-04-26 17:45:59,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:45:59,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7371 [WARNING|trainer.py:803] 2025-04-26 17:46:00,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7318 7394 [WARNING|trainer.py:803] 2025-04-26 17:46:00,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:46:01,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7372 [WARNING|trainer.py:803] 2025-04-26 17:46:01,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7319 7395 [WARNING|trainer.py:803] 2025-04-26 17:46:02,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7373 [WARNING|trainer.py:803] 2025-04-26 17:46:02,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:46:02,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7320 7396 [WARNING|trainer.py:803] 2025-04-26 17:46:03,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7374 [WARNING|trainer.py:803] 2025-04-26 17:46:03,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:46:04,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7321 7397 [WARNING|trainer.py:803] 2025-04-26 17:46:04,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7375 [WARNING|trainer.py:803] 2025-04-26 17:46:05,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:46:05,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7322 7398 [WARNING|trainer.py:803] 2025-04-26 17:46:05,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7376 [WARNING|trainer.py:803] 2025-04-26 17:46:06,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:06,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7323 7399 [WARNING|trainer.py:803] 2025-04-26 17:46:07,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7377 [WARNING|trainer.py:803] 2025-04-26 17:46:07,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:46:07,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7324 7400 [WARNING|trainer.py:803] 2025-04-26 17:46:08,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7378 [WARNING|trainer.py:803] 2025-04-26 17:46:08,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:46:09,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7325 7401 [WARNING|trainer.py:803] 2025-04-26 17:46:09,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7379 [WARNING|trainer.py:803] 2025-04-26 17:46:09,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:46:10,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7326 7402 [WARNING|trainer.py:803] 2025-04-26 17:46:10,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7380 [WARNING|trainer.py:803] 2025-04-26 17:46:11,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:11,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7327 [WARNING|trainer.py:803] 2025-04-26 17:46:11,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7403 7381 [WARNING|trainer.py:803] 2025-04-26 17:46:12,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:12,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7328 [WARNING|trainer.py:803] 2025-04-26 17:46:13,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7404 7382 [WARNING|trainer.py:803] 2025-04-26 17:46:13,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:46:14,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7329 [WARNING|trainer.py:803] 2025-04-26 17:46:14,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7405 7383 [WARNING|trainer.py:803] 2025-04-26 17:46:14,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:46:15,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7330 [WARNING|trainer.py:803] 2025-04-26 17:46:15,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7406 7384 [WARNING|trainer.py:803] 2025-04-26 17:46:16,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:46:16,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7331 [WARNING|trainer.py:803] 2025-04-26 17:46:16,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7407 7385 [WARNING|trainer.py:803] 2025-04-26 17:46:17,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:46:17,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7332 [WARNING|trainer.py:803] 2025-04-26 17:46:18,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7408 7386 [WARNING|trainer.py:803] 2025-04-26 17:46:18,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:19,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7333 [WARNING|trainer.py:803] 2025-04-26 17:46:19,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7409 7387 [WARNING|trainer.py:803] 2025-04-26 17:46:19,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:20,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7334 [WARNING|trainer.py:803] 2025-04-26 17:46:20,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7410 7388 [WARNING|trainer.py:803] 2025-04-26 17:46:21,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:46:21,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7335 [WARNING|trainer.py:803] 2025-04-26 17:46:21,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7411 7389 [WARNING|trainer.py:803] 2025-04-26 17:46:22,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:46:22,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7336 [WARNING|trainer.py:803] 2025-04-26 17:46:23,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7412 7390 [WARNING|trainer.py:803] 2025-04-26 17:46:23,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:46:23,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7337 [WARNING|trainer.py:803] 2025-04-26 17:46:24,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7413 7391 [WARNING|trainer.py:803] 2025-04-26 17:46:24,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:46:25,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7338 [WARNING|trainer.py:803] 2025-04-26 17:46:25,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7414 7392 [WARNING|trainer.py:803] 2025-04-26 17:46:26,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:26,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7339 [WARNING|trainer.py:803] 2025-04-26 17:46:26,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7415 7393 [WARNING|trainer.py:803] 2025-04-26 17:46:27,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:27,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7340 [WARNING|trainer.py:803] 2025-04-26 17:46:27,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7416 7394 [WARNING|trainer.py:803] 2025-04-26 17:46:28,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:28,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7341 [WARNING|trainer.py:803] 2025-04-26 17:46:29,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7417 7395 [WARNING|trainer.py:803] 2025-04-26 17:46:29,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:30,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7342 [WARNING|trainer.py:803] 2025-04-26 17:46:30,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7418 7396 [WARNING|trainer.py:803] 2025-04-26 17:46:31,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:31,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerNo 7343 [WARNING|trainer.py:803] 2025-04-26 17:46:31,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7419 7397 [WARNING|trainer.py:803] 2025-04-26 17:46:32,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:46:32,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7344 [WARNING|trainer.py:803] 2025-04-26 17:46:32,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7420 7398 [WARNING|trainer.py:803] 2025-04-26 17:46:33,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:46:33,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7345 [WARNING|trainer.py:803] 2025-04-26 17:46:34,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7421 7399 [WARNING|trainer.py:803] 2025-04-26 17:46:34,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:46:35,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7346 [WARNING|trainer.py:803] 2025-04-26 17:46:35,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7422 7400 [WARNING|trainer.py:803] 2025-04-26 17:46:36,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:46:36,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7347 [WARNING|trainer.py:803] 2025-04-26 17:46:36,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7423 7401 [WARNING|trainer.py:803] 2025-04-26 17:46:37,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:46:37,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7348 [WARNING|trainer.py:803] 2025-04-26 17:46:37,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7424 7402 [WARNING|trainer.py:803] 2025-04-26 17:46:38,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:46:38,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7349 [WARNING|trainer.py:803] 2025-04-26 17:46:39,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7425 7403 [WARNING|trainer.py:803] 2025-04-26 17:46:39,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:40,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7350 [WARNING|trainer.py:803] 2025-04-26 17:46:40,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7426 7404 [WARNING|trainer.py:803] 2025-04-26 17:46:41,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:41,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7351 [WARNING|trainer.py:803] 2025-04-26 17:46:41,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7427 7405 [WARNING|trainer.py:803] 2025-04-26 17:46:42,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:46:42,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7352 [WARNING|trainer.py:803] 2025-04-26 17:46:42,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7428 7406 [WARNING|trainer.py:803] 2025-04-26 17:46:43,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:46:43,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7353 [WARNING|trainer.py:803] 2025-04-26 17:46:44,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7429 7407 [WARNING|trainer.py:803] 2025-04-26 17:46:44,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:46:45,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7354 [WARNING|trainer.py:803] 2025-04-26 17:46:45,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7430 7408 [WARNING|trainer.py:803] 2025-04-26 17:46:46,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:46:46,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7355 [WARNING|trainer.py:803] 2025-04-26 17:46:46,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7431 7409 [WARNING|trainer.py:803] 2025-04-26 17:46:47,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:47,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7356 [WARNING|trainer.py:803] 2025-04-26 17:46:47,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7432 7410 [WARNING|trainer.py:803] 2025-04-26 17:46:48,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7357 [WARNING|trainer.py:803] 2025-04-26 17:46:48,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:49,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7433 7411 [WARNING|trainer.py:803] 2025-04-26 17:46:49,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7358 [WARNING|trainer.py:803] 2025-04-26 17:46:50,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:46:50,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7434 7412 [WARNING|trainer.py:803] 2025-04-26 17:46:51,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7359 [WARNING|trainer.py:803] 2025-04-26 17:46:51,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:46:51,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7435 7413 [WARNING|trainer.py:803] 2025-04-26 17:46:52,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7360 [WARNING|trainer.py:803] 2025-04-26 17:46:52,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:46:52,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7436 7414 [WARNING|trainer.py:803] 2025-04-26 17:46:53,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7361 [WARNING|trainer.py:803] 2025-04-26 17:46:53,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:46:53,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7437 7415 [WARNING|trainer.py:803] 2025-04-26 17:46:54,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7362 [WARNING|trainer.py:803] 2025-04-26 17:46:55,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:46:55,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7438 7416 [WARNING|trainer.py:803] 2025-04-26 17:46:55,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7363 [WARNING|trainer.py:803] 2025-04-26 17:46:56,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:46:56,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7439 7417 [WARNING|trainer.py:803] 2025-04-26 17:46:57,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7364 [WARNING|trainer.py:803] 2025-04-26 17:46:57,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:46:57,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7440 7418 [WARNING|trainer.py:803] 2025-04-26 17:46:58,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7365 [WARNING|trainer.py:803] 2025-04-26 17:46:58,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:46:58,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerNo 7441 7419 [WARNING|trainer.py:803] 2025-04-26 17:46:59,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7366 [WARNING|trainer.py:803] 2025-04-26 17:47:00,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:00,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7442 7420 [WARNING|trainer.py:803] 2025-04-26 17:47:00,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7367 [WARNING|trainer.py:803] 2025-04-26 17:47:01,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:47:01,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7443 7421 [WARNING|trainer.py:803] 2025-04-26 17:47:02,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7368 [WARNING|trainer.py:803] 2025-04-26 17:47:02,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:47:02,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7444 7422 [WARNING|trainer.py:803] 2025-04-26 17:47:03,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7369 [WARNING|trainer.py:803] 2025-04-26 17:47:03,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:47:03,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7423 7445 [WARNING|trainer.py:803] 2025-04-26 17:47:04,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7370 [WARNING|trainer.py:803] 2025-04-26 17:47:05,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:05,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7424 7446 [WARNING|trainer.py:803] 2025-04-26 17:47:05,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7371 [WARNING|trainer.py:803] 2025-04-26 17:47:06,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:47:06,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7425 7447 [WARNING|trainer.py:803] 2025-04-26 17:47:07,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7372 [WARNING|trainer.py:803] 2025-04-26 17:47:07,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:47:07,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7426 7448 [WARNING|trainer.py:803] 2025-04-26 17:47:08,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7373 [WARNING|trainer.py:803] 2025-04-26 17:47:08,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:08,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7427 7449 [WARNING|trainer.py:803] 2025-04-26 17:47:09,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7374 [WARNING|trainer.py:803] 2025-04-26 17:47:10,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:47:10,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7428 7450 [WARNING|trainer.py:803] 2025-04-26 17:47:10,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7375 [WARNING|trainer.py:803] 2025-04-26 17:47:11,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:11,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7429 7451 [WARNING|trainer.py:803] 2025-04-26 17:47:11,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7376 [WARNING|trainer.py:803] 2025-04-26 17:47:12,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:47:12,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7430 7452 [WARNING|trainer.py:803] 2025-04-26 17:47:13,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7377 [WARNING|trainer.py:803] 2025-04-26 17:47:13,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:13,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7431 7453 [WARNING|trainer.py:803] 2025-04-26 17:47:14,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7378 [WARNING|trainer.py:803] 2025-04-26 17:47:15,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:15,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7432 7454 [WARNING|trainer.py:803] 2025-04-26 17:47:15,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7379 [WARNING|trainer.py:803] 2025-04-26 17:47:16,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:47:16,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7433 7455 [WARNING|trainer.py:803] 2025-04-26 17:47:16,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7380 [WARNING|trainer.py:803] 2025-04-26 17:47:17,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:17,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7434 7456 [WARNING|trainer.py:803] 2025-04-26 17:47:18,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7381 [WARNING|trainer.py:803] 2025-04-26 17:47:18,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:18,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7435 7457 [WARNING|trainer.py:803] 2025-04-26 17:47:19,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7382 [WARNING|trainer.py:803] 2025-04-26 17:47:19,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:20,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7436 7458 [WARNING|trainer.py:803] 2025-04-26 17:47:20,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7383 [WARNING|trainer.py:803] 2025-04-26 17:47:21,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:21,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7437 7459 [WARNING|trainer.py:803] 2025-04-26 17:47:21,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7384 [WARNING|trainer.py:803] 2025-04-26 17:47:22,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:47:22,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7438 [WARNING|trainer.py:803] 2025-04-26 17:47:23,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7460 7385 [WARNING|trainer.py:803] 2025-04-26 17:47:23,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:23,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7439 [WARNING|trainer.py:803] 2025-04-26 17:47:24,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7461 7386 [WARNING|trainer.py:803] 2025-04-26 17:47:24,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:47:25,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7440 [WARNING|trainer.py:803] 2025-04-26 17:47:25,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7462 7387 [WARNING|trainer.py:803] 2025-04-26 17:47:26,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:47:26,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7441 [WARNING|trainer.py:803] 2025-04-26 17:47:26,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7463 7388 [WARNING|trainer.py:803] 2025-04-26 17:47:27,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:27,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7442 [WARNING|trainer.py:803] 2025-04-26 17:47:27,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7464 7389 [WARNING|trainer.py:803] 2025-04-26 17:47:28,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:47:28,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7443 [WARNING|trainer.py:803] 2025-04-26 17:47:29,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7465 7390 [WARNING|trainer.py:803] 2025-04-26 17:47:29,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:47:30,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7444 [WARNING|trainer.py:803] 2025-04-26 17:47:30,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7466 7391 [WARNING|trainer.py:803] 2025-04-26 17:47:30,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:47:31,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7445 [WARNING|trainer.py:803] 2025-04-26 17:47:31,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7467 7392 [WARNING|trainer.py:803] 2025-04-26 17:47:32,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:32,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7446 [WARNING|trainer.py:803] 2025-04-26 17:47:32,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7468 7393 [WARNING|trainer.py:803] 2025-04-26 17:47:33,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:47:33,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7447 [WARNING|trainer.py:803] 2025-04-26 17:47:34,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7469 7394 [WARNING|trainer.py:803] 2025-04-26 17:47:34,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:35,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7448 [WARNING|trainer.py:803] 2025-04-26 17:47:35,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7470 7395 [WARNING|trainer.py:803] 2025-04-26 17:47:35,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7449 [WARNING|trainer.py:803] 2025-04-26 17:47:36,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7471 [WARNING|trainer.py:803] 2025-04-26 17:47:36,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7396 [WARNING|trainer.py:803] 2025-04-26 17:47:37,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7450 [WARNING|trainer.py:803] 2025-04-26 17:47:37,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:47:37,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7472 [WARNING|trainer.py:803] 2025-04-26 17:47:38,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 7397:Yes 7451 [WARNING|trainer.py:803] 2025-04-26 17:47:38,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:47:39,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7473 7398 [WARNING|trainer.py:803] 2025-04-26 17:47:39,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7452 [WARNING|trainer.py:803] 2025-04-26 17:47:39,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:47:40,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7474 [WARNING|trainer.py:803] 2025-04-26 17:47:40,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7399 7453 [WARNING|trainer.py:803] 2025-04-26 17:47:41,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:47:41,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7475 [WARNING|trainer.py:803] 2025-04-26 17:47:41,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7400 7454 [WARNING|trainer.py:803] 2025-04-26 17:47:42,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:47:42,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7476 [WARNING|trainer.py:803] 2025-04-26 17:47:43,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7401 7455 [WARNING|trainer.py:803] 2025-04-26 17:47:43,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:47:44,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7477 [WARNING|trainer.py:803] 2025-04-26 17:47:44,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7402 7456 [WARNING|trainer.py:803] 2025-04-26 17:47:44,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:47:45,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7478 [WARNING|trainer.py:803] 2025-04-26 17:47:45,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7403 7457 [WARNING|trainer.py:803] 2025-04-26 17:47:46,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:47:46,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7479 [WARNING|trainer.py:803] 2025-04-26 17:47:46,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7404 7458 [WARNING|trainer.py:803] 2025-04-26 17:47:47,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:47:47,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7480 [WARNING|trainer.py:803] 2025-04-26 17:47:48,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7405 7459 [WARNING|trainer.py:803] 2025-04-26 17:47:48,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:47:48,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7481 [WARNING|trainer.py:803] 2025-04-26 17:47:49,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7406 7460 [WARNING|trainer.py:803] 2025-04-26 17:47:49,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:47:50,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7482 [WARNING|trainer.py:803] 2025-04-26 17:47:50,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7407 7461 [WARNING|trainer.py:803] 2025-04-26 17:47:51,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:47:51,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7483 [WARNING|trainer.py:803] 2025-04-26 17:47:51,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7408 7462 [WARNING|trainer.py:803] 2025-04-26 17:47:52,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:47:52,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7484 [WARNING|trainer.py:803] 2025-04-26 17:47:53,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7409 7463 [WARNING|trainer.py:803] 2025-04-26 17:47:53,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:53,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7485 [WARNING|trainer.py:803] 2025-04-26 17:47:54,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7410 7464 [WARNING|trainer.py:803] 2025-04-26 17:47:54,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:47:55,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7486 [WARNING|trainer.py:803] 2025-04-26 17:47:55,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7411 7465 [WARNING|trainer.py:803] 2025-04-26 17:47:56,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:47:56,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7487 [WARNING|trainer.py:803] 2025-04-26 17:47:56,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7412 7466 [WARNING|trainer.py:803] 2025-04-26 17:47:57,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:47:57,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7488 [WARNING|trainer.py:803] 2025-04-26 17:47:57,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7413 7467 [WARNING|trainer.py:803] 2025-04-26 17:47:58,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:47:58,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7489 [WARNING|trainer.py:803] 2025-04-26 17:47:59,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7414 7468 [WARNING|trainer.py:803] 2025-04-26 17:47:59,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:48:00,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7490 [WARNING|trainer.py:803] 2025-04-26 17:48:00,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7415 7469 [WARNING|trainer.py:803] 2025-04-26 17:48:01,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:01,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7491 [WARNING|trainer.py:803] 2025-04-26 17:48:01,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7416 7470 [WARNING|trainer.py:803] 2025-04-26 17:48:02,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:02,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7492 [WARNING|trainer.py:803] 2025-04-26 17:48:02,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7417 7471 [WARNING|trainer.py:803] 2025-04-26 17:48:03,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:03,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7493 [WARNING|trainer.py:803] 2025-04-26 17:48:04,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7418 7472 [WARNING|trainer.py:803] 2025-04-26 17:48:04,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:05,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerNo 7494 [WARNING|trainer.py:803] 2025-04-26 17:48:05,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7419 7473 [WARNING|trainer.py:803] 2025-04-26 17:48:06,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:06,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7495 [WARNING|trainer.py:803] 2025-04-26 17:48:06,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7420 7474 [WARNING|trainer.py:803] 2025-04-26 17:48:07,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:07,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7496 [WARNING|trainer.py:803] 2025-04-26 17:48:07,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7421 7475 [WARNING|trainer.py:803] 2025-04-26 17:48:08,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:08,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7497 [WARNING|trainer.py:803] 2025-04-26 17:48:09,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7422 7476 [WARNING|trainer.py:803] 2025-04-26 17:48:09,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:10,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7498 [WARNING|trainer.py:803] 2025-04-26 17:48:10,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7423 7477 [WARNING|trainer.py:803] 2025-04-26 17:48:10,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:11,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7499 [WARNING|trainer.py:803] 2025-04-26 17:48:11,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7424 7478 [WARNING|trainer.py:803] 2025-04-26 17:48:12,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:12,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7500 [WARNING|trainer.py:803] 2025-04-26 17:48:12,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7425 7479 [WARNING|trainer.py:803] 2025-04-26 17:48:13,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:13,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:48:13,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7501 7426 7480 [WARNING|trainer.py:803] 2025-04-26 17:48:14,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:48:15,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:15,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7502 7427 7481 [WARNING|trainer.py:803] 2025-04-26 17:48:16,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:16,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:48:16,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7428 7482 7503 [WARNING|trainer.py:803] 2025-04-26 17:48:17,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:17,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:48:17,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7429 7483 7504 [WARNING|trainer.py:803] 2025-04-26 17:48:18,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:18,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:48:19,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7430 7484 7505 [WARNING|trainer.py:803] 2025-04-26 17:48:20,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:20,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:20,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7431 7485 7506 [WARNING|trainer.py:803] 2025-04-26 17:48:21,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:21,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7486 7432 [WARNING|trainer.py:803] 2025-04-26 17:48:21,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7507 [WARNING|trainer.py:803] 2025-04-26 17:48:22,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:22,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7487 7433 [WARNING|trainer.py:803] 2025-04-26 17:48:23,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7508 [WARNING|trainer.py:803] 2025-04-26 17:48:23,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:24,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7488 7434 [WARNING|trainer.py:803] 2025-04-26 17:48:24,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:25,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:25,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7509 7489 7435 [WARNING|trainer.py:803] 2025-04-26 17:48:26,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:26,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:48:26,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7510 7490 7436 [WARNING|trainer.py:803] 2025-04-26 17:48:27,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:48:27,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:27,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7491 7511 7437 [WARNING|trainer.py:803] 2025-04-26 17:48:28,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:29,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:29,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7492 7438 7512 [WARNING|trainer.py:803] 2025-04-26 17:48:30,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:30,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:30,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7493 7439 7513 [WARNING|trainer.py:803] 2025-04-26 17:48:31,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:31,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7494 [WARNING|trainer.py:803] 2025-04-26 17:48:31,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7440 7514 [WARNING|trainer.py:803] 2025-04-26 17:48:32,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:32,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7495 [WARNING|trainer.py:803] 2025-04-26 17:48:33,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7441 7515 [WARNING|trainer.py:803] 2025-04-26 17:48:33,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:34,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7496 7442 [WARNING|trainer.py:803] 2025-04-26 17:48:34,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:48:35,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7516 [WARNING|trainer.py:803] 2025-04-26 17:48:35,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7497 7443 [WARNING|trainer.py:803] 2025-04-26 17:48:36,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:48:36,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7517 [WARNING|trainer.py:803] 2025-04-26 17:48:36,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7498 7444 [WARNING|trainer.py:803] 2025-04-26 17:48:37,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:48:37,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:37,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7518 7499 7445 [WARNING|trainer.py:803] 2025-04-26 17:48:38,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:48:38,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:39,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7500 7519 7446 [WARNING|trainer.py:803] 2025-04-26 17:48:40,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:40,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:48:40,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7501 7447 7520 [WARNING|trainer.py:803] 2025-04-26 17:48:41,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:48:41,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:41,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7448 7502 7521 [WARNING|trainer.py:803] 2025-04-26 17:48:42,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:48:42,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:43,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7449 7503 7522 [WARNING|trainer.py:803] 2025-04-26 17:48:44,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:44,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:44,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7450 7504 7523 [WARNING|trainer.py:803] 2025-04-26 17:48:45,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7451 [WARNING|trainer.py:803] 2025-04-26 17:48:45,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:48:46,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7505 [WARNING|trainer.py:803] 2025-04-26 17:48:46,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7524 7452 [WARNING|trainer.py:803] 2025-04-26 17:48:47,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:47,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7506 [WARNING|trainer.py:803] 2025-04-26 17:48:47,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7525 7453 [WARNING|trainer.py:803] 2025-04-26 17:48:48,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:48,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:48:49,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7507 7454 7526 [WARNING|trainer.py:803] 2025-04-26 17:48:50,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:50,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:50,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7508 7455 7527 [WARNING|trainer.py:803] 2025-04-26 17:48:51,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:48:51,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:51,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7456 7509 7528 [WARNING|trainer.py:803] 2025-04-26 17:48:52,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:48:52,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:53,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7457 7510 7529 [WARNING|trainer.py:803] 2025-04-26 17:48:54,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:54,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7458 [WARNING|trainer.py:803] 2025-04-26 17:48:54,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7511 7530 [WARNING|trainer.py:803] 2025-04-26 17:48:55,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7459 [WARNING|trainer.py:803] 2025-04-26 17:48:55,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:56,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7512 [WARNING|trainer.py:803] 2025-04-26 17:48:56,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7531 7460 [WARNING|trainer.py:803] 2025-04-26 17:48:57,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:48:57,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:48:57,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7513 7461 7532 [WARNING|trainer.py:803] 2025-04-26 17:48:58,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:48:59,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:48:59,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7514 7462 7533 [WARNING|trainer.py:803] 2025-04-26 17:49:00,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:00,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:49:00,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7515 7463 7534 [WARNING|trainer.py:803] 2025-04-26 17:49:01,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:49:01,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7464 [WARNING|trainer.py:803] 2025-04-26 17:49:01,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7516 7535 [WARNING|trainer.py:803] 2025-04-26 17:49:02,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:49:02,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7465 [WARNING|trainer.py:803] 2025-04-26 17:49:03,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7517 7536 [WARNING|trainer.py:803] 2025-04-26 17:49:04,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:49:04,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7466 [WARNING|trainer.py:803] 2025-04-26 17:49:04,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7518 [WARNING|trainer.py:803] 2025-04-26 17:49:05,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7537 7467 [WARNING|trainer.py:803] 2025-04-26 17:49:05,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:49:06,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7519 [WARNING|trainer.py:803] 2025-04-26 17:49:06,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7538 7468 [WARNING|trainer.py:803] 2025-04-26 17:49:07,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:49:07,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:07,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7520 7469 7539 [WARNING|trainer.py:803] 2025-04-26 17:49:08,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:09,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:49:09,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7521 7470 7540 [WARNING|trainer.py:803] 2025-04-26 17:49:10,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:49:10,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:49:10,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7471 7522 7541 [WARNING|trainer.py:803] 2025-04-26 17:49:11,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:49:11,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:49:11,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7472 7523 7542 [WARNING|trainer.py:803] 2025-04-26 17:49:12,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:49:12,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7473 [WARNING|trainer.py:803] 2025-04-26 17:49:13,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7524 7543 [WARNING|trainer.py:803] 2025-04-26 17:49:14,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7474 [WARNING|trainer.py:803] 2025-04-26 17:49:14,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:14,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7525 [WARNING|trainer.py:803] 2025-04-26 17:49:15,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7544 7475 [WARNING|trainer.py:803] 2025-04-26 17:49:15,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:49:16,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:16,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7526 7545 7476 [WARNING|trainer.py:803] 2025-04-26 17:49:17,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:17,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:17,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7527 7546 7477 [WARNING|trainer.py:803] 2025-04-26 17:49:18,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:49:18,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:18,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7478 7528 7547 [WARNING|trainer.py:803] 2025-04-26 17:49:20,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:20,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:49:20,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7479 7529 7548 [WARNING|trainer.py:803] 2025-04-26 17:49:21,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:21,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:49:21,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7480 7549 7530 [WARNING|trainer.py:803] 2025-04-26 17:49:22,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7481 [WARNING|trainer.py:803] 2025-04-26 17:49:23,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:23,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7550 7531 [WARNING|trainer.py:803] 2025-04-26 17:49:23,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7482 [WARNING|trainer.py:803] 2025-04-26 17:49:24,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:24,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:49:25,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7551 7532 7483 [WARNING|trainer.py:803] 2025-04-26 17:49:25,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:49:26,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:49:26,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7552 7533 7484 [WARNING|trainer.py:803] 2025-04-26 17:49:27,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:27,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:27,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7553 7485 7534 [WARNING|trainer.py:803] 2025-04-26 17:49:28,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:28,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:29,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7486 7554 7535 [WARNING|trainer.py:803] 2025-04-26 17:49:30,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:30,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:30,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7487 7555 7536 [WARNING|trainer.py:803] 2025-04-26 17:49:31,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:31,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:31,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7488 7556 7537 [WARNING|trainer.py:803] 2025-04-26 17:49:32,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7489 [WARNING|trainer.py:803] 2025-04-26 17:49:33,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:33,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7557 7538 [WARNING|trainer.py:803] 2025-04-26 17:49:33,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7490 [WARNING|trainer.py:803] 2025-04-26 17:49:34,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:34,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7558 7539 [WARNING|trainer.py:803] 2025-04-26 17:49:35,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7491 [WARNING|trainer.py:803] 2025-04-26 17:49:35,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:49:36,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7559 [WARNING|trainer.py:803] 2025-04-26 17:49:36,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7540 7492 [WARNING|trainer.py:803] 2025-04-26 17:49:37,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:49:37,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7560 [WARNING|trainer.py:803] 2025-04-26 17:49:37,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7541 7493 [WARNING|trainer.py:803] 2025-04-26 17:49:38,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:49:38,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:38,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7561 7494 7542 [WARNING|trainer.py:803] 2025-04-26 17:49:39,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:49:40,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7562 [WARNING|trainer.py:803] 2025-04-26 17:49:40,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7495 7543 [WARNING|trainer.py:803] 2025-04-26 17:49:41,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:41,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7563 [WARNING|trainer.py:803] 2025-04-26 17:49:41,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7496 7544 [WARNING|trainer.py:803] 2025-04-26 17:49:42,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:49:42,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7564 7497 [WARNING|trainer.py:803] 2025-04-26 17:49:43,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7545 [WARNING|trainer.py:803] 2025-04-26 17:49:43,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:43,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7498 7565 [WARNING|trainer.py:803] 2025-04-26 17:49:44,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:45,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7546 [WARNING|trainer.py:803] 2025-04-26 17:49:45,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7499 7566 [WARNING|trainer.py:803] 2025-04-26 17:49:46,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:46,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:46,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7547 7500 7567 [WARNING|trainer.py:803] 2025-04-26 17:49:47,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:47,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:48,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7548 7501 7568 [WARNING|trainer.py:803] 2025-04-26 17:49:48,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:49:49,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:49:49,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7549 7502 7569 [WARNING|trainer.py:803] 2025-04-26 17:49:50,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:50,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:50,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7550 7503 7570 [WARNING|trainer.py:803] 2025-04-26 17:49:51,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:52,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:52,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7551 7504 7571 [WARNING|trainer.py:803] 2025-04-26 17:49:53,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:49:53,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:49:53,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7552 7505 7572 [WARNING|trainer.py:803] 2025-04-26 17:49:54,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:54,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:55,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7553 7506 7573 [WARNING|trainer.py:803] 2025-04-26 17:49:56,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:56,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:56,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7554 7507 7574 [WARNING|trainer.py:803] 2025-04-26 17:49:57,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:57,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:57,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7555 7508 7575 [WARNING|trainer.py:803] 2025-04-26 17:49:58,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:49:58,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:49:59,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7556 7509 7576 [WARNING|trainer.py:803] 2025-04-26 17:50:00,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:00,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:00,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7557 7510 7577 [WARNING|trainer.py:803] 2025-04-26 17:50:01,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:01,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:50:02,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7558 7511 7578 [WARNING|trainer.py:803] 2025-04-26 17:50:03,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:50:03,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:03,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7559 7512 7579 [WARNING|trainer.py:803] 2025-04-26 17:50:04,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:50:04,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:04,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7560 7513 7580 [WARNING|trainer.py:803] 2025-04-26 17:50:05,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:50:06,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:50:06,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7561 7514 7581 [WARNING|trainer.py:803] 2025-04-26 17:50:07,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:50:07,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:07,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7562 7515 7582 [WARNING|trainer.py:803] 2025-04-26 17:50:08,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:08,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:50:09,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7563 7516 7583 [WARNING|trainer.py:803] 2025-04-26 17:50:10,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:50:10,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:50:10,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7564 7517 7584 [WARNING|trainer.py:803] 2025-04-26 17:50:11,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:11,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:50:12,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7565 7518 7585 [WARNING|trainer.py:803] 2025-04-26 17:50:12,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:13,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:50:13,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7566 7519 7586 [WARNING|trainer.py:803] 2025-04-26 17:50:14,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:14,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7567 [WARNING|trainer.py:803] 2025-04-26 17:50:14,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7520 7587 [WARNING|trainer.py:803] 2025-04-26 17:50:15,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:50:15,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7568 [WARNING|trainer.py:803] 2025-04-26 17:50:16,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7521 7588 [WARNING|trainer.py:803] 2025-04-26 17:50:17,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:17,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:50:17,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7569 7522 7589 [WARNING|trainer.py:803] 2025-04-26 17:50:18,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:50:18,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7570 [WARNING|trainer.py:803] 2025-04-26 17:50:19,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7523 7590 [WARNING|trainer.py:803] 2025-04-26 17:50:19,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:50:20,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7571 [WARNING|trainer.py:803] 2025-04-26 17:50:20,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7524 7591 [WARNING|trainer.py:803] 2025-04-26 17:50:21,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:50:21,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7572 [WARNING|trainer.py:803] 2025-04-26 17:50:21,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7525 7592 [WARNING|trainer.py:803] 2025-04-26 17:50:22,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:23,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7573 [WARNING|trainer.py:803] 2025-04-26 17:50:23,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7526 7593 [WARNING|trainer.py:803] 2025-04-26 17:50:24,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:50:24,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7574 [WARNING|trainer.py:803] 2025-04-26 17:50:24,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7527 7594 [WARNING|trainer.py:803] 2025-04-26 17:50:25,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:50:25,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:50:26,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7575 7528 7595 [WARNING|trainer.py:803] 2025-04-26 17:50:26,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:27,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7576 [WARNING|trainer.py:803] 2025-04-26 17:50:27,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7529 7596 [WARNING|trainer.py:803] 2025-04-26 17:50:28,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:28,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:50:28,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7577 7530 7597 [WARNING|trainer.py:803] 2025-04-26 17:50:29,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:30,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7578 [WARNING|trainer.py:803] 2025-04-26 17:50:30,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7531 7598 [WARNING|trainer.py:803] 2025-04-26 17:50:31,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:50:31,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:50:31,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7579 7532 7599 [WARNING|trainer.py:803] 2025-04-26 17:50:32,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:33,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7580 [WARNING|trainer.py:803] 2025-04-26 17:50:33,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7533 7600 [WARNING|trainer.py:803] 2025-04-26 17:50:34,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:50:34,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7581 [WARNING|trainer.py:803] 2025-04-26 17:50:34,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7534 7601 [WARNING|trainer.py:803] 2025-04-26 17:50:35,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7582 [WARNING|trainer.py:803] 2025-04-26 17:50:35,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:50:36,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7535 7602 [WARNING|trainer.py:803] 2025-04-26 17:50:36,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:50:37,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7583 [WARNING|trainer.py:803] 2025-04-26 17:50:37,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7536 7603 [WARNING|trainer.py:803] 2025-04-26 17:50:38,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:50:38,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7584 [WARNING|trainer.py:803] 2025-04-26 17:50:39,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7537 [WARNING|trainer.py:803] 2025-04-26 17:50:39,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7604 [WARNING|trainer.py:803] 2025-04-26 17:50:40,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7585 [WARNING|trainer.py:803] 2025-04-26 17:50:40,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7538 [WARNING|trainer.py:803] 2025-04-26 17:50:41,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7605 [WARNING|trainer.py:803] 2025-04-26 17:50:41,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7586 [WARNING|trainer.py:803] 2025-04-26 17:50:41,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7539 [WARNING|trainer.py:803] 2025-04-26 17:50:42,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7606 [WARNING|trainer.py:803] 2025-04-26 17:50:42,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7587 [WARNING|trainer.py:803] 2025-04-26 17:50:43,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7540 7607 [WARNING|trainer.py:803] 2025-04-26 17:50:44,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:44,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7588 [WARNING|trainer.py:803] 2025-04-26 17:50:44,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7541 [WARNING|trainer.py:803] 2025-04-26 17:50:45,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7608 [WARNING|trainer.py:803] 2025-04-26 17:50:45,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7589 [WARNING|trainer.py:803] 2025-04-26 17:50:46,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7542 [WARNING|trainer.py:803] 2025-04-26 17:50:46,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7609 [WARNING|trainer.py:803] 2025-04-26 17:50:47,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7590 [WARNING|trainer.py:803] 2025-04-26 17:50:47,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7543 [WARNING|trainer.py:803] 2025-04-26 17:50:48,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7610 [WARNING|trainer.py:803] 2025-04-26 17:50:48,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7591 [WARNING|trainer.py:803] 2025-04-26 17:50:49,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7544 [WARNING|trainer.py:803] 2025-04-26 17:50:49,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7611 [WARNING|trainer.py:803] 2025-04-26 17:50:50,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7592 [WARNING|trainer.py:803] 2025-04-26 17:50:50,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7545 [WARNING|trainer.py:803] 2025-04-26 17:50:51,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7612 [WARNING|trainer.py:803] 2025-04-26 17:50:51,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7593 [WARNING|trainer.py:803] 2025-04-26 17:50:51,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7546 [WARNING|trainer.py:803] 2025-04-26 17:50:52,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7613 [WARNING|trainer.py:803] 2025-04-26 17:50:52,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7594 [WARNING|trainer.py:803] 2025-04-26 17:50:53,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7547 [WARNING|trainer.py:803] 2025-04-26 17:50:53,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7614 7595 [WARNING|trainer.py:803] 2025-04-26 17:50:54,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:50:54,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7548 [WARNING|trainer.py:803] 2025-04-26 17:50:55,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7615 [WARNING|trainer.py:803] 2025-04-26 17:50:55,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7596 [WARNING|trainer.py:803] 2025-04-26 17:50:56,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7549 [WARNING|trainer.py:803] 2025-04-26 17:50:56,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7616 7597 [WARNING|trainer.py:803] 2025-04-26 17:50:57,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:50:57,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7550 [WARNING|trainer.py:803] 2025-04-26 17:50:58,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7617 7598 [WARNING|trainer.py:803] 2025-04-26 17:50:58,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:50:59,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7551 [WARNING|trainer.py:803] 2025-04-26 17:50:59,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7618 7599 [WARNING|trainer.py:803] 2025-04-26 17:51:00,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:51:00,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7552 [WARNING|trainer.py:803] 2025-04-26 17:51:00,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7619 7600 [WARNING|trainer.py:803] 2025-04-26 17:51:01,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:51:02,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7553 [WARNING|trainer.py:803] 2025-04-26 17:51:02,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7620 7601 [WARNING|trainer.py:803] 2025-04-26 17:51:02,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:03,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7554 [WARNING|trainer.py:803] 2025-04-26 17:51:03,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7621 7602 [WARNING|trainer.py:803] 2025-04-26 17:51:04,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:04,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7555 [WARNING|trainer.py:803] 2025-04-26 17:51:05,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7622 7603 [WARNING|trainer.py:803] 2025-04-26 17:51:05,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:06,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7556 [WARNING|trainer.py:803] 2025-04-26 17:51:06,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7623 [WARNING|trainer.py:803] 2025-04-26 17:51:07,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7604 [WARNING|trainer.py:803] 2025-04-26 17:51:07,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7557 [WARNING|trainer.py:803] 2025-04-26 17:51:08,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7624 7605 [WARNING|trainer.py:803] 2025-04-26 17:51:08,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:09,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7558 [WARNING|trainer.py:803] 2025-04-26 17:51:09,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7625 [WARNING|trainer.py:803] 2025-04-26 17:51:10,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7606 7559 [WARNING|trainer.py:803] 2025-04-26 17:51:10,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:51:11,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7626 [WARNING|trainer.py:803] 2025-04-26 17:51:11,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7607 7560 [WARNING|trainer.py:803] 2025-04-26 17:51:12,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:51:12,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7627 [WARNING|trainer.py:803] 2025-04-26 17:51:12,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7608 7561 [WARNING|trainer.py:803] 2025-04-26 17:51:13,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:51:13,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7628 [WARNING|trainer.py:803] 2025-04-26 17:51:14,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7609 7562 [WARNING|trainer.py:803] 2025-04-26 17:51:14,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:51:15,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7629 [WARNING|trainer.py:803] 2025-04-26 17:51:15,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7610 7563 [WARNING|trainer.py:803] 2025-04-26 17:51:16,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:51:16,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7630 [WARNING|trainer.py:803] 2025-04-26 17:51:16,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7611 7564 [WARNING|trainer.py:803] 2025-04-26 17:51:17,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:51:18,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7631 [WARNING|trainer.py:803] 2025-04-26 17:51:18,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7612 7565 [WARNING|trainer.py:803] 2025-04-26 17:51:19,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:51:19,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7632 [WARNING|trainer.py:803] 2025-04-26 17:51:19,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7613 7566 [WARNING|trainer.py:803] 2025-04-26 17:51:20,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:51:21,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7633 [WARNING|trainer.py:803] 2025-04-26 17:51:21,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7567 7614 [WARNING|trainer.py:803] 2025-04-26 17:51:21,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:51:22,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:51:22,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7634 7568 7615 [WARNING|trainer.py:803] 2025-04-26 17:51:23,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:23,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:24,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7635 7569 7616 [WARNING|trainer.py:803] 2025-04-26 17:51:24,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:51:25,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:51:25,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7636 7570 7617 [WARNING|trainer.py:803] 2025-04-26 17:51:26,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:51:26,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7637 [WARNING|trainer.py:803] 2025-04-26 17:51:27,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7571 7618 [WARNING|trainer.py:803] 2025-04-26 17:51:28,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:28,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7638 [WARNING|trainer.py:803] 2025-04-26 17:51:28,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7572 7619 [WARNING|trainer.py:803] 2025-04-26 17:51:29,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:51:29,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:30,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7639 7573 7620 [WARNING|trainer.py:803] 2025-04-26 17:51:30,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:30,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7574 7640 [WARNING|trainer.py:803] 2025-04-26 17:51:31,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7621 [WARNING|trainer.py:803] 2025-04-26 17:51:32,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:51:32,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7641 [WARNING|trainer.py:803] 2025-04-26 17:51:32,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7575 7622 [WARNING|trainer.py:803] 2025-04-26 17:51:33,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:51:33,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7576 7642 [WARNING|trainer.py:803] 2025-04-26 17:51:34,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7623 [WARNING|trainer.py:803] 2025-04-26 17:51:35,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:35,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:35,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7577 7643 7624 [WARNING|trainer.py:803] 2025-04-26 17:51:36,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:36,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7578 [WARNING|trainer.py:803] 2025-04-26 17:51:37,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7644 7625 [WARNING|trainer.py:803] 2025-04-26 17:51:38,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:51:38,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7579 [WARNING|trainer.py:803] 2025-04-26 17:51:38,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7645 7626 [WARNING|trainer.py:803] 2025-04-26 17:51:39,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:39,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7580 [WARNING|trainer.py:803] 2025-04-26 17:51:40,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7646 7627 [WARNING|trainer.py:803] 2025-04-26 17:51:40,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:51:40,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7581 [WARNING|trainer.py:803] 2025-04-26 17:51:41,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7647 7628 [WARNING|trainer.py:803] 2025-04-26 17:51:42,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:51:42,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7582 [WARNING|trainer.py:803] 2025-04-26 17:51:42,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7648 7629 [WARNING|trainer.py:803] 2025-04-26 17:51:43,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:51:43,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7583 [WARNING|trainer.py:803] 2025-04-26 17:51:44,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7649 7630 [WARNING|trainer.py:803] 2025-04-26 17:51:45,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:51:45,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7584 [WARNING|trainer.py:803] 2025-04-26 17:51:45,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7650 7631 [WARNING|trainer.py:803] 2025-04-26 17:51:46,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:51:46,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7585 [WARNING|trainer.py:803] 2025-04-26 17:51:47,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7651 7632 [WARNING|trainer.py:803] 2025-04-26 17:51:47,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:51:48,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7586 [WARNING|trainer.py:803] 2025-04-26 17:51:48,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7652 7633 [WARNING|trainer.py:803] 2025-04-26 17:51:49,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:49,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7587 [WARNING|trainer.py:803] 2025-04-26 17:51:50,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7653 7634 [WARNING|trainer.py:803] 2025-04-26 17:51:50,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:51,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7588 [WARNING|trainer.py:803] 2025-04-26 17:51:51,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7654 7635 [WARNING|trainer.py:803] 2025-04-26 17:51:52,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7589 [WARNING|trainer.py:803] 2025-04-26 17:51:52,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:51:53,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7655 [WARNING|trainer.py:803] 2025-04-26 17:51:53,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7636 7590 [WARNING|trainer.py:803] 2025-04-26 17:51:54,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:54,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7656 [WARNING|trainer.py:803] 2025-04-26 17:51:55,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7637 7591 [WARNING|trainer.py:803] 2025-04-26 17:51:55,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:51:55,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7657 [WARNING|trainer.py:803] 2025-04-26 17:51:56,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7638 7592 [WARNING|trainer.py:803] 2025-04-26 17:51:57,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:51:57,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:51:57,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7658 7639 7593 [WARNING|trainer.py:803] 2025-04-26 17:51:58,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:58,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:51:59,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7659 7640 7594 [WARNING|trainer.py:803] 2025-04-26 17:52:00,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:00,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:00,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7641 7660 7595 [WARNING|trainer.py:803] 2025-04-26 17:52:01,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:52:01,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:52:02,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7642 7661 7596 [WARNING|trainer.py:803] 2025-04-26 17:52:03,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:03,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:52:03,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7643 7662 7597 [WARNING|trainer.py:803] 2025-04-26 17:52:04,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:04,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:04,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7644 7663 7598 [WARNING|trainer.py:803] 2025-04-26 17:52:05,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:06,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:06,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7645 7599 7664 [WARNING|trainer.py:803] 2025-04-26 17:52:07,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:07,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:07,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7646 7600 7665 [WARNING|trainer.py:803] 2025-04-26 17:52:08,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:09,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:09,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7647 7601 7666 [WARNING|trainer.py:803] 2025-04-26 17:52:10,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:52:10,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:52:10,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7648 7602 7667 [WARNING|trainer.py:803] 2025-04-26 17:52:11,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:12,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:12,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7649 7603 7668 [WARNING|trainer.py:803] 2025-04-26 17:52:13,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:52:13,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:13,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7650 7604 7669 [WARNING|trainer.py:803] 2025-04-26 17:52:14,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:14,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7651 [WARNING|trainer.py:803] 2025-04-26 17:52:15,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7605 7670 [WARNING|trainer.py:803] 2025-04-26 17:52:16,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:16,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7652 [WARNING|trainer.py:803] 2025-04-26 17:52:16,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7606 7671 [WARNING|trainer.py:803] 2025-04-26 17:52:17,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:17,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:18,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7653 7607 7672 [WARNING|trainer.py:803] 2025-04-26 17:52:19,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:19,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:19,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7654 7608 7673 [WARNING|trainer.py:803] 2025-04-26 17:52:20,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:20,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:21,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7609 7655 7674 [WARNING|trainer.py:803] 2025-04-26 17:52:22,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:22,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:22,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7656 7610 7675 [WARNING|trainer.py:803] 2025-04-26 17:52:23,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:23,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:23,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7611 7657 7676 [WARNING|trainer.py:803] 2025-04-26 17:52:24,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:52:25,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:25,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7612 7658 7677 [WARNING|trainer.py:803] 2025-04-26 17:52:26,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:26,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:26,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7613 7659 7678 [WARNING|trainer.py:803] 2025-04-26 17:52:27,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:28,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:28,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7614 7660 7679 [WARNING|trainer.py:803] 2025-04-26 17:52:29,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:29,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7615 [WARNING|trainer.py:803] 2025-04-26 17:52:29,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7661 7680 [WARNING|trainer.py:803] 2025-04-26 17:52:30,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:31,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7616 [WARNING|trainer.py:803] 2025-04-26 17:52:31,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7662 7681 [WARNING|trainer.py:803] 2025-04-26 17:52:32,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:32,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7617 [WARNING|trainer.py:803] 2025-04-26 17:52:32,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7663 7682 [WARNING|trainer.py:803] 2025-04-26 17:52:33,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:52:34,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7618 [WARNING|trainer.py:803] 2025-04-26 17:52:34,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7664 7683 [WARNING|trainer.py:803] 2025-04-26 17:52:35,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:35,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7619 [WARNING|trainer.py:803] 2025-04-26 17:52:35,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7665 7684 [WARNING|trainer.py:803] 2025-04-26 17:52:36,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:37,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7620 [WARNING|trainer.py:803] 2025-04-26 17:52:37,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7666 7685 [WARNING|trainer.py:803] 2025-04-26 17:52:38,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:38,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7621 [WARNING|trainer.py:803] 2025-04-26 17:52:38,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7667 7686 [WARNING|trainer.py:803] 2025-04-26 17:52:39,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7622 [WARNING|trainer.py:803] 2025-04-26 17:52:40,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:52:40,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7687 7668 [WARNING|trainer.py:803] 2025-04-26 17:52:40,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7623 [WARNING|trainer.py:803] 2025-04-26 17:52:41,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:41,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:52:42,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7688 7669 7624 [WARNING|trainer.py:803] 2025-04-26 17:52:43,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:43,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:43,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7689 7670 7625 [WARNING|trainer.py:803] 2025-04-26 17:52:44,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:52:44,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:52:45,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7690 7671 7626 [WARNING|trainer.py:803] 2025-04-26 17:52:46,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:46,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:52:46,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7672 7691 7627 [WARNING|trainer.py:803] 2025-04-26 17:52:47,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:47,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:48,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7673 7692 7628 [WARNING|trainer.py:803] 2025-04-26 17:52:49,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:52:49,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:49,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7674 7693 7629 [WARNING|trainer.py:803] 2025-04-26 17:52:50,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:50,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7675 [WARNING|trainer.py:803] 2025-04-26 17:52:51,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7694 7630 [WARNING|trainer.py:803] 2025-04-26 17:52:51,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:52:52,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:52:52,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7676 7695 7631 [WARNING|trainer.py:803] 2025-04-26 17:52:53,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:53,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:53,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7677 7696 7632 [WARNING|trainer.py:803] 2025-04-26 17:52:54,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:55,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:55,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7678 7633 7697 [WARNING|trainer.py:803] 2025-04-26 17:52:56,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:52:56,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:52:56,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7679 7634 7698 [WARNING|trainer.py:803] 2025-04-26 17:52:57,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:58,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:58,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7680 7699 7635 [WARNING|trainer.py:803] 2025-04-26 17:52:59,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:52:59,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:52:59,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7681 7636 7700 [WARNING|trainer.py:803] 2025-04-26 17:53:00,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:53:01,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:53:01,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7682 7637 7701 [WARNING|trainer.py:803] 2025-04-26 17:53:02,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:53:02,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:53:02,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7683 7638 7702 [WARNING|trainer.py:803] 2025-04-26 17:53:03,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:53:04,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:53:04,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7684 7639 7703 [WARNING|trainer.py:803] 2025-04-26 17:53:05,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:53:05,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:53:05,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7685 7640 7704 [WARNING|trainer.py:803] 2025-04-26 17:53:06,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:53:06,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:53:07,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7686 7641 7705 [WARNING|trainer.py:803] 2025-04-26 17:53:08,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:53:08,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:53:08,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7687 7642 7706 [WARNING|trainer.py:803] 2025-04-26 17:53:09,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:53:09,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:53:10,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7688 7643 7707 [WARNING|trainer.py:803] 2025-04-26 17:53:11,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:53:11,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:53:11,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7689 7644 7708 [WARNING|trainer.py:803] 2025-04-26 17:53:12,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:53:12,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:53:12,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7690 7645 7709 [WARNING|trainer.py:803] 2025-04-26 17:53:13,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:53:14,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7691 [WARNING|trainer.py:803] 2025-04-26 17:53:14,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7646 7710 [WARNING|trainer.py:803] 2025-04-26 17:53:15,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:53:15,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:53:16,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7692 7647 7711 [WARNING|trainer.py:803] 2025-04-26 17:53:16,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:53:17,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:53:17,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7693 7648 7712 [WARNING|trainer.py:803] 2025-04-26 17:53:18,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:53:18,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:53:18,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7694 7649 7713 [WARNING|trainer.py:803] 2025-04-26 17:53:19,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:53:20,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:53:20,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7695 7650 7714 [WARNING|trainer.py:803] 2025-04-26 17:53:21,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:53:21,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:53:21,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7696 7715 7651 [WARNING|trainer.py:803] 2025-04-26 17:53:22,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:53:23,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 17:53:23,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7716 7697 7652 [WARNING|trainer.py:803] 2025-04-26 17:53:24,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:53:24,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:53:24,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7717 7698 7653 [WARNING|trainer.py:803] 2025-04-26 17:53:25,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:53:26,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:53:26,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7718 7699 7654 [WARNING|trainer.py:803] 2025-04-26 17:53:27,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:53:27,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:53:27,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7719 7700 7655 [WARNING|trainer.py:803] 2025-04-26 17:53:28,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:53:28,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:53:29,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7720 7701 7656 [WARNING|trainer.py:803] 2025-04-26 17:53:30,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:53:30,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7721 [WARNING|trainer.py:803] 2025-04-26 17:53:30,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7702 7657 [WARNING|trainer.py:803] 2025-04-26 17:53:31,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:53:31,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7722 [WARNING|trainer.py:803] 2025-04-26 17:53:32,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7703 7658 [WARNING|trainer.py:803] 2025-04-26 17:53:32,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:53:33,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7723 [WARNING|trainer.py:803] 2025-04-26 17:53:33,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7704 [WARNING|trainer.py:803] 2025-04-26 17:53:34,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7659 [WARNING|trainer.py:803] 2025-04-26 17:53:34,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7724 [WARNING|trainer.py:803] 2025-04-26 17:53:35,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7705 [WARNING|trainer.py:803] 2025-04-26 17:53:35,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7660 [WARNING|trainer.py:803] 2025-04-26 17:53:36,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7725 7706 [WARNING|trainer.py:803] 2025-04-26 17:53:36,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:53:37,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7661 [WARNING|trainer.py:803] 2025-04-26 17:53:37,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7726 7707 [WARNING|trainer.py:803] 2025-04-26 17:53:38,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:53:38,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7662 [WARNING|trainer.py:803] 2025-04-26 17:53:38,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7727 7708 [WARNING|trainer.py:803] 2025-04-26 17:53:39,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:53:39,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7663 [WARNING|trainer.py:803] 2025-04-26 17:53:40,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7728 7709 [WARNING|trainer.py:803] 2025-04-26 17:53:41,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:53:41,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7729 7664 [WARNING|trainer.py:803] 2025-04-26 17:53:42,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7710 [WARNING|trainer.py:803] 2025-04-26 17:53:42,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:53:42,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7730 7665 [WARNING|trainer.py:803] 2025-04-26 17:53:43,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7711 [WARNING|trainer.py:803] 2025-04-26 17:53:44,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:53:44,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7731 7666 [WARNING|trainer.py:803] 2025-04-26 17:53:44,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7712 [WARNING|trainer.py:803] 2025-04-26 17:53:45,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:53:45,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7732 [WARNING|trainer.py:803] 2025-04-26 17:53:46,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7667 7713 [WARNING|trainer.py:803] 2025-04-26 17:53:46,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:53:47,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7733 [WARNING|trainer.py:803] 2025-04-26 17:53:47,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7668 7714 [WARNING|trainer.py:803] 2025-04-26 17:53:48,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:53:48,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7734 [WARNING|trainer.py:803] 2025-04-26 17:53:49,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7669 7715 [WARNING|trainer.py:803] 2025-04-26 17:53:49,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7735 [WARNING|trainer.py:803] 2025-04-26 17:53:50,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:53:50,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 7670 7716 [WARNING|trainer.py:803] 2025-04-26 17:53:51,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7736 [WARNING|trainer.py:803] 2025-04-26 17:53:51,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:53:51,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7671 7717 [WARNING|trainer.py:803] 2025-04-26 17:53:52,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7737 [WARNING|trainer.py:803] 2025-04-26 17:53:53,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:53:53,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7672 7718 [WARNING|trainer.py:803] 2025-04-26 17:53:53,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7738 [WARNING|trainer.py:803] 2025-04-26 17:53:54,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:53:54,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7719 7673 [WARNING|trainer.py:803] 2025-04-26 17:53:55,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7739 [WARNING|trainer.py:803] 2025-04-26 17:53:56,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:53:56,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7720 [WARNING|trainer.py:803] 2025-04-26 17:53:56,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7674 7740 [WARNING|trainer.py:803] 2025-04-26 17:53:57,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:53:57,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:53:58,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7721 7675 7741 [WARNING|trainer.py:803] 2025-04-26 17:53:59,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:53:59,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:53:59,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7722 7676 7742 [WARNING|trainer.py:803] 2025-04-26 17:54:00,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:54:00,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:01,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7723 7677 7743 [WARNING|trainer.py:803] 2025-04-26 17:54:01,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:54:02,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:02,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7724 7678 7744 [WARNING|trainer.py:803] 2025-04-26 17:54:03,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:54:03,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:54:03,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7725 7679 7745 [WARNING|trainer.py:803] 2025-04-26 17:54:04,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:04,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:05,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7726 7680 7746 [WARNING|trainer.py:803] 2025-04-26 17:54:06,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:54:06,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:06,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7727 7681 7747 [WARNING|trainer.py:803] 2025-04-26 17:54:07,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:08,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:08,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7728 7748 7682 [WARNING|trainer.py:803] 2025-04-26 17:54:08,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7729 [WARNING|trainer.py:803] 2025-04-26 17:54:09,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:09,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7749 7683 [WARNING|trainer.py:803] 2025-04-26 17:54:10,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7730 [WARNING|trainer.py:803] 2025-04-26 17:54:10,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:11,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7750 [WARNING|trainer.py:803] 2025-04-26 17:54:11,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7684 7731 [WARNING|trainer.py:803] 2025-04-26 17:54:12,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:12,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7751 [WARNING|trainer.py:803] 2025-04-26 17:54:13,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7685 7732 [WARNING|trainer.py:803] 2025-04-26 17:54:13,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:14,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7752 [WARNING|trainer.py:803] 2025-04-26 17:54:14,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7686 [WARNING|trainer.py:803] 2025-04-26 17:54:15,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7733 [WARNING|trainer.py:803] 2025-04-26 17:54:15,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7753 [WARNING|trainer.py:803] 2025-04-26 17:54:15,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7687 [WARNING|trainer.py:803] 2025-04-26 17:54:16,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7734 [WARNING|trainer.py:803] 2025-04-26 17:54:16,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7754 [WARNING|trainer.py:803] 2025-04-26 17:54:17,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7688 [WARNING|trainer.py:803] 2025-04-26 17:54:17,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7735 7755 [WARNING|trainer.py:803] 2025-04-26 17:54:18,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:18,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7689 [WARNING|trainer.py:803] 2025-04-26 17:54:19,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7736 7756 [WARNING|trainer.py:803] 2025-04-26 17:54:19,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:20,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7690 [WARNING|trainer.py:803] 2025-04-26 17:54:20,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7737 7757 [WARNING|trainer.py:803] 2025-04-26 17:54:21,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:54:21,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:54:22,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7738 7691 7758 [WARNING|trainer.py:803] 2025-04-26 17:54:22,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:22,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:23,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7739 7692 7759 [WARNING|trainer.py:803] 2025-04-26 17:54:24,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:24,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:54:24,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7740 7693 7760 [WARNING|trainer.py:803] 2025-04-26 17:54:25,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:26,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:26,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7741 7761 7694 [WARNING|trainer.py:803] 2025-04-26 17:54:27,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:27,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:27,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7742 7762 7695 [WARNING|trainer.py:803] 2025-04-26 17:54:28,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:28,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:29,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7743 7763 7696 [WARNING|trainer.py:803] 2025-04-26 17:54:30,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:30,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:30,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7744 7764 7697 [WARNING|trainer.py:803] 2025-04-26 17:54:31,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:31,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7745 [WARNING|trainer.py:803] 2025-04-26 17:54:32,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7765 7698 [WARNING|trainer.py:803] 2025-04-26 17:54:32,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:33,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7746 7766 [WARNING|trainer.py:803] 2025-04-26 17:54:33,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7699 [WARNING|trainer.py:803] 2025-04-26 17:54:34,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:34,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7747 7767 [WARNING|trainer.py:803] 2025-04-26 17:54:35,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7700 [WARNING|trainer.py:803] 2025-04-26 17:54:35,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:35,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7748 7768 [WARNING|trainer.py:803] 2025-04-26 17:54:36,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:37,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:37,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7701 7769 7749 [WARNING|trainer.py:803] 2025-04-26 17:54:38,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:38,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:38,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7702 7770 7750 [WARNING|trainer.py:803] 2025-04-26 17:54:39,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:54:40,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:40,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7703 7771 7751 [WARNING|trainer.py:803] 2025-04-26 17:54:41,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:41,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:41,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7704 7772 7752 [WARNING|trainer.py:803] 2025-04-26 17:54:42,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:42,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:42,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7705 7773 7753 [WARNING|trainer.py:803] 2025-04-26 17:54:43,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:54:44,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:44,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7706 7774 7754 [WARNING|trainer.py:803] 2025-04-26 17:54:45,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:45,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:45,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7707 7775 7755 [WARNING|trainer.py:803] 2025-04-26 17:54:46,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:47,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:54:47,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7708 7776 7756 [WARNING|trainer.py:803] 2025-04-26 17:54:48,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:48,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:48,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7709 7777 7757 [WARNING|trainer.py:803] 2025-04-26 17:54:49,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:49,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:50,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7778 7710 7758 [WARNING|trainer.py:803] 2025-04-26 17:54:51,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:54:51,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:51,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7779 7711 7759 [WARNING|trainer.py:803] 2025-04-26 17:54:52,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:52,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:52,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7780 7712 7760 [WARNING|trainer.py:803] 2025-04-26 17:54:54,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:54:54,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:54,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7781 7713 7761 [WARNING|trainer.py:803] 2025-04-26 17:54:55,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:55,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:54:55,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7782 7762 7714 [WARNING|trainer.py:803] 2025-04-26 17:54:57,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:57,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:54:57,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7783 7763 7715 [WARNING|trainer.py:803] 2025-04-26 17:54:58,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:58,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:58,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 7784 7764 7716 [WARNING|trainer.py:803] 2025-04-26 17:54:59,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:54:59,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:00,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7785 7765 7717 [WARNING|trainer.py:803] 2025-04-26 17:55:01,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:01,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:01,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7766 7786 7718 [WARNING|trainer.py:803] 2025-04-26 17:55:02,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:02,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:55:02,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7767 7787 7719 [WARNING|trainer.py:803] 2025-04-26 17:55:04,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:04,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:04,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7768 7788 7720 [WARNING|trainer.py:803] 2025-04-26 17:55:05,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:05,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:05,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7769 7789 7721 [WARNING|trainer.py:803] 2025-04-26 17:55:06,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:06,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:55:07,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7770 7790 7722 [WARNING|trainer.py:803] 2025-04-26 17:55:08,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:08,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:55:08,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7771 7791 7723 [WARNING|trainer.py:803] 2025-04-26 17:55:09,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:09,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:10,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7772 7792 7724 [WARNING|trainer.py:803] 2025-04-26 17:55:11,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:11,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:11,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7773 7793 7725 [WARNING|trainer.py:803] 2025-04-26 17:55:12,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:12,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:12,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7794 7774 7726 [WARNING|trainer.py:803] 2025-04-26 17:55:14,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:14,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:55:14,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7775 7795 7727 [WARNING|trainer.py:803] 2025-04-26 17:55:15,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:15,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:15,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7776 7728 7796 [WARNING|trainer.py:803] 2025-04-26 17:55:17,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:55:17,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:17,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7729 7797 7777 [WARNING|trainer.py:803] 2025-04-26 17:55:18,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:55:18,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:55:18,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7730 7798 7778 [WARNING|trainer.py:803] 2025-04-26 17:55:19,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:19,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:20,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7731 7799 7779 [WARNING|trainer.py:803] 2025-04-26 17:55:21,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:21,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:21,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7732 7780 7800 [WARNING|trainer.py:803] 2025-04-26 17:55:22,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:22,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:55:22,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7733 7781 7801 [WARNING|trainer.py:803] 2025-04-26 17:55:24,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:55:24,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:24,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7734 7782 7802 [WARNING|trainer.py:803] 2025-04-26 17:55:25,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:25,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7735 7783 [WARNING|trainer.py:803] 2025-04-26 17:55:26,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:26,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:27,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7736 7803 7784 [WARNING|trainer.py:803] 2025-04-26 17:55:28,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:28,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:28,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7737 7785 7804 [WARNING|trainer.py:803] 2025-04-26 17:55:29,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:29,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:30,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7738 7786 7805 [WARNING|trainer.py:803] 2025-04-26 17:55:31,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:31,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7739 [WARNING|trainer.py:803] 2025-04-26 17:55:31,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7787 [WARNING|trainer.py:803] 2025-04-26 17:55:32,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7806 [WARNING|trainer.py:803] 2025-04-26 17:55:32,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7740 7788 [WARNING|trainer.py:803] 2025-04-26 17:55:33,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:34,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:34,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7807 7741 7789 [WARNING|trainer.py:803] 2025-04-26 17:55:35,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:55:35,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:35,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7742 7790 7808 [WARNING|trainer.py:803] 2025-04-26 17:55:36,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:37,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:55:37,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7743 7791 7809 [WARNING|trainer.py:803] 2025-04-26 17:55:38,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:38,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7744 [WARNING|trainer.py:803] 2025-04-26 17:55:39,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7792 [WARNING|trainer.py:803] 2025-04-26 17:55:39,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7810 [WARNING|trainer.py:803] 2025-04-26 17:55:39,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7745 7793 [WARNING|trainer.py:803] 2025-04-26 17:55:40,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:41,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:55:41,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7811 7746 7794 [WARNING|trainer.py:803] 2025-04-26 17:55:42,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:55:42,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:42,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7747 7812 7795 [WARNING|trainer.py:803] 2025-04-26 17:55:43,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:55:44,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:44,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7748 7796 7813 [WARNING|trainer.py:803] 2025-04-26 17:55:45,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:45,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7749 [WARNING|trainer.py:803] 2025-04-26 17:55:46,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7797 [WARNING|trainer.py:803] 2025-04-26 17:55:46,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7814 [WARNING|trainer.py:803] 2025-04-26 17:55:47,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7750 7798 [WARNING|trainer.py:803] 2025-04-26 17:55:47,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:48,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:48,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7815 7751 7799 [WARNING|trainer.py:803] 2025-04-26 17:55:49,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:49,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:55:50,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7752 7816 7800 [WARNING|trainer.py:803] 2025-04-26 17:55:51,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:51,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:51,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7753 7817 7801 [WARNING|trainer.py:803] 2025-04-26 17:55:52,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:55:53,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7754 [WARNING|trainer.py:803] 2025-04-26 17:55:53,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:55:53,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7818 7802 7755 [WARNING|trainer.py:803] 2025-04-26 17:55:54,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:54,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:55:55,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7819 7756 7803 [WARNING|trainer.py:803] 2025-04-26 17:55:56,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:55:56,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:55:56,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7757 7820 7804 [WARNING|trainer.py:803] 2025-04-26 17:55:58,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:55:58,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:55:58,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7758 7821 [WARNING|trainer.py:803] 2025-04-26 17:55:59,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7805 [WARNING|trainer.py:803] 2025-04-26 17:56:00,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7759 [WARNING|trainer.py:803] 2025-04-26 17:56:00,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:01,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7822 7806 7760 [WARNING|trainer.py:803] 2025-04-26 17:56:01,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:56:02,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:02,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7761 7807 7823 [WARNING|trainer.py:803] 2025-04-26 17:56:03,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:56:03,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:03,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7762 7808 7824 [WARNING|trainer.py:803] 2025-04-26 17:56:05,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:56:05,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7763 [WARNING|trainer.py:803] 2025-04-26 17:56:05,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:56:06,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7825 7809 7764 [WARNING|trainer.py:803] 2025-04-26 17:56:07,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:56:07,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:07,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7826 7810 7765 [WARNING|trainer.py:803] 2025-04-26 17:56:09,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:09,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:56:09,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7766 7827 7811 [WARNING|trainer.py:803] 2025-04-26 17:56:10,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:56:10,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:11,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7767 7812 [WARNING|trainer.py:803] 2025-04-26 17:56:12,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7828 7768 [WARNING|trainer.py:803] 2025-04-26 17:56:12,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:12,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:56:13,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7813 7769 7829 [WARNING|trainer.py:803] 2025-04-26 17:56:14,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:56:14,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:56:14,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7770 7814 7830 [WARNING|trainer.py:803] 2025-04-26 17:56:16,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:56:16,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:56:16,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7771 7815 [WARNING|trainer.py:803] 2025-04-26 17:56:17,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7831 [WARNING|trainer.py:803] 2025-04-26 17:56:18,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7772 [WARNING|trainer.py:803] 2025-04-26 17:56:18,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:56:18,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7816 7832 7773 [WARNING|trainer.py:803] 2025-04-26 17:56:19,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:56:20,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:20,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7817 7833 7774 [WARNING|trainer.py:803] 2025-04-26 17:56:21,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:56:21,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:21,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7775 7818 7834 [WARNING|trainer.py:803] 2025-04-26 17:56:23,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:23,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:56:23,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7776 7819 7835 [WARNING|trainer.py:803] 2025-04-26 17:56:24,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7777 [WARNING|trainer.py:803] 2025-04-26 17:56:25,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:56:25,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:25,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7836 7820 7778 [WARNING|trainer.py:803] 2025-04-26 17:56:27,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:27,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:27,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7837 7779 7821 [WARNING|trainer.py:803] 2025-04-26 17:56:28,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:28,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:28,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7780 7822 7838 [WARNING|trainer.py:803] 2025-04-26 17:56:30,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:30,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:56:30,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7781 7839 [WARNING|trainer.py:803] 2025-04-26 17:56:31,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7823 7782 [WARNING|trainer.py:803] 2025-04-26 17:56:32,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:56:32,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:33,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7840 7783 7824 [WARNING|trainer.py:803] 2025-04-26 17:56:33,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:34,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7841 [WARNING|trainer.py:803] 2025-04-26 17:56:34,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7784 7825 [WARNING|trainer.py:803] 2025-04-26 17:56:35,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:36,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:56:36,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7785 7842 7826 [WARNING|trainer.py:803] 2025-04-26 17:56:37,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:56:37,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:56:37,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7786 7827 7843 [WARNING|trainer.py:803] 2025-04-26 17:56:38,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7787 [WARNING|trainer.py:803] 2025-04-26 17:56:39,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:39,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:40,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7828 7788 7844 [WARNING|trainer.py:803] 2025-04-26 17:56:41,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:56:41,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:41,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7789 7845 7829 [WARNING|trainer.py:803] 2025-04-26 17:56:43,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:43,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:43,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7790 7830 [WARNING|trainer.py:803] 2025-04-26 17:56:44,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7846 [WARNING|trainer.py:803] 2025-04-26 17:56:45,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7791 [WARNING|trainer.py:803] 2025-04-26 17:56:45,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:46,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7831 7792 7847 [WARNING|trainer.py:803] 2025-04-26 17:56:47,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:56:47,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:47,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7793 7832 7848 [WARNING|trainer.py:803] 2025-04-26 17:56:49,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:49,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:49,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7794 7833 7849 [WARNING|trainer.py:803] 2025-04-26 17:56:50,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:50,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7795 [WARNING|trainer.py:803] 2025-04-26 17:56:51,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7834 [WARNING|trainer.py:803] 2025-04-26 17:56:51,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7850 [WARNING|trainer.py:803] 2025-04-26 17:56:52,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7796 [WARNING|trainer.py:803] 2025-04-26 17:56:53,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:53,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7835 7797 [WARNING|trainer.py:803] 2025-04-26 17:56:54,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7851 [WARNING|trainer.py:803] 2025-04-26 17:56:54,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7836 [WARNING|trainer.py:803] 2025-04-26 17:56:55,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7798 [WARNING|trainer.py:803] 2025-04-26 17:56:55,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:56,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7852 7837 7799 [WARNING|trainer.py:803] 2025-04-26 17:56:57,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:56:57,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:56:57,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7800 7853 7838 [WARNING|trainer.py:803] 2025-04-26 17:56:59,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:56:59,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:56:59,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7801 7854 7839 [WARNING|trainer.py:803] 2025-04-26 17:57:01,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:57:01,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:57:01,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7855 7840 7802 [WARNING|trainer.py:803] 2025-04-26 17:57:02,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:57:02,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:57:02,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7841 7856 7803 [WARNING|trainer.py:803] 2025-04-26 17:57:04,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:57:04,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:57:04,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7857 7804 7842 [WARNING|trainer.py:803] 2025-04-26 17:57:06,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:57:06,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:57:06,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7858 7805 7843 [WARNING|trainer.py:803] 2025-04-26 17:57:07,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:57:08,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:57:08,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7859 7806 [WARNING|trainer.py:803] 2025-04-26 17:57:09,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7844 [WARNING|trainer.py:803] 2025-04-26 17:57:10,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7860 [WARNING|trainer.py:803] 2025-04-26 17:57:10,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7807 [WARNING|trainer.py:803] 2025-04-26 17:57:11,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7845 7861 [WARNING|trainer.py:803] 2025-04-26 17:57:11,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:57:12,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:57:12,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7808 7862 [WARNING|trainer.py:803] 2025-04-26 17:57:13,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7846 [WARNING|trainer.py:803] 2025-04-26 17:57:14,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:57:14,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7809 7863 [WARNING|trainer.py:803] 2025-04-26 17:57:15,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7847 [WARNING|trainer.py:803] 2025-04-26 17:57:15,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7810 [WARNING|trainer.py:803] 2025-04-26 17:57:16,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7864 [WARNING|trainer.py:803] 2025-04-26 17:57:17,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7848 [WARNING|trainer.py:803] 2025-04-26 17:57:17,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7811 [WARNING|trainer.py:803] 2025-04-26 17:57:18,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7865 [WARNING|trainer.py:803] 2025-04-26 17:57:18,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7849 [WARNING|trainer.py:803] 2025-04-26 17:57:19,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7812 [WARNING|trainer.py:803] 2025-04-26 17:57:19,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7866 [WARNING|trainer.py:803] 2025-04-26 17:57:20,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:57:21,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7850 7813 7867 [WARNING|trainer.py:803] 2025-04-26 17:57:22,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:57:22,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:57:22,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7851 7814 [WARNING|trainer.py:803] 2025-04-26 17:57:24,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:57:24,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7868 [WARNING|trainer.py:803] 2025-04-26 17:57:25,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7815 7852 [WARNING|trainer.py:803] 2025-04-26 17:57:25,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7869 [WARNING|trainer.py:803] 2025-04-26 17:57:26,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:57:26,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7816 7853 [WARNING|trainer.py:803] 2025-04-26 17:57:27,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:57:28,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7870 7817 [WARNING|trainer.py:803] 2025-04-26 17:57:28,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7854 [WARNING|trainer.py:803] 2025-04-26 17:57:29,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:57:29,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7871 7818 [WARNING|trainer.py:803] 2025-04-26 17:57:30,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7855 [WARNING|trainer.py:803] 2025-04-26 17:57:31,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7872 [WARNING|trainer.py:803] 2025-04-26 17:57:31,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7819 [WARNING|trainer.py:803] 2025-04-26 17:57:32,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7856 [WARNING|trainer.py:803] 2025-04-26 17:57:33,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7873 [WARNING|trainer.py:803] 2025-04-26 17:57:33,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:57:33,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7820 7857 7874 [WARNING|trainer.py:803] 2025-04-26 17:57:34,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:57:34,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:57:35,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7858 7821 [WARNING|trainer.py:803] 2025-04-26 17:57:36,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:57:36,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7875 7859 7822 [WARNING|trainer.py:803] 2025-04-26 17:57:37,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:57:38,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:57:38,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7876 7860 [WARNING|trainer.py:803] 2025-04-26 17:57:39,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7823 [WARNING|trainer.py:803] 2025-04-26 17:57:39,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:57:40,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7861 7877 [WARNING|trainer.py:803] 2025-04-26 17:57:41,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:57:41,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7824 7862 7878 [WARNING|trainer.py:803] 2025-04-26 17:57:42,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:57:43,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:57:43,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7825 7863 [WARNING|trainer.py:803] 2025-04-26 17:57:44,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7879 [WARNING|trainer.py:803] 2025-04-26 17:57:44,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7826 [WARNING|trainer.py:803] 2025-04-26 17:57:45,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7864 [WARNING|trainer.py:803] 2025-04-26 17:57:45,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7880 [WARNING|trainer.py:803] 2025-04-26 17:57:46,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7827 [WARNING|trainer.py:803] 2025-04-26 17:57:47,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:57:47,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7865 7881 [WARNING|trainer.py:803] 2025-04-26 17:57:48,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:57:48,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7828 7866 7882 [WARNING|trainer.py:803] 2025-04-26 17:57:49,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:57:50,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:57:50,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7829 7867 [WARNING|trainer.py:803] 2025-04-26 17:57:51,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:57:51,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7883 7830 [WARNING|trainer.py:803] 2025-04-26 17:57:52,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7868 [WARNING|trainer.py:803] 2025-04-26 17:57:53,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:57:54,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7884 7831 7869 [WARNING|trainer.py:803] 2025-04-26 17:57:55,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:57:55,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:57:55,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7885 7832 [WARNING|trainer.py:803] 2025-04-26 17:57:56,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:57:57,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7870 7833 [WARNING|trainer.py:803] 2025-04-26 17:57:57,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7886 [WARNING|trainer.py:803] 2025-04-26 17:57:58,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:57:58,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7871 7834 [WARNING|trainer.py:803] 2025-04-26 17:57:59,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7887 [WARNING|trainer.py:803] 2025-04-26 17:58:00,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7872 [WARNING|trainer.py:803] 2025-04-26 17:58:00,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:58:01,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7835 7888 7873 [WARNING|trainer.py:803] 2025-04-26 17:58:02,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:58:02,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:58:03,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7889 7836 7874 [WARNING|trainer.py:803] 2025-04-26 17:58:03,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:58:04,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:58:04,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7890 7837 [WARNING|trainer.py:803] 2025-04-26 17:58:05,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:58:05,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7875 7891 7838 [WARNING|trainer.py:803] 2025-04-26 17:58:07,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:58:07,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:58:07,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7876 7839 7892 [WARNING|trainer.py:803] 2025-04-26 17:58:08,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:58:09,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:58:09,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7877 7840 7893 [WARNING|trainer.py:803] 2025-04-26 17:58:10,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:58:10,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:58:11,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7878 7841 [WARNING|trainer.py:803] 2025-04-26 17:58:12,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:58:12,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7894 7879 [WARNING|trainer.py:803] 2025-04-26 17:58:13,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7842 [WARNING|trainer.py:803] 2025-04-26 17:58:14,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7895 [WARNING|trainer.py:803] 2025-04-26 17:58:14,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:58:15,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7880 7843 [WARNING|trainer.py:803] 2025-04-26 17:58:16,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7896 [WARNING|trainer.py:803] 2025-04-26 17:58:16,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7881 [WARNING|trainer.py:803] 2025-04-26 17:58:17,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:58:18,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7844 7897 [WARNING|trainer.py:803] 2025-04-26 17:58:18,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7882 [WARNING|trainer.py:803] 2025-04-26 17:58:19,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7845 [WARNING|trainer.py:803] 2025-04-26 17:58:19,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7898 [WARNING|trainer.py:803] 2025-04-26 17:58:20,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:58:21,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7883 7846 [WARNING|trainer.py:803] 2025-04-26 17:58:22,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7899 [WARNING|trainer.py:803] 2025-04-26 17:58:22,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:58:23,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7884 7847 7900 [WARNING|trainer.py:803] 2025-04-26 17:58:24,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:58:24,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:58:24,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7885 7848 7901 [WARNING|trainer.py:803] 2025-04-26 17:58:26,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:58:26,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:58:26,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7849 7886 7902 [WARNING|trainer.py:803] 2025-04-26 17:58:28,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:58:28,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:58:28,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7887 7903 7850 [WARNING|trainer.py:803] 2025-04-26 17:58:30,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:58:30,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:58:30,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7888 7851 7904 [WARNING|trainer.py:803] 2025-04-26 17:58:31,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:58:32,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7889 [WARNING|trainer.py:803] 2025-04-26 17:58:32,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7905 [WARNING|trainer.py:803] 2025-04-26 17:58:33,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7852 7890 [WARNING|trainer.py:803] 2025-04-26 17:58:34,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:58:34,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:58:34,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7906 7853 [WARNING|trainer.py:803] 2025-04-26 17:58:35,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7891 [WARNING|trainer.py:803] 2025-04-26 17:58:36,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:58:36,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7907 7854 [WARNING|trainer.py:803] 2025-04-26 17:58:37,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7892 [WARNING|trainer.py:803] 2025-04-26 17:58:38,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7908 [WARNING|trainer.py:803] 2025-04-26 17:58:38,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7855 [WARNING|trainer.py:803] 2025-04-26 17:58:39,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7893 [WARNING|trainer.py:803] 2025-04-26 17:58:39,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7909 [WARNING|trainer.py:803] 2025-04-26 17:58:40,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7856 [WARNING|trainer.py:803] 2025-04-26 17:58:41,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:58:41,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7894 7910 7857 [WARNING|trainer.py:803] 2025-04-26 17:58:42,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:58:42,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:58:43,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7911 7858 7895 [WARNING|trainer.py:803] 2025-04-26 17:58:44,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:58:44,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:58:44,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7912 7859 7896 [WARNING|trainer.py:803] 2025-04-26 17:58:46,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:58:46,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:58:46,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7913 7860 7897 [WARNING|trainer.py:803] 2025-04-26 17:58:47,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:58:48,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:58:48,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7861 7914 [WARNING|trainer.py:803] 2025-04-26 17:58:49,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:58:49,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7898 7862 [WARNING|trainer.py:803] 2025-04-26 17:58:50,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7915 [WARNING|trainer.py:803] 2025-04-26 17:58:51,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:58:51,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7899 7863 [WARNING|trainer.py:803] 2025-04-26 17:58:52,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7916 [WARNING|trainer.py:803] 2025-04-26 17:58:52,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7900 [WARNING|trainer.py:803] 2025-04-26 17:58:53,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7864 [WARNING|trainer.py:803] 2025-04-26 17:58:54,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7917 [WARNING|trainer.py:803] 2025-04-26 17:58:54,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:58:55,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7901 7865 [WARNING|trainer.py:803] 2025-04-26 17:58:56,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7918 [WARNING|trainer.py:803] 2025-04-26 17:58:56,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7866 7902 [WARNING|trainer.py:803] 2025-04-26 17:58:57,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:58:58,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:58:58,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7919 7903 7867 [WARNING|trainer.py:803] 2025-04-26 17:58:59,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:58:59,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:58:59,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7920 [WARNING|trainer.py:803] 2025-04-26 17:59:01,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7904 7868 [WARNING|trainer.py:803] 2025-04-26 17:59:02,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:59:02,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7921 7905 7869 [WARNING|trainer.py:803] 2025-04-26 17:59:03,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:59:03,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:03,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7906 7922 7870 [WARNING|trainer.py:803] 2025-04-26 17:59:05,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:05,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:06,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7923 7907 7871 [WARNING|trainer.py:803] 2025-04-26 17:59:07,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:59:07,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 17:59:07,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7908 7872 7924 [WARNING|trainer.py:803] 2025-04-26 17:59:09,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:09,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:59:09,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7909 7873 7925 [WARNING|trainer.py:803] 2025-04-26 17:59:10,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:11,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:59:11,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7874 7910 7926 [WARNING|trainer.py:803] 2025-04-26 17:59:12,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:59:12,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:59:13,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7911 7875 [WARNING|trainer.py:803] 2025-04-26 17:59:14,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7927 [WARNING|trainer.py:803] 2025-04-26 17:59:15,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7912 [WARNING|trainer.py:803] 2025-04-26 17:59:15,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:16,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7876 7928 7913 [WARNING|trainer.py:803] 2025-04-26 17:59:16,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:59:17,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:59:17,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7877 7929 7914 [WARNING|trainer.py:803] 2025-04-26 17:59:18,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:18,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:59:19,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7878 7930 7915 [WARNING|trainer.py:803] 2025-04-26 17:59:20,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:59:20,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:21,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7931 7879 [WARNING|trainer.py:803] 2025-04-26 17:59:22,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7916 [WARNING|trainer.py:803] 2025-04-26 17:59:22,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:59:23,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7932 7880 7917 [WARNING|trainer.py:803] 2025-04-26 17:59:24,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:59:24,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:59:24,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7881 7933 [WARNING|trainer.py:803] 2025-04-26 17:59:26,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7918 [WARNING|trainer.py:803] 2025-04-26 17:59:26,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7882 [WARNING|trainer.py:803] 2025-04-26 17:59:27,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7934 [WARNING|trainer.py:803] 2025-04-26 17:59:27,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7919 [WARNING|trainer.py:803] 2025-04-26 17:59:28,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:59:28,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7883 7935 7920 [WARNING|trainer.py:803] 2025-04-26 17:59:30,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:59:30,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:59:30,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7936 7884 [WARNING|trainer.py:803] 2025-04-26 17:59:32,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7921 [WARNING|trainer.py:803] 2025-04-26 17:59:32,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:59:33,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7937 7885 [WARNING|trainer.py:803] 2025-04-26 17:59:34,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:59:34,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7922 7938 [WARNING|trainer.py:803] 2025-04-26 17:59:35,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7886 [WARNING|trainer.py:803] 2025-04-26 17:59:35,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:59:36,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7923 7939 [WARNING|trainer.py:803] 2025-04-26 17:59:37,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7887 [WARNING|trainer.py:803] 2025-04-26 17:59:37,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:59:38,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7940 7924 7888 [WARNING|trainer.py:803] 2025-04-26 17:59:39,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:39,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:59:40,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7925 7941 7889 [WARNING|trainer.py:803] 2025-04-26 17:59:41,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:59:41,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:41,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7926 7942 7890 [WARNING|trainer.py:803] 2025-04-26 17:59:43,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:43,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:59:43,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7891 7943 7927 [WARNING|trainer.py:803] 2025-04-26 17:59:45,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:59:45,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:59:45,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7944 7928 7892 [WARNING|trainer.py:803] 2025-04-26 17:59:46,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:59:47,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:59:47,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7929 7893 7945 [WARNING|trainer.py:803] 2025-04-26 17:59:48,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:59:48,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:59:49,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7930 7946 7894 [WARNING|trainer.py:803] 2025-04-26 17:59:50,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:50,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 17:59:51,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7931 7947 [WARNING|trainer.py:803] 2025-04-26 17:59:52,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7895 [WARNING|trainer.py:803] 2025-04-26 17:59:52,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 17:59:53,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7932 7896 7948 [WARNING|trainer.py:803] 2025-04-26 17:59:54,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:59:54,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:54,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7933 7949 7897 [WARNING|trainer.py:803] 2025-04-26 17:59:56,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:56,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 17:59:56,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7934 7950 7898 [WARNING|trainer.py:803] 2025-04-26 17:59:58,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 17:59:58,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 17:59:58,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7935 7951 7899 [WARNING|trainer.py:803] 2025-04-26 18:00:00,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:00:00,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:00:00,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7936 7900 [WARNING|trainer.py:803] 2025-04-26 18:00:01,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7952 [WARNING|trainer.py:803] 2025-04-26 18:00:02,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:00:02,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7937 7901 [WARNING|trainer.py:803] 2025-04-26 18:00:03,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7953 [WARNING|trainer.py:803] 2025-04-26 18:00:04,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7938 [WARNING|trainer.py:803] 2025-04-26 18:00:04,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7902 [WARNING|trainer.py:803] 2025-04-26 18:00:05,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7954 [WARNING|trainer.py:803] 2025-04-26 18:00:06,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:06,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7939 7903 [WARNING|trainer.py:803] 2025-04-26 18:00:07,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7955 [WARNING|trainer.py:803] 2025-04-26 18:00:08,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7940 [WARNING|trainer.py:803] 2025-04-26 18:00:08,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:09,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7956 7904 [WARNING|trainer.py:803] 2025-04-26 18:00:10,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7941 [WARNING|trainer.py:803] 2025-04-26 18:00:10,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7905 7957 [WARNING|trainer.py:803] 2025-04-26 18:00:11,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:11,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 18:00:11,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes Yes 7942 7906 [WARNING|trainer.py:803] 2025-04-26 18:00:13,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7958 [WARNING|trainer.py:803] 2025-04-26 18:00:13,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7943 [WARNING|trainer.py:803] 2025-04-26 18:00:14,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7907 [WARNING|trainer.py:803] 2025-04-26 18:00:14,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7959 [WARNING|trainer.py:803] 2025-04-26 18:00:15,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7944 [WARNING|trainer.py:803] 2025-04-26 18:00:15,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7908 [WARNING|trainer.py:803] 2025-04-26 18:00:16,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7960 [WARNING|trainer.py:803] 2025-04-26 18:00:17,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:18,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7909 7945 7961 [WARNING|trainer.py:803] 2025-04-26 18:00:19,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:19,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:00:19,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7946 7910 7962 [WARNING|trainer.py:803] 2025-04-26 18:00:20,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:00:20,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:00:21,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7911 7947 7963 [WARNING|trainer.py:803] 2025-04-26 18:00:22,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:22,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:00:23,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7912 7948 [WARNING|trainer.py:803] 2025-04-26 18:00:24,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7964 [WARNING|trainer.py:803] 2025-04-26 18:00:24,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7913 [WARNING|trainer.py:803] 2025-04-26 18:00:25,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7949 [WARNING|trainer.py:803] 2025-04-26 18:00:26,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:00:26,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7965 7914 [WARNING|trainer.py:803] 2025-04-26 18:00:27,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7950 [WARNING|trainer.py:803] 2025-04-26 18:00:27,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7966 [WARNING|trainer.py:803] 2025-04-26 18:00:28,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7915 [WARNING|trainer.py:803] 2025-04-26 18:00:29,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:00:29,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7951 [WARNING|trainer.py:803] 2025-04-26 18:00:30,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7916 7967 [WARNING|trainer.py:803] 2025-04-26 18:00:31,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:00:31,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7952 7917 7968 [WARNING|trainer.py:803] 2025-04-26 18:00:32,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:00:33,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:33,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7953 7969 7918 [WARNING|trainer.py:803] 2025-04-26 18:00:34,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:00:35,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:00:35,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7954 7970 7919 [WARNING|trainer.py:803] 2025-04-26 18:00:36,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:00:37,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:37,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7955 7971 [WARNING|trainer.py:803] 2025-04-26 18:00:38,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7920 [WARNING|trainer.py:803] 2025-04-26 18:00:38,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7956 [WARNING|trainer.py:803] 2025-04-26 18:00:39,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7972 [WARNING|trainer.py:803] 2025-04-26 18:00:40,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:40,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7921 7957 7973 [WARNING|trainer.py:803] 2025-04-26 18:00:41,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:00:41,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:00:42,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7974 7922 7958 [WARNING|trainer.py:803] 2025-04-26 18:00:43,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:00:43,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:44,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7975 7923 7959 [WARNING|trainer.py:803] 2025-04-26 18:00:45,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:45,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:00:45,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7976 7960 7924 [WARNING|trainer.py:803] 2025-04-26 18:00:47,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:00:47,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:48,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7977 7961 7925 [WARNING|trainer.py:803] 2025-04-26 18:00:49,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:00:49,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:49,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7978 7926 7962 [WARNING|trainer.py:803] 2025-04-26 18:00:51,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:00:51,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:51,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7963 7979 7927 [WARNING|trainer.py:803] 2025-04-26 18:00:53,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:00:53,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:00:53,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7928 7980 7964 [WARNING|trainer.py:803] 2025-04-26 18:00:55,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:00:55,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:00:55,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7929 7981 7965 [WARNING|trainer.py:803] 2025-04-26 18:00:57,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:00:57,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:00:57,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7982 7930 7966 [WARNING|trainer.py:803] 2025-04-26 18:00:58,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:00:59,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:00:59,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7931 7983 [WARNING|trainer.py:803] 2025-04-26 18:01:00,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7967 [WARNING|trainer.py:803] 2025-04-26 18:01:00,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:01:01,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7984 7932 7968 [WARNING|trainer.py:803] 2025-04-26 18:01:02,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:01:02,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:01:03,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7933 7985 7969 [WARNING|trainer.py:803] 2025-04-26 18:01:04,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:01:04,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:01:05,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7934 7986 7970 [WARNING|trainer.py:803] 2025-04-26 18:01:06,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:01:06,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:01:06,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7987 7935 7971 [WARNING|trainer.py:803] 2025-04-26 18:01:08,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:01:08,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:01:08,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7972 7936 7988 [WARNING|trainer.py:803] 2025-04-26 18:01:10,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:01:10,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:01:10,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7973 7937 7989 [WARNING|trainer.py:803] 2025-04-26 18:01:12,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:01:12,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:01:12,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7974 7938 7990 [WARNING|trainer.py:803] 2025-04-26 18:01:13,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:01:14,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:01:14,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7975 7939 [WARNING|trainer.py:803] 2025-04-26 18:01:15,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7991 [WARNING|trainer.py:803] 2025-04-26 18:01:15,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:01:16,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7976 7940 [WARNING|trainer.py:803] 2025-04-26 18:01:17,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:01:17,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7992 7977 [WARNING|trainer.py:803] 2025-04-26 18:01:18,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7941 [WARNING|trainer.py:803] 2025-04-26 18:01:19,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:01:19,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7993 7978 7942 [WARNING|trainer.py:803] 2025-04-26 18:01:20,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:01:21,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:01:21,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7994 7979 7943 [WARNING|trainer.py:803] 2025-04-26 18:01:22,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:01:23,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:01:23,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7995 7944 [WARNING|trainer.py:803] 2025-04-26 18:01:24,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7980 [WARNING|trainer.py:803] 2025-04-26 18:01:25,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:01:25,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7996 7981 [WARNING|trainer.py:803] 2025-04-26 18:01:26,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7945 [WARNING|trainer.py:803] 2025-04-26 18:01:27,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:01:27,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7997 7982 7946 [WARNING|trainer.py:803] 2025-04-26 18:01:28,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:01:29,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:01:29,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7998 7983 7947 [WARNING|trainer.py:803] 2025-04-26 18:01:30,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:01:31,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:01:31,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7999 7984 7948 [WARNING|trainer.py:803] 2025-04-26 18:01:32,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:01:32,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:01:33,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7985 8000 7949 [WARNING|trainer.py:803] 2025-04-26 18:01:34,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:01:34,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:01:34,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7986 8001 7950 [WARNING|trainer.py:803] 2025-04-26 18:01:36,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:01:36,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:01:36,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7987 8002 7951 [WARNING|trainer.py:803] 2025-04-26 18:01:38,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:01:38,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:01:39,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7988 8003 7952 [WARNING|trainer.py:803] 2025-04-26 18:01:40,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 18:01:40,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes :Yes [WARNING|trainer.py:803] 2025-04-26 18:01:41,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8004 7989 [WARNING|trainer.py:803] 2025-04-26 18:01:42,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7953 [WARNING|trainer.py:803] 2025-04-26 18:01:42,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:01:43,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8005 7990 7954 [WARNING|trainer.py:803] 2025-04-26 18:01:44,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:01:44,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:01:45,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8006 7991 [WARNING|trainer.py:803] 2025-04-26 18:01:45,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7955 [WARNING|trainer.py:803] 2025-04-26 18:01:46,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:01:46,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8007 7956 [WARNING|trainer.py:803] 2025-04-26 18:01:47,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7992 [WARNING|trainer.py:803] 2025-04-26 18:01:48,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:01:48,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8008 7957 [WARNING|trainer.py:803] 2025-04-26 18:01:49,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7993 [WARNING|trainer.py:803] 2025-04-26 18:01:50,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:01:50,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8009 [WARNING|trainer.py:803] 2025-04-26 18:01:51,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7994 7958 8010 [WARNING|trainer.py:803] 2025-04-26 18:01:52,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:01:52,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:01:53,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7959 7995 8011 [WARNING|trainer.py:803] 2025-04-26 18:01:54,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:01:54,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:01:55,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7960 7996 8012 [WARNING|trainer.py:803] 2025-04-26 18:01:56,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:01:56,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:01:56,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7961 8013 [WARNING|trainer.py:803] 2025-04-26 18:01:58,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7997 [WARNING|trainer.py:803] 2025-04-26 18:01:58,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:01:58,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7962 8014 7998 [WARNING|trainer.py:803] 2025-04-26 18:01:59,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:02:00,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:02:00,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8015 7963 7999 [WARNING|trainer.py:803] 2025-04-26 18:02:01,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:01,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:02:02,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8016 7964 [WARNING|trainer.py:803] 2025-04-26 18:02:03,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8000 [WARNING|trainer.py:803] 2025-04-26 18:02:04,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8017 [WARNING|trainer.py:803] 2025-04-26 18:02:04,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7965 [WARNING|trainer.py:803] 2025-04-26 18:02:05,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:05,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8001 8018 7966 [WARNING|trainer.py:803] 2025-04-26 18:02:06,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:07,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:07,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8019 8002 [WARNING|trainer.py:803] 2025-04-26 18:02:08,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7967 [WARNING|trainer.py:803] 2025-04-26 18:02:09,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:02:09,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8020 8003 [WARNING|trainer.py:803] 2025-04-26 18:02:10,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7968 [WARNING|trainer.py:803] 2025-04-26 18:02:10,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8021 8004 [WARNING|trainer.py:803] 2025-04-26 18:02:11,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:02:12,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:12,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7969 8005 [WARNING|trainer.py:803] 2025-04-26 18:02:13,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8022 [WARNING|trainer.py:803] 2025-04-26 18:02:14,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:02:14,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7970 8006 [WARNING|trainer.py:803] 2025-04-26 18:02:15,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8023 [WARNING|trainer.py:803] 2025-04-26 18:02:15,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7971 [WARNING|trainer.py:803] 2025-04-26 18:02:16,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:16,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8024 8007 7972 [WARNING|trainer.py:803] 2025-04-26 18:02:17,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:17,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:18,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8025 8008 7973 [WARNING|trainer.py:803] 2025-04-26 18:02:19,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:19,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:02:20,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8026 8009 7974 [WARNING|trainer.py:803] 2025-04-26 18:02:21,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:21,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:21,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8027 8010 7975 [WARNING|trainer.py:803] 2025-04-26 18:02:23,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:23,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:23,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8028 8011 [WARNING|trainer.py:803] 2025-04-26 18:02:24,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7976 [WARNING|trainer.py:803] 2025-04-26 18:02:25,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8029 [WARNING|trainer.py:803] 2025-04-26 18:02:25,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8012 [WARNING|trainer.py:803] 2025-04-26 18:02:26,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:26,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7977 8030 8013 [WARNING|trainer.py:803] 2025-04-26 18:02:27,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:02:28,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:28,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7978 8031 8014 [WARNING|trainer.py:803] 2025-04-26 18:02:29,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:29,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:30,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7979 8032 8015 [WARNING|trainer.py:803] 2025-04-26 18:02:31,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:31,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:31,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8016 8033 7980 [WARNING|trainer.py:803] 2025-04-26 18:02:33,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:02:33,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:33,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8017 8034 7981 [WARNING|trainer.py:803] 2025-04-26 18:02:35,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:35,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:35,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8035 8018 7982 [WARNING|trainer.py:803] 2025-04-26 18:02:36,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:37,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:37,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8036 8019 7983 [WARNING|trainer.py:803] 2025-04-26 18:02:38,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:38,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:02:39,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8037 8020 7984 [WARNING|trainer.py:803] 2025-04-26 18:02:40,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:40,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:40,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8038 8021 [WARNING|trainer.py:803] 2025-04-26 18:02:41,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7985 [WARNING|trainer.py:803] 2025-04-26 18:02:42,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8039 [WARNING|trainer.py:803] 2025-04-26 18:02:42,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:43,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8022 7986 [WARNING|trainer.py:803] 2025-04-26 18:02:44,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8040 [WARNING|trainer.py:803] 2025-04-26 18:02:44,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:45,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8023 8041 7987 [WARNING|trainer.py:803] 2025-04-26 18:02:46,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:46,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:47,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8024 8042 [WARNING|trainer.py:803] 2025-04-26 18:02:48,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7988 [WARNING|trainer.py:803] 2025-04-26 18:02:48,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8025 [WARNING|trainer.py:803] 2025-04-26 18:02:49,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8043 [WARNING|trainer.py:803] 2025-04-26 18:02:49,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7989 [WARNING|trainer.py:803] 2025-04-26 18:02:50,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:50,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8026 8044 [WARNING|trainer.py:803] 2025-04-26 18:02:51,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:51,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7990 8027 8045 [WARNING|trainer.py:803] 2025-04-26 18:02:52,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:02:53,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:02:53,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7991 8028 [WARNING|trainer.py:803] 2025-04-26 18:02:54,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:02:55,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8046 8029 [WARNING|trainer.py:803] 2025-04-26 18:02:56,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7992 [WARNING|trainer.py:803] 2025-04-26 18:02:56,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8047 [WARNING|trainer.py:803] 2025-04-26 18:02:57,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8030 [WARNING|trainer.py:803] 2025-04-26 18:02:57,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7993 [WARNING|trainer.py:803] 2025-04-26 18:02:58,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8048 [WARNING|trainer.py:803] 2025-04-26 18:02:59,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8031 [WARNING|trainer.py:803] 2025-04-26 18:02:59,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:03:00,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7994 8049 8032 [WARNING|trainer.py:803] 2025-04-26 18:03:01,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:03:01,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:01,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8050 7995 8033 [WARNING|trainer.py:803] 2025-04-26 18:03:02,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:03:02,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:03:03,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8051 7996 8034 [WARNING|trainer.py:803] 2025-04-26 18:03:04,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:03:05,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:05,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8052 8035 [WARNING|trainer.py:803] 2025-04-26 18:03:06,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7997 [WARNING|trainer.py:803] 2025-04-26 18:03:06,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8053 [WARNING|trainer.py:803] 2025-04-26 18:03:07,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8036 [WARNING|trainer.py:803] 2025-04-26 18:03:07,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7998 8054 [WARNING|trainer.py:803] 2025-04-26 18:03:08,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:03:09,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:03:09,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8037 8055 7999 [WARNING|trainer.py:803] 2025-04-26 18:03:10,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:11,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:11,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8038 8056 [WARNING|trainer.py:803] 2025-04-26 18:03:12,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8000 [WARNING|trainer.py:803] 2025-04-26 18:03:12,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8039 [WARNING|trainer.py:803] 2025-04-26 18:03:13,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8057 [WARNING|trainer.py:803] 2025-04-26 18:03:13,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:03:14,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8001 8040 8058 [WARNING|trainer.py:803] 2025-04-26 18:03:15,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:15,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:03:15,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 8041 8002 8059 [WARNING|trainer.py:803] 2025-04-26 18:03:17,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:03:17,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:03:17,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8042 8060 8003 [WARNING|trainer.py:803] 2025-04-26 18:03:19,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:19,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:19,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8043 8061 8004 [WARNING|trainer.py:803] 2025-04-26 18:03:20,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:03:20,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:03:20,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8044 8062 8005 [WARNING|trainer.py:803] 2025-04-26 18:03:22,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:22,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:03:22,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8045 8063 8006 [WARNING|trainer.py:803] 2025-04-26 18:03:24,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:03:24,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:03:24,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8064 8007 8046 [WARNING|trainer.py:803] 2025-04-26 18:03:26,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:03:26,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:26,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8065 8047 8008 [WARNING|trainer.py:803] 2025-04-26 18:03:27,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:03:28,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:03:28,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8066 8048 8009 [WARNING|trainer.py:803] 2025-04-26 18:03:29,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:03:30,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:03:30,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8067 8049 8010 [WARNING|trainer.py:803] 2025-04-26 18:03:31,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:03:31,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:31,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8068 8050 8011 [WARNING|trainer.py:803] 2025-04-26 18:03:32,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:03:33,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:03:33,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8069 8051 8012 [WARNING|trainer.py:803] 2025-04-26 18:03:34,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:03:35,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:03:35,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8070 8013 8052 [WARNING|trainer.py:803] 2025-04-26 18:03:36,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:03:36,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:36,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8071 8053 8014 [WARNING|trainer.py:803] 2025-04-26 18:03:38,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:38,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:03:38,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8072 8054 8015 [WARNING|trainer.py:803] 2025-04-26 18:03:39,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:03:39,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:40,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8055 8016 8073 [WARNING|trainer.py:803] 2025-04-26 18:03:41,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:41,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:03:41,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8056 8017 [WARNING|trainer.py:803] 2025-04-26 18:03:43,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8074 [WARNING|trainer.py:803] 2025-04-26 18:03:43,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8057 [WARNING|trainer.py:803] 2025-04-26 18:03:43,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8018 [WARNING|trainer.py:803] 2025-04-26 18:03:44,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8075 [WARNING|trainer.py:803] 2025-04-26 18:03:45,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8058 [WARNING|trainer.py:803] 2025-04-26 18:03:46,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8019 [WARNING|trainer.py:803] 2025-04-26 18:03:46,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 8076 [WARNING|trainer.py:803] 2025-04-26 18:03:47,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8059 [WARNING|trainer.py:803] 2025-04-26 18:03:47,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:48,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8020 8077 8060 [WARNING|trainer.py:803] 2025-04-26 18:03:49,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:03:49,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:49,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8021 8078 8061 [WARNING|trainer.py:803] 2025-04-26 18:03:50,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:51,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:51,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8079 8022 8062 [WARNING|trainer.py:803] 2025-04-26 18:03:52,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:52,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:53,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8080 8023 8063 [WARNING|trainer.py:803] 2025-04-26 18:03:54,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:54,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:54,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8081 8024 8064 [WARNING|trainer.py:803] 2025-04-26 18:03:56,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:03:56,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:03:56,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8082 8065 8025 [WARNING|trainer.py:803] 2025-04-26 18:03:58,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:03:58,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:03:58,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8083 8066 8026 [WARNING|trainer.py:803] 2025-04-26 18:03:59,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:04:00,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:04:00,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8084 8067 8027 [WARNING|trainer.py:803] 2025-04-26 18:04:01,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:01,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:04:01,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8085 8068 8028 [WARNING|trainer.py:803] 2025-04-26 18:04:03,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:04:03,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:04:03,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8029 8086 8069 [WARNING|trainer.py:803] 2025-04-26 18:04:05,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:05,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:05,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8030 8070 8087 [WARNING|trainer.py:803] 2025-04-26 18:04:07,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:07,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:04:07,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8031 8071 8088 [WARNING|trainer.py:803] 2025-04-26 18:04:08,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:04:08,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:08,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8032 8072 8089 [WARNING|trainer.py:803] 2025-04-26 18:04:10,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:10,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:04:10,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8033 8073 [WARNING|trainer.py:803] 2025-04-26 18:04:12,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8090 [WARNING|trainer.py:803] 2025-04-26 18:04:12,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8034 [WARNING|trainer.py:803] 2025-04-26 18:04:13,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8074 8091 [WARNING|trainer.py:803] 2025-04-26 18:04:13,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:04:14,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8035 [WARNING|trainer.py:803] 2025-04-26 18:04:14,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:15,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8092 8075 8036 [WARNING|trainer.py:803] 2025-04-26 18:04:16,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:16,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:17,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8093 8076 [WARNING|trainer.py:803] 2025-04-26 18:04:18,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:18,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8037 8094 8077 [WARNING|trainer.py:803] 2025-04-26 18:04:19,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:20,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:20,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8038 8095 8078 [WARNING|trainer.py:803] 2025-04-26 18:04:20,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:04:21,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:21,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8039 8096 8079 [WARNING|trainer.py:803] 2025-04-26 18:04:22,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:04:23,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:23,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8040 8097 8080 [WARNING|trainer.py:803] 2025-04-26 18:04:24,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:04:24,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:04:24,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8041 8098 [WARNING|trainer.py:803] 2025-04-26 18:04:25,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8081 [WARNING|trainer.py:803] 2025-04-26 18:04:26,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8042 [WARNING|trainer.py:803] 2025-04-26 18:04:26,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8099 [WARNING|trainer.py:803] 2025-04-26 18:04:27,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8082 [WARNING|trainer.py:803] 2025-04-26 18:04:28,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8043 [WARNING|trainer.py:803] 2025-04-26 18:04:28,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8100 [WARNING|trainer.py:803] 2025-04-26 18:04:29,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8083 [WARNING|trainer.py:803] 2025-04-26 18:04:29,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8044 [WARNING|trainer.py:803] 2025-04-26 18:04:30,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:04:30,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8084 8101 8045 [WARNING|trainer.py:803] 2025-04-26 18:04:32,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:32,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:04:32,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8085 8102 [WARNING|trainer.py:803] 2025-04-26 18:04:33,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8046 [WARNING|trainer.py:803] 2025-04-26 18:04:34,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:34,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8086 8103 8047 [WARNING|trainer.py:803] 2025-04-26 18:04:35,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:36,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:36,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8087 8104 8048 [WARNING|trainer.py:803] 2025-04-26 18:04:37,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:38,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:04:38,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8088 8049 [WARNING|trainer.py:803] 2025-04-26 18:04:39,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8105 [WARNING|trainer.py:803] 2025-04-26 18:04:40,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8089 [WARNING|trainer.py:803] 2025-04-26 18:04:40,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8050 [WARNING|trainer.py:803] 2025-04-26 18:04:41,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8106 [WARNING|trainer.py:803] 2025-04-26 18:04:41,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:04:42,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8051 8090 [WARNING|trainer.py:803] 2025-04-26 18:04:43,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:04:43,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8107 [WARNING|trainer.py:803] 2025-04-26 18:04:44,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8091 8052 [WARNING|trainer.py:803] 2025-04-26 18:04:45,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:45,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8108 8053 8092 [WARNING|trainer.py:803] 2025-04-26 18:04:46,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:04:46,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:04:47,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8054 8093 8109 [WARNING|trainer.py:803] 2025-04-26 18:04:48,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:48,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:48,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8055 8094 8110 [WARNING|trainer.py:803] 2025-04-26 18:04:50,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:50,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:50,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8056 8095 [WARNING|trainer.py:803] 2025-04-26 18:04:51,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8111 [WARNING|trainer.py:803] 2025-04-26 18:04:52,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8057 [WARNING|trainer.py:803] 2025-04-26 18:04:52,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8096 [WARNING|trainer.py:803] 2025-04-26 18:04:53,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:04:53,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8112 8058 8097 [WARNING|trainer.py:803] 2025-04-26 18:04:55,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:04:55,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 18:04:55,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8059 8113 8098 [WARNING|trainer.py:803] 2025-04-26 18:04:56,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:04:57,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8060 [WARNING|trainer.py:803] 2025-04-26 18:04:57,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8114 [WARNING|trainer.py:803] 2025-04-26 18:04:58,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8099 [WARNING|trainer.py:803] 2025-04-26 18:04:59,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:04:59,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8061 8100 [WARNING|trainer.py:803] 2025-04-26 18:05:00,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8115 [WARNING|trainer.py:803] 2025-04-26 18:05:00,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8062 [WARNING|trainer.py:803] 2025-04-26 18:05:01,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:05:01,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8101 8116 8063 [WARNING|trainer.py:803] 2025-04-26 18:05:03,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:03,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:05:03,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8102 8064 8117 [WARNING|trainer.py:803] 2025-04-26 18:05:05,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:05:05,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:05,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8065 8103 8118 [WARNING|trainer.py:803] 2025-04-26 18:05:07,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:07,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:05:07,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8066 8104 8119 [WARNING|trainer.py:803] 2025-04-26 18:05:09,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:05:09,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:09,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8067 [WARNING|trainer.py:803] 2025-04-26 18:05:10,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8105 8120 8068 [WARNING|trainer.py:803] 2025-04-26 18:05:11,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:05:11,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:05:12,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8121 8106 8069 [WARNING|trainer.py:803] 2025-04-26 18:05:13,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:13,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:05:14,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8107 8122 8070 [WARNING|trainer.py:803] 2025-04-26 18:05:15,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:05:16,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:16,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8071 8108 8123 [WARNING|trainer.py:803] 2025-04-26 18:05:17,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:05:18,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:05:18,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8072 8109 8124 [WARNING|trainer.py:803] 2025-04-26 18:05:19,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:20,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:05:20,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8073 8110 [WARNING|trainer.py:803] 2025-04-26 18:05:21,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8125 [WARNING|trainer.py:803] 2025-04-26 18:05:22,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:22,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8074 8111 8126 [WARNING|trainer.py:803] 2025-04-26 18:05:23,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:05:24,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:24,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8075 8127 8112 [WARNING|trainer.py:803] 2025-04-26 18:05:25,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:05:26,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:26,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8076 [WARNING|trainer.py:803] 2025-04-26 18:05:27,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8113 8128 8077 [WARNING|trainer.py:803] 2025-04-26 18:05:28,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:28,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:29,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8114 8129 8078 [WARNING|trainer.py:803] 2025-04-26 18:05:30,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:05:30,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:30,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8079 8115 8130 [WARNING|trainer.py:803] 2025-04-26 18:05:32,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:05:32,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:05:33,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8080 8116 [WARNING|trainer.py:803] 2025-04-26 18:05:34,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8131 [WARNING|trainer.py:803] 2025-04-26 18:05:34,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:05:35,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8081 8117 [WARNING|trainer.py:803] 2025-04-26 18:05:36,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8132 [WARNING|trainer.py:803] 2025-04-26 18:05:37,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8082 [WARNING|trainer.py:803] 2025-04-26 18:05:37,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:38,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8118 8133 8083 [WARNING|trainer.py:803] 2025-04-26 18:05:39,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:05:39,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:05:39,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8119 8084 8134 [WARNING|trainer.py:803] 2025-04-26 18:05:41,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:05:41,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:05:41,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8120 8085 8135 [WARNING|trainer.py:803] 2025-04-26 18:05:43,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:05:43,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:05:43,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8121 8086 8136 [WARNING|trainer.py:803] 2025-04-26 18:05:45,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:45,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:05:45,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8087 8122 8137 [WARNING|trainer.py:803] 2025-04-26 18:05:47,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:05:47,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:47,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8088 8123 [WARNING|trainer.py:803] 2025-04-26 18:05:49,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8138 [WARNING|trainer.py:803] 2025-04-26 18:05:49,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8089 [WARNING|trainer.py:803] 2025-04-26 18:05:50,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:50,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8124 8139 [WARNING|trainer.py:803] 2025-04-26 18:05:51,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:52,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8090 8125 [WARNING|trainer.py:803] 2025-04-26 18:05:53,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8140 [WARNING|trainer.py:803] 2025-04-26 18:05:53,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8091 [WARNING|trainer.py:803] 2025-04-26 18:05:54,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:54,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8126 8141 8092 [WARNING|trainer.py:803] 2025-04-26 18:05:55,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:56,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:05:56,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8127 8093 8142 [WARNING|trainer.py:803] 2025-04-26 18:05:58,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:05:58,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:05:58,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8094 8128 8143 [WARNING|trainer.py:803] 2025-04-26 18:06:00,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:06:00,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:00,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8095 8129 [WARNING|trainer.py:803] 2025-04-26 18:06:01,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8144 [WARNING|trainer.py:803] 2025-04-26 18:06:02,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8096 [WARNING|trainer.py:803] 2025-04-26 18:06:02,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:06:03,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8130 8145 8097 [WARNING|trainer.py:803] 2025-04-26 18:06:04,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:05,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:06:05,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8131 8146 8098 [WARNING|trainer.py:803] 2025-04-26 18:06:06,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:07,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:06:07,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8099 8132 8147 [WARNING|trainer.py:803] 2025-04-26 18:06:08,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:06:09,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:09,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8100 [WARNING|trainer.py:803] 2025-04-26 18:06:10,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8148 8133 [WARNING|trainer.py:803] 2025-04-26 18:06:11,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:11,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8101 8149 [WARNING|trainer.py:803] 2025-04-26 18:06:12,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8134 [WARNING|trainer.py:803] 2025-04-26 18:06:13,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:06:13,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8102 [WARNING|trainer.py:803] 2025-04-26 18:06:14,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8150 8135 [WARNING|trainer.py:803] 2025-04-26 18:06:15,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:06:15,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8103 [WARNING|trainer.py:803] 2025-04-26 18:06:16,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8136 8151 [WARNING|trainer.py:803] 2025-04-26 18:06:17,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:06:17,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8104 8137 [WARNING|trainer.py:803] 2025-04-26 18:06:18,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8152 [WARNING|trainer.py:803] 2025-04-26 18:06:19,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:06:19,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8105 [WARNING|trainer.py:803] 2025-04-26 18:06:21,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8138 8153 [WARNING|trainer.py:803] 2025-04-26 18:06:21,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:22,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8106 8139 [WARNING|trainer.py:803] 2025-04-26 18:06:23,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8154 [WARNING|trainer.py:803] 2025-04-26 18:06:23,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:24,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8107 [WARNING|trainer.py:803] 2025-04-26 18:06:25,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8140 8155 [WARNING|trainer.py:803] 2025-04-26 18:06:26,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:26,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8108 [WARNING|trainer.py:803] 2025-04-26 18:06:27,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8141 8156 [WARNING|trainer.py:803] 2025-04-26 18:06:28,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:06:28,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8109 [WARNING|trainer.py:803] 2025-04-26 18:06:29,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8157 8142 [WARNING|trainer.py:803] 2025-04-26 18:06:30,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:30,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8110 [WARNING|trainer.py:803] 2025-04-26 18:06:31,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8158 8143 [WARNING|trainer.py:803] 2025-04-26 18:06:32,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:32,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8111 [WARNING|trainer.py:803] 2025-04-26 18:06:33,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8159 8144 [WARNING|trainer.py:803] 2025-04-26 18:06:34,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:34,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8112 8160 [WARNING|trainer.py:803] 2025-04-26 18:06:35,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8145 [WARNING|trainer.py:803] 2025-04-26 18:06:36,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:36,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8113 8161 [WARNING|trainer.py:803] 2025-04-26 18:06:37,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8146 [WARNING|trainer.py:803] 2025-04-26 18:06:38,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:38,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8114 8162 [WARNING|trainer.py:803] 2025-04-26 18:06:39,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8147 [WARNING|trainer.py:803] 2025-04-26 18:06:40,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:06:41,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8115 8163 [WARNING|trainer.py:803] 2025-04-26 18:06:42,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8148 [WARNING|trainer.py:803] 2025-04-26 18:06:42,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:06:43,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8116 8164 [WARNING|trainer.py:803] 2025-04-26 18:06:44,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8149 [WARNING|trainer.py:803] 2025-04-26 18:06:44,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:06:45,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8117 8165 [WARNING|trainer.py:803] 2025-04-26 18:06:46,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8150 [WARNING|trainer.py:803] 2025-04-26 18:06:46,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:47,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8118 8166 [WARNING|trainer.py:803] 2025-04-26 18:06:48,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8151 [WARNING|trainer.py:803] 2025-04-26 18:06:48,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:49,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8119 8167 [WARNING|trainer.py:803] 2025-04-26 18:06:50,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:06:50,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8152 [WARNING|trainer.py:803] 2025-04-26 18:06:51,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8168 8120 [WARNING|trainer.py:803] 2025-04-26 18:06:52,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:06:52,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8153 [WARNING|trainer.py:803] 2025-04-26 18:06:53,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8121 8169 [WARNING|trainer.py:803] 2025-04-26 18:06:54,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:54,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8154 [WARNING|trainer.py:803] 2025-04-26 18:06:55,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8170 8122 [WARNING|trainer.py:803] 2025-04-26 18:06:56,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:06:57,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8155 [WARNING|trainer.py:803] 2025-04-26 18:06:57,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8171 8123 [WARNING|trainer.py:803] 2025-04-26 18:06:58,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:06:59,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8156 [WARNING|trainer.py:803] 2025-04-26 18:07:00,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8172 8124 [WARNING|trainer.py:803] 2025-04-26 18:07:01,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8157 [WARNING|trainer.py:803] 2025-04-26 18:07:01,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:02,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8173 8125 [WARNING|trainer.py:803] 2025-04-26 18:07:03,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8158 [WARNING|trainer.py:803] 2025-04-26 18:07:03,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:04,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8174 8126 [WARNING|trainer.py:803] 2025-04-26 18:07:05,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8159 [WARNING|trainer.py:803] 2025-04-26 18:07:05,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:06,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8127 8175 8160 [WARNING|trainer.py:803] 2025-04-26 18:07:07,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:07,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:08,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8176 8128 8161 [WARNING|trainer.py:803] 2025-04-26 18:07:09,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:07:09,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:10,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8177 8129 8162 [WARNING|trainer.py:803] 2025-04-26 18:07:11,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:12,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:12,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8178 8130 8163 [WARNING|trainer.py:803] 2025-04-26 18:07:13,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:14,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:14,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8179 8164 8131 [WARNING|trainer.py:803] 2025-04-26 18:07:15,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:16,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:07:16,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8180 8132 8165 [WARNING|trainer.py:803] 2025-04-26 18:07:18,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:07:18,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:18,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8181 8166 8133 [WARNING|trainer.py:803] 2025-04-26 18:07:20,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:20,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:20,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8182 8167 [WARNING|trainer.py:803] 2025-04-26 18:07:22,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8134 [WARNING|trainer.py:803] 2025-04-26 18:07:22,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:23,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8183 8168 8135 [WARNING|trainer.py:803] 2025-04-26 18:07:24,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:24,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:07:25,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8184 8169 8136 [WARNING|trainer.py:803] 2025-04-26 18:07:26,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:26,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:27,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8185 8170 8137 [WARNING|trainer.py:803] 2025-04-26 18:07:28,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:28,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:29,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8171 8186 8138 [WARNING|trainer.py:803] 2025-04-26 18:07:30,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:07:31,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:31,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8172 8187 8139 [WARNING|trainer.py:803] 2025-04-26 18:07:33,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:07:33,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:07:33,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8173 8188 8140 [WARNING|trainer.py:803] 2025-04-26 18:07:35,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:07:35,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:07:35,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8174 8189 8141 [WARNING|trainer.py:803] 2025-04-26 18:07:37,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:07:37,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:07:37,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8190 8175 8142 [WARNING|trainer.py:803] 2025-04-26 18:07:39,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:39,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:40,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8191 8176 8143 [WARNING|trainer.py:803] 2025-04-26 18:07:41,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:07:41,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:07:42,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8177 8192 8144 [WARNING|trainer.py:803] 2025-04-26 18:07:44,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:44,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:07:44,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8178 8193 8145 [WARNING|trainer.py:803] 2025-04-26 18:07:46,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:46,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:46,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8179 8194 8146 [WARNING|trainer.py:803] 2025-04-26 18:07:48,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:48,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:07:48,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8180 8195 8147 [WARNING|trainer.py:803] 2025-04-26 18:07:50,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:07:50,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:50,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8181 8196 8148 [WARNING|trainer.py:803] 2025-04-26 18:07:52,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:52,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:52,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8182 8197 8149 [WARNING|trainer.py:803] 2025-04-26 18:07:54,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:07:54,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:07:54,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8183 8198 8150 [WARNING|trainer.py:803] 2025-04-26 18:07:56,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:56,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:07:57,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8184 8199 8151 [WARNING|trainer.py:803] 2025-04-26 18:07:58,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:59,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:07:59,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8185 8200 8152 [WARNING|trainer.py:803] 2025-04-26 18:08:01,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:01,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:08:01,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8186 8201 8153 [WARNING|trainer.py:803] 2025-04-26 18:08:03,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:03,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:03,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8202 8187 8154 [WARNING|trainer.py:803] 2025-04-26 18:08:05,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:08:05,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:05,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8188 8203 8155 [WARNING|trainer.py:803] 2025-04-26 18:08:07,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:08:07,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 18:08:07,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8189 8156 8204 [WARNING|trainer.py:803] 2025-04-26 18:08:09,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:08:09,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:08:10,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8190 8157 8205 [WARNING|trainer.py:803] 2025-04-26 18:08:11,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:11,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:12,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8191 8158 8206 [WARNING|trainer.py:803] 2025-04-26 18:08:13,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:08:14,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:14,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8159 8192 8207 [WARNING|trainer.py:803] 2025-04-26 18:08:16,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:16,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:08:16,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8160 8193 8208 [WARNING|trainer.py:803] 2025-04-26 18:08:18,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:18,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:18,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8161 8194 8209 [WARNING|trainer.py:803] 2025-04-26 18:08:20,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:20,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:08:20,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8162 8195 8210 [WARNING|trainer.py:803] 2025-04-26 18:08:22,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:08:22,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:22,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8163 8196 8211 [WARNING|trainer.py:803] 2025-04-26 18:08:24,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:08:24,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:24,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8197 8164 8212 [WARNING|trainer.py:803] 2025-04-26 18:08:26,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:08:26,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:08:26,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8165 8213 8198 [WARNING|trainer.py:803] 2025-04-26 18:08:28,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:29,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:08:29,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8166 8214 8199 [WARNING|trainer.py:803] 2025-04-26 18:08:30,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:31,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:08:31,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8167 8215 8200 [WARNING|trainer.py:803] 2025-04-26 18:08:33,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:33,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:33,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8168 8216 8201 [WARNING|trainer.py:803] 2025-04-26 18:08:35,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:08:35,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:35,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8169 8217 8202 [WARNING|trainer.py:803] 2025-04-26 18:08:37,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:37,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:37,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8170 8203 8218 [WARNING|trainer.py:803] 2025-04-26 18:08:39,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:39,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 18:08:40,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8171 8204 8219 [WARNING|trainer.py:803] 2025-04-26 18:08:41,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:08:42,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:08:42,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8172 8205 8220 [WARNING|trainer.py:803] 2025-04-26 18:08:43,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:08:44,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:44,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8173 8206 8221 [WARNING|trainer.py:803] 2025-04-26 18:08:45,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:08:46,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:08:46,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8174 8207 [WARNING|trainer.py:803] 2025-04-26 18:08:47,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8222 [WARNING|trainer.py:803] 2025-04-26 18:08:48,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:08:49,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8175 8208 [WARNING|trainer.py:803] 2025-04-26 18:08:50,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8223 [WARNING|trainer.py:803] 2025-04-26 18:08:50,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:51,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8176 8209 [WARNING|trainer.py:803] 2025-04-26 18:08:52,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8224 [WARNING|trainer.py:803] 2025-04-26 18:08:52,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:08:53,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8177 8210 [WARNING|trainer.py:803] 2025-04-26 18:08:54,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8225 [WARNING|trainer.py:803] 2025-04-26 18:08:55,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:08:55,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8178 8211 [WARNING|trainer.py:803] 2025-04-26 18:08:56,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8226 [WARNING|trainer.py:803] 2025-04-26 18:08:56,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8179 [WARNING|trainer.py:803] 2025-04-26 18:08:57,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8212 [WARNING|trainer.py:803] 2025-04-26 18:08:58,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:08:58,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8227 8180 [WARNING|trainer.py:803] 2025-04-26 18:08:59,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8213 [WARNING|trainer.py:803] 2025-04-26 18:09:00,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:09:01,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8228 8181 [WARNING|trainer.py:803] 2025-04-26 18:09:01,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8214 [WARNING|trainer.py:803] 2025-04-26 18:09:02,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:03,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8229 8182 [WARNING|trainer.py:803] 2025-04-26 18:09:04,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8215 [WARNING|trainer.py:803] 2025-04-26 18:09:04,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:09:05,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8230 8183 [WARNING|trainer.py:803] 2025-04-26 18:09:06,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8216 [WARNING|trainer.py:803] 2025-04-26 18:09:07,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:07,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8231 [WARNING|trainer.py:803] 2025-04-26 18:09:08,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8184 8217 [WARNING|trainer.py:803] 2025-04-26 18:09:09,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:09,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8232 8185 [WARNING|trainer.py:803] 2025-04-26 18:09:10,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8218 [WARNING|trainer.py:803] 2025-04-26 18:09:11,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8233 [WARNING|trainer.py:803] 2025-04-26 18:09:11,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:12,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8186 8219 [WARNING|trainer.py:803] 2025-04-26 18:09:13,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8234 [WARNING|trainer.py:803] 2025-04-26 18:09:14,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:14,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8187 8220 [WARNING|trainer.py:803] 2025-04-26 18:09:15,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8235 [WARNING|trainer.py:803] 2025-04-26 18:09:16,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:09:17,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8188 8221 [WARNING|trainer.py:803] 2025-04-26 18:09:17,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8236 [WARNING|trainer.py:803] 2025-04-26 18:09:18,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8189 [WARNING|trainer.py:803] 2025-04-26 18:09:19,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:19,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8222 8237 [WARNING|trainer.py:803] 2025-04-26 18:09:20,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8190 [WARNING|trainer.py:803] 2025-04-26 18:09:21,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:09:21,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8223 8238 [WARNING|trainer.py:803] 2025-04-26 18:09:23,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8191 [WARNING|trainer.py:803] 2025-04-26 18:09:23,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:24,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8224 8239 [WARNING|trainer.py:803] 2025-04-26 18:09:25,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:09:25,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8192 [WARNING|trainer.py:803] 2025-04-26 18:09:26,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8225 8240 [WARNING|trainer.py:803] 2025-04-26 18:09:27,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:09:27,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8193 [WARNING|trainer.py:803] 2025-04-26 18:09:28,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8226 8241 [WARNING|trainer.py:803] 2025-04-26 18:09:29,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:29,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8194 [WARNING|trainer.py:803] 2025-04-26 18:09:30,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8227 8242 [WARNING|trainer.py:803] 2025-04-26 18:09:31,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:31,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8195 [WARNING|trainer.py:803] 2025-04-26 18:09:32,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8228 8243 [WARNING|trainer.py:803] 2025-04-26 18:09:33,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:09:33,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8196 [WARNING|trainer.py:803] 2025-04-26 18:09:34,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8244 8229 [WARNING|trainer.py:803] 2025-04-26 18:09:36,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8197 [WARNING|trainer.py:803] 2025-04-26 18:09:36,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:36,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8245 8230 [WARNING|trainer.py:803] 2025-04-26 18:09:38,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:09:38,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8198 8246 8231 [WARNING|trainer.py:803] 2025-04-26 18:09:39,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:09:40,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:40,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8199 8232 8247 [WARNING|trainer.py:803] 2025-04-26 18:09:41,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:42,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:42,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8200 8233 [WARNING|trainer.py:803] 2025-04-26 18:09:43,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8248 [WARNING|trainer.py:803] 2025-04-26 18:09:44,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:09:44,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8201 8234 8249 [WARNING|trainer.py:803] 2025-04-26 18:09:45,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:46,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:09:46,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8202 8250 8235 [WARNING|trainer.py:803] 2025-04-26 18:09:48,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:48,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:48,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8203 8251 [WARNING|trainer.py:803] 2025-04-26 18:09:50,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 8236 [WARNING|trainer.py:803] 2025-04-26 18:09:50,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:09:51,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8204 8237 8252 [WARNING|trainer.py:803] 2025-04-26 18:09:52,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:09:53,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:09:53,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8205 8238 8253 [WARNING|trainer.py:803] 2025-04-26 18:09:54,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:55,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:55,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8206 8239 8254 [WARNING|trainer.py:803] 2025-04-26 18:09:57,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:09:57,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:09:57,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8207 8240 8255 [WARNING|trainer.py:803] 2025-04-26 18:09:59,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:09:59,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:09:59,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8208 8241 8256 [WARNING|trainer.py:803] 2025-04-26 18:10:01,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:01,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:10:01,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8209 8242 8257 [WARNING|trainer.py:803] 2025-04-26 18:10:03,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:10:03,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:04,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8210 8243 8258 [WARNING|trainer.py:803] 2025-04-26 18:10:05,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:10:05,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:10:06,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8211 8244 8259 [WARNING|trainer.py:803] 2025-04-26 18:10:07,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:07,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:10:08,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8212 8245 8260 [WARNING|trainer.py:803] 2025-04-26 18:10:09,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:10:10,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:10:10,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8213 8246 [WARNING|trainer.py:803] 2025-04-26 18:10:11,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8261 [WARNING|trainer.py:803] 2025-04-26 18:10:12,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:12,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8214 8247 8262 [WARNING|trainer.py:803] 2025-04-26 18:10:13,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:10:14,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:10:14,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8215 8248 8263 [WARNING|trainer.py:803] 2025-04-26 18:10:16,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:16,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:10:16,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8216 8249 [WARNING|trainer.py:803] 2025-04-26 18:10:18,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8264 [WARNING|trainer.py:803] 2025-04-26 18:10:18,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:19,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8217 8250 8265 [WARNING|trainer.py:803] 2025-04-26 18:10:20,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:20,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:21,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8218 8251 8266 [WARNING|trainer.py:803] 2025-04-26 18:10:22,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:10:22,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:23,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8219 8252 8267 [WARNING|trainer.py:803] 2025-04-26 18:10:25,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:25,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:25,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8253 8220 8268 [WARNING|trainer.py:803] 2025-04-26 18:10:27,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:10:27,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:10:27,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8254 8221 8269 [WARNING|trainer.py:803] 2025-04-26 18:10:29,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:29,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:10:29,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8270 8255 8222 [WARNING|trainer.py:803] 2025-04-26 18:10:31,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:10:31,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:32,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8271 8256 8223 [WARNING|trainer.py:803] 2025-04-26 18:10:33,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:10:34,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:34,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8272 8257 8224 [WARNING|trainer.py:803] 2025-04-26 18:10:35,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:36,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:10:36,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8273 8258 8225 [WARNING|trainer.py:803] 2025-04-26 18:10:38,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:10:38,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:38,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8274 8259 8226 [WARNING|trainer.py:803] 2025-04-26 18:10:40,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:10:40,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:10:40,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8275 8260 [WARNING|trainer.py:803] 2025-04-26 18:10:42,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8227 [WARNING|trainer.py:803] 2025-04-26 18:10:42,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:42,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8276 8261 8228 [WARNING|trainer.py:803] 2025-04-26 18:10:44,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:44,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:44,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8277 8262 8229 [WARNING|trainer.py:803] 2025-04-26 18:10:46,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:10:46,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:47,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8278 8263 8230 [WARNING|trainer.py:803] 2025-04-26 18:10:48,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:49,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:10:49,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8279 8231 8264 [WARNING|trainer.py:803] 2025-04-26 18:10:50,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:10:51,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:51,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8280 8232 8265 [WARNING|trainer.py:803] 2025-04-26 18:10:53,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:53,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:53,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8281 8233 8266 [WARNING|trainer.py:803] 2025-04-26 18:10:55,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:10:55,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:10:55,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8282 8267 8234 [WARNING|trainer.py:803] 2025-04-26 18:10:57,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:10:57,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:10:57,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8283 8268 8235 [WARNING|trainer.py:803] 2025-04-26 18:10:59,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:10:59,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:00,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8284 8269 8236 [WARNING|trainer.py:803] 2025-04-26 18:11:01,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:02,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:11:02,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8285 8270 8237 [WARNING|trainer.py:803] 2025-04-26 18:11:04,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:04,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:11:04,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8286 8271 8238 [WARNING|trainer.py:803] 2025-04-26 18:11:06,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:11:06,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:11:06,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8272 8287 8239 [WARNING|trainer.py:803] 2025-04-26 18:11:08,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:08,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:08,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8273 8288 8240 [WARNING|trainer.py:803] 2025-04-26 18:11:10,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:11:10,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:11:10,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8274 8289 8241 [WARNING|trainer.py:803] 2025-04-26 18:11:12,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:11:12,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:11:12,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8275 8290 8242 [WARNING|trainer.py:803] 2025-04-26 18:11:14,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:14,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:15,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8276 8291 8243 [WARNING|trainer.py:803] 2025-04-26 18:11:16,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:16,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:17,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8277 8244 8292 [WARNING|trainer.py:803] 2025-04-26 18:11:18,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:11:19,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:11:19,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8278 8245 8293 [WARNING|trainer.py:803] 2025-04-26 18:11:21,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:21,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:11:21,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8246 8279 8294 [WARNING|trainer.py:803] 2025-04-26 18:11:23,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:23,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:11:23,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8247 8280 8295 [WARNING|trainer.py:803] 2025-04-26 18:11:25,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:11:25,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:25,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8281 8248 8296 [WARNING|trainer.py:803] 2025-04-26 18:11:27,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:11:27,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:11:28,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8249 8282 8297 [WARNING|trainer.py:803] 2025-04-26 18:11:29,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:29,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:11:30,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8250 8283 8298 [WARNING|trainer.py:803] 2025-04-26 18:11:32,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:32,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:11:32,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8284 8251 8299 [WARNING|trainer.py:803] 2025-04-26 18:11:34,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:34,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:11:34,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8285 8252 8300 [WARNING|trainer.py:803] 2025-04-26 18:11:36,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:36,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:36,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8286 8253 8301 [WARNING|trainer.py:803] 2025-04-26 18:11:38,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:11:38,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:11:38,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8287 8254 8302 [WARNING|trainer.py:803] 2025-04-26 18:11:40,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:41,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:41,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8288 8303 8255 [WARNING|trainer.py:803] 2025-04-26 18:11:43,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:11:43,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:43,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8289 8304 8256 [WARNING|trainer.py:803] 2025-04-26 18:11:45,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:11:45,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:11:45,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8290 8305 8257 [WARNING|trainer.py:803] 2025-04-26 18:11:47,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:47,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:11:47,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8291 8306 8258 [WARNING|trainer.py:803] 2025-04-26 18:11:49,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:49,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:11:50,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8307 8292 8259 [WARNING|trainer.py:803] 2025-04-26 18:11:51,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:11:51,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:11:51,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8308 8260 8293 [WARNING|trainer.py:803] 2025-04-26 18:11:53,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:11:54,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:54,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8309 8261 8294 [WARNING|trainer.py:803] 2025-04-26 18:11:55,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:56,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:56,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8310 8262 [WARNING|trainer.py:803] 2025-04-26 18:11:57,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8295 [WARNING|trainer.py:803] 2025-04-26 18:11:58,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:11:58,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8311 [WARNING|trainer.py:803] 2025-04-26 18:11:59,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8263 8296 [WARNING|trainer.py:803] 2025-04-26 18:12:00,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8312 [WARNING|trainer.py:803] 2025-04-26 18:12:00,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:12:01,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8264 8297 8313 [WARNING|trainer.py:803] 2025-04-26 18:12:02,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:12:03,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:12:03,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8265 8298 8314 [WARNING|trainer.py:803] 2025-04-26 18:12:04,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:05,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:12:05,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8266 8299 8315 [WARNING|trainer.py:803] 2025-04-26 18:12:07,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:07,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:07,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8267 8316 8300 [WARNING|trainer.py:803] 2025-04-26 18:12:09,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:09,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:09,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8268 8317 8301 [WARNING|trainer.py:803] 2025-04-26 18:12:11,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:11,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:11,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8269 8318 8302 [WARNING|trainer.py:803] 2025-04-26 18:12:13,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:12:13,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:14,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8270 8319 8303 [WARNING|trainer.py:803] 2025-04-26 18:12:15,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:12:15,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:16,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8271 8320 8304 [WARNING|trainer.py:803] 2025-04-26 18:12:17,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:12:17,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:12:18,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8272 8321 8305 [WARNING|trainer.py:803] 2025-04-26 18:12:19,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:19,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:12:20,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8322 8273 8306 [WARNING|trainer.py:803] 2025-04-26 18:12:21,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:21,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:12:22,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8274 8323 8307 [WARNING|trainer.py:803] 2025-04-26 18:12:23,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:23,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:12:24,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8324 8275 8308 [WARNING|trainer.py:803] 2025-04-26 18:12:25,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:25,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:26,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8325 8276 8309 [WARNING|trainer.py:803] 2025-04-26 18:12:27,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:28,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:28,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8326 8277 8310 [WARNING|trainer.py:803] 2025-04-26 18:12:29,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:30,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:12:30,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8327 8311 8278 [WARNING|trainer.py:803] 2025-04-26 18:12:31,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:32,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:32,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8328 8312 8279 [WARNING|trainer.py:803] 2025-04-26 18:12:33,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:34,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:34,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8329 8313 [WARNING|trainer.py:803] 2025-04-26 18:12:35,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8280 [WARNING|trainer.py:803] 2025-04-26 18:12:36,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:36,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8330 8314 [WARNING|trainer.py:803] 2025-04-26 18:12:37,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8281 [WARNING|trainer.py:803] 2025-04-26 18:12:38,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8331 [WARNING|trainer.py:803] 2025-04-26 18:12:39,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8315 [WARNING|trainer.py:803] 2025-04-26 18:12:39,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8282 [WARNING|trainer.py:803] 2025-04-26 18:12:40,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8332 [WARNING|trainer.py:803] 2025-04-26 18:12:41,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8316 [WARNING|trainer.py:803] 2025-04-26 18:12:41,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:12:42,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8283 8333 [WARNING|trainer.py:803] 2025-04-26 18:12:43,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:12:43,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8317 [WARNING|trainer.py:803] 2025-04-26 18:12:44,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8284 8334 [WARNING|trainer.py:803] 2025-04-26 18:12:45,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:45,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8318 [WARNING|trainer.py:803] 2025-04-26 18:12:46,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8335 8285 [WARNING|trainer.py:803] 2025-04-26 18:12:47,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:12:47,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8319 [WARNING|trainer.py:803] 2025-04-26 18:12:48,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8336 8286 [WARNING|trainer.py:803] 2025-04-26 18:12:49,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:49,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8320 [WARNING|trainer.py:803] 2025-04-26 18:12:50,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8337 8287 [WARNING|trainer.py:803] 2025-04-26 18:12:51,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8321 [WARNING|trainer.py:803] 2025-04-26 18:12:52,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:52,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8338 8288 [WARNING|trainer.py:803] 2025-04-26 18:12:53,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8322 [WARNING|trainer.py:803] 2025-04-26 18:12:54,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:12:54,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8339 8289 [WARNING|trainer.py:803] 2025-04-26 18:12:55,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8323 [WARNING|trainer.py:803] 2025-04-26 18:12:56,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:12:56,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8340 8290 [WARNING|trainer.py:803] 2025-04-26 18:12:58,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8324 [WARNING|trainer.py:803] 2025-04-26 18:12:58,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:12:58,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8341 8291 [WARNING|trainer.py:803] 2025-04-26 18:13:00,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8325 [WARNING|trainer.py:803] 2025-04-26 18:13:00,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:13:00,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8342 8326 [WARNING|trainer.py:803] 2025-04-26 18:13:02,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8292 [WARNING|trainer.py:803] 2025-04-26 18:13:02,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:13:03,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8343 8327 [WARNING|trainer.py:803] 2025-04-26 18:13:04,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8293 [WARNING|trainer.py:803] 2025-04-26 18:13:04,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8344 [WARNING|trainer.py:803] 2025-04-26 18:13:05,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8328 [WARNING|trainer.py:803] 2025-04-26 18:13:06,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:06,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8294 8345 [WARNING|trainer.py:803] 2025-04-26 18:13:07,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:13:08,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8329 [WARNING|trainer.py:803] 2025-04-26 18:13:08,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8295 8346 8330 [WARNING|trainer.py:803] 2025-04-26 18:13:10,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:10,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:13:10,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8347 8296 8331 [WARNING|trainer.py:803] 2025-04-26 18:13:12,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:12,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:12,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8348 8297 8332 [WARNING|trainer.py:803] 2025-04-26 18:13:14,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:14,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:14,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8349 8333 8298 [WARNING|trainer.py:803] 2025-04-26 18:13:16,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:16,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:13:16,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8350 8334 8299 [WARNING|trainer.py:803] 2025-04-26 18:13:18,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:13:18,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:13:19,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8351 8335 8300 [WARNING|trainer.py:803] 2025-04-26 18:13:20,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:20,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:21,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8352 8336 [WARNING|trainer.py:803] 2025-04-26 18:13:22,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8301 [WARNING|trainer.py:803] 2025-04-26 18:13:22,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:13:23,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8353 8337 [WARNING|trainer.py:803] 2025-04-26 18:13:24,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8302 [WARNING|trainer.py:803] 2025-04-26 18:13:24,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:13:25,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8354 8338 [WARNING|trainer.py:803] 2025-04-26 18:13:26,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8303 [WARNING|trainer.py:803] 2025-04-26 18:13:27,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:13:27,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8355 8339 [WARNING|trainer.py:803] 2025-04-26 18:13:28,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8304 [WARNING|trainer.py:803] 2025-04-26 18:13:29,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:29,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8356 8340 [WARNING|trainer.py:803] 2025-04-26 18:13:30,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8305 [WARNING|trainer.py:803] 2025-04-26 18:13:31,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:31,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8357 8341 [WARNING|trainer.py:803] 2025-04-26 18:13:32,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8306 [WARNING|trainer.py:803] 2025-04-26 18:13:33,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:33,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8358 8342 [WARNING|trainer.py:803] 2025-04-26 18:13:35,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8307 [WARNING|trainer.py:803] 2025-04-26 18:13:35,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:13:36,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8359 8343 [WARNING|trainer.py:803] 2025-04-26 18:13:37,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8308 [WARNING|trainer.py:803] 2025-04-26 18:13:37,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:13:38,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8360 8344 [WARNING|trainer.py:803] 2025-04-26 18:13:39,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8309 [WARNING|trainer.py:803] 2025-04-26 18:13:39,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:40,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8361 8345 [WARNING|trainer.py:803] 2025-04-26 18:13:41,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8310 [WARNING|trainer.py:803] 2025-04-26 18:13:41,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:13:42,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8362 8346 [WARNING|trainer.py:803] 2025-04-26 18:13:43,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8311 [WARNING|trainer.py:803] 2025-04-26 18:13:43,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:13:44,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8363 8347 8312 [WARNING|trainer.py:803] 2025-04-26 18:13:45,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:13:45,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:46,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8364 8348 8313 [WARNING|trainer.py:803] 2025-04-26 18:13:47,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:13:47,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:48,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8365 8349 8314 [WARNING|trainer.py:803] 2025-04-26 18:13:49,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:13:49,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:50,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8366 8350 8315 [WARNING|trainer.py:803] 2025-04-26 18:13:51,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:13:51,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:13:52,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8367 8351 8316 [WARNING|trainer.py:803] 2025-04-26 18:13:53,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:13:53,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:54,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8368 8352 [WARNING|trainer.py:803] 2025-04-26 18:13:55,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8317 [WARNING|trainer.py:803] 2025-04-26 18:13:55,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:13:56,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8369 8353 [WARNING|trainer.py:803] 2025-04-26 18:13:57,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8318 [WARNING|trainer.py:803] 2025-04-26 18:13:57,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:13:58,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8370 8354 [WARNING|trainer.py:803] 2025-04-26 18:13:59,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8319 [WARNING|trainer.py:803] 2025-04-26 18:13:59,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:14:00,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8371 8355 [WARNING|trainer.py:803] 2025-04-26 18:14:01,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8320 [WARNING|trainer.py:803] 2025-04-26 18:14:01,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:14:02,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8372 8356 8321 [WARNING|trainer.py:803] 2025-04-26 18:14:03,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:04,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:14:04,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8373 8357 8322 [WARNING|trainer.py:803] 2025-04-26 18:14:05,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:14:06,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:14:06,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8374 8358 8323 [WARNING|trainer.py:803] 2025-04-26 18:14:08,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:14:08,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:14:08,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8359 8375 8324 [WARNING|trainer.py:803] 2025-04-26 18:14:10,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:14:10,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:10,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8360 8376 8325 [WARNING|trainer.py:803] 2025-04-26 18:14:12,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:12,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:14:12,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8361 8377 8326 [WARNING|trainer.py:803] 2025-04-26 18:14:14,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:14,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:14:14,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8362 8378 8327 [WARNING|trainer.py:803] 2025-04-26 18:14:16,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:16,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:14:16,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8363 8379 8328 [WARNING|trainer.py:803] 2025-04-26 18:14:18,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:14:18,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:14:18,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8364 8380 8329 [WARNING|trainer.py:803] 2025-04-26 18:14:20,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:20,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:14:20,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8365 8381 8330 [WARNING|trainer.py:803] 2025-04-26 18:14:22,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:22,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:14:22,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8366 8331 8382 [WARNING|trainer.py:803] 2025-04-26 18:14:24,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:14:24,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:24,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8367 8332 8383 [WARNING|trainer.py:803] 2025-04-26 18:14:26,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:14:26,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:14:26,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8333 8368 8384 [WARNING|trainer.py:803] 2025-04-26 18:14:28,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:14:28,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:28,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8369 8334 8385 [WARNING|trainer.py:803] 2025-04-26 18:14:30,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:30,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:31,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8335 8370 8386 [WARNING|trainer.py:803] 2025-04-26 18:14:32,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:14:32,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:33,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8336 8371 8387 [WARNING|trainer.py:803] 2025-04-26 18:14:34,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:35,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:35,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8337 8372 8388 [WARNING|trainer.py:803] 2025-04-26 18:14:37,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:37,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:37,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8338 8373 8389 [WARNING|trainer.py:803] 2025-04-26 18:14:39,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:39,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:14:39,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8339 8374 [WARNING|trainer.py:803] 2025-04-26 18:14:41,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8390 [WARNING|trainer.py:803] 2025-04-26 18:14:41,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:14:42,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8340 8375 8391 [WARNING|trainer.py:803] 2025-04-26 18:14:43,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:14:43,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:44,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8341 8376 [WARNING|trainer.py:803] 2025-04-26 18:14:45,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8392 [WARNING|trainer.py:803] 2025-04-26 18:14:45,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:14:46,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8342 8377 8393 [WARNING|trainer.py:803] 2025-04-26 18:14:47,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:47,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:14:48,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8378 8343 8394 [WARNING|trainer.py:803] 2025-04-26 18:14:49,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:14:49,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:14:50,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8379 8344 8395 [WARNING|trainer.py:803] 2025-04-26 18:14:51,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:14:51,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:14:52,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8345 8380 [WARNING|trainer.py:803] 2025-04-26 18:14:53,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:14:53,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8396 [WARNING|trainer.py:803] 2025-04-26 18:14:54,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8346 8381 8397 [WARNING|trainer.py:803] 2025-04-26 18:14:55,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:14:55,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:14:56,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8347 8382 [WARNING|trainer.py:803] 2025-04-26 18:14:57,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:14:57,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8398 [WARNING|trainer.py:803] 2025-04-26 18:14:58,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8348 8383 [WARNING|trainer.py:803] 2025-04-26 18:14:59,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8399 [WARNING|trainer.py:803] 2025-04-26 18:14:59,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:15:00,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8349 8384 8400 [WARNING|trainer.py:803] 2025-04-26 18:15:01,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:15:01,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:15:02,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8385 8350 8401 [WARNING|trainer.py:803] 2025-04-26 18:15:03,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:15:04,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:04,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8351 8386 8402 [WARNING|trainer.py:803] 2025-04-26 18:15:06,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:15:06,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:15:06,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8352 8403 8387 [WARNING|trainer.py:803] 2025-04-26 18:15:08,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:15:08,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:15:08,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8404 8353 8388 [WARNING|trainer.py:803] 2025-04-26 18:15:09,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:10,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:15:10,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8405 8354 [WARNING|trainer.py:803] 2025-04-26 18:15:11,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8389 [WARNING|trainer.py:803] 2025-04-26 18:15:12,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:15:12,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8406 8355 [WARNING|trainer.py:803] 2025-04-26 18:15:13,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8390 [WARNING|trainer.py:803] 2025-04-26 18:15:14,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8407 [WARNING|trainer.py:803] 2025-04-26 18:15:14,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:15,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8356 8391 8408 [WARNING|trainer.py:803] 2025-04-26 18:15:16,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:15:17,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:15:17,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8357 8409 8392 [WARNING|trainer.py:803] 2025-04-26 18:15:18,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:15:18,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:19,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8358 8410 8393 [WARNING|trainer.py:803] 2025-04-26 18:15:20,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:15:20,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:15:21,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8411 8359 8394 [WARNING|trainer.py:803] 2025-04-26 18:15:22,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:15:22,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:15:23,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8412 8360 8395 [WARNING|trainer.py:803] 2025-04-26 18:15:24,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:15:24,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:25,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8413 8361 [WARNING|trainer.py:803] 2025-04-26 18:15:26,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8396 [WARNING|trainer.py:803] 2025-04-26 18:15:26,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8414 [WARNING|trainer.py:803] 2025-04-26 18:15:27,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8362 [WARNING|trainer.py:803] 2025-04-26 18:15:27,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8397 [WARNING|trainer.py:803] 2025-04-26 18:15:28,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8415 [WARNING|trainer.py:803] 2025-04-26 18:15:29,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:15:29,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8363 8398 [WARNING|trainer.py:803] 2025-04-26 18:15:30,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8416 [WARNING|trainer.py:803] 2025-04-26 18:15:31,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:15:31,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8364 8417 8399 [WARNING|trainer.py:803] 2025-04-26 18:15:32,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:33,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:15:33,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8365 8418 8400 [WARNING|trainer.py:803] 2025-04-26 18:15:34,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:35,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:35,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8366 8419 8401 [WARNING|trainer.py:803] 2025-04-26 18:15:36,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:15:37,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:37,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8420 8367 8402 [WARNING|trainer.py:803] 2025-04-26 18:15:38,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:15:38,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:15:39,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8421 8403 8368 [WARNING|trainer.py:803] 2025-04-26 18:15:40,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:15:40,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:15:41,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8422 8404 8369 [WARNING|trainer.py:803] 2025-04-26 18:15:42,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:42,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:43,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8423 8405 8370 [WARNING|trainer.py:803] 2025-04-26 18:15:44,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:15:44,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:15:45,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8424 8406 [WARNING|trainer.py:803] 2025-04-26 18:15:46,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:46,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8371 [WARNING|trainer.py:803] 2025-04-26 18:15:47,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8407 8425 [WARNING|trainer.py:803] 2025-04-26 18:15:48,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:15:48,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8372 8426 8408 [WARNING|trainer.py:803] 2025-04-26 18:15:49,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:49,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:49,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8373 8427 8409 [WARNING|trainer.py:803] 2025-04-26 18:15:51,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:15:51,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:51,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8374 8428 8410 [WARNING|trainer.py:803] 2025-04-26 18:15:53,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:15:53,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:53,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8429 8411 8375 [WARNING|trainer.py:803] 2025-04-26 18:15:55,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:55,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:15:55,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8412 8430 8376 [WARNING|trainer.py:803] 2025-04-26 18:15:57,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:15:57,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:57,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8413 8431 8377 [WARNING|trainer.py:803] 2025-04-26 18:15:58,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:59,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:15:59,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8414 8432 [WARNING|trainer.py:803] 2025-04-26 18:16:00,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:16:00,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8378 8415 [WARNING|trainer.py:803] 2025-04-26 18:16:01,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8433 [WARNING|trainer.py:803] 2025-04-26 18:16:02,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:16:02,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8379 8416 8434 [WARNING|trainer.py:803] 2025-04-26 18:16:03,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:16:04,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:04,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8380 8417 8435 [WARNING|trainer.py:803] 2025-04-26 18:16:05,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:16:05,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:16:06,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8418 8381 8436 [WARNING|trainer.py:803] 2025-04-26 18:16:07,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:07,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:16:08,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8419 8437 8382 [WARNING|trainer.py:803] 2025-04-26 18:16:09,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:09,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:16:09,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8420 8438 8383 [WARNING|trainer.py:803] 2025-04-26 18:16:11,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:16:11,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:16:11,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8421 8439 [WARNING|trainer.py:803] 2025-04-26 18:16:13,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8384 [WARNING|trainer.py:803] 2025-04-26 18:16:13,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:14,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8422 8440 [WARNING|trainer.py:803] 2025-04-26 18:16:14,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:15,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8385 8423 [WARNING|trainer.py:803] 2025-04-26 18:16:16,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8441 [WARNING|trainer.py:803] 2025-04-26 18:16:16,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:16:17,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8386 8424 8442 [WARNING|trainer.py:803] 2025-04-26 18:16:18,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:16:18,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:18,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8425 8387 8443 [WARNING|trainer.py:803] 2025-04-26 18:16:20,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:16:20,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:16:20,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8426 8388 8444 [WARNING|trainer.py:803] 2025-04-26 18:16:22,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:22,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:16:22,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8427 8445 8389 [WARNING|trainer.py:803] 2025-04-26 18:16:24,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:24,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:16:24,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8428 8446 [WARNING|trainer.py:803] 2025-04-26 18:16:25,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8390 [WARNING|trainer.py:803] 2025-04-26 18:16:26,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8429 [WARNING|trainer.py:803] 2025-04-26 18:16:27,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8447 [WARNING|trainer.py:803] 2025-04-26 18:16:27,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:28,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8391 8430 8448 [WARNING|trainer.py:803] 2025-04-26 18:16:29,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:16:29,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:29,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8392 8431 8449 [WARNING|trainer.py:803] 2025-04-26 18:16:31,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:16:31,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:31,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8432 8393 8450 [WARNING|trainer.py:803] 2025-04-26 18:16:33,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:16:33,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:16:33,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8433 8394 8451 [WARNING|trainer.py:803] 2025-04-26 18:16:34,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:35,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:16:35,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8434 8452 8395 [WARNING|trainer.py:803] 2025-04-26 18:16:36,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:37,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:16:37,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8435 8453 [WARNING|trainer.py:803] 2025-04-26 18:16:38,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8396 [WARNING|trainer.py:803] 2025-04-26 18:16:39,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:16:39,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8436 8454 [WARNING|trainer.py:803] 2025-04-26 18:16:40,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8397 [WARNING|trainer.py:803] 2025-04-26 18:16:40,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8437 [WARNING|trainer.py:803] 2025-04-26 18:16:41,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8455 [WARNING|trainer.py:803] 2025-04-26 18:16:42,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:16:42,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8398 8438 8456 [WARNING|trainer.py:803] 2025-04-26 18:16:43,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:16:44,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:16:44,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8399 8439 8457 [WARNING|trainer.py:803] 2025-04-26 18:16:45,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:16:45,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:46,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8440 8458 8400 [WARNING|trainer.py:803] 2025-04-26 18:16:47,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:47,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:47,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8441 8459 8401 [WARNING|trainer.py:803] 2025-04-26 18:16:49,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:16:49,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:49,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8442 8460 8402 [WARNING|trainer.py:803] 2025-04-26 18:16:51,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:51,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:51,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8443 8461 8403 [WARNING|trainer.py:803] 2025-04-26 18:16:53,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:16:53,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:16:53,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8462 8444 8404 [WARNING|trainer.py:803] 2025-04-26 18:16:55,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:16:55,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:16:55,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8463 8445 8405 [WARNING|trainer.py:803] 2025-04-26 18:16:56,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:16:56,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:16:56,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8464 8446 8406 [WARNING|trainer.py:803] 2025-04-26 18:16:58,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:16:58,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:16:58,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8407 8465 8447 [WARNING|trainer.py:803] 2025-04-26 18:17:00,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:00,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:17:00,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8466 8448 8408 [WARNING|trainer.py:803] 2025-04-26 18:17:02,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:02,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:17:02,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8467 8449 8409 [WARNING|trainer.py:803] 2025-04-26 18:17:03,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:17:04,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:04,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8468 8410 8450 [WARNING|trainer.py:803] 2025-04-26 18:17:05,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:05,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:05,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8469 8411 8451 [WARNING|trainer.py:803] 2025-04-26 18:17:07,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:07,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:07,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8470 8412 8452 [WARNING|trainer.py:803] 2025-04-26 18:17:09,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:09,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:09,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8471 8413 8453 [WARNING|trainer.py:803] 2025-04-26 18:17:11,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:17:11,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:11,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8472 8454 8414 [WARNING|trainer.py:803] 2025-04-26 18:17:12,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:13,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:13,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8473 8455 8415 [WARNING|trainer.py:803] 2025-04-26 18:17:14,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:17:14,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:15,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8474 8456 8416 [WARNING|trainer.py:803] 2025-04-26 18:17:16,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:16,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:16,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8475 8457 8417 [WARNING|trainer.py:803] 2025-04-26 18:17:18,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:18,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:17:18,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8476 8458 8418 [WARNING|trainer.py:803] 2025-04-26 18:17:20,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:20,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:20,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8477 8459 8419 [WARNING|trainer.py:803] 2025-04-26 18:17:21,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:22,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:22,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8478 8460 8420 [WARNING|trainer.py:803] 2025-04-26 18:17:23,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:23,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:23,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8461 8479 8421 [WARNING|trainer.py:803] 2025-04-26 18:17:25,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:17:25,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:25,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8480 8462 8422 [WARNING|trainer.py:803] 2025-04-26 18:17:27,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:17:27,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:27,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8481 8463 8423 [WARNING|trainer.py:803] 2025-04-26 18:17:29,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:29,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:17:29,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8482 8464 8424 [WARNING|trainer.py:803] 2025-04-26 18:17:30,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:17:30,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:17:31,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8483 8465 8425 [WARNING|trainer.py:803] 2025-04-26 18:17:32,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:32,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:17:33,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8484 8466 8426 [WARNING|trainer.py:803] 2025-04-26 18:17:34,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:34,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:34,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8485 8467 8427 [WARNING|trainer.py:803] 2025-04-26 18:17:36,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:36,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:17:36,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8468 8486 8428 [WARNING|trainer.py:803] 2025-04-26 18:17:38,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:38,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:17:38,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8469 8487 8429 [WARNING|trainer.py:803] 2025-04-26 18:17:39,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:40,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:40,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8470 8488 8430 [WARNING|trainer.py:803] 2025-04-26 18:17:41,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:41,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:17:42,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8471 8489 [WARNING|trainer.py:803] 2025-04-26 18:17:43,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:17:43,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8431 [WARNING|trainer.py:803] 2025-04-26 18:17:44,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8490 8472 8432 [WARNING|trainer.py:803] 2025-04-26 18:17:45,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:45,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:45,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8473 8491 8433 [WARNING|trainer.py:803] 2025-04-26 18:17:46,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:17:47,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:17:47,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8474 8492 [WARNING|trainer.py:803] 2025-04-26 18:17:48,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8434 [WARNING|trainer.py:803] 2025-04-26 18:17:48,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:49,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8493 8475 8435 [WARNING|trainer.py:803] 2025-04-26 18:17:50,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:50,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:51,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8494 8476 8436 [WARNING|trainer.py:803] 2025-04-26 18:17:52,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:17:52,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:53,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8477 8495 8437 [WARNING|trainer.py:803] 2025-04-26 18:17:54,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:54,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:54,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8478 8496 8438 [WARNING|trainer.py:803] 2025-04-26 18:17:55,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:17:55,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:56,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8479 8497 8439 [WARNING|trainer.py:803] 2025-04-26 18:17:57,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:57,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:17:58,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8480 8498 8440 [WARNING|trainer.py:803] 2025-04-26 18:17:59,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:17:59,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:17:59,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8481 8499 8441 [WARNING|trainer.py:803] 2025-04-26 18:18:01,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:18:01,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:01,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8482 8500 8442 [WARNING|trainer.py:803] 2025-04-26 18:18:02,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:18:03,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:18:03,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8483 8501 8443 [WARNING|trainer.py:803] 2025-04-26 18:18:04,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:04,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:05,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8484 8502 8444 [WARNING|trainer.py:803] 2025-04-26 18:18:06,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:06,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:07,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8485 8503 8445 [WARNING|trainer.py:803] 2025-04-26 18:18:08,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:08,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:18:09,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8486 8504 8446 [WARNING|trainer.py:803] 2025-04-26 18:18:10,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:18:10,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:18:11,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8505 8487 8447 [WARNING|trainer.py:803] 2025-04-26 18:18:12,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:18:12,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:18:12,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8488 8506 8448 [WARNING|trainer.py:803] 2025-04-26 18:18:13,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:18:14,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:18:14,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8489 8507 8449 [WARNING|trainer.py:803] 2025-04-26 18:18:15,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:18:15,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:18:16,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8490 8508 [WARNING|trainer.py:803] 2025-04-26 18:18:17,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8450 [WARNING|trainer.py:803] 2025-04-26 18:18:17,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:18,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8491 8509 [WARNING|trainer.py:803] 2025-04-26 18:18:19,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8451 [WARNING|trainer.py:803] 2025-04-26 18:18:19,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:20,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8492 8510 [WARNING|trainer.py:803] 2025-04-26 18:18:20,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8452 [WARNING|trainer.py:803] 2025-04-26 18:18:21,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8493 [WARNING|trainer.py:803] 2025-04-26 18:18:21,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8511 [WARNING|trainer.py:803] 2025-04-26 18:18:22,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8453 [WARNING|trainer.py:803] 2025-04-26 18:18:23,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8494 [WARNING|trainer.py:803] 2025-04-26 18:18:23,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8512 [WARNING|trainer.py:803] 2025-04-26 18:18:24,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8454 [WARNING|trainer.py:803] 2025-04-26 18:18:24,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8495 [WARNING|trainer.py:803] 2025-04-26 18:18:25,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8513 [WARNING|trainer.py:803] 2025-04-26 18:18:26,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8455 [WARNING|trainer.py:803] 2025-04-26 18:18:26,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8496 [WARNING|trainer.py:803] 2025-04-26 18:18:27,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8514 [WARNING|trainer.py:803] 2025-04-26 18:18:28,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8456 [WARNING|trainer.py:803] 2025-04-26 18:18:28,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:18:28,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8497 8515 [WARNING|trainer.py:803] 2025-04-26 18:18:29,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8457 [WARNING|trainer.py:803] 2025-04-26 18:18:30,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:18:30,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8498 8516 8458 [WARNING|trainer.py:803] 2025-04-26 18:18:31,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:32,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:32,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8499 8517 8459 [WARNING|trainer.py:803] 2025-04-26 18:18:33,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:34,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:34,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8500 8518 8460 [WARNING|trainer.py:803] 2025-04-26 18:18:35,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:18:35,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:18:36,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8501 8461 8519 [WARNING|trainer.py:803] 2025-04-26 18:18:37,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:37,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:18:37,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8502 8462 8520 [WARNING|trainer.py:803] 2025-04-26 18:18:38,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:39,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:18:39,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8503 8463 [WARNING|trainer.py:803] 2025-04-26 18:18:40,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8521 [WARNING|trainer.py:803] 2025-04-26 18:18:41,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:18:41,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8504 8464 [WARNING|trainer.py:803] 2025-04-26 18:18:42,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8522 [WARNING|trainer.py:803] 2025-04-26 18:18:43,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:18:43,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8505 8465 8523 [WARNING|trainer.py:803] 2025-04-26 18:18:44,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:18:45,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:18:45,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8506 8466 8524 [WARNING|trainer.py:803] 2025-04-26 18:18:46,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:18:46,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:47,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8507 8467 [WARNING|trainer.py:803] 2025-04-26 18:18:48,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8525 [WARNING|trainer.py:803] 2025-04-26 18:18:48,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:18:49,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8508 8468 8526 [WARNING|trainer.py:803] 2025-04-26 18:18:50,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:50,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:18:50,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8509 8469 8527 [WARNING|trainer.py:803] 2025-04-26 18:18:51,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:51,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:18:52,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8510 8470 8528 [WARNING|trainer.py:803] 2025-04-26 18:18:53,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:18:53,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:54,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8511 8471 8529 [WARNING|trainer.py:803] 2025-04-26 18:18:55,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:18:55,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:18:56,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8512 8472 8530 [WARNING|trainer.py:803] 2025-04-26 18:18:57,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:18:57,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:18:57,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8513 8473 8531 [WARNING|trainer.py:803] 2025-04-26 18:18:59,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:18:59,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:18:59,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8474 8514 8532 [WARNING|trainer.py:803] 2025-04-26 18:19:01,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:19:01,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:01,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8515 8475 8533 [WARNING|trainer.py:803] 2025-04-26 18:19:02,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:19:02,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:19:03,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8516 8476 8534 [WARNING|trainer.py:803] 2025-04-26 18:19:04,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:04,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:19:05,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8517 8477 8535 [WARNING|trainer.py:803] 2025-04-26 18:19:06,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:06,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:07,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8478 8518 8536 [WARNING|trainer.py:803] 2025-04-26 18:19:08,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:19:08,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:19:08,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8479 8519 8537 [WARNING|trainer.py:803] 2025-04-26 18:19:10,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:10,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:10,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8480 8520 8538 [WARNING|trainer.py:803] 2025-04-26 18:19:11,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:19:12,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:19:12,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8481 8521 8539 [WARNING|trainer.py:803] 2025-04-26 18:19:13,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:19:13,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:19:14,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8482 8522 8540 [WARNING|trainer.py:803] 2025-04-26 18:19:15,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:19:15,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:19:15,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8483 8523 8541 [WARNING|trainer.py:803] 2025-04-26 18:19:17,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:17,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:19:17,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8484 8524 8542 [WARNING|trainer.py:803] 2025-04-26 18:19:19,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:19,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:19:19,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8485 8543 8525 [WARNING|trainer.py:803] 2025-04-26 18:19:21,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:21,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:19:21,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8486 8544 8526 [WARNING|trainer.py:803] 2025-04-26 18:19:22,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:19:23,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:19:23,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8487 8527 8545 [WARNING|trainer.py:803] 2025-04-26 18:19:24,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:19:25,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:19:25,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8488 8528 8546 [WARNING|trainer.py:803] 2025-04-26 18:19:26,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:19:26,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:19:26,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8489 8529 8547 [WARNING|trainer.py:803] 2025-04-26 18:19:28,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:19:28,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:28,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8490 8530 8548 [WARNING|trainer.py:803] 2025-04-26 18:19:30,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:30,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:19:30,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8491 8531 8549 [WARNING|trainer.py:803] 2025-04-26 18:19:31,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:19:32,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:32,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8492 8532 8550 [WARNING|trainer.py:803] 2025-04-26 18:19:33,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:34,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:19:34,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8493 8551 8533 [WARNING|trainer.py:803] 2025-04-26 18:19:35,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:36,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:36,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8494 [WARNING|trainer.py:803] 2025-04-26 18:19:37,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8552 8534 [WARNING|trainer.py:803] 2025-04-26 18:19:37,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:19:37,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8495 [WARNING|trainer.py:803] 2025-04-26 18:19:38,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8553 8535 8496 [WARNING|trainer.py:803] 2025-04-26 18:19:39,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:19:39,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:19:40,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8536 8554 8497 [WARNING|trainer.py:803] 2025-04-26 18:19:41,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:19:41,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:19:42,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8537 8555 8498 [WARNING|trainer.py:803] 2025-04-26 18:19:43,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:19:43,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:19:44,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8538 8556 8499 [WARNING|trainer.py:803] 2025-04-26 18:19:45,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:19:45,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:19:45,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8557 8539 8500 [WARNING|trainer.py:803] 2025-04-26 18:19:47,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:19:47,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:19:47,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8540 8558 8501 [WARNING|trainer.py:803] 2025-04-26 18:19:48,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:19:48,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:49,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8559 8541 8502 [WARNING|trainer.py:803] 2025-04-26 18:19:50,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:19:50,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:51,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8560 8542 8503 [WARNING|trainer.py:803] 2025-04-26 18:19:52,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:52,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:53,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8561 8543 8504 [WARNING|trainer.py:803] 2025-04-26 18:19:54,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:54,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:19:55,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8562 8544 8505 [WARNING|trainer.py:803] 2025-04-26 18:19:56,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:19:56,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:19:56,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8563 8545 [WARNING|trainer.py:803] 2025-04-26 18:19:57,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8506 [WARNING|trainer.py:803] 2025-04-26 18:19:58,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:19:58,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8564 8546 8507 [WARNING|trainer.py:803] 2025-04-26 18:19:59,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:19:59,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:20:00,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8565 8547 8508 [WARNING|trainer.py:803] 2025-04-26 18:20:01,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:01,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:20:02,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8566 8548 8509 [WARNING|trainer.py:803] 2025-04-26 18:20:03,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:03,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:20:04,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8567 8549 8510 [WARNING|trainer.py:803] 2025-04-26 18:20:05,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:05,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:06,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8568 8550 8511 [WARNING|trainer.py:803] 2025-04-26 18:20:07,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:07,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:07,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8569 8551 8512 [WARNING|trainer.py:803] 2025-04-26 18:20:08,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:20:09,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8570 [WARNING|trainer.py:803] 2025-04-26 18:20:09,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8552 [WARNING|trainer.py:803] 2025-04-26 18:20:10,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8513 [WARNING|trainer.py:803] 2025-04-26 18:20:11,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8571 [WARNING|trainer.py:803] 2025-04-26 18:20:11,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:20:12,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8553 8514 [WARNING|trainer.py:803] 2025-04-26 18:20:13,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8572 [WARNING|trainer.py:803] 2025-04-26 18:20:13,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:20:13,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8554 8515 [WARNING|trainer.py:803] 2025-04-26 18:20:14,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8573 [WARNING|trainer.py:803] 2025-04-26 18:20:15,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:20:15,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8555 8516 8574 [WARNING|trainer.py:803] 2025-04-26 18:20:16,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:20:16,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:17,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8556 8517 8575 [WARNING|trainer.py:803] 2025-04-26 18:20:18,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:20:18,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:19,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8557 8518 8576 [WARNING|trainer.py:803] 2025-04-26 18:20:20,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:20:20,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:20:20,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8558 8519 8577 [WARNING|trainer.py:803] 2025-04-26 18:20:22,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:20:22,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:22,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8559 8520 8578 [WARNING|trainer.py:803] 2025-04-26 18:20:24,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:20:24,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:20:24,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8560 8521 8579 [WARNING|trainer.py:803] 2025-04-26 18:20:25,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:26,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:20:26,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8561 8580 8522 [WARNING|trainer.py:803] 2025-04-26 18:20:27,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:28,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:20:28,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8562 8523 8581 [WARNING|trainer.py:803] 2025-04-26 18:20:29,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:29,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:20:30,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8563 8524 8582 [WARNING|trainer.py:803] 2025-04-26 18:20:31,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:31,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:20:32,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8564 8525 8583 [WARNING|trainer.py:803] 2025-04-26 18:20:33,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:20:33,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:20:34,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8565 8526 8584 [WARNING|trainer.py:803] 2025-04-26 18:20:35,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:35,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:35,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8566 8527 8585 [WARNING|trainer.py:803] 2025-04-26 18:20:36,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:37,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:20:37,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8567 8528 8586 [WARNING|trainer.py:803] 2025-04-26 18:20:38,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:39,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:20:39,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8568 8529 8587 [WARNING|trainer.py:803] 2025-04-26 18:20:40,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:41,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:41,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8569 8530 8588 [WARNING|trainer.py:803] 2025-04-26 18:20:42,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:20:42,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:43,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8570 8531 8589 [WARNING|trainer.py:803] 2025-04-26 18:20:44,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:20:44,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:44,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8571 8590 8532 [WARNING|trainer.py:803] 2025-04-26 18:20:45,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:20:46,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:20:46,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8572 8591 [WARNING|trainer.py:803] 2025-04-26 18:20:47,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8533 [WARNING|trainer.py:803] 2025-04-26 18:20:48,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:20:48,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8573 8592 [WARNING|trainer.py:803] 2025-04-26 18:20:49,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8534 [WARNING|trainer.py:803] 2025-04-26 18:20:50,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8574 [WARNING|trainer.py:803] 2025-04-26 18:20:50,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8593 [WARNING|trainer.py:803] 2025-04-26 18:20:51,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8535 [WARNING|trainer.py:803] 2025-04-26 18:20:51,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8575 [WARNING|trainer.py:803] 2025-04-26 18:20:52,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8594 [WARNING|trainer.py:803] 2025-04-26 18:20:52,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8536 [WARNING|trainer.py:803] 2025-04-26 18:20:53,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8576 [WARNING|trainer.py:803] 2025-04-26 18:20:54,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8595 [WARNING|trainer.py:803] 2025-04-26 18:20:54,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8537 [WARNING|trainer.py:803] 2025-04-26 18:20:55,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8577 [WARNING|trainer.py:803] 2025-04-26 18:20:55,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8596 [WARNING|trainer.py:803] 2025-04-26 18:20:56,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8538 [WARNING|trainer.py:803] 2025-04-26 18:20:57,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8578 [WARNING|trainer.py:803] 2025-04-26 18:20:57,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:20:58,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8597 8539 8579 [WARNING|trainer.py:803] 2025-04-26 18:20:59,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:20:59,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:21:00,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8598 8540 [WARNING|trainer.py:803] 2025-04-26 18:21:00,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8580 [WARNING|trainer.py:803] 2025-04-26 18:21:01,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:01,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8599 8541 [WARNING|trainer.py:803] 2025-04-26 18:21:02,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8581 [WARNING|trainer.py:803] 2025-04-26 18:21:03,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8600 [WARNING|trainer.py:803] 2025-04-26 18:21:03,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8542 [WARNING|trainer.py:803] 2025-04-26 18:21:04,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8582 [WARNING|trainer.py:803] 2025-04-26 18:21:05,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8601 [WARNING|trainer.py:803] 2025-04-26 18:21:05,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8543 [WARNING|trainer.py:803] 2025-04-26 18:21:06,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8583 [WARNING|trainer.py:803] 2025-04-26 18:21:06,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8602 [WARNING|trainer.py:803] 2025-04-26 18:21:07,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:21:07,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8544 8584 8603 [WARNING|trainer.py:803] 2025-04-26 18:21:08,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:21:09,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:09,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8545 8585 8604 [WARNING|trainer.py:803] 2025-04-26 18:21:10,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:21:11,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:21:11,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8546 8605 8586 [WARNING|trainer.py:803] 2025-04-26 18:21:12,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:21:13,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:13,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8547 8606 8587 [WARNING|trainer.py:803] 2025-04-26 18:21:14,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:21:14,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:21:14,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8548 8607 8588 [WARNING|trainer.py:803] 2025-04-26 18:21:16,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:21:16,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:21:16,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8549 8608 8589 [WARNING|trainer.py:803] 2025-04-26 18:21:18,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:18,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:18,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8550 8609 8590 [WARNING|trainer.py:803] 2025-04-26 18:21:20,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:20,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:20,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8551 8610 8591 [WARNING|trainer.py:803] 2025-04-26 18:21:21,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:21,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:21:22,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8552 8611 8592 [WARNING|trainer.py:803] 2025-04-26 18:21:23,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:21:23,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:21:23,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8612 8553 8593 [WARNING|trainer.py:803] 2025-04-26 18:21:25,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:25,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:21:25,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8613 8554 8594 [WARNING|trainer.py:803] 2025-04-26 18:21:27,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:21:27,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:21:27,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8614 8595 8555 [WARNING|trainer.py:803] 2025-04-26 18:21:29,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:21:29,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:29,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8615 8556 8596 [WARNING|trainer.py:803] 2025-04-26 18:21:30,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:21:31,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:21:31,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8616 8557 8597 [WARNING|trainer.py:803] 2025-04-26 18:21:32,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:21:33,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:21:33,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8617 8598 8558 [WARNING|trainer.py:803] 2025-04-26 18:21:34,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:34,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:21:35,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8618 8599 8559 [WARNING|trainer.py:803] 2025-04-26 18:21:36,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:36,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:36,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8619 8600 8560 [WARNING|trainer.py:803] 2025-04-26 18:21:37,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:38,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:38,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8620 8601 [WARNING|trainer.py:803] 2025-04-26 18:21:39,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8561 [WARNING|trainer.py:803] 2025-04-26 18:21:40,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:40,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8621 8602 [WARNING|trainer.py:803] 2025-04-26 18:21:41,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8562 [WARNING|trainer.py:803] 2025-04-26 18:21:42,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8622 [WARNING|trainer.py:803] 2025-04-26 18:21:42,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8603 [WARNING|trainer.py:803] 2025-04-26 18:21:43,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8563 [WARNING|trainer.py:803] 2025-04-26 18:21:43,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8623 [WARNING|trainer.py:803] 2025-04-26 18:21:44,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8604 [WARNING|trainer.py:803] 2025-04-26 18:21:44,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8564 [WARNING|trainer.py:803] 2025-04-26 18:21:45,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8624 [WARNING|trainer.py:803] 2025-04-26 18:21:46,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8605 [WARNING|trainer.py:803] 2025-04-26 18:21:46,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8565 [WARNING|trainer.py:803] 2025-04-26 18:21:47,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8625 [WARNING|trainer.py:803] 2025-04-26 18:21:47,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8606 [WARNING|trainer.py:803] 2025-04-26 18:21:48,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8566 [WARNING|trainer.py:803] 2025-04-26 18:21:48,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8626 [WARNING|trainer.py:803] 2025-04-26 18:21:49,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8607 [WARNING|trainer.py:803] 2025-04-26 18:21:50,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8567 [WARNING|trainer.py:803] 2025-04-26 18:21:50,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8627 [WARNING|trainer.py:803] 2025-04-26 18:21:51,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8608 [WARNING|trainer.py:803] 2025-04-26 18:21:51,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8568 [WARNING|trainer.py:803] 2025-04-26 18:21:52,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8628 [WARNING|trainer.py:803] 2025-04-26 18:21:53,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:53,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8609 8569 8629 [WARNING|trainer.py:803] 2025-04-26 18:21:54,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:55,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:21:55,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8610 8570 8630 [WARNING|trainer.py:803] 2025-04-26 18:21:56,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:21:56,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:21:56,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8611 8631 8571 [WARNING|trainer.py:803] 2025-04-26 18:21:57,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:21:58,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:21:58,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8612 8632 8572 [WARNING|trainer.py:803] 2025-04-26 18:21:59,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:00,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:00,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8613 8633 8573 [WARNING|trainer.py:803] 2025-04-26 18:22:01,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:02,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:02,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8614 8634 8574 [WARNING|trainer.py:803] 2025-04-26 18:22:03,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:03,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:03,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8615 8635 8575 [WARNING|trainer.py:803] 2025-04-26 18:22:05,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:05,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:05,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8616 8636 8576 [WARNING|trainer.py:803] 2025-04-26 18:22:06,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:07,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:22:07,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8617 8637 8577 [WARNING|trainer.py:803] 2025-04-26 18:22:08,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:09,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:09,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8618 8578 8638 [WARNING|trainer.py:803] 2025-04-26 18:22:10,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:11,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:11,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8619 8639 8579 [WARNING|trainer.py:803] 2025-04-26 18:22:12,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:12,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:12,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8620 8640 8580 [WARNING|trainer.py:803] 2025-04-26 18:22:13,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:14,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:14,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8621 8641 [WARNING|trainer.py:803] 2025-04-26 18:22:15,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8581 [WARNING|trainer.py:803] 2025-04-26 18:22:16,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8622 [WARNING|trainer.py:803] 2025-04-26 18:22:16,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8642 [WARNING|trainer.py:803] 2025-04-26 18:22:17,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8582 [WARNING|trainer.py:803] 2025-04-26 18:22:17,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8623 [WARNING|trainer.py:803] 2025-04-26 18:22:18,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8643 [WARNING|trainer.py:803] 2025-04-26 18:22:19,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8583 [WARNING|trainer.py:803] 2025-04-26 18:22:19,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8624 [WARNING|trainer.py:803] 2025-04-26 18:22:20,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8644 [WARNING|trainer.py:803] 2025-04-26 18:22:21,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:22:21,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8584 8625 [WARNING|trainer.py:803] 2025-04-26 18:22:22,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8645 [WARNING|trainer.py:803] 2025-04-26 18:22:22,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:22:23,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8585 8626 8646 [WARNING|trainer.py:803] 2025-04-26 18:22:24,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:22:24,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:25,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8586 8627 8647 [WARNING|trainer.py:803] 2025-04-26 18:22:26,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:22:26,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:26,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8587 8628 8648 [WARNING|trainer.py:803] 2025-04-26 18:22:27,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:28,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:22:28,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8629 8588 8649 [WARNING|trainer.py:803] 2025-04-26 18:22:29,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:29,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:30,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8630 8589 8650 [WARNING|trainer.py:803] 2025-04-26 18:22:31,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:22:31,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:31,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8631 8590 8651 [WARNING|trainer.py:803] 2025-04-26 18:22:33,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:33,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:33,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8632 8591 8652 [WARNING|trainer.py:803] 2025-04-26 18:22:34,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:35,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:22:35,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8633 8592 8653 [WARNING|trainer.py:803] 2025-04-26 18:22:36,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:36,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:22:37,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8634 8593 8654 [WARNING|trainer.py:803] 2025-04-26 18:22:38,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:38,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:22:38,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8635 8655 8594 [WARNING|trainer.py:803] 2025-04-26 18:22:40,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:40,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:22:40,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8636 8656 8595 [WARNING|trainer.py:803] 2025-04-26 18:22:41,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:22:42,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:22:42,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8637 8657 8596 [WARNING|trainer.py:803] 2025-04-26 18:22:43,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:22:44,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:22:44,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8638 8658 [WARNING|trainer.py:803] 2025-04-26 18:22:45,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8597 [WARNING|trainer.py:803] 2025-04-26 18:22:45,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8639 [WARNING|trainer.py:803] 2025-04-26 18:22:46,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8659 [WARNING|trainer.py:803] 2025-04-26 18:22:47,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8598 [WARNING|trainer.py:803] 2025-04-26 18:22:47,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8640 [WARNING|trainer.py:803] 2025-04-26 18:22:48,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8660 [WARNING|trainer.py:803] 2025-04-26 18:22:48,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8599 [WARNING|trainer.py:803] 2025-04-26 18:22:49,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8641 [WARNING|trainer.py:803] 2025-04-26 18:22:49,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8661 [WARNING|trainer.py:803] 2025-04-26 18:22:50,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8600 [WARNING|trainer.py:803] 2025-04-26 18:22:50,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8642 [WARNING|trainer.py:803] 2025-04-26 18:22:51,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8662 [WARNING|trainer.py:803] 2025-04-26 18:22:52,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8601 [WARNING|trainer.py:803] 2025-04-26 18:22:52,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8643 [WARNING|trainer.py:803] 2025-04-26 18:22:53,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8663 [WARNING|trainer.py:803] 2025-04-26 18:22:54,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:54,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8602 8644 [WARNING|trainer.py:803] 2025-04-26 18:22:55,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8664 [WARNING|trainer.py:803] 2025-04-26 18:22:55,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:22:55,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8603 8645 [WARNING|trainer.py:803] 2025-04-26 18:22:56,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8665 [WARNING|trainer.py:803] 2025-04-26 18:22:57,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8604 [WARNING|trainer.py:803] 2025-04-26 18:22:57,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:22:58,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8666 8646 [WARNING|trainer.py:803] 2025-04-26 18:22:59,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8605 [WARNING|trainer.py:803] 2025-04-26 18:22:59,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:23:00,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8667 8647 8606 [WARNING|trainer.py:803] 2025-04-26 18:23:01,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:23:01,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:23:01,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8648 8668 [WARNING|trainer.py:803] 2025-04-26 18:23:02,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:23:02,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8607 [WARNING|trainer.py:803] 2025-04-26 18:23:03,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8649 8669 [WARNING|trainer.py:803] 2025-04-26 18:23:04,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:23:04,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8608 8650 [WARNING|trainer.py:803] 2025-04-26 18:23:05,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8670 [WARNING|trainer.py:803] 2025-04-26 18:23:06,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:23:06,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8609 8651 [WARNING|trainer.py:803] 2025-04-26 18:23:07,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8671 [WARNING|trainer.py:803] 2025-04-26 18:23:08,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:23:08,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 8610 :Yes [WARNING|trainer.py:803] 2025-04-26 18:23:09,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8652 8672 [WARNING|trainer.py:803] 2025-04-26 18:23:10,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:23:10,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8611 8653 [WARNING|trainer.py:803] 2025-04-26 18:23:10,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8673 [WARNING|trainer.py:803] 2025-04-26 18:23:11,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:23:11,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8612 8654 [WARNING|trainer.py:803] 2025-04-26 18:23:12,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8674 [WARNING|trainer.py:803] 2025-04-26 18:23:13,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8613 [WARNING|trainer.py:803] 2025-04-26 18:23:13,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8655 [WARNING|trainer.py:803] 2025-04-26 18:23:14,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8675 [WARNING|trainer.py:803] 2025-04-26 18:23:15,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:23:15,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8614 8656 8676 [WARNING|trainer.py:803] 2025-04-26 18:23:16,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:23:17,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:23:17,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8615 8657 8677 [WARNING|trainer.py:803] 2025-04-26 18:23:18,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:23:18,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:23:18,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8616 8658 8678 [WARNING|trainer.py:803] 2025-04-26 18:23:19,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:23:20,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:23:20,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8617 8659 8679 [WARNING|trainer.py:803] 2025-04-26 18:23:21,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:23:22,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:23:22,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8618 8660 8680 [WARNING|trainer.py:803] 2025-04-26 18:23:23,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:23:23,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:23:24,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8619 8661 8681 [WARNING|trainer.py:803] 2025-04-26 18:23:25,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:23:25,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:23:25,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8620 8662 [WARNING|trainer.py:803] 2025-04-26 18:23:26,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8682 [WARNING|trainer.py:803] 2025-04-26 18:23:27,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:23:27,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8621 8663 [WARNING|trainer.py:803] 2025-04-26 18:23:28,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8683 [WARNING|trainer.py:803] 2025-04-26 18:23:29,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:23:29,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8622 8664 8684 [WARNING|trainer.py:803] 2025-04-26 18:23:30,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:23:30,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:23:31,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8623 8665 [WARNING|trainer.py:803] 2025-04-26 18:23:32,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8685 [WARNING|trainer.py:803] 2025-04-26 18:23:32,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:23:33,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8624 8666 [WARNING|trainer.py:803] 2025-04-26 18:23:33,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8686 [WARNING|trainer.py:803] 2025-04-26 18:23:34,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8625 [WARNING|trainer.py:803] 2025-04-26 18:23:34,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8667 [WARNING|trainer.py:803] 2025-04-26 18:23:35,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8687 [WARNING|trainer.py:803] 2025-04-26 18:23:36,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8626 [WARNING|trainer.py:803] 2025-04-26 18:23:36,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8668 [WARNING|trainer.py:803] 2025-04-26 18:23:37,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8688 [WARNING|trainer.py:803] 2025-04-26 18:23:37,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:23:38,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8627 8669 [WARNING|trainer.py:803] 2025-04-26 18:23:39,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8689 [WARNING|trainer.py:803] 2025-04-26 18:23:39,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8628 [WARNING|trainer.py:803] 2025-04-26 18:23:40,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8670 [WARNING|trainer.py:803] 2025-04-26 18:23:40,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8690 [WARNING|trainer.py:803] 2025-04-26 18:23:41,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8629 [WARNING|trainer.py:803] 2025-04-26 18:23:41,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8671 [WARNING|trainer.py:803] 2025-04-26 18:23:42,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8691 [WARNING|trainer.py:803] 2025-04-26 18:23:43,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8630 [WARNING|trainer.py:803] 2025-04-26 18:23:43,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8672 [WARNING|trainer.py:803] 2025-04-26 18:23:44,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8692 [WARNING|trainer.py:803] 2025-04-26 18:23:44,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8631 [WARNING|trainer.py:803] 2025-04-26 18:23:45,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8673 [WARNING|trainer.py:803] 2025-04-26 18:23:46,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8693 [WARNING|trainer.py:803] 2025-04-26 18:23:46,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8632 [WARNING|trainer.py:803] 2025-04-26 18:23:47,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8674 [WARNING|trainer.py:803] 2025-04-26 18:23:47,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8694 [WARNING|trainer.py:803] 2025-04-26 18:23:48,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8633 [WARNING|trainer.py:803] 2025-04-26 18:23:48,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:23:49,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8675 8695 [WARNING|trainer.py:803] 2025-04-26 18:23:50,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8634 [WARNING|trainer.py:803] 2025-04-26 18:23:50,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8676 [WARNING|trainer.py:803] 2025-04-26 18:23:51,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8696 [WARNING|trainer.py:803] 2025-04-26 18:23:51,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8635 [WARNING|trainer.py:803] 2025-04-26 18:23:52,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8677 [WARNING|trainer.py:803] 2025-04-26 18:23:52,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8697 [WARNING|trainer.py:803] 2025-04-26 18:23:53,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8636 [WARNING|trainer.py:803] 2025-04-26 18:23:54,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8678 [WARNING|trainer.py:803] 2025-04-26 18:23:54,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8698 [WARNING|trainer.py:803] 2025-04-26 18:23:55,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8637 [WARNING|trainer.py:803] 2025-04-26 18:23:56,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8679 [WARNING|trainer.py:803] 2025-04-26 18:23:56,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8699 [WARNING|trainer.py:803] 2025-04-26 18:23:57,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8638 [WARNING|trainer.py:803] 2025-04-26 18:23:57,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8680 [WARNING|trainer.py:803] 2025-04-26 18:23:58,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8700 [WARNING|trainer.py:803] 2025-04-26 18:23:58,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8639 [WARNING|trainer.py:803] 2025-04-26 18:23:59,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8681 [WARNING|trainer.py:803] 2025-04-26 18:23:59,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8701 [WARNING|trainer.py:803] 2025-04-26 18:24:00,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8640 [WARNING|trainer.py:803] 2025-04-26 18:24:00,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8702 8682 [WARNING|trainer.py:803] 2025-04-26 18:24:01,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:24:02,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:24:02,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8641 8703 [WARNING|trainer.py:803] 2025-04-26 18:24:03,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8683 [WARNING|trainer.py:803] 2025-04-26 18:24:03,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8704 8642 [WARNING|trainer.py:803] 2025-04-26 18:24:04,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:24:04,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:04,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8684 8705 [WARNING|trainer.py:803] 2025-04-26 18:24:05,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8643 [WARNING|trainer.py:803] 2025-04-26 18:24:05,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8706 [WARNING|trainer.py:803] 2025-04-26 18:24:06,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8685 [WARNING|trainer.py:803] 2025-04-26 18:24:07,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8644 [WARNING|trainer.py:803] 2025-04-26 18:24:07,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8707 [WARNING|trainer.py:803] 2025-04-26 18:24:08,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:08,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8686 8708 8645 [WARNING|trainer.py:803] 2025-04-26 18:24:09,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:09,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:10,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8709 8687 8646 [WARNING|trainer.py:803] 2025-04-26 18:24:11,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:11,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8710 [WARNING|trainer.py:803] 2025-04-26 18:24:11,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8688 [WARNING|trainer.py:803] 2025-04-26 18:24:12,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:24:12,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8647 8711 [WARNING|trainer.py:803] 2025-04-26 18:24:13,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:24:13,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8689 8712 [WARNING|trainer.py:803] 2025-04-26 18:24:14,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8648 [WARNING|trainer.py:803] 2025-04-26 18:24:15,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:15,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8713 8690 [WARNING|trainer.py:803] 2025-04-26 18:24:16,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8649 [WARNING|trainer.py:803] 2025-04-26 18:24:16,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8714 [WARNING|trainer.py:803] 2025-04-26 18:24:17,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8691 [WARNING|trainer.py:803] 2025-04-26 18:24:17,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8650 [WARNING|trainer.py:803] 2025-04-26 18:24:18,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8715 [WARNING|trainer.py:803] 2025-04-26 18:24:19,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:24:19,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8692 8716 8651 [WARNING|trainer.py:803] 2025-04-26 18:24:19,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:20,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:20,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8717 8693 [WARNING|trainer.py:803] 2025-04-26 18:24:21,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8652 [WARNING|trainer.py:803] 2025-04-26 18:24:21,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8718 [WARNING|trainer.py:803] 2025-04-26 18:24:22,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8694 [WARNING|trainer.py:803] 2025-04-26 18:24:22,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8653 8719 [WARNING|trainer.py:803] 2025-04-26 18:24:23,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:24,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:24,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8695 8720 8654 [WARNING|trainer.py:803] 2025-04-26 18:24:25,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:25,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:25,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8721 8696 [WARNING|trainer.py:803] 2025-04-26 18:24:26,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8655 [WARNING|trainer.py:803] 2025-04-26 18:24:26,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8722 [WARNING|trainer.py:803] 2025-04-26 18:24:27,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8697 [WARNING|trainer.py:803] 2025-04-26 18:24:28,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8656 8723 [WARNING|trainer.py:803] 2025-04-26 18:24:28,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:29,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:24:29,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8698 8724 8657 [WARNING|trainer.py:803] 2025-04-26 18:24:30,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:30,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:24:31,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8699 8725 8658 [WARNING|trainer.py:803] 2025-04-26 18:24:32,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:32,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8726 [WARNING|trainer.py:803] 2025-04-26 18:24:32,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8700 [WARNING|trainer.py:803] 2025-04-26 18:24:33,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8659 [WARNING|trainer.py:803] 2025-04-26 18:24:33,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8727 8701 [WARNING|trainer.py:803] 2025-04-26 18:24:34,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:24:34,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:24:35,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8728 8660 8702 [WARNING|trainer.py:803] 2025-04-26 18:24:36,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:24:36,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:24:36,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8729 8703 8661 [WARNING|trainer.py:803] 2025-04-26 18:24:37,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:24:37,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8730 [WARNING|trainer.py:803] 2025-04-26 18:24:38,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8704 [WARNING|trainer.py:803] 2025-04-26 18:24:38,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8662 [WARNING|trainer.py:803] 2025-04-26 18:24:39,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8731 8705 [WARNING|trainer.py:803] 2025-04-26 18:24:39,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:24:40,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:24:40,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8732 8663 8706 [WARNING|trainer.py:803] 2025-04-26 18:24:41,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:24:41,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:24:41,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8733 8707 8664 [WARNING|trainer.py:803] 2025-04-26 18:24:42,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:42,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8734 [WARNING|trainer.py:803] 2025-04-26 18:24:43,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8708 [WARNING|trainer.py:803] 2025-04-26 18:24:44,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8665 [WARNING|trainer.py:803] 2025-04-26 18:24:44,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8735 8709 [WARNING|trainer.py:803] 2025-04-26 18:24:44,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:24:45,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:45,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8736 8666 8710 [WARNING|trainer.py:803] 2025-04-26 18:24:46,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:24:46,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:24:46,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8737 8711 8667 [WARNING|trainer.py:803] 2025-04-26 18:24:47,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:48,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8738 [WARNING|trainer.py:803] 2025-04-26 18:24:48,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8712 [WARNING|trainer.py:803] 2025-04-26 18:24:49,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:49,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8668 8739 8713 [WARNING|trainer.py:803] 2025-04-26 18:24:50,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:24:50,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:24:50,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8740 8714 8669 [WARNING|trainer.py:803] 2025-04-26 18:24:51,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:51,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:52,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8741 8715 8670 [WARNING|trainer.py:803] 2025-04-26 18:24:53,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:53,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8742 8716 [WARNING|trainer.py:803] 2025-04-26 18:24:53,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:24:54,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:24:54,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8671 8743 8717 [WARNING|trainer.py:803] 2025-04-26 18:24:55,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:24:55,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:55,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8744 8718 8672 [WARNING|trainer.py:803] 2025-04-26 18:24:57,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:57,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:24:57,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8745 8719 8673 [WARNING|trainer.py:803] 2025-04-26 18:24:58,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:58,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8720 8746 [WARNING|trainer.py:803] 2025-04-26 18:24:59,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:24:59,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:24:59,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8674 8721 8747 [WARNING|trainer.py:803] 2025-04-26 18:25:00,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:01,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:25:01,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8722 8748 8675 [WARNING|trainer.py:803] 2025-04-26 18:25:02,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:02,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:02,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8723 8749 [WARNING|trainer.py:803] 2025-04-26 18:25:03,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8676 [WARNING|trainer.py:803] 2025-04-26 18:25:03,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8750 8724 [WARNING|trainer.py:803] 2025-04-26 18:25:04,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:25:05,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:05,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8677 8725 8751 [WARNING|trainer.py:803] 2025-04-26 18:25:06,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:25:06,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:25:06,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8726 8752 8678 [WARNING|trainer.py:803] 2025-04-26 18:25:07,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:07,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:08,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8727 8753 8679 [WARNING|trainer.py:803] 2025-04-26 18:25:09,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:09,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8728 8754 [WARNING|trainer.py:803] 2025-04-26 18:25:09,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:10,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:10,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8680 8729 8755 [WARNING|trainer.py:803] 2025-04-26 18:25:11,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:25:11,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:11,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8730 8756 8681 [WARNING|trainer.py:803] 2025-04-26 18:25:12,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:13,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:13,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8731 8757 8682 [WARNING|trainer.py:803] 2025-04-26 18:25:14,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:14,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8732 8758 [WARNING|trainer.py:803] 2025-04-26 18:25:15,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:25:15,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:15,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8683 8733 8759 [WARNING|trainer.py:803] 2025-04-26 18:25:16,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:25:16,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:16,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8734 8760 8684 [WARNING|trainer.py:803] 2025-04-26 18:25:18,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:25:18,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:18,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8735 8761 [WARNING|trainer.py:803] 2025-04-26 18:25:19,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8685 [WARNING|trainer.py:803] 2025-04-26 18:25:19,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8736 8762 [WARNING|trainer.py:803] 2025-04-26 18:25:20,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:25:20,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:20,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8737 8686 8763 [WARNING|trainer.py:803] 2025-04-26 18:25:22,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:22,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:25:22,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8738 8764 8687 [WARNING|trainer.py:803] 2025-04-26 18:25:23,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:23,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8739 [WARNING|trainer.py:803] 2025-04-26 18:25:24,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8765 [WARNING|trainer.py:803] 2025-04-26 18:25:24,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8688 [WARNING|trainer.py:803] 2025-04-26 18:25:24,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8740 8766 [WARNING|trainer.py:803] 2025-04-26 18:25:25,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:26,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:26,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8689 8741 8767 [WARNING|trainer.py:803] 2025-04-26 18:25:27,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:27,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:27,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8742 8768 8690 [WARNING|trainer.py:803] 2025-04-26 18:25:28,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:28,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:29,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8743 8769 8691 [WARNING|trainer.py:803] 2025-04-26 18:25:30,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:30,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8744 8770 [WARNING|trainer.py:803] 2025-04-26 18:25:30,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:31,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:31,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8692 8745 8771 [WARNING|trainer.py:803] 2025-04-26 18:25:32,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:32,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:32,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8772 8746 8693 [WARNING|trainer.py:803] 2025-04-26 18:25:34,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:34,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:34,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8773 8747 8694 [WARNING|trainer.py:803] 2025-04-26 18:25:35,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:35,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8774 8748 [WARNING|trainer.py:803] 2025-04-26 18:25:36,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:36,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:36,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8775 8695 8749 [WARNING|trainer.py:803] 2025-04-26 18:25:37,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:38,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:38,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8776 8750 8696 [WARNING|trainer.py:803] 2025-04-26 18:25:39,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:39,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8777 [WARNING|trainer.py:803] 2025-04-26 18:25:39,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8751 [WARNING|trainer.py:803] 2025-04-26 18:25:40,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8697 [WARNING|trainer.py:803] 2025-04-26 18:25:40,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8778 8752 [WARNING|trainer.py:803] 2025-04-26 18:25:41,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:41,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:42,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8779 8698 8753 [WARNING|trainer.py:803] 2025-04-26 18:25:43,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:43,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:43,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8780 8754 8699 [WARNING|trainer.py:803] 2025-04-26 18:25:44,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:44,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8781 [WARNING|trainer.py:803] 2025-04-26 18:25:45,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8755 [WARNING|trainer.py:803] 2025-04-26 18:25:45,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:46,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8700 8782 8756 [WARNING|trainer.py:803] 2025-04-26 18:25:47,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:47,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8701 [WARNING|trainer.py:803] 2025-04-26 18:25:47,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8783 8757 [WARNING|trainer.py:803] 2025-04-26 18:25:48,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:48,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:48,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8702 8784 8758 [WARNING|trainer.py:803] 2025-04-26 18:25:49,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:49,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:50,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8703 8785 8759 [WARNING|trainer.py:803] 2025-04-26 18:25:51,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:51,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:51,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8704 8786 8760 [WARNING|trainer.py:803] 2025-04-26 18:25:52,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:52,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:25:52,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8705 8787 8761 [WARNING|trainer.py:803] 2025-04-26 18:25:53,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:53,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:25:54,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8706 8788 8762 [WARNING|trainer.py:803] 2025-04-26 18:25:55,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:55,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:55,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8707 8789 8763 [WARNING|trainer.py:803] 2025-04-26 18:25:56,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:56,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:56,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8708 8790 8764 [WARNING|trainer.py:803] 2025-04-26 18:25:57,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:57,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:58,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8709 8791 8765 [WARNING|trainer.py:803] 2025-04-26 18:25:58,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:25:59,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8710 [WARNING|trainer.py:803] 2025-04-26 18:25:59,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8792 8766 [WARNING|trainer.py:803] 2025-04-26 18:26:00,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:26:00,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8711 [WARNING|trainer.py:803] 2025-04-26 18:26:00,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8793 8767 [WARNING|trainer.py:803] 2025-04-26 18:26:01,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:26:01,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8712 [WARNING|trainer.py:803] 2025-04-26 18:26:02,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8794 8768 [WARNING|trainer.py:803] 2025-04-26 18:26:02,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:03,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8713 [WARNING|trainer.py:803] 2025-04-26 18:26:03,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8795 8769 [WARNING|trainer.py:803] 2025-04-26 18:26:04,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8714 [WARNING|trainer.py:803] 2025-04-26 18:26:04,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:26:04,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8796 8770 [WARNING|trainer.py:803] 2025-04-26 18:26:05,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8715 [WARNING|trainer.py:803] 2025-04-26 18:26:05,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:26:06,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8797 8771 [WARNING|trainer.py:803] 2025-04-26 18:26:06,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8716 [WARNING|trainer.py:803] 2025-04-26 18:26:07,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:26:07,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8798 8772 [WARNING|trainer.py:803] 2025-04-26 18:26:07,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8717 [WARNING|trainer.py:803] 2025-04-26 18:26:08,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:08,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8799 8773 [WARNING|trainer.py:803] 2025-04-26 18:26:09,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8718 [WARNING|trainer.py:803] 2025-04-26 18:26:09,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:09,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8800 8774 [WARNING|trainer.py:803] 2025-04-26 18:26:10,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8719 [WARNING|trainer.py:803] 2025-04-26 18:26:11,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:26:11,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8801 8775 [WARNING|trainer.py:803] 2025-04-26 18:26:11,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8720 [WARNING|trainer.py:803] 2025-04-26 18:26:12,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:12,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:26:13,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8776 8802 8721 [WARNING|trainer.py:803] 2025-04-26 18:26:13,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:13,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8803 [WARNING|trainer.py:803] 2025-04-26 18:26:14,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8777 8722 [WARNING|trainer.py:803] 2025-04-26 18:26:15,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:15,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:15,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8778 8804 8723 [WARNING|trainer.py:803] 2025-04-26 18:26:16,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:16,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:17,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8805 8779 8724 [WARNING|trainer.py:803] 2025-04-26 18:26:17,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:17,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:18,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8806 8780 8725 [WARNING|trainer.py:803] 2025-04-26 18:26:19,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:26:19,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:19,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8807 8781 8726 [WARNING|trainer.py:803] 2025-04-26 18:26:20,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:26:20,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:26:21,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8808 8782 8727 [WARNING|trainer.py:803] 2025-04-26 18:26:21,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:22,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8809 [WARNING|trainer.py:803] 2025-04-26 18:26:22,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8783 8728 [WARNING|trainer.py:803] 2025-04-26 18:26:23,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:26:23,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8810 [WARNING|trainer.py:803] 2025-04-26 18:26:23,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8729 8784 [WARNING|trainer.py:803] 2025-04-26 18:26:24,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:24,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8811 [WARNING|trainer.py:803] 2025-04-26 18:26:24,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8785 8730 [WARNING|trainer.py:803] 2025-04-26 18:26:25,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8812 [WARNING|trainer.py:803] 2025-04-26 18:26:26,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:26:26,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8786 8731 [WARNING|trainer.py:803] 2025-04-26 18:26:27,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8813 [WARNING|trainer.py:803] 2025-04-26 18:26:27,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:26:27,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8787 8732 [WARNING|trainer.py:803] 2025-04-26 18:26:28,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8814 [WARNING|trainer.py:803] 2025-04-26 18:26:28,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:26:28,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8788 8733 [WARNING|trainer.py:803] 2025-04-26 18:26:29,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8815 [WARNING|trainer.py:803] 2025-04-26 18:26:30,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:30,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8789 8734 [WARNING|trainer.py:803] 2025-04-26 18:26:30,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8816 [WARNING|trainer.py:803] 2025-04-26 18:26:31,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:31,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8790 8735 [WARNING|trainer.py:803] 2025-04-26 18:26:32,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8817 [WARNING|trainer.py:803] 2025-04-26 18:26:32,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:32,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8736 8791 [WARNING|trainer.py:803] 2025-04-26 18:26:33,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 8818 [WARNING|trainer.py:803] 2025-04-26 18:26:34,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:26:34,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8737 8792 [WARNING|trainer.py:803] 2025-04-26 18:26:35,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8819 [WARNING|trainer.py:803] 2025-04-26 18:26:35,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:35,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8738 8793 [WARNING|trainer.py:803] 2025-04-26 18:26:36,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8820 [WARNING|trainer.py:803] 2025-04-26 18:26:36,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:37,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8739 8794 [WARNING|trainer.py:803] 2025-04-26 18:26:37,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:26:38,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8821 [WARNING|trainer.py:803] 2025-04-26 18:26:38,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8740 8795 [WARNING|trainer.py:803] 2025-04-26 18:26:39,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:26:39,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8822 [WARNING|trainer.py:803] 2025-04-26 18:26:39,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8741 8796 [WARNING|trainer.py:803] 2025-04-26 18:26:40,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8823 [WARNING|trainer.py:803] 2025-04-26 18:26:40,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:40,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8742 8797 [WARNING|trainer.py:803] 2025-04-26 18:26:41,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:42,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8824 [WARNING|trainer.py:803] 2025-04-26 18:26:42,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8743 8798 [WARNING|trainer.py:803] 2025-04-26 18:26:43,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8825 [WARNING|trainer.py:803] 2025-04-26 18:26:43,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:43,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8744 8799 [WARNING|trainer.py:803] 2025-04-26 18:26:44,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8826 [WARNING|trainer.py:803] 2025-04-26 18:26:44,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:44,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8745 8800 [WARNING|trainer.py:803] 2025-04-26 18:26:45,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8827 [WARNING|trainer.py:803] 2025-04-26 18:26:46,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:46,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8746 8801 [WARNING|trainer.py:803] 2025-04-26 18:26:46,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8828 [WARNING|trainer.py:803] 2025-04-26 18:26:47,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:47,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8747 8802 [WARNING|trainer.py:803] 2025-04-26 18:26:48,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8829 [WARNING|trainer.py:803] 2025-04-26 18:26:48,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:49,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8748 8803 [WARNING|trainer.py:803] 2025-04-26 18:26:49,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8830 [WARNING|trainer.py:803] 2025-04-26 18:26:50,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:50,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8749 [WARNING|trainer.py:803] 2025-04-26 18:26:50,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8804 8831 [WARNING|trainer.py:803] 2025-04-26 18:26:51,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:51,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8750 [WARNING|trainer.py:803] 2025-04-26 18:26:52,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8805 8832 [WARNING|trainer.py:803] 2025-04-26 18:26:52,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:26:53,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8751 [WARNING|trainer.py:803] 2025-04-26 18:26:53,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8806 8833 [WARNING|trainer.py:803] 2025-04-26 18:26:54,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:54,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8752 [WARNING|trainer.py:803] 2025-04-26 18:26:54,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8807 8834 [WARNING|trainer.py:803] 2025-04-26 18:26:55,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:55,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8753 [WARNING|trainer.py:803] 2025-04-26 18:26:56,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8808 8835 [WARNING|trainer.py:803] 2025-04-26 18:26:56,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:26:57,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8754 [WARNING|trainer.py:803] 2025-04-26 18:26:57,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8809 8836 [WARNING|trainer.py:803] 2025-04-26 18:26:58,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:26:58,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8755 [WARNING|trainer.py:803] 2025-04-26 18:26:58,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8810 8837 [WARNING|trainer.py:803] 2025-04-26 18:26:59,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:26:59,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8756 [WARNING|trainer.py:803] 2025-04-26 18:27:00,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8811 8838 [WARNING|trainer.py:803] 2025-04-26 18:27:00,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:01,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8757 [WARNING|trainer.py:803] 2025-04-26 18:27:01,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8812 8839 [WARNING|trainer.py:803] 2025-04-26 18:27:02,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:02,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8758 [WARNING|trainer.py:803] 2025-04-26 18:27:02,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8813 8840 [WARNING|trainer.py:803] 2025-04-26 18:27:03,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:03,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8759 [WARNING|trainer.py:803] 2025-04-26 18:27:03,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8814 8841 [WARNING|trainer.py:803] 2025-04-26 18:27:04,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:05,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8760 [WARNING|trainer.py:803] 2025-04-26 18:27:05,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8815 8842 [WARNING|trainer.py:803] 2025-04-26 18:27:06,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:06,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:06,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8761 8816 8843 [WARNING|trainer.py:803] 2025-04-26 18:27:07,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:07,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:07,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8762 8817 8844 [WARNING|trainer.py:803] 2025-04-26 18:27:08,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:09,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 18:27:09,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8763 8818 8845 [WARNING|trainer.py:803] 2025-04-26 18:27:09,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8764 [WARNING|trainer.py:803] 2025-04-26 18:27:10,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:27:10,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8819 8846 [WARNING|trainer.py:803] 2025-04-26 18:27:11,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8765 [WARNING|trainer.py:803] 2025-04-26 18:27:11,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:11,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8847 8820 [WARNING|trainer.py:803] 2025-04-26 18:27:12,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:13,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8766 [WARNING|trainer.py:803] 2025-04-26 18:27:13,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8848 8821 [WARNING|trainer.py:803] 2025-04-26 18:27:13,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8767 [WARNING|trainer.py:803] 2025-04-26 18:27:14,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:14,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8849 8822 [WARNING|trainer.py:803] 2025-04-26 18:27:15,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:27:15,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8768 [WARNING|trainer.py:803] 2025-04-26 18:27:15,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8850 8823 [WARNING|trainer.py:803] 2025-04-26 18:27:16,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:17,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8769 [WARNING|trainer.py:803] 2025-04-26 18:27:17,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8851 8824 [WARNING|trainer.py:803] 2025-04-26 18:27:17,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:27:18,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8770 [WARNING|trainer.py:803] 2025-04-26 18:27:18,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8852 8825 [WARNING|trainer.py:803] 2025-04-26 18:27:19,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:19,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8771 [WARNING|trainer.py:803] 2025-04-26 18:27:19,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8853 8826 [WARNING|trainer.py:803] 2025-04-26 18:27:20,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8772 [WARNING|trainer.py:803] 2025-04-26 18:27:21,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:21,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8854 8827 [WARNING|trainer.py:803] 2025-04-26 18:27:21,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8773 [WARNING|trainer.py:803] 2025-04-26 18:27:22,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:27:22,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8855 8828 [WARNING|trainer.py:803] 2025-04-26 18:27:23,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8774 [WARNING|trainer.py:803] 2025-04-26 18:27:23,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:23,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8856 8829 [WARNING|trainer.py:803] 2025-04-26 18:27:24,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8775 [WARNING|trainer.py:803] 2025-04-26 18:27:25,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:27:25,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8857 8830 [WARNING|trainer.py:803] 2025-04-26 18:27:25,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8776 [WARNING|trainer.py:803] 2025-04-26 18:27:26,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:26,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8831 8858 [WARNING|trainer.py:803] 2025-04-26 18:27:27,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8777 [WARNING|trainer.py:803] 2025-04-26 18:27:27,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:27,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8859 8832 [WARNING|trainer.py:803] 2025-04-26 18:27:28,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8778 [WARNING|trainer.py:803] 2025-04-26 18:27:29,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:29,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8860 8833 [WARNING|trainer.py:803] 2025-04-26 18:27:29,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8779 [WARNING|trainer.py:803] 2025-04-26 18:27:30,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:30,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8861 8834 [WARNING|trainer.py:803] 2025-04-26 18:27:31,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8780 [WARNING|trainer.py:803] 2025-04-26 18:27:31,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:31,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8862 8835 [WARNING|trainer.py:803] 2025-04-26 18:27:32,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8781 [WARNING|trainer.py:803] 2025-04-26 18:27:32,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:33,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8863 8836 [WARNING|trainer.py:803] 2025-04-26 18:27:33,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8782 [WARNING|trainer.py:803] 2025-04-26 18:27:34,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:34,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8837 8864 [WARNING|trainer.py:803] 2025-04-26 18:27:34,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8783 [WARNING|trainer.py:803] 2025-04-26 18:27:35,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:35,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8838 8865 [WARNING|trainer.py:803] 2025-04-26 18:27:36,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8784 [WARNING|trainer.py:803] 2025-04-26 18:27:36,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:27:36,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8839 8866 [WARNING|trainer.py:803] 2025-04-26 18:27:37,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8785 [WARNING|trainer.py:803] 2025-04-26 18:27:38,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:38,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8840 8867 [WARNING|trainer.py:803] 2025-04-26 18:27:39,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8786 [WARNING|trainer.py:803] 2025-04-26 18:27:39,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:27:39,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8841 8868 [WARNING|trainer.py:803] 2025-04-26 18:27:40,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8787 [WARNING|trainer.py:803] 2025-04-26 18:27:40,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:41,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8842 8869 [WARNING|trainer.py:803] 2025-04-26 18:27:41,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8788 [WARNING|trainer.py:803] 2025-04-26 18:27:42,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:27:42,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8843 8870 [WARNING|trainer.py:803] 2025-04-26 18:27:42,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8789 [WARNING|trainer.py:803] 2025-04-26 18:27:43,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:43,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8844 8871 [WARNING|trainer.py:803] 2025-04-26 18:27:44,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8790 [WARNING|trainer.py:803] 2025-04-26 18:27:44,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:44,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8845 8872 [WARNING|trainer.py:803] 2025-04-26 18:27:45,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8791 [WARNING|trainer.py:803] 2025-04-26 18:27:46,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:27:46,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8846 8873 [WARNING|trainer.py:803] 2025-04-26 18:27:46,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:47,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8792 [WARNING|trainer.py:803] 2025-04-26 18:27:47,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8847 8874 [WARNING|trainer.py:803] 2025-04-26 18:27:48,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:48,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:48,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8793 8875 8848 [WARNING|trainer.py:803] 2025-04-26 18:27:49,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:50,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8794 [WARNING|trainer.py:803] 2025-04-26 18:27:50,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8876 8849 [WARNING|trainer.py:803] 2025-04-26 18:27:50,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:51,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8795 [WARNING|trainer.py:803] 2025-04-26 18:27:51,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8877 8850 [WARNING|trainer.py:803] 2025-04-26 18:27:52,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:27:52,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8796 [WARNING|trainer.py:803] 2025-04-26 18:27:52,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8878 8851 [WARNING|trainer.py:803] 2025-04-26 18:27:53,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:27:54,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8797 [WARNING|trainer.py:803] 2025-04-26 18:27:54,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8879 8852 [WARNING|trainer.py:803] 2025-04-26 18:27:54,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:27:55,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8798 [WARNING|trainer.py:803] 2025-04-26 18:27:55,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8880 8853 [WARNING|trainer.py:803] 2025-04-26 18:27:56,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:56,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8799 [WARNING|trainer.py:803] 2025-04-26 18:27:56,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8881 8854 [WARNING|trainer.py:803] 2025-04-26 18:27:57,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:27:58,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8800 [WARNING|trainer.py:803] 2025-04-26 18:27:58,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8882 8855 [WARNING|trainer.py:803] 2025-04-26 18:27:58,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:27:59,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8801 [WARNING|trainer.py:803] 2025-04-26 18:27:59,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8883 8856 [WARNING|trainer.py:803] 2025-04-26 18:28:00,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:00,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8802 [WARNING|trainer.py:803] 2025-04-26 18:28:00,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8884 8857 [WARNING|trainer.py:803] 2025-04-26 18:28:01,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:01,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8803 [WARNING|trainer.py:803] 2025-04-26 18:28:02,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8885 8858 [WARNING|trainer.py:803] 2025-04-26 18:28:02,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:03,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8804 [WARNING|trainer.py:803] 2025-04-26 18:28:03,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8886 8859 [WARNING|trainer.py:803] 2025-04-26 18:28:04,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:04,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8805 [WARNING|trainer.py:803] 2025-04-26 18:28:04,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8887 8860 [WARNING|trainer.py:803] 2025-04-26 18:28:05,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:05,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8806 [WARNING|trainer.py:803] 2025-04-26 18:28:06,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8888 8861 [WARNING|trainer.py:803] 2025-04-26 18:28:06,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:28:07,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8807 [WARNING|trainer.py:803] 2025-04-26 18:28:07,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8889 8862 [WARNING|trainer.py:803] 2025-04-26 18:28:08,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:28:08,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8808 [WARNING|trainer.py:803] 2025-04-26 18:28:08,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8890 8863 [WARNING|trainer.py:803] 2025-04-26 18:28:09,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:09,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:28:10,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8809 8891 8864 [WARNING|trainer.py:803] 2025-04-26 18:28:11,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:28:11,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:11,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8810 8892 8865 [WARNING|trainer.py:803] 2025-04-26 18:28:12,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:12,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:12,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8811 8893 8866 [WARNING|trainer.py:803] 2025-04-26 18:28:13,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:13,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:28:14,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8812 8894 8867 [WARNING|trainer.py:803] 2025-04-26 18:28:15,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:15,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:15,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8813 8895 8868 [WARNING|trainer.py:803] 2025-04-26 18:28:16,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:28:16,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:16,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8814 8896 8869 [WARNING|trainer.py:803] 2025-04-26 18:28:17,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:28:17,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:18,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8815 8897 8870 [WARNING|trainer.py:803] 2025-04-26 18:28:19,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:19,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:19,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8816 8898 8871 [WARNING|trainer.py:803] 2025-04-26 18:28:20,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:20,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:28:20,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8817 8899 8872 [WARNING|trainer.py:803] 2025-04-26 18:28:21,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 18:28:21,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:22,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8818 8900 8873 [WARNING|trainer.py:803] 2025-04-26 18:28:23,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:28:23,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:28:23,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8901 8819 8874 [WARNING|trainer.py:803] 2025-04-26 18:28:24,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:28:24,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:24,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8902 8820 8875 [WARNING|trainer.py:803] 2025-04-26 18:28:25,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:25,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:28:26,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8903 8821 8876 [WARNING|trainer.py:803] 2025-04-26 18:28:27,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:27,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:28:27,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8904 8877 8822 [WARNING|trainer.py:803] 2025-04-26 18:28:28,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:28,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:28,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8905 8823 8878 [WARNING|trainer.py:803] 2025-04-26 18:28:29,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:30,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:30,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8906 8879 8824 [WARNING|trainer.py:803] 2025-04-26 18:28:31,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:31,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:31,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8907 8825 8880 [WARNING|trainer.py:803] 2025-04-26 18:28:32,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:32,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:28:32,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8908 8826 8881 [WARNING|trainer.py:803] 2025-04-26 18:28:33,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:34,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:34,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8909 8882 8827 [WARNING|trainer.py:803] 2025-04-26 18:28:35,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:35,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:35,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8910 8883 8828 [WARNING|trainer.py:803] 2025-04-26 18:28:36,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:36,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:36,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8911 8884 8829 [WARNING|trainer.py:803] 2025-04-26 18:28:37,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:38,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:28:38,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8912 8885 8830 [WARNING|trainer.py:803] 2025-04-26 18:28:38,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:39,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:28:39,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8913 8886 8831 [WARNING|trainer.py:803] 2025-04-26 18:28:40,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:40,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:40,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8914 8887 8832 [WARNING|trainer.py:803] 2025-04-26 18:28:41,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:41,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:42,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8915 8888 8833 [WARNING|trainer.py:803] 2025-04-26 18:28:43,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:43,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:43,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8916 8889 8834 [WARNING|trainer.py:803] 2025-04-26 18:28:44,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:44,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:44,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8917 8890 8835 [WARNING|trainer.py:803] 2025-04-26 18:28:45,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:45,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:28:46,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8918 8891 8836 [WARNING|trainer.py:803] 2025-04-26 18:28:46,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:47,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:47,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8919 8837 8892 [WARNING|trainer.py:803] 2025-04-26 18:28:48,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:48,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:48,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8920 8838 8893 [WARNING|trainer.py:803] 2025-04-26 18:28:49,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:49,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:28:50,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8921 8839 8894 [WARNING|trainer.py:803] 2025-04-26 18:28:51,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:51,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:28:51,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8922 8840 8895 [WARNING|trainer.py:803] 2025-04-26 18:28:52,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:52,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:28:52,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8923 8841 8896 [WARNING|trainer.py:803] 2025-04-26 18:28:53,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:53,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:28:54,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8924 8842 8897 [WARNING|trainer.py:803] 2025-04-26 18:28:55,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:55,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:28:55,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8925 8898 8843 [WARNING|trainer.py:803] 2025-04-26 18:28:56,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:56,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:28:56,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8926 8899 8844 [WARNING|trainer.py:803] 2025-04-26 18:28:57,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:58,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:58,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8927 8900 8845 [WARNING|trainer.py:803] 2025-04-26 18:28:59,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:28:59,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:28:59,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8928 8901 8846 [WARNING|trainer.py:803] 2025-04-26 18:29:00,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:00,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:29:00,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8929 8847 8902 [WARNING|trainer.py:803] 2025-04-26 18:29:01,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:02,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:02,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8930 8903 8848 [WARNING|trainer.py:803] 2025-04-26 18:29:03,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:03,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:03,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8931 8904 8849 [WARNING|trainer.py:803] 2025-04-26 18:29:04,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:04,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:04,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8932 8850 8905 [WARNING|trainer.py:803] 2025-04-26 18:29:05,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:06,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:06,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8933 8851 8906 [WARNING|trainer.py:803] 2025-04-26 18:29:07,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:07,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:29:07,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8934 8907 8852 [WARNING|trainer.py:803] 2025-04-26 18:29:08,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:08,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:08,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8935 8908 8853 [WARNING|trainer.py:803] 2025-04-26 18:29:09,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:10,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:10,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8936 8854 8909 [WARNING|trainer.py:803] 2025-04-26 18:29:11,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:11,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:11,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8937 8855 8910 [WARNING|trainer.py:803] 2025-04-26 18:29:12,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:12,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:12,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8938 8856 8911 [WARNING|trainer.py:803] 2025-04-26 18:29:13,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:14,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:14,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8939 8912 8857 [WARNING|trainer.py:803] 2025-04-26 18:29:15,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:15,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:15,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8940 8913 8858 [WARNING|trainer.py:803] 2025-04-26 18:29:16,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:16,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:16,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8941 8914 8859 [WARNING|trainer.py:803] 2025-04-26 18:29:17,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:18,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:18,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8942 8915 8860 [WARNING|trainer.py:803] 2025-04-26 18:29:19,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:19,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:19,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8943 8916 8861 [WARNING|trainer.py:803] 2025-04-26 18:29:20,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:20,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:20,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8944 8862 8917 [WARNING|trainer.py:803] 2025-04-26 18:29:21,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:22,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:22,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8945 8918 8863 [WARNING|trainer.py:803] 2025-04-26 18:29:23,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:23,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:23,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8946 8919 8864 [WARNING|trainer.py:803] 2025-04-26 18:29:24,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:29:24,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:24,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8947 8920 8865 [WARNING|trainer.py:803] 2025-04-26 18:29:25,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:26,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:26,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8948 8866 8921 [WARNING|trainer.py:803] 2025-04-26 18:29:27,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:29:27,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:27,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8949 8867 8922 [WARNING|trainer.py:803] 2025-04-26 18:29:28,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:28,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:28,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8950 8868 8923 [WARNING|trainer.py:803] 2025-04-26 18:29:29,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:30,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:29:30,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8951 8869 8924 [WARNING|trainer.py:803] 2025-04-26 18:29:31,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:31,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:29:31,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8952 8870 8925 [WARNING|trainer.py:803] 2025-04-26 18:29:32,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:32,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:32,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8953 8871 8926 [WARNING|trainer.py:803] 2025-04-26 18:29:33,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:34,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:34,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8954 8872 8927 [WARNING|trainer.py:803] 2025-04-26 18:29:35,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:35,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8955 [WARNING|trainer.py:803] 2025-04-26 18:29:35,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8873 8928 [WARNING|trainer.py:803] 2025-04-26 18:29:36,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:29:36,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8956 [WARNING|trainer.py:803] 2025-04-26 18:29:36,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8874 8929 [WARNING|trainer.py:803] 2025-04-26 18:29:37,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:37,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8957 [WARNING|trainer.py:803] 2025-04-26 18:29:38,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8875 8930 [WARNING|trainer.py:803] 2025-04-26 18:29:39,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:39,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8958 [WARNING|trainer.py:803] 2025-04-26 18:29:39,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8876 8931 [WARNING|trainer.py:803] 2025-04-26 18:29:40,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:40,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8959 [WARNING|trainer.py:803] 2025-04-26 18:29:40,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8877 8932 [WARNING|trainer.py:803] 2025-04-26 18:29:41,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:41,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8960 [WARNING|trainer.py:803] 2025-04-26 18:29:42,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8878 8933 [WARNING|trainer.py:803] 2025-04-26 18:29:42,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:43,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8961 [WARNING|trainer.py:803] 2025-04-26 18:29:43,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8879 8934 [WARNING|trainer.py:803] 2025-04-26 18:29:44,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:44,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8962 [WARNING|trainer.py:803] 2025-04-26 18:29:44,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8880 8935 [WARNING|trainer.py:803] 2025-04-26 18:29:45,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:45,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8963 [WARNING|trainer.py:803] 2025-04-26 18:29:46,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8881 8936 [WARNING|trainer.py:803] 2025-04-26 18:29:46,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:47,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8964 [WARNING|trainer.py:803] 2025-04-26 18:29:47,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8882 8937 [WARNING|trainer.py:803] 2025-04-26 18:29:48,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:48,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8965 [WARNING|trainer.py:803] 2025-04-26 18:29:48,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8883 [WARNING|trainer.py:803] 2025-04-26 18:29:49,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8938 [WARNING|trainer.py:803] 2025-04-26 18:29:49,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8966 [WARNING|trainer.py:803] 2025-04-26 18:29:50,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8884 [WARNING|trainer.py:803] 2025-04-26 18:29:50,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8939 [WARNING|trainer.py:803] 2025-04-26 18:29:51,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8967 [WARNING|trainer.py:803] 2025-04-26 18:29:51,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8885 [WARNING|trainer.py:803] 2025-04-26 18:29:52,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8940 [WARNING|trainer.py:803] 2025-04-26 18:29:52,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8968 8886 [WARNING|trainer.py:803] 2025-04-26 18:29:53,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:53,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8941 [WARNING|trainer.py:803] 2025-04-26 18:29:53,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8969 8887 [WARNING|trainer.py:803] 2025-04-26 18:29:54,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:54,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8942 8970 [WARNING|trainer.py:803] 2025-04-26 18:29:55,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8888 [WARNING|trainer.py:803] 2025-04-26 18:29:55,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:55,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8943 [WARNING|trainer.py:803] 2025-04-26 18:29:56,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8971 8889 [WARNING|trainer.py:803] 2025-04-26 18:29:57,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:57,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8944 [WARNING|trainer.py:803] 2025-04-26 18:29:57,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8972 8890 [WARNING|trainer.py:803] 2025-04-26 18:29:58,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:29:58,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8945 [WARNING|trainer.py:803] 2025-04-26 18:29:59,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8973 8891 [WARNING|trainer.py:803] 2025-04-26 18:29:59,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:00,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8946 [WARNING|trainer.py:803] 2025-04-26 18:30:00,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8974 8892 [WARNING|trainer.py:803] 2025-04-26 18:30:01,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:30:01,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8947 [WARNING|trainer.py:803] 2025-04-26 18:30:01,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8975 8893 [WARNING|trainer.py:803] 2025-04-26 18:30:02,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:02,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8948 [WARNING|trainer.py:803] 2025-04-26 18:30:03,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8976 8894 [WARNING|trainer.py:803] 2025-04-26 18:30:03,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:30:04,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8949 [WARNING|trainer.py:803] 2025-04-26 18:30:04,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8977 8895 [WARNING|trainer.py:803] 2025-04-26 18:30:05,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:05,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8950 [WARNING|trainer.py:803] 2025-04-26 18:30:05,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8978 8896 [WARNING|trainer.py:803] 2025-04-26 18:30:06,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:06,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8951 [WARNING|trainer.py:803] 2025-04-26 18:30:07,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8979 8897 [WARNING|trainer.py:803] 2025-04-26 18:30:07,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:08,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8952 [WARNING|trainer.py:803] 2025-04-26 18:30:08,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8980 8898 [WARNING|trainer.py:803] 2025-04-26 18:30:09,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:09,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8953 [WARNING|trainer.py:803] 2025-04-26 18:30:09,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8981 8899 [WARNING|trainer.py:803] 2025-04-26 18:30:10,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:10,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8954 [WARNING|trainer.py:803] 2025-04-26 18:30:11,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8982 8900 [WARNING|trainer.py:803] 2025-04-26 18:30:11,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:12,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8955 [WARNING|trainer.py:803] 2025-04-26 18:30:12,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8983 8901 [WARNING|trainer.py:803] 2025-04-26 18:30:13,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:30:13,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8956 [WARNING|trainer.py:803] 2025-04-26 18:30:13,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8984 8902 [WARNING|trainer.py:803] 2025-04-26 18:30:14,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:14,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8957 [WARNING|trainer.py:803] 2025-04-26 18:30:14,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8985 8903 [WARNING|trainer.py:803] 2025-04-26 18:30:15,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:15,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8958 [WARNING|trainer.py:803] 2025-04-26 18:30:16,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8986 8904 [WARNING|trainer.py:803] 2025-04-26 18:30:16,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:17,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8959 [WARNING|trainer.py:803] 2025-04-26 18:30:17,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8987 8905 [WARNING|trainer.py:803] 2025-04-26 18:30:18,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:18,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8960 [WARNING|trainer.py:803] 2025-04-26 18:30:18,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8988 8906 [WARNING|trainer.py:803] 2025-04-26 18:30:19,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:20,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8961 [WARNING|trainer.py:803] 2025-04-26 18:30:20,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8989 8907 [WARNING|trainer.py:803] 2025-04-26 18:30:20,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:21,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8962 [WARNING|trainer.py:803] 2025-04-26 18:30:21,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8990 8908 [WARNING|trainer.py:803] 2025-04-26 18:30:22,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8963 [WARNING|trainer.py:803] 2025-04-26 18:30:22,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:22,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8991 8909 [WARNING|trainer.py:803] 2025-04-26 18:30:23,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:23,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8964 [WARNING|trainer.py:803] 2025-04-26 18:30:24,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8992 8910 [WARNING|trainer.py:803] 2025-04-26 18:30:24,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:25,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8965 [WARNING|trainer.py:803] 2025-04-26 18:30:25,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8993 8911 [WARNING|trainer.py:803] 2025-04-26 18:30:26,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:26,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8966 [WARNING|trainer.py:803] 2025-04-26 18:30:26,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8994 8912 [WARNING|trainer.py:803] 2025-04-26 18:30:27,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:27,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8967 [WARNING|trainer.py:803] 2025-04-26 18:30:28,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8995 8913 [WARNING|trainer.py:803] 2025-04-26 18:30:28,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:29,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8968 [WARNING|trainer.py:803] 2025-04-26 18:30:29,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8996 8914 [WARNING|trainer.py:803] 2025-04-26 18:30:30,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:30,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8969 [WARNING|trainer.py:803] 2025-04-26 18:30:30,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8997 [WARNING|trainer.py:803] 2025-04-26 18:30:31,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8915 [WARNING|trainer.py:803] 2025-04-26 18:30:31,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8970 8998 [WARNING|trainer.py:803] 2025-04-26 18:30:32,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:32,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8916 [WARNING|trainer.py:803] 2025-04-26 18:30:32,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8971 8999 [WARNING|trainer.py:803] 2025-04-26 18:30:33,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8917 [WARNING|trainer.py:803] 2025-04-26 18:30:34,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:34,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8972 9000 [WARNING|trainer.py:803] 2025-04-26 18:30:34,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:35,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8918 [WARNING|trainer.py:803] 2025-04-26 18:30:35,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.7871111111111111 New best accuracy: 0.7871111111111111. Saving model... 8973 [WARNING|trainer.py:803] 2025-04-26 18:30:36,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:36,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8919 8974 [WARNING|trainer.py:803] 2025-04-26 18:30:37,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8920 [WARNING|trainer.py:803] 2025-04-26 18:30:37,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8975 [WARNING|trainer.py:803] 2025-04-26 18:30:38,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:39,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8921 8976 [WARNING|trainer.py:803] 2025-04-26 18:30:40,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:40,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8922 8977 [WARNING|trainer.py:803] 2025-04-26 18:30:41,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:42,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8923 8978 [WARNING|trainer.py:803] 2025-04-26 18:30:42,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:43,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8924 8979 [WARNING|trainer.py:803] 2025-04-26 18:30:44,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8925 [WARNING|trainer.py:803] 2025-04-26 18:30:44,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8980 [WARNING|trainer.py:803] 2025-04-26 18:30:45,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8926 [WARNING|trainer.py:803] 2025-04-26 18:30:45,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8981 [WARNING|trainer.py:803] 2025-04-26 18:30:46,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8927 [WARNING|trainer.py:803] 2025-04-26 18:30:47,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8982 evaluate! [WARNING|trainer.py:803] 2025-04-26 18:30:48,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8928 [WARNING|trainer.py:803] 2025-04-26 18:30:48,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8983 [WARNING|trainer.py:803] 2025-04-26 18:30:49,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1 8929 [WARNING|trainer.py:803] 2025-04-26 18:30:50,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8984 [WARNING|trainer.py:803] 2025-04-26 18:30:50,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:30:50,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8930 [WARNING|trainer.py:803] 2025-04-26 18:30:51,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8985 [WARNING|trainer.py:803] 2025-04-26 18:30:52,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2 8931 [WARNING|trainer.py:803] 2025-04-26 18:30:52,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:30:53,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8986 [WARNING|trainer.py:803] 2025-04-26 18:30:53,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8932 [WARNING|trainer.py:803] 2025-04-26 18:30:54,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8987 [WARNING|trainer.py:803] 2025-04-26 18:30:54,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3 8933 [WARNING|trainer.py:803] 2025-04-26 18:30:55,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:55,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8988 [WARNING|trainer.py:803] 2025-04-26 18:30:56,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8934 [WARNING|trainer.py:803] 2025-04-26 18:30:56,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8989 4 [WARNING|trainer.py:803] 2025-04-26 18:30:57,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8935 [WARNING|trainer.py:803] 2025-04-26 18:30:58,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:30:58,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8990 [WARNING|trainer.py:803] 2025-04-26 18:30:58,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8936 [WARNING|trainer.py:803] 2025-04-26 18:30:59,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5 8991 [WARNING|trainer.py:803] 2025-04-26 18:31:00,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:00,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8937 [WARNING|trainer.py:803] 2025-04-26 18:31:00,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8992 [WARNING|trainer.py:803] 2025-04-26 18:31:01,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:01,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8938 6 8993 [WARNING|trainer.py:803] 2025-04-26 18:31:02,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:03,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:03,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8939 8994 [WARNING|trainer.py:803] 2025-04-26 18:31:04,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:04,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7 8940 8995 [WARNING|trainer.py:803] 2025-04-26 18:31:05,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:31:05,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:05,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8941 8996 [WARNING|trainer.py:803] 2025-04-26 18:31:06,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8 [WARNING|trainer.py:803] 2025-04-26 18:31:07,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8942 8997 [WARNING|trainer.py:803] 2025-04-26 18:31:07,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:31:08,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:08,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8943 8998 [WARNING|trainer.py:803] 2025-04-26 18:31:09,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 9 [WARNING|trainer.py:803] 2025-04-26 18:31:09,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8944 8999 [WARNING|trainer.py:803] 2025-04-26 18:31:10,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:31:10,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:11,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8945 9000 [WARNING|trainer.py:803] 2025-04-26 18:31:12,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 10 [WARNING|trainer.py:803] 2025-04-26 18:31:12,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.7871111111111111 New best accuracy: 0.7871111111111111. Saving model... 8946 [WARNING|trainer.py:803] 2025-04-26 18:31:12,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:31:13,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8947 11 [WARNING|trainer.py:803] 2025-04-26 18:31:14,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8948 [WARNING|trainer.py:803] 2025-04-26 18:31:15,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:31:16,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8949 12 [WARNING|trainer.py:803] 2025-04-26 18:31:17,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:17,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8950 [WARNING|trainer.py:803] 2025-04-26 18:31:18,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8951 13 [WARNING|trainer.py:803] 2025-04-26 18:31:20,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:20,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8952 [WARNING|trainer.py:803] 2025-04-26 18:31:21,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 14 8953 [WARNING|trainer.py:803] 2025-04-26 18:31:22,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:31:22,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8954 [WARNING|trainer.py:803] 2025-04-26 18:31:23,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 15 8955 [WARNING|trainer.py:803] 2025-04-26 18:31:24,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes evaluate! [WARNING|trainer.py:803] 2025-04-26 18:31:25,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8956 16 [WARNING|trainer.py:803] 2025-04-26 18:31:26,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8957 1 [WARNING|trainer.py:803] 2025-04-26 18:31:27,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:31:27,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:31:27,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8958 17 [WARNING|trainer.py:803] 2025-04-26 18:31:29,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2 8959 [WARNING|trainer.py:803] 2025-04-26 18:31:29,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:31:30,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:30,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8960 18 [WARNING|trainer.py:803] 2025-04-26 18:31:31,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3 [WARNING|trainer.py:803] 2025-04-26 18:31:32,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8961 [WARNING|trainer.py:803] 2025-04-26 18:31:32,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:33,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8962 19 4 [WARNING|trainer.py:803] 2025-04-26 18:31:34,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:34,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8963 [WARNING|trainer.py:803] 2025-04-26 18:31:35,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:31:35,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 20 8964 5 [WARNING|trainer.py:803] 2025-04-26 18:31:37,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:37,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8965 [WARNING|trainer.py:803] 2025-04-26 18:31:37,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:31:38,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 21 8966 6 [WARNING|trainer.py:803] 2025-04-26 18:31:39,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:31:39,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:40,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8967 22 [WARNING|trainer.py:803] 2025-04-26 18:31:41,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8968 7 [WARNING|trainer.py:803] 2025-04-26 18:31:41,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:31:42,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:31:42,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8969 23 [WARNING|trainer.py:803] 2025-04-26 18:31:43,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8 8970 [WARNING|trainer.py:803] 2025-04-26 18:31:44,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:44,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:31:45,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8971 [WARNING|trainer.py:803] 2025-04-26 18:31:46,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 24 9 8972 [WARNING|trainer.py:803] 2025-04-26 18:31:47,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:31:47,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:31:47,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8973 25 10 [WARNING|trainer.py:803] 2025-04-26 18:31:49,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8974 [WARNING|trainer.py:803] 2025-04-26 18:31:49,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:31:50,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:31:50,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8975 26 11 [WARNING|trainer.py:803] 2025-04-26 18:31:51,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:31:52,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:31:52,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8976 [WARNING|trainer.py:803] 2025-04-26 18:31:53,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8977 27 12 [WARNING|trainer.py:803] 2025-04-26 18:31:54,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:31:54,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:31:54,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8978 [WARNING|trainer.py:803] 2025-04-26 18:31:56,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 13 8979 28 [WARNING|trainer.py:803] 2025-04-26 18:31:57,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:31:57,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:31:57,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8980 [WARNING|trainer.py:803] 2025-04-26 18:31:58,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 14 29 8981 [WARNING|trainer.py:803] 2025-04-26 18:31:59,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:31:59,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:32:00,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8982 15 [WARNING|trainer.py:803] 2025-04-26 18:32:01,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 30 8983 [WARNING|trainer.py:803] 2025-04-26 18:32:02,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:32:02,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:32:02,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8984 16 [WARNING|trainer.py:803] 2025-04-26 18:32:04,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 31 8985 [WARNING|trainer.py:803] 2025-04-26 18:32:04,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:32:05,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:32:05,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8986 17 32 [WARNING|trainer.py:803] 2025-04-26 18:32:06,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:32:07,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8987 [WARNING|trainer.py:803] 2025-04-26 18:32:07,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:32:08,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8988 18 33 [WARNING|trainer.py:803] 2025-04-26 18:32:09,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:32:09,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:32:09,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8989 [WARNING|trainer.py:803] 2025-04-26 18:32:10,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 19 8990 34 [WARNING|trainer.py:803] 2025-04-26 18:32:11,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:32:12,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:32:12,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8991 [WARNING|trainer.py:803] 2025-04-26 18:32:13,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 20 8992 35 [WARNING|trainer.py:803] 2025-04-26 18:32:14,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:32:14,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:32:15,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8993 21 [WARNING|trainer.py:803] 2025-04-26 18:32:16,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8994 [WARNING|trainer.py:803] 2025-04-26 18:32:16,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 36 [WARNING|trainer.py:803] 2025-04-26 18:32:17,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:32:17,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8995 22 [WARNING|trainer.py:803] 2025-04-26 18:32:18,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:32:19,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8996 37 [WARNING|trainer.py:803] 2025-04-26 18:32:19,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:32:19,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8997 23 [WARNING|trainer.py:803] 2025-04-26 18:32:21,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 38 8998 [WARNING|trainer.py:803] 2025-04-26 18:32:21,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:32:22,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:32:22,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8999 24 [WARNING|trainer.py:803] 2025-04-26 18:32:23,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 39 9000 [WARNING|trainer.py:803] 2025-04-26 18:32:24,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:32:24,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:32:25,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.7871111111111111 New best accuracy: 0.7871111111111111. Saving model... 25 40 [WARNING|trainer.py:803] 2025-04-26 18:32:27,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:32:27,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 41 26 [WARNING|trainer.py:803] 2025-04-26 18:32:29,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:32:29,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 27 42 [WARNING|trainer.py:803] 2025-04-26 18:32:32,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:32:32,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 43 28 [WARNING|trainer.py:803] 2025-04-26 18:32:34,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:32:34,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [INFO|trainer.py:3910] 2025-04-26 18:32:35,434 >> Saving model checkpoint to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast [INFO|configuration_utils.py:420] 2025-04-26 18:32:35,438 >> Configuration saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/config.json [INFO|configuration_utils.py:909] 2025-04-26 18:32:35,439 >> Configuration saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/generation_config.json 29 44 [WARNING|trainer.py:803] 2025-04-26 18:32:37,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:32:37,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 30 45 [WARNING|trainer.py:803] 2025-04-26 18:32:39,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:32:39,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 31 46 [WARNING|trainer.py:803] 2025-04-26 18:32:42,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:32:42,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 32 47 [WARNING|trainer.py:803] 2025-04-26 18:32:44,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:32:45,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 33 48 [WARNING|trainer.py:803] 2025-04-26 18:32:46,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:32:47,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 34 49 [WARNING|trainer.py:803] 2025-04-26 18:32:49,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:32:49,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 35 50 [WARNING|trainer.py:803] 2025-04-26 18:32:52,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:32:52,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 36 [WARNING|trainer.py:803] 2025-04-26 18:32:54,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 51 [WARNING|trainer.py:803] 2025-04-26 18:32:55,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 37 [WARNING|trainer.py:803] 2025-04-26 18:32:56,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 52 [WARNING|trainer.py:803] 2025-04-26 18:32:57,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 38 [WARNING|trainer.py:803] 2025-04-26 18:32:59,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 53 [WARNING|trainer.py:803] 2025-04-26 18:33:00,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 39 [WARNING|trainer.py:803] 2025-04-26 18:33:01,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 54 [WARNING|trainer.py:803] 2025-04-26 18:33:02,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 40 [WARNING|trainer.py:803] 2025-04-26 18:33:04,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 55 [WARNING|trainer.py:803] 2025-04-26 18:33:05,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 41 [WARNING|trainer.py:803] 2025-04-26 18:33:06,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 56 [WARNING|trainer.py:803] 2025-04-26 18:33:07,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 42 57 [WARNING|trainer.py:803] 2025-04-26 18:33:09,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:33:09,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 43 58 [WARNING|trainer.py:803] 2025-04-26 18:33:11,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:33:11,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 59 44 [WARNING|trainer.py:803] 2025-04-26 18:33:14,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:33:14,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 60 45 [WARNING|trainer.py:803] 2025-04-26 18:33:16,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:33:16,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 61 [WARNING|trainer.py:803] 2025-04-26 18:33:18,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 46 [WARNING|trainer.py:803] 2025-04-26 18:33:19,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 62 47 [WARNING|trainer.py:803] 2025-04-26 18:33:21,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:33:21,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 63 48 [WARNING|trainer.py:803] 2025-04-26 18:33:23,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:33:24,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 64 [INFO|modeling_utils.py:2996] 2025-04-26 18:33:25,484 >> The model is bigger than the maximum size per checkpoint (5GB) and is going to be split in 4 checkpoint shards. You can find where each parameters has been saved in the index located at /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/model.safetensors.index.json. [INFO|tokenization_utils_base.py:2491] 2025-04-26 18:33:25,487 >> tokenizer config file saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/tokenizer_config.json [INFO|tokenization_utils_base.py:2500] 2025-04-26 18:33:25,487 >> Special tokens file saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/special_tokens_map.json [INFO|tokenization_utils_base.py:2553] 2025-04-26 18:33:25,487 >> added tokens file saved in /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/added_tokens.json [WARNING|trainer.py:803] 2025-04-26 18:33:25,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 49 [WARNING|trainer.py:803] 2025-04-26 18:33:26,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 04/26/2025 18:33:27 - INFO - __main__ - Saved LoRA weights to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/lora_weights.pth evaluate! 65 [WARNING|trainer.py:803] 2025-04-26 18:33:28,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 50 1 [WARNING|trainer.py:803] 2025-04-26 18:33:29,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:33:30,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 66 [WARNING|trainer.py:803] 2025-04-26 18:33:31,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2 51 [WARNING|trainer.py:803] 2025-04-26 18:33:32,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:33:32,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 67 3 [WARNING|trainer.py:803] 2025-04-26 18:33:34,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:33:34,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 52 [WARNING|trainer.py:803] 2025-04-26 18:33:35,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4 68 [WARNING|trainer.py:803] 2025-04-26 18:33:36,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:33:36,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 53 [WARNING|trainer.py:803] 2025-04-26 18:33:38,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5 69 [WARNING|trainer.py:803] 2025-04-26 18:33:39,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:33:39,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 54 6 [WARNING|trainer.py:803] 2025-04-26 18:33:40,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:33:41,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 70 7 55 [WARNING|trainer.py:803] 2025-04-26 18:33:42,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:33:43,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:33:43,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 71 8 56 [WARNING|trainer.py:803] 2025-04-26 18:33:45,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:33:45,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:33:46,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 9 72 57 [WARNING|trainer.py:803] 2025-04-26 18:33:48,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:33:48,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:33:48,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 10 58 73 [WARNING|trainer.py:803] 2025-04-26 18:33:50,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:33:51,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:33:51,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 11 [WARNING|trainer.py:803] 2025-04-26 18:33:52,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 59 74 [WARNING|trainer.py:803] 2025-04-26 18:33:53,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:33:53,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 12 [WARNING|trainer.py:803] 2025-04-26 18:33:54,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 60 13 [WARNING|trainer.py:803] 2025-04-26 18:33:56,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 75 [WARNING|trainer.py:803] 2025-04-26 18:33:57,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:33:57,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 61 14 [WARNING|trainer.py:803] 2025-04-26 18:33:58,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 76 [WARNING|trainer.py:803] 2025-04-26 18:33:59,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:33:59,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 15 62 [WARNING|trainer.py:803] 2025-04-26 18:34:01,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:34:01,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 77 [WARNING|trainer.py:803] 2025-04-26 18:34:02,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 16 63 [WARNING|trainer.py:803] 2025-04-26 18:34:03,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:34:04,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 78 17 [WARNING|trainer.py:803] 2025-04-26 18:34:05,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 64 [WARNING|trainer.py:803] 2025-04-26 18:34:05,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:34:06,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 79 18 [WARNING|trainer.py:803] 2025-04-26 18:34:07,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:34:08,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 65 [WARNING|trainer.py:803] 2025-04-26 18:34:09,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 19 80 [WARNING|trainer.py:803] 2025-04-26 18:34:10,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 66 [WARNING|trainer.py:803] 2025-04-26 18:34:10,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:34:11,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 20 81 [WARNING|trainer.py:803] 2025-04-26 18:34:12,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:34:12,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 67 21 [WARNING|trainer.py:803] 2025-04-26 18:34:14,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:34:14,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 82 [WARNING|trainer.py:803] 2025-04-26 18:34:15,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 22 68 [WARNING|trainer.py:803] 2025-04-26 18:34:16,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:34:16,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 83 23 69 [WARNING|trainer.py:803] 2025-04-26 18:34:18,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:34:18,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:34:19,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 84 24 70 [WARNING|trainer.py:803] 2025-04-26 18:34:21,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:34:21,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:34:21,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 25 85 71 [WARNING|trainer.py:803] 2025-04-26 18:34:23,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:34:23,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:34:24,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 26 86 [WARNING|trainer.py:803] 2025-04-26 18:34:25,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 72 [WARNING|trainer.py:803] 2025-04-26 18:34:26,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:34:26,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 27 [WARNING|trainer.py:803] 2025-04-26 18:34:28,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 87 73 [WARNING|trainer.py:803] 2025-04-26 18:34:29,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:34:29,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 28 88 [WARNING|trainer.py:803] 2025-04-26 18:34:30,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 74 [WARNING|trainer.py:803] 2025-04-26 18:34:31,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 29 [WARNING|trainer.py:803] 2025-04-26 18:34:31,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 89 [WARNING|trainer.py:803] 2025-04-26 18:34:32,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:34:33,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 30 75 90 [WARNING|trainer.py:803] 2025-04-26 18:34:35,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:34:35,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:34:35,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 31 76 [WARNING|trainer.py:803] 2025-04-26 18:34:37,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:34:37,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 91 32 [WARNING|trainer.py:803] 2025-04-26 18:34:38,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:34:39,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 77 [WARNING|trainer.py:803] 2025-04-26 18:34:40,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 92 33 [WARNING|trainer.py:803] 2025-04-26 18:34:41,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:34:41,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 78 [WARNING|trainer.py:803] 2025-04-26 18:34:42,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 93 34 [WARNING|trainer.py:803] 2025-04-26 18:34:43,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:34:43,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 79 94 [WARNING|trainer.py:803] 2025-04-26 18:34:45,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 35 [WARNING|trainer.py:803] 2025-04-26 18:34:46,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:34:46,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 80 36 95 [WARNING|trainer.py:803] 2025-04-26 18:34:47,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:34:48,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:34:48,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 81 37 96 [WARNING|trainer.py:803] 2025-04-26 18:34:50,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:34:50,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:34:50,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 82 38 97 [WARNING|trainer.py:803] 2025-04-26 18:34:52,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:34:52,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:34:53,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 39 [WARNING|trainer.py:803] 2025-04-26 18:34:55,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 83 98 [WARNING|trainer.py:803] 2025-04-26 18:34:55,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 40 [WARNING|trainer.py:803] 2025-04-26 18:34:56,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:34:57,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 84 99 41 [WARNING|trainer.py:803] 2025-04-26 18:34:58,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:34:58,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:34:59,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 100 85 42 [WARNING|trainer.py:803] 2025-04-26 18:35:01,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:35:01,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:35:01,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 43 86 101 [WARNING|trainer.py:803] 2025-04-26 18:35:03,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:35:03,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:35:04,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 44 87 102 [WARNING|trainer.py:803] 2025-04-26 18:35:06,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:35:06,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:35:06,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 45 88 103 [WARNING|trainer.py:803] 2025-04-26 18:35:08,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:35:08,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:35:09,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 89 46 [WARNING|trainer.py:803] 2025-04-26 18:35:10,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 104 [WARNING|trainer.py:803] 2025-04-26 18:35:11,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:35:12,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 90 47 [WARNING|trainer.py:803] 2025-04-26 18:35:12,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:35:13,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 105 [WARNING|trainer.py:803] 2025-04-26 18:35:14,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 48 91 [WARNING|trainer.py:803] 2025-04-26 18:35:15,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:35:15,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 106 49 [WARNING|trainer.py:803] 2025-04-26 18:35:17,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 92 [WARNING|trainer.py:803] 2025-04-26 18:35:18,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:35:18,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 107 50 93 [WARNING|trainer.py:803] 2025-04-26 18:35:20,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:35:20,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:35:20,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 108 51 94 [WARNING|trainer.py:803] 2025-04-26 18:35:22,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:35:23,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:35:23,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 109 95 [WARNING|trainer.py:803] 2025-04-26 18:35:24,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 52 [WARNING|trainer.py:803] 2025-04-26 18:35:25,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:35:25,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 53 110 96 [WARNING|trainer.py:803] 2025-04-26 18:35:27,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:35:28,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:35:28,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 54 111 97 [WARNING|trainer.py:803] 2025-04-26 18:35:29,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:35:30,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:35:30,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 55 112 [WARNING|trainer.py:803] 2025-04-26 18:35:32,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 98 [WARNING|trainer.py:803] 2025-04-26 18:35:33,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 56 [WARNING|trainer.py:803] 2025-04-26 18:35:33,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:35:34,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 113 99 57 [WARNING|trainer.py:803] 2025-04-26 18:35:35,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:35:35,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:35:36,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 114 58 100 [WARNING|trainer.py:803] 2025-04-26 18:35:37,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:35:38,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:35:38,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 115 59 [WARNING|trainer.py:803] 2025-04-26 18:35:40,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:35:40,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 101 [WARNING|trainer.py:803] 2025-04-26 18:35:41,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 60 116 [WARNING|trainer.py:803] 2025-04-26 18:35:42,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 102 [WARNING|trainer.py:803] 2025-04-26 18:35:43,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 61 [WARNING|trainer.py:803] 2025-04-26 18:35:43,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:35:44,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 117 [WARNING|trainer.py:803] 2025-04-26 18:35:45,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 103 62 [WARNING|trainer.py:803] 2025-04-26 18:35:46,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:35:47,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 118 63 104 [WARNING|trainer.py:803] 2025-04-26 18:35:48,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:35:49,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:35:49,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 64 119 [WARNING|trainer.py:803] 2025-04-26 18:35:51,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 105 [WARNING|trainer.py:803] 2025-04-26 18:35:51,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:35:52,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 65 [WARNING|trainer.py:803] 2025-04-26 18:35:53,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 120 106 [WARNING|trainer.py:803] 2025-04-26 18:35:54,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 66 [WARNING|trainer.py:803] 2025-04-26 18:35:54,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:35:55,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 121 107 67 [WARNING|trainer.py:803] 2025-04-26 18:35:57,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:35:57,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:35:57,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 108 68 122 [WARNING|trainer.py:803] 2025-04-26 18:36:00,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:36:00,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:36:00,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 109 69 123 [WARNING|trainer.py:803] 2025-04-26 18:36:02,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:36:02,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:36:02,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 70 110 124 [WARNING|trainer.py:803] 2025-04-26 18:36:05,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:36:05,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:36:05,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 71 111 [WARNING|trainer.py:803] 2025-04-26 18:36:07,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 125 [WARNING|trainer.py:803] 2025-04-26 18:36:08,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:36:08,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 72 112 [WARNING|trainer.py:803] 2025-04-26 18:36:09,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 126 [WARNING|trainer.py:803] 2025-04-26 18:36:10,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:36:10,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 73 113 [WARNING|trainer.py:803] 2025-04-26 18:36:12,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:36:12,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 127 74 [WARNING|trainer.py:803] 2025-04-26 18:36:14,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:36:14,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 114 [WARNING|trainer.py:803] 2025-04-26 18:36:15,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 128 [WARNING|trainer.py:803] 2025-04-26 18:36:16,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 75 115 [WARNING|trainer.py:803] 2025-04-26 18:36:17,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:36:17,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 129 76 [WARNING|trainer.py:803] 2025-04-26 18:36:18,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 116 [WARNING|trainer.py:803] 2025-04-26 18:36:19,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:36:20,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 130 77 [WARNING|trainer.py:803] 2025-04-26 18:36:21,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 117 [WARNING|trainer.py:803] 2025-04-26 18:36:22,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 131 [WARNING|trainer.py:803] 2025-04-26 18:36:22,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 78 [WARNING|trainer.py:803] 2025-04-26 18:36:23,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:36:24,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 118 79 132 [WARNING|trainer.py:803] 2025-04-26 18:36:26,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:36:26,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:36:26,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 119 80 133 [WARNING|trainer.py:803] 2025-04-26 18:36:28,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:36:28,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:36:29,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 81 134 [WARNING|trainer.py:803] 2025-04-26 18:36:30,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 120 [WARNING|trainer.py:803] 2025-04-26 18:36:31,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:36:31,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 82 [WARNING|trainer.py:803] 2025-04-26 18:36:33,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 135 121 [WARNING|trainer.py:803] 2025-04-26 18:36:34,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:36:34,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 83 [WARNING|trainer.py:803] 2025-04-26 18:36:36,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 136 122 [WARNING|trainer.py:803] 2025-04-26 18:36:37,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 84 [WARNING|trainer.py:803] 2025-04-26 18:36:37,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:36:38,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 137 123 85 [WARNING|trainer.py:803] 2025-04-26 18:36:40,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:36:40,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:36:40,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 138 86 124 [WARNING|trainer.py:803] 2025-04-26 18:36:42,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:36:43,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:36:43,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 139 87 125 [WARNING|trainer.py:803] 2025-04-26 18:36:45,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:36:45,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:36:45,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 88 140 [WARNING|trainer.py:803] 2025-04-26 18:36:47,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 126 [WARNING|trainer.py:803] 2025-04-26 18:36:47,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 89 [WARNING|trainer.py:803] 2025-04-26 18:36:48,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:36:49,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 141 90 [WARNING|trainer.py:803] 2025-04-26 18:36:50,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 127 [WARNING|trainer.py:803] 2025-04-26 18:36:51,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:36:51,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 142 91 128 [WARNING|trainer.py:803] 2025-04-26 18:36:53,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:36:53,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:36:53,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 92 143 129 [WARNING|trainer.py:803] 2025-04-26 18:36:56,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:36:56,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:36:56,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 93 144 130 [WARNING|trainer.py:803] 2025-04-26 18:36:58,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:36:58,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:36:59,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 94 131 [WARNING|trainer.py:803] 2025-04-26 18:37:00,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 145 [WARNING|trainer.py:803] 2025-04-26 18:37:01,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 95 [WARNING|trainer.py:803] 2025-04-26 18:37:01,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:37:02,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 132 146 [WARNING|trainer.py:803] 2025-04-26 18:37:04,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 96 [WARNING|trainer.py:803] 2025-04-26 18:37:04,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:37:04,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 133 147 97 [WARNING|trainer.py:803] 2025-04-26 18:37:06,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:37:06,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:37:07,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 134 98 [WARNING|trainer.py:803] 2025-04-26 18:37:09,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 148 [WARNING|trainer.py:803] 2025-04-26 18:37:09,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:37:10,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 99 135 [WARNING|trainer.py:803] 2025-04-26 18:37:11,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 149 [WARNING|trainer.py:803] 2025-04-26 18:37:12,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:37:12,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 100 136 [WARNING|trainer.py:803] 2025-04-26 18:37:14,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 150 [WARNING|trainer.py:803] 2025-04-26 18:37:14,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:37:15,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 101 [WARNING|trainer.py:803] 2025-04-26 18:37:16,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 137 151 [WARNING|trainer.py:803] 2025-04-26 18:37:17,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:37:17,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 102 [WARNING|trainer.py:803] 2025-04-26 18:37:18,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 138 152 [WARNING|trainer.py:803] 2025-04-26 18:37:20,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 103 [WARNING|trainer.py:803] 2025-04-26 18:37:20,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:37:21,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 139 153 [WARNING|trainer.py:803] 2025-04-26 18:37:23,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 104 [WARNING|trainer.py:803] 2025-04-26 18:37:23,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:37:24,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 140 154 [WARNING|trainer.py:803] 2025-04-26 18:37:25,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 105 [WARNING|trainer.py:803] 2025-04-26 18:37:25,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:37:26,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 141 155 [WARNING|trainer.py:803] 2025-04-26 18:37:28,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 106 [WARNING|trainer.py:803] 2025-04-26 18:37:28,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:37:29,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 142 156 107 [WARNING|trainer.py:803] 2025-04-26 18:37:31,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:37:31,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:37:31,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 157 108 143 [WARNING|trainer.py:803] 2025-04-26 18:37:33,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:37:33,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:37:33,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 109 144 [WARNING|trainer.py:803] 2025-04-26 18:37:35,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 158 [WARNING|trainer.py:803] 2025-04-26 18:37:36,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:37:36,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 110 159 [WARNING|trainer.py:803] 2025-04-26 18:37:38,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 145 [WARNING|trainer.py:803] 2025-04-26 18:37:38,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:37:39,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 111 160 [WARNING|trainer.py:803] 2025-04-26 18:37:40,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 146 [WARNING|trainer.py:803] 2025-04-26 18:37:41,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:37:41,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 112 [WARNING|trainer.py:803] 2025-04-26 18:37:43,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 161 147 [WARNING|trainer.py:803] 2025-04-26 18:37:44,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 113 [WARNING|trainer.py:803] 2025-04-26 18:37:44,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:37:45,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 162 [WARNING|trainer.py:803] 2025-04-26 18:37:46,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 114 148 [WARNING|trainer.py:803] 2025-04-26 18:37:47,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:37:47,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 163 115 [WARNING|trainer.py:803] 2025-04-26 18:37:49,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 149 [WARNING|trainer.py:803] 2025-04-26 18:37:49,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:37:50,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 116 164 [WARNING|trainer.py:803] 2025-04-26 18:37:52,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:37:52,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 150 [WARNING|trainer.py:803] 2025-04-26 18:37:53,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 117 165 [WARNING|trainer.py:803] 2025-04-26 18:37:54,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:37:54,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 151 [WARNING|trainer.py:803] 2025-04-26 18:37:55,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 166 118 [WARNING|trainer.py:803] 2025-04-26 18:37:57,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:37:57,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 152 [WARNING|trainer.py:803] 2025-04-26 18:37:58,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 119 167 [WARNING|trainer.py:803] 2025-04-26 18:37:59,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:37:59,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 153 [WARNING|trainer.py:803] 2025-04-26 18:38:01,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 120 168 [WARNING|trainer.py:803] 2025-04-26 18:38:02,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:38:02,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 154 [WARNING|trainer.py:803] 2025-04-26 18:38:03,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 121 [WARNING|trainer.py:803] 2025-04-26 18:38:05,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 169 155 [WARNING|trainer.py:803] 2025-04-26 18:38:06,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:38:06,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 122 [WARNING|trainer.py:803] 2025-04-26 18:38:07,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 170 156 [WARNING|trainer.py:803] 2025-04-26 18:38:08,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:38:09,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 123 [WARNING|trainer.py:803] 2025-04-26 18:38:09,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 171 157 [WARNING|trainer.py:803] 2025-04-26 18:38:11,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:38:11,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 124 172 [WARNING|trainer.py:803] 2025-04-26 18:38:12,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:38:13,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 158 125 [WARNING|trainer.py:803] 2025-04-26 18:38:14,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:38:14,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 173 [WARNING|trainer.py:803] 2025-04-26 18:38:15,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 126 159 [WARNING|trainer.py:803] 2025-04-26 18:38:17,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:38:17,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 174 [WARNING|trainer.py:803] 2025-04-26 18:38:18,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 160 127 175 [WARNING|trainer.py:803] 2025-04-26 18:38:19,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:38:20,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:38:20,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 128 161 [WARNING|trainer.py:803] 2025-04-26 18:38:22,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:38:22,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 176 [WARNING|trainer.py:803] 2025-04-26 18:38:23,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 129 162 [WARNING|trainer.py:803] 2025-04-26 18:38:24,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:38:24,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 177 130 [WARNING|trainer.py:803] 2025-04-26 18:38:25,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:38:26,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 163 [WARNING|trainer.py:803] 2025-04-26 18:38:27,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 131 178 [WARNING|trainer.py:803] 2025-04-26 18:38:28,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:38:28,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 164 [WARNING|trainer.py:803] 2025-04-26 18:38:30,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 132 179 [WARNING|trainer.py:803] 2025-04-26 18:38:31,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:38:31,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 165 133 [WARNING|trainer.py:803] 2025-04-26 18:38:32,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:38:33,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 180 166 [WARNING|trainer.py:803] 2025-04-26 18:38:34,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 134 [WARNING|trainer.py:803] 2025-04-26 18:38:35,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:38:35,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 181 [WARNING|trainer.py:803] 2025-04-26 18:38:37,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 135 167 [WARNING|trainer.py:803] 2025-04-26 18:38:38,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:38:38,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 182 136 [WARNING|trainer.py:803] 2025-04-26 18:38:39,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 168 [WARNING|trainer.py:803] 2025-04-26 18:38:40,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:38:41,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 183 [WARNING|trainer.py:803] 2025-04-26 18:38:42,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 137 [WARNING|trainer.py:803] 2025-04-26 18:38:43,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 169 184 [WARNING|trainer.py:803] 2025-04-26 18:38:44,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 138 [WARNING|trainer.py:803] 2025-04-26 18:38:45,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:38:45,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 170 185 [WARNING|trainer.py:803] 2025-04-26 18:38:46,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 139 [WARNING|trainer.py:803] 2025-04-26 18:38:47,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:38:47,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 171 [WARNING|trainer.py:803] 2025-04-26 18:38:49,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 140 186 [WARNING|trainer.py:803] 2025-04-26 18:38:50,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:38:50,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 172 [WARNING|trainer.py:803] 2025-04-26 18:38:51,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 141 187 [WARNING|trainer.py:803] 2025-04-26 18:38:52,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:38:52,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 173 [WARNING|trainer.py:803] 2025-04-26 18:38:54,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 142 [WARNING|trainer.py:803] 2025-04-26 18:38:55,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 188 174 [WARNING|trainer.py:803] 2025-04-26 18:38:56,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:38:56,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 143 [WARNING|trainer.py:803] 2025-04-26 18:38:57,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 189 175 [WARNING|trainer.py:803] 2025-04-26 18:38:58,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:38:58,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 144 [WARNING|trainer.py:803] 2025-04-26 18:39:00,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 190 176 [WARNING|trainer.py:803] 2025-04-26 18:39:01,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:39:01,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 145 191 [WARNING|trainer.py:803] 2025-04-26 18:39:03,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 177 [WARNING|trainer.py:803] 2025-04-26 18:39:03,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:39:04,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 146 [WARNING|trainer.py:803] 2025-04-26 18:39:05,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 192 178 [WARNING|trainer.py:803] 2025-04-26 18:39:06,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 147 [WARNING|trainer.py:803] 2025-04-26 18:39:07,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:39:07,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 193 179 [WARNING|trainer.py:803] 2025-04-26 18:39:09,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 148 [WARNING|trainer.py:803] 2025-04-26 18:39:09,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:39:10,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 194 [WARNING|trainer.py:803] 2025-04-26 18:39:11,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 149 180 [WARNING|trainer.py:803] 2025-04-26 18:39:12,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:39:13,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 195 [WARNING|trainer.py:803] 2025-04-26 18:39:14,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 150 181 [WARNING|trainer.py:803] 2025-04-26 18:39:15,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:39:15,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 196 151 [WARNING|trainer.py:803] 2025-04-26 18:39:16,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 182 [WARNING|trainer.py:803] 2025-04-26 18:39:17,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:39:18,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 197 152 [WARNING|trainer.py:803] 2025-04-26 18:39:19,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 183 [WARNING|trainer.py:803] 2025-04-26 18:39:20,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:39:20,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 198 153 [WARNING|trainer.py:803] 2025-04-26 18:39:22,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 184 [WARNING|trainer.py:803] 2025-04-26 18:39:22,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:39:23,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 199 154 [WARNING|trainer.py:803] 2025-04-26 18:39:24,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:39:25,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 185 [WARNING|trainer.py:803] 2025-04-26 18:39:26,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 200 155 [WARNING|trainer.py:803] 2025-04-26 18:39:27,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:39:27,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 186 [WARNING|trainer.py:803] 2025-04-26 18:39:28,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 156 201 [WARNING|trainer.py:803] 2025-04-26 18:39:29,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:39:29,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 187 157 [WARNING|trainer.py:803] 2025-04-26 18:39:31,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:39:31,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 202 [WARNING|trainer.py:803] 2025-04-26 18:39:32,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 188 158 203 [WARNING|trainer.py:803] 2025-04-26 18:39:34,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:39:34,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:39:35,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 159 189 [WARNING|trainer.py:803] 2025-04-26 18:39:36,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 204 [WARNING|trainer.py:803] 2025-04-26 18:39:37,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:39:37,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 160 190 [WARNING|trainer.py:803] 2025-04-26 18:39:39,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:39:39,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 205 [WARNING|trainer.py:803] 2025-04-26 18:39:40,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 161 191 [WARNING|trainer.py:803] 2025-04-26 18:39:41,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:39:42,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 206 162 [WARNING|trainer.py:803] 2025-04-26 18:39:43,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:39:43,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 192 [WARNING|trainer.py:803] 2025-04-26 18:39:44,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 207 163 [WARNING|trainer.py:803] 2025-04-26 18:39:45,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:39:46,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 193 [WARNING|trainer.py:803] 2025-04-26 18:39:47,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 208 164 [WARNING|trainer.py:803] 2025-04-26 18:39:48,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:39:48,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 194 [WARNING|trainer.py:803] 2025-04-26 18:39:50,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 165 209 [WARNING|trainer.py:803] 2025-04-26 18:39:51,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:39:51,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 195 210 166 [WARNING|trainer.py:803] 2025-04-26 18:39:52,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:39:53,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:39:53,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 211 196 [WARNING|trainer.py:803] 2025-04-26 18:39:54,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 167 [WARNING|trainer.py:803] 2025-04-26 18:39:55,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:39:55,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 212 [WARNING|trainer.py:803] 2025-04-26 18:39:56,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 197 168 213 [WARNING|trainer.py:803] 2025-04-26 18:39:58,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:39:58,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:39:58,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 198 214 [WARNING|trainer.py:803] 2025-04-26 18:40:00,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:40:00,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 169 [WARNING|trainer.py:803] 2025-04-26 18:40:01,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 215 199 [WARNING|trainer.py:803] 2025-04-26 18:40:02,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:40:02,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 170 216 [WARNING|trainer.py:803] 2025-04-26 18:40:03,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 200 [WARNING|trainer.py:803] 2025-04-26 18:40:04,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 171 [WARNING|trainer.py:803] 2025-04-26 18:40:05,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 217 [WARNING|trainer.py:803] 2025-04-26 18:40:05,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:40:06,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 172 201 [WARNING|trainer.py:803] 2025-04-26 18:40:07,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 218 [WARNING|trainer.py:803] 2025-04-26 18:40:08,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:40:08,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 173 219 [WARNING|trainer.py:803] 2025-04-26 18:40:10,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 202 [WARNING|trainer.py:803] 2025-04-26 18:40:10,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:40:11,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 174 220 [WARNING|trainer.py:803] 2025-04-26 18:40:12,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:40:12,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 203 175 [WARNING|trainer.py:803] 2025-04-26 18:40:13,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 221 [WARNING|trainer.py:803] 2025-04-26 18:40:14,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:40:14,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 204 176 222 [WARNING|trainer.py:803] 2025-04-26 18:40:16,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:40:16,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:40:16,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 223 205 177 [WARNING|trainer.py:803] 2025-04-26 18:40:18,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:40:19,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:40:19,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 224 206 [WARNING|trainer.py:803] 2025-04-26 18:40:20,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 178 [WARNING|trainer.py:803] 2025-04-26 18:40:21,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:40:21,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 225 [WARNING|trainer.py:803] 2025-04-26 18:40:22,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 207 179 226 [WARNING|trainer.py:803] 2025-04-26 18:40:24,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:40:24,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:40:24,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 227 208 180 [WARNING|trainer.py:803] 2025-04-26 18:40:26,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:40:27,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:40:27,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 228 181 209 [WARNING|trainer.py:803] 2025-04-26 18:40:28,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:40:29,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:40:29,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 229 210 [WARNING|trainer.py:803] 2025-04-26 18:40:30,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 182 [WARNING|trainer.py:803] 2025-04-26 18:40:31,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:40:31,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 230 211 [WARNING|trainer.py:803] 2025-04-26 18:40:32,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 183 [WARNING|trainer.py:803] 2025-04-26 18:40:33,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 231 [WARNING|trainer.py:803] 2025-04-26 18:40:34,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 212 [WARNING|trainer.py:803] 2025-04-26 18:40:34,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 184 [WARNING|trainer.py:803] 2025-04-26 18:40:35,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 232 [WARNING|trainer.py:803] 2025-04-26 18:40:36,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 213 [WARNING|trainer.py:803] 2025-04-26 18:40:36,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:40:37,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 233 185 214 [WARNING|trainer.py:803] 2025-04-26 18:40:38,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:40:38,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:40:39,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 234 186 215 [WARNING|trainer.py:803] 2025-04-26 18:40:40,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:40:41,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:40:41,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 235 187 216 [WARNING|trainer.py:803] 2025-04-26 18:40:42,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:40:43,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:40:43,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 236 [WARNING|trainer.py:803] 2025-04-26 18:40:44,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 217 [WARNING|trainer.py:803] 2025-04-26 18:40:45,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 188 237 [WARNING|trainer.py:803] 2025-04-26 18:40:46,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:40:46,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 218 [WARNING|trainer.py:803] 2025-04-26 18:40:47,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 238 189 [WARNING|trainer.py:803] 2025-04-26 18:40:48,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 219 [WARNING|trainer.py:803] 2025-04-26 18:40:49,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:40:49,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 239 190 [WARNING|trainer.py:803] 2025-04-26 18:40:50,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 220 [WARNING|trainer.py:803] 2025-04-26 18:40:51,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:40:51,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 240 [WARNING|trainer.py:803] 2025-04-26 18:40:52,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 221 191 [WARNING|trainer.py:803] 2025-04-26 18:40:53,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:40:53,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 241 [WARNING|trainer.py:803] 2025-04-26 18:40:54,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 222 192 [WARNING|trainer.py:803] 2025-04-26 18:40:55,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 242 [WARNING|trainer.py:803] 2025-04-26 18:40:56,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:40:56,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 223 193 [WARNING|trainer.py:803] 2025-04-26 18:40:57,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 243 [WARNING|trainer.py:803] 2025-04-26 18:40:58,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:40:58,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 224 [WARNING|trainer.py:803] 2025-04-26 18:40:59,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 244 194 [WARNING|trainer.py:803] 2025-04-26 18:41:00,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:00,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 225 245 [WARNING|trainer.py:803] 2025-04-26 18:41:01,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 195 [WARNING|trainer.py:803] 2025-04-26 18:41:02,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 226 [WARNING|trainer.py:803] 2025-04-26 18:41:03,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 246 [WARNING|trainer.py:803] 2025-04-26 18:41:03,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 196 [WARNING|trainer.py:803] 2025-04-26 18:41:04,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 227 [WARNING|trainer.py:803] 2025-04-26 18:41:05,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 247 [WARNING|trainer.py:803] 2025-04-26 18:41:05,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:41:06,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 228 197 248 [WARNING|trainer.py:803] 2025-04-26 18:41:07,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:41:07,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:41:08,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 229 198 249 [WARNING|trainer.py:803] 2025-04-26 18:41:09,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:09,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:41:10,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 230 199 250 [WARNING|trainer.py:803] 2025-04-26 18:41:11,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:12,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:41:12,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 231 251 [WARNING|trainer.py:803] 2025-04-26 18:41:13,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 200 [WARNING|trainer.py:803] 2025-04-26 18:41:14,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:14,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 232 [WARNING|trainer.py:803] 2025-04-26 18:41:15,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 252 201 [WARNING|trainer.py:803] 2025-04-26 18:41:16,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 233 [WARNING|trainer.py:803] 2025-04-26 18:41:17,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:17,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 253 [WARNING|trainer.py:803] 2025-04-26 18:41:18,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 234 202 [WARNING|trainer.py:803] 2025-04-26 18:41:19,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 254 [WARNING|trainer.py:803] 2025-04-26 18:41:19,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:20,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 235 203 [WARNING|trainer.py:803] 2025-04-26 18:41:21,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 255 [WARNING|trainer.py:803] 2025-04-26 18:41:21,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:41:22,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 236 [WARNING|trainer.py:803] 2025-04-26 18:41:23,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 204 256 [WARNING|trainer.py:803] 2025-04-26 18:41:24,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 237 [WARNING|trainer.py:803] 2025-04-26 18:41:24,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:25,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 205 257 238 [WARNING|trainer.py:803] 2025-04-26 18:41:26,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:41:26,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:27,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 239 206 258 [WARNING|trainer.py:803] 2025-04-26 18:41:29,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:41:29,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:29,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 240 207 259 [WARNING|trainer.py:803] 2025-04-26 18:41:31,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:41:31,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:41:31,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 241 260 208 [WARNING|trainer.py:803] 2025-04-26 18:41:33,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:41:33,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:33,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 242 209 [WARNING|trainer.py:803] 2025-04-26 18:41:35,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 261 [WARNING|trainer.py:803] 2025-04-26 18:41:35,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:41:36,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 243 210 [WARNING|trainer.py:803] 2025-04-26 18:41:37,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 262 [WARNING|trainer.py:803] 2025-04-26 18:41:37,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:41:38,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 244 211 [WARNING|trainer.py:803] 2025-04-26 18:41:38,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 263 [WARNING|trainer.py:803] 2025-04-26 18:41:39,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:40,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 245 212 [WARNING|trainer.py:803] 2025-04-26 18:41:40,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 264 [WARNING|trainer.py:803] 2025-04-26 18:41:41,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:41:41,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 213 246 [WARNING|trainer.py:803] 2025-04-26 18:41:42,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:41:42,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 265 214 247 [WARNING|trainer.py:803] 2025-04-26 18:41:44,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:41:44,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:44,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 215 266 248 [WARNING|trainer.py:803] 2025-04-26 18:41:46,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:41:46,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:46,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 216 267 249 [WARNING|trainer.py:803] 2025-04-26 18:41:48,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:41:48,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:48,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 217 [WARNING|trainer.py:803] 2025-04-26 18:41:50,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 268 250 218 [WARNING|trainer.py:803] 2025-04-26 18:41:51,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:41:51,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:41:51,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 251 269 219 [WARNING|trainer.py:803] 2025-04-26 18:41:53,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:53,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:41:53,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 252 220 [WARNING|trainer.py:803] 2025-04-26 18:41:54,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 270 [WARNING|trainer.py:803] 2025-04-26 18:41:55,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:41:55,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 253 221 [WARNING|trainer.py:803] 2025-04-26 18:41:56,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:57,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 271 254 [WARNING|trainer.py:803] 2025-04-26 18:41:58,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 222 [WARNING|trainer.py:803] 2025-04-26 18:41:58,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:41:59,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 272 255 223 [WARNING|trainer.py:803] 2025-04-26 18:42:00,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:42:00,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:01,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 224 273 256 [WARNING|trainer.py:803] 2025-04-26 18:42:02,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:03,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:03,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 225 [WARNING|trainer.py:803] 2025-04-26 18:42:04,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 257 274 [WARNING|trainer.py:803] 2025-04-26 18:42:05,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:05,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 226 [WARNING|trainer.py:803] 2025-04-26 18:42:06,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 258 227 275 [WARNING|trainer.py:803] 2025-04-26 18:42:08,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:08,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:42:08,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 228 259 276 [WARNING|trainer.py:803] 2025-04-26 18:42:10,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:42:10,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:10,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 229 260 [WARNING|trainer.py:803] 2025-04-26 18:42:11,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 277 [WARNING|trainer.py:803] 2025-04-26 18:42:12,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 230 [WARNING|trainer.py:803] 2025-04-26 18:42:13,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:13,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 261 231 278 [WARNING|trainer.py:803] 2025-04-26 18:42:14,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:42:15,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:42:15,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 262 232 [WARNING|trainer.py:803] 2025-04-26 18:42:16,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 279 [WARNING|trainer.py:803] 2025-04-26 18:42:17,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:42:17,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 233 263 [WARNING|trainer.py:803] 2025-04-26 18:42:18,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:18,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 280 234 264 [WARNING|trainer.py:803] 2025-04-26 18:42:20,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:42:20,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:42:20,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 235 281 265 [WARNING|trainer.py:803] 2025-04-26 18:42:22,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:22,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:42:23,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 236 [WARNING|trainer.py:803] 2025-04-26 18:42:24,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 282 266 237 [WARNING|trainer.py:803] 2025-04-26 18:42:25,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:42:25,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:25,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 283 238 267 [WARNING|trainer.py:803] 2025-04-26 18:42:27,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:42:27,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:42:27,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 239 268 284 [WARNING|trainer.py:803] 2025-04-26 18:42:29,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:42:30,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:42:30,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 240 [WARNING|trainer.py:803] 2025-04-26 18:42:31,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 269 285 241 [WARNING|trainer.py:803] 2025-04-26 18:42:32,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:42:32,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:42:33,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 270 242 286 [WARNING|trainer.py:803] 2025-04-26 18:42:34,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:42:34,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:42:35,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 243 271 [WARNING|trainer.py:803] 2025-04-26 18:42:36,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:42:37,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 244 287 [WARNING|trainer.py:803] 2025-04-26 18:42:38,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:38,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 272 245 [WARNING|trainer.py:803] 2025-04-26 18:42:39,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 288 [WARNING|trainer.py:803] 2025-04-26 18:42:40,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:42:40,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 246 273 [WARNING|trainer.py:803] 2025-04-26 18:42:42,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:42:42,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 289 247 [WARNING|trainer.py:803] 2025-04-26 18:42:43,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:42:43,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 274 [WARNING|trainer.py:803] 2025-04-26 18:42:44,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 248 290 [WARNING|trainer.py:803] 2025-04-26 18:42:45,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:42:45,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 275 249 [WARNING|trainer.py:803] 2025-04-26 18:42:47,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:47,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 291 [WARNING|trainer.py:803] 2025-04-26 18:42:48,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 250 276 [WARNING|trainer.py:803] 2025-04-26 18:42:49,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:42:49,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 292 251 [WARNING|trainer.py:803] 2025-04-26 18:42:50,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:51,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 277 252 [WARNING|trainer.py:803] 2025-04-26 18:42:52,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 293 [WARNING|trainer.py:803] 2025-04-26 18:42:52,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:53,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 278 253 [WARNING|trainer.py:803] 2025-04-26 18:42:54,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:54,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 294 254 [WARNING|trainer.py:803] 2025-04-26 18:42:55,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 279 [WARNING|trainer.py:803] 2025-04-26 18:42:56,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:56,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 255 295 [WARNING|trainer.py:803] 2025-04-26 18:42:58,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 280 [WARNING|trainer.py:803] 2025-04-26 18:42:58,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:42:59,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 256 296 [WARNING|trainer.py:803] 2025-04-26 18:43:00,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 281 [WARNING|trainer.py:803] 2025-04-26 18:43:00,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 257 [WARNING|trainer.py:803] 2025-04-26 18:43:01,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:43:02,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 297 [WARNING|trainer.py:803] 2025-04-26 18:43:03,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 282 258 [WARNING|trainer.py:803] 2025-04-26 18:43:04,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:43:04,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 298 283 [WARNING|trainer.py:803] 2025-04-26 18:43:05,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 259 [WARNING|trainer.py:803] 2025-04-26 18:43:06,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:43:06,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 260 299 [WARNING|trainer.py:803] 2025-04-26 18:43:08,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 284 [WARNING|trainer.py:803] 2025-04-26 18:43:08,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:09,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 261 300 [WARNING|trainer.py:803] 2025-04-26 18:43:10,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:43:10,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 285 301 262 [WARNING|trainer.py:803] 2025-04-26 18:43:11,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:43:11,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:12,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 302 263 [WARNING|trainer.py:803] 2025-04-26 18:43:13,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 286 [WARNING|trainer.py:803] 2025-04-26 18:43:13,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 303 [WARNING|trainer.py:803] 2025-04-26 18:43:14,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:14,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 264 304 [WARNING|trainer.py:803] 2025-04-26 18:43:15,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:43:16,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 305 287 265 [WARNING|trainer.py:803] 2025-04-26 18:43:17,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:17,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:17,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 306 [WARNING|trainer.py:803] 2025-04-26 18:43:18,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 266 288 307 [WARNING|trainer.py:803] 2025-04-26 18:43:19,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:19,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:20,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 308 267 289 [WARNING|trainer.py:803] 2025-04-26 18:43:21,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:21,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 309 [WARNING|trainer.py:803] 2025-04-26 18:43:22,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:43:22,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 268 310 [WARNING|trainer.py:803] 2025-04-26 18:43:23,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:43:24,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 290 311 [WARNING|trainer.py:803] 2025-04-26 18:43:25,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 269 [WARNING|trainer.py:803] 2025-04-26 18:43:25,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 312 [WARNING|trainer.py:803] 2025-04-26 18:43:25,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 291 [WARNING|trainer.py:803] 2025-04-26 18:43:26,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 270 313 [WARNING|trainer.py:803] 2025-04-26 18:43:27,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:28,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:43:28,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 314 292 271 [WARNING|trainer.py:803] 2025-04-26 18:43:29,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 315 [WARNING|trainer.py:803] 2025-04-26 18:43:29,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:30,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:30,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 316 293 272 [WARNING|trainer.py:803] 2025-04-26 18:43:32,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:32,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:32,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 317 [WARNING|trainer.py:803] 2025-04-26 18:43:33,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 273 318 294 [WARNING|trainer.py:803] 2025-04-26 18:43:34,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:34,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:43:35,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 319 274 [WARNING|trainer.py:803] 2025-04-26 18:43:36,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 320 [WARNING|trainer.py:803] 2025-04-26 18:43:36,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 295 [WARNING|trainer.py:803] 2025-04-26 18:43:37,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:37,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 321 275 [WARNING|trainer.py:803] 2025-04-26 18:43:38,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:43:39,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 296 322 [WARNING|trainer.py:803] 2025-04-26 18:43:39,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:43:40,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 276 323 [WARNING|trainer.py:803] 2025-04-26 18:43:41,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:41,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 297 324 277 [WARNING|trainer.py:803] 2025-04-26 18:43:42,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:43:42,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:43:43,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 325 298 [WARNING|trainer.py:803] 2025-04-26 18:43:44,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 278 326 [WARNING|trainer.py:803] 2025-04-26 18:43:44,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:45,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:45,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 327 279 299 [WARNING|trainer.py:803] 2025-04-26 18:43:46,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:43:47,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 328 [WARNING|trainer.py:803] 2025-04-26 18:43:47,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:43:48,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 280 329 300 [WARNING|trainer.py:803] 2025-04-26 18:43:49,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:43:49,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:43:49,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 330 301 281 [WARNING|trainer.py:803] 2025-04-26 18:43:51,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:43:51,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 331 [WARNING|trainer.py:803] 2025-04-26 18:43:51,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 302 [WARNING|trainer.py:803] 2025-04-26 18:43:52,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:43:52,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 332 282 303 [WARNING|trainer.py:803] 2025-04-26 18:43:53,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:43:53,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:43:54,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 333 304 [WARNING|trainer.py:803] 2025-04-26 18:43:55,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 283 [WARNING|trainer.py:803] 2025-04-26 18:43:55,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 334 305 [WARNING|trainer.py:803] 2025-04-26 18:43:56,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:43:56,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:43:56,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 335 306 284 [WARNING|trainer.py:803] 2025-04-26 18:43:57,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:43:58,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 336 [WARNING|trainer.py:803] 2025-04-26 18:43:58,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 307 [WARNING|trainer.py:803] 2025-04-26 18:43:59,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:43:59,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 337 285 308 [WARNING|trainer.py:803] 2025-04-26 18:44:00,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:00,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:44:00,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 338 309 [WARNING|trainer.py:803] 2025-04-26 18:44:01,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 339 [WARNING|trainer.py:803] 2025-04-26 18:44:02,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 286 310 [WARNING|trainer.py:803] 2025-04-26 18:44:03,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:03,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:03,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 340 311 [WARNING|trainer.py:803] 2025-04-26 18:44:04,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 341 [WARNING|trainer.py:803] 2025-04-26 18:44:05,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 287 312 [WARNING|trainer.py:803] 2025-04-26 18:44:05,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:05,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:06,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 342 313 288 [WARNING|trainer.py:803] 2025-04-26 18:44:07,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 343 [WARNING|trainer.py:803] 2025-04-26 18:44:07,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:07,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 314 [WARNING|trainer.py:803] 2025-04-26 18:44:08,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:44:09,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 344 289 315 [WARNING|trainer.py:803] 2025-04-26 18:44:09,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:10,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:44:10,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 345 316 [WARNING|trainer.py:803] 2025-04-26 18:44:11,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:11,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 290 346 317 [WARNING|trainer.py:803] 2025-04-26 18:44:12,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:12,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:13,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 347 318 291 [WARNING|trainer.py:803] 2025-04-26 18:44:14,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:14,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 348 [WARNING|trainer.py:803] 2025-04-26 18:44:14,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 319 [WARNING|trainer.py:803] 2025-04-26 18:44:15,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:15,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 349 292 320 [WARNING|trainer.py:803] 2025-04-26 18:44:16,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:16,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:17,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 350 321 293 [WARNING|trainer.py:803] 2025-04-26 18:44:18,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:18,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 351 [WARNING|trainer.py:803] 2025-04-26 18:44:18,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 322 [WARNING|trainer.py:803] 2025-04-26 18:44:19,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:19,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 352 323 294 [WARNING|trainer.py:803] 2025-04-26 18:44:20,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:21,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 353 [WARNING|trainer.py:803] 2025-04-26 18:44:21,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 324 [WARNING|trainer.py:803] 2025-04-26 18:44:22,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 354 [WARNING|trainer.py:803] 2025-04-26 18:44:22,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 295 325 [WARNING|trainer.py:803] 2025-04-26 18:44:23,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:23,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 355 [WARNING|trainer.py:803] 2025-04-26 18:44:24,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 326 296 [WARNING|trainer.py:803] 2025-04-26 18:44:24,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 356 [WARNING|trainer.py:803] 2025-04-26 18:44:25,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:25,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 327 [WARNING|trainer.py:803] 2025-04-26 18:44:26,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 357 [WARNING|trainer.py:803] 2025-04-26 18:44:26,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 297 328 [WARNING|trainer.py:803] 2025-04-26 18:44:27,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:27,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 358 [WARNING|trainer.py:803] 2025-04-26 18:44:28,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 329 [WARNING|trainer.py:803] 2025-04-26 18:44:28,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 298 359 [WARNING|trainer.py:803] 2025-04-26 18:44:29,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:44:30,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:30,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 330 360 [WARNING|trainer.py:803] 2025-04-26 18:44:31,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:44:31,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 331 299 361 [WARNING|trainer.py:803] 2025-04-26 18:44:32,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:32,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:32,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 332 362 300 [WARNING|trainer.py:803] 2025-04-26 18:44:33,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:34,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 333 363 [WARNING|trainer.py:803] 2025-04-26 18:44:34,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 301 [WARNING|trainer.py:803] 2025-04-26 18:44:35,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:35,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 334 [WARNING|trainer.py:803] 2025-04-26 18:44:35,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 364 302 [WARNING|trainer.py:803] 2025-04-26 18:44:36,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:44:36,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:36,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 335 365 303 [WARNING|trainer.py:803] 2025-04-26 18:44:37,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:44:38,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:38,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 336 304 366 [WARNING|trainer.py:803] 2025-04-26 18:44:39,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:44:39,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:44:39,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 305 337 367 [WARNING|trainer.py:803] 2025-04-26 18:44:40,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:40,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:40,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 306 338 368 [WARNING|trainer.py:803] 2025-04-26 18:44:41,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 307 [WARNING|trainer.py:803] 2025-04-26 18:44:42,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:44:42,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 339 369 [WARNING|trainer.py:803] 2025-04-26 18:44:42,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 308 [WARNING|trainer.py:803] 2025-04-26 18:44:43,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:43,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 340 [WARNING|trainer.py:803] 2025-04-26 18:44:43,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 370 309 [WARNING|trainer.py:803] 2025-04-26 18:44:44,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:44,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:44:45,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 341 310 371 [WARNING|trainer.py:803] 2025-04-26 18:44:46,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:46,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:46,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 342 311 372 [WARNING|trainer.py:803] 2025-04-26 18:44:47,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:47,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 312 [WARNING|trainer.py:803] 2025-04-26 18:44:47,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 343 373 [WARNING|trainer.py:803] 2025-04-26 18:44:48,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:48,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 313 [WARNING|trainer.py:803] 2025-04-26 18:44:49,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 344 374 [WARNING|trainer.py:803] 2025-04-26 18:44:49,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 314 [WARNING|trainer.py:803] 2025-04-26 18:44:50,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:50,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 345 [WARNING|trainer.py:803] 2025-04-26 18:44:50,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 375 315 [WARNING|trainer.py:803] 2025-04-26 18:44:51,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:51,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:51,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 346 316 376 [WARNING|trainer.py:803] 2025-04-26 18:44:52,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:53,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:53,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 317 347 377 [WARNING|trainer.py:803] 2025-04-26 18:44:54,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:44:54,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 318 [WARNING|trainer.py:803] 2025-04-26 18:44:54,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 348 378 [WARNING|trainer.py:803] 2025-04-26 18:44:55,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:44:55,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 319 [WARNING|trainer.py:803] 2025-04-26 18:44:55,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 349 379 [WARNING|trainer.py:803] 2025-04-26 18:44:56,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 320 [WARNING|trainer.py:803] 2025-04-26 18:44:56,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:57,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 350 [WARNING|trainer.py:803] 2025-04-26 18:44:57,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 380 321 [WARNING|trainer.py:803] 2025-04-26 18:44:58,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:58,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:58,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 351 381 322 [WARNING|trainer.py:803] 2025-04-26 18:44:59,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:44:59,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:44:59,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 352 323 382 [WARNING|trainer.py:803] 2025-04-26 18:45:01,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:45:01,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:45:01,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 324 353 383 [WARNING|trainer.py:803] 2025-04-26 18:45:02,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:02,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:45:02,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 325 354 384 [WARNING|trainer.py:803] 2025-04-26 18:45:03,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 326 [WARNING|trainer.py:803] 2025-04-26 18:45:03,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:45:04,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 355 385 [WARNING|trainer.py:803] 2025-04-26 18:45:04,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 327 [WARNING|trainer.py:803] 2025-04-26 18:45:05,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:05,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 356 [WARNING|trainer.py:803] 2025-04-26 18:45:05,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 386 328 [WARNING|trainer.py:803] 2025-04-26 18:45:06,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:06,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 357 [WARNING|trainer.py:803] 2025-04-26 18:45:07,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 329 387 [WARNING|trainer.py:803] 2025-04-26 18:45:07,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:08,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:45:08,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 358 330 388 [WARNING|trainer.py:803] 2025-04-26 18:45:09,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:45:09,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:45:09,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 359 331 389 [WARNING|trainer.py:803] 2025-04-26 18:45:10,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:45:10,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:10,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 332 360 390 [WARNING|trainer.py:803] 2025-04-26 18:45:11,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:12,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 333 [WARNING|trainer.py:803] 2025-04-26 18:45:12,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 361 391 [WARNING|trainer.py:803] 2025-04-26 18:45:13,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:13,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 334 [WARNING|trainer.py:803] 2025-04-26 18:45:13,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 362 392 [WARNING|trainer.py:803] 2025-04-26 18:45:14,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:45:14,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 335 [WARNING|trainer.py:803] 2025-04-26 18:45:15,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 363 [WARNING|trainer.py:803] 2025-04-26 18:45:15,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 393 336 [WARNING|trainer.py:803] 2025-04-26 18:45:15,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:45:16,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 364 [WARNING|trainer.py:803] 2025-04-26 18:45:16,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 394 337 [WARNING|trainer.py:803] 2025-04-26 18:45:17,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:17,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:17,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 365 338 395 [WARNING|trainer.py:803] 2025-04-26 18:45:18,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:18,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:45:19,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 339 366 396 [WARNING|trainer.py:803] 2025-04-26 18:45:19,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:19,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 340 [WARNING|trainer.py:803] 2025-04-26 18:45:20,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 367 397 [WARNING|trainer.py:803] 2025-04-26 18:45:21,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:21,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 341 [WARNING|trainer.py:803] 2025-04-26 18:45:21,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 368 [WARNING|trainer.py:803] 2025-04-26 18:45:22,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 398 [WARNING|trainer.py:803] 2025-04-26 18:45:22,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 342 369 [WARNING|trainer.py:803] 2025-04-26 18:45:23,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:45:23,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 399 343 [WARNING|trainer.py:803] 2025-04-26 18:45:24,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:45:24,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 370 [WARNING|trainer.py:803] 2025-04-26 18:45:24,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 344 400 [WARNING|trainer.py:803] 2025-04-26 18:45:25,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:45:25,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:45:25,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 371 345 401 [WARNING|trainer.py:803] 2025-04-26 18:45:26,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:45:26,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:45:27,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 372 346 402 [WARNING|trainer.py:803] 2025-04-26 18:45:28,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:28,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 347 [WARNING|trainer.py:803] 2025-04-26 18:45:28,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 373 403 [WARNING|trainer.py:803] 2025-04-26 18:45:29,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:29,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 348 [WARNING|trainer.py:803] 2025-04-26 18:45:30,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 374 404 [WARNING|trainer.py:803] 2025-04-26 18:45:30,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 349 [WARNING|trainer.py:803] 2025-04-26 18:45:30,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:31,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 375 [WARNING|trainer.py:803] 2025-04-26 18:45:31,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 405 350 [WARNING|trainer.py:803] 2025-04-26 18:45:32,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:32,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 376 [WARNING|trainer.py:803] 2025-04-26 18:45:32,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 351 406 [WARNING|trainer.py:803] 2025-04-26 18:45:33,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:34,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:34,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 377 352 407 [WARNING|trainer.py:803] 2025-04-26 18:45:35,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:35,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:45:35,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 378 353 408 [WARNING|trainer.py:803] 2025-04-26 18:45:36,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:36,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 354 [WARNING|trainer.py:803] 2025-04-26 18:45:36,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 379 409 [WARNING|trainer.py:803] 2025-04-26 18:45:37,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:45:37,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 355 [WARNING|trainer.py:803] 2025-04-26 18:45:38,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 380 410 [WARNING|trainer.py:803] 2025-04-26 18:45:38,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 356 [WARNING|trainer.py:803] 2025-04-26 18:45:39,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:39,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 381 [WARNING|trainer.py:803] 2025-04-26 18:45:39,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 411 357 [WARNING|trainer.py:803] 2025-04-26 18:45:40,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:45:40,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 382 [WARNING|trainer.py:803] 2025-04-26 18:45:41,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 358 412 [WARNING|trainer.py:803] 2025-04-26 18:45:41,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:42,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:45:42,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 383 359 413 [WARNING|trainer.py:803] 2025-04-26 18:45:43,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:45:43,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:45:43,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 360 384 414 [WARNING|trainer.py:803] 2025-04-26 18:45:44,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:45:44,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 361 [WARNING|trainer.py:803] 2025-04-26 18:45:44,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 385 415 [WARNING|trainer.py:803] 2025-04-26 18:45:45,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 362 [WARNING|trainer.py:803] 2025-04-26 18:45:46,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:46,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 386 [WARNING|trainer.py:803] 2025-04-26 18:45:46,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 416 363 [WARNING|trainer.py:803] 2025-04-26 18:45:47,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:45:47,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:45:47,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 387 364 417 [WARNING|trainer.py:803] 2025-04-26 18:45:48,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:48,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:49,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 388 365 418 [WARNING|trainer.py:803] 2025-04-26 18:45:50,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:45:50,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 366 [WARNING|trainer.py:803] 2025-04-26 18:45:50,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 389 419 [WARNING|trainer.py:803] 2025-04-26 18:45:51,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:45:51,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 367 [WARNING|trainer.py:803] 2025-04-26 18:45:51,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 390 [WARNING|trainer.py:803] 2025-04-26 18:45:52,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 420 368 [WARNING|trainer.py:803] 2025-04-26 18:45:52,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 391 [WARNING|trainer.py:803] 2025-04-26 18:45:53,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:45:53,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 369 421 [WARNING|trainer.py:803] 2025-04-26 18:45:54,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:45:54,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:45:54,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 392 370 422 [WARNING|trainer.py:803] 2025-04-26 18:45:55,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:45:55,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 371 [WARNING|trainer.py:803] 2025-04-26 18:45:56,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 393 423 [WARNING|trainer.py:803] 2025-04-26 18:45:56,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:45:56,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 372 [WARNING|trainer.py:803] 2025-04-26 18:45:57,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 394 424 [WARNING|trainer.py:803] 2025-04-26 18:45:58,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:45:58,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 373 [WARNING|trainer.py:803] 2025-04-26 18:45:58,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 395 [WARNING|trainer.py:803] 2025-04-26 18:45:59,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 425 374 [WARNING|trainer.py:803] 2025-04-26 18:45:59,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 396 [WARNING|trainer.py:803] 2025-04-26 18:46:00,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:00,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 375 426 [WARNING|trainer.py:803] 2025-04-26 18:46:01,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:01,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 397 [WARNING|trainer.py:803] 2025-04-26 18:46:01,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 376 427 [WARNING|trainer.py:803] 2025-04-26 18:46:02,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:46:02,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 377 398 [WARNING|trainer.py:803] 2025-04-26 18:46:03,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 428 [WARNING|trainer.py:803] 2025-04-26 18:46:03,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:03,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 378 399 [WARNING|trainer.py:803] 2025-04-26 18:46:04,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:04,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 429 379 [WARNING|trainer.py:803] 2025-04-26 18:46:05,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:46:05,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 400 [WARNING|trainer.py:803] 2025-04-26 18:46:05,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 430 380 [WARNING|trainer.py:803] 2025-04-26 18:46:06,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:46:07,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 401 [WARNING|trainer.py:803] 2025-04-26 18:46:07,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 381 431 [WARNING|trainer.py:803] 2025-04-26 18:46:08,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:08,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:08,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 402 382 432 [WARNING|trainer.py:803] 2025-04-26 18:46:09,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:09,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:09,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 383 403 433 [WARNING|trainer.py:803] 2025-04-26 18:46:10,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:10,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:46:11,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 384 404 434 [WARNING|trainer.py:803] 2025-04-26 18:46:11,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:12,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 385 [WARNING|trainer.py:803] 2025-04-26 18:46:12,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 405 435 [WARNING|trainer.py:803] 2025-04-26 18:46:13,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:13,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 386 [WARNING|trainer.py:803] 2025-04-26 18:46:13,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 406 [WARNING|trainer.py:803] 2025-04-26 18:46:14,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 436 387 [WARNING|trainer.py:803] 2025-04-26 18:46:14,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:15,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 407 [WARNING|trainer.py:803] 2025-04-26 18:46:15,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 437 388 [WARNING|trainer.py:803] 2025-04-26 18:46:16,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:46:16,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:16,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 408 389 438 [WARNING|trainer.py:803] 2025-04-26 18:46:17,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:17,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:17,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 390 409 439 [WARNING|trainer.py:803] 2025-04-26 18:46:18,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:19,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 391 [WARNING|trainer.py:803] 2025-04-26 18:46:19,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 410 440 [WARNING|trainer.py:803] 2025-04-26 18:46:20,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:20,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 392 [WARNING|trainer.py:803] 2025-04-26 18:46:20,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 411 441 [WARNING|trainer.py:803] 2025-04-26 18:46:21,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 393 [WARNING|trainer.py:803] 2025-04-26 18:46:21,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:21,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 412 [WARNING|trainer.py:803] 2025-04-26 18:46:22,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 442 394 [WARNING|trainer.py:803] 2025-04-26 18:46:23,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:23,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:23,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 413 395 443 [WARNING|trainer.py:803] 2025-04-26 18:46:24,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:24,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:24,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 414 396 444 [WARNING|trainer.py:803] 2025-04-26 18:46:25,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:25,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 397 [WARNING|trainer.py:803] 2025-04-26 18:46:26,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 415 445 [WARNING|trainer.py:803] 2025-04-26 18:46:27,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:46:27,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 398 [WARNING|trainer.py:803] 2025-04-26 18:46:27,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 416 446 [WARNING|trainer.py:803] 2025-04-26 18:46:28,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 399 [WARNING|trainer.py:803] 2025-04-26 18:46:28,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:28,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 417 [WARNING|trainer.py:803] 2025-04-26 18:46:29,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 447 400 [WARNING|trainer.py:803] 2025-04-26 18:46:30,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:30,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:30,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 418 401 448 [WARNING|trainer.py:803] 2025-04-26 18:46:31,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:31,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:31,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 419 402 449 [WARNING|trainer.py:803] 2025-04-26 18:46:32,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:32,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:33,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 403 420 450 [WARNING|trainer.py:803] 2025-04-26 18:46:34,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:46:34,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 404 [WARNING|trainer.py:803] 2025-04-26 18:46:34,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 421 451 [WARNING|trainer.py:803] 2025-04-26 18:46:35,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 405 [WARNING|trainer.py:803] 2025-04-26 18:46:35,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:46:35,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 422 452 [WARNING|trainer.py:803] 2025-04-26 18:46:36,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 406 [WARNING|trainer.py:803] 2025-04-26 18:46:36,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:37,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 423 [WARNING|trainer.py:803] 2025-04-26 18:46:37,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 453 407 [WARNING|trainer.py:803] 2025-04-26 18:46:38,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:38,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:38,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 424 454 408 [WARNING|trainer.py:803] 2025-04-26 18:46:39,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:46:39,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:39,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 409 425 455 [WARNING|trainer.py:803] 2025-04-26 18:46:41,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:41,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:41,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 410 426 456 [WARNING|trainer.py:803] 2025-04-26 18:46:42,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 411 [WARNING|trainer.py:803] 2025-04-26 18:46:42,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:46:42,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 457 427 [WARNING|trainer.py:803] 2025-04-26 18:46:43,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 412 [WARNING|trainer.py:803] 2025-04-26 18:46:43,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:46:43,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:44,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 428 458 413 [WARNING|trainer.py:803] 2025-04-26 18:46:45,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:45,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:45,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 459 429 414 [WARNING|trainer.py:803] 2025-04-26 18:46:46,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:46,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:46,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 415 430 460 [WARNING|trainer.py:803] 2025-04-26 18:46:47,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:48,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:48,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 416 431 461 [WARNING|trainer.py:803] 2025-04-26 18:46:49,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:49,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 417 [WARNING|trainer.py:803] 2025-04-26 18:46:49,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 432 462 [WARNING|trainer.py:803] 2025-04-26 18:46:50,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 418 [WARNING|trainer.py:803] 2025-04-26 18:46:50,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:50,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 433 463 [WARNING|trainer.py:803] 2025-04-26 18:46:51,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 419 [WARNING|trainer.py:803] 2025-04-26 18:46:52,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:46:52,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 434 [WARNING|trainer.py:803] 2025-04-26 18:46:52,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 464 420 [WARNING|trainer.py:803] 2025-04-26 18:46:53,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:53,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:46:53,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 435 465 421 [WARNING|trainer.py:803] 2025-04-26 18:46:54,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:55,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:46:55,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 436 422 466 [WARNING|trainer.py:803] 2025-04-26 18:46:56,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:56,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:56,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 423 437 467 [WARNING|trainer.py:803] 2025-04-26 18:46:57,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:57,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:46:57,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 424 438 468 [WARNING|trainer.py:803] 2025-04-26 18:46:58,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:46:59,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 425 [WARNING|trainer.py:803] 2025-04-26 18:46:59,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 439 469 [WARNING|trainer.py:803] 2025-04-26 18:47:00,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 426 [WARNING|trainer.py:803] 2025-04-26 18:47:00,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:00,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 440 470 [WARNING|trainer.py:803] 2025-04-26 18:47:01,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 427 [WARNING|trainer.py:803] 2025-04-26 18:47:01,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:47:01,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 441 [WARNING|trainer.py:803] 2025-04-26 18:47:02,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 471 428 [WARNING|trainer.py:803] 2025-04-26 18:47:03,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:47:03,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:03,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 442 472 429 [WARNING|trainer.py:803] 2025-04-26 18:47:04,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:47:04,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:04,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 430 443 473 [WARNING|trainer.py:803] 2025-04-26 18:47:05,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:47:05,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:06,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 431 474 444 [WARNING|trainer.py:803] 2025-04-26 18:47:07,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:47:07,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:07,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 432 475 445 [WARNING|trainer.py:803] 2025-04-26 18:47:08,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 433 [WARNING|trainer.py:803] 2025-04-26 18:47:08,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:47:08,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 446 476 [WARNING|trainer.py:803] 2025-04-26 18:47:09,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 434 [WARNING|trainer.py:803] 2025-04-26 18:47:10,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:10,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:47:10,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 477 447 435 [WARNING|trainer.py:803] 2025-04-26 18:47:11,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:11,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:11,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 478 448 436 [WARNING|trainer.py:803] 2025-04-26 18:47:12,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:12,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:13,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 437 479 449 [WARNING|trainer.py:803] 2025-04-26 18:47:14,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:47:14,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:14,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 438 480 450 [WARNING|trainer.py:803] 2025-04-26 18:47:15,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 439 [WARNING|trainer.py:803] 2025-04-26 18:47:15,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:15,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 481 451 [WARNING|trainer.py:803] 2025-04-26 18:47:16,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 440 [WARNING|trainer.py:803] 2025-04-26 18:47:16,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:47:17,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 482 [WARNING|trainer.py:803] 2025-04-26 18:47:17,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 452 441 [WARNING|trainer.py:803] 2025-04-26 18:47:18,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:18,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:47:18,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 483 453 442 [WARNING|trainer.py:803] 2025-04-26 18:47:19,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:19,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:19,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 484 443 454 [WARNING|trainer.py:803] 2025-04-26 18:47:21,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:47:21,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:21,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 444 485 455 [WARNING|trainer.py:803] 2025-04-26 18:47:22,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:22,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:47:22,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 445 486 456 [WARNING|trainer.py:803] 2025-04-26 18:47:23,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 446 [WARNING|trainer.py:803] 2025-04-26 18:47:23,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:47:23,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 487 457 [WARNING|trainer.py:803] 2025-04-26 18:47:24,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 447 [WARNING|trainer.py:803] 2025-04-26 18:47:25,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:47:25,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:47:25,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 488 458 448 [WARNING|trainer.py:803] 2025-04-26 18:47:26,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:47:26,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:47:26,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 489 449 459 [WARNING|trainer.py:803] 2025-04-26 18:47:27,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:28,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:47:28,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 450 490 460 [WARNING|trainer.py:803] 2025-04-26 18:47:29,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:47:29,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 451 [WARNING|trainer.py:803] 2025-04-26 18:47:29,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 491 461 [WARNING|trainer.py:803] 2025-04-26 18:47:30,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 452 [WARNING|trainer.py:803] 2025-04-26 18:47:30,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:31,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 492 [WARNING|trainer.py:803] 2025-04-26 18:47:31,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 462 453 [WARNING|trainer.py:803] 2025-04-26 18:47:32,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:47:32,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:47:32,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 493 454 463 [WARNING|trainer.py:803] 2025-04-26 18:47:33,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:33,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:33,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 494 455 464 [WARNING|trainer.py:803] 2025-04-26 18:47:34,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:35,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 456 [WARNING|trainer.py:803] 2025-04-26 18:47:35,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 495 465 [WARNING|trainer.py:803] 2025-04-26 18:47:36,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:47:36,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 457 [WARNING|trainer.py:803] 2025-04-26 18:47:36,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 496 466 [WARNING|trainer.py:803] 2025-04-26 18:47:37,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:47:37,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 458 [WARNING|trainer.py:803] 2025-04-26 18:47:38,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 497 [WARNING|trainer.py:803] 2025-04-26 18:47:38,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 467 459 [WARNING|trainer.py:803] 2025-04-26 18:47:39,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:47:39,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 498 [WARNING|trainer.py:803] 2025-04-26 18:47:39,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 468 460 [WARNING|trainer.py:803] 2025-04-26 18:47:40,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:47:40,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 499 [WARNING|trainer.py:803] 2025-04-26 18:47:40,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 461 469 [WARNING|trainer.py:803] 2025-04-26 18:47:41,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:47:42,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:42,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 500 462 470 [WARNING|trainer.py:803] 2025-04-26 18:47:43,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:43,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:47:43,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 463 501 471 [WARNING|trainer.py:803] 2025-04-26 18:47:44,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:47:44,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 464 [WARNING|trainer.py:803] 2025-04-26 18:47:44,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 502 472 [WARNING|trainer.py:803] 2025-04-26 18:47:45,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:47:45,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 465 [WARNING|trainer.py:803] 2025-04-26 18:47:46,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 503 473 [WARNING|trainer.py:803] 2025-04-26 18:47:46,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:47:47,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 466 [WARNING|trainer.py:803] 2025-04-26 18:47:47,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 504 474 [WARNING|trainer.py:803] 2025-04-26 18:47:47,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 467 [WARNING|trainer.py:803] 2025-04-26 18:47:48,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:48,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 505 [WARNING|trainer.py:803] 2025-04-26 18:47:49,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 475 468 [WARNING|trainer.py:803] 2025-04-26 18:47:50,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:50,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:47:50,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 506 469 476 [WARNING|trainer.py:803] 2025-04-26 18:47:51,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:51,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:51,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 470 507 477 [WARNING|trainer.py:803] 2025-04-26 18:47:52,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:52,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:52,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 471 508 478 [WARNING|trainer.py:803] 2025-04-26 18:47:53,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 472 [WARNING|trainer.py:803] 2025-04-26 18:47:54,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:54,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 509 479 [WARNING|trainer.py:803] 2025-04-26 18:47:54,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 473 [WARNING|trainer.py:803] 2025-04-26 18:47:55,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:47:55,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 510 [WARNING|trainer.py:803] 2025-04-26 18:47:55,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 480 474 [WARNING|trainer.py:803] 2025-04-26 18:47:56,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:56,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:47:57,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 511 475 481 [WARNING|trainer.py:803] 2025-04-26 18:47:58,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:47:58,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:47:58,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 476 512 482 [WARNING|trainer.py:803] 2025-04-26 18:47:59,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:47:59,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 477 [WARNING|trainer.py:803] 2025-04-26 18:47:59,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 513 483 [WARNING|trainer.py:803] 2025-04-26 18:48:00,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:48:00,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 478 [WARNING|trainer.py:803] 2025-04-26 18:48:01,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 514 484 [WARNING|trainer.py:803] 2025-04-26 18:48:01,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 479 [WARNING|trainer.py:803] 2025-04-26 18:48:02,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:48:02,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 515 [WARNING|trainer.py:803] 2025-04-26 18:48:02,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 485 480 [WARNING|trainer.py:803] 2025-04-26 18:48:03,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 516 [WARNING|trainer.py:803] 2025-04-26 18:48:03,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:48:04,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 481 486 [WARNING|trainer.py:803] 2025-04-26 18:48:04,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 517 [WARNING|trainer.py:803] 2025-04-26 18:48:05,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:05,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 482 487 [WARNING|trainer.py:803] 2025-04-26 18:48:05,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:06,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 518 [WARNING|trainer.py:803] 2025-04-26 18:48:06,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 483 488 [WARNING|trainer.py:803] 2025-04-26 18:48:07,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:07,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 519 484 [WARNING|trainer.py:803] 2025-04-26 18:48:07,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 489 [WARNING|trainer.py:803] 2025-04-26 18:48:08,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:48:08,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 485 520 [WARNING|trainer.py:803] 2025-04-26 18:48:09,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 490 [WARNING|trainer.py:803] 2025-04-26 18:48:09,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:48:09,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 486 521 [WARNING|trainer.py:803] 2025-04-26 18:48:10,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:48:10,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:11,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 491 487 522 [WARNING|trainer.py:803] 2025-04-26 18:48:12,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:12,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:12,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 488 492 523 [WARNING|trainer.py:803] 2025-04-26 18:48:13,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:48:13,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 489 [WARNING|trainer.py:803] 2025-04-26 18:48:13,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 493 524 [WARNING|trainer.py:803] 2025-04-26 18:48:14,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 490 [WARNING|trainer.py:803] 2025-04-26 18:48:15,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:48:15,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 494 525 [WARNING|trainer.py:803] 2025-04-26 18:48:15,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 491 [WARNING|trainer.py:803] 2025-04-26 18:48:16,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:16,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 495 [WARNING|trainer.py:803] 2025-04-26 18:48:16,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 526 492 [WARNING|trainer.py:803] 2025-04-26 18:48:17,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:17,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:48:18,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 527 496 493 [WARNING|trainer.py:803] 2025-04-26 18:48:19,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:19,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:19,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 528 494 497 [WARNING|trainer.py:803] 2025-04-26 18:48:20,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:20,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:20,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 529 495 498 [WARNING|trainer.py:803] 2025-04-26 18:48:21,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:21,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:22,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 496 530 499 [WARNING|trainer.py:803] 2025-04-26 18:48:22,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:23,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 497 [WARNING|trainer.py:803] 2025-04-26 18:48:23,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 531 500 [WARNING|trainer.py:803] 2025-04-26 18:48:24,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:24,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 498 [WARNING|trainer.py:803] 2025-04-26 18:48:24,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 532 501 [WARNING|trainer.py:803] 2025-04-26 18:48:25,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 499 [WARNING|trainer.py:803] 2025-04-26 18:48:25,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:26,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 533 [WARNING|trainer.py:803] 2025-04-26 18:48:26,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 502 500 [WARNING|trainer.py:803] 2025-04-26 18:48:26,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:27,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 534 [WARNING|trainer.py:803] 2025-04-26 18:48:27,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 501 503 [WARNING|trainer.py:803] 2025-04-26 18:48:28,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:28,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:28,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 535 502 504 [WARNING|trainer.py:803] 2025-04-26 18:48:29,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:29,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 503 536 [WARNING|trainer.py:803] 2025-04-26 18:48:30,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 505 [WARNING|trainer.py:803] 2025-04-26 18:48:30,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:30,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 504 [WARNING|trainer.py:803] 2025-04-26 18:48:31,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 537 506 [WARNING|trainer.py:803] 2025-04-26 18:48:32,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:32,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 505 [WARNING|trainer.py:803] 2025-04-26 18:48:32,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 538 507 [WARNING|trainer.py:803] 2025-04-26 18:48:33,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 506 [WARNING|trainer.py:803] 2025-04-26 18:48:33,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 539 [WARNING|trainer.py:803] 2025-04-26 18:48:34,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:48:34,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 508 507 [WARNING|trainer.py:803] 2025-04-26 18:48:34,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 540 [WARNING|trainer.py:803] 2025-04-26 18:48:35,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:48:35,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 508 509 [WARNING|trainer.py:803] 2025-04-26 18:48:36,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:36,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 541 [WARNING|trainer.py:803] 2025-04-26 18:48:36,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 509 510 [WARNING|trainer.py:803] 2025-04-26 18:48:37,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:37,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 542 510 [WARNING|trainer.py:803] 2025-04-26 18:48:38,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 511 [WARNING|trainer.py:803] 2025-04-26 18:48:38,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:38,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 511 543 [WARNING|trainer.py:803] 2025-04-26 18:48:39,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 512 [WARNING|trainer.py:803] 2025-04-26 18:48:39,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:40,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 512 544 [WARNING|trainer.py:803] 2025-04-26 18:48:40,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:48:41,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 513 513 [WARNING|trainer.py:803] 2025-04-26 18:48:41,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 545 [WARNING|trainer.py:803] 2025-04-26 18:48:42,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:42,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 514 514 [WARNING|trainer.py:803] 2025-04-26 18:48:42,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 546 [WARNING|trainer.py:803] 2025-04-26 18:48:43,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:48:43,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 515 515 [WARNING|trainer.py:803] 2025-04-26 18:48:44,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:44,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 547 516 [WARNING|trainer.py:803] 2025-04-26 18:48:44,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 516 [WARNING|trainer.py:803] 2025-04-26 18:48:45,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:48:45,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 517 548 [WARNING|trainer.py:803] 2025-04-26 18:48:46,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 517 [WARNING|trainer.py:803] 2025-04-26 18:48:46,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:46,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 518 549 [WARNING|trainer.py:803] 2025-04-26 18:48:47,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:47,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 518 [WARNING|trainer.py:803] 2025-04-26 18:48:48,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 519 550 [WARNING|trainer.py:803] 2025-04-26 18:48:48,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:48,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 520 519 [WARNING|trainer.py:803] 2025-04-26 18:48:49,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 551 [WARNING|trainer.py:803] 2025-04-26 18:48:50,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:50,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 521 [WARNING|trainer.py:803] 2025-04-26 18:48:50,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 520 [WARNING|trainer.py:803] 2025-04-26 18:48:51,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 552 522 [WARNING|trainer.py:803] 2025-04-26 18:48:51,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 521 [WARNING|trainer.py:803] 2025-04-26 18:48:51,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:52,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 553 523 [WARNING|trainer.py:803] 2025-04-26 18:48:52,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 522 [WARNING|trainer.py:803] 2025-04-26 18:48:53,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:53,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 524 554 [WARNING|trainer.py:803] 2025-04-26 18:48:54,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:54,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:54,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 523 525 555 [WARNING|trainer.py:803] 2025-04-26 18:48:55,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:55,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:55,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 526 524 556 [WARNING|trainer.py:803] 2025-04-26 18:48:56,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:48:56,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 527 [WARNING|trainer.py:803] 2025-04-26 18:48:57,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 525 557 [WARNING|trainer.py:803] 2025-04-26 18:48:57,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:48:58,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 528 [WARNING|trainer.py:803] 2025-04-26 18:48:58,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 526 [WARNING|trainer.py:803] 2025-04-26 18:48:58,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 558 529 [WARNING|trainer.py:803] 2025-04-26 18:48:59,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:48:59,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 527 [WARNING|trainer.py:803] 2025-04-26 18:49:00,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 530 559 [WARNING|trainer.py:803] 2025-04-26 18:49:00,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:01,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:49:01,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 528 531 560 [WARNING|trainer.py:803] 2025-04-26 18:49:02,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:49:02,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 532 [WARNING|trainer.py:803] 2025-04-26 18:49:02,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 529 561 [WARNING|trainer.py:803] 2025-04-26 18:49:03,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:03,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 533 [WARNING|trainer.py:803] 2025-04-26 18:49:03,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 530 562 [WARNING|trainer.py:803] 2025-04-26 18:49:04,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 534 [WARNING|trainer.py:803] 2025-04-26 18:49:04,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:49:05,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 531 [WARNING|trainer.py:803] 2025-04-26 18:49:05,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 563 535 [WARNING|trainer.py:803] 2025-04-26 18:49:06,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:49:06,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:49:06,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 532 536 564 [WARNING|trainer.py:803] 2025-04-26 18:49:07,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:07,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:49:07,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 533 537 565 [WARNING|trainer.py:803] 2025-04-26 18:49:08,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:08,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 538 [WARNING|trainer.py:803] 2025-04-26 18:49:09,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 534 566 [WARNING|trainer.py:803] 2025-04-26 18:49:10,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:49:10,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 539 [WARNING|trainer.py:803] 2025-04-26 18:49:10,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 535 567 [WARNING|trainer.py:803] 2025-04-26 18:49:11,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 540 [WARNING|trainer.py:803] 2025-04-26 18:49:11,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:49:11,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 536 [WARNING|trainer.py:803] 2025-04-26 18:49:12,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 568 541 [WARNING|trainer.py:803] 2025-04-26 18:49:12,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:49:13,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 537 [WARNING|trainer.py:803] 2025-04-26 18:49:13,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 542 569 [WARNING|trainer.py:803] 2025-04-26 18:49:14,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:14,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:49:14,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 538 543 570 [WARNING|trainer.py:803] 2025-04-26 18:49:15,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:49:15,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 544 [WARNING|trainer.py:803] 2025-04-26 18:49:15,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 539 571 [WARNING|trainer.py:803] 2025-04-26 18:49:16,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:16,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 545 [WARNING|trainer.py:803] 2025-04-26 18:49:17,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 540 [WARNING|trainer.py:803] 2025-04-26 18:49:17,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 572 546 [WARNING|trainer.py:803] 2025-04-26 18:49:18,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:49:18,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 541 [WARNING|trainer.py:803] 2025-04-26 18:49:18,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 547 573 [WARNING|trainer.py:803] 2025-04-26 18:49:19,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:49:19,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:49:19,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 542 548 574 [WARNING|trainer.py:803] 2025-04-26 18:49:20,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:49:21,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 549 [WARNING|trainer.py:803] 2025-04-26 18:49:21,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 543 575 [WARNING|trainer.py:803] 2025-04-26 18:49:22,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:49:22,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 550 544 [WARNING|trainer.py:803] 2025-04-26 18:49:22,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:23,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 576 551 [WARNING|trainer.py:803] 2025-04-26 18:49:23,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 545 [WARNING|trainer.py:803] 2025-04-26 18:49:24,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:24,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 552 577 [WARNING|trainer.py:803] 2025-04-26 18:49:24,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 546 [WARNING|trainer.py:803] 2025-04-26 18:49:25,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:25,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 553 578 [WARNING|trainer.py:803] 2025-04-26 18:49:26,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:26,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 554 [WARNING|trainer.py:803] 2025-04-26 18:49:26,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 547 579 [WARNING|trainer.py:803] 2025-04-26 18:49:27,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:27,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 555 [WARNING|trainer.py:803] 2025-04-26 18:49:28,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 548 580 [WARNING|trainer.py:803] 2025-04-26 18:49:28,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:28,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 556 549 [WARNING|trainer.py:803] 2025-04-26 18:49:29,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:29,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 581 557 [WARNING|trainer.py:803] 2025-04-26 18:49:30,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 550 [WARNING|trainer.py:803] 2025-04-26 18:49:30,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:49:30,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 558 582 [WARNING|trainer.py:803] 2025-04-26 18:49:31,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 551 [WARNING|trainer.py:803] 2025-04-26 18:49:31,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:32,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 559 583 [WARNING|trainer.py:803] 2025-04-26 18:49:32,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:33,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 552 560 [WARNING|trainer.py:803] 2025-04-26 18:49:33,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 584 [WARNING|trainer.py:803] 2025-04-26 18:49:34,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:34,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 561 553 [WARNING|trainer.py:803] 2025-04-26 18:49:34,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 585 [WARNING|trainer.py:803] 2025-04-26 18:49:35,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:35,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 562 554 [WARNING|trainer.py:803] 2025-04-26 18:49:36,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:49:36,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 586 563 [WARNING|trainer.py:803] 2025-04-26 18:49:36,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 555 [WARNING|trainer.py:803] 2025-04-26 18:49:37,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:37,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 564 587 [WARNING|trainer.py:803] 2025-04-26 18:49:38,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 556 [WARNING|trainer.py:803] 2025-04-26 18:49:38,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 565 [WARNING|trainer.py:803] 2025-04-26 18:49:38,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 588 [WARNING|trainer.py:803] 2025-04-26 18:49:39,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:39,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 557 566 [WARNING|trainer.py:803] 2025-04-26 18:49:40,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 589 [WARNING|trainer.py:803] 2025-04-26 18:49:40,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:49:40,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 567 558 [WARNING|trainer.py:803] 2025-04-26 18:49:41,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:49:41,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 568 [WARNING|trainer.py:803] 2025-04-26 18:49:42,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 590 559 [WARNING|trainer.py:803] 2025-04-26 18:49:42,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:49:43,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 569 [WARNING|trainer.py:803] 2025-04-26 18:49:43,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 591 560 [WARNING|trainer.py:803] 2025-04-26 18:49:44,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 570 [WARNING|trainer.py:803] 2025-04-26 18:49:44,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:44,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 592 [WARNING|trainer.py:803] 2025-04-26 18:49:45,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 561 571 [WARNING|trainer.py:803] 2025-04-26 18:49:45,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:46,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 593 [WARNING|trainer.py:803] 2025-04-26 18:49:46,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 562 572 [WARNING|trainer.py:803] 2025-04-26 18:49:47,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:47,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 594 [WARNING|trainer.py:803] 2025-04-26 18:49:47,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 573 563 [WARNING|trainer.py:803] 2025-04-26 18:49:48,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:48,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:48,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 595 574 564 [WARNING|trainer.py:803] 2025-04-26 18:49:49,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:49,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:50,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 575 596 565 [WARNING|trainer.py:803] 2025-04-26 18:49:51,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:51,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 576 [WARNING|trainer.py:803] 2025-04-26 18:49:51,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 597 566 [WARNING|trainer.py:803] 2025-04-26 18:49:52,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 577 [WARNING|trainer.py:803] 2025-04-26 18:49:52,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:52,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 598 [WARNING|trainer.py:803] 2025-04-26 18:49:53,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 567 578 [WARNING|trainer.py:803] 2025-04-26 18:49:53,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:54,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 599 [WARNING|trainer.py:803] 2025-04-26 18:49:54,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 568 579 [WARNING|trainer.py:803] 2025-04-26 18:49:55,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:55,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:49:55,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 600 580 569 [WARNING|trainer.py:803] 2025-04-26 18:49:56,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:56,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:56,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 581 570 [WARNING|trainer.py:803] 2025-04-26 18:49:57,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 601 582 [WARNING|trainer.py:803] 2025-04-26 18:49:58,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:49:58,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 571 [WARNING|trainer.py:803] 2025-04-26 18:49:58,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 583 [WARNING|trainer.py:803] 2025-04-26 18:49:59,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:49:59,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 602 572 584 [WARNING|trainer.py:803] 2025-04-26 18:50:00,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:50:00,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:01,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 585 573 [WARNING|trainer.py:803] 2025-04-26 18:50:02,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 603 [WARNING|trainer.py:803] 2025-04-26 18:50:02,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 586 574 [WARNING|trainer.py:803] 2025-04-26 18:50:03,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:03,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:03,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 587 575 604 [WARNING|trainer.py:803] 2025-04-26 18:50:04,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 588 [WARNING|trainer.py:803] 2025-04-26 18:50:05,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:05,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 576 [WARNING|trainer.py:803] 2025-04-26 18:50:05,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 589 [WARNING|trainer.py:803] 2025-04-26 18:50:06,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 605 577 [WARNING|trainer.py:803] 2025-04-26 18:50:06,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 590 [WARNING|trainer.py:803] 2025-04-26 18:50:07,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:50:07,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 578 [WARNING|trainer.py:803] 2025-04-26 18:50:08,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 591 606 [WARNING|trainer.py:803] 2025-04-26 18:50:09,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:50:09,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 579 [WARNING|trainer.py:803] 2025-04-26 18:50:09,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 592 [WARNING|trainer.py:803] 2025-04-26 18:50:10,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:50:10,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 580 593 607 [WARNING|trainer.py:803] 2025-04-26 18:50:11,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:11,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 594 [WARNING|trainer.py:803] 2025-04-26 18:50:12,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 581 [WARNING|trainer.py:803] 2025-04-26 18:50:12,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:13,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 595 608 582 [WARNING|trainer.py:803] 2025-04-26 18:50:14,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:14,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 596 [WARNING|trainer.py:803] 2025-04-26 18:50:14,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 583 [WARNING|trainer.py:803] 2025-04-26 18:50:15,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 597 [WARNING|trainer.py:803] 2025-04-26 18:50:15,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 609 584 [WARNING|trainer.py:803] 2025-04-26 18:50:16,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 598 [WARNING|trainer.py:803] 2025-04-26 18:50:16,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:50:17,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:50:17,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 585 599 610 [WARNING|trainer.py:803] 2025-04-26 18:50:18,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:50:18,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:18,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 600 586 [WARNING|trainer.py:803] 2025-04-26 18:50:19,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:19,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 611 587 601 [WARNING|trainer.py:803] 2025-04-26 18:50:20,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:50:21,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:21,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 588 602 612 [WARNING|trainer.py:803] 2025-04-26 18:50:22,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 589 [WARNING|trainer.py:803] 2025-04-26 18:50:23,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:50:23,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:50:24,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 590 603 613 [WARNING|trainer.py:803] 2025-04-26 18:50:25,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:50:25,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:26,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 591 604 [WARNING|trainer.py:803] 2025-04-26 18:50:26,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 614 592 [WARNING|trainer.py:803] 2025-04-26 18:50:27,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:50:28,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:28,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 605 593 [WARNING|trainer.py:803] 2025-04-26 18:50:29,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 615 [WARNING|trainer.py:803] 2025-04-26 18:50:29,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 594 [WARNING|trainer.py:803] 2025-04-26 18:50:30,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 606 [WARNING|trainer.py:803] 2025-04-26 18:50:31,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:31,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 595 616 [WARNING|trainer.py:803] 2025-04-26 18:50:32,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:32,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 607 596 [WARNING|trainer.py:803] 2025-04-26 18:50:33,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:50:33,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 617 597 608 [WARNING|trainer.py:803] 2025-04-26 18:50:34,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:50:35,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:35,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 598 618 [WARNING|trainer.py:803] 2025-04-26 18:50:36,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 609 [WARNING|trainer.py:803] 2025-04-26 18:50:36,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 599 [WARNING|trainer.py:803] 2025-04-26 18:50:37,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:50:37,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 600 619 610 [WARNING|trainer.py:803] 2025-04-26 18:50:39,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:39,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:50:39,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 620 611 601 [WARNING|trainer.py:803] 2025-04-26 18:50:41,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:50:41,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:50:41,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 621 602 612 [WARNING|trainer.py:803] 2025-04-26 18:50:43,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:50:43,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:50:43,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 622 603 613 [WARNING|trainer.py:803] 2025-04-26 18:50:45,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:45,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:46,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 614 604 623 [WARNING|trainer.py:803] 2025-04-26 18:50:48,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:48,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:50:48,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 615 605 [WARNING|trainer.py:803] 2025-04-26 18:50:50,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 624 [WARNING|trainer.py:803] 2025-04-26 18:50:50,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:50:50,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 616 606 [WARNING|trainer.py:803] 2025-04-26 18:50:51,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 625 [WARNING|trainer.py:803] 2025-04-26 18:50:52,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 617 [WARNING|trainer.py:803] 2025-04-26 18:50:53,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:50:53,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 607 626 [WARNING|trainer.py:803] 2025-04-26 18:50:54,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 618 [WARNING|trainer.py:803] 2025-04-26 18:50:55,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:50:55,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 608 619 627 [WARNING|trainer.py:803] 2025-04-26 18:50:57,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:50:57,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:50:57,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 609 620 628 [WARNING|trainer.py:803] 2025-04-26 18:50:59,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:50:59,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:51:00,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 621 610 629 [WARNING|trainer.py:803] 2025-04-26 18:51:01,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:51:01,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:51:02,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 622 611 [WARNING|trainer.py:803] 2025-04-26 18:51:03,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 630 [WARNING|trainer.py:803] 2025-04-26 18:51:04,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:51:04,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 623 612 631 [WARNING|trainer.py:803] 2025-04-26 18:51:05,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:06,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:51:06,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 624 [WARNING|trainer.py:803] 2025-04-26 18:51:08,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 632 613 [WARNING|trainer.py:803] 2025-04-26 18:51:09,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:09,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 625 [WARNING|trainer.py:803] 2025-04-26 18:51:10,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 633 614 [WARNING|trainer.py:803] 2025-04-26 18:51:11,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:11,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 626 [WARNING|trainer.py:803] 2025-04-26 18:51:12,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 615 634 [WARNING|trainer.py:803] 2025-04-26 18:51:13,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 627 [WARNING|trainer.py:803] 2025-04-26 18:51:13,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:51:14,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 616 635 628 [WARNING|trainer.py:803] 2025-04-26 18:51:15,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:51:16,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:16,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 617 636 629 [WARNING|trainer.py:803] 2025-04-26 18:51:17,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:51:18,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:51:18,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 618 637 630 [WARNING|trainer.py:803] 2025-04-26 18:51:20,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:51:20,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:20,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 631 619 638 [WARNING|trainer.py:803] 2025-04-26 18:51:22,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:51:22,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:51:22,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 632 620 639 [WARNING|trainer.py:803] 2025-04-26 18:51:24,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:24,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:51:24,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 633 621 [WARNING|trainer.py:803] 2025-04-26 18:51:26,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 640 [WARNING|trainer.py:803] 2025-04-26 18:51:26,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:51:27,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 634 641 622 [WARNING|trainer.py:803] 2025-04-26 18:51:28,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:51:29,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:29,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 635 [WARNING|trainer.py:803] 2025-04-26 18:51:30,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 642 623 [WARNING|trainer.py:803] 2025-04-26 18:51:31,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 636 [WARNING|trainer.py:803] 2025-04-26 18:51:31,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:32,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 643 624 637 [WARNING|trainer.py:803] 2025-04-26 18:51:34,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:51:34,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:34,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 638 625 644 [WARNING|trainer.py:803] 2025-04-26 18:51:36,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:36,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:36,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 639 626 645 [WARNING|trainer.py:803] 2025-04-26 18:51:38,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:38,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:51:38,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 640 646 [WARNING|trainer.py:803] 2025-04-26 18:51:40,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 627 [WARNING|trainer.py:803] 2025-04-26 18:51:41,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:51:41,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 641 [WARNING|trainer.py:803] 2025-04-26 18:51:42,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 628 647 [WARNING|trainer.py:803] 2025-04-26 18:51:43,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 642 [WARNING|trainer.py:803] 2025-04-26 18:51:43,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:44,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 629 648 [WARNING|trainer.py:803] 2025-04-26 18:51:45,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 643 [WARNING|trainer.py:803] 2025-04-26 18:51:45,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:51:46,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 649 630 644 [WARNING|trainer.py:803] 2025-04-26 18:51:47,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:51:48,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:48,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 631 650 645 [WARNING|trainer.py:803] 2025-04-26 18:51:50,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:51:50,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:50,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 632 651 646 [WARNING|trainer.py:803] 2025-04-26 18:51:52,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:52,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:51:52,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 633 652 647 [WARNING|trainer.py:803] 2025-04-26 18:51:54,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:54,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:54,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 648 653 634 [WARNING|trainer.py:803] 2025-04-26 18:51:56,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:51:56,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:51:57,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 649 635 654 [WARNING|trainer.py:803] 2025-04-26 18:51:58,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:51:59,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:51:59,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 650 636 655 [WARNING|trainer.py:803] 2025-04-26 18:52:01,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:52:01,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:52:01,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 651 [WARNING|trainer.py:803] 2025-04-26 18:52:02,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 637 656 [WARNING|trainer.py:803] 2025-04-26 18:52:03,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:52:03,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 652 [WARNING|trainer.py:803] 2025-04-26 18:52:04,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 638 657 [WARNING|trainer.py:803] 2025-04-26 18:52:06,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 653 [WARNING|trainer.py:803] 2025-04-26 18:52:06,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:52:06,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 658 639 [WARNING|trainer.py:803] 2025-04-26 18:52:08,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 654 [WARNING|trainer.py:803] 2025-04-26 18:52:08,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:52:09,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 640 659 655 [WARNING|trainer.py:803] 2025-04-26 18:52:10,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:52:10,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:11,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 641 660 656 [WARNING|trainer.py:803] 2025-04-26 18:52:12,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:52:12,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:12,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 661 657 642 [WARNING|trainer.py:803] 2025-04-26 18:52:14,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:14,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:52:14,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 658 643 662 [WARNING|trainer.py:803] 2025-04-26 18:52:16,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:17,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:52:17,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 659 [WARNING|trainer.py:803] 2025-04-26 18:52:18,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 644 663 660 [WARNING|trainer.py:803] 2025-04-26 18:52:19,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:52:20,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:52:20,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 645 661 664 [WARNING|trainer.py:803] 2025-04-26 18:52:22,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:52:22,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:22,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 646 665 662 [WARNING|trainer.py:803] 2025-04-26 18:52:24,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:52:24,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:52:25,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 666 647 663 [WARNING|trainer.py:803] 2025-04-26 18:52:27,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:27,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:52:27,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 667 648 664 [WARNING|trainer.py:803] 2025-04-26 18:52:29,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:29,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:29,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 665 668 649 [WARNING|trainer.py:803] 2025-04-26 18:52:31,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:52:31,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:52:31,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 666 669 650 [WARNING|trainer.py:803] 2025-04-26 18:52:33,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:33,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:52:33,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 667 651 670 [WARNING|trainer.py:803] 2025-04-26 18:52:35,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:35,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:52:36,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 668 [WARNING|trainer.py:803] 2025-04-26 18:52:37,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 652 671 [WARNING|trainer.py:803] 2025-04-26 18:52:38,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 669 [WARNING|trainer.py:803] 2025-04-26 18:52:38,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:52:39,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 653 672 [WARNING|trainer.py:803] 2025-04-26 18:52:40,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 670 [WARNING|trainer.py:803] 2025-04-26 18:52:40,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:52:41,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 673 654 671 [WARNING|trainer.py:803] 2025-04-26 18:52:42,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:52:42,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:52:43,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 674 655 672 [WARNING|trainer.py:803] 2025-04-26 18:52:45,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:52:45,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:45,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 656 673 675 [WARNING|trainer.py:803] 2025-04-26 18:52:47,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:47,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:52:47,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 674 657 676 [WARNING|trainer.py:803] 2025-04-26 18:52:49,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:52:49,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:52:49,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 658 675 677 [WARNING|trainer.py:803] 2025-04-26 18:52:51,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:51,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:52,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 676 659 678 [WARNING|trainer.py:803] 2025-04-26 18:52:53,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:52:54,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:54,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 677 660 [WARNING|trainer.py:803] 2025-04-26 18:52:55,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 679 [WARNING|trainer.py:803] 2025-04-26 18:52:56,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:52:56,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 678 661 [WARNING|trainer.py:803] 2025-04-26 18:52:57,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 680 [WARNING|trainer.py:803] 2025-04-26 18:52:58,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 679 [WARNING|trainer.py:803] 2025-04-26 18:52:58,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:52:59,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 681 662 680 [WARNING|trainer.py:803] 2025-04-26 18:53:00,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:53:01,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:53:01,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 682 681 663 [WARNING|trainer.py:803] 2025-04-26 18:53:02,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:53:03,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:53:03,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 682 683 [WARNING|trainer.py:803] 2025-04-26 18:53:05,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 664 [WARNING|trainer.py:803] 2025-04-26 18:53:05,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:53:06,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 683 665 [WARNING|trainer.py:803] 2025-04-26 18:53:07,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 684 [WARNING|trainer.py:803] 2025-04-26 18:53:08,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:53:08,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 684 666 685 [WARNING|trainer.py:803] 2025-04-26 18:53:10,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:53:10,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:53:11,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 685 667 [WARNING|trainer.py:803] 2025-04-26 18:53:12,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:53:12,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 686 [WARNING|trainer.py:803] 2025-04-26 18:53:13,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 686 668 [WARNING|trainer.py:803] 2025-04-26 18:53:14,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:53:14,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 687 [WARNING|trainer.py:803] 2025-04-26 18:53:16,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 687 669 [WARNING|trainer.py:803] 2025-04-26 18:53:16,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:53:17,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 688 688 [WARNING|trainer.py:803] 2025-04-26 18:53:18,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 670 [WARNING|trainer.py:803] 2025-04-26 18:53:18,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:53:19,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 689 689 [WARNING|trainer.py:803] 2025-04-26 18:53:20,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:53:20,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 671 690 690 [WARNING|trainer.py:803] 2025-04-26 18:53:22,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:53:22,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:53:22,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 672 691 691 [WARNING|trainer.py:803] 2025-04-26 18:53:24,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:53:24,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:53:25,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 673 [WARNING|trainer.py:803] 2025-04-26 18:53:26,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 692 692 [WARNING|trainer.py:803] 2025-04-26 18:53:27,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 674 [WARNING|trainer.py:803] 2025-04-26 18:53:27,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 693 [WARNING|trainer.py:803] 2025-04-26 18:53:28,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 693 [WARNING|trainer.py:803] 2025-04-26 18:53:29,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:53:29,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 694 675 694 [WARNING|trainer.py:803] 2025-04-26 18:53:30,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:53:30,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:53:31,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 676 695 [WARNING|trainer.py:803] 2025-04-26 18:53:33,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:53:33,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 695 [WARNING|trainer.py:803] 2025-04-26 18:53:34,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 696 677 [WARNING|trainer.py:803] 2025-04-26 18:53:35,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:53:35,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 696 [WARNING|trainer.py:803] 2025-04-26 18:53:36,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 678 697 [WARNING|trainer.py:803] 2025-04-26 18:53:37,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:53:37,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 697 679 [WARNING|trainer.py:803] 2025-04-26 18:53:38,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 698 [WARNING|trainer.py:803] 2025-04-26 18:53:39,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:53:40,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 698 680 699 [WARNING|trainer.py:803] 2025-04-26 18:53:41,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:53:41,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:53:41,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 700 681 699 [WARNING|trainer.py:803] 2025-04-26 18:53:43,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:53:44,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:53:44,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 682 701 700 [WARNING|trainer.py:803] 2025-04-26 18:53:46,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:53:46,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:53:46,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 702 701 683 [WARNING|trainer.py:803] 2025-04-26 18:53:48,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:53:48,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 18:53:49,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 703 [WARNING|trainer.py:803] 2025-04-26 18:53:50,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 702 [WARNING|trainer.py:803] 2025-04-26 18:53:51,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 684 704 [WARNING|trainer.py:803] 2025-04-26 18:53:52,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:53:52,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 703 705 [WARNING|trainer.py:803] 2025-04-26 18:53:53,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 685 [WARNING|trainer.py:803] 2025-04-26 18:53:54,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:53:54,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 704 706 [WARNING|trainer.py:803] 2025-04-26 18:53:55,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 686 [WARNING|trainer.py:803] 2025-04-26 18:53:56,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 705 [WARNING|trainer.py:803] 2025-04-26 18:53:57,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 707 [WARNING|trainer.py:803] 2025-04-26 18:53:57,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 687 [WARNING|trainer.py:803] 2025-04-26 18:53:58,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:53:59,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 706 708 [WARNING|trainer.py:803] 2025-04-26 18:54:00,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 688 [WARNING|trainer.py:803] 2025-04-26 18:54:00,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:54:01,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 709 707 [WARNING|trainer.py:803] 2025-04-26 18:54:02,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:02,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 689 [WARNING|trainer.py:803] 2025-04-26 18:54:03,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 710 708 690 [WARNING|trainer.py:803] 2025-04-26 18:54:04,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:54:05,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:54:05,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 711 709 [WARNING|trainer.py:803] 2025-04-26 18:54:07,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:07,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 691 712 [WARNING|trainer.py:803] 2025-04-26 18:54:08,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:54:08,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 710 [WARNING|trainer.py:803] 2025-04-26 18:54:09,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 692 713 [WARNING|trainer.py:803] 2025-04-26 18:54:10,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 711 [WARNING|trainer.py:803] 2025-04-26 18:54:11,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 693 [WARNING|trainer.py:803] 2025-04-26 18:54:12,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 714 [WARNING|trainer.py:803] 2025-04-26 18:54:12,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 712 [WARNING|trainer.py:803] 2025-04-26 18:54:13,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 694 [WARNING|trainer.py:803] 2025-04-26 18:54:14,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:14,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 715 [WARNING|trainer.py:803] 2025-04-26 18:54:15,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 713 695 716 [WARNING|trainer.py:803] 2025-04-26 18:54:17,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:54:17,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:17,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 696 714 717 [WARNING|trainer.py:803] 2025-04-26 18:54:19,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:19,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:54:20,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 715 697 718 [WARNING|trainer.py:803] 2025-04-26 18:54:22,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:22,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:54:22,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 716 719 698 [WARNING|trainer.py:803] 2025-04-26 18:54:24,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:54:24,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:54:25,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 720 717 699 [WARNING|trainer.py:803] 2025-04-26 18:54:26,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:54:27,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:27,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 721 700 718 [WARNING|trainer.py:803] 2025-04-26 18:54:28,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:54:29,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:29,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 722 [WARNING|trainer.py:803] 2025-04-26 18:54:30,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 719 701 [WARNING|trainer.py:803] 2025-04-26 18:54:31,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:54:31,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 723 [WARNING|trainer.py:803] 2025-04-26 18:54:32,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 720 702 [WARNING|trainer.py:803] 2025-04-26 18:54:34,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:54:34,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 724 703 [WARNING|trainer.py:803] 2025-04-26 18:54:35,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 721 [WARNING|trainer.py:803] 2025-04-26 18:54:36,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 18:54:36,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 725 704 [WARNING|trainer.py:803] 2025-04-26 18:54:38,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 722 [WARNING|trainer.py:803] 2025-04-26 18:54:38,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:54:38,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 726 705 [WARNING|trainer.py:803] 2025-04-26 18:54:40,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 723 [WARNING|trainer.py:803] 2025-04-26 18:54:40,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:41,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 727 [WARNING|trainer.py:803] 2025-04-26 18:54:42,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 706 728 724 [WARNING|trainer.py:803] 2025-04-26 18:54:43,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:54:44,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:54:44,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 707 729 [WARNING|trainer.py:803] 2025-04-26 18:54:45,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 725 [WARNING|trainer.py:803] 2025-04-26 18:54:46,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:46,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 730 708 [WARNING|trainer.py:803] 2025-04-26 18:54:48,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:48,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 726 [WARNING|trainer.py:803] 2025-04-26 18:54:49,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 731 709 [WARNING|trainer.py:803] 2025-04-26 18:54:49,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:50,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 727 732 [WARNING|trainer.py:803] 2025-04-26 18:54:51,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:51,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 710 728 [WARNING|trainer.py:803] 2025-04-26 18:54:52,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 733 [WARNING|trainer.py:803] 2025-04-26 18:54:53,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:54:54,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 711 729 [WARNING|trainer.py:803] 2025-04-26 18:54:55,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 734 [WARNING|trainer.py:803] 2025-04-26 18:54:55,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:56,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 712 730 [WARNING|trainer.py:803] 2025-04-26 18:54:57,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 735 [WARNING|trainer.py:803] 2025-04-26 18:54:58,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:54:58,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 731 713 736 [WARNING|trainer.py:803] 2025-04-26 18:55:00,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:00,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:55:00,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 732 714 737 [WARNING|trainer.py:803] 2025-04-26 18:55:02,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:02,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:55:03,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 738 733 715 [WARNING|trainer.py:803] 2025-04-26 18:55:04,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:55:04,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:55:05,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 739 734 716 [WARNING|trainer.py:803] 2025-04-26 18:55:07,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:55:07,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:07,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 740 735 [WARNING|trainer.py:803] 2025-04-26 18:55:09,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 717 [WARNING|trainer.py:803] 2025-04-26 18:55:10,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:55:10,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 741 [WARNING|trainer.py:803] 2025-04-26 18:55:11,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 736 718 742 [WARNING|trainer.py:803] 2025-04-26 18:55:12,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:12,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:13,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 737 719 743 [WARNING|trainer.py:803] 2025-04-26 18:55:15,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:55:15,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:55:15,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 738 744 720 [WARNING|trainer.py:803] 2025-04-26 18:55:17,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:55:17,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:55:17,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 745 739 721 [WARNING|trainer.py:803] 2025-04-26 18:55:19,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:55:19,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:55:19,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 746 740 [WARNING|trainer.py:803] 2025-04-26 18:55:21,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 722 [WARNING|trainer.py:803] 2025-04-26 18:55:21,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:22,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 747 741 [WARNING|trainer.py:803] 2025-04-26 18:55:23,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 723 [WARNING|trainer.py:803] 2025-04-26 18:55:24,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:24,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 748 742 [WARNING|trainer.py:803] 2025-04-26 18:55:25,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:55:26,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 724 749 [WARNING|trainer.py:803] 2025-04-26 18:55:27,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:55:27,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 743 [WARNING|trainer.py:803] 2025-04-26 18:55:28,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 750 725 [WARNING|trainer.py:803] 2025-04-26 18:55:29,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:30,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 744 [WARNING|trainer.py:803] 2025-04-26 18:55:31,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 751 726 [WARNING|trainer.py:803] 2025-04-26 18:55:31,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:32,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 745 752 [WARNING|trainer.py:803] 2025-04-26 18:55:33,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:55:33,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 727 753 [WARNING|trainer.py:803] 2025-04-26 18:55:34,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 746 [WARNING|trainer.py:803] 2025-04-26 18:55:35,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:55:35,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 728 754 [WARNING|trainer.py:803] 2025-04-26 18:55:36,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 747 [WARNING|trainer.py:803] 2025-04-26 18:55:37,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:55:38,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 729 755 [WARNING|trainer.py:803] 2025-04-26 18:55:39,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:39,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 748 756 [WARNING|trainer.py:803] 2025-04-26 18:55:40,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 730 [WARNING|trainer.py:803] 2025-04-26 18:55:41,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:55:42,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 749 757 [WARNING|trainer.py:803] 2025-04-26 18:55:43,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 731 [WARNING|trainer.py:803] 2025-04-26 18:55:43,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:55:44,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 750 758 732 [WARNING|trainer.py:803] 2025-04-26 18:55:45,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:46,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:46,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 751 759 733 [WARNING|trainer.py:803] 2025-04-26 18:55:48,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:48,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:48,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 752 760 [WARNING|trainer.py:803] 2025-04-26 18:55:50,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 734 [WARNING|trainer.py:803] 2025-04-26 18:55:50,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:51,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 753 761 [WARNING|trainer.py:803] 2025-04-26 18:55:52,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:55:52,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 735 762 [WARNING|trainer.py:803] 2025-04-26 18:55:53,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 754 [WARNING|trainer.py:803] 2025-04-26 18:55:54,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:54,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 736 763 755 [WARNING|trainer.py:803] 2025-04-26 18:55:56,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:56,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:55:56,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 737 764 756 [WARNING|trainer.py:803] 2025-04-26 18:55:58,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:55:58,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:55:59,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 738 765 757 [WARNING|trainer.py:803] 2025-04-26 18:56:00,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:56:01,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:56:01,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 766 739 [WARNING|trainer.py:803] 2025-04-26 18:56:03,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 758 [WARNING|trainer.py:803] 2025-04-26 18:56:03,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:56:04,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 767 740 759 [WARNING|trainer.py:803] 2025-04-26 18:56:05,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:56:05,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:56:06,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 768 741 [WARNING|trainer.py:803] 2025-04-26 18:56:07,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:56:07,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 760 [WARNING|trainer.py:803] 2025-04-26 18:56:08,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 742 769 [WARNING|trainer.py:803] 2025-04-26 18:56:10,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:56:10,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 761 770 [WARNING|trainer.py:803] 2025-04-26 18:56:11,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 743 [WARNING|trainer.py:803] 2025-04-26 18:56:12,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:56:12,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 762 771 [WARNING|trainer.py:803] 2025-04-26 18:56:13,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:56:13,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 744 [WARNING|trainer.py:803] 2025-04-26 18:56:14,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 763 772 [WARNING|trainer.py:803] 2025-04-26 18:56:15,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:56:16,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 745 773 [WARNING|trainer.py:803] 2025-04-26 18:56:17,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 764 [WARNING|trainer.py:803] 2025-04-26 18:56:18,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:56:18,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 746 774 [WARNING|trainer.py:803] 2025-04-26 18:56:19,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 765 [WARNING|trainer.py:803] 2025-04-26 18:56:20,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:56:20,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 775 747 [WARNING|trainer.py:803] 2025-04-26 18:56:21,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:56:22,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 766 776 [WARNING|trainer.py:803] 2025-04-26 18:56:23,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 748 [WARNING|trainer.py:803] 2025-04-26 18:56:23,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:56:24,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 767 777 [WARNING|trainer.py:803] 2025-04-26 18:56:25,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:56:26,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 749 [WARNING|trainer.py:803] 2025-04-26 18:56:27,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 768 778 [WARNING|trainer.py:803] 2025-04-26 18:56:28,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:56:28,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 750 [WARNING|trainer.py:803] 2025-04-26 18:56:29,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 779 [WARNING|trainer.py:803] 2025-04-26 18:56:30,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 769 751 [WARNING|trainer.py:803] 2025-04-26 18:56:31,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 780 [WARNING|trainer.py:803] 2025-04-26 18:56:31,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 770 [WARNING|trainer.py:803] 2025-04-26 18:56:32,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 752 [WARNING|trainer.py:803] 2025-04-26 18:56:33,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 781 [WARNING|trainer.py:803] 2025-04-26 18:56:34,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 771 [WARNING|trainer.py:803] 2025-04-26 18:56:34,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:56:35,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 753 782 [WARNING|trainer.py:803] 2025-04-26 18:56:36,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:56:36,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 772 [WARNING|trainer.py:803] 2025-04-26 18:56:37,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 754 783 [WARNING|trainer.py:803] 2025-04-26 18:56:38,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:56:38,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 773 755 784 [WARNING|trainer.py:803] 2025-04-26 18:56:40,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:56:40,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:56:40,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 774 785 756 [WARNING|trainer.py:803] 2025-04-26 18:56:42,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:56:42,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:56:42,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 775 786 [WARNING|trainer.py:803] 2025-04-26 18:56:44,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 757 [WARNING|trainer.py:803] 2025-04-26 18:56:45,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:56:45,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 776 787 [WARNING|trainer.py:803] 2025-04-26 18:56:46,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:56:47,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 758 777 [WARNING|trainer.py:803] 2025-04-26 18:56:48,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:56:48,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 788 759 [WARNING|trainer.py:803] 2025-04-26 18:56:49,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:56:50,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 778 789 [WARNING|trainer.py:803] 2025-04-26 18:56:51,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:56:51,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 760 779 790 [WARNING|trainer.py:803] 2025-04-26 18:56:52,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:56:53,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:56:53,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 761 791 [WARNING|trainer.py:803] 2025-04-26 18:56:55,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 780 [WARNING|trainer.py:803] 2025-04-26 18:56:55,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:56:56,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 762 792 [WARNING|trainer.py:803] 2025-04-26 18:56:57,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:56:57,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 781 [WARNING|trainer.py:803] 2025-04-26 18:56:58,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 763 793 [WARNING|trainer.py:803] 2025-04-26 18:56:59,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 782 [WARNING|trainer.py:803] 2025-04-26 18:56:59,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:57:00,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 794 764 [WARNING|trainer.py:803] 2025-04-26 18:57:02,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 783 [WARNING|trainer.py:803] 2025-04-26 18:57:02,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 795 [WARNING|trainer.py:803] 2025-04-26 18:57:03,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:57:03,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 765 784 [WARNING|trainer.py:803] 2025-04-26 18:57:04,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 796 [WARNING|trainer.py:803] 2025-04-26 18:57:05,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:57:06,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 766 785 [WARNING|trainer.py:803] 2025-04-26 18:57:06,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 797 [WARNING|trainer.py:803] 2025-04-26 18:57:07,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:57:08,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 767 786 798 [WARNING|trainer.py:803] 2025-04-26 18:57:09,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:57:10,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:57:10,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 768 799 787 [WARNING|trainer.py:803] 2025-04-26 18:57:12,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:57:12,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:57:12,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 800 769 [WARNING|trainer.py:803] 2025-04-26 18:57:14,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 788 [WARNING|trainer.py:803] 2025-04-26 18:57:15,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:57:15,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 801 770 789 [WARNING|trainer.py:803] 2025-04-26 18:57:16,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:57:17,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:57:17,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 802 771 790 [WARNING|trainer.py:803] 2025-04-26 18:57:18,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:57:19,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:57:19,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 803 772 791 [WARNING|trainer.py:803] 2025-04-26 18:57:21,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:57:21,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:57:21,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 804 [WARNING|trainer.py:803] 2025-04-26 18:57:22,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 773 792 [WARNING|trainer.py:803] 2025-04-26 18:57:23,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 805 [WARNING|trainer.py:803] 2025-04-26 18:57:24,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:57:24,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 774 793 806 [WARNING|trainer.py:803] 2025-04-26 18:57:26,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:57:26,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:57:26,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 775 [WARNING|trainer.py:803] 2025-04-26 18:57:28,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 794 807 [WARNING|trainer.py:803] 2025-04-26 18:57:29,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:57:29,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 776 795 808 [WARNING|trainer.py:803] 2025-04-26 18:57:30,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:57:31,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:57:31,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 777 809 796 [WARNING|trainer.py:803] 2025-04-26 18:57:32,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:57:32,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:57:33,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 810 778 [WARNING|trainer.py:803] 2025-04-26 18:57:34,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 797 [WARNING|trainer.py:803] 2025-04-26 18:57:35,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:57:35,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 811 779 [WARNING|trainer.py:803] 2025-04-26 18:57:37,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 798 [WARNING|trainer.py:803] 2025-04-26 18:57:37,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 812 [WARNING|trainer.py:803] 2025-04-26 18:57:38,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:57:38,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 780 799 813 [WARNING|trainer.py:803] 2025-04-26 18:57:40,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:57:40,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:57:40,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 781 800 814 [WARNING|trainer.py:803] 2025-04-26 18:57:42,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:57:43,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:57:43,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 782 815 801 [WARNING|trainer.py:803] 2025-04-26 18:57:44,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:57:44,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:57:45,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 816 783 [WARNING|trainer.py:803] 2025-04-26 18:57:46,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:57:47,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 802 [WARNING|trainer.py:803] 2025-04-26 18:57:48,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 817 784 [WARNING|trainer.py:803] 2025-04-26 18:57:48,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:57:49,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 803 818 [WARNING|trainer.py:803] 2025-04-26 18:57:50,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:57:50,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 785 819 804 [WARNING|trainer.py:803] 2025-04-26 18:57:51,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:57:52,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:57:52,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 786 805 820 [WARNING|trainer.py:803] 2025-04-26 18:57:54,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:57:54,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:57:54,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 821 787 806 [WARNING|trainer.py:803] 2025-04-26 18:57:56,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:57:56,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:57:57,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 822 788 [WARNING|trainer.py:803] 2025-04-26 18:57:58,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 807 [WARNING|trainer.py:803] 2025-04-26 18:57:59,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:57:59,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 823 789 [WARNING|trainer.py:803] 2025-04-26 18:58:01,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 808 [WARNING|trainer.py:803] 2025-04-26 18:58:01,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:58:02,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 824 [WARNING|trainer.py:803] 2025-04-26 18:58:02,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 790 809 825 [WARNING|trainer.py:803] 2025-04-26 18:58:03,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:58:04,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:58:04,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 791 810 826 [WARNING|trainer.py:803] 2025-04-26 18:58:05,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:58:06,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:58:06,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 792 827 811 [WARNING|trainer.py:803] 2025-04-26 18:58:08,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:58:08,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:58:08,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 828 812 793 [WARNING|trainer.py:803] 2025-04-26 18:58:10,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:58:10,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:58:10,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 829 813 [WARNING|trainer.py:803] 2025-04-26 18:58:12,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 794 [WARNING|trainer.py:803] 2025-04-26 18:58:12,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 830 [WARNING|trainer.py:803] 2025-04-26 18:58:13,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:58:14,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 795 814 [WARNING|trainer.py:803] 2025-04-26 18:58:15,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 831 [WARNING|trainer.py:803] 2025-04-26 18:58:15,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:58:16,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 796 815 832 [WARNING|trainer.py:803] 2025-04-26 18:58:17,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:58:17,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:58:18,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 816 797 833 [WARNING|trainer.py:803] 2025-04-26 18:58:19,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:58:20,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:58:20,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 834 817 798 [WARNING|trainer.py:803] 2025-04-26 18:58:22,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:58:22,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:58:22,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 835 818 799 [WARNING|trainer.py:803] 2025-04-26 18:58:23,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:58:24,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:58:24,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 836 819 [WARNING|trainer.py:803] 2025-04-26 18:58:25,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 800 [WARNING|trainer.py:803] 2025-04-26 18:58:26,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:58:27,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 837 820 [WARNING|trainer.py:803] 2025-04-26 18:58:28,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:58:28,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 801 838 [WARNING|trainer.py:803] 2025-04-26 18:58:29,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 821 [WARNING|trainer.py:803] 2025-04-26 18:58:30,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:58:31,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 839 802 [WARNING|trainer.py:803] 2025-04-26 18:58:32,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:58:32,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 822 840 [WARNING|trainer.py:803] 2025-04-26 18:58:33,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 803 [WARNING|trainer.py:803] 2025-04-26 18:58:34,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:58:34,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 823 841 [WARNING|trainer.py:803] 2025-04-26 18:58:35,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 804 [WARNING|trainer.py:803] 2025-04-26 18:58:36,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:58:36,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 824 842 [WARNING|trainer.py:803] 2025-04-26 18:58:38,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:58:38,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 805 [WARNING|trainer.py:803] 2025-04-26 18:58:39,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 843 825 [WARNING|trainer.py:803] 2025-04-26 18:58:39,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:58:40,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 806 844 [WARNING|trainer.py:803] 2025-04-26 18:58:41,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 826 [WARNING|trainer.py:803] 2025-04-26 18:58:41,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:58:42,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 845 807 827 [WARNING|trainer.py:803] 2025-04-26 18:58:43,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:58:43,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:58:44,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 846 808 828 [WARNING|trainer.py:803] 2025-04-26 18:58:45,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:58:46,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:58:46,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 847 [WARNING|trainer.py:803] 2025-04-26 18:58:47,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 809 829 [WARNING|trainer.py:803] 2025-04-26 18:58:48,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:58:48,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 848 [WARNING|trainer.py:803] 2025-04-26 18:58:49,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 810 830 [WARNING|trainer.py:803] 2025-04-26 18:58:50,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 849 [WARNING|trainer.py:803] 2025-04-26 18:58:50,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:58:51,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 811 831 850 [WARNING|trainer.py:803] 2025-04-26 18:58:52,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:58:53,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:58:53,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 812 832 851 [WARNING|trainer.py:803] 2025-04-26 18:58:55,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:58:55,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:58:55,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 813 852 833 [WARNING|trainer.py:803] 2025-04-26 18:58:57,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:58:57,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:58:57,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 853 834 814 [WARNING|trainer.py:803] 2025-04-26 18:58:59,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:58:59,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:58:59,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 854 [WARNING|trainer.py:803] 2025-04-26 18:59:01,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 815 835 [WARNING|trainer.py:803] 2025-04-26 18:59:01,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:59:02,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 855 [WARNING|trainer.py:803] 2025-04-26 18:59:02,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 836 816 856 [WARNING|trainer.py:803] 2025-04-26 18:59:04,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:04,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:59:04,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 857 817 837 [WARNING|trainer.py:803] 2025-04-26 18:59:06,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:59:06,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:59:07,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 818 858 838 [WARNING|trainer.py:803] 2025-04-26 18:59:08,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:08,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:59:09,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 859 819 839 [WARNING|trainer.py:803] 2025-04-26 18:59:10,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:59:10,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:59:11,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 860 820 840 [WARNING|trainer.py:803] 2025-04-26 18:59:12,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:59:13,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:13,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 861 [WARNING|trainer.py:803] 2025-04-26 18:59:14,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 821 841 [WARNING|trainer.py:803] 2025-04-26 18:59:15,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 862 [WARNING|trainer.py:803] 2025-04-26 18:59:15,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:59:16,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 822 842 863 [WARNING|trainer.py:803] 2025-04-26 18:59:17,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:59:17,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:59:18,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 864 843 823 [WARNING|trainer.py:803] 2025-04-26 18:59:20,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:59:20,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:59:20,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 865 844 824 [WARNING|trainer.py:803] 2025-04-26 18:59:22,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:59:22,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:59:22,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 866 845 825 [WARNING|trainer.py:803] 2025-04-26 18:59:23,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:24,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:59:24,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 867 846 [WARNING|trainer.py:803] 2025-04-26 18:59:25,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 826 [WARNING|trainer.py:803] 2025-04-26 18:59:26,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 868 [WARNING|trainer.py:803] 2025-04-26 18:59:26,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:27,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 847 827 869 [WARNING|trainer.py:803] 2025-04-26 18:59:28,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:59:28,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:59:29,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 828 848 870 [WARNING|trainer.py:803] 2025-04-26 18:59:30,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:31,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:59:31,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 829 849 871 [WARNING|trainer.py:803] 2025-04-26 18:59:33,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:59:33,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:33,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 830 850 872 [WARNING|trainer.py:803] 2025-04-26 18:59:35,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:35,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:59:35,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 831 851 873 [WARNING|trainer.py:803] 2025-04-26 18:59:37,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:59:37,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:59:37,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 874 852 832 [WARNING|trainer.py:803] 2025-04-26 18:59:39,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:59:39,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:59:40,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 875 853 833 [WARNING|trainer.py:803] 2025-04-26 18:59:42,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:59:42,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:42,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 876 854 834 [WARNING|trainer.py:803] 2025-04-26 18:59:44,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:44,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 18:59:44,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 877 855 835 [WARNING|trainer.py:803] 2025-04-26 18:59:45,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:46,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:59:46,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 878 [WARNING|trainer.py:803] 2025-04-26 18:59:47,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 856 836 [WARNING|trainer.py:803] 2025-04-26 18:59:48,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 879 [WARNING|trainer.py:803] 2025-04-26 18:59:48,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:49,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 857 880 [WARNING|trainer.py:803] 2025-04-26 18:59:50,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 837 [WARNING|trainer.py:803] 2025-04-26 18:59:51,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 18:59:51,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 881 858 838 [WARNING|trainer.py:803] 2025-04-26 18:59:53,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:53,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:59:53,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 859 882 839 [WARNING|trainer.py:803] 2025-04-26 18:59:55,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 18:59:55,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:55,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 883 860 840 [WARNING|trainer.py:803] 2025-04-26 18:59:57,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 18:59:57,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 18:59:58,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 884 861 [WARNING|trainer.py:803] 2025-04-26 18:59:59,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 841 [WARNING|trainer.py:803] 2025-04-26 18:59:59,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 885 [WARNING|trainer.py:803] 2025-04-26 19:00:00,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:00:00,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 862 [WARNING|trainer.py:803] 2025-04-26 19:00:01,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 886 842 [WARNING|trainer.py:803] 2025-04-26 19:00:02,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:00:03,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 863 [WARNING|trainer.py:803] 2025-04-26 19:00:03,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 843 887 [WARNING|trainer.py:803] 2025-04-26 19:00:05,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 864 [WARNING|trainer.py:803] 2025-04-26 19:00:05,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:00:06,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 844 888 [WARNING|trainer.py:803] 2025-04-26 19:00:07,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:00:07,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 865 [WARNING|trainer.py:803] 2025-04-26 19:00:08,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 889 845 [WARNING|trainer.py:803] 2025-04-26 19:00:09,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:00:09,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 866 [WARNING|trainer.py:803] 2025-04-26 19:00:10,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 846 890 [WARNING|trainer.py:803] 2025-04-26 19:00:11,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:00:11,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 867 [WARNING|trainer.py:803] 2025-04-26 19:00:12,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 847 891 [WARNING|trainer.py:803] 2025-04-26 19:00:13,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:00:13,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 868 [WARNING|trainer.py:803] 2025-04-26 19:00:14,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 892 848 [WARNING|trainer.py:803] 2025-04-26 19:00:15,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 869 [WARNING|trainer.py:803] 2025-04-26 19:00:15,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 893 [WARNING|trainer.py:803] 2025-04-26 19:00:16,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 849 [WARNING|trainer.py:803] 2025-04-26 19:00:17,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 870 [WARNING|trainer.py:803] 2025-04-26 19:00:18,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 894 [WARNING|trainer.py:803] 2025-04-26 19:00:18,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:00:19,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 850 871 895 [WARNING|trainer.py:803] 2025-04-26 19:00:20,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:00:20,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:00:21,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 851 [WARNING|trainer.py:803] 2025-04-26 19:00:22,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 896 872 [WARNING|trainer.py:803] 2025-04-26 19:00:23,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:00:23,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 852 [WARNING|trainer.py:803] 2025-04-26 19:00:24,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 897 873 [WARNING|trainer.py:803] 2025-04-26 19:00:25,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:00:25,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 853 [WARNING|trainer.py:803] 2025-04-26 19:00:26,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 898 874 [WARNING|trainer.py:803] 2025-04-26 19:00:27,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:00:28,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 854 899 [WARNING|trainer.py:803] 2025-04-26 19:00:29,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 875 [WARNING|trainer.py:803] 2025-04-26 19:00:29,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 855 [WARNING|trainer.py:803] 2025-04-26 19:00:30,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 900 [WARNING|trainer.py:803] 2025-04-26 19:00:31,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:00:31,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 876 856 901 [WARNING|trainer.py:803] 2025-04-26 19:00:32,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:00:33,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 19:00:33,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 877 902 857 [WARNING|trainer.py:803] 2025-04-26 19:00:34,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:00:34,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:00:35,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 878 903 [WARNING|trainer.py:803] 2025-04-26 19:00:36,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 858 [WARNING|trainer.py:803] 2025-04-26 19:00:37,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 904 [WARNING|trainer.py:803] 2025-04-26 19:00:38,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 879 [WARNING|trainer.py:803] 2025-04-26 19:00:38,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:00:38,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 859 905 [WARNING|trainer.py:803] 2025-04-26 19:00:40,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 880 [WARNING|trainer.py:803] 2025-04-26 19:00:40,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 906 [WARNING|trainer.py:803] 2025-04-26 19:00:41,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 860 [WARNING|trainer.py:803] 2025-04-26 19:00:41,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:00:42,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 907 881 [WARNING|trainer.py:803] 2025-04-26 19:00:43,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:00:43,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 861 908 [WARNING|trainer.py:803] 2025-04-26 19:00:44,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:00:44,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes 882 909 [WARNING|trainer.py:803] 2025-04-26 19:00:45,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 862 [WARNING|trainer.py:803] 2025-04-26 19:00:46,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:00:46,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 883 910 [WARNING|trainer.py:803] 2025-04-26 19:00:47,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 863 [WARNING|trainer.py:803] 2025-04-26 19:00:47,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 911 [WARNING|trainer.py:803] 2025-04-26 19:00:48,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 884 [WARNING|trainer.py:803] 2025-04-26 19:00:49,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 864 [WARNING|trainer.py:803] 2025-04-26 19:00:49,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 912 [WARNING|trainer.py:803] 2025-04-26 19:00:50,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:00:51,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 885 913 [WARNING|trainer.py:803] 2025-04-26 19:00:51,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 865 [WARNING|trainer.py:803] 2025-04-26 19:00:52,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:00:52,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 914 886 [WARNING|trainer.py:803] 2025-04-26 19:00:54,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:00:54,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 866 915 [WARNING|trainer.py:803] 2025-04-26 19:00:55,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:00:55,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 887 867 916 [WARNING|trainer.py:803] 2025-04-26 19:00:56,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:00:57,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:00:57,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 917 868 888 [WARNING|trainer.py:803] 2025-04-26 19:00:58,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:00:59,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:00:59,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 918 [WARNING|trainer.py:803] 2025-04-26 19:01:00,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 889 869 919 [WARNING|trainer.py:803] 2025-04-26 19:01:01,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:01,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:01:01,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 920 870 890 [WARNING|trainer.py:803] 2025-04-26 19:01:03,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:03,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:01:03,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 921 871 [WARNING|trainer.py:803] 2025-04-26 19:01:04,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 891 [WARNING|trainer.py:803] 2025-04-26 19:01:05,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 922 [WARNING|trainer.py:803] 2025-04-26 19:01:06,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:06,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 923 892 872 [WARNING|trainer.py:803] 2025-04-26 19:01:08,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:08,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:08,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 924 893 873 [WARNING|trainer.py:803] 2025-04-26 19:01:10,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:10,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:10,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 925 894 874 [WARNING|trainer.py:803] 2025-04-26 19:01:11,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:12,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:12,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 926 [WARNING|trainer.py:803] 2025-04-26 19:01:13,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 927 895 875 [WARNING|trainer.py:803] 2025-04-26 19:01:15,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:01:15,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:15,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 928 876 896 [WARNING|trainer.py:803] 2025-04-26 19:01:16,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:17,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:01:17,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 929 [WARNING|trainer.py:803] 2025-04-26 19:01:18,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 877 897 930 [WARNING|trainer.py:803] 2025-04-26 19:01:19,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:01:19,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:19,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 878 931 898 [WARNING|trainer.py:803] 2025-04-26 19:01:21,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:21,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:01:22,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 932 879 [WARNING|trainer.py:803] 2025-04-26 19:01:23,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:23,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 899 933 [WARNING|trainer.py:803] 2025-04-26 19:01:24,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:24,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 880 934 900 [WARNING|trainer.py:803] 2025-04-26 19:01:25,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:26,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:26,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 935 881 901 [WARNING|trainer.py:803] 2025-04-26 19:01:27,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:27,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:01:28,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 936 902 882 [WARNING|trainer.py:803] 2025-04-26 19:01:29,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:30,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:30,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 937 [WARNING|trainer.py:803] 2025-04-26 19:01:31,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 883 903 938 [WARNING|trainer.py:803] 2025-04-26 19:01:32,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:32,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:01:32,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 939 904 884 [WARNING|trainer.py:803] 2025-04-26 19:01:34,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:34,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:01:34,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 940 905 [WARNING|trainer.py:803] 2025-04-26 19:01:35,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 885 [WARNING|trainer.py:803] 2025-04-26 19:01:35,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 941 [WARNING|trainer.py:803] 2025-04-26 19:01:36,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 906 [WARNING|trainer.py:803] 2025-04-26 19:01:37,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:37,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 886 942 907 [WARNING|trainer.py:803] 2025-04-26 19:01:38,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:39,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:01:39,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 943 908 887 [WARNING|trainer.py:803] 2025-04-26 19:01:40,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:41,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes [WARNING|trainer.py:803] 2025-04-26 19:01:41,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 944 909 [WARNING|trainer.py:803] 2025-04-26 19:01:42,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:42,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 888 945 910 [WARNING|trainer.py:803] 2025-04-26 19:01:43,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:44,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:44,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 946 889 911 [WARNING|trainer.py:803] 2025-04-26 19:01:45,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:45,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:46,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 947 912 890 [WARNING|trainer.py:803] 2025-04-26 19:01:47,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 948 [WARNING|trainer.py:803] 2025-04-26 19:01:48,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:48,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:49,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 913 949 891 [WARNING|trainer.py:803] 2025-04-26 19:01:50,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:01:50,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:50,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 914 950 [WARNING|trainer.py:803] 2025-04-26 19:01:51,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 892 [WARNING|trainer.py:803] 2025-04-26 19:01:52,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:52,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 915 951 [WARNING|trainer.py:803] 2025-04-26 19:01:53,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 893 [WARNING|trainer.py:803] 2025-04-26 19:01:54,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 916 [WARNING|trainer.py:803] 2025-04-26 19:01:54,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 952 [WARNING|trainer.py:803] 2025-04-26 19:01:55,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:55,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 894 917 953 [WARNING|trainer.py:803] 2025-04-26 19:01:56,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:57,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:57,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 954 918 895 [WARNING|trainer.py:803] 2025-04-26 19:01:58,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:01:58,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:01:59,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 955 919 [WARNING|trainer.py:803] 2025-04-26 19:02:00,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:02:00,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 896 956 920 [WARNING|trainer.py:803] 2025-04-26 19:02:01,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:02:02,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:02:02,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 957 897 921 [WARNING|trainer.py:803] 2025-04-26 19:02:03,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:02:04,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:02:04,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 958 [WARNING|trainer.py:803] 2025-04-26 19:02:05,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 922 898 959 [WARNING|trainer.py:803] 2025-04-26 19:02:05,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:02:06,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:06,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 923 960 [WARNING|trainer.py:803] 2025-04-26 19:02:07,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 899 [WARNING|trainer.py:803] 2025-04-26 19:02:08,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 961 [WARNING|trainer.py:803] 2025-04-26 19:02:08,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 924 [WARNING|trainer.py:803] 2025-04-26 19:02:09,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 900 [WARNING|trainer.py:803] 2025-04-26 19:02:10,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 962 [WARNING|trainer.py:803] 2025-04-26 19:02:10,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 925 [WARNING|trainer.py:803] 2025-04-26 19:02:11,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 963 901 [WARNING|trainer.py:803] 2025-04-26 19:02:11,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:12,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:02:12,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 926 964 902 [WARNING|trainer.py:803] 2025-04-26 19:02:13,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:14,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:14,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 927 965 [WARNING|trainer.py:803] 2025-04-26 19:02:15,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:02:15,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 903 966 928 [WARNING|trainer.py:803] 2025-04-26 19:02:16,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:17,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:02:17,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 904 967 929 [WARNING|trainer.py:803] 2025-04-26 19:02:18,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:02:18,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:02:19,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 968 905 930 [WARNING|trainer.py:803] 2025-04-26 19:02:20,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:02:20,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:20,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 969 906 [WARNING|trainer.py:803] 2025-04-26 19:02:21,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 931 [WARNING|trainer.py:803] 2025-04-26 19:02:21,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:22,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 907 970 932 [WARNING|trainer.py:803] 2025-04-26 19:02:23,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:02:23,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 971 [WARNING|trainer.py:803] 2025-04-26 19:02:24,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 908 [WARNING|trainer.py:803] 2025-04-26 19:02:25,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 933 [WARNING|trainer.py:803] 2025-04-26 19:02:25,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes 972 [WARNING|trainer.py:803] 2025-04-26 19:02:26,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 909 [WARNING|trainer.py:803] 2025-04-26 19:02:26,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 934 [WARNING|trainer.py:803] 2025-04-26 19:02:27,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 973 [WARNING|trainer.py:803] 2025-04-26 19:02:27,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 910 [WARNING|trainer.py:803] 2025-04-26 19:02:28,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:29,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 935 974 [WARNING|trainer.py:803] 2025-04-26 19:02:29,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:02:29,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 911 [WARNING|trainer.py:803] 2025-04-26 19:02:30,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 936 975 [WARNING|trainer.py:803] 2025-04-26 19:02:31,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 912 [WARNING|trainer.py:803] 2025-04-26 19:02:32,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 937 [WARNING|trainer.py:803] 2025-04-26 19:02:32,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 976 [WARNING|trainer.py:803] 2025-04-26 19:02:33,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 913 [WARNING|trainer.py:803] 2025-04-26 19:02:33,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 938 [WARNING|trainer.py:803] 2025-04-26 19:02:34,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 977 [WARNING|trainer.py:803] 2025-04-26 19:02:35,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 914 [WARNING|trainer.py:803] 2025-04-26 19:02:35,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 939 978 [WARNING|trainer.py:803] 2025-04-26 19:02:36,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:36,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:36,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 915 979 [WARNING|trainer.py:803] 2025-04-26 19:02:37,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 940 [WARNING|trainer.py:803] 2025-04-26 19:02:38,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:02:38,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 916 941 980 [WARNING|trainer.py:803] 2025-04-26 19:02:39,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:02:40,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:40,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 917 981 [WARNING|trainer.py:803] 2025-04-26 19:02:41,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 942 [WARNING|trainer.py:803] 2025-04-26 19:02:41,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 918 [WARNING|trainer.py:803] 2025-04-26 19:02:42,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 982 [WARNING|trainer.py:803] 2025-04-26 19:02:42,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:43,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 943 919 [WARNING|trainer.py:803] 2025-04-26 19:02:44,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 983 [WARNING|trainer.py:803] 2025-04-26 19:02:44,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 944 [WARNING|trainer.py:803] 2025-04-26 19:02:45,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 920 [WARNING|trainer.py:803] 2025-04-26 19:02:46,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:46,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 984 945 [WARNING|trainer.py:803] 2025-04-26 19:02:47,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 921 [WARNING|trainer.py:803] 2025-04-26 19:02:47,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 985 [WARNING|trainer.py:803] 2025-04-26 19:02:48,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 946 [WARNING|trainer.py:803] 2025-04-26 19:02:48,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 922 [WARNING|trainer.py:803] 2025-04-26 19:02:49,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:02:50,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 986 [WARNING|trainer.py:803] 2025-04-26 19:02:50,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 923 947 987 [WARNING|trainer.py:803] 2025-04-26 19:02:51,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:51,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:52,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 948 988 924 [WARNING|trainer.py:803] 2025-04-26 19:02:53,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:02:54,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:02:54,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 949 989 925 [WARNING|trainer.py:803] 2025-04-26 19:02:55,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:55,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:02:55,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 990 950 926 [WARNING|trainer.py:803] 2025-04-26 19:02:56,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:02:57,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:02:57,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 991 951 927 [WARNING|trainer.py:803] 2025-04-26 19:02:58,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:02:59,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:02:59,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 992 [WARNING|trainer.py:803] 2025-04-26 19:03:00,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 952 928 993 [WARNING|trainer.py:803] 2025-04-26 19:03:01,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:03:01,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:01,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 953 929 994 [WARNING|trainer.py:803] 2025-04-26 19:03:03,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:03:03,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:03:03,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 954 930 995 [WARNING|trainer.py:803] 2025-04-26 19:03:04,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:04,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:03:05,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 955 931 996 [WARNING|trainer.py:803] 2025-04-26 19:03:06,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:03:06,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:03:06,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 997 956 932 [WARNING|trainer.py:803] 2025-04-26 19:03:08,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:03:08,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:08,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 998 957 933 [WARNING|trainer.py:803] 2025-04-26 19:03:10,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:10,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:03:10,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 999 958 934 [WARNING|trainer.py:803] 2025-04-26 19:03:11,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:03:11,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:12,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1000 959 935 [WARNING|trainer.py:803] 2025-04-26 19:03:13,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:03:13,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:03:13,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1001 960 936 [WARNING|trainer.py:803] 2025-04-26 19:03:15,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:03:15,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:03:15,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1002 961 937 [WARNING|trainer.py:803] 2025-04-26 19:03:16,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:03:17,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1003 [WARNING|trainer.py:803] 2025-04-26 19:03:17,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 962 [WARNING|trainer.py:803] 2025-04-26 19:03:18,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 938 [WARNING|trainer.py:803] 2025-04-26 19:03:18,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:03:19,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1004 963 939 [WARNING|trainer.py:803] 2025-04-26 19:03:20,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:03:20,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1005 [WARNING|trainer.py:803] 2025-04-26 19:03:20,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 964 [WARNING|trainer.py:803] 2025-04-26 19:03:21,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 940 [WARNING|trainer.py:803] 2025-04-26 19:03:22,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1006 [WARNING|trainer.py:803] 2025-04-26 19:03:22,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 965 [WARNING|trainer.py:803] 2025-04-26 19:03:23,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 941 [WARNING|trainer.py:803] 2025-04-26 19:03:23,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1007 [WARNING|trainer.py:803] 2025-04-26 19:03:24,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 966 [WARNING|trainer.py:803] 2025-04-26 19:03:24,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1008 [WARNING|trainer.py:803] 2025-04-26 19:03:25,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 942 [WARNING|trainer.py:803] 2025-04-26 19:03:26,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 967 [WARNING|trainer.py:803] 2025-04-26 19:03:26,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1009 [WARNING|trainer.py:803] 2025-04-26 19:03:27,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 943 [WARNING|trainer.py:803] 2025-04-26 19:03:27,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 968 [WARNING|trainer.py:803] 2025-04-26 19:03:28,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1010 [WARNING|trainer.py:803] 2025-04-26 19:03:29,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 944 [WARNING|trainer.py:803] 2025-04-26 19:03:29,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 969 [WARNING|trainer.py:803] 2025-04-26 19:03:30,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1011 [WARNING|trainer.py:803] 2025-04-26 19:03:31,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 945 [WARNING|trainer.py:803] 2025-04-26 19:03:31,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1012 [WARNING|trainer.py:803] 2025-04-26 19:03:31,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 970 [WARNING|trainer.py:803] 2025-04-26 19:03:32,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 946 [WARNING|trainer.py:803] 2025-04-26 19:03:33,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1013 [WARNING|trainer.py:803] 2025-04-26 19:03:33,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 971 [WARNING|trainer.py:803] 2025-04-26 19:03:34,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:34,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 947 1014 972 [WARNING|trainer.py:803] 2025-04-26 19:03:36,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:36,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:03:36,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1015 948 973 [WARNING|trainer.py:803] 2025-04-26 19:03:37,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:03:38,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:03:38,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1016 949 974 [WARNING|trainer.py:803] 2025-04-26 19:03:39,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:39,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:03:40,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1017 [WARNING|trainer.py:803] 2025-04-26 19:03:41,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 950 975 [WARNING|trainer.py:803] 2025-04-26 19:03:41,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1018 [WARNING|trainer.py:803] 2025-04-26 19:03:42,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 951 [WARNING|trainer.py:803] 2025-04-26 19:03:43,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 976 1019 [WARNING|trainer.py:803] 2025-04-26 19:03:44,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:44,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:44,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 952 977 [WARNING|trainer.py:803] 2025-04-26 19:03:45,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1020 [WARNING|trainer.py:803] 2025-04-26 19:03:46,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 953 [WARNING|trainer.py:803] 2025-04-26 19:03:46,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 978 [WARNING|trainer.py:803] 2025-04-26 19:03:47,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1021 [WARNING|trainer.py:803] 2025-04-26 19:03:47,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 954 [WARNING|trainer.py:803] 2025-04-26 19:03:48,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 979 [WARNING|trainer.py:803] 2025-04-26 19:03:49,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:49,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1022 955 [WARNING|trainer.py:803] 2025-04-26 19:03:50,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 980 [WARNING|trainer.py:803] 2025-04-26 19:03:51,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1023 [WARNING|trainer.py:803] 2025-04-26 19:03:51,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 956 [WARNING|trainer.py:803] 2025-04-26 19:03:52,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 981 [WARNING|trainer.py:803] 2025-04-26 19:03:52,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:53,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1024 957 982 [WARNING|trainer.py:803] 2025-04-26 19:03:54,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:03:54,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:03:55,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1025 958 [WARNING|trainer.py:803] 2025-04-26 19:03:56,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:56,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 983 1026 959 [WARNING|trainer.py:803] 2025-04-26 19:03:57,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:57,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:03:58,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 984 960 1027 [WARNING|trainer.py:803] 2025-04-26 19:03:59,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:03:59,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:03:59,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1028 985 961 [WARNING|trainer.py:803] 2025-04-26 19:04:01,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:04:01,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:01,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1029 962 [WARNING|trainer.py:803] 2025-04-26 19:04:02,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 986 [WARNING|trainer.py:803] 2025-04-26 19:04:03,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1030 [WARNING|trainer.py:803] 2025-04-26 19:04:03,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 963 987 [WARNING|trainer.py:803] 2025-04-26 19:04:04,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:04:05,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1031 [WARNING|trainer.py:803] 2025-04-26 19:04:05,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 964 [WARNING|trainer.py:803] 2025-04-26 19:04:06,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 988 [WARNING|trainer.py:803] 2025-04-26 19:04:06,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1032 [WARNING|trainer.py:803] 2025-04-26 19:04:07,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:04:07,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 965 989 1033 [WARNING|trainer.py:803] 2025-04-26 19:04:08,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:08,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:09,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 966 990 1034 [WARNING|trainer.py:803] 2025-04-26 19:04:10,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:10,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:10,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 967 991 1035 [WARNING|trainer.py:803] 2025-04-26 19:04:11,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:04:12,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:04:12,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 968 1036 992 [WARNING|trainer.py:803] 2025-04-26 19:04:13,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:14,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:04:14,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 969 1037 993 [WARNING|trainer.py:803] 2025-04-26 19:04:15,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:04:16,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:04:16,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1038 970 994 [WARNING|trainer.py:803] 2025-04-26 19:04:17,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:04:17,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:04:17,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1039 971 [WARNING|trainer.py:803] 2025-04-26 19:04:19,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 995 [WARNING|trainer.py:803] 2025-04-26 19:04:19,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1040 [WARNING|trainer.py:803] 2025-04-26 19:04:20,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 972 [WARNING|trainer.py:803] 2025-04-26 19:04:20,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 996 1041 [WARNING|trainer.py:803] 2025-04-26 19:04:21,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:22,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:04:22,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 973 997 [WARNING|trainer.py:803] 2025-04-26 19:04:23,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1042 [WARNING|trainer.py:803] 2025-04-26 19:04:23,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 974 [WARNING|trainer.py:803] 2025-04-26 19:04:24,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:04:24,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1043 998 [WARNING|trainer.py:803] 2025-04-26 19:04:25,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:25,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 975 1044 999 [WARNING|trainer.py:803] 2025-04-26 19:04:27,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:04:27,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:04:27,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 976 1000 1045 [WARNING|trainer.py:803] 2025-04-26 19:04:29,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:29,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:04:29,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 977 1046 1001 [WARNING|trainer.py:803] 2025-04-26 19:04:30,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:31,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:04:31,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 978 1047 1002 [WARNING|trainer.py:803] 2025-04-26 19:04:32,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:04:33,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:33,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 979 1003 1048 [WARNING|trainer.py:803] 2025-04-26 19:04:34,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:04:35,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:04:35,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 980 1049 1004 [WARNING|trainer.py:803] 2025-04-26 19:04:36,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:04:36,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:37,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 981 1050 1005 [WARNING|trainer.py:803] 2025-04-26 19:04:38,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:04:38,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:04:38,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 982 1051 [WARNING|trainer.py:803] 2025-04-26 19:04:40,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1006 [WARNING|trainer.py:803] 2025-04-26 19:04:40,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:04:40,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1052 983 1007 [WARNING|trainer.py:803] 2025-04-26 19:04:41,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:42,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:42,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1053 1008 984 [WARNING|trainer.py:803] 2025-04-26 19:04:43,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:04:44,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1054 [WARNING|trainer.py:803] 2025-04-26 19:04:44,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1009 [WARNING|trainer.py:803] 2025-04-26 19:04:45,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 985 [WARNING|trainer.py:803] 2025-04-26 19:04:46,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1055 [WARNING|trainer.py:803] 2025-04-26 19:04:46,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:47,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1010 1056 986 [WARNING|trainer.py:803] 2025-04-26 19:04:48,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:48,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:04:48,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1011 1057 987 [WARNING|trainer.py:803] 2025-04-26 19:04:49,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:04:50,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:04:50,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1012 988 [WARNING|trainer.py:803] 2025-04-26 19:04:51,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1058 [WARNING|trainer.py:803] 2025-04-26 19:04:52,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:04:52,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1013 989 1059 [WARNING|trainer.py:803] 2025-04-26 19:04:53,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:53,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:53,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1014 990 1060 [WARNING|trainer.py:803] 2025-04-26 19:04:55,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:55,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:56,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1015 991 1061 [WARNING|trainer.py:803] 2025-04-26 19:04:57,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:04:57,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:04:57,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1016 1062 992 [WARNING|trainer.py:803] 2025-04-26 19:04:59,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:04:59,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:04:59,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1063 993 1017 [WARNING|trainer.py:803] 2025-04-26 19:05:00,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:01,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:01,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1064 994 1018 [WARNING|trainer.py:803] 2025-04-26 19:05:02,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:02,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1065 [WARNING|trainer.py:803] 2025-04-26 19:05:03,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 995 [WARNING|trainer.py:803] 2025-04-26 19:05:04,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1019 [WARNING|trainer.py:803] 2025-04-26 19:05:05,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1066 [WARNING|trainer.py:803] 2025-04-26 19:05:05,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 996 [WARNING|trainer.py:803] 2025-04-26 19:05:05,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:06,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1020 1067 997 [WARNING|trainer.py:803] 2025-04-26 19:05:07,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:08,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:08,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1021 1068 998 [WARNING|trainer.py:803] 2025-04-26 19:05:09,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:09,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:10,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 999 1022 1069 [WARNING|trainer.py:803] 2025-04-26 19:05:11,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:05:12,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:12,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1070 1000 1023 [WARNING|trainer.py:803] 2025-04-26 19:05:13,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:13,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:13,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1071 [WARNING|trainer.py:803] 2025-04-26 19:05:14,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1001 1024 1072 [WARNING|trainer.py:803] 2025-04-26 19:05:15,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:16,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:05:16,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1002 1073 1025 [WARNING|trainer.py:803] 2025-04-26 19:05:17,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:17,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:18,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1074 1003 1026 [WARNING|trainer.py:803] 2025-04-26 19:05:19,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:19,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:05:19,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1075 [WARNING|trainer.py:803] 2025-04-26 19:05:20,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1004 1027 [WARNING|trainer.py:803] 2025-04-26 19:05:21,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1076 [WARNING|trainer.py:803] 2025-04-26 19:05:22,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1005 [WARNING|trainer.py:803] 2025-04-26 19:05:22,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1028 1077 [WARNING|trainer.py:803] 2025-04-26 19:05:23,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:23,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:24,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1006 1029 1078 [WARNING|trainer.py:803] 2025-04-26 19:05:25,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:05:25,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:25,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1007 1030 1079 [WARNING|trainer.py:803] 2025-04-26 19:05:27,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:05:27,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:27,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1008 1031 1080 [WARNING|trainer.py:803] 2025-04-26 19:05:28,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:29,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:29,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1009 1032 1081 [WARNING|trainer.py:803] 2025-04-26 19:05:30,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:30,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:31,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1033 1010 1082 [WARNING|trainer.py:803] 2025-04-26 19:05:32,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:32,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:32,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1034 1083 1011 [WARNING|trainer.py:803] 2025-04-26 19:05:34,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:34,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:34,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1084 1012 1035 [WARNING|trainer.py:803] 2025-04-26 19:05:36,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:36,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:36,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1085 1013 1036 [WARNING|trainer.py:803] 2025-04-26 19:05:37,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:37,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1086 [WARNING|trainer.py:803] 2025-04-26 19:05:38,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:39,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1014 1037 [WARNING|trainer.py:803] 2025-04-26 19:05:39,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1087 [WARNING|trainer.py:803] 2025-04-26 19:05:40,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:40,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1015 1038 [WARNING|trainer.py:803] 2025-04-26 19:05:41,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1088 [WARNING|trainer.py:803] 2025-04-26 19:05:42,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1016 [WARNING|trainer.py:803] 2025-04-26 19:05:42,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1039 [WARNING|trainer.py:803] 2025-04-26 19:05:43,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1089 [WARNING|trainer.py:803] 2025-04-26 19:05:44,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:44,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1017 1040 1090 [WARNING|trainer.py:803] 2025-04-26 19:05:45,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:45,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:46,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1041 1091 1018 [WARNING|trainer.py:803] 2025-04-26 19:05:47,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:47,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:05:47,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1092 1042 1019 [WARNING|trainer.py:803] 2025-04-26 19:05:49,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:49,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:05:49,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1093 1043 1020 [WARNING|trainer.py:803] 2025-04-26 19:05:51,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:51,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:52,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1094 1044 [WARNING|trainer.py:803] 2025-04-26 19:05:53,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1021 [WARNING|trainer.py:803] 2025-04-26 19:05:53,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1095 [WARNING|trainer.py:803] 2025-04-26 19:05:54,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:54,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1045 1022 1096 [WARNING|trainer.py:803] 2025-04-26 19:05:56,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:05:56,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:56,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1046 1023 1097 [WARNING|trainer.py:803] 2025-04-26 19:05:57,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:58,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:05:58,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1047 1098 1024 [WARNING|trainer.py:803] 2025-04-26 19:05:59,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:05:59,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:00,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1099 1048 [WARNING|trainer.py:803] 2025-04-26 19:06:01,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1025 [WARNING|trainer.py:803] 2025-04-26 19:06:01,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1100 [WARNING|trainer.py:803] 2025-04-26 19:06:02,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1049 [WARNING|trainer.py:803] 2025-04-26 19:06:03,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1026 [WARNING|trainer.py:803] 2025-04-26 19:06:03,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1101 [WARNING|trainer.py:803] 2025-04-26 19:06:04,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1050 [WARNING|trainer.py:803] 2025-04-26 19:06:04,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1027 [WARNING|trainer.py:803] 2025-04-26 19:06:05,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1102 [WARNING|trainer.py:803] 2025-04-26 19:06:06,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1051 [WARNING|trainer.py:803] 2025-04-26 19:06:06,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1028 1103 [WARNING|trainer.py:803] 2025-04-26 19:06:07,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:06:07,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:08,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1052 1029 1104 [WARNING|trainer.py:803] 2025-04-26 19:06:09,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:06:09,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:09,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1105 1030 1053 [WARNING|trainer.py:803] 2025-04-26 19:06:11,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:06:11,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:06:11,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1106 1031 1054 [WARNING|trainer.py:803] 2025-04-26 19:06:12,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:06:13,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1107 [WARNING|trainer.py:803] 2025-04-26 19:06:13,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1032 [WARNING|trainer.py:803] 2025-04-26 19:06:14,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1055 [WARNING|trainer.py:803] 2025-04-26 19:06:14,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:15,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1108 1033 [WARNING|trainer.py:803] 2025-04-26 19:06:16,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1056 [WARNING|trainer.py:803] 2025-04-26 19:06:16,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:06:17,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1109 1034 [WARNING|trainer.py:803] 2025-04-26 19:06:18,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1057 [WARNING|trainer.py:803] 2025-04-26 19:06:18,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1110 [WARNING|trainer.py:803] 2025-04-26 19:06:18,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:06:19,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1035 1111 [WARNING|trainer.py:803] 2025-04-26 19:06:20,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1058 [WARNING|trainer.py:803] 2025-04-26 19:06:21,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:21,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1036 1112 1059 [WARNING|trainer.py:803] 2025-04-26 19:06:22,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:06:22,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:06:23,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1113 1037 [WARNING|trainer.py:803] 2025-04-26 19:06:24,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:06:24,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1114 1060 1038 [WARNING|trainer.py:803] 2025-04-26 19:06:25,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:25,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1115 [WARNING|trainer.py:803] 2025-04-26 19:06:26,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1061 [WARNING|trainer.py:803] 2025-04-26 19:06:26,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1039 [WARNING|trainer.py:803] 2025-04-26 19:06:27,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1116 [WARNING|trainer.py:803] 2025-04-26 19:06:27,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1062 [WARNING|trainer.py:803] 2025-04-26 19:06:28,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1040 [WARNING|trainer.py:803] 2025-04-26 19:06:29,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1117 [WARNING|trainer.py:803] 2025-04-26 19:06:29,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1063 [WARNING|trainer.py:803] 2025-04-26 19:06:30,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1041 [WARNING|trainer.py:803] 2025-04-26 19:06:30,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1118 [WARNING|trainer.py:803] 2025-04-26 19:06:31,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:06:31,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1064 1119 1042 [WARNING|trainer.py:803] 2025-04-26 19:06:32,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:06:33,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:33,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1065 1120 [WARNING|trainer.py:803] 2025-04-26 19:06:34,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1043 [WARNING|trainer.py:803] 2025-04-26 19:06:35,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:35,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1066 1121 [WARNING|trainer.py:803] 2025-04-26 19:06:36,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1044 [WARNING|trainer.py:803] 2025-04-26 19:06:36,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:06:37,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1122 1067 [WARNING|trainer.py:803] 2025-04-26 19:06:38,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:39,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1045 [WARNING|trainer.py:803] 2025-04-26 19:06:39,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1123 1068 1046 [WARNING|trainer.py:803] 2025-04-26 19:06:40,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:41,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1124 [WARNING|trainer.py:803] 2025-04-26 19:06:41,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:42,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1047 1069 1125 [WARNING|trainer.py:803] 2025-04-26 19:06:43,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:06:43,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:06:43,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1070 1126 1048 [WARNING|trainer.py:803] 2025-04-26 19:06:45,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:45,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:45,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1127 1071 1049 [WARNING|trainer.py:803] 2025-04-26 19:06:46,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:46,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1128 [WARNING|trainer.py:803] 2025-04-26 19:06:47,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1072 [WARNING|trainer.py:803] 2025-04-26 19:06:48,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:48,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1050 1129 [WARNING|trainer.py:803] 2025-04-26 19:06:49,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1073 [WARNING|trainer.py:803] 2025-04-26 19:06:49,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:50,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1130 1051 1074 [WARNING|trainer.py:803] 2025-04-26 19:06:51,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:51,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1131 [WARNING|trainer.py:803] 2025-04-26 19:06:52,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1052 [WARNING|trainer.py:803] 2025-04-26 19:06:52,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1075 [WARNING|trainer.py:803] 2025-04-26 19:06:53,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1132 [WARNING|trainer.py:803] 2025-04-26 19:06:53,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:06:54,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1053 1133 1076 [WARNING|trainer.py:803] 2025-04-26 19:06:55,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:55,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:06:55,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1054 1134 1077 [WARNING|trainer.py:803] 2025-04-26 19:06:57,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:06:57,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:06:57,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1135 1055 1078 [WARNING|trainer.py:803] 2025-04-26 19:06:58,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:06:59,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1136 [WARNING|trainer.py:803] 2025-04-26 19:06:59,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1056 [WARNING|trainer.py:803] 2025-04-26 19:07:00,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1079 [WARNING|trainer.py:803] 2025-04-26 19:07:00,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1137 [WARNING|trainer.py:803] 2025-04-26 19:07:01,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1057 [WARNING|trainer.py:803] 2025-04-26 19:07:02,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:02,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1138 1080 [WARNING|trainer.py:803] 2025-04-26 19:07:03,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:07:03,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1058 1139 1081 [WARNING|trainer.py:803] 2025-04-26 19:07:05,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:05,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:07:05,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1140 1059 1082 [WARNING|trainer.py:803] 2025-04-26 19:07:06,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:07:06,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:07,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1141 [WARNING|trainer.py:803] 2025-04-26 19:07:08,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1060 1083 1142 [WARNING|trainer.py:803] 2025-04-26 19:07:09,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:09,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:09,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1061 1084 1143 [WARNING|trainer.py:803] 2025-04-26 19:07:11,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:07:11,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:11,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1062 1144 1085 [WARNING|trainer.py:803] 2025-04-26 19:07:12,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:07:12,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:13,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1145 1063 1086 [WARNING|trainer.py:803] 2025-04-26 19:07:14,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:14,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:14,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1146 1087 [WARNING|trainer.py:803] 2025-04-26 19:07:15,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1064 1147 [WARNING|trainer.py:803] 2025-04-26 19:07:16,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:16,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:17,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1065 1088 1148 [WARNING|trainer.py:803] 2025-04-26 19:07:18,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:18,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:18,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1149 1066 1089 [WARNING|trainer.py:803] 2025-04-26 19:07:20,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:20,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:20,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1150 1090 [WARNING|trainer.py:803] 2025-04-26 19:07:21,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1067 1151 [WARNING|trainer.py:803] 2025-04-26 19:07:22,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:07:22,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:07:23,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1091 1152 1068 [WARNING|trainer.py:803] 2025-04-26 19:07:24,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:07:24,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:25,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1092 1153 [WARNING|trainer.py:803] 2025-04-26 19:07:26,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:26,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1069 1154 [WARNING|trainer.py:803] 2025-04-26 19:07:27,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:07:27,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1093 1070 1155 [WARNING|trainer.py:803] 2025-04-26 19:07:28,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:29,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:29,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1094 1071 1156 [WARNING|trainer.py:803] 2025-04-26 19:07:30,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:07:30,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:30,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1157 1095 1072 [WARNING|trainer.py:803] 2025-04-26 19:07:32,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:32,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:32,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1158 1073 1096 [WARNING|trainer.py:803] 2025-04-26 19:07:34,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:34,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:34,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1159 1074 [WARNING|trainer.py:803] 2025-04-26 19:07:35,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1097 [WARNING|trainer.py:803] 2025-04-26 19:07:36,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1160 [WARNING|trainer.py:803] 2025-04-26 19:07:36,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1075 [WARNING|trainer.py:803] 2025-04-26 19:07:36,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1098 1161 [WARNING|trainer.py:803] 2025-04-26 19:07:37,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:07:38,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:38,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1076 1162 1099 [WARNING|trainer.py:803] 2025-04-26 19:07:39,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:39,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:07:39,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1163 1077 1100 [WARNING|trainer.py:803] 2025-04-26 19:07:41,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:41,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:07:41,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1164 1078 [WARNING|trainer.py:803] 2025-04-26 19:07:42,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1101 [WARNING|trainer.py:803] 2025-04-26 19:07:43,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1165 [WARNING|trainer.py:803] 2025-04-26 19:07:44,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:44,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1079 1166 1102 [WARNING|trainer.py:803] 2025-04-26 19:07:45,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:46,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:46,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1167 1080 1103 [WARNING|trainer.py:803] 2025-04-26 19:07:47,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:07:47,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:07:48,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1168 1081 1104 [WARNING|trainer.py:803] 2025-04-26 19:07:49,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:07:49,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1169 [WARNING|trainer.py:803] 2025-04-26 19:07:49,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:50,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1082 1105 1170 [WARNING|trainer.py:803] 2025-04-26 19:07:51,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:51,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:07:52,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1106 1083 1171 [WARNING|trainer.py:803] 2025-04-26 19:07:53,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:07:53,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:53,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1107 1084 1172 [WARNING|trainer.py:803] 2025-04-26 19:07:54,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:55,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:55,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1173 1085 1108 [WARNING|trainer.py:803] 2025-04-26 19:07:57,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:57,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:07:57,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1086 1174 1109 [WARNING|trainer.py:803] 2025-04-26 19:07:58,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:58,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:07:59,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1087 1175 1110 [WARNING|trainer.py:803] 2025-04-26 19:08:00,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:08:00,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:00,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1176 1111 [WARNING|trainer.py:803] 2025-04-26 19:08:02,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1088 [WARNING|trainer.py:803] 2025-04-26 19:08:02,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1177 [WARNING|trainer.py:803] 2025-04-26 19:08:03,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1112 [WARNING|trainer.py:803] 2025-04-26 19:08:03,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1089 1178 [WARNING|trainer.py:803] 2025-04-26 19:08:04,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:08:04,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:05,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1113 1179 1090 [WARNING|trainer.py:803] 2025-04-26 19:08:06,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:08:06,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:06,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1114 1180 1091 [WARNING|trainer.py:803] 2025-04-26 19:08:07,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:08,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:08:08,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1115 1181 [WARNING|trainer.py:803] 2025-04-26 19:08:09,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1092 [WARNING|trainer.py:803] 2025-04-26 19:08:09,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1116 [WARNING|trainer.py:803] 2025-04-26 19:08:10,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1182 [WARNING|trainer.py:803] 2025-04-26 19:08:11,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:11,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1093 1183 1117 [WARNING|trainer.py:803] 2025-04-26 19:08:12,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:08:13,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:08:13,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1184 1094 1118 [WARNING|trainer.py:803] 2025-04-26 19:08:14,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:08:14,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:08:15,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1185 1095 1119 [WARNING|trainer.py:803] 2025-04-26 19:08:16,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:08:16,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:16,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1186 1096 [WARNING|trainer.py:803] 2025-04-26 19:08:18,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1120 [WARNING|trainer.py:803] 2025-04-26 19:08:18,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1187 [WARNING|trainer.py:803] 2025-04-26 19:08:19,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:19,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1097 1121 1188 [WARNING|trainer.py:803] 2025-04-26 19:08:20,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:08:20,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:08:21,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1098 1189 1122 [WARNING|trainer.py:803] 2025-04-26 19:08:22,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:23,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:23,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1099 [WARNING|trainer.py:803] 2025-04-26 19:08:24,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1190 1123 [WARNING|trainer.py:803] 2025-04-26 19:08:25,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1100 [WARNING|trainer.py:803] 2025-04-26 19:08:25,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1191 [WARNING|trainer.py:803] 2025-04-26 19:08:26,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1124 [WARNING|trainer.py:803] 2025-04-26 19:08:26,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:27,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1192 1101 1125 [WARNING|trainer.py:803] 2025-04-26 19:08:28,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:28,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1193 [WARNING|trainer.py:803] 2025-04-26 19:08:28,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:29,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1126 1102 1194 [WARNING|trainer.py:803] 2025-04-26 19:08:30,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:30,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:31,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1127 1103 1195 [WARNING|trainer.py:803] 2025-04-26 19:08:32,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:32,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:32,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1128 1104 1196 [WARNING|trainer.py:803] 2025-04-26 19:08:33,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:34,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:08:34,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1129 1197 1105 [WARNING|trainer.py:803] 2025-04-26 19:08:35,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:35,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:08:35,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1130 1198 1106 [WARNING|trainer.py:803] 2025-04-26 19:08:37,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:37,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:37,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1131 1199 1107 [WARNING|trainer.py:803] 2025-04-26 19:08:38,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:08:39,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:39,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1200 1132 [WARNING|trainer.py:803] 2025-04-26 19:08:40,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:40,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1108 1133 [WARNING|trainer.py:803] 2025-04-26 19:08:41,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1201 [WARNING|trainer.py:803] 2025-04-26 19:08:42,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1109 [WARNING|trainer.py:803] 2025-04-26 19:08:42,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1134 [WARNING|trainer.py:803] 2025-04-26 19:08:43,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:08:44,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1202 1110 1135 [WARNING|trainer.py:803] 2025-04-26 19:08:45,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:08:45,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:08:45,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1203 1111 1136 [WARNING|trainer.py:803] 2025-04-26 19:08:47,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:47,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:47,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1112 1204 1137 [WARNING|trainer.py:803] 2025-04-26 19:08:49,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:08:49,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:08:49,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1113 1138 1205 [WARNING|trainer.py:803] 2025-04-26 19:08:50,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:08:51,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:08:51,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1114 1139 [WARNING|trainer.py:803] 2025-04-26 19:08:52,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1206 [WARNING|trainer.py:803] 2025-04-26 19:08:52,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1115 [WARNING|trainer.py:803] 2025-04-26 19:08:53,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1140 [WARNING|trainer.py:803] 2025-04-26 19:08:54,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1207 [WARNING|trainer.py:803] 2025-04-26 19:08:54,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1116 [WARNING|trainer.py:803] 2025-04-26 19:08:55,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1141 [WARNING|trainer.py:803] 2025-04-26 19:08:56,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:56,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1208 1117 1142 [WARNING|trainer.py:803] 2025-04-26 19:08:57,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:08:58,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:08:58,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1209 1143 1118 [WARNING|trainer.py:803] 2025-04-26 19:08:59,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:00,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:09:00,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1210 1144 1119 [WARNING|trainer.py:803] 2025-04-26 19:09:01,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:09:01,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:01,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1145 1211 1120 [WARNING|trainer.py:803] 2025-04-26 19:09:03,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:09:03,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:09:04,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1146 1212 1121 [WARNING|trainer.py:803] 2025-04-26 19:09:05,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:09:05,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:05,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1147 1122 1213 [WARNING|trainer.py:803] 2025-04-26 19:09:07,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:09:07,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:07,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1148 [WARNING|trainer.py:803] 2025-04-26 19:09:08,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1214 1123 1149 [WARNING|trainer.py:803] 2025-04-26 19:09:09,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:10,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:10,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1124 1215 1150 [WARNING|trainer.py:803] 2025-04-26 19:09:11,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:11,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:09:12,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1125 1151 1216 [WARNING|trainer.py:803] 2025-04-26 19:09:13,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:14,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:14,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1126 1152 [WARNING|trainer.py:803] 2025-04-26 19:09:15,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1217 [WARNING|trainer.py:803] 2025-04-26 19:09:15,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1127 [WARNING|trainer.py:803] 2025-04-26 19:09:16,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1153 [WARNING|trainer.py:803] 2025-04-26 19:09:16,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1218 [WARNING|trainer.py:803] 2025-04-26 19:09:17,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1128 [WARNING|trainer.py:803] 2025-04-26 19:09:18,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1154 [WARNING|trainer.py:803] 2025-04-26 19:09:18,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1219 [WARNING|trainer.py:803] 2025-04-26 19:09:19,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1129 [WARNING|trainer.py:803] 2025-04-26 19:09:20,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:09:20,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1155 1130 [WARNING|trainer.py:803] 2025-04-26 19:09:21,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1220 [WARNING|trainer.py:803] 2025-04-26 19:09:22,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1156 [WARNING|trainer.py:803] 2025-04-26 19:09:22,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:22,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1131 1221 [WARNING|trainer.py:803] 2025-04-26 19:09:23,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1157 [WARNING|trainer.py:803] 2025-04-26 19:09:23,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1132 [WARNING|trainer.py:803] 2025-04-26 19:09:24,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1222 [WARNING|trainer.py:803] 2025-04-26 19:09:25,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1158 [WARNING|trainer.py:803] 2025-04-26 19:09:26,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1133 [WARNING|trainer.py:803] 2025-04-26 19:09:26,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:27,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1159 1223 1134 [WARNING|trainer.py:803] 2025-04-26 19:09:28,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:28,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:28,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1160 1135 [WARNING|trainer.py:803] 2025-04-26 19:09:29,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1224 [WARNING|trainer.py:803] 2025-04-26 19:09:30,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:09:30,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1161 1136 [WARNING|trainer.py:803] 2025-04-26 19:09:31,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1225 [WARNING|trainer.py:803] 2025-04-26 19:09:32,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1162 [WARNING|trainer.py:803] 2025-04-26 19:09:32,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:09:33,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1137 1226 1163 [WARNING|trainer.py:803] 2025-04-26 19:09:34,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:34,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:09:35,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1138 1164 1227 [WARNING|trainer.py:803] 2025-04-26 19:09:36,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:09:36,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:09:36,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1139 1165 [WARNING|trainer.py:803] 2025-04-26 19:09:37,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:09:38,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1228 1140 1166 [WARNING|trainer.py:803] 2025-04-26 19:09:39,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:09:39,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:09:40,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1229 1141 1167 [WARNING|trainer.py:803] 2025-04-26 19:09:41,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:09:41,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:09:41,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1142 1168 1230 [WARNING|trainer.py:803] 2025-04-26 19:09:43,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:43,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:09:43,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1143 1169 1231 [WARNING|trainer.py:803] 2025-04-26 19:09:44,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:09:45,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:09:45,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1144 1170 1232 [WARNING|trainer.py:803] 2025-04-26 19:09:46,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:09:47,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1145 [WARNING|trainer.py:803] 2025-04-26 19:09:47,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1171 [WARNING|trainer.py:803] 2025-04-26 19:09:48,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1233 [WARNING|trainer.py:803] 2025-04-26 19:09:48,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1146 [WARNING|trainer.py:803] 2025-04-26 19:09:49,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1172 [WARNING|trainer.py:803] 2025-04-26 19:09:50,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:09:50,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1234 1147 1173 [WARNING|trainer.py:803] 2025-04-26 19:09:51,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:09:52,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:09:52,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1148 1235 1174 [WARNING|trainer.py:803] 2025-04-26 19:09:53,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:09:54,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:09:54,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1149 1236 [WARNING|trainer.py:803] 2025-04-26 19:09:55,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1175 [WARNING|trainer.py:803] 2025-04-26 19:09:56,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:09:56,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1150 [WARNING|trainer.py:803] 2025-04-26 19:09:57,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1176 1237 [WARNING|trainer.py:803] 2025-04-26 19:09:58,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1151 [WARNING|trainer.py:803] 2025-04-26 19:09:58,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1177 [WARNING|trainer.py:803] 2025-04-26 19:09:59,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1238 [WARNING|trainer.py:803] 2025-04-26 19:09:59,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1152 [WARNING|trainer.py:803] 2025-04-26 19:10:00,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1178 [WARNING|trainer.py:803] 2025-04-26 19:10:00,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1239 [WARNING|trainer.py:803] 2025-04-26 19:10:01,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1153 [WARNING|trainer.py:803] 2025-04-26 19:10:02,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1179 [WARNING|trainer.py:803] 2025-04-26 19:10:02,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:03,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1154 1240 1180 [WARNING|trainer.py:803] 2025-04-26 19:10:04,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:04,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:10:04,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1155 1241 1181 [WARNING|trainer.py:803] 2025-04-26 19:10:06,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:06,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:10:06,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1156 1182 [WARNING|trainer.py:803] 2025-04-26 19:10:07,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1242 [WARNING|trainer.py:803] 2025-04-26 19:10:08,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1157 [WARNING|trainer.py:803] 2025-04-26 19:10:08,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:10:09,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1183 1243 [WARNING|trainer.py:803] 2025-04-26 19:10:10,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1158 [WARNING|trainer.py:803] 2025-04-26 19:10:11,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1184 [WARNING|trainer.py:803] 2025-04-26 19:10:11,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:12,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1159 1244 [WARNING|trainer.py:803] 2025-04-26 19:10:13,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1185 [WARNING|trainer.py:803] 2025-04-26 19:10:13,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1160 [WARNING|trainer.py:803] 2025-04-26 19:10:13,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1245 [WARNING|trainer.py:803] 2025-04-26 19:10:14,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1186 [WARNING|trainer.py:803] 2025-04-26 19:10:15,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1161 [WARNING|trainer.py:803] 2025-04-26 19:10:15,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:16,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1246 1187 1162 [WARNING|trainer.py:803] 2025-04-26 19:10:17,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:10:17,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:18,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1188 1247 1163 [WARNING|trainer.py:803] 2025-04-26 19:10:19,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:19,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:20,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1189 1164 1248 [WARNING|trainer.py:803] 2025-04-26 19:10:21,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:21,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:10:22,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1165 1190 1249 [WARNING|trainer.py:803] 2025-04-26 19:10:23,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:10:23,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:10:24,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1166 1191 1250 [WARNING|trainer.py:803] 2025-04-26 19:10:25,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:10:25,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:25,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1167 1192 [WARNING|trainer.py:803] 2025-04-26 19:10:27,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:10:27,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1251 1168 1193 [WARNING|trainer.py:803] 2025-04-26 19:10:28,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:10:28,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:10:29,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1169 1252 1194 [WARNING|trainer.py:803] 2025-04-26 19:10:30,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:10:30,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:30,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1170 1253 1195 [WARNING|trainer.py:803] 2025-04-26 19:10:32,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:10:32,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:10:32,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1171 1196 1254 [WARNING|trainer.py:803] 2025-04-26 19:10:34,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:10:34,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:10:34,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1172 1197 [WARNING|trainer.py:803] 2025-04-26 19:10:35,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1255 [WARNING|trainer.py:803] 2025-04-26 19:10:36,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:10:36,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1173 1198 [WARNING|trainer.py:803] 2025-04-26 19:10:37,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:38,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1256 1174 [WARNING|trainer.py:803] 2025-04-26 19:10:38,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1199 [WARNING|trainer.py:803] 2025-04-26 19:10:39,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:39,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1257 1200 [WARNING|trainer.py:803] 2025-04-26 19:10:40,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1175 [WARNING|trainer.py:803] 2025-04-26 19:10:41,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:41,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1258 1176 [WARNING|trainer.py:803] 2025-04-26 19:10:43,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1201 [WARNING|trainer.py:803] 2025-04-26 19:10:43,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:10:44,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1259 1177 [WARNING|trainer.py:803] 2025-04-26 19:10:45,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:10:45,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1202 1178 [WARNING|trainer.py:803] 2025-04-26 19:10:46,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1260 [WARNING|trainer.py:803] 2025-04-26 19:10:46,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:47,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1179 1203 1261 [WARNING|trainer.py:803] 2025-04-26 19:10:48,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:48,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:49,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1180 1204 [WARNING|trainer.py:803] 2025-04-26 19:10:50,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1262 [WARNING|trainer.py:803] 2025-04-26 19:10:51,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:10:51,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1181 [WARNING|trainer.py:803] 2025-04-26 19:10:52,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1205 1263 1182 [WARNING|trainer.py:803] 2025-04-26 19:10:53,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:53,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:10:54,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1264 1206 1183 [WARNING|trainer.py:803] 2025-04-26 19:10:55,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:10:55,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:10:55,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1184 1265 1207 [WARNING|trainer.py:803] 2025-04-26 19:10:57,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:10:57,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:10:57,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1185 1266 1208 [WARNING|trainer.py:803] 2025-04-26 19:10:59,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:10:59,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:10:59,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1186 1267 [WARNING|trainer.py:803] 2025-04-26 19:11:01,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1209 [WARNING|trainer.py:803] 2025-04-26 19:11:02,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:11:02,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1187 1268 [WARNING|trainer.py:803] 2025-04-26 19:11:03,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1210 [WARNING|trainer.py:803] 2025-04-26 19:11:04,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1188 [WARNING|trainer.py:803] 2025-04-26 19:11:04,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:11:05,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1269 1211 1189 [WARNING|trainer.py:803] 2025-04-26 19:11:06,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:11:06,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:11:06,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1270 1212 1190 [WARNING|trainer.py:803] 2025-04-26 19:11:08,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:11:09,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:09,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1271 1191 1213 [WARNING|trainer.py:803] 2025-04-26 19:11:10,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:11:11,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:11,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1272 1192 [WARNING|trainer.py:803] 2025-04-26 19:11:12,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:11:12,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1214 1193 [WARNING|trainer.py:803] 2025-04-26 19:11:13,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1273 [WARNING|trainer.py:803] 2025-04-26 19:11:14,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:14,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1215 1194 [WARNING|trainer.py:803] 2025-04-26 19:11:15,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1274 [WARNING|trainer.py:803] 2025-04-26 19:11:16,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:16,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1195 1216 [WARNING|trainer.py:803] 2025-04-26 19:11:18,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:18,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1275 1196 [WARNING|trainer.py:803] 2025-04-26 19:11:19,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1217 [WARNING|trainer.py:803] 2025-04-26 19:11:19,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1276 [WARNING|trainer.py:803] 2025-04-26 19:11:20,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1197 [WARNING|trainer.py:803] 2025-04-26 19:11:21,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:11:21,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1218 1277 1198 [WARNING|trainer.py:803] 2025-04-26 19:11:22,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:23,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:11:23,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1219 1278 1199 [WARNING|trainer.py:803] 2025-04-26 19:11:24,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:11:24,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:11:25,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1200 1220 1279 [WARNING|trainer.py:803] 2025-04-26 19:11:26,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:26,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:27,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1221 1201 1280 [WARNING|trainer.py:803] 2025-04-26 19:11:29,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:11:29,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:29,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1222 1281 1202 [WARNING|trainer.py:803] 2025-04-26 19:11:31,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:11:31,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:11:31,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1282 1223 1203 [WARNING|trainer.py:803] 2025-04-26 19:11:33,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:11:33,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:34,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1283 1204 1224 [WARNING|trainer.py:803] 2025-04-26 19:11:35,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:11:36,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:11:36,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1284 1205 1225 [WARNING|trainer.py:803] 2025-04-26 19:11:38,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:11:38,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:38,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1285 1206 1226 [WARNING|trainer.py:803] 2025-04-26 19:11:40,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:11:40,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:40,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1286 1207 1227 [WARNING|trainer.py:803] 2025-04-26 19:11:42,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:11:42,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:11:42,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1208 1287 1228 [WARNING|trainer.py:803] 2025-04-26 19:11:45,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:45,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:11:45,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1288 1209 1229 [WARNING|trainer.py:803] 2025-04-26 19:11:47,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:11:47,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:47,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1289 1210 [WARNING|trainer.py:803] 2025-04-26 19:11:49,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1230 [WARNING|trainer.py:803] 2025-04-26 19:11:49,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1290 [WARNING|trainer.py:803] 2025-04-26 19:11:50,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1211 [WARNING|trainer.py:803] 2025-04-26 19:11:51,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1231 [WARNING|trainer.py:803] 2025-04-26 19:11:51,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1291 [WARNING|trainer.py:803] 2025-04-26 19:11:52,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:11:53,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1212 1232 [WARNING|trainer.py:803] 2025-04-26 19:11:54,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1292 [WARNING|trainer.py:803] 2025-04-26 19:11:54,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:11:55,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1213 1233 1293 [WARNING|trainer.py:803] 2025-04-26 19:11:56,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:11:57,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:11:57,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1214 1294 1234 [WARNING|trainer.py:803] 2025-04-26 19:11:58,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:11:59,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:11:59,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1215 1295 1235 [WARNING|trainer.py:803] 2025-04-26 19:12:01,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:12:01,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:12:01,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1296 1216 1236 [WARNING|trainer.py:803] 2025-04-26 19:12:03,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:12:03,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:12:03,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1217 1297 1237 [WARNING|trainer.py:803] 2025-04-26 19:12:05,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:12:05,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:12:06,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1218 1298 [WARNING|trainer.py:803] 2025-04-26 19:12:07,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1238 [WARNING|trainer.py:803] 2025-04-26 19:12:07,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:12:08,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1299 1219 [WARNING|trainer.py:803] 2025-04-26 19:12:09,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:12:09,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1239 [WARNING|trainer.py:803] 2025-04-26 19:12:10,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1220 1300 [WARNING|trainer.py:803] 2025-04-26 19:12:11,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:12:12,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1240 [WARNING|trainer.py:803] 2025-04-26 19:12:13,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1221 1301 [WARNING|trainer.py:803] 2025-04-26 19:12:14,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1241 [WARNING|trainer.py:803] 2025-04-26 19:12:14,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1222 [WARNING|trainer.py:803] 2025-04-26 19:12:15,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1302 [WARNING|trainer.py:803] 2025-04-26 19:12:16,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:12:16,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1242 1303 [WARNING|trainer.py:803] 2025-04-26 19:12:18,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1223 [WARNING|trainer.py:803] 2025-04-26 19:12:18,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:12:18,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1243 1304 [WARNING|trainer.py:803] 2025-04-26 19:12:20,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1224 [WARNING|trainer.py:803] 2025-04-26 19:12:20,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:12:21,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1244 1305 [WARNING|trainer.py:803] 2025-04-26 19:12:22,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1225 [WARNING|trainer.py:803] 2025-04-26 19:12:22,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:12:23,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1306 1245 [WARNING|trainer.py:803] 2025-04-26 19:12:25,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1226 [WARNING|trainer.py:803] 2025-04-26 19:12:25,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:12:25,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1307 1246 [WARNING|trainer.py:803] 2025-04-26 19:12:27,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1227 [WARNING|trainer.py:803] 2025-04-26 19:12:27,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:12:27,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1308 [WARNING|trainer.py:803] 2025-04-26 19:12:29,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1247 1228 [WARNING|trainer.py:803] 2025-04-26 19:12:29,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1309 [WARNING|trainer.py:803] 2025-04-26 19:12:30,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:12:31,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1248 1229 1310 [WARNING|trainer.py:803] 2025-04-26 19:12:32,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:12:32,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:12:33,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1249 1311 [WARNING|trainer.py:803] 2025-04-26 19:12:34,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1230 [WARNING|trainer.py:803] 2025-04-26 19:12:35,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:12:35,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1250 1312 [WARNING|trainer.py:803] 2025-04-26 19:12:36,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1231 [WARNING|trainer.py:803] 2025-04-26 19:12:37,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:12:37,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1251 1313 1232 [WARNING|trainer.py:803] 2025-04-26 19:12:39,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:12:39,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:12:39,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1314 1252 1233 [WARNING|trainer.py:803] 2025-04-26 19:12:41,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:12:41,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:12:42,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1315 1253 1234 [WARNING|trainer.py:803] 2025-04-26 19:12:43,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:12:43,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:12:44,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1316 1254 1235 [WARNING|trainer.py:803] 2025-04-26 19:12:45,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:12:46,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:12:46,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1317 1255 [WARNING|trainer.py:803] 2025-04-26 19:12:48,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1236 [WARNING|trainer.py:803] 2025-04-26 19:12:48,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:12:48,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1318 1256 [WARNING|trainer.py:803] 2025-04-26 19:12:50,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1237 [WARNING|trainer.py:803] 2025-04-26 19:12:50,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:12:51,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1319 1257 [WARNING|trainer.py:803] 2025-04-26 19:12:52,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1238 [WARNING|trainer.py:803] 2025-04-26 19:12:52,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:12:53,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1320 1258 [WARNING|trainer.py:803] 2025-04-26 19:12:54,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1239 [WARNING|trainer.py:803] 2025-04-26 19:12:55,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:12:55,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1321 1259 1240 [WARNING|trainer.py:803] 2025-04-26 19:12:57,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:12:57,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:12:58,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1322 1260 [WARNING|trainer.py:803] 2025-04-26 19:12:59,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1241 [WARNING|trainer.py:803] 2025-04-26 19:13:00,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:13:00,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1323 1261 [WARNING|trainer.py:803] 2025-04-26 19:13:01,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:13:02,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1242 1324 [WARNING|trainer.py:803] 2025-04-26 19:13:03,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1262 [WARNING|trainer.py:803] 2025-04-26 19:13:03,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:13:04,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1243 1325 [WARNING|trainer.py:803] 2025-04-26 19:13:05,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1263 [WARNING|trainer.py:803] 2025-04-26 19:13:06,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:13:06,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1244 1326 [WARNING|trainer.py:803] 2025-04-26 19:13:07,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1264 [WARNING|trainer.py:803] 2025-04-26 19:13:08,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:13:08,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1245 1327 [WARNING|trainer.py:803] 2025-04-26 19:13:10,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:13:10,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1265 [WARNING|trainer.py:803] 2025-04-26 19:13:11,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1328 1246 [WARNING|trainer.py:803] 2025-04-26 19:13:12,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:13:12,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1266 1329 [WARNING|trainer.py:803] 2025-04-26 19:13:13,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1247 [WARNING|trainer.py:803] 2025-04-26 19:13:14,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1267 [WARNING|trainer.py:803] 2025-04-26 19:13:15,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1330 [WARNING|trainer.py:803] 2025-04-26 19:13:15,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:13:16,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1248 [WARNING|trainer.py:803] 2025-04-26 19:13:17,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1268 1331 [WARNING|trainer.py:803] 2025-04-26 19:13:18,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:13:18,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1249 [WARNING|trainer.py:803] 2025-04-26 19:13:19,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1332 1269 [WARNING|trainer.py:803] 2025-04-26 19:13:20,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:13:20,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1250 [WARNING|trainer.py:803] 2025-04-26 19:13:21,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1333 1270 [WARNING|trainer.py:803] 2025-04-26 19:13:22,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:13:23,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1251 1334 1271 [WARNING|trainer.py:803] 2025-04-26 19:13:24,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:13:24,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:13:25,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1252 1335 [WARNING|trainer.py:803] 2025-04-26 19:13:26,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1272 [WARNING|trainer.py:803] 2025-04-26 19:13:26,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:13:27,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1253 1336 [WARNING|trainer.py:803] 2025-04-26 19:13:28,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1273 [WARNING|trainer.py:803] 2025-04-26 19:13:29,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:13:29,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1337 1254 1274 [WARNING|trainer.py:803] 2025-04-26 19:13:31,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:13:31,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:13:31,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1338 1255 [WARNING|trainer.py:803] 2025-04-26 19:13:33,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:13:33,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1275 [WARNING|trainer.py:803] 2025-04-26 19:13:34,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1339 1256 [WARNING|trainer.py:803] 2025-04-26 19:13:35,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1276 [WARNING|trainer.py:803] 2025-04-26 19:13:35,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:13:36,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1340 1257 [WARNING|trainer.py:803] 2025-04-26 19:13:37,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1277 [WARNING|trainer.py:803] 2025-04-26 19:13:37,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:13:38,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1341 1258 [WARNING|trainer.py:803] 2025-04-26 19:13:39,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1278 [WARNING|trainer.py:803] 2025-04-26 19:13:40,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:13:40,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1342 [WARNING|trainer.py:803] 2025-04-26 19:13:41,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1259 1279 [WARNING|trainer.py:803] 2025-04-26 19:13:42,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1343 [WARNING|trainer.py:803] 2025-04-26 19:13:43,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:13:43,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1260 1280 1344 [WARNING|trainer.py:803] 2025-04-26 19:13:45,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:13:45,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:13:45,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1261 1281 [WARNING|trainer.py:803] 2025-04-26 19:13:47,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1345 [WARNING|trainer.py:803] 2025-04-26 19:13:47,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:13:48,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1262 1282 1346 [WARNING|trainer.py:803] 2025-04-26 19:13:49,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:13:50,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:13:50,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1263 1347 [WARNING|trainer.py:803] 2025-04-26 19:13:51,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1283 [WARNING|trainer.py:803] 2025-04-26 19:13:52,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:13:52,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1264 1348 [WARNING|trainer.py:803] 2025-04-26 19:13:53,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1284 [WARNING|trainer.py:803] 2025-04-26 19:13:54,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:13:55,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1265 1349 [WARNING|trainer.py:803] 2025-04-26 19:13:56,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:13:56,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1285 [WARNING|trainer.py:803] 2025-04-26 19:13:57,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1266 [WARNING|trainer.py:803] 2025-04-26 19:13:58,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1350 1286 [WARNING|trainer.py:803] 2025-04-26 19:14:00,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:14:00,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1267 [WARNING|trainer.py:803] 2025-04-26 19:14:01,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1287 1351 1268 [WARNING|trainer.py:803] 2025-04-26 19:14:02,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:03,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:14:03,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1288 [WARNING|trainer.py:803] 2025-04-26 19:14:04,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1269 1352 [WARNING|trainer.py:803] 2025-04-26 19:14:05,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1289 [WARNING|trainer.py:803] 2025-04-26 19:14:06,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:14:07,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1270 1353 1290 [WARNING|trainer.py:803] 2025-04-26 19:14:08,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:14:08,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:09,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1271 1354 1291 [WARNING|trainer.py:803] 2025-04-26 19:14:10,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:14:11,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:11,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1272 1355 1292 [WARNING|trainer.py:803] 2025-04-26 19:14:13,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:13,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:13,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1273 1293 [WARNING|trainer.py:803] 2025-04-26 19:14:15,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1356 [WARNING|trainer.py:803] 2025-04-26 19:14:15,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:14:16,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1274 1294 [WARNING|trainer.py:803] 2025-04-26 19:14:17,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:18,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1357 [WARNING|trainer.py:803] 2025-04-26 19:14:19,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1275 1295 [WARNING|trainer.py:803] 2025-04-26 19:14:19,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:20,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1276 1358 1296 [WARNING|trainer.py:803] 2025-04-26 19:14:22,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:22,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:22,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1277 1359 1297 [WARNING|trainer.py:803] 2025-04-26 19:14:24,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:14:24,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:14:24,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1278 [WARNING|trainer.py:803] 2025-04-26 19:14:26,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1298 1360 [WARNING|trainer.py:803] 2025-04-26 19:14:27,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:14:27,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1279 [WARNING|trainer.py:803] 2025-04-26 19:14:28,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1299 1361 [WARNING|trainer.py:803] 2025-04-26 19:14:29,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:14:30,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1280 [WARNING|trainer.py:803] 2025-04-26 19:14:31,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1300 1362 [WARNING|trainer.py:803] 2025-04-26 19:14:32,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:14:32,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1281 [WARNING|trainer.py:803] 2025-04-26 19:14:33,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1301 1363 1282 [WARNING|trainer.py:803] 2025-04-26 19:14:35,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:14:35,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:14:35,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1302 [WARNING|trainer.py:803] 2025-04-26 19:14:37,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1283 1364 [WARNING|trainer.py:803] 2025-04-26 19:14:38,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1303 [WARNING|trainer.py:803] 2025-04-26 19:14:38,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:39,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1284 1365 [WARNING|trainer.py:803] 2025-04-26 19:14:40,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1304 [WARNING|trainer.py:803] 2025-04-26 19:14:41,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:14:41,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1285 1305 1366 [WARNING|trainer.py:803] 2025-04-26 19:14:43,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:14:43,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:14:44,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1286 1306 [WARNING|trainer.py:803] 2025-04-26 19:14:45,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1367 [WARNING|trainer.py:803] 2025-04-26 19:14:46,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:14:46,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1287 1307 1368 [WARNING|trainer.py:803] 2025-04-26 19:14:48,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:48,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:14:48,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1288 1308 [WARNING|trainer.py:803] 2025-04-26 19:14:50,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:14:50,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1369 1289 1309 [WARNING|trainer.py:803] 2025-04-26 19:14:52,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:14:52,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:52,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1370 1290 1310 [WARNING|trainer.py:803] 2025-04-26 19:14:54,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:14:54,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:14:55,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1291 1371 1311 [WARNING|trainer.py:803] 2025-04-26 19:14:56,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:14:57,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:57,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1372 1292 1312 [WARNING|trainer.py:803] 2025-04-26 19:14:59,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:59,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:14:59,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1293 1373 1313 [WARNING|trainer.py:803] 2025-04-26 19:15:01,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:15:01,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:15:02,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1294 1314 1374 [WARNING|trainer.py:803] 2025-04-26 19:15:03,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:15:04,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:15:04,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1295 1315 1375 [WARNING|trainer.py:803] 2025-04-26 19:15:06,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:15:06,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:15:06,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1296 1316 [WARNING|trainer.py:803] 2025-04-26 19:15:08,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1376 [WARNING|trainer.py:803] 2025-04-26 19:15:08,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:15:09,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1297 1317 [WARNING|trainer.py:803] 2025-04-26 19:15:10,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1377 [WARNING|trainer.py:803] 2025-04-26 19:15:11,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:15:11,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1298 1318 [WARNING|trainer.py:803] 2025-04-26 19:15:13,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:15:13,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1378 1299 [WARNING|trainer.py:803] 2025-04-26 19:15:15,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1319 [WARNING|trainer.py:803] 2025-04-26 19:15:15,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:15:16,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1300 1379 1320 [WARNING|trainer.py:803] 2025-04-26 19:15:17,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:15:17,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:15:18,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1301 1380 1321 [WARNING|trainer.py:803] 2025-04-26 19:15:20,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:15:20,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:15:21,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1302 1322 1381 [WARNING|trainer.py:803] 2025-04-26 19:15:22,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:15:23,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:15:23,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1303 1323 [WARNING|trainer.py:803] 2025-04-26 19:15:25,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1382 [WARNING|trainer.py:803] 2025-04-26 19:15:25,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:15:26,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1304 1324 [WARNING|trainer.py:803] 2025-04-26 19:15:27,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1383 [WARNING|trainer.py:803] 2025-04-26 19:15:28,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1305 [WARNING|trainer.py:803] 2025-04-26 19:15:28,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:15:29,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1325 1384 [WARNING|trainer.py:803] 2025-04-26 19:15:30,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:15:30,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1306 [WARNING|trainer.py:803] 2025-04-26 19:15:31,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1326 [WARNING|trainer.py:803] 2025-04-26 19:15:32,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1307 1385 [WARNING|trainer.py:803] 2025-04-26 19:15:34,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:15:34,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1327 [WARNING|trainer.py:803] 2025-04-26 19:15:35,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1308 1386 [WARNING|trainer.py:803] 2025-04-26 19:15:36,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1328 [WARNING|trainer.py:803] 2025-04-26 19:15:36,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:15:37,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1309 [WARNING|trainer.py:803] 2025-04-26 19:15:38,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1329 1387 [WARNING|trainer.py:803] 2025-04-26 19:15:39,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:15:39,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1310 [WARNING|trainer.py:803] 2025-04-26 19:15:40,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1330 [WARNING|trainer.py:803] 2025-04-26 19:15:41,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1311 1388 [WARNING|trainer.py:803] 2025-04-26 19:15:43,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1331 [WARNING|trainer.py:803] 2025-04-26 19:15:43,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:15:43,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1312 1389 [WARNING|trainer.py:803] 2025-04-26 19:15:45,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1332 [WARNING|trainer.py:803] 2025-04-26 19:15:45,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:15:46,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1313 1390 1333 [WARNING|trainer.py:803] 2025-04-26 19:15:47,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:15:48,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:15:48,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1314 [WARNING|trainer.py:803] 2025-04-26 19:15:49,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1334 1391 [WARNING|trainer.py:803] 2025-04-26 19:15:50,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:15:51,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1315 [WARNING|trainer.py:803] 2025-04-26 19:15:52,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1335 [WARNING|trainer.py:803] 2025-04-26 19:15:53,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1392 1316 [WARNING|trainer.py:803] 2025-04-26 19:15:54,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:15:54,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1336 [WARNING|trainer.py:803] 2025-04-26 19:15:55,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1317 1393 [WARNING|trainer.py:803] 2025-04-26 19:15:56,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1337 [WARNING|trainer.py:803] 2025-04-26 19:15:57,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:15:57,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1318 1394 1338 [WARNING|trainer.py:803] 2025-04-26 19:15:59,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:16:00,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:16:00,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1319 1339 [WARNING|trainer.py:803] 2025-04-26 19:16:01,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1395 [WARNING|trainer.py:803] 2025-04-26 19:16:02,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:16:03,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1320 1340 [WARNING|trainer.py:803] 2025-04-26 19:16:04,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:16:05,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1396 1321 1341 [WARNING|trainer.py:803] 2025-04-26 19:16:06,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:16:06,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:16:07,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1322 1342 1397 [WARNING|trainer.py:803] 2025-04-26 19:16:09,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:16:09,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:16:10,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1323 1343 [WARNING|trainer.py:803] 2025-04-26 19:16:11,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:16:11,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1398 1324 1344 [WARNING|trainer.py:803] 2025-04-26 19:16:13,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:16:14,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:16:14,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1399 1345 [WARNING|trainer.py:803] 2025-04-26 19:16:15,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1325 [WARNING|trainer.py:803] 2025-04-26 19:16:16,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:16:16,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1400 1346 [WARNING|trainer.py:803] 2025-04-26 19:16:18,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1326 [WARNING|trainer.py:803] 2025-04-26 19:16:18,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:16:18,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1401 [WARNING|trainer.py:803] 2025-04-26 19:16:19,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1347 1327 1402 [WARNING|trainer.py:803] 2025-04-26 19:16:21,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:16:21,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:16:21,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1328 1348 1403 [WARNING|trainer.py:803] 2025-04-26 19:16:23,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:16:23,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:16:23,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1329 1349 1404 [WARNING|trainer.py:803] 2025-04-26 19:16:25,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:16:25,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:16:25,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1330 1405 [WARNING|trainer.py:803] 2025-04-26 19:16:27,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:16:27,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1350 1406 1331 [WARNING|trainer.py:803] 2025-04-26 19:16:29,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:16:29,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:16:29,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1407 1332 1351 [WARNING|trainer.py:803] 2025-04-26 19:16:31,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:16:32,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:16:32,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1408 1333 [WARNING|trainer.py:803] 2025-04-26 19:16:34,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:16:34,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1352 1409 [WARNING|trainer.py:803] 2025-04-26 19:16:35,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:16:36,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1334 [WARNING|trainer.py:803] 2025-04-26 19:16:37,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1410 [WARNING|trainer.py:803] 2025-04-26 19:16:37,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1353 1335 [WARNING|trainer.py:803] 2025-04-26 19:16:38,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1411 [WARNING|trainer.py:803] 2025-04-26 19:16:39,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:16:40,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1354 1336 [WARNING|trainer.py:803] 2025-04-26 19:16:41,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1412 [WARNING|trainer.py:803] 2025-04-26 19:16:41,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:16:42,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1413 1355 1337 [WARNING|trainer.py:803] 2025-04-26 19:16:43,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:16:44,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:16:44,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1414 1338 1356 [WARNING|trainer.py:803] 2025-04-26 19:16:46,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:16:46,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:16:46,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1415 [WARNING|trainer.py:803] 2025-04-26 19:16:47,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1339 1357 1416 [WARNING|trainer.py:803] 2025-04-26 19:16:49,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:16:49,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:16:49,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1340 1417 [WARNING|trainer.py:803] 2025-04-26 19:16:51,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:16:51,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1358 1341 1418 [WARNING|trainer.py:803] 2025-04-26 19:16:52,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:16:53,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:16:53,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1359 1419 1342 [WARNING|trainer.py:803] 2025-04-26 19:16:55,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:16:55,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:16:56,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1420 1343 [WARNING|trainer.py:803] 2025-04-26 19:16:57,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1360 [WARNING|trainer.py:803] 2025-04-26 19:16:58,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1421 [WARNING|trainer.py:803] 2025-04-26 19:16:58,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:16:59,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1344 [WARNING|trainer.py:803] 2025-04-26 19:17:00,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1422 1361 [WARNING|trainer.py:803] 2025-04-26 19:17:01,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:17:01,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1345 1423 [WARNING|trainer.py:803] 2025-04-26 19:17:02,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1362 [WARNING|trainer.py:803] 2025-04-26 19:17:03,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:17:04,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1346 1424 [WARNING|trainer.py:803] 2025-04-26 19:17:05,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:17:05,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1363 1425 1347 [WARNING|trainer.py:803] 2025-04-26 19:17:07,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:17:07,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:17:07,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1426 1348 [WARNING|trainer.py:803] 2025-04-26 19:17:09,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:17:10,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1364 1427 [WARNING|trainer.py:803] 2025-04-26 19:17:11,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:17:11,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1349 1428 [WARNING|trainer.py:803] 2025-04-26 19:17:12,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1365 [WARNING|trainer.py:803] 2025-04-26 19:17:13,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:17:13,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1429 1350 [WARNING|trainer.py:803] 2025-04-26 19:17:15,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1366 [WARNING|trainer.py:803] 2025-04-26 19:17:16,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:17:16,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1430 [WARNING|trainer.py:803] 2025-04-26 19:17:17,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1367 1431 1351 [WARNING|trainer.py:803] 2025-04-26 19:17:19,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:17:19,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:17:19,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1432 1368 [WARNING|trainer.py:803] 2025-04-26 19:17:21,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:17:21,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1352 1433 [WARNING|trainer.py:803] 2025-04-26 19:17:22,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:17:23,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1369 1434 1353 [WARNING|trainer.py:803] 2025-04-26 19:17:25,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:17:25,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:17:25,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1435 1370 1354 [WARNING|trainer.py:803] 2025-04-26 19:17:27,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:17:27,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:17:28,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1436 [WARNING|trainer.py:803] 2025-04-26 19:17:29,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1371 1355 [WARNING|trainer.py:803] 2025-04-26 19:17:30,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1437 [WARNING|trainer.py:803] 2025-04-26 19:17:31,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:17:31,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1372 1438 [WARNING|trainer.py:803] 2025-04-26 19:17:32,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1356 [WARNING|trainer.py:803] 2025-04-26 19:17:33,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:17:33,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1439 1373 [WARNING|trainer.py:803] 2025-04-26 19:17:35,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:17:35,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1357 [WARNING|trainer.py:803] 2025-04-26 19:17:36,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1440 1374 [WARNING|trainer.py:803] 2025-04-26 19:17:37,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:17:38,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1441 1358 [WARNING|trainer.py:803] 2025-04-26 19:17:39,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:17:39,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1375 [WARNING|trainer.py:803] 2025-04-26 19:17:40,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1442 1359 [WARNING|trainer.py:803] 2025-04-26 19:17:41,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:17:42,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1443 1376 [WARNING|trainer.py:803] 2025-04-26 19:17:43,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:17:43,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1360 1444 1377 [WARNING|trainer.py:803] 2025-04-26 19:17:45,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:17:45,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:17:46,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1445 1361 [WARNING|trainer.py:803] 2025-04-26 19:17:47,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:17:48,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1446 1378 [WARNING|trainer.py:803] 2025-04-26 19:17:49,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:17:50,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1362 1447 [WARNING|trainer.py:803] 2025-04-26 19:17:51,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:17:51,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1379 [WARNING|trainer.py:803] 2025-04-26 19:17:52,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1448 1363 [WARNING|trainer.py:803] 2025-04-26 19:17:53,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:17:54,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1449 1380 [WARNING|trainer.py:803] 2025-04-26 19:17:55,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:17:56,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1364 1450 [WARNING|trainer.py:803] 2025-04-26 19:17:58,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:17:58,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1381 [WARNING|trainer.py:803] 2025-04-26 19:17:59,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1451 1365 [WARNING|trainer.py:803] 2025-04-26 19:18:00,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:18:00,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1382 1452 [WARNING|trainer.py:803] 2025-04-26 19:18:01,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:18:02,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1366 1453 [WARNING|trainer.py:803] 2025-04-26 19:18:03,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1383 [WARNING|trainer.py:803] 2025-04-26 19:18:04,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:18:04,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1454 1367 1384 [WARNING|trainer.py:803] 2025-04-26 19:18:06,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:18:06,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:18:06,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1455 1368 [WARNING|trainer.py:803] 2025-04-26 19:18:08,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:18:08,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1456 1385 [WARNING|trainer.py:803] 2025-04-26 19:18:10,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:18:10,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1457 1369 [WARNING|trainer.py:803] 2025-04-26 19:18:12,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1386 [WARNING|trainer.py:803] 2025-04-26 19:18:12,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:18:13,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1458 1370 [WARNING|trainer.py:803] 2025-04-26 19:18:14,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:18:14,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1387 1459 [WARNING|trainer.py:803] 2025-04-26 19:18:16,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1371 [WARNING|trainer.py:803] 2025-04-26 19:18:16,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:18:17,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1460 [WARNING|trainer.py:803] 2025-04-26 19:18:18,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1372 1388 1461 [WARNING|trainer.py:803] 2025-04-26 19:18:19,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:18:19,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:18:20,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1462 1373 1389 [WARNING|trainer.py:803] 2025-04-26 19:18:22,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:18:22,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:18:22,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1463 1374 [WARNING|trainer.py:803] 2025-04-26 19:18:24,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1390 [WARNING|trainer.py:803] 2025-04-26 19:18:25,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1464 [WARNING|trainer.py:803] 2025-04-26 19:18:25,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:18:26,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1375 1465 [WARNING|trainer.py:803] 2025-04-26 19:18:27,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1391 [WARNING|trainer.py:803] 2025-04-26 19:18:28,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:18:28,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1466 1376 [WARNING|trainer.py:803] 2025-04-26 19:18:30,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:18:30,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1392 1467 [WARNING|trainer.py:803] 2025-04-26 19:18:31,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:18:32,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1377 1468 [WARNING|trainer.py:803] 2025-04-26 19:18:33,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:18:34,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1393 1469 [WARNING|trainer.py:803] 2025-04-26 19:18:35,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:18:36,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1378 [WARNING|trainer.py:803] 2025-04-26 19:18:37,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1470 1394 [WARNING|trainer.py:803] 2025-04-26 19:18:38,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:18:38,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1379 1471 [WARNING|trainer.py:803] 2025-04-26 19:18:39,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:18:40,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1395 1472 [WARNING|trainer.py:803] 2025-04-26 19:18:41,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:18:42,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1380 [WARNING|trainer.py:803] 2025-04-26 19:18:43,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1473 [WARNING|trainer.py:803] 2025-04-26 19:18:44,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1396 1381 [WARNING|trainer.py:803] 2025-04-26 19:18:45,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1474 [WARNING|trainer.py:803] 2025-04-26 19:18:46,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:18:46,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1475 1397 1382 [WARNING|trainer.py:803] 2025-04-26 19:18:48,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:18:49,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 19:18:49,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo YesYes 1476 [WARNING|trainer.py:803] 2025-04-26 19:18:50,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1383 [WARNING|trainer.py:803] 2025-04-26 19:18:51,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1398 1477 [WARNING|trainer.py:803] 2025-04-26 19:18:52,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:18:52,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1384 1478 1399 [WARNING|trainer.py:803] 2025-04-26 19:18:53,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:18:54,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:18:54,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1479 1400 [WARNING|trainer.py:803] 2025-04-26 19:18:56,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1385 [WARNING|trainer.py:803] 2025-04-26 19:18:57,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:18:57,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1480 1401 [WARNING|trainer.py:803] 2025-04-26 19:18:58,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1386 [WARNING|trainer.py:803] 2025-04-26 19:18:59,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1481 [WARNING|trainer.py:803] 2025-04-26 19:19:00,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1402 [WARNING|trainer.py:803] 2025-04-26 19:19:00,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:19:01,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1482 1387 1403 [WARNING|trainer.py:803] 2025-04-26 19:19:03,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:19:03,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:19:03,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1483 1404 [WARNING|trainer.py:803] 2025-04-26 19:19:05,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:19:06,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1388 1484 [WARNING|trainer.py:803] 2025-04-26 19:19:07,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:19:07,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1405 [WARNING|trainer.py:803] 2025-04-26 19:19:08,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1485 1389 [WARNING|trainer.py:803] 2025-04-26 19:19:09,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1406 [WARNING|trainer.py:803] 2025-04-26 19:19:09,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:19:10,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1486 [WARNING|trainer.py:803] 2025-04-26 19:19:11,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1407 1390 [WARNING|trainer.py:803] 2025-04-26 19:19:12,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:19:12,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1487 1408 [WARNING|trainer.py:803] 2025-04-26 19:19:14,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1391 [WARNING|trainer.py:803] 2025-04-26 19:19:14,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1488 [WARNING|trainer.py:803] 2025-04-26 19:19:15,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:19:15,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1409 1489 [WARNING|trainer.py:803] 2025-04-26 19:19:17,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:19:17,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1392 1410 [WARNING|trainer.py:803] 2025-04-26 19:19:18,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:19:19,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1490 [WARNING|trainer.py:803] 2025-04-26 19:19:20,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1411 1393 [WARNING|trainer.py:803] 2025-04-26 19:19:21,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1491 [WARNING|trainer.py:803] 2025-04-26 19:19:22,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:19:22,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1412 [WARNING|trainer.py:803] 2025-04-26 19:19:23,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1492 1394 [WARNING|trainer.py:803] 2025-04-26 19:19:24,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1413 [WARNING|trainer.py:803] 2025-04-26 19:19:25,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1493 [WARNING|trainer.py:803] 2025-04-26 19:19:25,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:19:26,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1414 1494 1395 [WARNING|trainer.py:803] 2025-04-26 19:19:28,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:19:28,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:19:28,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1415 1495 [WARNING|trainer.py:803] 2025-04-26 19:19:30,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:19:30,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1396 1496 1416 [WARNING|trainer.py:803] 2025-04-26 19:19:32,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:19:32,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:19:32,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1497 1417 [WARNING|trainer.py:803] 2025-04-26 19:19:34,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:19:34,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1397 1498 [WARNING|trainer.py:803] 2025-04-26 19:19:36,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1418 [WARNING|trainer.py:803] 2025-04-26 19:19:36,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:19:36,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1499 1419 1398 [WARNING|trainer.py:803] 2025-04-26 19:19:38,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:19:39,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:19:39,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1500 1420 [WARNING|trainer.py:803] 2025-04-26 19:19:40,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1399 [WARNING|trainer.py:803] 2025-04-26 19:19:41,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:19:41,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1421 1501 [WARNING|trainer.py:803] 2025-04-26 19:19:43,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1400 [WARNING|trainer.py:803] 2025-04-26 19:19:43,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:19:44,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1422 1401 [WARNING|trainer.py:803] 2025-04-26 19:19:45,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:19:46,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1502 1423 [WARNING|trainer.py:803] 2025-04-26 19:19:47,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1402 [WARNING|trainer.py:803] 2025-04-26 19:19:47,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:19:48,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1424 1503 1403 [WARNING|trainer.py:803] 2025-04-26 19:19:49,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:19:50,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:19:50,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1425 1404 [WARNING|trainer.py:803] 2025-04-26 19:19:52,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1504 [WARNING|trainer.py:803] 2025-04-26 19:19:52,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:19:53,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1426 [WARNING|trainer.py:803] 2025-04-26 19:19:54,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1405 [WARNING|trainer.py:803] 2025-04-26 19:19:55,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1427 1505 [WARNING|trainer.py:803] 2025-04-26 19:19:56,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1406 [WARNING|trainer.py:803] 2025-04-26 19:19:56,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:19:57,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1428 [WARNING|trainer.py:803] 2025-04-26 19:19:58,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1407 1506 [WARNING|trainer.py:803] 2025-04-26 19:19:59,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1429 [WARNING|trainer.py:803] 2025-04-26 19:19:59,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:20:00,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1408 [WARNING|trainer.py:803] 2025-04-26 19:20:01,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1507 1430 [WARNING|trainer.py:803] 2025-04-26 19:20:02,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:20:02,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1409 1431 [WARNING|trainer.py:803] 2025-04-26 19:20:04,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:20:05,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1508 1410 [WARNING|trainer.py:803] 2025-04-26 19:20:06,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:20:06,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1432 [WARNING|trainer.py:803] 2025-04-26 19:20:07,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1411 1509 [WARNING|trainer.py:803] 2025-04-26 19:20:08,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1433 [WARNING|trainer.py:803] 2025-04-26 19:20:09,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:20:09,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1412 [WARNING|trainer.py:803] 2025-04-26 19:20:10,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1434 1510 [WARNING|trainer.py:803] 2025-04-26 19:20:11,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1413 [WARNING|trainer.py:803] 2025-04-26 19:20:12,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:20:12,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1435 [WARNING|trainer.py:803] 2025-04-26 19:20:13,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1414 1511 1436 [WARNING|trainer.py:803] 2025-04-26 19:20:15,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:20:15,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:20:16,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1415 1437 [WARNING|trainer.py:803] 2025-04-26 19:20:17,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1512 [WARNING|trainer.py:803] 2025-04-26 19:20:18,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:20:18,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1416 1438 [WARNING|trainer.py:803] 2025-04-26 19:20:19,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:20:20,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1513 1417 1439 [WARNING|trainer.py:803] 2025-04-26 19:20:21,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:20:21,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:20:22,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1418 1440 [WARNING|trainer.py:803] 2025-04-26 19:20:24,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1514 [WARNING|trainer.py:803] 2025-04-26 19:20:24,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:20:25,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1419 1441 [WARNING|trainer.py:803] 2025-04-26 19:20:26,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:20:27,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1515 1420 [WARNING|trainer.py:803] 2025-04-26 19:20:28,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:20:28,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1442 [WARNING|trainer.py:803] 2025-04-26 19:20:29,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1421 [WARNING|trainer.py:803] 2025-04-26 19:20:30,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1443 1516 [WARNING|trainer.py:803] 2025-04-26 19:20:31,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:20:31,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1422 [WARNING|trainer.py:803] 2025-04-26 19:20:32,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1444 [WARNING|trainer.py:803] 2025-04-26 19:20:33,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1423 1517 [WARNING|trainer.py:803] 2025-04-26 19:20:35,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:20:35,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1445 [WARNING|trainer.py:803] 2025-04-26 19:20:35,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1424 [WARNING|trainer.py:803] 2025-04-26 19:20:37,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1446 1518 [WARNING|trainer.py:803] 2025-04-26 19:20:38,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:20:38,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1425 [WARNING|trainer.py:803] 2025-04-26 19:20:39,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1447 [WARNING|trainer.py:803] 2025-04-26 19:20:40,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1519 1426 [WARNING|trainer.py:803] 2025-04-26 19:20:41,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:20:41,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1448 [WARNING|trainer.py:803] 2025-04-26 19:20:42,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1427 [WARNING|trainer.py:803] 2025-04-26 19:20:43,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1520 1449 [WARNING|trainer.py:803] 2025-04-26 19:20:44,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:20:44,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1428 [WARNING|trainer.py:803] 2025-04-26 19:20:45,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1450 1521 [WARNING|trainer.py:803] 2025-04-26 19:20:47,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1429 [WARNING|trainer.py:803] 2025-04-26 19:20:47,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:20:48,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1451 [WARNING|trainer.py:803] 2025-04-26 19:20:49,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1430 [WARNING|trainer.py:803] 2025-04-26 19:20:50,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1522 1452 [WARNING|trainer.py:803] 2025-04-26 19:20:51,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:20:51,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1431 [WARNING|trainer.py:803] 2025-04-26 19:20:52,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1453 [WARNING|trainer.py:803] 2025-04-26 19:20:53,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1523 1432 [WARNING|trainer.py:803] 2025-04-26 19:20:54,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:20:54,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1454 1433 [WARNING|trainer.py:803] 2025-04-26 19:20:55,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:20:56,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1524 1455 [WARNING|trainer.py:803] 2025-04-26 19:20:57,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1434 [WARNING|trainer.py:803] 2025-04-26 19:20:58,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:20:58,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1456 1435 [WARNING|trainer.py:803] 2025-04-26 19:21:00,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1525 [WARNING|trainer.py:803] 2025-04-26 19:21:01,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:21:01,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1457 [WARNING|trainer.py:803] 2025-04-26 19:21:02,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1436 [WARNING|trainer.py:803] 2025-04-26 19:21:03,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1526 1458 [WARNING|trainer.py:803] 2025-04-26 19:21:04,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1437 [WARNING|trainer.py:803] 2025-04-26 19:21:04,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:21:05,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1459 1438 1527 [WARNING|trainer.py:803] 2025-04-26 19:21:07,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:21:07,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:21:08,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1460 1439 [WARNING|trainer.py:803] 2025-04-26 19:21:09,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:21:09,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1461 1528 1440 [WARNING|trainer.py:803] 2025-04-26 19:21:11,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:21:11,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:21:12,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1462 1441 [WARNING|trainer.py:803] 2025-04-26 19:21:13,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1529 [WARNING|trainer.py:803] 2025-04-26 19:21:14,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:21:14,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1463 [WARNING|trainer.py:803] 2025-04-26 19:21:16,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1442 [WARNING|trainer.py:803] 2025-04-26 19:21:16,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1530 1464 [WARNING|trainer.py:803] 2025-04-26 19:21:17,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1443 [WARNING|trainer.py:803] 2025-04-26 19:21:18,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:21:19,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1465 1531 1444 [WARNING|trainer.py:803] 2025-04-26 19:21:20,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:21:21,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:21:21,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1466 1445 [WARNING|trainer.py:803] 2025-04-26 19:21:22,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:21:23,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1532 1467 [WARNING|trainer.py:803] 2025-04-26 19:21:24,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1446 [WARNING|trainer.py:803] 2025-04-26 19:21:24,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:21:25,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1468 1447 [WARNING|trainer.py:803] 2025-04-26 19:21:26,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1533 [WARNING|trainer.py:803] 2025-04-26 19:21:27,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:21:27,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1469 [WARNING|trainer.py:803] 2025-04-26 19:21:28,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1448 [WARNING|trainer.py:803] 2025-04-26 19:21:29,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1534 1470 [WARNING|trainer.py:803] 2025-04-26 19:21:30,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:21:31,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1449 1471 [WARNING|trainer.py:803] 2025-04-26 19:21:32,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:21:33,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1450 1535 [WARNING|trainer.py:803] 2025-04-26 19:21:34,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1472 [WARNING|trainer.py:803] 2025-04-26 19:21:34,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:21:35,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1451 [WARNING|trainer.py:803] 2025-04-26 19:21:36,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1473 1536 [WARNING|trainer.py:803] 2025-04-26 19:21:37,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1452 [WARNING|trainer.py:803] 2025-04-26 19:21:38,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:21:38,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1474 [WARNING|trainer.py:803] 2025-04-26 19:21:39,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1453 1537 [WARNING|trainer.py:803] 2025-04-26 19:21:41,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:21:41,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1475 1454 [WARNING|trainer.py:803] 2025-04-26 19:21:42,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:21:43,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1538 1476 1455 [WARNING|trainer.py:803] 2025-04-26 19:21:44,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:21:44,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:21:45,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1477 1456 [WARNING|trainer.py:803] 2025-04-26 19:21:46,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1539 [WARNING|trainer.py:803] 2025-04-26 19:21:47,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:21:48,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1478 1457 [WARNING|trainer.py:803] 2025-04-26 19:21:49,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:21:49,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1479 1540 [WARNING|trainer.py:803] 2025-04-26 19:21:51,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:21:51,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1458 [WARNING|trainer.py:803] 2025-04-26 19:21:52,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1480 [WARNING|trainer.py:803] 2025-04-26 19:21:53,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1541 1459 [WARNING|trainer.py:803] 2025-04-26 19:21:54,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:21:54,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1481 1460 [WARNING|trainer.py:803] 2025-04-26 19:21:55,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:21:56,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1542 1482 [WARNING|trainer.py:803] 2025-04-26 19:21:57,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1461 [WARNING|trainer.py:803] 2025-04-26 19:21:58,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:21:58,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1483 1462 1543 [WARNING|trainer.py:803] 2025-04-26 19:22:00,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:22:00,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:22:00,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1484 1463 [WARNING|trainer.py:803] 2025-04-26 19:22:02,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:22:03,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1544 [WARNING|trainer.py:803] 2025-04-26 19:22:04,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1485 1464 [WARNING|trainer.py:803] 2025-04-26 19:22:05,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:22:05,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1545 1465 1486 [WARNING|trainer.py:803] 2025-04-26 19:22:07,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:22:07,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:22:07,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1466 1487 [WARNING|trainer.py:803] 2025-04-26 19:22:09,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1546 [WARNING|trainer.py:803] 2025-04-26 19:22:10,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:22:10,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1467 1488 [WARNING|trainer.py:803] 2025-04-26 19:22:12,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:22:12,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1547 1468 1489 [WARNING|trainer.py:803] 2025-04-26 19:22:14,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:22:14,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:22:14,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1469 [WARNING|trainer.py:803] 2025-04-26 19:22:16,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1548 1490 [WARNING|trainer.py:803] 2025-04-26 19:22:17,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:22:17,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1470 [WARNING|trainer.py:803] 2025-04-26 19:22:18,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1491 1549 [WARNING|trainer.py:803] 2025-04-26 19:22:19,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1471 [WARNING|trainer.py:803] 2025-04-26 19:22:20,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:22:20,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1492 [WARNING|trainer.py:803] 2025-04-26 19:22:21,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1472 [WARNING|trainer.py:803] 2025-04-26 19:22:22,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1550 1493 [WARNING|trainer.py:803] 2025-04-26 19:22:23,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:22:23,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1473 1494 [WARNING|trainer.py:803] 2025-04-26 19:22:25,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:22:25,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1551 1474 [WARNING|trainer.py:803] 2025-04-26 19:22:26,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:22:27,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1495 [WARNING|trainer.py:803] 2025-04-26 19:22:28,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1475 1552 1496 [WARNING|trainer.py:803] 2025-04-26 19:22:29,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:22:30,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:22:30,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1476 1497 1553 [WARNING|trainer.py:803] 2025-04-26 19:22:32,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:22:32,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:22:32,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1477 1498 [WARNING|trainer.py:803] 2025-04-26 19:22:34,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:22:34,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1554 1478 [WARNING|trainer.py:803] 2025-04-26 19:22:36,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1499 [WARNING|trainer.py:803] 2025-04-26 19:22:36,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:22:37,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1479 1555 [WARNING|trainer.py:803] 2025-04-26 19:22:38,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1500 [WARNING|trainer.py:803] 2025-04-26 19:22:39,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:22:39,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1480 [WARNING|trainer.py:803] 2025-04-26 19:22:40,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1556 1501 1481 [WARNING|trainer.py:803] 2025-04-26 19:22:42,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:22:42,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:22:43,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1482 1557 1502 [WARNING|trainer.py:803] 2025-04-26 19:22:45,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:22:45,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:22:45,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1483 [WARNING|trainer.py:803] 2025-04-26 19:22:48,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1558 1503 [WARNING|trainer.py:803] 2025-04-26 19:22:48,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:22:49,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1484 [WARNING|trainer.py:803] 2025-04-26 19:22:50,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1559 1504 1485 [WARNING|trainer.py:803] 2025-04-26 19:22:52,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:22:52,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:22:52,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1486 1560 1505 [WARNING|trainer.py:803] 2025-04-26 19:22:55,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:22:55,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:22:55,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1487 1561 [WARNING|trainer.py:803] 2025-04-26 19:22:57,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:22:58,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1506 1488 [WARNING|trainer.py:803] 2025-04-26 19:22:59,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:22:59,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1489 1562 1507 [WARNING|trainer.py:803] 2025-04-26 19:23:01,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:23:01,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:23:02,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1490 1563 [WARNING|trainer.py:803] 2025-04-26 19:23:04,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:23:04,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1508 [WARNING|trainer.py:803] 2025-04-26 19:23:05,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1491 [WARNING|trainer.py:803] 2025-04-26 19:23:06,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1564 [WARNING|trainer.py:803] 2025-04-26 19:23:07,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1492 1509 [WARNING|trainer.py:803] 2025-04-26 19:23:09,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:23:09,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1565 1493 [WARNING|trainer.py:803] 2025-04-26 19:23:10,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:23:11,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1510 [WARNING|trainer.py:803] 2025-04-26 19:23:12,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1494 [WARNING|trainer.py:803] 2025-04-26 19:23:13,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1566 [WARNING|trainer.py:803] 2025-04-26 19:23:14,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1511 1495 [WARNING|trainer.py:803] 2025-04-26 19:23:15,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:23:15,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1567 1496 [WARNING|trainer.py:803] 2025-04-26 19:23:17,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:23:17,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1512 [WARNING|trainer.py:803] 2025-04-26 19:23:19,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1497 1568 [WARNING|trainer.py:803] 2025-04-26 19:23:20,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:23:20,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1513 1498 [WARNING|trainer.py:803] 2025-04-26 19:23:22,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:23:22,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1569 1499 [WARNING|trainer.py:803] 2025-04-26 19:23:24,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:23:24,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1514 [WARNING|trainer.py:803] 2025-04-26 19:23:25,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1500 1570 [WARNING|trainer.py:803] 2025-04-26 19:23:26,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:23:27,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1515 [WARNING|trainer.py:803] 2025-04-26 19:23:29,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1501 1571 [WARNING|trainer.py:803] 2025-04-26 19:23:30,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:23:30,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1516 [WARNING|trainer.py:803] 2025-04-26 19:23:32,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1502 1572 [WARNING|trainer.py:803] 2025-04-26 19:23:33,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:23:34,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1517 [WARNING|trainer.py:803] 2025-04-26 19:23:35,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1503 1573 [WARNING|trainer.py:803] 2025-04-26 19:23:36,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:23:37,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1518 [WARNING|trainer.py:803] 2025-04-26 19:23:39,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1504 1574 [WARNING|trainer.py:803] 2025-04-26 19:23:40,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:23:40,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1519 [WARNING|trainer.py:803] 2025-04-26 19:23:42,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1505 1575 [WARNING|trainer.py:803] 2025-04-26 19:23:43,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:23:43,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1520 [WARNING|trainer.py:803] 2025-04-26 19:23:45,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1576 1506 [WARNING|trainer.py:803] 2025-04-26 19:23:46,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:23:47,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1521 [WARNING|trainer.py:803] 2025-04-26 19:23:49,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1577 1507 [WARNING|trainer.py:803] 2025-04-26 19:23:50,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:23:50,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1522 1578 [WARNING|trainer.py:803] 2025-04-26 19:23:52,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1508 [WARNING|trainer.py:803] 2025-04-26 19:23:53,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:23:53,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1523 1579 [WARNING|trainer.py:803] 2025-04-26 19:23:56,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1509 [WARNING|trainer.py:803] 2025-04-26 19:23:56,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:23:57,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1524 1580 1510 [WARNING|trainer.py:803] 2025-04-26 19:23:59,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:23:59,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:24:00,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1581 1525 1511 [WARNING|trainer.py:803] 2025-04-26 19:24:03,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:03,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:24:03,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1582 1526 1512 [WARNING|trainer.py:803] 2025-04-26 19:24:06,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:24:06,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:06,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1583 1513 1527 [WARNING|trainer.py:803] 2025-04-26 19:24:10,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:10,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:10,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1584 1514 1528 [WARNING|trainer.py:803] 2025-04-26 19:24:13,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:13,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:13,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1515 1585 1529 [WARNING|trainer.py:803] 2025-04-26 19:24:16,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:24:16,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:17,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1516 1530 1586 [WARNING|trainer.py:803] 2025-04-26 19:24:20,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:20,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:20,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1531 1517 1587 [WARNING|trainer.py:803] 2025-04-26 19:24:24,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:24,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:24,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1588 1518 1532 [WARNING|trainer.py:803] 2025-04-26 19:24:27,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:24:27,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:27,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1589 1519 [WARNING|trainer.py:803] 2025-04-26 19:24:30,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1533 [WARNING|trainer.py:803] 2025-04-26 19:24:30,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:31,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1590 1520 [WARNING|trainer.py:803] 2025-04-26 19:24:33,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1534 [WARNING|trainer.py:803] 2025-04-26 19:24:34,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:24:34,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1591 1521 [WARNING|trainer.py:803] 2025-04-26 19:24:36,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:24:37,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1535 [WARNING|trainer.py:803] 2025-04-26 19:24:38,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1592 1522 [WARNING|trainer.py:803] 2025-04-26 19:24:40,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:41,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1536 [WARNING|trainer.py:803] 2025-04-26 19:24:41,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1593 1523 [WARNING|trainer.py:803] 2025-04-26 19:24:43,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1537 [WARNING|trainer.py:803] 2025-04-26 19:24:44,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:24:45,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1594 [WARNING|trainer.py:803] 2025-04-26 19:24:47,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1524 1538 [WARNING|trainer.py:803] 2025-04-26 19:24:48,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:48,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1595 [WARNING|trainer.py:803] 2025-04-26 19:24:50,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1525 1539 [WARNING|trainer.py:803] 2025-04-26 19:24:51,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:24:52,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1596 [WARNING|trainer.py:803] 2025-04-26 19:24:53,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1526 1540 [WARNING|trainer.py:803] 2025-04-26 19:24:55,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:24:55,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1597 [WARNING|trainer.py:803] 2025-04-26 19:24:56,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1527 1541 [WARNING|trainer.py:803] 2025-04-26 19:24:58,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:24:58,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1598 [WARNING|trainer.py:803] 2025-04-26 19:24:59,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1542 1528 [WARNING|trainer.py:803] 2025-04-26 19:25:02,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:25:02,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1599 [WARNING|trainer.py:803] 2025-04-26 19:25:03,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1529 1543 [WARNING|trainer.py:803] 2025-04-26 19:25:05,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:25:05,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1600 [WARNING|trainer.py:803] 2025-04-26 19:25:06,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1530 1544 [WARNING|trainer.py:803] 2025-04-26 19:25:08,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:25:08,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1601 [WARNING|trainer.py:803] 2025-04-26 19:25:09,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1531 1545 1602 [WARNING|trainer.py:803] 2025-04-26 19:25:12,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:25:12,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:25:13,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1546 1532 1603 [WARNING|trainer.py:803] 2025-04-26 19:25:15,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:25:15,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:25:16,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1547 1533 1604 [WARNING|trainer.py:803] 2025-04-26 19:25:19,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:25:19,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:25:19,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1548 1605 1534 [WARNING|trainer.py:803] 2025-04-26 19:25:22,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:25:22,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:25:22,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1549 1606 [WARNING|trainer.py:803] 2025-04-26 19:25:25,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1535 [WARNING|trainer.py:803] 2025-04-26 19:25:25,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:25:26,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1550 1607 1536 [WARNING|trainer.py:803] 2025-04-26 19:25:29,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:25:29,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:25:30,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1551 1608 [WARNING|trainer.py:803] 2025-04-26 19:25:32,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1537 [WARNING|trainer.py:803] 2025-04-26 19:25:33,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:25:33,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1552 1609 [WARNING|trainer.py:803] 2025-04-26 19:25:36,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1538 [WARNING|trainer.py:803] 2025-04-26 19:25:36,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:25:36,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1553 1610 [WARNING|trainer.py:803] 2025-04-26 19:25:39,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1539 [WARNING|trainer.py:803] 2025-04-26 19:25:39,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:25:40,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1554 1611 [WARNING|trainer.py:803] 2025-04-26 19:25:42,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1540 [WARNING|trainer.py:803] 2025-04-26 19:25:43,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:25:43,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1555 1612 [WARNING|trainer.py:803] 2025-04-26 19:25:46,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1541 [WARNING|trainer.py:803] 2025-04-26 19:25:46,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:25:47,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1556 1613 [WARNING|trainer.py:803] 2025-04-26 19:25:49,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:25:49,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1542 [WARNING|trainer.py:803] 2025-04-26 19:25:50,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1557 1614 [WARNING|trainer.py:803] 2025-04-26 19:25:52,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:25:52,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1543 [WARNING|trainer.py:803] 2025-04-26 19:25:53,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1615 1558 [WARNING|trainer.py:803] 2025-04-26 19:25:56,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:25:56,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1544 [WARNING|trainer.py:803] 2025-04-26 19:25:57,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1559 1616 [WARNING|trainer.py:803] 2025-04-26 19:25:59,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1545 [WARNING|trainer.py:803] 2025-04-26 19:25:59,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:26:00,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1560 1617 [WARNING|trainer.py:803] 2025-04-26 19:26:02,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1546 [WARNING|trainer.py:803] 2025-04-26 19:26:03,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:03,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1561 1618 [WARNING|trainer.py:803] 2025-04-26 19:26:06,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1547 [WARNING|trainer.py:803] 2025-04-26 19:26:06,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:07,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1619 1562 [WARNING|trainer.py:803] 2025-04-26 19:26:09,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1548 [WARNING|trainer.py:803] 2025-04-26 19:26:09,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:10,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1563 1549 1620 [WARNING|trainer.py:803] 2025-04-26 19:26:13,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:26:13,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:26:13,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1564 [WARNING|trainer.py:803] 2025-04-26 19:26:16,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1621 1550 [WARNING|trainer.py:803] 2025-04-26 19:26:17,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:17,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1565 1622 [WARNING|trainer.py:803] 2025-04-26 19:26:19,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1551 [WARNING|trainer.py:803] 2025-04-26 19:26:20,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:20,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1566 1623 1552 [WARNING|trainer.py:803] 2025-04-26 19:26:23,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:23,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:26:23,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1567 1624 1553 [WARNING|trainer.py:803] 2025-04-26 19:26:26,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:26,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:26:27,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1554 1568 1625 [WARNING|trainer.py:803] 2025-04-26 19:26:30,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:26:30,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:26:30,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1555 1626 1569 [WARNING|trainer.py:803] 2025-04-26 19:26:33,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:33,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:33,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1556 1570 1627 [WARNING|trainer.py:803] 2025-04-26 19:26:37,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:26:37,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:37,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1628 1557 1571 [WARNING|trainer.py:803] 2025-04-26 19:26:40,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:26:40,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:26:40,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1629 1558 1572 [WARNING|trainer.py:803] 2025-04-26 19:26:43,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:43,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:44,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1630 1559 1573 [WARNING|trainer.py:803] 2025-04-26 19:26:46,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:47,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:47,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1631 1560 1574 [WARNING|trainer.py:803] 2025-04-26 19:26:50,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:50,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:50,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1561 1632 1575 [WARNING|trainer.py:803] 2025-04-26 19:26:53,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:26:53,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:26:54,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1633 1562 1576 [WARNING|trainer.py:803] 2025-04-26 19:26:56,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:57,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:26:57,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1634 1563 1577 [WARNING|trainer.py:803] 2025-04-26 19:27:00,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:27:00,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:27:00,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1564 1635 1578 [WARNING|trainer.py:803] 2025-04-26 19:27:03,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:27:03,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:27:03,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1565 1579 1636 [WARNING|trainer.py:803] 2025-04-26 19:27:06,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:27:07,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:27:07,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1637 1566 1580 [WARNING|trainer.py:803] 2025-04-26 19:27:10,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:27:10,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:27:10,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1638 1581 1567 [WARNING|trainer.py:803] 2025-04-26 19:27:13,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:27:13,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:27:14,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1639 1582 1568 [WARNING|trainer.py:803] 2025-04-26 19:27:16,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:27:17,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:27:17,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1640 [WARNING|trainer.py:803] 2025-04-26 19:27:19,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1569 1583 [WARNING|trainer.py:803] 2025-04-26 19:27:21,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:27:21,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1641 [WARNING|trainer.py:803] 2025-04-26 19:27:23,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1570 1584 [WARNING|trainer.py:803] 2025-04-26 19:27:24,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:27:24,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1642 [WARNING|trainer.py:803] 2025-04-26 19:27:26,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1571 1585 [WARNING|trainer.py:803] 2025-04-26 19:27:27,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:27:27,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1643 [WARNING|trainer.py:803] 2025-04-26 19:27:30,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1572 1586 [WARNING|trainer.py:803] 2025-04-26 19:27:31,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:27:32,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1644 [WARNING|trainer.py:803] 2025-04-26 19:27:33,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1573 [WARNING|trainer.py:803] 2025-04-26 19:27:34,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1587 [WARNING|trainer.py:803] 2025-04-26 19:27:35,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1645 [WARNING|trainer.py:803] 2025-04-26 19:27:36,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1574 [WARNING|trainer.py:803] 2025-04-26 19:27:37,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1588 [WARNING|trainer.py:803] 2025-04-26 19:27:38,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1646 [WARNING|trainer.py:803] 2025-04-26 19:27:39,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1575 [WARNING|trainer.py:803] 2025-04-26 19:27:41,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1589 [WARNING|trainer.py:803] 2025-04-26 19:27:42,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1647 [WARNING|trainer.py:803] 2025-04-26 19:27:43,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1576 [WARNING|trainer.py:803] 2025-04-26 19:27:44,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1590 [WARNING|trainer.py:803] 2025-04-26 19:27:45,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1648 [WARNING|trainer.py:803] 2025-04-26 19:27:46,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1577 [WARNING|trainer.py:803] 2025-04-26 19:27:47,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1591 1649 [WARNING|trainer.py:803] 2025-04-26 19:27:48,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:27:48,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1578 1650 [WARNING|trainer.py:803] 2025-04-26 19:27:51,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:27:51,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1592 [WARNING|trainer.py:803] 2025-04-26 19:27:52,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1579 1651 [WARNING|trainer.py:803] 2025-04-26 19:27:54,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:27:54,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1593 [WARNING|trainer.py:803] 2025-04-26 19:27:55,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1652 1580 [WARNING|trainer.py:803] 2025-04-26 19:27:57,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:27:57,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1594 [WARNING|trainer.py:803] 2025-04-26 19:27:59,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1653 [WARNING|trainer.py:803] 2025-04-26 19:28:00,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1581 [WARNING|trainer.py:803] 2025-04-26 19:28:01,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1654 1595 [WARNING|trainer.py:803] 2025-04-26 19:28:02,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:28:03,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1582 [WARNING|trainer.py:803] 2025-04-26 19:28:04,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1655 [WARNING|trainer.py:803] 2025-04-26 19:28:05,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1596 [WARNING|trainer.py:803] 2025-04-26 19:28:06,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1656 1583 [WARNING|trainer.py:803] 2025-04-26 19:28:07,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:28:08,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1597 [WARNING|trainer.py:803] 2025-04-26 19:28:09,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1657 [WARNING|trainer.py:803] 2025-04-26 19:28:10,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1584 [WARNING|trainer.py:803] 2025-04-26 19:28:11,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1598 1658 [WARNING|trainer.py:803] 2025-04-26 19:28:12,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:28:13,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1585 1659 [WARNING|trainer.py:803] 2025-04-26 19:28:14,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1599 [WARNING|trainer.py:803] 2025-04-26 19:28:15,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:28:16,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1660 1586 [WARNING|trainer.py:803] 2025-04-26 19:28:18,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:28:18,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1600 [WARNING|trainer.py:803] 2025-04-26 19:28:19,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1661 [WARNING|trainer.py:803] 2025-04-26 19:28:21,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1587 [WARNING|trainer.py:803] 2025-04-26 19:28:21,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1601 1662 [WARNING|trainer.py:803] 2025-04-26 19:28:23,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:28:23,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1588 [WARNING|trainer.py:803] 2025-04-26 19:28:25,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1663 1602 [WARNING|trainer.py:803] 2025-04-26 19:28:26,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:28:26,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1589 1664 [WARNING|trainer.py:803] 2025-04-26 19:28:28,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:28:28,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1603 [WARNING|trainer.py:803] 2025-04-26 19:28:29,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1665 1590 [WARNING|trainer.py:803] 2025-04-26 19:28:31,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:28:31,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1604 1666 [WARNING|trainer.py:803] 2025-04-26 19:28:33,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:28:33,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1591 [WARNING|trainer.py:803] 2025-04-26 19:28:35,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1605 1667 [WARNING|trainer.py:803] 2025-04-26 19:28:36,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:28:36,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1592 1668 [WARNING|trainer.py:803] 2025-04-26 19:28:38,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:28:39,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1606 [WARNING|trainer.py:803] 2025-04-26 19:28:39,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1669 1593 [WARNING|trainer.py:803] 2025-04-26 19:28:41,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:28:42,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1607 1670 [WARNING|trainer.py:803] 2025-04-26 19:28:43,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:28:44,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1594 [WARNING|trainer.py:803] 2025-04-26 19:28:45,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1671 1608 [WARNING|trainer.py:803] 2025-04-26 19:28:46,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:28:47,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1595 1672 [WARNING|trainer.py:803] 2025-04-26 19:28:49,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:28:49,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1609 [WARNING|trainer.py:803] 2025-04-26 19:28:50,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1673 1596 [WARNING|trainer.py:803] 2025-04-26 19:28:51,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:28:52,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1674 1610 [WARNING|trainer.py:803] 2025-04-26 19:28:54,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:28:54,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1597 [WARNING|trainer.py:803] 2025-04-26 19:28:55,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1675 [WARNING|trainer.py:803] 2025-04-26 19:28:56,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1611 [WARNING|trainer.py:803] 2025-04-26 19:28:57,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1598 1676 [WARNING|trainer.py:803] 2025-04-26 19:28:59,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:28:59,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1612 [WARNING|trainer.py:803] 2025-04-26 19:29:00,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1677 1599 [WARNING|trainer.py:803] 2025-04-26 19:29:01,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:29:02,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1613 1678 [WARNING|trainer.py:803] 2025-04-26 19:29:04,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:29:04,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1600 [WARNING|trainer.py:803] 2025-04-26 19:29:05,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1679 1614 [WARNING|trainer.py:803] 2025-04-26 19:29:06,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:29:07,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1601 1680 [WARNING|trainer.py:803] 2025-04-26 19:29:09,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:29:09,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1615 [WARNING|trainer.py:803] 2025-04-26 19:29:11,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1681 1602 [WARNING|trainer.py:803] 2025-04-26 19:29:12,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:29:12,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1682 1616 [WARNING|trainer.py:803] 2025-04-26 19:29:14,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:29:14,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1603 [WARNING|trainer.py:803] 2025-04-26 19:29:15,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1683 [WARNING|trainer.py:803] 2025-04-26 19:29:17,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1617 [WARNING|trainer.py:803] 2025-04-26 19:29:18,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1604 1684 [WARNING|trainer.py:803] 2025-04-26 19:29:19,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:29:19,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1618 1605 1685 [WARNING|trainer.py:803] 2025-04-26 19:29:21,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:29:22,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:29:22,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1619 1686 [WARNING|trainer.py:803] 2025-04-26 19:29:25,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1606 [WARNING|trainer.py:803] 2025-04-26 19:29:25,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:29:26,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1687 [WARNING|trainer.py:803] 2025-04-26 19:29:27,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1620 1607 [WARNING|trainer.py:803] 2025-04-26 19:29:29,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1688 [WARNING|trainer.py:803] 2025-04-26 19:29:29,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:29:30,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1621 1689 1608 [WARNING|trainer.py:803] 2025-04-26 19:29:32,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:29:33,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:29:33,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1690 1622 1609 [WARNING|trainer.py:803] 2025-04-26 19:29:35,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:29:36,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:29:36,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1691 [WARNING|trainer.py:803] 2025-04-26 19:29:38,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1623 1610 [WARNING|trainer.py:803] 2025-04-26 19:29:39,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1692 [WARNING|trainer.py:803] 2025-04-26 19:29:40,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:29:41,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1624 [WARNING|trainer.py:803] 2025-04-26 19:29:42,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1693 1611 [WARNING|trainer.py:803] 2025-04-26 19:29:43,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:29:43,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1694 1612 1625 [WARNING|trainer.py:803] 2025-04-26 19:29:46,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:29:46,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:29:46,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1695 [WARNING|trainer.py:803] 2025-04-26 19:29:49,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1626 1613 [WARNING|trainer.py:803] 2025-04-26 19:29:50,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:29:50,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1696 [WARNING|trainer.py:803] 2025-04-26 19:29:51,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1614 1627 [WARNING|trainer.py:803] 2025-04-26 19:29:53,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1697 [WARNING|trainer.py:803] 2025-04-26 19:29:54,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:29:54,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1628 1615 1698 [WARNING|trainer.py:803] 2025-04-26 19:29:57,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:29:57,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:29:57,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1699 1629 [WARNING|trainer.py:803] 2025-04-26 19:29:59,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1616 [WARNING|trainer.py:803] 2025-04-26 19:30:00,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:30:01,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1700 [WARNING|trainer.py:803] 2025-04-26 19:30:02,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1630 1617 [WARNING|trainer.py:803] 2025-04-26 19:30:03,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1701 [WARNING|trainer.py:803] 2025-04-26 19:30:04,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:30:04,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1631 1702 [WARNING|trainer.py:803] 2025-04-26 19:30:07,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1618 [WARNING|trainer.py:803] 2025-04-26 19:30:07,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:30:07,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1703 1632 [WARNING|trainer.py:803] 2025-04-26 19:30:10,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1619 [WARNING|trainer.py:803] 2025-04-26 19:30:10,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:30:11,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1704 [WARNING|trainer.py:803] 2025-04-26 19:30:12,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1633 [WARNING|trainer.py:803] 2025-04-26 19:30:13,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1705 1620 [WARNING|trainer.py:803] 2025-04-26 19:30:14,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:30:15,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1706 1634 [WARNING|trainer.py:803] 2025-04-26 19:30:17,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:30:17,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1621 [WARNING|trainer.py:803] 2025-04-26 19:30:18,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1707 [WARNING|trainer.py:803] 2025-04-26 19:30:20,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1635 [WARNING|trainer.py:803] 2025-04-26 19:30:20,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1622 1708 [WARNING|trainer.py:803] 2025-04-26 19:30:22,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:30:22,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1636 1709 [WARNING|trainer.py:803] 2025-04-26 19:30:24,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1623 [WARNING|trainer.py:803] 2025-04-26 19:30:25,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:30:25,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1710 1637 [WARNING|trainer.py:803] 2025-04-26 19:30:27,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:30:27,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1624 [WARNING|trainer.py:803] 2025-04-26 19:30:28,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1711 [WARNING|trainer.py:803] 2025-04-26 19:30:30,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1638 [WARNING|trainer.py:803] 2025-04-26 19:30:31,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1625 1712 [WARNING|trainer.py:803] 2025-04-26 19:30:32,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:30:32,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1639 1713 [WARNING|trainer.py:803] 2025-04-26 19:30:34,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:30:34,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1626 [WARNING|trainer.py:803] 2025-04-26 19:30:35,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1640 1714 [WARNING|trainer.py:803] 2025-04-26 19:30:37,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:30:37,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1627 1715 [WARNING|trainer.py:803] 2025-04-26 19:30:39,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:30:40,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1641 [WARNING|trainer.py:803] 2025-04-26 19:30:41,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1716 1628 [WARNING|trainer.py:803] 2025-04-26 19:30:42,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:30:42,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1642 1717 [WARNING|trainer.py:803] 2025-04-26 19:30:44,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:30:44,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1629 [WARNING|trainer.py:803] 2025-04-26 19:30:46,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1718 [WARNING|trainer.py:803] 2025-04-26 19:30:47,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1643 [WARNING|trainer.py:803] 2025-04-26 19:30:48,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1630 1719 [WARNING|trainer.py:803] 2025-04-26 19:30:49,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:30:49,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1644 1720 [WARNING|trainer.py:803] 2025-04-26 19:30:51,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1631 [WARNING|trainer.py:803] 2025-04-26 19:30:52,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:30:52,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1721 1645 [WARNING|trainer.py:803] 2025-04-26 19:30:54,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:30:55,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1632 [WARNING|trainer.py:803] 2025-04-26 19:30:56,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1722 1646 [WARNING|trainer.py:803] 2025-04-26 19:30:57,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:30:58,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1633 1723 [WARNING|trainer.py:803] 2025-04-26 19:30:59,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:31:00,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1647 [WARNING|trainer.py:803] 2025-04-26 19:31:01,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1724 1634 [WARNING|trainer.py:803] 2025-04-26 19:31:02,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:31:03,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1648 1725 [WARNING|trainer.py:803] 2025-04-26 19:31:04,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:31:05,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1635 [WARNING|trainer.py:803] 2025-04-26 19:31:06,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1726 1649 [WARNING|trainer.py:803] 2025-04-26 19:31:07,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:31:07,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1727 1636 1650 [WARNING|trainer.py:803] 2025-04-26 19:31:10,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:31:10,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:31:10,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1728 [WARNING|trainer.py:803] 2025-04-26 19:31:12,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1651 1637 [WARNING|trainer.py:803] 2025-04-26 19:31:13,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:31:13,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1729 [WARNING|trainer.py:803] 2025-04-26 19:31:15,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1652 1638 [WARNING|trainer.py:803] 2025-04-26 19:31:16,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1730 [WARNING|trainer.py:803] 2025-04-26 19:31:16,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:31:17,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1653 1731 [WARNING|trainer.py:803] 2025-04-26 19:31:19,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1639 [WARNING|trainer.py:803] 2025-04-26 19:31:19,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:31:20,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1654 1732 [WARNING|trainer.py:803] 2025-04-26 19:31:22,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:31:22,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1640 [WARNING|trainer.py:803] 2025-04-26 19:31:23,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1655 1733 [WARNING|trainer.py:803] 2025-04-26 19:31:24,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:31:24,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1641 1734 1656 [WARNING|trainer.py:803] 2025-04-26 19:31:27,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:31:27,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:31:27,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1735 1657 1642 [WARNING|trainer.py:803] 2025-04-26 19:31:29,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:31:30,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:31:30,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1736 1658 [WARNING|trainer.py:803] 2025-04-26 19:31:32,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1643 [WARNING|trainer.py:803] 2025-04-26 19:31:33,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:31:33,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1737 1659 [WARNING|trainer.py:803] 2025-04-26 19:31:35,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:31:35,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1644 1738 [WARNING|trainer.py:803] 2025-04-26 19:31:37,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:31:37,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1660 [WARNING|trainer.py:803] 2025-04-26 19:31:38,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1739 1645 [WARNING|trainer.py:803] 2025-04-26 19:31:40,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:31:40,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1661 [WARNING|trainer.py:803] 2025-04-26 19:31:41,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1740 1646 [WARNING|trainer.py:803] 2025-04-26 19:31:42,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1662 [WARNING|trainer.py:803] 2025-04-26 19:31:43,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:31:44,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1741 [WARNING|trainer.py:803] 2025-04-26 19:31:45,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1663 1647 [WARNING|trainer.py:803] 2025-04-26 19:31:46,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1742 [WARNING|trainer.py:803] 2025-04-26 19:31:47,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:31:47,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1664 1743 [WARNING|trainer.py:803] 2025-04-26 19:31:49,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1648 [WARNING|trainer.py:803] 2025-04-26 19:31:50,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:31:50,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1665 1744 [WARNING|trainer.py:803] 2025-04-26 19:31:52,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1649 [WARNING|trainer.py:803] 2025-04-26 19:31:53,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:31:53,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1666 1745 [WARNING|trainer.py:803] 2025-04-26 19:31:55,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:31:55,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1650 [WARNING|trainer.py:803] 2025-04-26 19:31:56,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1667 1746 [WARNING|trainer.py:803] 2025-04-26 19:31:57,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:31:58,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1651 [WARNING|trainer.py:803] 2025-04-26 19:31:59,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1668 1747 [WARNING|trainer.py:803] 2025-04-26 19:32:00,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:32:00,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1652 [WARNING|trainer.py:803] 2025-04-26 19:32:02,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1748 1669 [WARNING|trainer.py:803] 2025-04-26 19:32:03,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:32:03,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1653 1670 [WARNING|trainer.py:803] 2025-04-26 19:32:05,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1749 [WARNING|trainer.py:803] 2025-04-26 19:32:06,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:32:06,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1654 1671 [WARNING|trainer.py:803] 2025-04-26 19:32:07,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1750 [WARNING|trainer.py:803] 2025-04-26 19:32:08,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:32:09,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1655 1672 1751 [WARNING|trainer.py:803] 2025-04-26 19:32:10,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:32:11,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:32:11,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1656 1673 1752 [WARNING|trainer.py:803] 2025-04-26 19:32:13,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:32:13,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:32:14,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1674 1657 1753 [WARNING|trainer.py:803] 2025-04-26 19:32:16,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:32:16,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:32:16,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1754 1675 1658 [WARNING|trainer.py:803] 2025-04-26 19:32:18,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:32:18,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:32:19,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1755 1676 1659 [WARNING|trainer.py:803] 2025-04-26 19:32:21,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:32:21,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:32:21,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1756 1677 1660 [WARNING|trainer.py:803] 2025-04-26 19:32:24,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:32:24,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:32:24,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1757 1678 [WARNING|trainer.py:803] 2025-04-26 19:32:26,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1661 [WARNING|trainer.py:803] 2025-04-26 19:32:27,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:32:27,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1758 1662 1679 [WARNING|trainer.py:803] 2025-04-26 19:32:29,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:32:30,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:32:30,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1759 [WARNING|trainer.py:803] 2025-04-26 19:32:31,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1680 1663 [WARNING|trainer.py:803] 2025-04-26 19:32:32,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:32:33,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1760 [WARNING|trainer.py:803] 2025-04-26 19:32:34,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1681 1664 [WARNING|trainer.py:803] 2025-04-26 19:32:35,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:32:35,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1761 [WARNING|trainer.py:803] 2025-04-26 19:32:36,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1682 1665 1762 [WARNING|trainer.py:803] 2025-04-26 19:32:38,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:32:38,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:32:39,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1683 1666 1763 [WARNING|trainer.py:803] 2025-04-26 19:32:40,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:32:41,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:32:41,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1764 1684 1667 [WARNING|trainer.py:803] 2025-04-26 19:32:43,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:32:43,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:32:44,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1765 1668 1685 [WARNING|trainer.py:803] 2025-04-26 19:32:46,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:32:46,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:32:46,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1766 1686 1669 [WARNING|trainer.py:803] 2025-04-26 19:32:49,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:32:49,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:32:49,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1767 1687 1670 [WARNING|trainer.py:803] 2025-04-26 19:32:51,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:32:52,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:32:52,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1768 1671 [WARNING|trainer.py:803] 2025-04-26 19:32:54,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1688 [WARNING|trainer.py:803] 2025-04-26 19:32:54,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:32:55,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1769 [WARNING|trainer.py:803] 2025-04-26 19:32:56,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1672 1689 [WARNING|trainer.py:803] 2025-04-26 19:32:57,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:32:57,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1770 [WARNING|trainer.py:803] 2025-04-26 19:32:59,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1673 1690 [WARNING|trainer.py:803] 2025-04-26 19:33:00,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:33:00,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1771 1674 [WARNING|trainer.py:803] 2025-04-26 19:33:01,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1691 [WARNING|trainer.py:803] 2025-04-26 19:33:02,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:33:03,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1772 [WARNING|trainer.py:803] 2025-04-26 19:33:04,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1675 [WARNING|trainer.py:803] 2025-04-26 19:33:05,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1692 1773 [WARNING|trainer.py:803] 2025-04-26 19:33:06,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:33:06,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1676 [WARNING|trainer.py:803] 2025-04-26 19:33:07,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1693 1774 [WARNING|trainer.py:803] 2025-04-26 19:33:08,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:33:09,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1677 [WARNING|trainer.py:803] 2025-04-26 19:33:10,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1694 1775 [WARNING|trainer.py:803] 2025-04-26 19:33:11,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:33:12,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1678 [WARNING|trainer.py:803] 2025-04-26 19:33:13,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1695 1776 [WARNING|trainer.py:803] 2025-04-26 19:33:14,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:33:14,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1679 [WARNING|trainer.py:803] 2025-04-26 19:33:15,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1777 1696 [WARNING|trainer.py:803] 2025-04-26 19:33:17,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:33:17,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1680 [WARNING|trainer.py:803] 2025-04-26 19:33:18,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1778 1697 [WARNING|trainer.py:803] 2025-04-26 19:33:19,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:33:20,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1681 1779 [WARNING|trainer.py:803] 2025-04-26 19:33:21,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:33:21,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1698 [WARNING|trainer.py:803] 2025-04-26 19:33:23,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1682 1780 [WARNING|trainer.py:803] 2025-04-26 19:33:23,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:33:24,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1699 [WARNING|trainer.py:803] 2025-04-26 19:33:25,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1683 1781 [WARNING|trainer.py:803] 2025-04-26 19:33:26,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:33:26,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1700 [WARNING|trainer.py:803] 2025-04-26 19:33:28,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1782 1684 [WARNING|trainer.py:803] 2025-04-26 19:33:29,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:33:29,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1701 [WARNING|trainer.py:803] 2025-04-26 19:33:31,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1783 1685 [WARNING|trainer.py:803] 2025-04-26 19:33:32,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:33:32,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1702 1784 [WARNING|trainer.py:803] 2025-04-26 19:33:34,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1686 [WARNING|trainer.py:803] 2025-04-26 19:33:34,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:33:35,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1703 1785 [WARNING|trainer.py:803] 2025-04-26 19:33:36,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1687 [WARNING|trainer.py:803] 2025-04-26 19:33:37,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:33:37,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1704 1786 [WARNING|trainer.py:803] 2025-04-26 19:33:39,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1688 [WARNING|trainer.py:803] 2025-04-26 19:33:40,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:33:40,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1705 [WARNING|trainer.py:803] 2025-04-26 19:33:42,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1787 1689 [WARNING|trainer.py:803] 2025-04-26 19:33:43,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:33:43,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1706 1788 [WARNING|trainer.py:803] 2025-04-26 19:33:44,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1690 [WARNING|trainer.py:803] 2025-04-26 19:33:45,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:33:46,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1707 1789 [WARNING|trainer.py:803] 2025-04-26 19:33:47,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1691 [WARNING|trainer.py:803] 2025-04-26 19:33:48,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:33:48,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1708 1790 [WARNING|trainer.py:803] 2025-04-26 19:33:50,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:33:50,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1692 [WARNING|trainer.py:803] 2025-04-26 19:33:51,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1709 1791 [WARNING|trainer.py:803] 2025-04-26 19:33:52,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:33:53,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1693 1710 [WARNING|trainer.py:803] 2025-04-26 19:33:54,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1792 [WARNING|trainer.py:803] 2025-04-26 19:33:55,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:33:55,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1694 1793 1711 [WARNING|trainer.py:803] 2025-04-26 19:33:57,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:33:58,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:33:58,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1695 1794 1712 [WARNING|trainer.py:803] 2025-04-26 19:34:00,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:34:00,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:34:00,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1795 1696 1713 [WARNING|trainer.py:803] 2025-04-26 19:34:03,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:34:03,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:34:03,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1796 1697 1714 [WARNING|trainer.py:803] 2025-04-26 19:34:05,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:34:06,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:34:06,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1797 1715 1698 [WARNING|trainer.py:803] 2025-04-26 19:34:08,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:34:08,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:34:09,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1798 1716 1699 [WARNING|trainer.py:803] 2025-04-26 19:34:11,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:34:11,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:34:11,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1799 1717 [WARNING|trainer.py:803] 2025-04-26 19:34:13,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1700 [WARNING|trainer.py:803] 2025-04-26 19:34:14,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:34:14,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1800 1718 [WARNING|trainer.py:803] 2025-04-26 19:34:16,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1701 [WARNING|trainer.py:803] 2025-04-26 19:34:16,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:34:17,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1801 [WARNING|trainer.py:803] 2025-04-26 19:34:18,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1719 1702 1802 [WARNING|trainer.py:803] 2025-04-26 19:34:19,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:34:20,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:34:20,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1720 1803 [WARNING|trainer.py:803] 2025-04-26 19:34:22,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1703 [WARNING|trainer.py:803] 2025-04-26 19:34:22,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:34:22,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1721 1804 1704 [WARNING|trainer.py:803] 2025-04-26 19:34:24,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:34:24,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:34:25,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1805 [WARNING|trainer.py:803] 2025-04-26 19:34:26,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1722 1705 [WARNING|trainer.py:803] 2025-04-26 19:34:27,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1806 [WARNING|trainer.py:803] 2025-04-26 19:34:27,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:34:28,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1807 1723 1706 [WARNING|trainer.py:803] 2025-04-26 19:34:30,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:34:30,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:34:30,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1808 [WARNING|trainer.py:803] 2025-04-26 19:34:31,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1724 1707 [WARNING|trainer.py:803] 2025-04-26 19:34:32,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1809 [WARNING|trainer.py:803] 2025-04-26 19:34:33,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:34:34,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1725 1810 1708 [WARNING|trainer.py:803] 2025-04-26 19:34:35,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:34:35,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:34:36,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1811 1726 [WARNING|trainer.py:803] 2025-04-26 19:34:37,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1709 [WARNING|trainer.py:803] 2025-04-26 19:34:38,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1812 [WARNING|trainer.py:803] 2025-04-26 19:34:38,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:34:39,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1727 1813 1710 [WARNING|trainer.py:803] 2025-04-26 19:34:40,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:34:41,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:34:41,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1814 1728 [WARNING|trainer.py:803] 2025-04-26 19:34:43,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1711 [WARNING|trainer.py:803] 2025-04-26 19:34:43,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:34:43,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1815 [WARNING|trainer.py:803] 2025-04-26 19:34:44,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1729 1712 [WARNING|trainer.py:803] 2025-04-26 19:34:46,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1816 [WARNING|trainer.py:803] 2025-04-26 19:34:46,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:34:46,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1730 1817 1713 [WARNING|trainer.py:803] 2025-04-26 19:34:48,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:34:48,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:34:49,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1818 1731 [WARNING|trainer.py:803] 2025-04-26 19:34:50,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1714 [WARNING|trainer.py:803] 2025-04-26 19:34:51,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1819 [WARNING|trainer.py:803] 2025-04-26 19:34:52,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:34:52,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1732 1715 1820 [WARNING|trainer.py:803] 2025-04-26 19:34:53,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:34:54,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:34:54,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1821 1733 1716 [WARNING|trainer.py:803] 2025-04-26 19:34:56,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:34:56,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:34:57,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1822 1734 [WARNING|trainer.py:803] 2025-04-26 19:34:58,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1717 [WARNING|trainer.py:803] 2025-04-26 19:34:59,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:34:59,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1823 [WARNING|trainer.py:803] 2025-04-26 19:35:00,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1735 1718 [WARNING|trainer.py:803] 2025-04-26 19:35:01,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:35:01,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1824 [WARNING|trainer.py:803] 2025-04-26 19:35:03,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1736 1719 1825 [WARNING|trainer.py:803] 2025-04-26 19:35:04,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:35:04,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:35:04,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1826 1720 1737 [WARNING|trainer.py:803] 2025-04-26 19:35:06,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:35:07,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:35:07,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1827 [WARNING|trainer.py:803] 2025-04-26 19:35:08,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1721 1738 1828 [WARNING|trainer.py:803] 2025-04-26 19:35:10,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:35:10,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:35:10,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1829 1739 1722 [WARNING|trainer.py:803] 2025-04-26 19:35:12,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:35:12,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:35:12,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1830 1740 [WARNING|trainer.py:803] 2025-04-26 19:35:14,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1723 [WARNING|trainer.py:803] 2025-04-26 19:35:15,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:35:15,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1831 [WARNING|trainer.py:803] 2025-04-26 19:35:17,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1741 1724 1832 [WARNING|trainer.py:803] 2025-04-26 19:35:18,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:35:18,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:35:18,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1833 1742 1725 [WARNING|trainer.py:803] 2025-04-26 19:35:20,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:35:20,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:35:20,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1834 [WARNING|trainer.py:803] 2025-04-26 19:35:22,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1743 1726 [WARNING|trainer.py:803] 2025-04-26 19:35:23,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:35:23,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1835 [WARNING|trainer.py:803] 2025-04-26 19:35:24,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1727 1836 1744 [WARNING|trainer.py:803] 2025-04-26 19:35:26,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:35:26,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:35:26,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1837 1745 1728 [WARNING|trainer.py:803] 2025-04-26 19:35:28,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:35:28,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:35:28,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1838 [WARNING|trainer.py:803] 2025-04-26 19:35:30,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1729 1746 1839 [WARNING|trainer.py:803] 2025-04-26 19:35:31,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:35:31,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:35:31,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1840 1730 1747 [WARNING|trainer.py:803] 2025-04-26 19:35:33,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:35:33,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:35:34,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1841 1731 [WARNING|trainer.py:803] 2025-04-26 19:35:35,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1748 [WARNING|trainer.py:803] 2025-04-26 19:35:36,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:35:37,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1842 [WARNING|trainer.py:803] 2025-04-26 19:35:37,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1732 1843 1749 [WARNING|trainer.py:803] 2025-04-26 19:35:39,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:35:39,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:35:39,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1733 1844 [WARNING|trainer.py:803] 2025-04-26 19:35:41,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1750 [WARNING|trainer.py:803] 2025-04-26 19:35:42,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:35:42,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1845 1734 1751 [WARNING|trainer.py:803] 2025-04-26 19:35:44,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:35:44,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:35:45,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1846 1735 [WARNING|trainer.py:803] 2025-04-26 19:35:46,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1752 [WARNING|trainer.py:803] 2025-04-26 19:35:47,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1847 [WARNING|trainer.py:803] 2025-04-26 19:35:47,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:35:48,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1736 1848 1753 [WARNING|trainer.py:803] 2025-04-26 19:35:49,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:35:50,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:35:50,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1849 1737 [WARNING|trainer.py:803] 2025-04-26 19:35:52,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1754 [WARNING|trainer.py:803] 2025-04-26 19:35:52,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:35:53,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1850 [WARNING|trainer.py:803] 2025-04-26 19:35:54,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1738 1755 [WARNING|trainer.py:803] 2025-04-26 19:35:55,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1851 [WARNING|trainer.py:803] 2025-04-26 19:35:55,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:35:56,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1739 1756 1852 [WARNING|trainer.py:803] 2025-04-26 19:35:58,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:35:58,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:35:58,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1853 1740 1757 [WARNING|trainer.py:803] 2025-04-26 19:36:00,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:36:00,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:36:01,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1854 1741 [WARNING|trainer.py:803] 2025-04-26 19:36:02,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1758 [WARNING|trainer.py:803] 2025-04-26 19:36:03,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:36:04,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1855 [WARNING|trainer.py:803] 2025-04-26 19:36:05,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1742 1759 1856 [WARNING|trainer.py:803] 2025-04-26 19:36:06,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:36:06,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:36:06,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1857 1743 1760 [WARNING|trainer.py:803] 2025-04-26 19:36:08,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:36:08,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:36:09,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1858 1744 [WARNING|trainer.py:803] 2025-04-26 19:36:10,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1761 [WARNING|trainer.py:803] 2025-04-26 19:36:11,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:36:11,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1859 [WARNING|trainer.py:803] 2025-04-26 19:36:12,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1745 1762 1860 [WARNING|trainer.py:803] 2025-04-26 19:36:14,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:36:14,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:36:14,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1861 1763 1746 [WARNING|trainer.py:803] 2025-04-26 19:36:16,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:36:16,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:36:16,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1862 1764 [WARNING|trainer.py:803] 2025-04-26 19:36:18,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1747 [WARNING|trainer.py:803] 2025-04-26 19:36:19,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:36:19,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1863 [WARNING|trainer.py:803] 2025-04-26 19:36:20,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1765 1864 1748 [WARNING|trainer.py:803] 2025-04-26 19:36:22,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:36:22,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:36:22,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1865 1766 1749 [WARNING|trainer.py:803] 2025-04-26 19:36:24,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:36:24,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:36:25,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1866 [WARNING|trainer.py:803] 2025-04-26 19:36:26,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1767 1867 1750 [WARNING|trainer.py:803] 2025-04-26 19:36:27,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:36:28,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:36:28,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1768 1868 1751 [WARNING|trainer.py:803] 2025-04-26 19:36:30,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:36:30,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:36:30,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1869 1769 [WARNING|trainer.py:803] 2025-04-26 19:36:32,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1752 [WARNING|trainer.py:803] 2025-04-26 19:36:32,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1870 [WARNING|trainer.py:803] 2025-04-26 19:36:33,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:36:34,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1770 1871 1753 [WARNING|trainer.py:803] 2025-04-26 19:36:35,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:36:36,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:36:36,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1872 1771 1754 [WARNING|trainer.py:803] 2025-04-26 19:36:38,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:36:38,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:36:38,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1873 1772 [WARNING|trainer.py:803] 2025-04-26 19:36:40,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1755 [WARNING|trainer.py:803] 2025-04-26 19:36:40,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1874 [WARNING|trainer.py:803] 2025-04-26 19:36:41,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:36:41,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1773 1875 1756 [WARNING|trainer.py:803] 2025-04-26 19:36:43,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:36:43,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:36:43,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1876 1774 [WARNING|trainer.py:803] 2025-04-26 19:36:45,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1757 [WARNING|trainer.py:803] 2025-04-26 19:36:46,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:36:46,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1877 [WARNING|trainer.py:803] 2025-04-26 19:36:47,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1775 1878 1758 [WARNING|trainer.py:803] 2025-04-26 19:36:49,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:36:49,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:36:49,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1879 1759 [WARNING|trainer.py:803] 2025-04-26 19:36:51,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1776 [WARNING|trainer.py:803] 2025-04-26 19:36:52,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:36:52,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1880 [WARNING|trainer.py:803] 2025-04-26 19:36:53,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1760 1777 1881 [WARNING|trainer.py:803] 2025-04-26 19:36:54,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:36:54,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:36:55,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1882 1761 1778 [WARNING|trainer.py:803] 2025-04-26 19:36:57,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:36:57,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:36:57,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1883 1779 1762 [WARNING|trainer.py:803] 2025-04-26 19:36:59,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:36:59,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:37:00,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1884 1780 1763 [WARNING|trainer.py:803] 2025-04-26 19:37:01,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:37:02,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:37:02,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1885 [WARNING|trainer.py:803] 2025-04-26 19:37:04,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1781 1764 [WARNING|trainer.py:803] 2025-04-26 19:37:05,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:37:05,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1886 [WARNING|trainer.py:803] 2025-04-26 19:37:06,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1782 1765 1887 [WARNING|trainer.py:803] 2025-04-26 19:37:07,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:37:07,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:37:08,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1888 1766 1783 [WARNING|trainer.py:803] 2025-04-26 19:37:10,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:37:10,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:37:10,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1889 [WARNING|trainer.py:803] 2025-04-26 19:37:11,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1767 1784 1890 [WARNING|trainer.py:803] 2025-04-26 19:37:12,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:37:13,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:37:13,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1891 1768 1785 [WARNING|trainer.py:803] 2025-04-26 19:37:15,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:37:15,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:37:15,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1892 1769 1786 [WARNING|trainer.py:803] 2025-04-26 19:37:18,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:37:18,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:37:18,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1893 [WARNING|trainer.py:803] 2025-04-26 19:37:19,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1770 1787 [WARNING|trainer.py:803] 2025-04-26 19:37:20,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1894 [WARNING|trainer.py:803] 2025-04-26 19:37:21,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:37:21,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1771 1895 1788 [WARNING|trainer.py:803] 2025-04-26 19:37:23,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:37:23,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:37:24,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1896 1772 [WARNING|trainer.py:803] 2025-04-26 19:37:25,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1789 [WARNING|trainer.py:803] 2025-04-26 19:37:26,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1897 [WARNING|trainer.py:803] 2025-04-26 19:37:26,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:37:27,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1773 1790 [WARNING|trainer.py:803] 2025-04-26 19:37:28,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1898 [WARNING|trainer.py:803] 2025-04-26 19:37:29,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:37:30,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1774 1899 1791 [WARNING|trainer.py:803] 2025-04-26 19:37:31,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:37:32,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:37:32,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1900 1792 1775 [WARNING|trainer.py:803] 2025-04-26 19:37:34,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:37:34,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:37:34,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1901 [WARNING|trainer.py:803] 2025-04-26 19:37:36,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1793 1776 [WARNING|trainer.py:803] 2025-04-26 19:37:37,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1902 [WARNING|trainer.py:803] 2025-04-26 19:37:37,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:37:38,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1794 1903 1777 [WARNING|trainer.py:803] 2025-04-26 19:37:39,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:37:40,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:37:40,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1904 1795 1778 [WARNING|trainer.py:803] 2025-04-26 19:37:42,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:37:42,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:37:42,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1905 [WARNING|trainer.py:803] 2025-04-26 19:37:44,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1779 1796 [WARNING|trainer.py:803] 2025-04-26 19:37:45,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:37:45,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1906 [WARNING|trainer.py:803] 2025-04-26 19:37:46,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1780 1797 1907 [WARNING|trainer.py:803] 2025-04-26 19:37:47,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:37:47,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:37:48,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1908 1781 1798 [WARNING|trainer.py:803] 2025-04-26 19:37:50,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:37:50,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:37:50,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1909 [WARNING|trainer.py:803] 2025-04-26 19:37:51,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1799 1782 1910 [WARNING|trainer.py:803] 2025-04-26 19:37:53,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:37:53,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:37:53,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1911 1800 1783 [WARNING|trainer.py:803] 2025-04-26 19:37:55,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:37:55,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:37:55,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1912 1801 [WARNING|trainer.py:803] 2025-04-26 19:37:57,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1784 [WARNING|trainer.py:803] 2025-04-26 19:37:57,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:37:58,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1913 1802 [WARNING|trainer.py:803] 2025-04-26 19:37:59,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:38:00,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1785 1914 [WARNING|trainer.py:803] 2025-04-26 19:38:01,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:38:01,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1803 [WARNING|trainer.py:803] 2025-04-26 19:38:02,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1915 [WARNING|trainer.py:803] 2025-04-26 19:38:03,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1786 1804 [WARNING|trainer.py:803] 2025-04-26 19:38:04,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1916 [WARNING|trainer.py:803] 2025-04-26 19:38:05,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:38:05,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1805 1787 1917 [WARNING|trainer.py:803] 2025-04-26 19:38:07,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:38:07,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:38:07,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1806 1918 1788 [WARNING|trainer.py:803] 2025-04-26 19:38:09,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:38:09,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:38:09,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1807 1919 [WARNING|trainer.py:803] 2025-04-26 19:38:11,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1789 [WARNING|trainer.py:803] 2025-04-26 19:38:11,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:38:12,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1808 1920 [WARNING|trainer.py:803] 2025-04-26 19:38:13,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:38:13,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1790 1921 1809 [WARNING|trainer.py:803] 2025-04-26 19:38:15,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:38:15,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:38:16,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1791 1810 1922 [WARNING|trainer.py:803] 2025-04-26 19:38:17,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:38:18,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:38:18,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1923 1811 1792 [WARNING|trainer.py:803] 2025-04-26 19:38:20,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:38:20,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:38:20,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1812 1924 1793 [WARNING|trainer.py:803] 2025-04-26 19:38:22,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:38:22,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:38:23,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1813 1925 [WARNING|trainer.py:803] 2025-04-26 19:38:24,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:38:24,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1794 1814 [WARNING|trainer.py:803] 2025-04-26 19:38:25,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1926 [WARNING|trainer.py:803] 2025-04-26 19:38:26,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:38:26,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1795 1815 1927 [WARNING|trainer.py:803] 2025-04-26 19:38:28,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:38:28,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:38:28,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1928 1796 1816 [WARNING|trainer.py:803] 2025-04-26 19:38:30,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:38:30,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:38:30,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1929 1817 [WARNING|trainer.py:803] 2025-04-26 19:38:32,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1797 [WARNING|trainer.py:803] 2025-04-26 19:38:32,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:38:33,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1930 [WARNING|trainer.py:803] 2025-04-26 19:38:34,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1818 1798 [WARNING|trainer.py:803] 2025-04-26 19:38:35,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1931 [WARNING|trainer.py:803] 2025-04-26 19:38:35,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:38:36,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1819 1932 [WARNING|trainer.py:803] 2025-04-26 19:38:37,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1799 [WARNING|trainer.py:803] 2025-04-26 19:38:38,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:38:38,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1820 1933 [WARNING|trainer.py:803] 2025-04-26 19:38:39,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:38:40,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1800 1821 1934 [WARNING|trainer.py:803] 2025-04-26 19:38:41,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:38:41,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:38:41,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1801 1822 1935 [WARNING|trainer.py:803] 2025-04-26 19:38:43,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:38:43,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:38:44,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1802 1936 [WARNING|trainer.py:803] 2025-04-26 19:38:45,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1823 [WARNING|trainer.py:803] 2025-04-26 19:38:45,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:38:46,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1937 1803 [WARNING|trainer.py:803] 2025-04-26 19:38:47,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:38:47,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1824 1938 [WARNING|trainer.py:803] 2025-04-26 19:38:49,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:38:49,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1804 1825 [WARNING|trainer.py:803] 2025-04-26 19:38:50,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1939 [WARNING|trainer.py:803] 2025-04-26 19:38:51,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:38:51,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1805 1940 1826 [WARNING|trainer.py:803] 2025-04-26 19:38:52,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:38:53,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:38:53,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1806 1941 1827 [WARNING|trainer.py:803] 2025-04-26 19:38:54,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:38:55,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:38:55,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1807 1942 [WARNING|trainer.py:803] 2025-04-26 19:38:56,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1828 [WARNING|trainer.py:803] 2025-04-26 19:38:57,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:38:57,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1808 1943 1829 [WARNING|trainer.py:803] 2025-04-26 19:38:59,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:38:59,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:38:59,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1944 1809 [WARNING|trainer.py:803] 2025-04-26 19:39:01,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:39:01,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1830 [WARNING|trainer.py:803] 2025-04-26 19:39:02,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1810 1945 [WARNING|trainer.py:803] 2025-04-26 19:39:03,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:03,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1831 1946 1811 [WARNING|trainer.py:803] 2025-04-26 19:39:04,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:05,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:05,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1832 1947 1812 [WARNING|trainer.py:803] 2025-04-26 19:39:06,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:39:07,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:07,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1833 1948 1813 [WARNING|trainer.py:803] 2025-04-26 19:39:09,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:39:09,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:39:09,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1949 1834 1814 [WARNING|trainer.py:803] 2025-04-26 19:39:11,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:11,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:11,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1950 1835 1815 [WARNING|trainer.py:803] 2025-04-26 19:39:13,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:13,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:13,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1951 1836 [WARNING|trainer.py:803] 2025-04-26 19:39:15,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1816 [WARNING|trainer.py:803] 2025-04-26 19:39:15,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:39:16,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1952 1837 1817 [WARNING|trainer.py:803] 2025-04-26 19:39:17,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:39:17,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:39:18,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1953 1838 1818 [WARNING|trainer.py:803] 2025-04-26 19:39:19,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:39:20,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:39:20,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1954 1839 [WARNING|trainer.py:803] 2025-04-26 19:39:21,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1819 [WARNING|trainer.py:803] 2025-04-26 19:39:22,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:39:22,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1955 1840 [WARNING|trainer.py:803] 2025-04-26 19:39:23,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1820 [WARNING|trainer.py:803] 2025-04-26 19:39:24,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:39:24,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1956 1841 [WARNING|trainer.py:803] 2025-04-26 19:39:25,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1821 [WARNING|trainer.py:803] 2025-04-26 19:39:26,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1957 [WARNING|trainer.py:803] 2025-04-26 19:39:26,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:39:27,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1842 1822 [WARNING|trainer.py:803] 2025-04-26 19:39:28,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:29,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1958 1843 [WARNING|trainer.py:803] 2025-04-26 19:39:30,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1823 [WARNING|trainer.py:803] 2025-04-26 19:39:31,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1959 [WARNING|trainer.py:803] 2025-04-26 19:39:31,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:39:32,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1844 1960 1824 [WARNING|trainer.py:803] 2025-04-26 19:39:33,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:39:34,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:34,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1845 1961 1825 [WARNING|trainer.py:803] 2025-04-26 19:39:35,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:39:36,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:36,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1962 1846 1826 [WARNING|trainer.py:803] 2025-04-26 19:39:38,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:38,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:39:38,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1963 1847 1827 [WARNING|trainer.py:803] 2025-04-26 19:39:40,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:39:40,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:39:40,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1964 1848 1828 [WARNING|trainer.py:803] 2025-04-26 19:39:42,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:39:42,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:39:42,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1965 1849 1829 [WARNING|trainer.py:803] 2025-04-26 19:39:44,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:44,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:39:44,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1966 1850 [WARNING|trainer.py:803] 2025-04-26 19:39:46,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1830 [WARNING|trainer.py:803] 2025-04-26 19:39:46,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:39:47,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1967 [WARNING|trainer.py:803] 2025-04-26 19:39:48,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1851 1831 [WARNING|trainer.py:803] 2025-04-26 19:39:49,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1968 [WARNING|trainer.py:803] 2025-04-26 19:39:49,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:50,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1832 1852 [WARNING|trainer.py:803] 2025-04-26 19:39:51,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:39:51,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1969 1833 1853 [WARNING|trainer.py:803] 2025-04-26 19:39:53,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:39:53,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1970 [WARNING|trainer.py:803] 2025-04-26 19:39:54,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:39:54,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1834 1854 1971 [WARNING|trainer.py:803] 2025-04-26 19:39:56,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:56,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:39:56,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1972 1835 1855 [WARNING|trainer.py:803] 2025-04-26 19:39:58,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:58,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:39:58,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1973 1836 1856 [WARNING|trainer.py:803] 2025-04-26 19:40:00,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:00,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:40:00,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1974 1837 [WARNING|trainer.py:803] 2025-04-26 19:40:02,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1857 [WARNING|trainer.py:803] 2025-04-26 19:40:02,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1975 [WARNING|trainer.py:803] 2025-04-26 19:40:02,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:40:03,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1838 1858 [WARNING|trainer.py:803] 2025-04-26 19:40:04,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1976 [WARNING|trainer.py:803] 2025-04-26 19:40:05,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:05,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1839 1859 1977 [WARNING|trainer.py:803] 2025-04-26 19:40:06,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:40:07,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:07,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1840 1978 1860 [WARNING|trainer.py:803] 2025-04-26 19:40:08,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:40:09,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:40:09,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1841 1861 1979 [WARNING|trainer.py:803] 2025-04-26 19:40:10,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:40:11,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:11,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1842 1980 1862 [WARNING|trainer.py:803] 2025-04-26 19:40:13,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:13,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:40:13,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1843 1981 1863 [WARNING|trainer.py:803] 2025-04-26 19:40:15,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:15,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:15,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1982 1864 1844 [WARNING|trainer.py:803] 2025-04-26 19:40:17,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:40:18,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:40:18,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 1983 1865 1845 [WARNING|trainer.py:803] 2025-04-26 19:40:19,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:40:20,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:40:20,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1984 1866 [WARNING|trainer.py:803] 2025-04-26 19:40:21,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1846 [WARNING|trainer.py:803] 2025-04-26 19:40:22,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:23,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1985 1867 [WARNING|trainer.py:803] 2025-04-26 19:40:24,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1847 [WARNING|trainer.py:803] 2025-04-26 19:40:24,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:40:25,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1986 1868 [WARNING|trainer.py:803] 2025-04-26 19:40:26,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1848 [WARNING|trainer.py:803] 2025-04-26 19:40:26,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1987 [WARNING|trainer.py:803] 2025-04-26 19:40:27,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1869 [WARNING|trainer.py:803] 2025-04-26 19:40:28,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1849 [WARNING|trainer.py:803] 2025-04-26 19:40:28,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:29,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1988 1870 [WARNING|trainer.py:803] 2025-04-26 19:40:30,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1850 [WARNING|trainer.py:803] 2025-04-26 19:40:31,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:31,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1989 1871 [WARNING|trainer.py:803] 2025-04-26 19:40:32,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:40:33,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1851 1990 [WARNING|trainer.py:803] 2025-04-26 19:40:34,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:40:34,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1872 1991 [WARNING|trainer.py:803] 2025-04-26 19:40:35,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1852 [WARNING|trainer.py:803] 2025-04-26 19:40:36,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:40:36,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1873 [WARNING|trainer.py:803] 2025-04-26 19:40:37,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1853 1992 [WARNING|trainer.py:803] 2025-04-26 19:40:38,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:40:38,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1874 [WARNING|trainer.py:803] 2025-04-26 19:40:39,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1993 1854 [WARNING|trainer.py:803] 2025-04-26 19:40:40,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:40:40,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1875 1994 [WARNING|trainer.py:803] 2025-04-26 19:40:41,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1855 [WARNING|trainer.py:803] 2025-04-26 19:40:42,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1876 [WARNING|trainer.py:803] 2025-04-26 19:40:43,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:43,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1995 1856 [WARNING|trainer.py:803] 2025-04-26 19:40:44,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1877 [WARNING|trainer.py:803] 2025-04-26 19:40:45,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:46,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1996 1857 [WARNING|trainer.py:803] 2025-04-26 19:40:47,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1878 [WARNING|trainer.py:803] 2025-04-26 19:40:47,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:40:48,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1858 1997 1879 [WARNING|trainer.py:803] 2025-04-26 19:40:49,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:49,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:40:50,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1998 1859 1880 [WARNING|trainer.py:803] 2025-04-26 19:40:51,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:40:51,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:52,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1999 1860 1881 [WARNING|trainer.py:803] 2025-04-26 19:40:53,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:40:54,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:40:54,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2000 1861 [WARNING|trainer.py:803] 2025-04-26 19:40:55,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1882 [WARNING|trainer.py:803] 2025-04-26 19:40:56,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:40:56,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2001 1862 [WARNING|trainer.py:803] 2025-04-26 19:40:57,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:40:58,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1883 2002 [WARNING|trainer.py:803] 2025-04-26 19:40:59,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1863 [WARNING|trainer.py:803] 2025-04-26 19:40:59,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:41:00,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1884 2003 1864 [WARNING|trainer.py:803] 2025-04-26 19:41:01,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:41:01,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:41:02,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2004 1885 [WARNING|trainer.py:803] 2025-04-26 19:41:03,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1865 [WARNING|trainer.py:803] 2025-04-26 19:41:04,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:41:04,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2005 1886 [WARNING|trainer.py:803] 2025-04-26 19:41:05,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1866 [WARNING|trainer.py:803] 2025-04-26 19:41:06,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:41:06,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2006 [WARNING|trainer.py:803] 2025-04-26 19:41:07,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1867 1887 2007 [WARNING|trainer.py:803] 2025-04-26 19:41:08,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:41:09,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:41:09,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1868 1888 2008 [WARNING|trainer.py:803] 2025-04-26 19:41:11,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:41:11,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:41:11,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1889 1869 2009 [WARNING|trainer.py:803] 2025-04-26 19:41:13,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:41:13,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:41:13,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1890 2010 1870 [WARNING|trainer.py:803] 2025-04-26 19:41:15,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:41:15,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:41:15,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2011 1891 1871 [WARNING|trainer.py:803] 2025-04-26 19:41:17,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:41:17,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:41:17,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2012 [WARNING|trainer.py:803] 2025-04-26 19:41:19,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1872 1892 [WARNING|trainer.py:803] 2025-04-26 19:41:19,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2013 [WARNING|trainer.py:803] 2025-04-26 19:41:20,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:41:20,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1873 1893 [WARNING|trainer.py:803] 2025-04-26 19:41:22,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:41:22,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2014 [WARNING|trainer.py:803] 2025-04-26 19:41:23,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1874 1894 [WARNING|trainer.py:803] 2025-04-26 19:41:24,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2015 [WARNING|trainer.py:803] 2025-04-26 19:41:24,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:41:25,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1875 1895 2016 [WARNING|trainer.py:803] 2025-04-26 19:41:26,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:41:26,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:41:26,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1876 1896 2017 [WARNING|trainer.py:803] 2025-04-26 19:41:28,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:41:28,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:41:28,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1877 1897 2018 [WARNING|trainer.py:803] 2025-04-26 19:41:30,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:41:30,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:41:31,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2019 1878 [WARNING|trainer.py:803] 2025-04-26 19:41:32,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:41:32,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1898 [WARNING|trainer.py:803] 2025-04-26 19:41:33,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2020 1879 [WARNING|trainer.py:803] 2025-04-26 19:41:34,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:41:34,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1899 2021 [WARNING|trainer.py:803] 2025-04-26 19:41:36,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1880 [WARNING|trainer.py:803] 2025-04-26 19:41:36,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:41:37,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1900 2022 1881 [WARNING|trainer.py:803] 2025-04-26 19:41:38,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:41:38,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:41:39,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1901 2023 1882 [WARNING|trainer.py:803] 2025-04-26 19:41:40,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:41:41,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:41:41,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1902 2024 [WARNING|trainer.py:803] 2025-04-26 19:41:42,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:41:42,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1883 2025 [WARNING|trainer.py:803] 2025-04-26 19:41:43,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1903 [WARNING|trainer.py:803] 2025-04-26 19:41:44,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:41:45,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1884 2026 [WARNING|trainer.py:803] 2025-04-26 19:41:46,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:41:46,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1904 [WARNING|trainer.py:803] 2025-04-26 19:41:47,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2027 1885 1905 [WARNING|trainer.py:803] 2025-04-26 19:41:48,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:41:48,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:41:49,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2028 1886 [WARNING|trainer.py:803] 2025-04-26 19:41:50,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1906 [WARNING|trainer.py:803] 2025-04-26 19:41:51,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2029 [WARNING|trainer.py:803] 2025-04-26 19:41:51,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:41:52,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1887 1907 2030 [WARNING|trainer.py:803] 2025-04-26 19:41:53,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:41:53,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:41:54,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1888 1908 2031 [WARNING|trainer.py:803] 2025-04-26 19:41:55,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:41:56,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:41:56,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1889 2032 1909 [WARNING|trainer.py:803] 2025-04-26 19:41:57,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:41:57,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:41:58,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2033 1890 1910 [WARNING|trainer.py:803] 2025-04-26 19:41:59,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:41:59,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:42:00,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2034 1891 1911 [WARNING|trainer.py:803] 2025-04-26 19:42:01,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:42:02,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:42:02,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2035 [WARNING|trainer.py:803] 2025-04-26 19:42:03,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1912 1892 [WARNING|trainer.py:803] 2025-04-26 19:42:04,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2036 [WARNING|trainer.py:803] 2025-04-26 19:42:04,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:05,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1913 1893 2037 [WARNING|trainer.py:803] 2025-04-26 19:42:06,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:42:06,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:07,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1894 1914 2038 [WARNING|trainer.py:803] 2025-04-26 19:42:08,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:42:09,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:42:09,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1915 2039 1895 [WARNING|trainer.py:803] 2025-04-26 19:42:11,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:11,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 19:42:11,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes :Yes 2040 1896 1916 [WARNING|trainer.py:803] 2025-04-26 19:42:13,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:13,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:42:13,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2041 1897 1917 [WARNING|trainer.py:803] 2025-04-26 19:42:15,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:15,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:42:15,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2042 [WARNING|trainer.py:803] 2025-04-26 19:42:17,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1918 1898 2043 [WARNING|trainer.py:803] 2025-04-26 19:42:18,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:18,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:42:19,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1919 1899 [WARNING|trainer.py:803] 2025-04-26 19:42:20,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2044 [WARNING|trainer.py:803] 2025-04-26 19:42:20,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:42:21,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1920 1900 2045 [WARNING|trainer.py:803] 2025-04-26 19:42:22,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:42:22,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:42:23,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1921 1901 2046 [WARNING|trainer.py:803] 2025-04-26 19:42:25,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:42:25,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:25,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2047 1902 1922 [WARNING|trainer.py:803] 2025-04-26 19:42:27,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:42:27,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:42:27,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2048 1903 1923 [WARNING|trainer.py:803] 2025-04-26 19:42:28,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:42:29,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2049 [WARNING|trainer.py:803] 2025-04-26 19:42:29,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:42:30,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1904 1924 2050 [WARNING|trainer.py:803] 2025-04-26 19:42:32,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:42:32,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:42:32,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1905 2051 1925 [WARNING|trainer.py:803] 2025-04-26 19:42:34,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:34,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:42:34,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2052 1906 1926 [WARNING|trainer.py:803] 2025-04-26 19:42:36,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:42:36,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:36,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2053 1907 [WARNING|trainer.py:803] 2025-04-26 19:42:38,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1927 [WARNING|trainer.py:803] 2025-04-26 19:42:38,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:42:39,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2054 1908 [WARNING|trainer.py:803] 2025-04-26 19:42:40,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1928 [WARNING|trainer.py:803] 2025-04-26 19:42:40,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:41,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2055 1909 [WARNING|trainer.py:803] 2025-04-26 19:42:42,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:42:42,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1929 2056 [WARNING|trainer.py:803] 2025-04-26 19:42:43,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:44,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1910 1930 [WARNING|trainer.py:803] 2025-04-26 19:42:44,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2057 [WARNING|trainer.py:803] 2025-04-26 19:42:45,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:45,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1911 [WARNING|trainer.py:803] 2025-04-26 19:42:47,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1931 2058 [WARNING|trainer.py:803] 2025-04-26 19:42:47,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:47,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1912 2059 [WARNING|trainer.py:803] 2025-04-26 19:42:49,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1932 [WARNING|trainer.py:803] 2025-04-26 19:42:49,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:42:50,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1913 2060 1933 [WARNING|trainer.py:803] 2025-04-26 19:42:51,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:42:51,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:42:52,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2061 1914 [WARNING|trainer.py:803] 2025-04-26 19:42:53,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1934 [WARNING|trainer.py:803] 2025-04-26 19:42:53,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2062 [WARNING|trainer.py:803] 2025-04-26 19:42:54,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:55,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1915 1935 [WARNING|trainer.py:803] 2025-04-26 19:42:55,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2063 [WARNING|trainer.py:803] 2025-04-26 19:42:56,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:42:57,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1916 1936 2064 [WARNING|trainer.py:803] 2025-04-26 19:42:58,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:42:58,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:42:58,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1917 1937 2065 [WARNING|trainer.py:803] 2025-04-26 19:43:00,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:43:00,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:43:00,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2066 1918 1938 [WARNING|trainer.py:803] 2025-04-26 19:43:02,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:43:03,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:43:03,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2067 1939 1919 [WARNING|trainer.py:803] 2025-04-26 19:43:04,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:43:05,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:43:05,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2068 [WARNING|trainer.py:803] 2025-04-26 19:43:06,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1940 1920 [WARNING|trainer.py:803] 2025-04-26 19:43:07,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2069 [WARNING|trainer.py:803] 2025-04-26 19:43:07,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:43:08,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1941 1921 2070 [WARNING|trainer.py:803] 2025-04-26 19:43:09,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:43:09,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:43:09,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2071 1942 1922 [WARNING|trainer.py:803] 2025-04-26 19:43:11,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:43:11,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:43:12,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2072 1943 1923 [WARNING|trainer.py:803] 2025-04-26 19:43:14,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:43:14,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:43:14,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2073 1944 1924 [WARNING|trainer.py:803] 2025-04-26 19:43:16,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:43:16,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:43:17,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2074 1945 [WARNING|trainer.py:803] 2025-04-26 19:43:18,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1925 [WARNING|trainer.py:803] 2025-04-26 19:43:18,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2075 [WARNING|trainer.py:803] 2025-04-26 19:43:19,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1946 [WARNING|trainer.py:803] 2025-04-26 19:43:20,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:43:20,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1926 2076 [WARNING|trainer.py:803] 2025-04-26 19:43:21,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1947 [WARNING|trainer.py:803] 2025-04-26 19:43:22,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:43:22,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1927 2077 [WARNING|trainer.py:803] 2025-04-26 19:43:24,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:43:24,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1948 2078 1928 [WARNING|trainer.py:803] 2025-04-26 19:43:25,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:43:25,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:43:26,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1949 2079 [WARNING|trainer.py:803] 2025-04-26 19:43:27,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1929 [WARNING|trainer.py:803] 2025-04-26 19:43:27,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:43:28,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2080 1950 1930 [WARNING|trainer.py:803] 2025-04-26 19:43:29,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:43:30,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:43:30,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2081 1951 1931 [WARNING|trainer.py:803] 2025-04-26 19:43:31,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:43:32,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:43:32,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2082 1952 [WARNING|trainer.py:803] 2025-04-26 19:43:33,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1932 [WARNING|trainer.py:803] 2025-04-26 19:43:34,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:43:34,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2083 [WARNING|trainer.py:803] 2025-04-26 19:43:36,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1933 1953 [WARNING|trainer.py:803] 2025-04-26 19:43:36,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:43:37,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2084 [WARNING|trainer.py:803] 2025-04-26 19:43:37,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1954 1934 [WARNING|trainer.py:803] 2025-04-26 19:43:39,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:43:39,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2085 [WARNING|trainer.py:803] 2025-04-26 19:43:40,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1955 1935 [WARNING|trainer.py:803] 2025-04-26 19:43:41,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:43:41,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2086 [WARNING|trainer.py:803] 2025-04-26 19:43:42,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1936 1956 [WARNING|trainer.py:803] 2025-04-26 19:43:43,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2087 [WARNING|trainer.py:803] 2025-04-26 19:43:43,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:43:44,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1937 1957 2088 [WARNING|trainer.py:803] 2025-04-26 19:43:45,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:43:45,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:43:46,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1938 2089 [WARNING|trainer.py:803] 2025-04-26 19:43:47,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:43:48,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1958 [WARNING|trainer.py:803] 2025-04-26 19:43:49,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1939 2090 [WARNING|trainer.py:803] 2025-04-26 19:43:50,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1959 [WARNING|trainer.py:803] 2025-04-26 19:43:50,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:43:50,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1940 2091 [WARNING|trainer.py:803] 2025-04-26 19:43:52,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1960 [WARNING|trainer.py:803] 2025-04-26 19:43:52,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:43:53,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1941 2092 [WARNING|trainer.py:803] 2025-04-26 19:43:54,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1961 [WARNING|trainer.py:803] 2025-04-26 19:43:54,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:43:55,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2093 1942 [WARNING|trainer.py:803] 2025-04-26 19:43:56,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:43:56,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1962 2094 [WARNING|trainer.py:803] 2025-04-26 19:43:57,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:43:58,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1943 [WARNING|trainer.py:803] 2025-04-26 19:43:58,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2095 1963 [WARNING|trainer.py:803] 2025-04-26 19:43:59,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:44:00,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1944 2096 [WARNING|trainer.py:803] 2025-04-26 19:44:01,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1964 [WARNING|trainer.py:803] 2025-04-26 19:44:01,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:44:02,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1945 2097 [WARNING|trainer.py:803] 2025-04-26 19:44:03,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:44:03,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1965 1946 [WARNING|trainer.py:803] 2025-04-26 19:44:04,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2098 [WARNING|trainer.py:803] 2025-04-26 19:44:05,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:05,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1966 [WARNING|trainer.py:803] 2025-04-26 19:44:06,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1947 2099 [WARNING|trainer.py:803] 2025-04-26 19:44:07,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:07,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1967 2100 1948 [WARNING|trainer.py:803] 2025-04-26 19:44:09,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:09,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2101 [WARNING|trainer.py:803] 2025-04-26 19:44:10,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:44:10,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1968 2102 1949 [WARNING|trainer.py:803] 2025-04-26 19:44:11,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:12,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:44:12,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2103 [WARNING|trainer.py:803] 2025-04-26 19:44:13,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2104 1969 1950 [WARNING|trainer.py:803] 2025-04-26 19:44:14,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2105 [WARNING|trainer.py:803] 2025-04-26 19:44:14,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:44:14,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:15,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1970 2106 1951 [WARNING|trainer.py:803] 2025-04-26 19:44:16,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:16,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2107 [WARNING|trainer.py:803] 2025-04-26 19:44:16,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1971 [WARNING|trainer.py:803] 2025-04-26 19:44:17,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2108 1952 [WARNING|trainer.py:803] 2025-04-26 19:44:18,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:44:18,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2109 [WARNING|trainer.py:803] 2025-04-26 19:44:19,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1972 [WARNING|trainer.py:803] 2025-04-26 19:44:19,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2110 [WARNING|trainer.py:803] 2025-04-26 19:44:20,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1953 [WARNING|trainer.py:803] 2025-04-26 19:44:20,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2111 [WARNING|trainer.py:803] 2025-04-26 19:44:21,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1973 [WARNING|trainer.py:803] 2025-04-26 19:44:22,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2112 [WARNING|trainer.py:803] 2025-04-26 19:44:22,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1954 [WARNING|trainer.py:803] 2025-04-26 19:44:23,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2113 [WARNING|trainer.py:803] 2025-04-26 19:44:23,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1974 [WARNING|trainer.py:803] 2025-04-26 19:44:24,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2114 [WARNING|trainer.py:803] 2025-04-26 19:44:24,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 1955 [WARNING|trainer.py:803] 2025-04-26 19:44:25,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2115 1975 [WARNING|trainer.py:803] 2025-04-26 19:44:26,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:44:26,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2116 [WARNING|trainer.py:803] 2025-04-26 19:44:26,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:27,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1956 2117 1976 [WARNING|trainer.py:803] 2025-04-26 19:44:28,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:44:28,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2118 [WARNING|trainer.py:803] 2025-04-26 19:44:28,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:44:29,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1957 2119 1977 [WARNING|trainer.py:803] 2025-04-26 19:44:30,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:44:30,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:44:30,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2120 [WARNING|trainer.py:803] 2025-04-26 19:44:31,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2121 1978 1958 [WARNING|trainer.py:803] 2025-04-26 19:44:33,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:33,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2122 [WARNING|trainer.py:803] 2025-04-26 19:44:33,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:44:34,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2123 1959 1979 [WARNING|trainer.py:803] 2025-04-26 19:44:35,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2124 [WARNING|trainer.py:803] 2025-04-26 19:44:35,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:35,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:44:36,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2125 1980 1960 [WARNING|trainer.py:803] 2025-04-26 19:44:37,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2126 [WARNING|trainer.py:803] 2025-04-26 19:44:38,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:44:38,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:38,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2127 1961 1981 [WARNING|trainer.py:803] 2025-04-26 19:44:39,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2128 [WARNING|trainer.py:803] 2025-04-26 19:44:40,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:40,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:40,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2129 1962 1982 [WARNING|trainer.py:803] 2025-04-26 19:44:42,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2130 [WARNING|trainer.py:803] 2025-04-26 19:44:42,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:42,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:44:43,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2131 1963 1983 [WARNING|trainer.py:803] 2025-04-26 19:44:44,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2132 [WARNING|trainer.py:803] 2025-04-26 19:44:44,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:44:44,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:44:45,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2133 1984 1964 [WARNING|trainer.py:803] 2025-04-26 19:44:46,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2134 [WARNING|trainer.py:803] 2025-04-26 19:44:47,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:47,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:44:47,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2135 1965 [WARNING|trainer.py:803] 2025-04-26 19:44:48,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1985 2136 [WARNING|trainer.py:803] 2025-04-26 19:44:49,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:49,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:49,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2137 1966 [WARNING|trainer.py:803] 2025-04-26 19:44:50,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1986 2138 [WARNING|trainer.py:803] 2025-04-26 19:44:51,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:44:51,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:44:51,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2139 [WARNING|trainer.py:803] 2025-04-26 19:44:53,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1967 1987 2140 [WARNING|trainer.py:803] 2025-04-26 19:44:54,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:54,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:54,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2141 [WARNING|trainer.py:803] 2025-04-26 19:44:55,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1968 2142 1988 [WARNING|trainer.py:803] 2025-04-26 19:44:56,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:56,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:44:56,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2143 [WARNING|trainer.py:803] 2025-04-26 19:44:57,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2144 1989 1969 [WARNING|trainer.py:803] 2025-04-26 19:44:58,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:44:59,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2145 [WARNING|trainer.py:803] 2025-04-26 19:44:59,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:44:59,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2146 1970 1990 [WARNING|trainer.py:803] 2025-04-26 19:45:00,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:45:01,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2147 [WARNING|trainer.py:803] 2025-04-26 19:45:01,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:45:02,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1971 2148 1991 [WARNING|trainer.py:803] 2025-04-26 19:45:03,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:45:03,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:45:03,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2149 1972 [WARNING|trainer.py:803] 2025-04-26 19:45:04,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2150 [WARNING|trainer.py:803] 2025-04-26 19:45:05,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1992 [WARNING|trainer.py:803] 2025-04-26 19:45:05,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2151 [WARNING|trainer.py:803] 2025-04-26 19:45:05,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1973 [WARNING|trainer.py:803] 2025-04-26 19:45:06,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2152 [WARNING|trainer.py:803] 2025-04-26 19:45:07,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1993 [WARNING|trainer.py:803] 2025-04-26 19:45:07,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2153 [WARNING|trainer.py:803] 2025-04-26 19:45:08,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1974 [WARNING|trainer.py:803] 2025-04-26 19:45:08,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2154 [WARNING|trainer.py:803] 2025-04-26 19:45:09,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:45:09,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1994 2155 1975 [WARNING|trainer.py:803] 2025-04-26 19:45:10,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:45:10,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2156 [WARNING|trainer.py:803] 2025-04-26 19:45:11,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:45:11,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1995 2157 1976 [WARNING|trainer.py:803] 2025-04-26 19:45:12,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:45:13,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2158 [WARNING|trainer.py:803] 2025-04-26 19:45:13,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:45:14,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2159 1996 1977 [WARNING|trainer.py:803] 2025-04-26 19:45:15,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:45:15,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2160 [WARNING|trainer.py:803] 2025-04-26 19:45:15,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:45:16,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2161 1978 1997 [WARNING|trainer.py:803] 2025-04-26 19:45:17,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2162 [WARNING|trainer.py:803] 2025-04-26 19:45:17,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:45:18,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:45:18,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2163 1979 1998 [WARNING|trainer.py:803] 2025-04-26 19:45:19,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2164 [WARNING|trainer.py:803] 2025-04-26 19:45:20,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:45:20,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:45:20,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2165 1999 1980 [WARNING|trainer.py:803] 2025-04-26 19:45:21,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2166 [WARNING|trainer.py:803] 2025-04-26 19:45:22,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:45:22,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:45:22,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2167 2000 [WARNING|trainer.py:803] 2025-04-26 19:45:24,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1981 2168 [WARNING|trainer.py:803] 2025-04-26 19:45:24,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:45:25,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:45:25,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2169 1982 [WARNING|trainer.py:803] 2025-04-26 19:45:26,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2001 2170 [WARNING|trainer.py:803] 2025-04-26 19:45:27,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:45:27,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:45:27,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2171 [WARNING|trainer.py:803] 2025-04-26 19:45:28,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2002 1983 2172 [WARNING|trainer.py:803] 2025-04-26 19:45:29,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:45:29,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:45:29,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2173 [WARNING|trainer.py:803] 2025-04-26 19:45:30,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 1984 2003 2174 [WARNING|trainer.py:803] 2025-04-26 19:45:31,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:45:31,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:45:31,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2175 [WARNING|trainer.py:803] 2025-04-26 19:45:33,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2176 2004 1985 [WARNING|trainer.py:803] 2025-04-26 19:45:34,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:45:34,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:45:34,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2177 [WARNING|trainer.py:803] 2025-04-26 19:45:35,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2178 1986 2005 [WARNING|trainer.py:803] 2025-04-26 19:45:36,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:45:36,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:45:36,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2179 [WARNING|trainer.py:803] 2025-04-26 19:45:37,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2006 2180 1987 [WARNING|trainer.py:803] 2025-04-26 19:45:38,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:45:38,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:45:38,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2181 [WARNING|trainer.py:803] 2025-04-26 19:45:39,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2007 2182 1988 [WARNING|trainer.py:803] 2025-04-26 19:45:40,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:45:40,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2183 [WARNING|trainer.py:803] 2025-04-26 19:45:41,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:45:42,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2008 2184 1989 [WARNING|trainer.py:803] 2025-04-26 19:45:42,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:45:43,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2185 [WARNING|trainer.py:803] 2025-04-26 19:45:43,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2009 [WARNING|trainer.py:803] 2025-04-26 19:45:44,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2186 [WARNING|trainer.py:803] 2025-04-26 19:45:45,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1990 [WARNING|trainer.py:803] 2025-04-26 19:45:45,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2187 [WARNING|trainer.py:803] 2025-04-26 19:45:46,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2010 [WARNING|trainer.py:803] 2025-04-26 19:45:46,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2188 [WARNING|trainer.py:803] 2025-04-26 19:45:47,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 1991 [WARNING|trainer.py:803] 2025-04-26 19:45:47,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2189 [WARNING|trainer.py:803] 2025-04-26 19:45:48,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2011 [WARNING|trainer.py:803] 2025-04-26 19:45:48,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2190 [WARNING|trainer.py:803] 2025-04-26 19:45:49,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 1992 [WARNING|trainer.py:803] 2025-04-26 19:45:49,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2191 [WARNING|trainer.py:803] 2025-04-26 19:45:50,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2012 [WARNING|trainer.py:803] 2025-04-26 19:45:50,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2192 [WARNING|trainer.py:803] 2025-04-26 19:45:51,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:45:52,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1993 2193 2013 [WARNING|trainer.py:803] 2025-04-26 19:45:52,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:45:53,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2194 [WARNING|trainer.py:803] 2025-04-26 19:45:53,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:45:54,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 1994 2195 2014 [WARNING|trainer.py:803] 2025-04-26 19:45:55,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:45:55,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2196 [WARNING|trainer.py:803] 2025-04-26 19:45:56,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:45:56,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2197 1995 2015 [WARNING|trainer.py:803] 2025-04-26 19:45:57,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:45:57,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2198 [WARNING|trainer.py:803] 2025-04-26 19:45:58,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:45:58,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2199 1996 2016 [WARNING|trainer.py:803] 2025-04-26 19:45:59,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2200 [WARNING|trainer.py:803] 2025-04-26 19:46:00,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:46:00,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:00,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2201 2017 1997 [WARNING|trainer.py:803] 2025-04-26 19:46:02,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2202 [WARNING|trainer.py:803] 2025-04-26 19:46:02,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:02,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:46:03,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2203 2018 1998 [WARNING|trainer.py:803] 2025-04-26 19:46:04,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2204 [WARNING|trainer.py:803] 2025-04-26 19:46:05,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:46:05,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:46:05,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2205 2019 1999 [WARNING|trainer.py:803] 2025-04-26 19:46:06,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:07,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2206 [WARNING|trainer.py:803] 2025-04-26 19:46:07,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:46:07,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2207 2020 2000 [WARNING|trainer.py:803] 2025-04-26 19:46:09,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2208 [WARNING|trainer.py:803] 2025-04-26 19:46:09,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:46:09,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:10,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2209 2021 2001 [WARNING|trainer.py:803] 2025-04-26 19:46:11,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:11,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2210 [WARNING|trainer.py:803] 2025-04-26 19:46:11,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:46:12,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2211 2022 2002 [WARNING|trainer.py:803] 2025-04-26 19:46:13,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:46:13,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2212 [WARNING|trainer.py:803] 2025-04-26 19:46:14,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:14,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2213 2023 2003 [WARNING|trainer.py:803] 2025-04-26 19:46:15,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2214 [WARNING|trainer.py:803] 2025-04-26 19:46:16,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:46:16,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:46:16,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2215 2024 2004 [WARNING|trainer.py:803] 2025-04-26 19:46:18,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2216 [WARNING|trainer.py:803] 2025-04-26 19:46:18,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:46:18,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:46:19,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2217 2025 2005 [WARNING|trainer.py:803] 2025-04-26 19:46:20,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:20,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2218 [WARNING|trainer.py:803] 2025-04-26 19:46:21,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:46:21,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2026 2219 2006 [WARNING|trainer.py:803] 2025-04-26 19:46:22,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:22,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2220 [WARNING|trainer.py:803] 2025-04-26 19:46:23,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:46:23,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2027 2221 2007 [WARNING|trainer.py:803] 2025-04-26 19:46:24,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:46:25,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:46:25,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2222 2028 [WARNING|trainer.py:803] 2025-04-26 19:46:26,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2223 2008 [WARNING|trainer.py:803] 2025-04-26 19:46:27,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:46:27,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:46:27,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2224 2029 [WARNING|trainer.py:803] 2025-04-26 19:46:28,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2009 2225 [WARNING|trainer.py:803] 2025-04-26 19:46:29,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:46:29,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:29,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2226 2030 2010 [WARNING|trainer.py:803] 2025-04-26 19:46:30,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2227 [WARNING|trainer.py:803] 2025-04-26 19:46:31,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:46:31,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:46:31,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2228 2031 2011 [WARNING|trainer.py:803] 2025-04-26 19:46:33,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2229 [WARNING|trainer.py:803] 2025-04-26 19:46:33,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:46:33,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:34,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2230 2032 2012 [WARNING|trainer.py:803] 2025-04-26 19:46:35,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:46:35,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2231 [WARNING|trainer.py:803] 2025-04-26 19:46:35,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:36,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2033 2232 2013 [WARNING|trainer.py:803] 2025-04-26 19:46:37,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:37,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2233 [WARNING|trainer.py:803] 2025-04-26 19:46:37,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2034 [WARNING|trainer.py:803] 2025-04-26 19:46:38,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2234 [WARNING|trainer.py:803] 2025-04-26 19:46:39,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2014 [WARNING|trainer.py:803] 2025-04-26 19:46:39,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2235 [WARNING|trainer.py:803] 2025-04-26 19:46:40,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2035 [WARNING|trainer.py:803] 2025-04-26 19:46:41,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2236 [WARNING|trainer.py:803] 2025-04-26 19:46:41,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2015 [WARNING|trainer.py:803] 2025-04-26 19:46:42,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2237 [WARNING|trainer.py:803] 2025-04-26 19:46:42,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2036 [WARNING|trainer.py:803] 2025-04-26 19:46:43,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2238 2016 [WARNING|trainer.py:803] 2025-04-26 19:46:43,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:46:44,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:44,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2239 [WARNING|trainer.py:803] 2025-04-26 19:46:45,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2037 2240 2017 [WARNING|trainer.py:803] 2025-04-26 19:46:46,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:46:46,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2241 [WARNING|trainer.py:803] 2025-04-26 19:46:46,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:47,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2038 2242 2018 [WARNING|trainer.py:803] 2025-04-26 19:46:48,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:48,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2243 [WARNING|trainer.py:803] 2025-04-26 19:46:49,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2039 [WARNING|trainer.py:803] 2025-04-26 19:46:49,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2244 2019 [WARNING|trainer.py:803] 2025-04-26 19:46:50,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:50,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:46:51,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2245 2040 [WARNING|trainer.py:803] 2025-04-26 19:46:52,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2246 [WARNING|trainer.py:803] 2025-04-26 19:46:52,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2020 [WARNING|trainer.py:803] 2025-04-26 19:46:53,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2247 [WARNING|trainer.py:803] 2025-04-26 19:46:53,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2041 [WARNING|trainer.py:803] 2025-04-26 19:46:54,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2248 [WARNING|trainer.py:803] 2025-04-26 19:46:54,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2021 [WARNING|trainer.py:803] 2025-04-26 19:46:55,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2249 [WARNING|trainer.py:803] 2025-04-26 19:46:55,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2042 [WARNING|trainer.py:803] 2025-04-26 19:46:56,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2250 2022 [WARNING|trainer.py:803] 2025-04-26 19:46:57,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:46:57,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2251 [WARNING|trainer.py:803] 2025-04-26 19:46:57,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2043 [WARNING|trainer.py:803] 2025-04-26 19:46:58,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2252 [WARNING|trainer.py:803] 2025-04-26 19:46:59,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2023 [WARNING|trainer.py:803] 2025-04-26 19:46:59,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2253 [WARNING|trainer.py:803] 2025-04-26 19:47:00,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2044 [WARNING|trainer.py:803] 2025-04-26 19:47:00,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2254 [WARNING|trainer.py:803] 2025-04-26 19:47:01,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2024 [WARNING|trainer.py:803] 2025-04-26 19:47:01,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2255 [WARNING|trainer.py:803] 2025-04-26 19:47:02,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2045 [WARNING|trainer.py:803] 2025-04-26 19:47:03,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2256 [WARNING|trainer.py:803] 2025-04-26 19:47:03,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2025 [WARNING|trainer.py:803] 2025-04-26 19:47:04,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2257 [WARNING|trainer.py:803] 2025-04-26 19:47:04,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:47:05,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2046 2258 [WARNING|trainer.py:803] 2025-04-26 19:47:05,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2026 [WARNING|trainer.py:803] 2025-04-26 19:47:06,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2259 [WARNING|trainer.py:803] 2025-04-26 19:47:06,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:47:07,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2047 2260 [WARNING|trainer.py:803] 2025-04-26 19:47:08,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:47:08,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2027 2261 [WARNING|trainer.py:803] 2025-04-26 19:47:09,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:47:09,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2048 2262 [WARNING|trainer.py:803] 2025-04-26 19:47:10,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:47:10,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2028 2263 [WARNING|trainer.py:803] 2025-04-26 19:47:11,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2049 [WARNING|trainer.py:803] 2025-04-26 19:47:11,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2264 [WARNING|trainer.py:803] 2025-04-26 19:47:12,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:47:12,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2029 2265 [WARNING|trainer.py:803] 2025-04-26 19:47:13,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:47:13,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2050 2266 [WARNING|trainer.py:803] 2025-04-26 19:47:14,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2030 [WARNING|trainer.py:803] 2025-04-26 19:47:14,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2267 [WARNING|trainer.py:803] 2025-04-26 19:47:15,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2051 [WARNING|trainer.py:803] 2025-04-26 19:47:15,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2268 [WARNING|trainer.py:803] 2025-04-26 19:47:16,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2031 [WARNING|trainer.py:803] 2025-04-26 19:47:16,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2269 [WARNING|trainer.py:803] 2025-04-26 19:47:17,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2052 [WARNING|trainer.py:803] 2025-04-26 19:47:17,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2270 [WARNING|trainer.py:803] 2025-04-26 19:47:18,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2032 [WARNING|trainer.py:803] 2025-04-26 19:47:18,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2271 [WARNING|trainer.py:803] 2025-04-26 19:47:19,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2053 [WARNING|trainer.py:803] 2025-04-26 19:47:20,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2272 2033 [WARNING|trainer.py:803] 2025-04-26 19:47:20,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:47:21,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2273 [WARNING|trainer.py:803] 2025-04-26 19:47:21,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2054 [WARNING|trainer.py:803] 2025-04-26 19:47:22,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2274 2034 [WARNING|trainer.py:803] 2025-04-26 19:47:22,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:47:23,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2275 [WARNING|trainer.py:803] 2025-04-26 19:47:23,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2055 [WARNING|trainer.py:803] 2025-04-26 19:47:24,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2276 2035 [WARNING|trainer.py:803] 2025-04-26 19:47:25,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:47:25,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2277 [WARNING|trainer.py:803] 2025-04-26 19:47:25,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2056 [WARNING|trainer.py:803] 2025-04-26 19:47:26,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2278 2036 [WARNING|trainer.py:803] 2025-04-26 19:47:27,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:47:27,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2279 [WARNING|trainer.py:803] 2025-04-26 19:47:27,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2057 [WARNING|trainer.py:803] 2025-04-26 19:47:28,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2280 [WARNING|trainer.py:803] 2025-04-26 19:47:29,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2037 [WARNING|trainer.py:803] 2025-04-26 19:47:29,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2281 [WARNING|trainer.py:803] 2025-04-26 19:47:30,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:47:30,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2058 2282 2038 [WARNING|trainer.py:803] 2025-04-26 19:47:31,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:47:31,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2283 [WARNING|trainer.py:803] 2025-04-26 19:47:32,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:47:32,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2059 2284 2039 [WARNING|trainer.py:803] 2025-04-26 19:47:33,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:47:34,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2285 [WARNING|trainer.py:803] 2025-04-26 19:47:34,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2060 [WARNING|trainer.py:803] 2025-04-26 19:47:35,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2286 2040 [WARNING|trainer.py:803] 2025-04-26 19:47:35,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:47:36,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2287 [WARNING|trainer.py:803] 2025-04-26 19:47:36,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2061 [WARNING|trainer.py:803] 2025-04-26 19:47:37,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2288 2041 [WARNING|trainer.py:803] 2025-04-26 19:47:37,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:47:38,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2289 [WARNING|trainer.py:803] 2025-04-26 19:47:38,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2062 [WARNING|trainer.py:803] 2025-04-26 19:47:39,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2290 [WARNING|trainer.py:803] 2025-04-26 19:47:40,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2042 [WARNING|trainer.py:803] 2025-04-26 19:47:40,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2291 [WARNING|trainer.py:803] 2025-04-26 19:47:41,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:47:41,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2063 2292 2043 [WARNING|trainer.py:803] 2025-04-26 19:47:42,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:47:42,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2293 [WARNING|trainer.py:803] 2025-04-26 19:47:43,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2064 [WARNING|trainer.py:803] 2025-04-26 19:47:43,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2294 [WARNING|trainer.py:803] 2025-04-26 19:47:44,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:47:44,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2044 2295 [WARNING|trainer.py:803] 2025-04-26 19:47:45,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:47:45,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2065 2296 2045 [WARNING|trainer.py:803] 2025-04-26 19:47:46,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:47:46,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2297 [WARNING|trainer.py:803] 2025-04-26 19:47:47,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2066 [WARNING|trainer.py:803] 2025-04-26 19:47:48,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2298 [WARNING|trainer.py:803] 2025-04-26 19:47:48,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2046 [WARNING|trainer.py:803] 2025-04-26 19:47:49,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2299 [WARNING|trainer.py:803] 2025-04-26 19:47:49,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2067 [WARNING|trainer.py:803] 2025-04-26 19:47:50,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2300 [WARNING|trainer.py:803] 2025-04-26 19:47:50,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2047 [WARNING|trainer.py:803] 2025-04-26 19:47:51,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2301 [WARNING|trainer.py:803] 2025-04-26 19:47:52,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2068 [WARNING|trainer.py:803] 2025-04-26 19:47:52,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2302 [WARNING|trainer.py:803] 2025-04-26 19:47:52,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2048 [WARNING|trainer.py:803] 2025-04-26 19:47:53,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2303 [WARNING|trainer.py:803] 2025-04-26 19:47:54,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2069 [WARNING|trainer.py:803] 2025-04-26 19:47:54,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2304 [WARNING|trainer.py:803] 2025-04-26 19:47:55,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2049 [WARNING|trainer.py:803] 2025-04-26 19:47:55,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2305 [WARNING|trainer.py:803] 2025-04-26 19:47:56,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2070 [WARNING|trainer.py:803] 2025-04-26 19:47:56,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2306 [WARNING|trainer.py:803] 2025-04-26 19:47:57,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2050 [WARNING|trainer.py:803] 2025-04-26 19:47:57,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2307 [WARNING|trainer.py:803] 2025-04-26 19:47:58,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2071 [WARNING|trainer.py:803] 2025-04-26 19:47:58,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2308 [WARNING|trainer.py:803] 2025-04-26 19:47:59,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2051 [WARNING|trainer.py:803] 2025-04-26 19:47:59,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2309 [WARNING|trainer.py:803] 2025-04-26 19:48:00,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2072 [WARNING|trainer.py:803] 2025-04-26 19:48:00,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2310 2052 [WARNING|trainer.py:803] 2025-04-26 19:48:01,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:48:02,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2311 [WARNING|trainer.py:803] 2025-04-26 19:48:02,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2073 [WARNING|trainer.py:803] 2025-04-26 19:48:03,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2312 2053 [WARNING|trainer.py:803] 2025-04-26 19:48:03,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:48:04,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2313 [WARNING|trainer.py:803] 2025-04-26 19:48:04,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:48:05,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2074 2314 2054 [WARNING|trainer.py:803] 2025-04-26 19:48:06,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:48:06,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2315 [WARNING|trainer.py:803] 2025-04-26 19:48:06,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:48:07,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2075 2316 2055 [WARNING|trainer.py:803] 2025-04-26 19:48:08,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:48:08,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:48:08,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2317 [WARNING|trainer.py:803] 2025-04-26 19:48:09,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2318 2076 2056 [WARNING|trainer.py:803] 2025-04-26 19:48:10,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:48:10,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:48:10,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2319 [WARNING|trainer.py:803] 2025-04-26 19:48:11,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2077 2320 2057 [WARNING|trainer.py:803] 2025-04-26 19:48:12,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:48:12,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:48:12,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2321 2078 [WARNING|trainer.py:803] 2025-04-26 19:48:14,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2322 2058 [WARNING|trainer.py:803] 2025-04-26 19:48:14,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:48:15,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:48:15,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2323 2079 [WARNING|trainer.py:803] 2025-04-26 19:48:16,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2324 2059 [WARNING|trainer.py:803] 2025-04-26 19:48:16,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:48:17,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:48:17,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2325 2080 [WARNING|trainer.py:803] 2025-04-26 19:48:18,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2326 2060 [WARNING|trainer.py:803] 2025-04-26 19:48:19,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:48:19,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:48:19,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2327 [WARNING|trainer.py:803] 2025-04-26 19:48:20,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2081 2061 2328 [WARNING|trainer.py:803] 2025-04-26 19:48:21,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:48:21,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:48:21,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2329 [WARNING|trainer.py:803] 2025-04-26 19:48:22,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2062 2330 2082 [WARNING|trainer.py:803] 2025-04-26 19:48:23,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:48:23,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:48:23,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2331 [WARNING|trainer.py:803] 2025-04-26 19:48:24,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2063 2332 2083 [WARNING|trainer.py:803] 2025-04-26 19:48:25,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:48:25,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:48:26,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2333 2064 [WARNING|trainer.py:803] 2025-04-26 19:48:26,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2334 2084 [WARNING|trainer.py:803] 2025-04-26 19:48:27,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:48:28,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:48:28,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2335 [WARNING|trainer.py:803] 2025-04-26 19:48:29,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2336 2065 2085 [WARNING|trainer.py:803] 2025-04-26 19:48:30,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:48:30,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2337 [WARNING|trainer.py:803] 2025-04-26 19:48:30,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:48:31,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2066 2338 [WARNING|trainer.py:803] 2025-04-26 19:48:32,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:48:32,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2086 2339 [WARNING|trainer.py:803] 2025-04-26 19:48:33,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:48:33,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2067 2340 [WARNING|trainer.py:803] 2025-04-26 19:48:34,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:48:34,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2341 2087 2068 [WARNING|trainer.py:803] 2025-04-26 19:48:35,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:48:35,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2342 [WARNING|trainer.py:803] 2025-04-26 19:48:36,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:48:36,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2343 2088 2069 [WARNING|trainer.py:803] 2025-04-26 19:48:37,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:48:37,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2344 [WARNING|trainer.py:803] 2025-04-26 19:48:38,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:48:38,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2089 2345 2070 [WARNING|trainer.py:803] 2025-04-26 19:48:39,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:48:39,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2346 [WARNING|trainer.py:803] 2025-04-26 19:48:40,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:48:40,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2347 2090 2071 [WARNING|trainer.py:803] 2025-04-26 19:48:41,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2348 [WARNING|trainer.py:803] 2025-04-26 19:48:42,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:48:42,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:48:42,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2349 2091 [WARNING|trainer.py:803] 2025-04-26 19:48:43,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2350 2072 [WARNING|trainer.py:803] 2025-04-26 19:48:44,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:48:44,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:48:44,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2351 2092 [WARNING|trainer.py:803] 2025-04-26 19:48:45,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2352 2073 [WARNING|trainer.py:803] 2025-04-26 19:48:46,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:48:47,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2353 [WARNING|trainer.py:803] 2025-04-26 19:48:47,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2093 [WARNING|trainer.py:803] 2025-04-26 19:48:48,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2354 [WARNING|trainer.py:803] 2025-04-26 19:48:48,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2074 [WARNING|trainer.py:803] 2025-04-26 19:48:49,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2355 [WARNING|trainer.py:803] 2025-04-26 19:48:49,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2094 [WARNING|trainer.py:803] 2025-04-26 19:48:50,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2356 [WARNING|trainer.py:803] 2025-04-26 19:48:50,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2075 [WARNING|trainer.py:803] 2025-04-26 19:48:51,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2357 [WARNING|trainer.py:803] 2025-04-26 19:48:51,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2095 [WARNING|trainer.py:803] 2025-04-26 19:48:52,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2358 [WARNING|trainer.py:803] 2025-04-26 19:48:52,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2076 [WARNING|trainer.py:803] 2025-04-26 19:48:53,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2359 [WARNING|trainer.py:803] 2025-04-26 19:48:54,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2096 [WARNING|trainer.py:803] 2025-04-26 19:48:54,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2360 [WARNING|trainer.py:803] 2025-04-26 19:48:55,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2077 [WARNING|trainer.py:803] 2025-04-26 19:48:55,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2361 [WARNING|trainer.py:803] 2025-04-26 19:48:56,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2097 [WARNING|trainer.py:803] 2025-04-26 19:48:56,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2362 [WARNING|trainer.py:803] 2025-04-26 19:48:57,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2078 [WARNING|trainer.py:803] 2025-04-26 19:48:57,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2363 [WARNING|trainer.py:803] 2025-04-26 19:48:58,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2098 [WARNING|trainer.py:803] 2025-04-26 19:48:58,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2364 [WARNING|trainer.py:803] 2025-04-26 19:48:59,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2079 [WARNING|trainer.py:803] 2025-04-26 19:48:59,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2365 [WARNING|trainer.py:803] 2025-04-26 19:49:00,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:49:00,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2099 2366 2080 [WARNING|trainer.py:803] 2025-04-26 19:49:01,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:01,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2367 [WARNING|trainer.py:803] 2025-04-26 19:49:02,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:49:02,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2100 2368 2081 [WARNING|trainer.py:803] 2025-04-26 19:49:03,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:49:04,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2369 2101 [WARNING|trainer.py:803] 2025-04-26 19:49:04,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:49:05,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:49:05,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2370 2102 [WARNING|trainer.py:803] 2025-04-26 19:49:06,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2082 2371 [WARNING|trainer.py:803] 2025-04-26 19:49:06,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2103 [WARNING|trainer.py:803] 2025-04-26 19:49:07,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:49:07,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2372 [WARNING|trainer.py:803] 2025-04-26 19:49:07,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2104 [WARNING|trainer.py:803] 2025-04-26 19:49:08,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2373 2083 [WARNING|trainer.py:803] 2025-04-26 19:49:09,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:49:09,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:09,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2105 2374 [WARNING|trainer.py:803] 2025-04-26 19:49:10,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:10,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2375 2084 2106 [WARNING|trainer.py:803] 2025-04-26 19:49:11,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:49:11,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:49:11,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2376 2107 [WARNING|trainer.py:803] 2025-04-26 19:49:12,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2377 [WARNING|trainer.py:803] 2025-04-26 19:49:13,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2085 2108 [WARNING|trainer.py:803] 2025-04-26 19:49:13,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2378 [WARNING|trainer.py:803] 2025-04-26 19:49:13,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:49:14,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:14,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2109 2379 [WARNING|trainer.py:803] 2025-04-26 19:49:15,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2086 [WARNING|trainer.py:803] 2025-04-26 19:49:15,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2380 2110 [WARNING|trainer.py:803] 2025-04-26 19:49:16,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:16,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:16,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2381 2111 [WARNING|trainer.py:803] 2025-04-26 19:49:17,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2382 [WARNING|trainer.py:803] 2025-04-26 19:49:18,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2087 2112 [WARNING|trainer.py:803] 2025-04-26 19:49:18,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:49:19,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2383 [WARNING|trainer.py:803] 2025-04-26 19:49:19,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2113 [WARNING|trainer.py:803] 2025-04-26 19:49:19,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2384 2088 [WARNING|trainer.py:803] 2025-04-26 19:49:20,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:49:21,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:49:21,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2114 2385 [WARNING|trainer.py:803] 2025-04-26 19:49:21,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:49:22,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2089 2386 2115 [WARNING|trainer.py:803] 2025-04-26 19:49:23,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:49:23,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:23,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2387 2116 [WARNING|trainer.py:803] 2025-04-26 19:49:24,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2388 [WARNING|trainer.py:803] 2025-04-26 19:49:24,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2117 2090 [WARNING|trainer.py:803] 2025-04-26 19:49:25,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2389 [WARNING|trainer.py:803] 2025-04-26 19:49:25,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:49:25,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:26,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2118 2390 [WARNING|trainer.py:803] 2025-04-26 19:49:27,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2091 [WARNING|trainer.py:803] 2025-04-26 19:49:27,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2391 2119 [WARNING|trainer.py:803] 2025-04-26 19:49:28,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:28,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:28,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2392 2120 [WARNING|trainer.py:803] 2025-04-26 19:49:29,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2092 [WARNING|trainer.py:803] 2025-04-26 19:49:29,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2393 2121 [WARNING|trainer.py:803] 2025-04-26 19:49:30,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:30,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2394 [WARNING|trainer.py:803] 2025-04-26 19:49:30,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2122 2093 [WARNING|trainer.py:803] 2025-04-26 19:49:31,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2395 [WARNING|trainer.py:803] 2025-04-26 19:49:32,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:49:32,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:32,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2123 2396 2094 [WARNING|trainer.py:803] 2025-04-26 19:49:33,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:33,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2397 2124 [WARNING|trainer.py:803] 2025-04-26 19:49:34,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:34,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:49:34,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2398 2125 2095 [WARNING|trainer.py:803] 2025-04-26 19:49:36,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:36,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2399 [WARNING|trainer.py:803] 2025-04-26 19:49:36,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2126 [WARNING|trainer.py:803] 2025-04-26 19:49:37,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2400 [WARNING|trainer.py:803] 2025-04-26 19:49:37,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2096 2127 [WARNING|trainer.py:803] 2025-04-26 19:49:38,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:38,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:49:38,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2128 2401 2097 [WARNING|trainer.py:803] 2025-04-26 19:49:40,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:49:40,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2129 [WARNING|trainer.py:803] 2025-04-26 19:49:40,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:49:41,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2402 2130 2098 [WARNING|trainer.py:803] 2025-04-26 19:49:42,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:49:42,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:49:43,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2131 2403 [WARNING|trainer.py:803] 2025-04-26 19:49:44,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2132 [WARNING|trainer.py:803] 2025-04-26 19:49:44,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2099 [WARNING|trainer.py:803] 2025-04-26 19:49:45,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:49:45,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2404 2133 [WARNING|trainer.py:803] 2025-04-26 19:49:46,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:46,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2100 2134 [WARNING|trainer.py:803] 2025-04-26 19:49:47,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2405 [WARNING|trainer.py:803] 2025-04-26 19:49:48,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2101 [WARNING|trainer.py:803] 2025-04-26 19:49:48,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2135 [WARNING|trainer.py:803] 2025-04-26 19:49:49,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2102 [WARNING|trainer.py:803] 2025-04-26 19:49:49,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2406 2136 [WARNING|trainer.py:803] 2025-04-26 19:49:50,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:49:50,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:50,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2103 2137 [WARNING|trainer.py:803] 2025-04-26 19:49:51,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2104 [WARNING|trainer.py:803] 2025-04-26 19:49:52,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2138 [WARNING|trainer.py:803] 2025-04-26 19:49:53,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:49:53,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2105 2139 [WARNING|trainer.py:803] 2025-04-26 19:49:54,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:54,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2106 2140 2407 [WARNING|trainer.py:803] 2025-04-26 19:49:55,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:56,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2107 [WARNING|trainer.py:803] 2025-04-26 19:49:56,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2141 [WARNING|trainer.py:803] 2025-04-26 19:49:56,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2108 [WARNING|trainer.py:803] 2025-04-26 19:49:57,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2408 2142 [WARNING|trainer.py:803] 2025-04-26 19:49:58,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:58,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:49:58,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2109 2143 [WARNING|trainer.py:803] 2025-04-26 19:49:59,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2409 2110 [WARNING|trainer.py:803] 2025-04-26 19:50:00,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:50:00,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2144 [WARNING|trainer.py:803] 2025-04-26 19:50:00,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2111 [WARNING|trainer.py:803] 2025-04-26 19:50:01,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2410 2145 [WARNING|trainer.py:803] 2025-04-26 19:50:02,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:02,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2112 [WARNING|trainer.py:803] 2025-04-26 19:50:02,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2146 2411 [WARNING|trainer.py:803] 2025-04-26 19:50:03,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2113 [WARNING|trainer.py:803] 2025-04-26 19:50:03,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:50:04,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2147 [WARNING|trainer.py:803] 2025-04-26 19:50:04,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2114 [WARNING|trainer.py:803] 2025-04-26 19:50:05,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2412 2148 [WARNING|trainer.py:803] 2025-04-26 19:50:05,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2115 [WARNING|trainer.py:803] 2025-04-26 19:50:06,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:50:06,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2149 [WARNING|trainer.py:803] 2025-04-26 19:50:07,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2116 2413 [WARNING|trainer.py:803] 2025-04-26 19:50:07,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2150 [WARNING|trainer.py:803] 2025-04-26 19:50:08,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:08,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2117 [WARNING|trainer.py:803] 2025-04-26 19:50:09,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2151 [WARNING|trainer.py:803] 2025-04-26 19:50:09,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2414 2118 [WARNING|trainer.py:803] 2025-04-26 19:50:10,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:10,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2152 [WARNING|trainer.py:803] 2025-04-26 19:50:11,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2119 2415 [WARNING|trainer.py:803] 2025-04-26 19:50:11,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2153 [WARNING|trainer.py:803] 2025-04-26 19:50:12,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:50:12,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2120 [WARNING|trainer.py:803] 2025-04-26 19:50:13,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2154 [WARNING|trainer.py:803] 2025-04-26 19:50:13,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2121 2416 [WARNING|trainer.py:803] 2025-04-26 19:50:14,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2155 [WARNING|trainer.py:803] 2025-04-26 19:50:14,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:50:15,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2122 [WARNING|trainer.py:803] 2025-04-26 19:50:15,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2156 2417 [WARNING|trainer.py:803] 2025-04-26 19:50:16,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2123 [WARNING|trainer.py:803] 2025-04-26 19:50:16,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:17,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2157 [WARNING|trainer.py:803] 2025-04-26 19:50:17,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2418 2124 [WARNING|trainer.py:803] 2025-04-26 19:50:18,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2158 [WARNING|trainer.py:803] 2025-04-26 19:50:18,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:50:18,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2125 [WARNING|trainer.py:803] 2025-04-26 19:50:19,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2159 2419 [WARNING|trainer.py:803] 2025-04-26 19:50:20,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2126 [WARNING|trainer.py:803] 2025-04-26 19:50:20,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:50:21,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2160 [WARNING|trainer.py:803] 2025-04-26 19:50:21,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2127 [WARNING|trainer.py:803] 2025-04-26 19:50:22,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2420 2161 [WARNING|trainer.py:803] 2025-04-26 19:50:22,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:23,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:50:23,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2128 2162 2421 [WARNING|trainer.py:803] 2025-04-26 19:50:24,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2129 [WARNING|trainer.py:803] 2025-04-26 19:50:24,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:24,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2163 [WARNING|trainer.py:803] 2025-04-26 19:50:25,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2130 [WARNING|trainer.py:803] 2025-04-26 19:50:25,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2422 2164 [WARNING|trainer.py:803] 2025-04-26 19:50:26,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:50:27,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:50:27,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2131 2165 [WARNING|trainer.py:803] 2025-04-26 19:50:28,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2423 2132 [WARNING|trainer.py:803] 2025-04-26 19:50:28,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2166 [WARNING|trainer.py:803] 2025-04-26 19:50:29,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:50:29,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:29,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2133 2167 [WARNING|trainer.py:803] 2025-04-26 19:50:30,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:50:31,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2134 2424 2168 [WARNING|trainer.py:803] 2025-04-26 19:50:32,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:50:32,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:50:32,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2135 2169 2425 [WARNING|trainer.py:803] 2025-04-26 19:50:33,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:50:33,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2136 [WARNING|trainer.py:803] 2025-04-26 19:50:34,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2170 [WARNING|trainer.py:803] 2025-04-26 19:50:34,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:35,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2137 2171 2426 [WARNING|trainer.py:803] 2025-04-26 19:50:36,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:50:36,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:50:36,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2138 2172 [WARNING|trainer.py:803] 2025-04-26 19:50:37,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2427 [WARNING|trainer.py:803] 2025-04-26 19:50:37,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2139 2173 [WARNING|trainer.py:803] 2025-04-26 19:50:38,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:50:38,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:50:39,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2140 2174 [WARNING|trainer.py:803] 2025-04-26 19:50:39,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2428 [WARNING|trainer.py:803] 2025-04-26 19:50:40,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2141 2175 [WARNING|trainer.py:803] 2025-04-26 19:50:40,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:50:41,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:41,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2142 2429 2176 [WARNING|trainer.py:803] 2025-04-26 19:50:42,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:50:42,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:50:42,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2143 2177 [WARNING|trainer.py:803] 2025-04-26 19:50:43,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2430 [WARNING|trainer.py:803] 2025-04-26 19:50:44,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2144 2178 [WARNING|trainer.py:803] 2025-04-26 19:50:45,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:50:45,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:50:45,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2145 2179 2431 [WARNING|trainer.py:803] 2025-04-26 19:50:46,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:50:46,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2146 2180 [WARNING|trainer.py:803] 2025-04-26 19:50:47,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:50:47,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:50:48,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2147 2181 2432 [WARNING|trainer.py:803] 2025-04-26 19:50:49,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:50:49,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:50:49,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2148 2182 [WARNING|trainer.py:803] 2025-04-26 19:50:50,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2433 [WARNING|trainer.py:803] 2025-04-26 19:50:50,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2149 2183 [WARNING|trainer.py:803] 2025-04-26 19:50:51,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:50:51,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:52,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2150 2184 2434 [WARNING|trainer.py:803] 2025-04-26 19:50:53,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:53,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2151 [WARNING|trainer.py:803] 2025-04-26 19:50:53,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2185 [WARNING|trainer.py:803] 2025-04-26 19:50:54,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:54,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2435 2152 2186 [WARNING|trainer.py:803] 2025-04-26 19:50:55,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:50:55,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:50:56,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2153 2187 2436 [WARNING|trainer.py:803] 2025-04-26 19:50:57,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:57,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:57,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2154 2188 [WARNING|trainer.py:803] 2025-04-26 19:50:58,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2437 [WARNING|trainer.py:803] 2025-04-26 19:50:58,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2155 2189 [WARNING|trainer.py:803] 2025-04-26 19:50:59,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:50:59,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:00,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2156 2190 2438 [WARNING|trainer.py:803] 2025-04-26 19:51:01,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:01,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2157 [WARNING|trainer.py:803] 2025-04-26 19:51:01,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2191 [WARNING|trainer.py:803] 2025-04-26 19:51:02,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2439 [WARNING|trainer.py:803] 2025-04-26 19:51:02,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2158 2192 [WARNING|trainer.py:803] 2025-04-26 19:51:03,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:51:03,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:04,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2159 2193 2440 [WARNING|trainer.py:803] 2025-04-26 19:51:04,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:51:05,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2160 [WARNING|trainer.py:803] 2025-04-26 19:51:05,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2194 [WARNING|trainer.py:803] 2025-04-26 19:51:06,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2161 [WARNING|trainer.py:803] 2025-04-26 19:51:06,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2441 2195 [WARNING|trainer.py:803] 2025-04-26 19:51:07,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:07,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:08,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2162 2196 [WARNING|trainer.py:803] 2025-04-26 19:51:09,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:09,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2163 2197 [WARNING|trainer.py:803] 2025-04-26 19:51:10,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:51:10,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2164 2198 [WARNING|trainer.py:803] 2025-04-26 19:51:11,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:12,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2165 2199 [WARNING|trainer.py:803] 2025-04-26 19:51:12,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:51:13,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2442 2166 2200 [WARNING|trainer.py:803] 2025-04-26 19:51:14,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:51:14,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:14,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2167 2201 2443 [WARNING|trainer.py:803] 2025-04-26 19:51:15,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:51:15,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2168 [WARNING|trainer.py:803] 2025-04-26 19:51:16,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2202 [WARNING|trainer.py:803] 2025-04-26 19:51:16,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2169 [WARNING|trainer.py:803] 2025-04-26 19:51:17,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2444 2203 [WARNING|trainer.py:803] 2025-04-26 19:51:18,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2170 [WARNING|trainer.py:803] 2025-04-26 19:51:18,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 19:51:18,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2204 [WARNING|trainer.py:803] 2025-04-26 19:51:19,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2445 2171 [WARNING|trainer.py:803] 2025-04-26 19:51:20,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2205 [WARNING|trainer.py:803] 2025-04-26 19:51:20,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:20,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2172 [WARNING|trainer.py:803] 2025-04-26 19:51:21,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2206 [WARNING|trainer.py:803] 2025-04-26 19:51:22,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2446 2173 [WARNING|trainer.py:803] 2025-04-26 19:51:22,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:51:23,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2207 [WARNING|trainer.py:803] 2025-04-26 19:51:23,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2174 [WARNING|trainer.py:803] 2025-04-26 19:51:24,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2208 [WARNING|trainer.py:803] 2025-04-26 19:51:24,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2175 [WARNING|trainer.py:803] 2025-04-26 19:51:25,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2447 2209 [WARNING|trainer.py:803] 2025-04-26 19:51:26,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:26,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2176 [WARNING|trainer.py:803] 2025-04-26 19:51:26,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2210 [WARNING|trainer.py:803] 2025-04-26 19:51:27,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2448 2177 [WARNING|trainer.py:803] 2025-04-26 19:51:28,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:28,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2211 [WARNING|trainer.py:803] 2025-04-26 19:51:28,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2178 [WARNING|trainer.py:803] 2025-04-26 19:51:29,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2212 [WARNING|trainer.py:803] 2025-04-26 19:51:29,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2179 [WARNING|trainer.py:803] 2025-04-26 19:51:30,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2213 [WARNING|trainer.py:803] 2025-04-26 19:51:31,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2180 [WARNING|trainer.py:803] 2025-04-26 19:51:31,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2214 [WARNING|trainer.py:803] 2025-04-26 19:51:32,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2181 [WARNING|trainer.py:803] 2025-04-26 19:51:33,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2215 2449 [WARNING|trainer.py:803] 2025-04-26 19:51:33,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2182 [WARNING|trainer.py:803] 2025-04-26 19:51:34,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:34,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2216 [WARNING|trainer.py:803] 2025-04-26 19:51:35,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2183 [WARNING|trainer.py:803] 2025-04-26 19:51:35,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2217 2450 [WARNING|trainer.py:803] 2025-04-26 19:51:36,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2184 [WARNING|trainer.py:803] 2025-04-26 19:51:37,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:51:37,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2218 [WARNING|trainer.py:803] 2025-04-26 19:51:37,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2185 [WARNING|trainer.py:803] 2025-04-26 19:51:38,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2451 2219 [WARNING|trainer.py:803] 2025-04-26 19:51:39,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:39,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2186 [WARNING|trainer.py:803] 2025-04-26 19:51:39,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2220 [WARNING|trainer.py:803] 2025-04-26 19:51:40,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2452 2187 [WARNING|trainer.py:803] 2025-04-26 19:51:41,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2221 [WARNING|trainer.py:803] 2025-04-26 19:51:41,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:51:42,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2188 [WARNING|trainer.py:803] 2025-04-26 19:51:42,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2453 2222 [WARNING|trainer.py:803] 2025-04-26 19:51:43,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2189 [WARNING|trainer.py:803] 2025-04-26 19:51:43,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:51:44,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:51:44,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2223 2454 2190 [WARNING|trainer.py:803] 2025-04-26 19:51:45,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:51:45,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:51:45,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2224 2191 [WARNING|trainer.py:803] 2025-04-26 19:51:46,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2455 [WARNING|trainer.py:803] 2025-04-26 19:51:47,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2225 2192 [WARNING|trainer.py:803] 2025-04-26 19:51:47,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:51:48,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:51:48,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2226 2193 [WARNING|trainer.py:803] 2025-04-26 19:51:49,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:51:49,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2227 2194 [WARNING|trainer.py:803] 2025-04-26 19:51:50,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:51:51,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2228 2195 2456 [WARNING|trainer.py:803] 2025-04-26 19:51:52,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:52,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:52,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2229 2196 [WARNING|trainer.py:803] 2025-04-26 19:51:53,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2457 [WARNING|trainer.py:803] 2025-04-26 19:51:53,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2230 2197 [WARNING|trainer.py:803] 2025-04-26 19:51:54,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:51:54,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:51:55,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2231 2198 2458 [WARNING|trainer.py:803] 2025-04-26 19:51:56,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:51:56,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:51:56,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2232 2199 [WARNING|trainer.py:803] 2025-04-26 19:51:57,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2459 [WARNING|trainer.py:803] 2025-04-26 19:51:57,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2233 2200 [WARNING|trainer.py:803] 2025-04-26 19:51:58,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:51:58,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:51:59,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2234 2201 [WARNING|trainer.py:803] 2025-04-26 19:52:00,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2460 [WARNING|trainer.py:803] 2025-04-26 19:52:00,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2235 2202 [WARNING|trainer.py:803] 2025-04-26 19:52:01,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:52:01,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:52:01,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2461 2236 2203 [WARNING|trainer.py:803] 2025-04-26 19:52:02,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:52:02,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:52:03,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2237 2204 [WARNING|trainer.py:803] 2025-04-26 19:52:04,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2462 [WARNING|trainer.py:803] 2025-04-26 19:52:04,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2238 2205 [WARNING|trainer.py:803] 2025-04-26 19:52:05,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:52:05,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:52:05,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2239 2206 [WARNING|trainer.py:803] 2025-04-26 19:52:06,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:52:07,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2240 2207 [WARNING|trainer.py:803] 2025-04-26 19:52:08,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:52:08,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2241 2208 2463 [WARNING|trainer.py:803] 2025-04-26 19:52:09,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:52:09,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2242 [WARNING|trainer.py:803] 2025-04-26 19:52:10,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2209 [WARNING|trainer.py:803] 2025-04-26 19:52:10,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:52:11,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2243 2210 [WARNING|trainer.py:803] 2025-04-26 19:52:12,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2244 [WARNING|trainer.py:803] 2025-04-26 19:52:12,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2211 [WARNING|trainer.py:803] 2025-04-26 19:52:13,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:52:13,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2245 2212 2464 [WARNING|trainer.py:803] 2025-04-26 19:52:14,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2246 [WARNING|trainer.py:803] 2025-04-26 19:52:15,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:52:15,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2213 [WARNING|trainer.py:803] 2025-04-26 19:52:16,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2247 [WARNING|trainer.py:803] 2025-04-26 19:52:16,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2214 [WARNING|trainer.py:803] 2025-04-26 19:52:17,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:52:17,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2248 2215 [WARNING|trainer.py:803] 2025-04-26 19:52:18,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2249 [WARNING|trainer.py:803] 2025-04-26 19:52:19,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2216 [WARNING|trainer.py:803] 2025-04-26 19:52:19,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2465 2250 [WARNING|trainer.py:803] 2025-04-26 19:52:20,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2217 [WARNING|trainer.py:803] 2025-04-26 19:52:21,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:52:21,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2251 [WARNING|trainer.py:803] 2025-04-26 19:52:21,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2218 [WARNING|trainer.py:803] 2025-04-26 19:52:22,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2252 [WARNING|trainer.py:803] 2025-04-26 19:52:23,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2219 [WARNING|trainer.py:803] 2025-04-26 19:52:23,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2253 [WARNING|trainer.py:803] 2025-04-26 19:52:24,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:52:25,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2220 2254 2466 [WARNING|trainer.py:803] 2025-04-26 19:52:26,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:52:26,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:52:26,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2221 2255 [WARNING|trainer.py:803] 2025-04-26 19:52:27,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:52:27,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2222 2256 [WARNING|trainer.py:803] 2025-04-26 19:52:28,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:52:28,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2257 2223 [WARNING|trainer.py:803] 2025-04-26 19:52:30,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:52:30,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2258 2224 2467 [WARNING|trainer.py:803] 2025-04-26 19:52:31,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:52:31,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:52:31,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2259 2225 [WARNING|trainer.py:803] 2025-04-26 19:52:32,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:52:32,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2260 2226 [WARNING|trainer.py:803] 2025-04-26 19:52:33,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:52:34,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2261 2227 [WARNING|trainer.py:803] 2025-04-26 19:52:35,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:52:35,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2262 2228 [WARNING|trainer.py:803] 2025-04-26 19:52:36,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2468 [WARNING|trainer.py:803] 2025-04-26 19:52:36,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2263 2229 [WARNING|trainer.py:803] 2025-04-26 19:52:37,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:52:37,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:52:38,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2264 2230 [WARNING|trainer.py:803] 2025-04-26 19:52:39,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:52:39,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2265 2231 [WARNING|trainer.py:803] 2025-04-26 19:52:40,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:52:40,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2266 2232 [WARNING|trainer.py:803] 2025-04-26 19:52:41,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2469 2267 [WARNING|trainer.py:803] 2025-04-26 19:52:42,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:52:42,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2233 [WARNING|trainer.py:803] 2025-04-26 19:52:42,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2268 [WARNING|trainer.py:803] 2025-04-26 19:52:43,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2234 [WARNING|trainer.py:803] 2025-04-26 19:52:44,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2269 [WARNING|trainer.py:803] 2025-04-26 19:52:44,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2235 [WARNING|trainer.py:803] 2025-04-26 19:52:45,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2270 [WARNING|trainer.py:803] 2025-04-26 19:52:46,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2236 [WARNING|trainer.py:803] 2025-04-26 19:52:46,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2470 2271 [WARNING|trainer.py:803] 2025-04-26 19:52:47,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:52:47,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2237 [WARNING|trainer.py:803] 2025-04-26 19:52:48,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2272 [WARNING|trainer.py:803] 2025-04-26 19:52:48,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2238 [WARNING|trainer.py:803] 2025-04-26 19:52:49,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2273 [WARNING|trainer.py:803] 2025-04-26 19:52:50,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2239 [WARNING|trainer.py:803] 2025-04-26 19:52:50,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2274 [WARNING|trainer.py:803] 2025-04-26 19:52:51,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2240 [WARNING|trainer.py:803] 2025-04-26 19:52:51,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2275 2471 [WARNING|trainer.py:803] 2025-04-26 19:52:52,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:52:53,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2241 [WARNING|trainer.py:803] 2025-04-26 19:52:53,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2276 [WARNING|trainer.py:803] 2025-04-26 19:52:54,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:52:54,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2242 2472 2277 [WARNING|trainer.py:803] 2025-04-26 19:52:55,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:52:55,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:52:55,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2243 2278 2473 [WARNING|trainer.py:803] 2025-04-26 19:52:56,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:52:57,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:52:57,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2244 2279 [WARNING|trainer.py:803] 2025-04-26 19:52:58,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2474 [WARNING|trainer.py:803] 2025-04-26 19:52:58,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2245 2280 [WARNING|trainer.py:803] 2025-04-26 19:52:59,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:52:59,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:52:59,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2246 2281 [WARNING|trainer.py:803] 2025-04-26 19:53:00,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:00,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2247 2282 2475 [WARNING|trainer.py:803] 2025-04-26 19:53:02,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:02,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:02,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2283 2248 2476 [WARNING|trainer.py:803] 2025-04-26 19:53:03,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:03,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2284 2249 [WARNING|trainer.py:803] 2025-04-26 19:53:04,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:53:04,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:53:04,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2285 2477 2250 [WARNING|trainer.py:803] 2025-04-26 19:53:06,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:53:06,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:53:06,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2286 2251 2478 [WARNING|trainer.py:803] 2025-04-26 19:53:07,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:07,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2287 2252 [WARNING|trainer.py:803] 2025-04-26 19:53:08,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:53:08,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:53:08,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2288 2253 2479 [WARNING|trainer.py:803] 2025-04-26 19:53:09,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:53:09,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:10,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2289 2254 [WARNING|trainer.py:803] 2025-04-26 19:53:11,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:11,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2480 2290 2255 [WARNING|trainer.py:803] 2025-04-26 19:53:12,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:53:12,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:53:12,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2291 2256 2481 [WARNING|trainer.py:803] 2025-04-26 19:53:13,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:13,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:53:13,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2292 2257 2482 [WARNING|trainer.py:803] 2025-04-26 19:53:14,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:53:15,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2293 2258 [WARNING|trainer.py:803] 2025-04-26 19:53:15,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:53:16,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:53:16,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2294 2259 [WARNING|trainer.py:803] 2025-04-26 19:53:17,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:53:17,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2295 2260 2483 [WARNING|trainer.py:803] 2025-04-26 19:53:18,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:18,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:19,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2296 2261 [WARNING|trainer.py:803] 2025-04-26 19:53:19,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:53:20,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2297 2262 [WARNING|trainer.py:803] 2025-04-26 19:53:21,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2484 [WARNING|trainer.py:803] 2025-04-26 19:53:21,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2298 2263 [WARNING|trainer.py:803] 2025-04-26 19:53:22,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:53:22,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:22,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2299 2485 2264 [WARNING|trainer.py:803] 2025-04-26 19:53:23,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2300 [WARNING|trainer.py:803] 2025-04-26 19:53:24,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:53:24,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2265 [WARNING|trainer.py:803] 2025-04-26 19:53:24,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2301 [WARNING|trainer.py:803] 2025-04-26 19:53:25,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2486 2266 [WARNING|trainer.py:803] 2025-04-26 19:53:26,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:53:26,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2302 [WARNING|trainer.py:803] 2025-04-26 19:53:26,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2267 [WARNING|trainer.py:803] 2025-04-26 19:53:27,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2487 2303 [WARNING|trainer.py:803] 2025-04-26 19:53:28,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2268 [WARNING|trainer.py:803] 2025-04-26 19:53:28,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:53:28,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2304 [WARNING|trainer.py:803] 2025-04-26 19:53:29,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2488 2269 [WARNING|trainer.py:803] 2025-04-26 19:53:30,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:30,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2305 [WARNING|trainer.py:803] 2025-04-26 19:53:30,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2270 [WARNING|trainer.py:803] 2025-04-26 19:53:31,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2489 2306 [WARNING|trainer.py:803] 2025-04-26 19:53:31,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:53:32,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2271 [WARNING|trainer.py:803] 2025-04-26 19:53:32,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2307 [WARNING|trainer.py:803] 2025-04-26 19:53:33,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2490 2272 [WARNING|trainer.py:803] 2025-04-26 19:53:33,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:34,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2308 [WARNING|trainer.py:803] 2025-04-26 19:53:34,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2273 [WARNING|trainer.py:803] 2025-04-26 19:53:35,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2491 2309 [WARNING|trainer.py:803] 2025-04-26 19:53:35,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2274 [WARNING|trainer.py:803] 2025-04-26 19:53:36,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:36,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2310 [WARNING|trainer.py:803] 2025-04-26 19:53:36,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2275 [WARNING|trainer.py:803] 2025-04-26 19:53:37,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2492 2311 [WARNING|trainer.py:803] 2025-04-26 19:53:38,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2276 [WARNING|trainer.py:803] 2025-04-26 19:53:38,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:53:38,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2312 [WARNING|trainer.py:803] 2025-04-26 19:53:39,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2493 2277 [WARNING|trainer.py:803] 2025-04-26 19:53:40,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2313 [WARNING|trainer.py:803] 2025-04-26 19:53:40,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:53:40,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2278 [WARNING|trainer.py:803] 2025-04-26 19:53:41,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2314 2494 [WARNING|trainer.py:803] 2025-04-26 19:53:41,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2279 [WARNING|trainer.py:803] 2025-04-26 19:53:42,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:42,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2315 [WARNING|trainer.py:803] 2025-04-26 19:53:43,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2280 [WARNING|trainer.py:803] 2025-04-26 19:53:43,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2316 2495 [WARNING|trainer.py:803] 2025-04-26 19:53:44,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2281 [WARNING|trainer.py:803] 2025-04-26 19:53:45,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:45,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2317 [WARNING|trainer.py:803] 2025-04-26 19:53:45,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2282 2496 [WARNING|trainer.py:803] 2025-04-26 19:53:46,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2318 [WARNING|trainer.py:803] 2025-04-26 19:53:47,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:47,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2283 [WARNING|trainer.py:803] 2025-04-26 19:53:47,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2319 2497 [WARNING|trainer.py:803] 2025-04-26 19:53:48,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2284 [WARNING|trainer.py:803] 2025-04-26 19:53:48,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:53:49,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2320 [WARNING|trainer.py:803] 2025-04-26 19:53:49,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2285 [WARNING|trainer.py:803] 2025-04-26 19:53:50,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2498 2321 [WARNING|trainer.py:803] 2025-04-26 19:53:50,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2286 [WARNING|trainer.py:803] 2025-04-26 19:53:51,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:53:51,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2322 [WARNING|trainer.py:803] 2025-04-26 19:53:52,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2499 2287 [WARNING|trainer.py:803] 2025-04-26 19:53:52,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2323 [WARNING|trainer.py:803] 2025-04-26 19:53:53,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:53:53,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2288 [WARNING|trainer.py:803] 2025-04-26 19:53:53,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2324 [WARNING|trainer.py:803] 2025-04-26 19:53:54,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2289 [WARNING|trainer.py:803] 2025-04-26 19:53:55,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2325 [WARNING|trainer.py:803] 2025-04-26 19:53:56,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2290 [WARNING|trainer.py:803] 2025-04-26 19:53:56,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2326 [WARNING|trainer.py:803] 2025-04-26 19:53:57,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2500 2291 [WARNING|trainer.py:803] 2025-04-26 19:53:57,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2327 [WARNING|trainer.py:803] 2025-04-26 19:53:58,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:53:58,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:53:58,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2292 2328 [WARNING|trainer.py:803] 2025-04-26 19:53:59,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2501 [WARNING|trainer.py:803] 2025-04-26 19:54:00,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2293 2329 [WARNING|trainer.py:803] 2025-04-26 19:54:00,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:54:01,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:01,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2294 2330 2502 [WARNING|trainer.py:803] 2025-04-26 19:54:02,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:54:02,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2295 [WARNING|trainer.py:803] 2025-04-26 19:54:03,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2331 [WARNING|trainer.py:803] 2025-04-26 19:54:03,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:54:03,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2296 2503 2332 [WARNING|trainer.py:803] 2025-04-26 19:54:04,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:04,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:54:05,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2297 2333 [WARNING|trainer.py:803] 2025-04-26 19:54:06,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:54:06,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2298 2504 2334 [WARNING|trainer.py:803] 2025-04-26 19:54:07,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:54:07,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:07,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2299 2335 [WARNING|trainer.py:803] 2025-04-26 19:54:08,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2505 [WARNING|trainer.py:803] 2025-04-26 19:54:09,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2300 2336 [WARNING|trainer.py:803] 2025-04-26 19:54:09,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:54:10,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:10,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2301 2506 2337 [WARNING|trainer.py:803] 2025-04-26 19:54:11,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:11,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:54:11,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2302 2338 [WARNING|trainer.py:803] 2025-04-26 19:54:12,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:54:12,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2303 2507 2339 [WARNING|trainer.py:803] 2025-04-26 19:54:13,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:54:14,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:54:14,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2304 2340 2508 [WARNING|trainer.py:803] 2025-04-26 19:54:15,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2305 [WARNING|trainer.py:803] 2025-04-26 19:54:15,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:15,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2341 [WARNING|trainer.py:803] 2025-04-26 19:54:16,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2306 [WARNING|trainer.py:803] 2025-04-26 19:54:16,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2509 2342 [WARNING|trainer.py:803] 2025-04-26 19:54:17,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:54:17,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2307 [WARNING|trainer.py:803] 2025-04-26 19:54:18,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2343 [WARNING|trainer.py:803] 2025-04-26 19:54:18,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2510 2308 [WARNING|trainer.py:803] 2025-04-26 19:54:19,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2344 [WARNING|trainer.py:803] 2025-04-26 19:54:19,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:20,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2309 [WARNING|trainer.py:803] 2025-04-26 19:54:20,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2345 2511 [WARNING|trainer.py:803] 2025-04-26 19:54:21,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2310 [WARNING|trainer.py:803] 2025-04-26 19:54:21,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:54:22,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2346 [WARNING|trainer.py:803] 2025-04-26 19:54:22,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2512 2311 [WARNING|trainer.py:803] 2025-04-26 19:54:23,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2347 [WARNING|trainer.py:803] 2025-04-26 19:54:23,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:23,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:54:24,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2312 2348 2513 [WARNING|trainer.py:803] 2025-04-26 19:54:25,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2313 [WARNING|trainer.py:803] 2025-04-26 19:54:25,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:26,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2349 [WARNING|trainer.py:803] 2025-04-26 19:54:26,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2314 [WARNING|trainer.py:803] 2025-04-26 19:54:26,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2350 2514 [WARNING|trainer.py:803] 2025-04-26 19:54:27,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2315 [WARNING|trainer.py:803] 2025-04-26 19:54:28,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:54:28,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2351 [WARNING|trainer.py:803] 2025-04-26 19:54:29,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2316 2515 [WARNING|trainer.py:803] 2025-04-26 19:54:29,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2352 [WARNING|trainer.py:803] 2025-04-26 19:54:30,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:54:30,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2317 [WARNING|trainer.py:803] 2025-04-26 19:54:30,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2353 2516 [WARNING|trainer.py:803] 2025-04-26 19:54:31,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2318 [WARNING|trainer.py:803] 2025-04-26 19:54:32,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:32,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2354 [WARNING|trainer.py:803] 2025-04-26 19:54:32,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2319 [WARNING|trainer.py:803] 2025-04-26 19:54:33,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2355 2517 [WARNING|trainer.py:803] 2025-04-26 19:54:34,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2320 [WARNING|trainer.py:803] 2025-04-26 19:54:34,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2356 [WARNING|trainer.py:803] 2025-04-26 19:54:35,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:54:35,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2321 [WARNING|trainer.py:803] 2025-04-26 19:54:35,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2357 [WARNING|trainer.py:803] 2025-04-26 19:54:36,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2322 [WARNING|trainer.py:803] 2025-04-26 19:54:37,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2518 2358 [WARNING|trainer.py:803] 2025-04-26 19:54:37,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:54:38,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:54:38,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2323 2359 [WARNING|trainer.py:803] 2025-04-26 19:54:39,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2519 [WARNING|trainer.py:803] 2025-04-26 19:54:39,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2324 2360 [WARNING|trainer.py:803] 2025-04-26 19:54:40,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:40,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:54:40,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2325 2361 [WARNING|trainer.py:803] 2025-04-26 19:54:41,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2520 [WARNING|trainer.py:803] 2025-04-26 19:54:42,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2326 2362 [WARNING|trainer.py:803] 2025-04-26 19:54:42,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:54:43,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:43,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2327 2363 2521 [WARNING|trainer.py:803] 2025-04-26 19:54:44,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:54:44,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2328 2364 [WARNING|trainer.py:803] 2025-04-26 19:54:45,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:45,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:45,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2329 2365 [WARNING|trainer.py:803] 2025-04-26 19:54:46,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:47,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2330 2366 [WARNING|trainer.py:803] 2025-04-26 19:54:48,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:48,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2331 2367 2522 [WARNING|trainer.py:803] 2025-04-26 19:54:49,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:49,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2332 2368 [WARNING|trainer.py:803] 2025-04-26 19:54:50,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:54:50,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:50,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2333 2369 [WARNING|trainer.py:803] 2025-04-26 19:54:52,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:52,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2334 2370 [WARNING|trainer.py:803] 2025-04-26 19:54:53,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:53,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2335 2371 [WARNING|trainer.py:803] 2025-04-26 19:54:54,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:54:54,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2336 2523 2372 [WARNING|trainer.py:803] 2025-04-26 19:54:55,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:55,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:54:56,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2337 2373 [WARNING|trainer.py:803] 2025-04-26 19:54:57,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:57,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2338 2374 [WARNING|trainer.py:803] 2025-04-26 19:54:58,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:58,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2339 2375 [WARNING|trainer.py:803] 2025-04-26 19:54:59,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:54:59,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2340 2376 [WARNING|trainer.py:803] 2025-04-26 19:55:00,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2524 [WARNING|trainer.py:803] 2025-04-26 19:55:01,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2341 2377 [WARNING|trainer.py:803] 2025-04-26 19:55:01,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:55:02,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:02,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2342 2378 [WARNING|trainer.py:803] 2025-04-26 19:55:03,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:55:03,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2343 2379 [WARNING|trainer.py:803] 2025-04-26 19:55:04,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:55:05,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2344 2380 [WARNING|trainer.py:803] 2025-04-26 19:55:06,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:06,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2345 2381 2525 [WARNING|trainer.py:803] 2025-04-26 19:55:07,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:07,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2346 2382 [WARNING|trainer.py:803] 2025-04-26 19:55:08,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:55:08,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:55:08,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2347 2383 [WARNING|trainer.py:803] 2025-04-26 19:55:10,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:55:10,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2348 2384 [WARNING|trainer.py:803] 2025-04-26 19:55:11,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:55:11,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2349 2385 [WARNING|trainer.py:803] 2025-04-26 19:55:12,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:55:12,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2350 2386 2526 [WARNING|trainer.py:803] 2025-04-26 19:55:13,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:55:13,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:14,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2351 2387 [WARNING|trainer.py:803] 2025-04-26 19:55:15,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:55:15,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2388 2352 [WARNING|trainer.py:803] 2025-04-26 19:55:16,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:16,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2389 2353 [WARNING|trainer.py:803] 2025-04-26 19:55:17,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:17,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2390 2354 [WARNING|trainer.py:803] 2025-04-26 19:55:19,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:19,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2391 2355 [WARNING|trainer.py:803] 2025-04-26 19:55:20,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:20,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2392 2356 [WARNING|trainer.py:803] 2025-04-26 19:55:21,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:21,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2393 2357 [WARNING|trainer.py:803] 2025-04-26 19:55:22,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:22,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2527 2394 2358 [WARNING|trainer.py:803] 2025-04-26 19:55:23,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:55:24,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:24,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2395 2359 [WARNING|trainer.py:803] 2025-04-26 19:55:25,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:25,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2396 2360 [WARNING|trainer.py:803] 2025-04-26 19:55:26,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:26,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2397 2361 2528 [WARNING|trainer.py:803] 2025-04-26 19:55:28,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:55:28,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2362 2398 [WARNING|trainer.py:803] 2025-04-26 19:55:28,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:55:29,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:29,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2363 2399 [WARNING|trainer.py:803] 2025-04-26 19:55:30,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:55:30,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2364 2400 [WARNING|trainer.py:803] 2025-04-26 19:55:32,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:32,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2365 2529 [WARNING|trainer.py:803] 2025-04-26 19:55:33,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:55:33,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2401 2366 [WARNING|trainer.py:803] 2025-04-26 19:55:34,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:55:34,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2367 [WARNING|trainer.py:803] 2025-04-26 19:55:35,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2402 2368 [WARNING|trainer.py:803] 2025-04-26 19:55:37,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:55:37,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2530 2369 [WARNING|trainer.py:803] 2025-04-26 19:55:38,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:55:38,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2403 2370 [WARNING|trainer.py:803] 2025-04-26 19:55:39,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:39,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2371 2404 [WARNING|trainer.py:803] 2025-04-26 19:55:41,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2372 [WARNING|trainer.py:803] 2025-04-26 19:55:41,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2531 [WARNING|trainer.py:803] 2025-04-26 19:55:42,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:42,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2373 2405 [WARNING|trainer.py:803] 2025-04-26 19:55:43,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:43,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2532 2374 [WARNING|trainer.py:803] 2025-04-26 19:55:44,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:55:45,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2406 2375 [WARNING|trainer.py:803] 2025-04-26 19:55:46,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:46,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2533 2376 [WARNING|trainer.py:803] 2025-04-26 19:55:47,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:55:47,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2377 2534 [WARNING|trainer.py:803] 2025-04-26 19:55:48,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:49,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2378 [WARNING|trainer.py:803] 2025-04-26 19:55:50,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2379 2407 2535 [WARNING|trainer.py:803] 2025-04-26 19:55:51,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:55:51,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:55:51,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2380 [WARNING|trainer.py:803] 2025-04-26 19:55:52,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2408 2536 2381 [WARNING|trainer.py:803] 2025-04-26 19:55:53,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:53,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:54,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2382 2537 [WARNING|trainer.py:803] 2025-04-26 19:55:55,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2409 2383 [WARNING|trainer.py:803] 2025-04-26 19:55:56,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:56,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:55:56,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2538 2384 2410 [WARNING|trainer.py:803] 2025-04-26 19:55:57,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:55:57,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2385 [WARNING|trainer.py:803] 2025-04-26 19:55:58,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:55:59,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2386 2539 2411 [WARNING|trainer.py:803] 2025-04-26 19:56:00,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:56:00,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:56:00,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2387 2540 [WARNING|trainer.py:803] 2025-04-26 19:56:01,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2388 2412 [WARNING|trainer.py:803] 2025-04-26 19:56:02,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:56:03,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:56:03,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2389 2541 [WARNING|trainer.py:803] 2025-04-26 19:56:04,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:56:04,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2413 2390 [WARNING|trainer.py:803] 2025-04-26 19:56:05,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:56:05,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2542 2391 [WARNING|trainer.py:803] 2025-04-26 19:56:06,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:56:07,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2414 2392 [WARNING|trainer.py:803] 2025-04-26 19:56:07,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:56:08,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2543 2393 2415 [WARNING|trainer.py:803] 2025-04-26 19:56:09,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:56:09,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:56:10,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2394 2544 [WARNING|trainer.py:803] 2025-04-26 19:56:10,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2395 [WARNING|trainer.py:803] 2025-04-26 19:56:11,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2416 [WARNING|trainer.py:803] 2025-04-26 19:56:12,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2396 [WARNING|trainer.py:803] 2025-04-26 19:56:12,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:56:13,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2397 2417 2545 [WARNING|trainer.py:803] 2025-04-26 19:56:14,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:56:15,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:56:15,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2398 2546 [WARNING|trainer.py:803] 2025-04-26 19:56:16,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2418 2399 [WARNING|trainer.py:803] 2025-04-26 19:56:17,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:56:17,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:56:17,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2400 2547 2419 [WARNING|trainer.py:803] 2025-04-26 19:56:19,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:56:19,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:56:19,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2401 2548 2420 [WARNING|trainer.py:803] 2025-04-26 19:56:21,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:56:21,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:56:21,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2549 2421 2402 [WARNING|trainer.py:803] 2025-04-26 19:56:23,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:56:23,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:56:24,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2550 2422 2403 [WARNING|trainer.py:803] 2025-04-26 19:56:25,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:56:26,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:56:26,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2551 [WARNING|trainer.py:803] 2025-04-26 19:56:27,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2423 2404 [WARNING|trainer.py:803] 2025-04-26 19:56:28,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:56:28,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2552 2405 [WARNING|trainer.py:803] 2025-04-26 19:56:30,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:56:31,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2424 2553 [WARNING|trainer.py:803] 2025-04-26 19:56:32,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2406 [WARNING|trainer.py:803] 2025-04-26 19:56:32,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:56:33,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2425 2554 [WARNING|trainer.py:803] 2025-04-26 19:56:34,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:56:35,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2426 2555 [WARNING|trainer.py:803] 2025-04-26 19:56:37,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 19:56:37,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2427 2407 2556 [WARNING|trainer.py:803] 2025-04-26 19:56:39,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:56:39,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:56:39,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2557 2408 [WARNING|trainer.py:803] 2025-04-26 19:56:41,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2428 [WARNING|trainer.py:803] 2025-04-26 19:56:41,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:56:41,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2409 2429 [WARNING|trainer.py:803] 2025-04-26 19:56:43,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:56:44,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2558 2410 [WARNING|trainer.py:803] 2025-04-26 19:56:45,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2430 [WARNING|trainer.py:803] 2025-04-26 19:56:45,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2559 [WARNING|trainer.py:803] 2025-04-26 19:56:46,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2411 [WARNING|trainer.py:803] 2025-04-26 19:56:47,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:56:48,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2431 2560 [WARNING|trainer.py:803] 2025-04-26 19:56:49,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2412 [WARNING|trainer.py:803] 2025-04-26 19:56:49,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:56:50,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2432 2561 [WARNING|trainer.py:803] 2025-04-26 19:56:51,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:56:52,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2413 2433 [WARNING|trainer.py:803] 2025-04-26 19:56:53,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2562 [WARNING|trainer.py:803] 2025-04-26 19:56:53,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:56:53,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2414 2563 [WARNING|trainer.py:803] 2025-04-26 19:56:55,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2434 [WARNING|trainer.py:803] 2025-04-26 19:56:56,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:56:56,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2415 [WARNING|trainer.py:803] 2025-04-26 19:56:57,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2435 [WARNING|trainer.py:803] 2025-04-26 19:56:58,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2564 2416 [WARNING|trainer.py:803] 2025-04-26 19:56:59,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2436 [WARNING|trainer.py:803] 2025-04-26 19:57:00,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:57:00,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2565 [WARNING|trainer.py:803] 2025-04-26 19:57:01,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2417 2437 [WARNING|trainer.py:803] 2025-04-26 19:57:02,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:57:02,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2566 2418 [WARNING|trainer.py:803] 2025-04-26 19:57:04,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2438 [WARNING|trainer.py:803] 2025-04-26 19:57:04,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:57:05,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2419 2439 [WARNING|trainer.py:803] 2025-04-26 19:57:06,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2567 [WARNING|trainer.py:803] 2025-04-26 19:57:07,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:57:08,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2420 2440 [WARNING|trainer.py:803] 2025-04-26 19:57:09,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:57:09,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2568 2421 [WARNING|trainer.py:803] 2025-04-26 19:57:11,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2441 [WARNING|trainer.py:803] 2025-04-26 19:57:11,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:57:12,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2569 2422 [WARNING|trainer.py:803] 2025-04-26 19:57:13,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:57:13,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2570 [WARNING|trainer.py:803] 2025-04-26 19:57:14,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2423 2571 [WARNING|trainer.py:803] 2025-04-26 19:57:16,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:57:16,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2442 2572 [WARNING|trainer.py:803] 2025-04-26 19:57:18,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:57:18,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2424 2573 [WARNING|trainer.py:803] 2025-04-26 19:57:19,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2443 [WARNING|trainer.py:803] 2025-04-26 19:57:20,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:57:21,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2425 2574 [WARNING|trainer.py:803] 2025-04-26 19:57:21,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:57:22,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2444 2575 2426 [WARNING|trainer.py:803] 2025-04-26 19:57:23,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 19:57:24,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:57:24,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2445 2427 [WARNING|trainer.py:803] 2025-04-26 19:57:25,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2576 [WARNING|trainer.py:803] 2025-04-26 19:57:26,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:57:27,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2446 2577 2428 [WARNING|trainer.py:803] 2025-04-26 19:57:28,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:57:29,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:57:29,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2429 2578 2447 [WARNING|trainer.py:803] 2025-04-26 19:57:31,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:57:31,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:57:32,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2430 2579 2448 [WARNING|trainer.py:803] 2025-04-26 19:57:34,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:57:34,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:57:34,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2431 2580 [WARNING|trainer.py:803] 2025-04-26 19:57:36,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:57:37,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2581 2432 [WARNING|trainer.py:803] 2025-04-26 19:57:39,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:57:39,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2449 2582 2433 [WARNING|trainer.py:803] 2025-04-26 19:57:40,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:57:41,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:57:41,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2450 2434 2583 [WARNING|trainer.py:803] 2025-04-26 19:57:43,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:57:43,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:57:44,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2435 2451 2584 [WARNING|trainer.py:803] 2025-04-26 19:57:46,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:57:46,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:57:46,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2436 2452 2585 [WARNING|trainer.py:803] 2025-04-26 19:57:48,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:57:48,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:57:48,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2437 2586 2453 [WARNING|trainer.py:803] 2025-04-26 19:57:50,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:57:50,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:57:50,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2454 2587 2438 [WARNING|trainer.py:803] 2025-04-26 19:57:52,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:57:53,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:57:53,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2439 2455 2588 [WARNING|trainer.py:803] 2025-04-26 19:57:55,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:57:55,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:57:55,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2440 2589 [WARNING|trainer.py:803] 2025-04-26 19:57:57,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:57:58,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2441 2456 2590 [WARNING|trainer.py:803] 2025-04-26 19:58:00,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:58:00,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:58:00,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2457 2591 [WARNING|trainer.py:803] 2025-04-26 19:58:02,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:58:02,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2458 [WARNING|trainer.py:803] 2025-04-26 19:58:04,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2592 [WARNING|trainer.py:803] 2025-04-26 19:58:05,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2442 2459 [WARNING|trainer.py:803] 2025-04-26 19:58:06,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:58:06,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2593 [WARNING|trainer.py:803] 2025-04-26 19:58:08,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2443 2460 2594 [WARNING|trainer.py:803] 2025-04-26 19:58:09,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:58:09,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:58:10,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2461 2444 2595 [WARNING|trainer.py:803] 2025-04-26 19:58:11,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:58:11,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:58:11,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 2445 2462 2596 [WARNING|trainer.py:803] 2025-04-26 19:58:14,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:58:14,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:58:14,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2446 2597 [WARNING|trainer.py:803] 2025-04-26 19:58:16,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:58:16,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2598 2463 [WARNING|trainer.py:803] 2025-04-26 19:58:18,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:58:19,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2599 2447 [WARNING|trainer.py:803] 2025-04-26 19:58:20,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:58:20,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2600 2448 [WARNING|trainer.py:803] 2025-04-26 19:58:22,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:58:22,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2464 [WARNING|trainer.py:803] 2025-04-26 19:58:24,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2601 [WARNING|trainer.py:803] 2025-04-26 19:58:25,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2602 [WARNING|trainer.py:803] 2025-04-26 19:58:27,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2449 2603 [WARNING|trainer.py:803] 2025-04-26 19:58:29,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2465 [WARNING|trainer.py:803] 2025-04-26 19:58:29,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:58:30,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2604 2450 [WARNING|trainer.py:803] 2025-04-26 19:58:32,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:58:32,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2605 2451 [WARNING|trainer.py:803] 2025-04-26 19:58:34,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2466 [WARNING|trainer.py:803] 2025-04-26 19:58:34,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:58:35,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2606 2452 [WARNING|trainer.py:803] 2025-04-26 19:58:36,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:58:37,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2607 [WARNING|trainer.py:803] 2025-04-26 19:58:38,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2453 [WARNING|trainer.py:803] 2025-04-26 19:58:39,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2608 2467 [WARNING|trainer.py:803] 2025-04-26 19:58:40,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2454 [WARNING|trainer.py:803] 2025-04-26 19:58:40,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2609 [WARNING|trainer.py:803] 2025-04-26 19:58:41,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:58:42,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2455 [WARNING|trainer.py:803] 2025-04-26 19:58:43,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2610 2468 [WARNING|trainer.py:803] 2025-04-26 19:58:46,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:58:46,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2456 [WARNING|trainer.py:803] 2025-04-26 19:58:48,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2457 [WARNING|trainer.py:803] 2025-04-26 19:58:51,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2611 2469 [WARNING|trainer.py:803] 2025-04-26 19:58:52,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:58:52,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2458 [WARNING|trainer.py:803] 2025-04-26 19:58:53,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2612 [WARNING|trainer.py:803] 2025-04-26 19:58:54,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2459 [WARNING|trainer.py:803] 2025-04-26 19:58:55,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2613 [WARNING|trainer.py:803] 2025-04-26 19:58:56,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2470 2460 [WARNING|trainer.py:803] 2025-04-26 19:58:57,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2614 [WARNING|trainer.py:803] 2025-04-26 19:58:58,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:58:58,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2461 2615 [WARNING|trainer.py:803] 2025-04-26 19:59:00,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:59:00,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2616 2462 [WARNING|trainer.py:803] 2025-04-26 19:59:02,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2471 [WARNING|trainer.py:803] 2025-04-26 19:59:02,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:59:03,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2617 [WARNING|trainer.py:803] 2025-04-26 19:59:04,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2472 [WARNING|trainer.py:803] 2025-04-26 19:59:05,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2618 2473 [WARNING|trainer.py:803] 2025-04-26 19:59:06,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2463 [WARNING|trainer.py:803] 2025-04-26 19:59:07,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2619 [WARNING|trainer.py:803] 2025-04-26 19:59:07,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:59:08,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2474 [WARNING|trainer.py:803] 2025-04-26 19:59:09,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2620 [WARNING|trainer.py:803] 2025-04-26 19:59:10,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2621 2464 2475 [WARNING|trainer.py:803] 2025-04-26 19:59:12,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:59:12,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:59:13,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2622 2476 [WARNING|trainer.py:803] 2025-04-26 19:59:14,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:59:15,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2477 2623 [WARNING|trainer.py:803] 2025-04-26 19:59:17,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:59:17,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2465 2478 [WARNING|trainer.py:803] 2025-04-26 19:59:18,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:59:19,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2624 [WARNING|trainer.py:803] 2025-04-26 19:59:20,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2479 2625 [WARNING|trainer.py:803] 2025-04-26 19:59:21,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 19:59:22,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2466 2480 [WARNING|trainer.py:803] 2025-04-26 19:59:24,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:59:24,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2626 [WARNING|trainer.py:803] 2025-04-26 19:59:25,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2481 [WARNING|trainer.py:803] 2025-04-26 19:59:26,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2627 [WARNING|trainer.py:803] 2025-04-26 19:59:27,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2482 2628 [WARNING|trainer.py:803] 2025-04-26 19:59:28,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2467 [WARNING|trainer.py:803] 2025-04-26 19:59:29,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:59:29,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2629 2483 [WARNING|trainer.py:803] 2025-04-26 19:59:31,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:59:32,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2630 [WARNING|trainer.py:803] 2025-04-26 19:59:33,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2484 2468 [WARNING|trainer.py:803] 2025-04-26 19:59:35,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:59:35,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2631 [WARNING|trainer.py:803] 2025-04-26 19:59:36,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2485 2632 [WARNING|trainer.py:803] 2025-04-26 19:59:37,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:59:38,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2633 2486 2469 [WARNING|trainer.py:803] 2025-04-26 19:59:40,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:59:40,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:59:40,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2634 2487 [WARNING|trainer.py:803] 2025-04-26 19:59:42,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:59:43,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2635 2488 [WARNING|trainer.py:803] 2025-04-26 19:59:44,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:59:45,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2470 2636 [WARNING|trainer.py:803] 2025-04-26 19:59:46,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2489 [WARNING|trainer.py:803] 2025-04-26 19:59:46,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:59:47,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2637 2490 [WARNING|trainer.py:803] 2025-04-26 19:59:48,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:59:49,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2638 [WARNING|trainer.py:803] 2025-04-26 19:59:50,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2471 2491 2639 [WARNING|trainer.py:803] 2025-04-26 19:59:51,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 19:59:51,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:59:52,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2472 2640 2492 [WARNING|trainer.py:803] 2025-04-26 19:59:53,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:59:54,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:59:54,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2473 2641 2493 [WARNING|trainer.py:803] 2025-04-26 19:59:55,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 19:59:56,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:59:56,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2474 2642 2494 [WARNING|trainer.py:803] 2025-04-26 19:59:58,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 19:59:58,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 19:59:58,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2495 2475 2643 [WARNING|trainer.py:803] 2025-04-26 20:00:01,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:00:01,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:00:02,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2476 2496 [WARNING|trainer.py:803] 2025-04-26 20:00:03,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:00:03,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2497 2477 [WARNING|trainer.py:803] 2025-04-26 20:00:05,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:00:05,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2644 2478 [WARNING|trainer.py:803] 2025-04-26 20:00:07,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2498 [WARNING|trainer.py:803] 2025-04-26 20:00:08,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:00:08,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2479 2499 [WARNING|trainer.py:803] 2025-04-26 20:00:10,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:00:10,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2645 2480 [WARNING|trainer.py:803] 2025-04-26 20:00:12,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:00:12,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2481 [WARNING|trainer.py:803] 2025-04-26 20:00:15,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2500 [WARNING|trainer.py:803] 2025-04-26 20:00:16,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2482 2646 [WARNING|trainer.py:803] 2025-04-26 20:00:17,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:00:17,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2501 [WARNING|trainer.py:803] 2025-04-26 20:00:18,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2483 2502 2647 [WARNING|trainer.py:803] 2025-04-26 20:00:20,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:00:21,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:00:21,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2503 2484 [WARNING|trainer.py:803] 2025-04-26 20:00:23,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:00:24,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2648 [WARNING|trainer.py:803] 2025-04-26 20:00:25,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2485 2504 [WARNING|trainer.py:803] 2025-04-26 20:00:26,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:00:26,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2505 2486 [WARNING|trainer.py:803] 2025-04-26 20:00:28,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2649 [WARNING|trainer.py:803] 2025-04-26 20:00:29,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:00:29,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2506 2487 [WARNING|trainer.py:803] 2025-04-26 20:00:30,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:00:31,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2507 2488 [WARNING|trainer.py:803] 2025-04-26 20:00:33,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:00:33,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2650 2489 2508 [WARNING|trainer.py:803] 2025-04-26 20:00:35,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:00:35,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:00:35,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2651 2490 [WARNING|trainer.py:803] 2025-04-26 20:00:36,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2509 [WARNING|trainer.py:803] 2025-04-26 20:00:37,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:00:37,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2652 [WARNING|trainer.py:803] 2025-04-26 20:00:38,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2491 2510 2653 [WARNING|trainer.py:803] 2025-04-26 20:00:40,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:00:40,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:00:40,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2511 2654 2492 [WARNING|trainer.py:803] 2025-04-26 20:00:42,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:00:42,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:00:42,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2655 2512 2493 [WARNING|trainer.py:803] 2025-04-26 20:00:44,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:00:44,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:00:45,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2656 2513 [WARNING|trainer.py:803] 2025-04-26 20:00:46,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2494 [WARNING|trainer.py:803] 2025-04-26 20:00:47,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2657 [WARNING|trainer.py:803] 2025-04-26 20:00:47,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:00:48,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2514 [WARNING|trainer.py:803] 2025-04-26 20:00:49,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2495 2658 [WARNING|trainer.py:803] 2025-04-26 20:00:50,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:00:50,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2515 2659 2496 [WARNING|trainer.py:803] 2025-04-26 20:00:51,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:00:52,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:00:52,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2660 2516 2497 [WARNING|trainer.py:803] 2025-04-26 20:00:54,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:00:54,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:00:54,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2661 [WARNING|trainer.py:803] 2025-04-26 20:00:56,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2517 2498 [WARNING|trainer.py:803] 2025-04-26 20:00:57,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2662 [WARNING|trainer.py:803] 2025-04-26 20:00:57,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:00:58,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2499 [WARNING|trainer.py:803] 2025-04-26 20:00:59,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2518 2663 [WARNING|trainer.py:803] 2025-04-26 20:01:00,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:01:00,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2519 2664 [WARNING|trainer.py:803] 2025-04-26 20:01:02,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:01:03,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2500 2665 [WARNING|trainer.py:803] 2025-04-26 20:01:04,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2520 [WARNING|trainer.py:803] 2025-04-26 20:01:05,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:01:05,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2501 [WARNING|trainer.py:803] 2025-04-26 20:01:07,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2521 [WARNING|trainer.py:803] 2025-04-26 20:01:08,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2502 2666 [WARNING|trainer.py:803] 2025-04-26 20:01:10,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:01:10,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2503 2667 [WARNING|trainer.py:803] 2025-04-26 20:01:12,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:01:12,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2522 2668 [WARNING|trainer.py:803] 2025-04-26 20:01:13,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2504 [WARNING|trainer.py:803] 2025-04-26 20:01:14,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:01:15,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2669 [WARNING|trainer.py:803] 2025-04-26 20:01:16,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2505 [WARNING|trainer.py:803] 2025-04-26 20:01:17,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2670 [WARNING|trainer.py:803] 2025-04-26 20:01:18,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2523 2506 [WARNING|trainer.py:803] 2025-04-26 20:01:19,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2671 [WARNING|trainer.py:803] 2025-04-26 20:01:19,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:01:20,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2507 2672 [WARNING|trainer.py:803] 2025-04-26 20:01:22,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:01:22,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2508 2673 [WARNING|trainer.py:803] 2025-04-26 20:01:24,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2524 [WARNING|trainer.py:803] 2025-04-26 20:01:24,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:01:25,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2509 [WARNING|trainer.py:803] 2025-04-26 20:01:26,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2674 [WARNING|trainer.py:803] 2025-04-26 20:01:28,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2510 [WARNING|trainer.py:803] 2025-04-26 20:01:29,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2511 2525 [WARNING|trainer.py:803] 2025-04-26 20:01:31,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:01:32,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2512 [WARNING|trainer.py:803] 2025-04-26 20:01:33,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2513 [WARNING|trainer.py:803] 2025-04-26 20:01:35,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2675 2526 2514 [WARNING|trainer.py:803] 2025-04-26 20:01:37,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:01:38,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:01:38,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2676 2515 [WARNING|trainer.py:803] 2025-04-26 20:01:40,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:01:40,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2677 [WARNING|trainer.py:803] 2025-04-26 20:01:42,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2516 2678 [WARNING|trainer.py:803] 2025-04-26 20:01:43,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:01:43,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2679 2517 [WARNING|trainer.py:803] 2025-04-26 20:01:45,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:01:45,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2680 [WARNING|trainer.py:803] 2025-04-26 20:01:47,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2527 [WARNING|trainer.py:803] 2025-04-26 20:01:48,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2518 2681 [WARNING|trainer.py:803] 2025-04-26 20:01:49,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:01:49,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2682 2519 [WARNING|trainer.py:803] 2025-04-26 20:01:51,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:01:51,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2528 2683 [WARNING|trainer.py:803] 2025-04-26 20:01:53,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:01:53,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2520 2684 [WARNING|trainer.py:803] 2025-04-26 20:01:54,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:01:55,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2521 2685 2529 [WARNING|trainer.py:803] 2025-04-26 20:01:57,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:01:57,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:01:57,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2686 [WARNING|trainer.py:803] 2025-04-26 20:02:00,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2687 2522 [WARNING|trainer.py:803] 2025-04-26 20:02:02,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2530 [WARNING|trainer.py:803] 2025-04-26 20:02:02,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2688 [WARNING|trainer.py:803] 2025-04-26 20:02:03,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:02:04,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2689 [WARNING|trainer.py:803] 2025-04-26 20:02:05,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2531 2690 [WARNING|trainer.py:803] 2025-04-26 20:02:07,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2523 [WARNING|trainer.py:803] 2025-04-26 20:02:08,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:02:08,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2532 2691 [WARNING|trainer.py:803] 2025-04-26 20:02:09,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:02:10,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2692 [WARNING|trainer.py:803] 2025-04-26 20:02:12,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2533 [WARNING|trainer.py:803] 2025-04-26 20:02:12,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2693 [WARNING|trainer.py:803] 2025-04-26 20:02:13,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2524 2534 [WARNING|trainer.py:803] 2025-04-26 20:02:14,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2694 [WARNING|trainer.py:803] 2025-04-26 20:02:15,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:02:15,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2695 2535 [WARNING|trainer.py:803] 2025-04-26 20:02:18,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:02:18,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2696 2536 [WARNING|trainer.py:803] 2025-04-26 20:02:20,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:02:20,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2525 2697 [WARNING|trainer.py:803] 2025-04-26 20:02:21,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2537 [WARNING|trainer.py:803] 2025-04-26 20:02:22,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:02:22,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2698 2538 [WARNING|trainer.py:803] 2025-04-26 20:02:24,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:02:24,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2699 [WARNING|trainer.py:803] 2025-04-26 20:02:26,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2526 2539 2700 [WARNING|trainer.py:803] 2025-04-26 20:02:27,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:02:27,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:02:28,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2540 2701 [WARNING|trainer.py:803] 2025-04-26 20:02:30,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:02:30,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2541 2702 [WARNING|trainer.py:803] 2025-04-26 20:02:32,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:02:32,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2542 2703 [WARNING|trainer.py:803] 2025-04-26 20:02:34,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:02:35,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2527 2704 2543 [WARNING|trainer.py:803] 2025-04-26 20:02:37,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:02:37,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:02:37,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2705 2544 [WARNING|trainer.py:803] 2025-04-26 20:02:40,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:02:40,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2528 2706 [WARNING|trainer.py:803] 2025-04-26 20:02:42,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:02:42,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2545 2707 [WARNING|trainer.py:803] 2025-04-26 20:02:44,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:02:45,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2546 [WARNING|trainer.py:803] 2025-04-26 20:02:46,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2529 2708 [WARNING|trainer.py:803] 2025-04-26 20:02:47,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:02:47,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2547 [WARNING|trainer.py:803] 2025-04-26 20:02:48,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2709 [WARNING|trainer.py:803] 2025-04-26 20:02:49,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2548 [WARNING|trainer.py:803] 2025-04-26 20:02:51,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2710 2530 [WARNING|trainer.py:803] 2025-04-26 20:02:52,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:02:52,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2549 [WARNING|trainer.py:803] 2025-04-26 20:02:53,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2711 [WARNING|trainer.py:803] 2025-04-26 20:02:54,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2550 2531 [WARNING|trainer.py:803] 2025-04-26 20:02:56,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2712 [WARNING|trainer.py:803] 2025-04-26 20:02:56,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:02:57,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2551 2532 [WARNING|trainer.py:803] 2025-04-26 20:02:58,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2713 [WARNING|trainer.py:803] 2025-04-26 20:02:59,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:02:59,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2552 2533 [WARNING|trainer.py:803] 2025-04-26 20:03:01,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2714 [WARNING|trainer.py:803] 2025-04-26 20:03:01,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:03:02,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2553 2534 2715 [WARNING|trainer.py:803] 2025-04-26 20:03:03,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:03:04,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:03:04,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2716 2554 2535 [WARNING|trainer.py:803] 2025-04-26 20:03:06,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:06,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:07,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2555 2717 2536 [WARNING|trainer.py:803] 2025-04-26 20:03:08,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:09,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:09,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2556 2718 2537 [WARNING|trainer.py:803] 2025-04-26 20:03:11,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:11,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:11,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2557 2538 2719 [WARNING|trainer.py:803] 2025-04-26 20:03:13,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:03:13,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:03:14,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2720 2539 [WARNING|trainer.py:803] 2025-04-26 20:03:16,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:03:16,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2558 [WARNING|trainer.py:803] 2025-04-26 20:03:18,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2721 2540 [WARNING|trainer.py:803] 2025-04-26 20:03:18,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:03:19,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2559 [WARNING|trainer.py:803] 2025-04-26 20:03:20,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2722 2541 [WARNING|trainer.py:803] 2025-04-26 20:03:21,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:03:21,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2560 2542 2723 [WARNING|trainer.py:803] 2025-04-26 20:03:23,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:03:23,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:23,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2561 2724 [WARNING|trainer.py:803] 2025-04-26 20:03:25,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2543 [WARNING|trainer.py:803] 2025-04-26 20:03:26,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:26,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2562 2725 [WARNING|trainer.py:803] 2025-04-26 20:03:27,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2544 [WARNING|trainer.py:803] 2025-04-26 20:03:28,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:29,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2563 2726 [WARNING|trainer.py:803] 2025-04-26 20:03:30,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:31,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2545 2727 [WARNING|trainer.py:803] 2025-04-26 20:03:32,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2564 [WARNING|trainer.py:803] 2025-04-26 20:03:33,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:34,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2546 [WARNING|trainer.py:803] 2025-04-26 20:03:35,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2728 2565 [WARNING|trainer.py:803] 2025-04-26 20:03:36,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:36,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2547 2729 [WARNING|trainer.py:803] 2025-04-26 20:03:37,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:03:38,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2566 2548 [WARNING|trainer.py:803] 2025-04-26 20:03:39,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2730 [WARNING|trainer.py:803] 2025-04-26 20:03:40,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:03:40,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2549 [WARNING|trainer.py:803] 2025-04-26 20:03:42,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2731 2567 [WARNING|trainer.py:803] 2025-04-26 20:03:43,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:43,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2550 2732 [WARNING|trainer.py:803] 2025-04-26 20:03:44,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:45,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2568 2551 [WARNING|trainer.py:803] 2025-04-26 20:03:46,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:47,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2733 [WARNING|trainer.py:803] 2025-04-26 20:03:47,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2569 2552 [WARNING|trainer.py:803] 2025-04-26 20:03:48,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2734 [WARNING|trainer.py:803] 2025-04-26 20:03:49,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2570 [WARNING|trainer.py:803] 2025-04-26 20:03:50,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:50,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2553 2735 2571 [WARNING|trainer.py:803] 2025-04-26 20:03:52,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:03:52,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:03:53,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2554 2572 2736 [WARNING|trainer.py:803] 2025-04-26 20:03:55,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:55,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:03:55,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2573 2555 2737 [WARNING|trainer.py:803] 2025-04-26 20:03:57,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:03:57,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:03:57,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2574 2556 2738 [WARNING|trainer.py:803] 2025-04-26 20:03:59,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:03:59,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:00,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2575 2557 2739 [WARNING|trainer.py:803] 2025-04-26 20:04:01,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:01,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:04:02,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2740 2576 [WARNING|trainer.py:803] 2025-04-26 20:04:04,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:04,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2558 2741 2577 [WARNING|trainer.py:803] 2025-04-26 20:04:06,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:07,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:04:07,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2559 2742 [WARNING|trainer.py:803] 2025-04-26 20:04:08,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2578 [WARNING|trainer.py:803] 2025-04-26 20:04:09,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:09,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2560 2743 [WARNING|trainer.py:803] 2025-04-26 20:04:11,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2579 [WARNING|trainer.py:803] 2025-04-26 20:04:11,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:12,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2561 2744 [WARNING|trainer.py:803] 2025-04-26 20:04:14,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:14,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2580 2562 [WARNING|trainer.py:803] 2025-04-26 20:04:15,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2745 [WARNING|trainer.py:803] 2025-04-26 20:04:16,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:04:16,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2581 2563 [WARNING|trainer.py:803] 2025-04-26 20:04:18,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2746 [WARNING|trainer.py:803] 2025-04-26 20:04:18,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:19,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2582 [WARNING|trainer.py:803] 2025-04-26 20:04:20,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2747 [WARNING|trainer.py:803] 2025-04-26 20:04:21,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2564 [WARNING|trainer.py:803] 2025-04-26 20:04:22,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2583 2748 [WARNING|trainer.py:803] 2025-04-26 20:04:23,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2565 [WARNING|trainer.py:803] 2025-04-26 20:04:24,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 20:04:24,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2584 2749 [WARNING|trainer.py:803] 2025-04-26 20:04:26,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:26,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2566 2585 [WARNING|trainer.py:803] 2025-04-26 20:04:27,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2750 [WARNING|trainer.py:803] 2025-04-26 20:04:28,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:28,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2586 2751 [WARNING|trainer.py:803] 2025-04-26 20:04:30,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2567 [WARNING|trainer.py:803] 2025-04-26 20:04:31,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:31,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2587 2752 [WARNING|trainer.py:803] 2025-04-26 20:04:33,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:04:33,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2568 [WARNING|trainer.py:803] 2025-04-26 20:04:34,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2753 2588 [WARNING|trainer.py:803] 2025-04-26 20:04:36,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:36,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2569 [WARNING|trainer.py:803] 2025-04-26 20:04:37,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2754 2589 [WARNING|trainer.py:803] 2025-04-26 20:04:38,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2570 [WARNING|trainer.py:803] 2025-04-26 20:04:39,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:04:39,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2755 2571 [WARNING|trainer.py:803] 2025-04-26 20:04:40,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2590 [WARNING|trainer.py:803] 2025-04-26 20:04:41,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:04:42,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2756 2572 [WARNING|trainer.py:803] 2025-04-26 20:04:43,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2591 [WARNING|trainer.py:803] 2025-04-26 20:04:43,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:04:44,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2757 2573 [WARNING|trainer.py:803] 2025-04-26 20:04:45,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:45,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2592 2574 [WARNING|trainer.py:803] 2025-04-26 20:04:47,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2758 [WARNING|trainer.py:803] 2025-04-26 20:04:47,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:04:48,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2593 2575 2759 [WARNING|trainer.py:803] 2025-04-26 20:04:50,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:04:50,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:50,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2594 2760 [WARNING|trainer.py:803] 2025-04-26 20:04:52,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2576 [WARNING|trainer.py:803] 2025-04-26 20:04:52,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:53,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2595 2761 [WARNING|trainer.py:803] 2025-04-26 20:04:54,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2577 [WARNING|trainer.py:803] 2025-04-26 20:04:55,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:55,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2596 2762 [WARNING|trainer.py:803] 2025-04-26 20:04:57,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:04:57,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2578 [WARNING|trainer.py:803] 2025-04-26 20:04:58,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2763 2597 [WARNING|trainer.py:803] 2025-04-26 20:04:59,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2579 [WARNING|trainer.py:803] 2025-04-26 20:05:00,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:05:00,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2598 2764 [WARNING|trainer.py:803] 2025-04-26 20:05:02,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:05:02,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2580 2599 [WARNING|trainer.py:803] 2025-04-26 20:05:03,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2765 [WARNING|trainer.py:803] 2025-04-26 20:05:04,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:05:04,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2581 2600 2766 [WARNING|trainer.py:803] 2025-04-26 20:05:06,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:05:06,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:05:06,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2582 2767 [WARNING|trainer.py:803] 2025-04-26 20:05:08,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2601 [WARNING|trainer.py:803] 2025-04-26 20:05:09,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:05:09,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2583 2768 2602 [WARNING|trainer.py:803] 2025-04-26 20:05:11,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:05:11,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:05:11,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2769 2584 2603 [WARNING|trainer.py:803] 2025-04-26 20:05:14,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:05:14,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:05:14,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2770 2604 2585 [WARNING|trainer.py:803] 2025-04-26 20:05:16,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:05:16,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:05:16,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2771 2586 2605 [WARNING|trainer.py:803] 2025-04-26 20:05:19,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:05:19,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:05:19,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2772 2606 2587 [WARNING|trainer.py:803] 2025-04-26 20:05:21,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:05:21,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:05:21,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2607 2773 2588 [WARNING|trainer.py:803] 2025-04-26 20:05:23,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:05:23,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:05:24,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2608 2774 [WARNING|trainer.py:803] 2025-04-26 20:05:26,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:05:26,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2589 [WARNING|trainer.py:803] 2025-04-26 20:05:27,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2609 2775 [WARNING|trainer.py:803] 2025-04-26 20:05:28,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:05:28,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2590 [WARNING|trainer.py:803] 2025-04-26 20:05:30,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2776 [WARNING|trainer.py:803] 2025-04-26 20:05:31,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2610 2591 [WARNING|trainer.py:803] 2025-04-26 20:05:32,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:05:32,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2777 [WARNING|trainer.py:803] 2025-04-26 20:05:33,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2592 2778 [WARNING|trainer.py:803] 2025-04-26 20:05:35,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:05:36,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2611 2593 2779 [WARNING|trainer.py:803] 2025-04-26 20:05:38,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:05:38,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:05:38,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2612 2594 2780 [WARNING|trainer.py:803] 2025-04-26 20:05:40,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:05:40,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:05:41,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2613 2595 2781 [WARNING|trainer.py:803] 2025-04-26 20:05:42,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:05:42,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:05:43,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2614 2596 2782 [WARNING|trainer.py:803] 2025-04-26 20:05:44,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:05:45,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:05:45,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2615 [WARNING|trainer.py:803] 2025-04-26 20:05:46,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2783 2597 [WARNING|trainer.py:803] 2025-04-26 20:05:48,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2616 [WARNING|trainer.py:803] 2025-04-26 20:05:48,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:05:49,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2784 2598 [WARNING|trainer.py:803] 2025-04-26 20:05:50,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2617 [WARNING|trainer.py:803] 2025-04-26 20:05:50,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:05:51,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2599 2785 [WARNING|trainer.py:803] 2025-04-26 20:05:52,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:05:52,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2618 [WARNING|trainer.py:803] 2025-04-26 20:05:54,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2786 2600 [WARNING|trainer.py:803] 2025-04-26 20:05:55,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:05:55,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2619 [WARNING|trainer.py:803] 2025-04-26 20:05:56,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2787 2601 [WARNING|trainer.py:803] 2025-04-26 20:05:57,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:05:58,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2620 [WARNING|trainer.py:803] 2025-04-26 20:05:58,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2788 2602 [WARNING|trainer.py:803] 2025-04-26 20:06:00,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2621 [WARNING|trainer.py:803] 2025-04-26 20:06:00,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:06:00,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2789 2603 [WARNING|trainer.py:803] 2025-04-26 20:06:02,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2622 [WARNING|trainer.py:803] 2025-04-26 20:06:03,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:03,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2790 2604 [WARNING|trainer.py:803] 2025-04-26 20:06:04,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:05,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2623 2791 [WARNING|trainer.py:803] 2025-04-26 20:06:06,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:06:06,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2605 [WARNING|trainer.py:803] 2025-04-26 20:06:08,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2792 2624 [WARNING|trainer.py:803] 2025-04-26 20:06:09,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2606 [WARNING|trainer.py:803] 2025-04-26 20:06:09,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:06:10,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2793 2625 2607 [WARNING|trainer.py:803] 2025-04-26 20:06:11,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:12,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:12,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2794 2608 2626 [WARNING|trainer.py:803] 2025-04-26 20:06:14,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:14,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:06:15,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2795 2609 2627 [WARNING|trainer.py:803] 2025-04-26 20:06:16,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:17,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:17,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2796 2628 [WARNING|trainer.py:803] 2025-04-26 20:06:19,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:19,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2610 2629 2797 [WARNING|trainer.py:803] 2025-04-26 20:06:21,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:21,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:06:21,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2630 2798 [WARNING|trainer.py:803] 2025-04-26 20:06:24,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:06:24,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2799 2611 2631 [WARNING|trainer.py:803] 2025-04-26 20:06:26,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:27,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:06:27,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2800 2632 2612 [WARNING|trainer.py:803] 2025-04-26 20:06:29,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:06:29,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:29,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2613 2633 2801 [WARNING|trainer.py:803] 2025-04-26 20:06:31,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:31,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:31,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2614 2634 2802 [WARNING|trainer.py:803] 2025-04-26 20:06:34,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:06:34,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:34,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2615 2635 2803 [WARNING|trainer.py:803] 2025-04-26 20:06:36,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:06:36,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:06:36,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2616 2636 2804 [WARNING|trainer.py:803] 2025-04-26 20:06:38,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:38,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:06:39,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2617 2637 2805 [WARNING|trainer.py:803] 2025-04-26 20:06:40,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:41,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:06:41,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2638 2618 2806 [WARNING|trainer.py:803] 2025-04-26 20:06:43,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:43,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:06:43,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2639 2619 [WARNING|trainer.py:803] 2025-04-26 20:06:45,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2807 [WARNING|trainer.py:803] 2025-04-26 20:06:45,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:46,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2640 [WARNING|trainer.py:803] 2025-04-26 20:06:47,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2620 2808 [WARNING|trainer.py:803] 2025-04-26 20:06:48,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:06:48,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2641 [WARNING|trainer.py:803] 2025-04-26 20:06:49,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2621 2809 [WARNING|trainer.py:803] 2025-04-26 20:06:50,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:06:50,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2642 2622 [WARNING|trainer.py:803] 2025-04-26 20:06:51,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2810 [WARNING|trainer.py:803] 2025-04-26 20:06:52,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:06:53,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2623 2811 [WARNING|trainer.py:803] 2025-04-26 20:06:55,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2643 [WARNING|trainer.py:803] 2025-04-26 20:06:55,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:06:56,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2812 [WARNING|trainer.py:803] 2025-04-26 20:06:58,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2624 [WARNING|trainer.py:803] 2025-04-26 20:06:59,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2813 2644 2625 [WARNING|trainer.py:803] 2025-04-26 20:07:00,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:07:01,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:07:01,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2814 [WARNING|trainer.py:803] 2025-04-26 20:07:03,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2626 [WARNING|trainer.py:803] 2025-04-26 20:07:04,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2815 2645 [WARNING|trainer.py:803] 2025-04-26 20:07:05,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2627 [WARNING|trainer.py:803] 2025-04-26 20:07:06,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:07:06,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2816 2628 [WARNING|trainer.py:803] 2025-04-26 20:07:07,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:07:08,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2817 2646 [WARNING|trainer.py:803] 2025-04-26 20:07:10,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2629 [WARNING|trainer.py:803] 2025-04-26 20:07:10,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:07:11,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2818 2630 [WARNING|trainer.py:803] 2025-04-26 20:07:12,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:07:13,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2647 2819 [WARNING|trainer.py:803] 2025-04-26 20:07:15,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:07:15,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2631 [WARNING|trainer.py:803] 2025-04-26 20:07:16,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2820 [WARNING|trainer.py:803] 2025-04-26 20:07:17,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2632 2648 [WARNING|trainer.py:803] 2025-04-26 20:07:18,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:07:19,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2821 2633 [WARNING|trainer.py:803] 2025-04-26 20:07:20,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:07:21,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2822 2634 [WARNING|trainer.py:803] 2025-04-26 20:07:22,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2649 [WARNING|trainer.py:803] 2025-04-26 20:07:23,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:07:23,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2823 2635 [WARNING|trainer.py:803] 2025-04-26 20:07:25,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:07:25,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2824 2636 [WARNING|trainer.py:803] 2025-04-26 20:07:27,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:07:28,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2650 [WARNING|trainer.py:803] 2025-04-26 20:07:29,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2825 2637 [WARNING|trainer.py:803] 2025-04-26 20:07:30,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:07:30,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2651 [WARNING|trainer.py:803] 2025-04-26 20:07:31,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2638 2826 [WARNING|trainer.py:803] 2025-04-26 20:07:32,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2652 [WARNING|trainer.py:803] 2025-04-26 20:07:32,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:07:33,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2639 2827 [WARNING|trainer.py:803] 2025-04-26 20:07:34,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2653 [WARNING|trainer.py:803] 2025-04-26 20:07:35,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:07:35,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2640 [WARNING|trainer.py:803] 2025-04-26 20:07:36,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2828 2654 [WARNING|trainer.py:803] 2025-04-26 20:07:37,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:07:37,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2641 [WARNING|trainer.py:803] 2025-04-26 20:07:38,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2829 2655 [WARNING|trainer.py:803] 2025-04-26 20:07:39,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:07:40,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2642 2656 [WARNING|trainer.py:803] 2025-04-26 20:07:41,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2830 [WARNING|trainer.py:803] 2025-04-26 20:07:42,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:07:42,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2657 2831 [WARNING|trainer.py:803] 2025-04-26 20:07:44,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:07:44,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2643 [WARNING|trainer.py:803] 2025-04-26 20:07:45,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2658 2832 [WARNING|trainer.py:803] 2025-04-26 20:07:47,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:07:47,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2659 2833 [WARNING|trainer.py:803] 2025-04-26 20:07:49,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2644 [WARNING|trainer.py:803] 2025-04-26 20:07:49,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2660 [WARNING|trainer.py:803] 2025-04-26 20:07:50,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:07:51,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2834 [WARNING|trainer.py:803] 2025-04-26 20:07:52,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2661 [WARNING|trainer.py:803] 2025-04-26 20:07:53,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2835 2645 [WARNING|trainer.py:803] 2025-04-26 20:07:54,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2662 [WARNING|trainer.py:803] 2025-04-26 20:07:55,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:07:55,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2836 [WARNING|trainer.py:803] 2025-04-26 20:07:57,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2663 [WARNING|trainer.py:803] 2025-04-26 20:07:58,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2837 2646 [WARNING|trainer.py:803] 2025-04-26 20:07:59,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:07:59,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2664 2838 [WARNING|trainer.py:803] 2025-04-26 20:08:01,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:08:02,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2665 2647 2839 [WARNING|trainer.py:803] 2025-04-26 20:08:03,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:08:04,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:08:04,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2840 [WARNING|trainer.py:803] 2025-04-26 20:08:07,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2648 [WARNING|trainer.py:803] 2025-04-26 20:08:08,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2666 2841 [WARNING|trainer.py:803] 2025-04-26 20:08:09,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:08:09,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2667 2842 [WARNING|trainer.py:803] 2025-04-26 20:08:11,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:08:11,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2649 [WARNING|trainer.py:803] 2025-04-26 20:08:12,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2668 2843 [WARNING|trainer.py:803] 2025-04-26 20:08:13,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:08:14,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2669 [WARNING|trainer.py:803] 2025-04-26 20:08:15,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2844 [WARNING|trainer.py:803] 2025-04-26 20:08:16,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2670 2650 [WARNING|trainer.py:803] 2025-04-26 20:08:17,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:08:18,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2845 [WARNING|trainer.py:803] 2025-04-26 20:08:19,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2671 2651 [WARNING|trainer.py:803] 2025-04-26 20:08:20,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:08:20,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2846 2652 [WARNING|trainer.py:803] 2025-04-26 20:08:21,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2672 [WARNING|trainer.py:803] 2025-04-26 20:08:22,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:08:22,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2847 2653 [WARNING|trainer.py:803] 2025-04-26 20:08:24,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:08:24,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2673 [WARNING|trainer.py:803] 2025-04-26 20:08:25,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2848 2654 [WARNING|trainer.py:803] 2025-04-26 20:08:26,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:08:26,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2655 2674 2849 [WARNING|trainer.py:803] 2025-04-26 20:08:28,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:08:28,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:08:28,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2656 2850 [WARNING|trainer.py:803] 2025-04-26 20:08:31,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:08:31,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2657 2851 [WARNING|trainer.py:803] 2025-04-26 20:08:33,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:08:33,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2658 2852 [WARNING|trainer.py:803] 2025-04-26 20:08:36,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:08:36,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2659 2675 2853 [WARNING|trainer.py:803] 2025-04-26 20:08:38,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:08:38,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:08:38,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2660 2676 [WARNING|trainer.py:803] 2025-04-26 20:08:40,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2854 [WARNING|trainer.py:803] 2025-04-26 20:08:40,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:08:41,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2661 2677 [WARNING|trainer.py:803] 2025-04-26 20:08:42,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2855 [WARNING|trainer.py:803] 2025-04-26 20:08:42,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:08:43,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2662 2678 [WARNING|trainer.py:803] 2025-04-26 20:08:44,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:08:44,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2856 2679 [WARNING|trainer.py:803] 2025-04-26 20:08:46,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo 2663 [WARNING|trainer.py:803] 2025-04-26 20:08:46,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:08:47,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2857 2680 [WARNING|trainer.py:803] 2025-04-26 20:08:48,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:08:48,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2664 2858 [WARNING|trainer.py:803] 2025-04-26 20:08:50,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2681 [WARNING|trainer.py:803] 2025-04-26 20:08:51,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:08:51,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2665 [WARNING|trainer.py:803] 2025-04-26 20:08:52,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2859 2682 [WARNING|trainer.py:803] 2025-04-26 20:08:53,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:08:54,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2683 2860 [WARNING|trainer.py:803] 2025-04-26 20:08:56,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:08:56,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2666 2684 2861 [WARNING|trainer.py:803] 2025-04-26 20:08:58,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:08:58,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:08:58,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2667 2685 [WARNING|trainer.py:803] 2025-04-26 20:09:00,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2862 [WARNING|trainer.py:803] 2025-04-26 20:09:00,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:09:01,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2668 [WARNING|trainer.py:803] 2025-04-26 20:09:02,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2686 2863 [WARNING|trainer.py:803] 2025-04-26 20:09:03,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:09:03,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2669 [WARNING|trainer.py:803] 2025-04-26 20:09:04,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2687 2864 [WARNING|trainer.py:803] 2025-04-26 20:09:05,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2670 [WARNING|trainer.py:803] 2025-04-26 20:09:06,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:09:06,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2688 2865 [WARNING|trainer.py:803] 2025-04-26 20:09:07,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2671 [WARNING|trainer.py:803] 2025-04-26 20:09:08,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2689 [WARNING|trainer.py:803] 2025-04-26 20:09:09,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:09:09,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2866 2672 [WARNING|trainer.py:803] 2025-04-26 20:09:10,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2690 [WARNING|trainer.py:803] 2025-04-26 20:09:11,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:09:12,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2867 2673 [WARNING|trainer.py:803] 2025-04-26 20:09:13,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2691 [WARNING|trainer.py:803] 2025-04-26 20:09:13,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:09:14,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2868 [WARNING|trainer.py:803] 2025-04-26 20:09:15,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2692 2674 [WARNING|trainer.py:803] 2025-04-26 20:09:16,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:09:17,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2869 2693 [WARNING|trainer.py:803] 2025-04-26 20:09:18,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:09:18,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2870 2694 [WARNING|trainer.py:803] 2025-04-26 20:09:20,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:09:21,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2871 2695 [WARNING|trainer.py:803] 2025-04-26 20:09:23,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:09:23,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2872 2696 [WARNING|trainer.py:803] 2025-04-26 20:09:25,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:09:26,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2675 2873 [WARNING|trainer.py:803] 2025-04-26 20:09:27,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2697 [WARNING|trainer.py:803] 2025-04-26 20:09:28,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:09:28,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2676 [WARNING|trainer.py:803] 2025-04-26 20:09:29,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2874 2698 [WARNING|trainer.py:803] 2025-04-26 20:09:30,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:09:30,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2677 [WARNING|trainer.py:803] 2025-04-26 20:09:31,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2699 2875 2678 [WARNING|trainer.py:803] 2025-04-26 20:09:32,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:09:33,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:09:33,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2700 2876 2679 [WARNING|trainer.py:803] 2025-04-26 20:09:35,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:09:35,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:09:35,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2680 2701 2877 [WARNING|trainer.py:803] 2025-04-26 20:09:37,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:09:37,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:09:37,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2878 2681 2702 [WARNING|trainer.py:803] 2025-04-26 20:09:40,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:09:40,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:09:40,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2879 2682 2703 [WARNING|trainer.py:803] 2025-04-26 20:09:42,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:09:42,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:09:43,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2683 2880 2704 [WARNING|trainer.py:803] 2025-04-26 20:09:44,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:09:45,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:09:45,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2684 [WARNING|trainer.py:803] 2025-04-26 20:09:46,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2881 2705 [WARNING|trainer.py:803] 2025-04-26 20:09:47,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:09:48,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2685 2882 [WARNING|trainer.py:803] 2025-04-26 20:09:49,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2706 [WARNING|trainer.py:803] 2025-04-26 20:09:50,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:09:50,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2686 2883 [WARNING|trainer.py:803] 2025-04-26 20:09:52,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2707 [WARNING|trainer.py:803] 2025-04-26 20:09:52,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:09:53,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2687 2884 [WARNING|trainer.py:803] 2025-04-26 20:09:54,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2708 [WARNING|trainer.py:803] 2025-04-26 20:09:55,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:09:55,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2688 [WARNING|trainer.py:803] 2025-04-26 20:09:56,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2885 2709 2689 [WARNING|trainer.py:803] 2025-04-26 20:09:57,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:09:58,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:09:58,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2886 2710 [WARNING|trainer.py:803] 2025-04-26 20:10:00,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2690 [WARNING|trainer.py:803] 2025-04-26 20:10:00,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:01,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2887 2691 [WARNING|trainer.py:803] 2025-04-26 20:10:02,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2711 [WARNING|trainer.py:803] 2025-04-26 20:10:03,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:10:03,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2888 2692 [WARNING|trainer.py:803] 2025-04-26 20:10:05,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2712 [WARNING|trainer.py:803] 2025-04-26 20:10:05,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:10:06,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2889 2693 [WARNING|trainer.py:803] 2025-04-26 20:10:07,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2713 [WARNING|trainer.py:803] 2025-04-26 20:10:07,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:10:08,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2694 2890 [WARNING|trainer.py:803] 2025-04-26 20:10:10,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:10:10,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2714 [WARNING|trainer.py:803] 2025-04-26 20:10:11,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2695 2891 [WARNING|trainer.py:803] 2025-04-26 20:10:12,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:10:12,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2715 [WARNING|trainer.py:803] 2025-04-26 20:10:13,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2696 2892 [WARNING|trainer.py:803] 2025-04-26 20:10:15,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:10:15,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2716 [WARNING|trainer.py:803] 2025-04-26 20:10:16,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2697 2893 [WARNING|trainer.py:803] 2025-04-26 20:10:17,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:10:17,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2717 [WARNING|trainer.py:803] 2025-04-26 20:10:18,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2698 2894 [WARNING|trainer.py:803] 2025-04-26 20:10:19,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:10:20,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2718 2699 [WARNING|trainer.py:803] 2025-04-26 20:10:21,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2895 [WARNING|trainer.py:803] 2025-04-26 20:10:21,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:10:22,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2719 2700 [WARNING|trainer.py:803] 2025-04-26 20:10:24,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2896 [WARNING|trainer.py:803] 2025-04-26 20:10:24,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:10:25,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2720 2701 2897 [WARNING|trainer.py:803] 2025-04-26 20:10:26,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:10:26,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:27,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2721 2702 2898 [WARNING|trainer.py:803] 2025-04-26 20:10:29,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:10:29,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:10:29,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2722 2703 2899 [WARNING|trainer.py:803] 2025-04-26 20:10:31,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:10:31,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:10:32,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2723 2704 2900 [WARNING|trainer.py:803] 2025-04-26 20:10:34,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:10:34,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:34,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2705 2901 2724 [WARNING|trainer.py:803] 2025-04-26 20:10:37,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:10:37,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:37,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2902 2725 2706 [WARNING|trainer.py:803] 2025-04-26 20:10:39,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:39,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:39,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2903 2726 2707 [WARNING|trainer.py:803] 2025-04-26 20:10:42,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:42,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:42,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2904 2727 2708 [WARNING|trainer.py:803] 2025-04-26 20:10:44,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 20:10:44,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:45,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2905 2728 2709 [WARNING|trainer.py:803] 2025-04-26 20:10:47,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:10:47,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:47,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2906 2710 2729 [WARNING|trainer.py:803] 2025-04-26 20:10:49,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:50,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:50,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2907 2711 2730 [WARNING|trainer.py:803] 2025-04-26 20:10:52,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:10:52,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:52,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2908 2712 2731 [WARNING|trainer.py:803] 2025-04-26 20:10:54,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:10:55,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:55,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2909 2713 2732 [WARNING|trainer.py:803] 2025-04-26 20:10:57,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:10:57,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:10:57,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2910 2714 [WARNING|trainer.py:803] 2025-04-26 20:10:59,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2733 [WARNING|trainer.py:803] 2025-04-26 20:11:00,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:00,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2911 [WARNING|trainer.py:803] 2025-04-26 20:11:01,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2715 2734 [WARNING|trainer.py:803] 2025-04-26 20:11:02,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:11:02,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2912 [WARNING|trainer.py:803] 2025-04-26 20:11:04,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2716 2735 [WARNING|trainer.py:803] 2025-04-26 20:11:05,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:05,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2913 [WARNING|trainer.py:803] 2025-04-26 20:11:06,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2717 2736 [WARNING|trainer.py:803] 2025-04-26 20:11:07,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:08,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2914 [WARNING|trainer.py:803] 2025-04-26 20:11:09,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2718 2737 [WARNING|trainer.py:803] 2025-04-26 20:11:10,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:10,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2915 [WARNING|trainer.py:803] 2025-04-26 20:11:11,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2719 2738 [WARNING|trainer.py:803] 2025-04-26 20:11:12,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:13,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2916 [WARNING|trainer.py:803] 2025-04-26 20:11:13,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2720 2739 2917 [WARNING|trainer.py:803] 2025-04-26 20:11:15,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:11:15,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:16,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2740 2721 2918 [WARNING|trainer.py:803] 2025-04-26 20:11:18,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:18,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:11:18,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2741 2722 2919 [WARNING|trainer.py:803] 2025-04-26 20:11:20,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:11:20,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:11:21,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2742 2723 2920 [WARNING|trainer.py:803] 2025-04-26 20:11:23,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:23,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:11:23,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2743 2724 2921 [WARNING|trainer.py:803] 2025-04-26 20:11:25,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:25,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:26,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2744 2725 2922 [WARNING|trainer.py:803] 2025-04-26 20:11:28,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:11:28,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:28,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2745 2923 2726 [WARNING|trainer.py:803] 2025-04-26 20:11:30,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:11:30,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:11:31,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2924 2746 2727 [WARNING|trainer.py:803] 2025-04-26 20:11:33,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:11:33,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:11:33,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2925 2747 2728 [WARNING|trainer.py:803] 2025-04-26 20:11:35,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:36,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:11:36,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2926 2748 2729 [WARNING|trainer.py:803] 2025-04-26 20:11:38,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:38,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 20:11:39,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2927 2749 [WARNING|trainer.py:803] 2025-04-26 20:11:40,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2730 [WARNING|trainer.py:803] 2025-04-26 20:11:41,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:41,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2928 [WARNING|trainer.py:803] 2025-04-26 20:11:42,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2750 2731 [WARNING|trainer.py:803] 2025-04-26 20:11:43,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:44,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2929 [WARNING|trainer.py:803] 2025-04-26 20:11:45,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2751 2732 [WARNING|trainer.py:803] 2025-04-26 20:11:46,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:46,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2930 [WARNING|trainer.py:803] 2025-04-26 20:11:47,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2752 2733 [WARNING|trainer.py:803] 2025-04-26 20:11:49,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:11:49,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2931 [WARNING|trainer.py:803] 2025-04-26 20:11:50,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2753 2734 [WARNING|trainer.py:803] 2025-04-26 20:11:51,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:51,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2932 [WARNING|trainer.py:803] 2025-04-26 20:11:52,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2754 2735 [WARNING|trainer.py:803] 2025-04-26 20:11:54,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2933 [WARNING|trainer.py:803] 2025-04-26 20:11:54,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:11:55,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2755 2736 2934 [WARNING|trainer.py:803] 2025-04-26 20:11:56,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:11:56,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:11:57,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2756 2737 2935 [WARNING|trainer.py:803] 2025-04-26 20:11:59,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:11:59,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:12:00,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2757 2738 2936 [WARNING|trainer.py:803] 2025-04-26 20:12:01,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:01,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:02,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2758 2739 2937 [WARNING|trainer.py:803] 2025-04-26 20:12:04,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:04,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:05,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2759 2740 2938 [WARNING|trainer.py:803] 2025-04-26 20:12:06,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:07,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:07,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2760 2741 2939 [WARNING|trainer.py:803] 2025-04-26 20:12:09,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:09,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:12:09,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2761 2742 2940 [WARNING|trainer.py:803] 2025-04-26 20:12:11,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:12,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:12,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2762 2941 2743 [WARNING|trainer.py:803] 2025-04-26 20:12:14,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:14,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:14,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2763 2942 2744 [WARNING|trainer.py:803] 2025-04-26 20:12:16,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:12:17,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:12:17,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2764 2943 2745 [WARNING|trainer.py:803] 2025-04-26 20:12:19,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:12:19,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:19,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2944 2765 2746 [WARNING|trainer.py:803] 2025-04-26 20:12:22,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:12:22,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:22,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2945 2766 2747 [WARNING|trainer.py:803] 2025-04-26 20:12:24,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:24,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:24,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2946 2767 2748 [WARNING|trainer.py:803] 2025-04-26 20:12:27,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:12:27,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:12:27,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo 2947 2768 2749 [WARNING|trainer.py:803] 2025-04-26 20:12:29,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:12:29,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:12:30,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2948 2769 2750 [WARNING|trainer.py:803] 2025-04-26 20:12:32,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:12:32,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:32,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2949 2770 2751 [WARNING|trainer.py:803] 2025-04-26 20:12:34,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:12:35,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:35,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2950 2771 2752 [WARNING|trainer.py:803] 2025-04-26 20:12:37,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:37,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:12:37,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2951 2772 [WARNING|trainer.py:803] 2025-04-26 20:12:39,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2753 [WARNING|trainer.py:803] 2025-04-26 20:12:40,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:12:40,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2952 2773 [WARNING|trainer.py:803] 2025-04-26 20:12:41,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2754 [WARNING|trainer.py:803] 2025-04-26 20:12:42,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:12:42,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2953 [WARNING|trainer.py:803] 2025-04-26 20:12:44,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2774 2755 [WARNING|trainer.py:803] 2025-04-26 20:12:45,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:45,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2954 [WARNING|trainer.py:803] 2025-04-26 20:12:46,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2775 2756 [WARNING|trainer.py:803] 2025-04-26 20:12:47,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:48,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2955 [WARNING|trainer.py:803] 2025-04-26 20:12:49,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2776 2757 [WARNING|trainer.py:803] 2025-04-26 20:12:50,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:12:50,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2956 [WARNING|trainer.py:803] 2025-04-26 20:12:51,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2758 2777 2957 [WARNING|trainer.py:803] 2025-04-26 20:12:53,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:53,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:53,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2778 2759 2958 [WARNING|trainer.py:803] 2025-04-26 20:12:55,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:55,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:56,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2779 2760 2959 [WARNING|trainer.py:803] 2025-04-26 20:12:58,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:12:58,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:12:58,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2761 2780 2960 [WARNING|trainer.py:803] 2025-04-26 20:13:00,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:00,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:13:01,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2762 2781 2961 [WARNING|trainer.py:803] 2025-04-26 20:13:03,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:03,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:03,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2763 2782 2962 [WARNING|trainer.py:803] 2025-04-26 20:13:05,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:13:05,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:13:05,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2963 2764 2783 [WARNING|trainer.py:803] 2025-04-26 20:13:08,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:13:08,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:13:08,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2964 2784 2765 [WARNING|trainer.py:803] 2025-04-26 20:13:10,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:10,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:11,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2965 2785 2766 [WARNING|trainer.py:803] 2025-04-26 20:13:13,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:13,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:13,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2966 2767 2786 [WARNING|trainer.py:803] 2025-04-26 20:13:15,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:13:16,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:13:16,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2967 2787 2768 [WARNING|trainer.py:803] 2025-04-26 20:13:18,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:18,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:13:18,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2968 2788 2769 [WARNING|trainer.py:803] 2025-04-26 20:13:20,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:21,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:13:21,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2969 2789 2770 [WARNING|trainer.py:803] 2025-04-26 20:13:23,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:23,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:13:23,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2970 2790 2771 [WARNING|trainer.py:803] 2025-04-26 20:13:25,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:26,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:26,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2971 2791 2772 [WARNING|trainer.py:803] 2025-04-26 20:13:28,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:28,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:13:28,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2972 2773 2792 [WARNING|trainer.py:803] 2025-04-26 20:13:30,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:13:31,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:13:31,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2973 2774 2793 [WARNING|trainer.py:803] 2025-04-26 20:13:33,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:13:34,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:34,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2974 [WARNING|trainer.py:803] 2025-04-26 20:13:35,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2775 2794 [WARNING|trainer.py:803] 2025-04-26 20:13:36,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:36,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2975 [WARNING|trainer.py:803] 2025-04-26 20:13:38,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2795 2776 [WARNING|trainer.py:803] 2025-04-26 20:13:39,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:39,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2976 [WARNING|trainer.py:803] 2025-04-26 20:13:40,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2796 2777 [WARNING|trainer.py:803] 2025-04-26 20:13:41,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:41,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2977 [WARNING|trainer.py:803] 2025-04-26 20:13:43,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2778 2797 [WARNING|trainer.py:803] 2025-04-26 20:13:44,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:44,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2978 [WARNING|trainer.py:803] 2025-04-26 20:13:45,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2779 2798 [WARNING|trainer.py:803] 2025-04-26 20:13:46,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:13:47,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2979 [WARNING|trainer.py:803] 2025-04-26 20:13:47,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2780 2799 [WARNING|trainer.py:803] 2025-04-26 20:13:49,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2980 [WARNING|trainer.py:803] 2025-04-26 20:13:49,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:50,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2781 2800 [WARNING|trainer.py:803] 2025-04-26 20:13:51,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2981 [WARNING|trainer.py:803] 2025-04-26 20:13:52,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:13:52,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2782 2801 2982 [WARNING|trainer.py:803] 2025-04-26 20:13:54,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:13:54,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:55,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2783 2802 2983 [WARNING|trainer.py:803] 2025-04-26 20:13:57,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:57,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:13:57,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2784 2984 2803 [WARNING|trainer.py:803] 2025-04-26 20:13:59,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:00,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:00,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2785 2985 2804 [WARNING|trainer.py:803] 2025-04-26 20:14:02,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:02,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:14:02,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2986 2786 2805 [WARNING|trainer.py:803] 2025-04-26 20:14:04,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:14:04,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:05,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2987 2787 2806 [WARNING|trainer.py:803] 2025-04-26 20:14:07,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:14:07,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:14:07,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2988 2788 2807 [WARNING|trainer.py:803] 2025-04-26 20:14:09,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:10,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:14:10,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2989 2789 2808 [WARNING|trainer.py:803] 2025-04-26 20:14:12,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:12,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:14:12,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2990 2790 2809 [WARNING|trainer.py:803] 2025-04-26 20:14:14,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:15,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:15,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2991 2791 [WARNING|trainer.py:803] 2025-04-26 20:14:17,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2810 [WARNING|trainer.py:803] 2025-04-26 20:14:17,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:14:17,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2992 2792 [WARNING|trainer.py:803] 2025-04-26 20:14:19,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2811 [WARNING|trainer.py:803] 2025-04-26 20:14:20,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:20,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2993 [WARNING|trainer.py:803] 2025-04-26 20:14:21,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2793 2812 [WARNING|trainer.py:803] 2025-04-26 20:14:22,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:23,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2994 [WARNING|trainer.py:803] 2025-04-26 20:14:24,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2794 2813 [WARNING|trainer.py:803] 2025-04-26 20:14:25,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:25,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2995 [WARNING|trainer.py:803] 2025-04-26 20:14:26,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2795 2814 [WARNING|trainer.py:803] 2025-04-26 20:14:28,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:28,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2996 [WARNING|trainer.py:803] 2025-04-26 20:14:29,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2796 2815 [WARNING|trainer.py:803] 2025-04-26 20:14:30,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2997 [WARNING|trainer.py:803] 2025-04-26 20:14:30,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:31,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2797 2816 2998 [WARNING|trainer.py:803] 2025-04-26 20:14:33,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:14:33,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:14:34,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2798 2817 2999 [WARNING|trainer.py:803] 2025-04-26 20:14:35,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:14:36,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:14:36,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2799 2818 3000 [WARNING|trainer.py:803] 2025-04-26 20:14:38,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:38,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:39,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3001 2800 2819 [WARNING|trainer.py:803] 2025-04-26 20:14:40,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:41,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:14:41,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3002 [WARNING|trainer.py:803] 2025-04-26 20:14:42,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2801 2820 [WARNING|trainer.py:803] 2025-04-26 20:14:43,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3003 [WARNING|trainer.py:803] 2025-04-26 20:14:44,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:14:44,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3004 2802 2821 [WARNING|trainer.py:803] 2025-04-26 20:14:46,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:46,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:46,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3005 2803 [WARNING|trainer.py:803] 2025-04-26 20:14:48,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2822 [WARNING|trainer.py:803] 2025-04-26 20:14:49,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3006 [WARNING|trainer.py:803] 2025-04-26 20:14:49,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:50,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2804 2823 3007 [WARNING|trainer.py:803] 2025-04-26 20:14:51,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:14:52,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:14:52,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3008 2805 2824 [WARNING|trainer.py:803] 2025-04-26 20:14:54,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:14:54,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:54,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3009 [WARNING|trainer.py:803] 2025-04-26 20:14:55,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2806 2825 3010 [WARNING|trainer.py:803] 2025-04-26 20:14:57,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:57,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:14:57,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2807 3011 2826 [WARNING|trainer.py:803] 2025-04-26 20:14:59,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:14:59,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:14:59,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3012 2808 [WARNING|trainer.py:803] 2025-04-26 20:15:01,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2827 [WARNING|trainer.py:803] 2025-04-26 20:15:02,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3013 [WARNING|trainer.py:803] 2025-04-26 20:15:02,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:15:03,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2809 2828 3014 [WARNING|trainer.py:803] 2025-04-26 20:15:04,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:15:04,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:15:05,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3015 2810 2829 [WARNING|trainer.py:803] 2025-04-26 20:15:06,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:15:07,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:15:07,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3016 [WARNING|trainer.py:803] 2025-04-26 20:15:08,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2811 2830 3017 [WARNING|trainer.py:803] 2025-04-26 20:15:09,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:15:10,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:15:10,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3018 2812 2831 [WARNING|trainer.py:803] 2025-04-26 20:15:12,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:15:12,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:15:12,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3019 2813 [WARNING|trainer.py:803] 2025-04-26 20:15:14,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2832 [WARNING|trainer.py:803] 2025-04-26 20:15:15,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:15:15,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3020 [WARNING|trainer.py:803] 2025-04-26 20:15:16,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2814 2833 3021 [WARNING|trainer.py:803] 2025-04-26 20:15:17,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:15:17,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:15:18,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3022 2815 2834 [WARNING|trainer.py:803] 2025-04-26 20:15:19,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:15:20,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:15:20,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3023 [WARNING|trainer.py:803] 2025-04-26 20:15:21,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2816 2835 3024 [WARNING|trainer.py:803] 2025-04-26 20:15:22,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:15:23,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:15:23,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3025 2817 2836 [WARNING|trainer.py:803] 2025-04-26 20:15:25,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:15:25,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:15:25,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3026 [WARNING|trainer.py:803] 2025-04-26 20:15:27,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2818 2837 [WARNING|trainer.py:803] 2025-04-26 20:15:28,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3027 [WARNING|trainer.py:803] 2025-04-26 20:15:28,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:15:29,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2819 2838 3028 [WARNING|trainer.py:803] 2025-04-26 20:15:30,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:15:30,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:15:31,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3029 2820 2839 [WARNING|trainer.py:803] 2025-04-26 20:15:32,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:15:33,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:15:33,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3030 [WARNING|trainer.py:803] 2025-04-26 20:15:34,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2821 2840 3031 [WARNING|trainer.py:803] 2025-04-26 20:15:35,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:15:36,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:15:36,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3032 2822 2841 [WARNING|trainer.py:803] 2025-04-26 20:15:38,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:15:38,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:15:38,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3033 2823 2842 [WARNING|trainer.py:803] 2025-04-26 20:15:40,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:15:41,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:15:41,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3034 [WARNING|trainer.py:803] 2025-04-26 20:15:42,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2824 2843 3035 [WARNING|trainer.py:803] 2025-04-26 20:15:43,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:15:43,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:15:44,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3036 2844 2825 [WARNING|trainer.py:803] 2025-04-26 20:15:46,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:15:46,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:15:46,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3037 [WARNING|trainer.py:803] 2025-04-26 20:15:48,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2826 2845 3038 [WARNING|trainer.py:803] 2025-04-26 20:15:49,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:15:49,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:15:49,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3039 2846 2827 [WARNING|trainer.py:803] 2025-04-26 20:15:51,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:15:51,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:15:51,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3040 2828 [WARNING|trainer.py:803] 2025-04-26 20:15:53,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2847 [WARNING|trainer.py:803] 2025-04-26 20:15:54,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:15:54,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3041 [WARNING|trainer.py:803] 2025-04-26 20:15:55,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2829 2848 3042 [WARNING|trainer.py:803] 2025-04-26 20:15:56,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:15:57,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:15:57,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3043 2849 2830 [WARNING|trainer.py:803] 2025-04-26 20:15:59,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:15:59,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:15:59,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3044 [WARNING|trainer.py:803] 2025-04-26 20:16:01,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2831 2850 3045 [WARNING|trainer.py:803] 2025-04-26 20:16:02,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:02,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:02,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3046 2851 2832 [WARNING|trainer.py:803] 2025-04-26 20:16:04,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:04,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:04,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3047 2833 2852 [WARNING|trainer.py:803] 2025-04-26 20:16:06,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:16:07,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:07,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3048 [WARNING|trainer.py:803] 2025-04-26 20:16:08,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2834 2853 3049 [WARNING|trainer.py:803] 2025-04-26 20:16:10,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:10,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:16:10,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3050 2835 2854 [WARNING|trainer.py:803] 2025-04-26 20:16:12,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:16:12,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:16:12,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3051 [WARNING|trainer.py:803] 2025-04-26 20:16:14,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2836 2855 3052 [WARNING|trainer.py:803] 2025-04-26 20:16:15,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:16:15,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:15,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3053 2837 2856 [WARNING|trainer.py:803] 2025-04-26 20:16:17,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:16:17,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:16:18,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo 3054 [WARNING|trainer.py:803] 2025-04-26 20:16:19,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2838 2857 [WARNING|trainer.py:803] 2025-04-26 20:16:20,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3055 [WARNING|trainer.py:803] 2025-04-26 20:16:20,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:16:21,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2839 3056 2858 [WARNING|trainer.py:803] 2025-04-26 20:16:22,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:16:23,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:16:23,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3057 2840 [WARNING|trainer.py:803] 2025-04-26 20:16:25,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2859 [WARNING|trainer.py:803] 2025-04-26 20:16:25,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:26,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3058 [WARNING|trainer.py:803] 2025-04-26 20:16:27,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2841 2860 3059 [WARNING|trainer.py:803] 2025-04-26 20:16:28,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:28,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:28,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3060 2842 2861 [WARNING|trainer.py:803] 2025-04-26 20:16:30,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:30,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:31,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3061 [WARNING|trainer.py:803] 2025-04-26 20:16:32,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2843 2862 [WARNING|trainer.py:803] 2025-04-26 20:16:33,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3062 [WARNING|trainer.py:803] 2025-04-26 20:16:33,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:34,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3063 2844 2863 [WARNING|trainer.py:803] 2025-04-26 20:16:36,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:16:36,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:16:36,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3064 [WARNING|trainer.py:803] 2025-04-26 20:16:37,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2845 2864 [WARNING|trainer.py:803] 2025-04-26 20:16:38,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3065 [WARNING|trainer.py:803] 2025-04-26 20:16:39,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:39,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2846 3066 2865 [WARNING|trainer.py:803] 2025-04-26 20:16:41,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:41,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:41,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3067 2847 2866 [WARNING|trainer.py:803] 2025-04-26 20:16:43,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:16:43,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:16:44,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3068 [WARNING|trainer.py:803] 2025-04-26 20:16:45,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2848 2867 3069 [WARNING|trainer.py:803] 2025-04-26 20:16:46,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:16:46,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:47,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3070 2849 2868 [WARNING|trainer.py:803] 2025-04-26 20:16:48,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:49,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:49,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3071 [WARNING|trainer.py:803] 2025-04-26 20:16:50,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2850 2869 [WARNING|trainer.py:803] 2025-04-26 20:16:51,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3072 [WARNING|trainer.py:803] 2025-04-26 20:16:52,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:16:52,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2851 3073 2870 [WARNING|trainer.py:803] 2025-04-26 20:16:54,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:16:54,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:16:54,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3074 2852 [WARNING|trainer.py:803] 2025-04-26 20:16:56,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2871 [WARNING|trainer.py:803] 2025-04-26 20:16:57,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3075 [WARNING|trainer.py:803] 2025-04-26 20:16:57,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:16:58,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2853 2872 3076 [WARNING|trainer.py:803] 2025-04-26 20:16:59,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:17:00,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:17:00,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3077 2854 2873 [WARNING|trainer.py:803] 2025-04-26 20:17:01,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:17:02,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:02,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3078 [WARNING|trainer.py:803] 2025-04-26 20:17:03,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2855 2874 3079 [WARNING|trainer.py:803] 2025-04-26 20:17:04,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:05,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:17:05,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2856 3080 2875 [WARNING|trainer.py:803] 2025-04-26 20:17:07,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 20:17:07,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:17:07,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3081 2857 [WARNING|trainer.py:803] 2025-04-26 20:17:09,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2876 [WARNING|trainer.py:803] 2025-04-26 20:17:10,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:17:10,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3082 [WARNING|trainer.py:803] 2025-04-26 20:17:11,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2858 2877 3083 [WARNING|trainer.py:803] 2025-04-26 20:17:12,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:13,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:17:13,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3084 2859 2878 [WARNING|trainer.py:803] 2025-04-26 20:17:15,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:17:15,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:17:15,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3085 [WARNING|trainer.py:803] 2025-04-26 20:17:16,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2879 2860 3086 [WARNING|trainer.py:803] 2025-04-26 20:17:18,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:17:18,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:18,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3087 2880 2861 [WARNING|trainer.py:803] 2025-04-26 20:17:20,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:17:20,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:20,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3088 [WARNING|trainer.py:803] 2025-04-26 20:17:22,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2881 2862 3089 [WARNING|trainer.py:803] 2025-04-26 20:17:23,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:23,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:24,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3090 2882 2863 [WARNING|trainer.py:803] 2025-04-26 20:17:25,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:26,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:17:26,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3091 2864 2883 [WARNING|trainer.py:803] 2025-04-26 20:17:27,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:17:28,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:28,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3092 [WARNING|trainer.py:803] 2025-04-26 20:17:29,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2865 2884 3093 [WARNING|trainer.py:803] 2025-04-26 20:17:31,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:31,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:31,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3094 2866 2885 [WARNING|trainer.py:803] 2025-04-26 20:17:33,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:17:33,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:34,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3095 [WARNING|trainer.py:803] 2025-04-26 20:17:35,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2867 2886 3096 [WARNING|trainer.py:803] 2025-04-26 20:17:36,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:36,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:17:37,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3097 2887 2868 [WARNING|trainer.py:803] 2025-04-26 20:17:38,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:39,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:39,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3098 [WARNING|trainer.py:803] 2025-04-26 20:17:40,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2888 2869 [WARNING|trainer.py:803] 2025-04-26 20:17:41,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3099 [WARNING|trainer.py:803] 2025-04-26 20:17:41,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:17:42,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2889 2870 3100 [WARNING|trainer.py:803] 2025-04-26 20:17:44,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:44,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:44,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2871 2890 3101 [WARNING|trainer.py:803] 2025-04-26 20:17:47,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:17:47,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:47,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2891 2872 3102 [WARNING|trainer.py:803] 2025-04-26 20:17:49,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:49,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:17:49,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 2892 2873 3103 [WARNING|trainer.py:803] 2025-04-26 20:17:52,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:17:52,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:17:52,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 3104 2893 2874 [WARNING|trainer.py:803] 2025-04-26 20:17:54,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:17:54,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:17:54,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3105 2894 2875 [WARNING|trainer.py:803] 2025-04-26 20:17:56,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:57,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:17:57,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3106 2895 [WARNING|trainer.py:803] 2025-04-26 20:17:59,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2876 [WARNING|trainer.py:803] 2025-04-26 20:18:00,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:18:00,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3107 [WARNING|trainer.py:803] 2025-04-26 20:18:01,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2896 2877 [WARNING|trainer.py:803] 2025-04-26 20:18:02,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:18:02,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3108 [WARNING|trainer.py:803] 2025-04-26 20:18:03,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2897 2878 3109 [WARNING|trainer.py:803] 2025-04-26 20:18:05,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:18:05,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:18:05,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2879 2898 3110 [WARNING|trainer.py:803] 2025-04-26 20:18:07,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:18:07,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:18:08,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3111 2899 2880 [WARNING|trainer.py:803] 2025-04-26 20:18:10,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:18:10,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:18:10,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3112 2881 2900 [WARNING|trainer.py:803] 2025-04-26 20:18:12,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:18:13,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:18:13,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3113 [WARNING|trainer.py:803] 2025-04-26 20:18:14,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2882 2901 [WARNING|trainer.py:803] 2025-04-26 20:18:15,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:18:15,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3114 [WARNING|trainer.py:803] 2025-04-26 20:18:16,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2902 2883 3115 [WARNING|trainer.py:803] 2025-04-26 20:18:18,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:18:18,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:18:19,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2903 2884 3116 [WARNING|trainer.py:803] 2025-04-26 20:18:21,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:18:21,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:18:21,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2904 2885 3117 [WARNING|trainer.py:803] 2025-04-26 20:18:23,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 20:18:23,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:18:23,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2905 2886 3118 [WARNING|trainer.py:803] 2025-04-26 20:18:26,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:18:26,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:18:26,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2906 2887 3119 [WARNING|trainer.py:803] 2025-04-26 20:18:28,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:18:28,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:18:28,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3120 2907 2888 [WARNING|trainer.py:803] 2025-04-26 20:18:31,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 20:18:31,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes YesYes [WARNING|trainer.py:803] 2025-04-26 20:18:31,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3121 2908 2889 [WARNING|trainer.py:803] 2025-04-26 20:18:33,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:18:34,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:18:34,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2909 2890 3122 [WARNING|trainer.py:803] 2025-04-26 20:18:36,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:18:36,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:18:36,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3123 2910 2891 [WARNING|trainer.py:803] 2025-04-26 20:18:38,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:18:39,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:18:39,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3124 2911 2892 [WARNING|trainer.py:803] 2025-04-26 20:18:41,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:18:41,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:18:42,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3125 2912 2893 [WARNING|trainer.py:803] 2025-04-26 20:18:44,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:18:44,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:18:44,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3126 2913 [WARNING|trainer.py:803] 2025-04-26 20:18:46,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2894 [WARNING|trainer.py:803] 2025-04-26 20:18:46,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:18:47,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3127 2914 [WARNING|trainer.py:803] 2025-04-26 20:18:49,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2895 [WARNING|trainer.py:803] 2025-04-26 20:18:49,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:18:50,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3128 2915 [WARNING|trainer.py:803] 2025-04-26 20:18:51,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2896 [WARNING|trainer.py:803] 2025-04-26 20:18:52,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:18:52,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3129 [WARNING|trainer.py:803] 2025-04-26 20:18:53,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2916 2897 [WARNING|trainer.py:803] 2025-04-26 20:18:54,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3130 [WARNING|trainer.py:803] 2025-04-26 20:18:55,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:18:56,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2917 2898 [WARNING|trainer.py:803] 2025-04-26 20:18:57,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3131 [WARNING|trainer.py:803] 2025-04-26 20:18:57,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:18:58,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2918 2899 [WARNING|trainer.py:803] 2025-04-26 20:18:59,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3132 [WARNING|trainer.py:803] 2025-04-26 20:19:00,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:19:00,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2919 3133 2900 [WARNING|trainer.py:803] 2025-04-26 20:19:02,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:19:02,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:19:03,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3134 2920 [WARNING|trainer.py:803] 2025-04-26 20:19:04,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2901 [WARNING|trainer.py:803] 2025-04-26 20:19:05,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:19:05,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2921 3135 2902 [WARNING|trainer.py:803] 2025-04-26 20:19:07,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:19:07,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:19:08,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3136 2922 [WARNING|trainer.py:803] 2025-04-26 20:19:09,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2903 [WARNING|trainer.py:803] 2025-04-26 20:19:10,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:10,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3137 2923 [WARNING|trainer.py:803] 2025-04-26 20:19:12,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2904 [WARNING|trainer.py:803] 2025-04-26 20:19:12,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3138 [WARNING|trainer.py:803] 2025-04-26 20:19:13,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesNo [WARNING|trainer.py:803] 2025-04-26 20:19:14,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2924 2905 [WARNING|trainer.py:803] 2025-04-26 20:19:15,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3139 [WARNING|trainer.py:803] 2025-04-26 20:19:15,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:19:16,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2925 2906 [WARNING|trainer.py:803] 2025-04-26 20:19:17,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3140 [WARNING|trainer.py:803] 2025-04-26 20:19:18,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:19,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2926 2907 [WARNING|trainer.py:803] 2025-04-26 20:19:20,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3141 [WARNING|trainer.py:803] 2025-04-26 20:19:21,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:19:21,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2927 3142 2908 [WARNING|trainer.py:803] 2025-04-26 20:19:22,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:23,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:23,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2928 3143 2909 [WARNING|trainer.py:803] 2025-04-26 20:19:25,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:25,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:19:26,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3144 2929 2910 [WARNING|trainer.py:803] 2025-04-26 20:19:28,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:19:28,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:19:28,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2930 3145 2911 [WARNING|trainer.py:803] 2025-04-26 20:19:30,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:19:31,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:19:31,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3146 2931 2912 [WARNING|trainer.py:803] 2025-04-26 20:19:33,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:19:33,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:33,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3147 2932 [WARNING|trainer.py:803] 2025-04-26 20:19:35,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2913 [WARNING|trainer.py:803] 2025-04-26 20:19:36,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:36,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3148 2933 [WARNING|trainer.py:803] 2025-04-26 20:19:37,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2914 [WARNING|trainer.py:803] 2025-04-26 20:19:38,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3149 [WARNING|trainer.py:803] 2025-04-26 20:19:39,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:39,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2934 3150 2915 [WARNING|trainer.py:803] 2025-04-26 20:19:41,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:41,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:19:41,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3151 2935 2916 [WARNING|trainer.py:803] 2025-04-26 20:19:43,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:19:43,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:19:44,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3152 [WARNING|trainer.py:803] 2025-04-26 20:19:45,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2936 2917 3153 [WARNING|trainer.py:803] 2025-04-26 20:19:46,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:19:46,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:47,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3154 2918 2937 [WARNING|trainer.py:803] 2025-04-26 20:19:48,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:19:49,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:49,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3155 [WARNING|trainer.py:803] 2025-04-26 20:19:50,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2919 2938 3156 [WARNING|trainer.py:803] 2025-04-26 20:19:51,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:19:51,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:52,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3157 2939 2920 [WARNING|trainer.py:803] 2025-04-26 20:19:54,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:19:54,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:54,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3158 [WARNING|trainer.py:803] 2025-04-26 20:19:55,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2940 2921 [WARNING|trainer.py:803] 2025-04-26 20:19:56,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3159 [WARNING|trainer.py:803] 2025-04-26 20:19:56,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:19:57,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2922 2941 3160 [WARNING|trainer.py:803] 2025-04-26 20:19:59,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:59,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:19:59,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3161 2942 2923 [WARNING|trainer.py:803] 2025-04-26 20:20:01,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:02,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:20:02,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3162 [WARNING|trainer.py:803] 2025-04-26 20:20:03,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2924 2943 3163 [WARNING|trainer.py:803] 2025-04-26 20:20:04,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:20:04,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:05,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3164 2925 2944 [WARNING|trainer.py:803] 2025-04-26 20:20:07,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:20:07,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:07,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3165 [WARNING|trainer.py:803] 2025-04-26 20:20:08,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2926 2945 3166 [WARNING|trainer.py:803] 2025-04-26 20:20:09,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:09,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:10,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3167 2927 2946 [WARNING|trainer.py:803] 2025-04-26 20:20:12,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:20:12,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:12,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3168 [WARNING|trainer.py:803] 2025-04-26 20:20:14,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2928 2947 [WARNING|trainer.py:803] 2025-04-26 20:20:15,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:15,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3169 [WARNING|trainer.py:803] 2025-04-26 20:20:15,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2948 2929 3170 [WARNING|trainer.py:803] 2025-04-26 20:20:17,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:20:17,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:20:17,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3171 [WARNING|trainer.py:803] 2025-04-26 20:20:19,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2949 2930 [WARNING|trainer.py:803] 2025-04-26 20:20:20,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:20:20,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3172 [WARNING|trainer.py:803] 2025-04-26 20:20:21,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2931 2950 3173 [WARNING|trainer.py:803] 2025-04-26 20:20:22,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:22,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:23,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3174 2932 2951 [WARNING|trainer.py:803] 2025-04-26 20:20:25,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:20:25,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:25,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3175 [WARNING|trainer.py:803] 2025-04-26 20:20:27,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2933 2952 3176 [WARNING|trainer.py:803] 2025-04-26 20:20:28,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:20:28,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:20:28,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2953 2934 3177 [WARNING|trainer.py:803] 2025-04-26 20:20:30,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:20:30,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:30,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3178 2954 2935 [WARNING|trainer.py:803] 2025-04-26 20:20:32,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:20:33,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:20:33,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3179 [WARNING|trainer.py:803] 2025-04-26 20:20:34,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2955 2936 3180 [WARNING|trainer.py:803] 2025-04-26 20:20:35,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:20:35,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:20:36,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3181 2956 2937 [WARNING|trainer.py:803] 2025-04-26 20:20:38,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:20:38,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:20:38,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3182 [WARNING|trainer.py:803] 2025-04-26 20:20:39,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2957 2938 3183 [WARNING|trainer.py:803] 2025-04-26 20:20:40,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:20:41,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:41,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3184 2958 2939 [WARNING|trainer.py:803] 2025-04-26 20:20:43,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:43,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:20:43,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3185 [WARNING|trainer.py:803] 2025-04-26 20:20:45,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2940 2959 3186 [WARNING|trainer.py:803] 2025-04-26 20:20:46,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:46,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:20:47,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2960 2941 3187 [WARNING|trainer.py:803] 2025-04-26 20:20:48,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:20:48,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:48,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3188 2961 2942 [WARNING|trainer.py:803] 2025-04-26 20:20:50,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:20:51,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:51,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3189 [WARNING|trainer.py:803] 2025-04-26 20:20:52,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2962 2943 3190 [WARNING|trainer.py:803] 2025-04-26 20:20:53,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:54,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:54,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3191 2963 2944 [WARNING|trainer.py:803] 2025-04-26 20:20:56,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:20:56,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:20:56,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3192 [WARNING|trainer.py:803] 2025-04-26 20:20:58,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2964 2945 3193 [WARNING|trainer.py:803] 2025-04-26 20:20:59,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:59,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:20:59,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3194 2946 2965 [WARNING|trainer.py:803] 2025-04-26 20:21:01,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:21:01,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:21:01,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3195 2947 [WARNING|trainer.py:803] 2025-04-26 20:21:03,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2966 [WARNING|trainer.py:803] 2025-04-26 20:21:04,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:21:04,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3196 [WARNING|trainer.py:803] 2025-04-26 20:21:05,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2967 2948 3197 [WARNING|trainer.py:803] 2025-04-26 20:21:06,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:21:06,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:21:07,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3198 2968 2949 [WARNING|trainer.py:803] 2025-04-26 20:21:09,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:21:09,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:21:09,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3199 [WARNING|trainer.py:803] 2025-04-26 20:21:10,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2969 2950 3200 [WARNING|trainer.py:803] 2025-04-26 20:21:12,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:21:12,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:21:12,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3201 2970 2951 [WARNING|trainer.py:803] 2025-04-26 20:21:14,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:21:14,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:21:14,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3202 [WARNING|trainer.py:803] 2025-04-26 20:21:16,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2952 2971 3203 [WARNING|trainer.py:803] 2025-04-26 20:21:17,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:21:17,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:21:17,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3204 2953 2972 [WARNING|trainer.py:803] 2025-04-26 20:21:19,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:21:19,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:21:19,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3205 [WARNING|trainer.py:803] 2025-04-26 20:21:21,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2954 2973 [WARNING|trainer.py:803] 2025-04-26 20:21:22,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3206 [WARNING|trainer.py:803] 2025-04-26 20:21:22,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:21:23,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2955 3207 2974 [WARNING|trainer.py:803] 2025-04-26 20:21:24,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:21:25,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:21:25,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3208 2956 2975 [WARNING|trainer.py:803] 2025-04-26 20:21:26,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:21:27,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:21:27,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3209 [WARNING|trainer.py:803] 2025-04-26 20:21:28,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2957 2976 3210 [WARNING|trainer.py:803] 2025-04-26 20:21:30,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:21:30,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:21:30,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3211 2958 2977 [WARNING|trainer.py:803] 2025-04-26 20:21:32,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:21:32,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:21:32,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3212 [WARNING|trainer.py:803] 2025-04-26 20:21:34,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2959 2978 3213 [WARNING|trainer.py:803] 2025-04-26 20:21:35,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:21:35,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:21:35,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3214 2960 2979 [WARNING|trainer.py:803] 2025-04-26 20:21:37,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:21:37,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:21:37,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3215 [WARNING|trainer.py:803] 2025-04-26 20:21:39,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2980 2961 3216 [WARNING|trainer.py:803] 2025-04-26 20:21:40,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:21:40,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:21:40,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3217 2981 2962 [WARNING|trainer.py:803] 2025-04-26 20:21:42,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:21:43,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:21:43,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3218 [WARNING|trainer.py:803] 2025-04-26 20:21:44,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 2982 2963 3219 [WARNING|trainer.py:803] 2025-04-26 20:21:45,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:21:45,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:21:46,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3220 2983 2964 [WARNING|trainer.py:803] 2025-04-26 20:21:47,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:21:48,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:21:48,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3221 [WARNING|trainer.py:803] 2025-04-26 20:21:49,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2984 2965 3222 [WARNING|trainer.py:803] 2025-04-26 20:21:50,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:21:51,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:21:51,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3223 2985 2966 [WARNING|trainer.py:803] 2025-04-26 20:21:53,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:21:53,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:21:53,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3224 [WARNING|trainer.py:803] 2025-04-26 20:21:54,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2986 2967 3225 [WARNING|trainer.py:803] 2025-04-26 20:21:55,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:21:56,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:21:56,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3226 2987 2968 [WARNING|trainer.py:803] 2025-04-26 20:21:58,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:21:58,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:21:58,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3227 [WARNING|trainer.py:803] 2025-04-26 20:22:00,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2988 2969 3228 [WARNING|trainer.py:803] 2025-04-26 20:22:01,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:22:01,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:22:02,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3229 2989 2970 [WARNING|trainer.py:803] 2025-04-26 20:22:03,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:22:03,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:22:03,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3230 [WARNING|trainer.py:803] 2025-04-26 20:22:05,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2990 2971 [WARNING|trainer.py:803] 2025-04-26 20:22:06,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3231 [WARNING|trainer.py:803] 2025-04-26 20:22:06,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:22:07,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2991 3232 2972 [WARNING|trainer.py:803] 2025-04-26 20:22:08,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:22:08,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:22:09,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3233 2992 [WARNING|trainer.py:803] 2025-04-26 20:22:10,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2973 [WARNING|trainer.py:803] 2025-04-26 20:22:11,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3234 [WARNING|trainer.py:803] 2025-04-26 20:22:11,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:22:12,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3235 2993 2974 [WARNING|trainer.py:803] 2025-04-26 20:22:14,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:22:14,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:22:14,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3236 [WARNING|trainer.py:803] 2025-04-26 20:22:15,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2994 2975 [WARNING|trainer.py:803] 2025-04-26 20:22:16,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3237 [WARNING|trainer.py:803] 2025-04-26 20:22:17,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:22:17,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2995 3238 2976 [WARNING|trainer.py:803] 2025-04-26 20:22:19,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:22:19,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:22:19,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3239 [WARNING|trainer.py:803] 2025-04-26 20:22:21,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2996 2977 [WARNING|trainer.py:803] 2025-04-26 20:22:21,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3240 [WARNING|trainer.py:803] 2025-04-26 20:22:22,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:22:22,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2997 3241 2978 [WARNING|trainer.py:803] 2025-04-26 20:22:24,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:22:24,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:22:24,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3242 2998 [WARNING|trainer.py:803] 2025-04-26 20:22:26,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2979 [WARNING|trainer.py:803] 2025-04-26 20:22:26,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3243 [WARNING|trainer.py:803] 2025-04-26 20:22:27,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:22:27,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2999 3244 2980 [WARNING|trainer.py:803] 2025-04-26 20:22:29,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:22:29,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:22:29,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3245 3000 2981 [WARNING|trainer.py:803] 2025-04-26 20:22:31,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:22:32,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:22:32,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3246 [WARNING|trainer.py:803] 2025-04-26 20:22:33,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3001 2982 3247 [WARNING|trainer.py:803] 2025-04-26 20:22:34,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:22:34,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:22:35,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3002 3248 [WARNING|trainer.py:803] 2025-04-26 20:22:36,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2983 [WARNING|trainer.py:803] 2025-04-26 20:22:36,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:22:37,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3249 3003 [WARNING|trainer.py:803] 2025-04-26 20:22:38,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:22:38,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2984 3250 [WARNING|trainer.py:803] 2025-04-26 20:22:40,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:22:40,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3004 3251 [WARNING|trainer.py:803] 2025-04-26 20:22:41,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 2985 [WARNING|trainer.py:803] 2025-04-26 20:22:41,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3005 3252 [WARNING|trainer.py:803] 2025-04-26 20:22:42,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:22:43,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:22:43,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3253 2986 3006 [WARNING|trainer.py:803] 2025-04-26 20:22:45,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:22:45,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:22:45,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3254 3007 [WARNING|trainer.py:803] 2025-04-26 20:22:46,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2987 [WARNING|trainer.py:803] 2025-04-26 20:22:47,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3255 [WARNING|trainer.py:803] 2025-04-26 20:22:47,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:22:48,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3008 3256 [WARNING|trainer.py:803] 2025-04-26 20:22:49,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 2988 [WARNING|trainer.py:803] 2025-04-26 20:22:50,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 20:22:50,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3009 3257 [WARNING|trainer.py:803] 2025-04-26 20:22:51,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:22:51,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 2989 3258 [WARNING|trainer.py:803] 2025-04-26 20:22:52,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3010 [WARNING|trainer.py:803] 2025-04-26 20:22:53,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 20:22:53,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3259 2990 3011 [WARNING|trainer.py:803] 2025-04-26 20:22:55,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 20:22:55,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:22:55,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3260 [WARNING|trainer.py:803] 2025-04-26 20:22:56,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3012 2991 3261 [WARNING|trainer.py:803] 2025-04-26 20:22:57,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:22:58,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:22:58,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3013 3262 2992 [WARNING|trainer.py:803] 2025-04-26 20:23:00,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:00,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:23:00,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3263 3014 [WARNING|trainer.py:803] 2025-04-26 20:23:02,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:23:02,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2993 3264 [WARNING|trainer.py:803] 2025-04-26 20:23:03,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3015 [WARNING|trainer.py:803] 2025-04-26 20:23:03,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:04,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 2994 3265 3016 [WARNING|trainer.py:803] 2025-04-26 20:23:05,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:23:05,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:06,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3266 2995 [WARNING|trainer.py:803] 2025-04-26 20:23:07,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3017 [WARNING|trainer.py:803] 2025-04-26 20:23:08,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:23:08,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3267 [WARNING|trainer.py:803] 2025-04-26 20:23:09,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3018 2996 3268 [WARNING|trainer.py:803] 2025-04-26 20:23:10,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:23:11,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:23:11,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3019 3269 2997 [WARNING|trainer.py:803] 2025-04-26 20:23:12,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:23:13,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:23:13,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3270 3020 [WARNING|trainer.py:803] 2025-04-26 20:23:14,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:15,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 2998 3271 [WARNING|trainer.py:803] 2025-04-26 20:23:16,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3021 [WARNING|trainer.py:803] 2025-04-26 20:23:16,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:23:17,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3272 2999 [WARNING|trainer.py:803] 2025-04-26 20:23:18,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3022 [WARNING|trainer.py:803] 2025-04-26 20:23:18,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3273 [WARNING|trainer.py:803] 2025-04-26 20:23:19,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:20,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3000 3023 3274 [WARNING|trainer.py:803] 2025-04-26 20:23:21,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:23:21,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:23:21,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3001 3024 3275 [WARNING|trainer.py:803] 2025-04-26 20:23:23,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:23,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:23:23,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3276 3025 3002 [WARNING|trainer.py:803] 2025-04-26 20:23:25,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:23:25,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:23:25,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3277 3026 3003 [WARNING|trainer.py:803] 2025-04-26 20:23:27,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:23:27,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:27,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3278 [WARNING|trainer.py:803] 2025-04-26 20:23:28,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3027 3004 3279 [WARNING|trainer.py:803] 2025-04-26 20:23:29,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:30,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:30,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3028 3005 3280 [WARNING|trainer.py:803] 2025-04-26 20:23:32,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:23:32,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:23:32,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3281 3029 3006 [WARNING|trainer.py:803] 2025-04-26 20:23:34,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:23:34,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:34,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3282 3030 3007 [WARNING|trainer.py:803] 2025-04-26 20:23:36,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:23:36,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:36,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3283 3031 3008 [WARNING|trainer.py:803] 2025-04-26 20:23:37,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:23:38,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:38,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3284 [WARNING|trainer.py:803] 2025-04-26 20:23:39,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3032 3009 3285 [WARNING|trainer.py:803] 2025-04-26 20:23:40,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:23:40,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:23:41,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3010 3033 3286 [WARNING|trainer.py:803] 2025-04-26 20:23:42,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:42,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:23:43,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3011 3034 3287 [WARNING|trainer.py:803] 2025-04-26 20:23:44,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:44,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:23:45,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3288 3012 3035 [WARNING|trainer.py:803] 2025-04-26 20:23:46,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:23:46,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:23:47,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3289 3013 [WARNING|trainer.py:803] 2025-04-26 20:23:48,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3036 [WARNING|trainer.py:803] 2025-04-26 20:23:48,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:49,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3290 3014 [WARNING|trainer.py:803] 2025-04-26 20:23:50,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3037 [WARNING|trainer.py:803] 2025-04-26 20:23:51,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3291 [WARNING|trainer.py:803] 2025-04-26 20:23:51,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:23:52,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3015 3038 3292 [WARNING|trainer.py:803] 2025-04-26 20:23:53,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:23:53,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:53,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3016 3039 3293 [WARNING|trainer.py:803] 2025-04-26 20:23:55,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:55,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:23:55,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3294 3017 3040 [WARNING|trainer.py:803] 2025-04-26 20:23:57,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:23:57,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:23:57,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3295 3018 3041 [WARNING|trainer.py:803] 2025-04-26 20:23:59,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:23:59,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:23:59,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3296 [WARNING|trainer.py:803] 2025-04-26 20:24:01,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3019 3042 3297 [WARNING|trainer.py:803] 2025-04-26 20:24:01,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:24:02,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:24:02,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3020 3043 3298 [WARNING|trainer.py:803] 2025-04-26 20:24:04,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:04,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:24:04,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3021 3299 3044 [WARNING|trainer.py:803] 2025-04-26 20:24:06,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:24:06,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:24:06,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3300 3022 3045 [WARNING|trainer.py:803] 2025-04-26 20:24:08,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:24:08,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3301 [WARNING|trainer.py:803] 2025-04-26 20:24:08,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:24:09,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3302 3023 3046 [WARNING|trainer.py:803] 2025-04-26 20:24:10,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:24:10,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3303 [WARNING|trainer.py:803] 2025-04-26 20:24:10,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:24:11,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3304 3024 3047 [WARNING|trainer.py:803] 2025-04-26 20:24:12,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:24:12,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3305 [WARNING|trainer.py:803] 2025-04-26 20:24:12,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:24:13,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3306 3025 3048 [WARNING|trainer.py:803] 2025-04-26 20:24:14,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:24:14,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:24:14,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3307 [WARNING|trainer.py:803] 2025-04-26 20:24:15,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3026 3308 3049 [WARNING|trainer.py:803] 2025-04-26 20:24:16,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:24:16,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:24:17,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3309 [WARNING|trainer.py:803] 2025-04-26 20:24:18,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3027 3050 3310 [WARNING|trainer.py:803] 2025-04-26 20:24:18,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:24:19,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:24:19,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3311 [WARNING|trainer.py:803] 2025-04-26 20:24:20,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3028 3051 3312 [WARNING|trainer.py:803] 2025-04-26 20:24:21,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:21,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:24:21,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3313 [WARNING|trainer.py:803] 2025-04-26 20:24:22,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3029 3052 3314 [WARNING|trainer.py:803] 2025-04-26 20:24:23,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:24:23,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:24:23,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3315 [WARNING|trainer.py:803] 2025-04-26 20:24:24,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3053 3030 3316 [WARNING|trainer.py:803] 2025-04-26 20:24:25,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:25,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:24:25,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3317 3054 [WARNING|trainer.py:803] 2025-04-26 20:24:26,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3031 3318 [WARNING|trainer.py:803] 2025-04-26 20:24:27,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:27,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:24:27,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3319 3055 3032 [WARNING|trainer.py:803] 2025-04-26 20:24:29,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3320 [WARNING|trainer.py:803] 2025-04-26 20:24:29,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:24:29,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:30,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3321 3056 3033 [WARNING|trainer.py:803] 2025-04-26 20:24:31,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3322 [WARNING|trainer.py:803] 2025-04-26 20:24:31,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:32,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:24:32,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3323 3057 3034 [WARNING|trainer.py:803] 2025-04-26 20:24:33,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3324 [WARNING|trainer.py:803] 2025-04-26 20:24:34,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:34,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:24:34,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3325 3058 3035 [WARNING|trainer.py:803] 2025-04-26 20:24:35,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3326 [WARNING|trainer.py:803] 2025-04-26 20:24:36,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:36,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:36,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3327 3059 3036 [WARNING|trainer.py:803] 2025-04-26 20:24:37,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3328 [WARNING|trainer.py:803] 2025-04-26 20:24:38,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:24:38,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:38,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3329 3060 3037 [WARNING|trainer.py:803] 2025-04-26 20:24:39,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3330 [WARNING|trainer.py:803] 2025-04-26 20:24:40,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:24:40,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:40,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3331 3061 3038 [WARNING|trainer.py:803] 2025-04-26 20:24:42,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3332 [WARNING|trainer.py:803] 2025-04-26 20:24:42,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:24:42,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:24:43,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3333 3062 3039 [WARNING|trainer.py:803] 2025-04-26 20:24:44,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3334 [WARNING|trainer.py:803] 2025-04-26 20:24:44,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:44,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:24:45,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3335 3063 3040 [WARNING|trainer.py:803] 2025-04-26 20:24:46,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:24:46,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3336 [WARNING|trainer.py:803] 2025-04-26 20:24:46,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:24:47,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3337 3064 3041 [WARNING|trainer.py:803] 2025-04-26 20:24:48,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:48,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3338 [WARNING|trainer.py:803] 2025-04-26 20:24:49,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:49,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3339 3065 3042 [WARNING|trainer.py:803] 2025-04-26 20:24:50,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:24:50,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3340 [WARNING|trainer.py:803] 2025-04-26 20:24:51,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:24:51,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3341 3066 3043 [WARNING|trainer.py:803] 2025-04-26 20:24:52,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:52,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3342 [WARNING|trainer.py:803] 2025-04-26 20:24:53,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:24:54,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3067 3343 3044 [WARNING|trainer.py:803] 2025-04-26 20:24:55,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:24:55,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3344 [WARNING|trainer.py:803] 2025-04-26 20:24:55,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:24:56,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3068 3345 3045 [WARNING|trainer.py:803] 2025-04-26 20:24:57,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:24:57,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:24:57,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3346 [WARNING|trainer.py:803] 2025-04-26 20:24:58,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3069 3347 3046 [WARNING|trainer.py:803] 2025-04-26 20:24:59,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:24:59,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:24:59,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3348 [WARNING|trainer.py:803] 2025-04-26 20:25:00,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3070 3349 3047 [WARNING|trainer.py:803] 2025-04-26 20:25:01,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:25:01,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:25:01,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3350 3071 [WARNING|trainer.py:803] 2025-04-26 20:25:02,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3048 3351 [WARNING|trainer.py:803] 2025-04-26 20:25:03,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:25:03,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:25:04,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3352 3072 [WARNING|trainer.py:803] 2025-04-26 20:25:05,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3049 3353 [WARNING|trainer.py:803] 2025-04-26 20:25:05,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:06,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:06,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3354 3073 3050 [WARNING|trainer.py:803] 2025-04-26 20:25:07,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3355 [WARNING|trainer.py:803] 2025-04-26 20:25:07,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:25:08,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:25:08,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3356 3074 3051 [WARNING|trainer.py:803] 2025-04-26 20:25:09,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3357 [WARNING|trainer.py:803] 2025-04-26 20:25:10,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:10,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:25:10,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3358 3075 3052 [WARNING|trainer.py:803] 2025-04-26 20:25:11,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3359 [WARNING|trainer.py:803] 2025-04-26 20:25:12,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:12,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:25:12,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3360 3076 3053 [WARNING|trainer.py:803] 2025-04-26 20:25:13,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3361 [WARNING|trainer.py:803] 2025-04-26 20:25:14,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:14,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:14,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3362 3077 3054 [WARNING|trainer.py:803] 2025-04-26 20:25:16,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3363 [WARNING|trainer.py:803] 2025-04-26 20:25:16,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:16,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:17,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3364 3078 3055 [WARNING|trainer.py:803] 2025-04-26 20:25:18,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3365 [WARNING|trainer.py:803] 2025-04-26 20:25:18,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:18,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:25:19,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3366 3079 3056 [WARNING|trainer.py:803] 2025-04-26 20:25:20,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:25:20,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3367 [WARNING|trainer.py:803] 2025-04-26 20:25:21,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:21,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3368 3080 3057 [WARNING|trainer.py:803] 2025-04-26 20:25:22,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:25:22,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3369 [WARNING|trainer.py:803] 2025-04-26 20:25:23,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:23,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3370 3081 3058 [WARNING|trainer.py:803] 2025-04-26 20:25:24,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:25:24,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3371 [WARNING|trainer.py:803] 2025-04-26 20:25:25,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:25,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3372 3082 3059 [WARNING|trainer.py:803] 2025-04-26 20:25:26,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:27,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3373 [WARNING|trainer.py:803] 2025-04-26 20:25:27,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:25:28,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3374 3083 3060 [WARNING|trainer.py:803] 2025-04-26 20:25:29,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:29,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3375 [WARNING|trainer.py:803] 2025-04-26 20:25:29,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:25:30,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3376 3084 3061 [WARNING|trainer.py:803] 2025-04-26 20:25:31,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:31,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3377 [WARNING|trainer.py:803] 2025-04-26 20:25:31,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:25:32,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3378 3085 3062 [WARNING|trainer.py:803] 2025-04-26 20:25:33,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:25:33,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3379 [WARNING|trainer.py:803] 2025-04-26 20:25:33,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:34,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3380 3086 3063 [WARNING|trainer.py:803] 2025-04-26 20:25:35,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:25:35,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:25:35,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3381 [WARNING|trainer.py:803] 2025-04-26 20:25:36,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3087 3382 3064 [WARNING|trainer.py:803] 2025-04-26 20:25:37,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:37,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:25:38,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3383 3088 [WARNING|trainer.py:803] 2025-04-26 20:25:39,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3384 3065 [WARNING|trainer.py:803] 2025-04-26 20:25:39,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:25:40,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:40,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3385 3089 [WARNING|trainer.py:803] 2025-04-26 20:25:41,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3066 3386 [WARNING|trainer.py:803] 2025-04-26 20:25:41,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:25:42,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:25:42,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3387 3090 [WARNING|trainer.py:803] 2025-04-26 20:25:43,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3067 3388 [WARNING|trainer.py:803] 2025-04-26 20:25:44,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:25:44,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:25:44,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3389 3091 3068 [WARNING|trainer.py:803] 2025-04-26 20:25:45,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3390 [WARNING|trainer.py:803] 2025-04-26 20:25:46,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:25:46,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:46,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3391 3092 3069 [WARNING|trainer.py:803] 2025-04-26 20:25:47,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3392 [WARNING|trainer.py:803] 2025-04-26 20:25:48,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:25:48,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:25:49,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3393 3093 [WARNING|trainer.py:803] 2025-04-26 20:25:50,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3070 3394 [WARNING|trainer.py:803] 2025-04-26 20:25:50,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:25:50,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:25:51,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3395 3094 3071 [WARNING|trainer.py:803] 2025-04-26 20:25:52,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:52,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3396 [WARNING|trainer.py:803] 2025-04-26 20:25:53,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:25:53,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3397 3095 3072 [WARNING|trainer.py:803] 2025-04-26 20:25:54,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:25:54,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3398 [WARNING|trainer.py:803] 2025-04-26 20:25:55,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:55,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3399 3096 3073 [WARNING|trainer.py:803] 2025-04-26 20:25:56,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:25:56,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3400 [WARNING|trainer.py:803] 2025-04-26 20:25:57,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:25:57,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3401 3097 3074 [WARNING|trainer.py:803] 2025-04-26 20:25:58,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:25:58,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3402 [WARNING|trainer.py:803] 2025-04-26 20:25:59,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:25:59,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3403 3098 3075 [WARNING|trainer.py:803] 2025-04-26 20:26:00,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 20:26:01,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3404 [WARNING|trainer.py:803] 2025-04-26 20:26:01,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:26:02,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3099 3405 3076 [WARNING|trainer.py:803] 2025-04-26 20:26:03,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:26:03,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3406 [WARNING|trainer.py:803] 2025-04-26 20:26:03,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:26:04,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3100 3407 3077 [WARNING|trainer.py:803] 2025-04-26 20:26:05,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:26:05,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3408 [WARNING|trainer.py:803] 2025-04-26 20:26:05,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:26:06,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3409 3078 [WARNING|trainer.py:803] 2025-04-26 20:26:07,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3101 3410 [WARNING|trainer.py:803] 2025-04-26 20:26:07,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:26:08,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:08,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3411 3079 [WARNING|trainer.py:803] 2025-04-26 20:26:09,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3412 [WARNING|trainer.py:803] 2025-04-26 20:26:10,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3102 [WARNING|trainer.py:803] 2025-04-26 20:26:10,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3413 [WARNING|trainer.py:803] 2025-04-26 20:26:11,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3080 [WARNING|trainer.py:803] 2025-04-26 20:26:11,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:26:12,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3414 [WARNING|trainer.py:803] 2025-04-26 20:26:13,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3103 3415 3081 [WARNING|trainer.py:803] 2025-04-26 20:26:13,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 20:26:14,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:26:14,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3416 [WARNING|trainer.py:803] 2025-04-26 20:26:15,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3417 3082 3104 [WARNING|trainer.py:803] 2025-04-26 20:26:16,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:26:16,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:26:16,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3418 [WARNING|trainer.py:803] 2025-04-26 20:26:17,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3083 3419 3105 [WARNING|trainer.py:803] 2025-04-26 20:26:18,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:18,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:26:18,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3420 [WARNING|trainer.py:803] 2025-04-26 20:26:19,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3084 3421 [WARNING|trainer.py:803] 2025-04-26 20:26:20,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3106 [WARNING|trainer.py:803] 2025-04-26 20:26:20,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3422 [WARNING|trainer.py:803] 2025-04-26 20:26:21,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:22,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3085 3423 [WARNING|trainer.py:803] 2025-04-26 20:26:22,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:26:23,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3424 3107 3086 [WARNING|trainer.py:803] 2025-04-26 20:26:24,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:26:24,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3425 [WARNING|trainer.py:803] 2025-04-26 20:26:25,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:25,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3426 3108 3087 [WARNING|trainer.py:803] 2025-04-26 20:26:26,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3427 [WARNING|trainer.py:803] 2025-04-26 20:26:26,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:27,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:26:27,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3428 3088 3109 [WARNING|trainer.py:803] 2025-04-26 20:26:28,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3429 [WARNING|trainer.py:803] 2025-04-26 20:26:29,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:26:29,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:29,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3430 3089 [WARNING|trainer.py:803] 2025-04-26 20:26:30,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3110 3431 [WARNING|trainer.py:803] 2025-04-26 20:26:31,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:26:31,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:26:31,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3432 3090 [WARNING|trainer.py:803] 2025-04-26 20:26:33,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3433 [WARNING|trainer.py:803] 2025-04-26 20:26:33,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3111 [WARNING|trainer.py:803] 2025-04-26 20:26:34,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:34,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3434 3091 [WARNING|trainer.py:803] 2025-04-26 20:26:35,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:26:35,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3435 3112 [WARNING|trainer.py:803] 2025-04-26 20:26:36,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:26:36,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3436 3092 [WARNING|trainer.py:803] 2025-04-26 20:26:37,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:26:37,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3437 3113 [WARNING|trainer.py:803] 2025-04-26 20:26:38,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3438 [WARNING|trainer.py:803] 2025-04-26 20:26:38,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3093 [WARNING|trainer.py:803] 2025-04-26 20:26:39,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:26:39,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3439 3114 [WARNING|trainer.py:803] 2025-04-26 20:26:40,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3094 3440 [WARNING|trainer.py:803] 2025-04-26 20:26:41,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:26:41,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:26:41,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3441 [WARNING|trainer.py:803] 2025-04-26 20:26:42,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3095 3442 3115 [WARNING|trainer.py:803] 2025-04-26 20:26:43,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:26:44,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:44,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3443 [WARNING|trainer.py:803] 2025-04-26 20:26:45,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3096 3444 3116 [WARNING|trainer.py:803] 2025-04-26 20:26:46,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:46,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3445 [WARNING|trainer.py:803] 2025-04-26 20:26:46,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:47,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3097 3446 [WARNING|trainer.py:803] 2025-04-26 20:26:48,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:48,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3117 3447 [WARNING|trainer.py:803] 2025-04-26 20:26:49,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:49,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3098 3448 [WARNING|trainer.py:803] 2025-04-26 20:26:50,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:50,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3449 3118 [WARNING|trainer.py:803] 2025-04-26 20:26:51,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3099 3450 [WARNING|trainer.py:803] 2025-04-26 20:26:52,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:52,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:26:52,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3451 3100 [WARNING|trainer.py:803] 2025-04-26 20:26:53,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3452 3119 [WARNING|trainer.py:803] 2025-04-26 20:26:54,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:26:55,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:26:55,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3453 [WARNING|trainer.py:803] 2025-04-26 20:26:56,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3454 3120 3101 [WARNING|trainer.py:803] 2025-04-26 20:26:57,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3455 [WARNING|trainer.py:803] 2025-04-26 20:26:57,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:26:57,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:26:58,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3456 [WARNING|trainer.py:803] 2025-04-26 20:26:59,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3457 3102 3121 [WARNING|trainer.py:803] 2025-04-26 20:27:00,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:27:00,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:27:00,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3458 [WARNING|trainer.py:803] 2025-04-26 20:27:01,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3459 3103 [WARNING|trainer.py:803] 2025-04-26 20:27:02,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3122 3460 [WARNING|trainer.py:803] 2025-04-26 20:27:03,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 20:27:03,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:03,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3461 [WARNING|trainer.py:803] 2025-04-26 20:27:05,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3104 3462 3123 [WARNING|trainer.py:803] 2025-04-26 20:27:06,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:27:06,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:27:06,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3463 [WARNING|trainer.py:803] 2025-04-26 20:27:07,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3105 3464 3124 [WARNING|trainer.py:803] 2025-04-26 20:27:08,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:27:08,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3465 [WARNING|trainer.py:803] 2025-04-26 20:27:09,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:27:09,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3466 3106 [WARNING|trainer.py:803] 2025-04-26 20:27:10,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3467 3125 [WARNING|trainer.py:803] 2025-04-26 20:27:11,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:11,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:11,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3468 [WARNING|trainer.py:803] 2025-04-26 20:27:12,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3469 3107 3126 [WARNING|trainer.py:803] 2025-04-26 20:27:13,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:27:13,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3470 [WARNING|trainer.py:803] 2025-04-26 20:27:14,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:14,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3471 3108 [WARNING|trainer.py:803] 2025-04-26 20:27:15,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3472 [WARNING|trainer.py:803] 2025-04-26 20:27:16,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3127 [WARNING|trainer.py:803] 2025-04-26 20:27:17,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3473 [WARNING|trainer.py:803] 2025-04-26 20:27:17,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3109 [WARNING|trainer.py:803] 2025-04-26 20:27:18,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3474 [WARNING|trainer.py:803] 2025-04-26 20:27:18,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3128 [WARNING|trainer.py:803] 2025-04-26 20:27:19,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3475 3110 [WARNING|trainer.py:803] 2025-04-26 20:27:20,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:20,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3476 [WARNING|trainer.py:803] 2025-04-26 20:27:21,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:27:21,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3477 3129 [WARNING|trainer.py:803] 2025-04-26 20:27:22,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3111 3478 [WARNING|trainer.py:803] 2025-04-26 20:27:22,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:23,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:23,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3479 3130 [WARNING|trainer.py:803] 2025-04-26 20:27:24,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3480 3112 [WARNING|trainer.py:803] 2025-04-26 20:27:25,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:25,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:27:25,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3481 [WARNING|trainer.py:803] 2025-04-26 20:27:26,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3131 3482 3113 [WARNING|trainer.py:803] 2025-04-26 20:27:27,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:28,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:27:28,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3483 [WARNING|trainer.py:803] 2025-04-26 20:27:29,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3484 3132 3114 [WARNING|trainer.py:803] 2025-04-26 20:27:30,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3485 [WARNING|trainer.py:803] 2025-04-26 20:27:30,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:30,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:27:31,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3486 3133 [WARNING|trainer.py:803] 2025-04-26 20:27:32,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3487 3115 [WARNING|trainer.py:803] 2025-04-26 20:27:32,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:27:33,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:27:33,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3488 3134 [WARNING|trainer.py:803] 2025-04-26 20:27:34,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3489 [WARNING|trainer.py:803] 2025-04-26 20:27:35,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3116 [WARNING|trainer.py:803] 2025-04-26 20:27:35,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3490 [WARNING|trainer.py:803] 2025-04-26 20:27:36,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:36,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3491 3135 3117 [WARNING|trainer.py:803] 2025-04-26 20:27:37,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3492 [WARNING|trainer.py:803] 2025-04-26 20:27:38,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:27:38,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:38,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3493 3136 [WARNING|trainer.py:803] 2025-04-26 20:27:40,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:27:40,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3494 3118 [WARNING|trainer.py:803] 2025-04-26 20:27:41,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3495 [WARNING|trainer.py:803] 2025-04-26 20:27:41,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3137 [WARNING|trainer.py:803] 2025-04-26 20:27:42,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3496 [WARNING|trainer.py:803] 2025-04-26 20:27:43,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:27:43,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3119 3497 [WARNING|trainer.py:803] 2025-04-26 20:27:44,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:27:44,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3138 3498 [WARNING|trainer.py:803] 2025-04-26 20:27:45,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:27:45,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3499 3120 [WARNING|trainer.py:803] 2025-04-26 20:27:46,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3500 [WARNING|trainer.py:803] 2025-04-26 20:27:47,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3139 [WARNING|trainer.py:803] 2025-04-26 20:27:47,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3501 [WARNING|trainer.py:803] 2025-04-26 20:27:48,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:27:48,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3121 3502 [WARNING|trainer.py:803] 2025-04-26 20:27:49,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:27:50,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3140 3503 [WARNING|trainer.py:803] 2025-04-26 20:27:51,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:51,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3504 [WARNING|trainer.py:803] 2025-04-26 20:27:52,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3122 3505 3141 [WARNING|trainer.py:803] 2025-04-26 20:27:53,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:53,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3506 [WARNING|trainer.py:803] 2025-04-26 20:27:53,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:54,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3507 3123 3142 [WARNING|trainer.py:803] 2025-04-26 20:27:55,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:27:55,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3508 [WARNING|trainer.py:803] 2025-04-26 20:27:56,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:27:56,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3509 3124 3143 [WARNING|trainer.py:803] 2025-04-26 20:27:57,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3510 [WARNING|trainer.py:803] 2025-04-26 20:27:58,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:27:58,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:27:58,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3511 [WARNING|trainer.py:803] 2025-04-26 20:27:59,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3512 3144 3125 [WARNING|trainer.py:803] 2025-04-26 20:28:01,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:28:01,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:28:01,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3513 [WARNING|trainer.py:803] 2025-04-26 20:28:02,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3514 3126 [WARNING|trainer.py:803] 2025-04-26 20:28:03,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3515 3145 [WARNING|trainer.py:803] 2025-04-26 20:28:03,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:28:04,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:28:04,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3516 [WARNING|trainer.py:803] 2025-04-26 20:28:05,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3517 3146 3127 [WARNING|trainer.py:803] 2025-04-26 20:28:06,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:28:06,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3518 [WARNING|trainer.py:803] 2025-04-26 20:28:07,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:28:07,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3519 3147 [WARNING|trainer.py:803] 2025-04-26 20:28:08,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3128 3520 [WARNING|trainer.py:803] 2025-04-26 20:28:09,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:28:09,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:28:09,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3521 [WARNING|trainer.py:803] 2025-04-26 20:28:10,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3148 3522 3129 [WARNING|trainer.py:803] 2025-04-26 20:28:11,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:28:11,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3523 [WARNING|trainer.py:803] 2025-04-26 20:28:12,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3149 [WARNING|trainer.py:803] 2025-04-26 20:28:12,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3524 [WARNING|trainer.py:803] 2025-04-26 20:28:13,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3130 [WARNING|trainer.py:803] 2025-04-26 20:28:13,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3525 [WARNING|trainer.py:803] 2025-04-26 20:28:14,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3150 [WARNING|trainer.py:803] 2025-04-26 20:28:15,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3526 [WARNING|trainer.py:803] 2025-04-26 20:28:15,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:28:16,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3131 3527 [WARNING|trainer.py:803] 2025-04-26 20:28:17,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3151 [WARNING|trainer.py:803] 2025-04-26 20:28:17,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3528 [WARNING|trainer.py:803] 2025-04-26 20:28:18,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:28:18,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3529 3132 3152 [WARNING|trainer.py:803] 2025-04-26 20:28:19,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3530 [WARNING|trainer.py:803] 2025-04-26 20:28:19,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:28:20,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:28:20,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [mov,mp4,m4a,3gp,3g2,mj2 @ 0x67d94b40] moov atom not found [20:28:20] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4, Invalid data found when processing input Error reading /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4... Failed to load video: /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k 3531 3133 3153 [WARNING|trainer.py:803] 2025-04-26 20:28:21,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3532 [WARNING|trainer.py:803] 2025-04-26 20:28:22,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:28:22,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:28:22,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3533 3154 3134 [WARNING|trainer.py:803] 2025-04-26 20:28:23,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3534 [WARNING|trainer.py:803] 2025-04-26 20:28:24,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:28:24,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:28:24,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3535 3155 [WARNING|trainer.py:803] 2025-04-26 20:28:25,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3536 [WARNING|trainer.py:803] 2025-04-26 20:28:26,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3135 [WARNING|trainer.py:803] 2025-04-26 20:28:26,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3537 [WARNING|trainer.py:803] 2025-04-26 20:28:27,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3156 [WARNING|trainer.py:803] 2025-04-26 20:28:27,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3538 [WARNING|trainer.py:803] 2025-04-26 20:28:28,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3136 [WARNING|trainer.py:803] 2025-04-26 20:28:29,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3539 3157 [WARNING|trainer.py:803] 2025-04-26 20:28:29,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:28:30,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3540 [WARNING|trainer.py:803] 2025-04-26 20:28:30,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:28:31,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3541 3137 3158 [WARNING|trainer.py:803] 2025-04-26 20:28:32,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:28:32,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:28:32,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3542 [WARNING|trainer.py:803] 2025-04-26 20:28:33,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3543 3159 3138 [WARNING|trainer.py:803] 2025-04-26 20:28:34,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:28:34,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3544 [WARNING|trainer.py:803] 2025-04-26 20:28:34,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:28:35,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3545 3160 [WARNING|trainer.py:803] 2025-04-26 20:28:36,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3139 [WARNING|trainer.py:803] 2025-04-26 20:28:36,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3546 [WARNING|trainer.py:803] 2025-04-26 20:28:37,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:28:37,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3161 3547 [WARNING|trainer.py:803] 2025-04-26 20:28:38,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:28:38,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3548 3140 [WARNING|trainer.py:803] 2025-04-26 20:28:39,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3162 3549 [WARNING|trainer.py:803] 2025-04-26 20:28:40,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:28:40,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:28:41,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3550 3141 [WARNING|trainer.py:803] 2025-04-26 20:28:42,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3163 3551 [WARNING|trainer.py:803] 2025-04-26 20:28:42,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:28:43,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:28:43,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3552 [WARNING|trainer.py:803] 2025-04-26 20:28:44,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3164 3142 3553 [WARNING|trainer.py:803] 2025-04-26 20:28:45,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:28:45,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:28:45,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3554 3165 [WARNING|trainer.py:803] 2025-04-26 20:28:46,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3555 3143 [WARNING|trainer.py:803] 2025-04-26 20:28:47,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:28:47,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3556 [WARNING|trainer.py:803] 2025-04-26 20:28:47,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3166 [WARNING|trainer.py:803] 2025-04-26 20:28:48,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3557 [WARNING|trainer.py:803] 2025-04-26 20:28:49,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3144 [WARNING|trainer.py:803] 2025-04-26 20:28:49,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3558 [WARNING|trainer.py:803] 2025-04-26 20:28:50,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3167 [WARNING|trainer.py:803] 2025-04-26 20:28:50,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3559 [WARNING|trainer.py:803] 2025-04-26 20:28:51,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:28:51,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3560 3168 3145 [WARNING|trainer.py:803] 2025-04-26 20:28:52,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3561 [WARNING|trainer.py:803] 2025-04-26 20:28:53,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:28:53,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:28:53,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3562 3169 3146 [WARNING|trainer.py:803] 2025-04-26 20:28:55,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3563 [WARNING|trainer.py:803] 2025-04-26 20:28:55,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:28:55,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:28:56,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3564 3170 [WARNING|trainer.py:803] 2025-04-26 20:28:57,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:28:57,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3565 3147 [WARNING|trainer.py:803] 2025-04-26 20:28:58,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:28:58,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3171 3566 [WARNING|trainer.py:803] 2025-04-26 20:28:59,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:28:59,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3567 3148 3172 [WARNING|trainer.py:803] 2025-04-26 20:29:00,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3568 [WARNING|trainer.py:803] 2025-04-26 20:29:00,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:29:01,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:01,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3569 3149 3173 [WARNING|trainer.py:803] 2025-04-26 20:29:02,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3570 [WARNING|trainer.py:803] 2025-04-26 20:29:02,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:03,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:29:03,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3571 3150 3174 [WARNING|trainer.py:803] 2025-04-26 20:29:04,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3572 [WARNING|trainer.py:803] 2025-04-26 20:29:05,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:29:05,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:29:05,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3573 3151 3175 [WARNING|trainer.py:803] 2025-04-26 20:29:06,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3574 [WARNING|trainer.py:803] 2025-04-26 20:29:07,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:29:07,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:07,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3575 3152 3176 [WARNING|trainer.py:803] 2025-04-26 20:29:08,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3576 [WARNING|trainer.py:803] 2025-04-26 20:29:09,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:29:09,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:29:09,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3577 3153 3177 [WARNING|trainer.py:803] 2025-04-26 20:29:11,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3578 [WARNING|trainer.py:803] 2025-04-26 20:29:11,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:29:11,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:29:12,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3579 3154 3178 [WARNING|trainer.py:803] 2025-04-26 20:29:13,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3580 [WARNING|trainer.py:803] 2025-04-26 20:29:13,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:29:13,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:29:14,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3581 3155 3179 [WARNING|trainer.py:803] 2025-04-26 20:29:15,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3582 [WARNING|trainer.py:803] 2025-04-26 20:29:15,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:15,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:29:16,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3583 3156 3180 [WARNING|trainer.py:803] 2025-04-26 20:29:17,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:29:17,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3584 [WARNING|trainer.py:803] 2025-04-26 20:29:17,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:29:18,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3585 3157 3181 [WARNING|trainer.py:803] 2025-04-26 20:29:19,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:29:19,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3586 [WARNING|trainer.py:803] 2025-04-26 20:29:20,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:29:20,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3158 3587 3182 [WARNING|trainer.py:803] 2025-04-26 20:29:21,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:21,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:29:22,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3588 [WARNING|trainer.py:803] 2025-04-26 20:29:23,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3159 3589 3183 [WARNING|trainer.py:803] 2025-04-26 20:29:23,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:29:24,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:29:24,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3590 3160 [WARNING|trainer.py:803] 2025-04-26 20:29:25,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3184 3591 [WARNING|trainer.py:803] 2025-04-26 20:29:26,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:29:26,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:26,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3592 3161 [WARNING|trainer.py:803] 2025-04-26 20:29:27,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3185 3593 [WARNING|trainer.py:803] 2025-04-26 20:29:28,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:28,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:29:28,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3594 3162 3186 [WARNING|trainer.py:803] 2025-04-26 20:29:29,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3595 [WARNING|trainer.py:803] 2025-04-26 20:29:30,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:29:30,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:29:30,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3596 3163 3187 [WARNING|trainer.py:803] 2025-04-26 20:29:31,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3597 [WARNING|trainer.py:803] 2025-04-26 20:29:32,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:29:32,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:32,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3598 3164 3188 [WARNING|trainer.py:803] 2025-04-26 20:29:34,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3599 [WARNING|trainer.py:803] 2025-04-26 20:29:34,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:29:34,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:29:35,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3600 3189 3165 [WARNING|trainer.py:803] 2025-04-26 20:29:36,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3601 [WARNING|trainer.py:803] 2025-04-26 20:29:36,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:29:36,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:29:37,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3166 3602 3190 [WARNING|trainer.py:803] 2025-04-26 20:29:38,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:38,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:29:38,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3603 3167 [WARNING|trainer.py:803] 2025-04-26 20:29:39,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3191 3604 [WARNING|trainer.py:803] 2025-04-26 20:29:40,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:29:40,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:29:40,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3605 3168 [WARNING|trainer.py:803] 2025-04-26 20:29:41,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3192 3606 [WARNING|trainer.py:803] 2025-04-26 20:29:42,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:42,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:29:43,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3607 3169 [WARNING|trainer.py:803] 2025-04-26 20:29:44,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3193 3608 [WARNING|trainer.py:803] 2025-04-26 20:29:44,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:29:44,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:29:45,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3609 3170 3194 [WARNING|trainer.py:803] 2025-04-26 20:29:46,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3610 [WARNING|trainer.py:803] 2025-04-26 20:29:46,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:29:47,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:47,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3611 3171 3195 [WARNING|trainer.py:803] 2025-04-26 20:29:48,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:29:48,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3612 [WARNING|trainer.py:803] 2025-04-26 20:29:49,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:49,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3172 3613 3196 [WARNING|trainer.py:803] 2025-04-26 20:29:50,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:51,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:51,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3614 3173 [WARNING|trainer.py:803] 2025-04-26 20:29:52,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3615 3197 [WARNING|trainer.py:803] 2025-04-26 20:29:52,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:29:53,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:29:53,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3616 3174 [WARNING|trainer.py:803] 2025-04-26 20:29:54,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3198 3617 [WARNING|trainer.py:803] 2025-04-26 20:29:55,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:29:55,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:29:55,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3618 3175 3199 [WARNING|trainer.py:803] 2025-04-26 20:29:56,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3619 [WARNING|trainer.py:803] 2025-04-26 20:29:57,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:29:57,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:29:57,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3620 3176 3200 [WARNING|trainer.py:803] 2025-04-26 20:29:59,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:29:59,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3621 [WARNING|trainer.py:803] 2025-04-26 20:29:59,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:30:00,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3622 3177 3201 [WARNING|trainer.py:803] 2025-04-26 20:30:01,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:30:01,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:01,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3623 [WARNING|trainer.py:803] 2025-04-26 20:30:02,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3178 3202 3624 [WARNING|trainer.py:803] 2025-04-26 20:30:03,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:30:03,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:30:03,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3625 3203 [WARNING|trainer.py:803] 2025-04-26 20:30:04,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3179 3626 [WARNING|trainer.py:803] 2025-04-26 20:30:05,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:30:05,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:30:06,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3627 3204 3180 [WARNING|trainer.py:803] 2025-04-26 20:30:07,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:30:07,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3628 [WARNING|trainer.py:803] 2025-04-26 20:30:07,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:08,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3629 3205 3181 [WARNING|trainer.py:803] 2025-04-26 20:30:09,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:30:09,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3630 [WARNING|trainer.py:803] 2025-04-26 20:30:09,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:10,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3206 3631 3182 [WARNING|trainer.py:803] 2025-04-26 20:30:11,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:11,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:30:12,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3632 3207 [WARNING|trainer.py:803] 2025-04-26 20:30:13,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3183 3633 [WARNING|trainer.py:803] 2025-04-26 20:30:13,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:14,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:30:14,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3634 3208 [WARNING|trainer.py:803] 2025-04-26 20:30:15,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3184 3635 [WARNING|trainer.py:803] 2025-04-26 20:30:15,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:30:16,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:30:16,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3636 3209 3185 [WARNING|trainer.py:803] 2025-04-26 20:30:17,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3637 [WARNING|trainer.py:803] 2025-04-26 20:30:18,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:30:18,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:30:18,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3638 3210 3186 [WARNING|trainer.py:803] 2025-04-26 20:30:20,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:20,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:20,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3639 [WARNING|trainer.py:803] 2025-04-26 20:30:21,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3211 3640 3187 [WARNING|trainer.py:803] 2025-04-26 20:30:22,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:30:22,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:30:22,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3641 3212 [WARNING|trainer.py:803] 2025-04-26 20:30:23,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3188 3642 [WARNING|trainer.py:803] 2025-04-26 20:30:24,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:24,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:30:24,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3643 3213 3189 [WARNING|trainer.py:803] 2025-04-26 20:30:25,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:26,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3644 [WARNING|trainer.py:803] 2025-04-26 20:30:26,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:30:27,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3214 3645 3190 [WARNING|trainer.py:803] 2025-04-26 20:30:28,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:28,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3646 [WARNING|trainer.py:803] 2025-04-26 20:30:28,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3215 [WARNING|trainer.py:803] 2025-04-26 20:30:29,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3647 3191 [WARNING|trainer.py:803] 2025-04-26 20:30:30,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:30:30,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:30:30,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3648 3216 [WARNING|trainer.py:803] 2025-04-26 20:30:31,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3192 [WARNING|trainer.py:803] 2025-04-26 20:30:32,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3649 [WARNING|trainer.py:803] 2025-04-26 20:30:33,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:30:33,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3650 3217 [WARNING|trainer.py:803] 2025-04-26 20:30:34,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:34,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3193 3651 [WARNING|trainer.py:803] 2025-04-26 20:30:35,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3218 [WARNING|trainer.py:803] 2025-04-26 20:30:35,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3652 [WARNING|trainer.py:803] 2025-04-26 20:30:36,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3194 [WARNING|trainer.py:803] 2025-04-26 20:30:36,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3653 [WARNING|trainer.py:803] 2025-04-26 20:30:37,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3219 [WARNING|trainer.py:803] 2025-04-26 20:30:38,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:30:38,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3654 3195 [WARNING|trainer.py:803] 2025-04-26 20:30:39,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:30:39,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3220 3655 [WARNING|trainer.py:803] 2025-04-26 20:30:40,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:40,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3196 3656 [WARNING|trainer.py:803] 2025-04-26 20:30:41,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3221 [WARNING|trainer.py:803] 2025-04-26 20:30:41,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3657 [WARNING|trainer.py:803] 2025-04-26 20:30:42,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:30:42,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3197 3658 3222 [WARNING|trainer.py:803] 2025-04-26 20:30:43,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:43,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3659 [WARNING|trainer.py:803] 2025-04-26 20:30:44,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3198 [WARNING|trainer.py:803] 2025-04-26 20:30:44,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3660 [WARNING|trainer.py:803] 2025-04-26 20:30:45,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 3223 Yes [WARNING|trainer.py:803] 2025-04-26 20:30:46,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:30:46,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3661 3199 [WARNING|trainer.py:803] 2025-04-26 20:30:47,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3224 [WARNING|trainer.py:803] 2025-04-26 20:30:47,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3662 [WARNING|trainer.py:803] 2025-04-26 20:30:48,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:30:48,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3663 3200 3225 [WARNING|trainer.py:803] 2025-04-26 20:30:49,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:30:49,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3664 [WARNING|trainer.py:803] 2025-04-26 20:30:50,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:30:50,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3201 3665 3226 [WARNING|trainer.py:803] 2025-04-26 20:30:51,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:30:51,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3666 [WARNING|trainer.py:803] 2025-04-26 20:30:52,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3202 [WARNING|trainer.py:803] 2025-04-26 20:30:53,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3227 3667 [WARNING|trainer.py:803] 2025-04-26 20:30:53,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:30:54,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:30:54,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3668 3203 3228 [WARNING|trainer.py:803] 2025-04-26 20:30:55,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:30:55,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3669 [WARNING|trainer.py:803] 2025-04-26 20:30:56,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:56,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3204 3670 3229 [WARNING|trainer.py:803] 2025-04-26 20:30:57,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:30:57,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:30:58,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3671 [WARNING|trainer.py:803] 2025-04-26 20:30:59,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3205 3230 3672 [WARNING|trainer.py:803] 2025-04-26 20:31:00,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:31:00,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:00,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3673 3206 3231 [WARNING|trainer.py:803] 2025-04-26 20:31:01,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3674 [WARNING|trainer.py:803] 2025-04-26 20:31:02,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:31:02,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:31:02,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3675 3232 3207 [WARNING|trainer.py:803] 2025-04-26 20:31:03,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3676 [WARNING|trainer.py:803] 2025-04-26 20:31:04,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:04,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:31:04,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3677 3233 3208 [WARNING|trainer.py:803] 2025-04-26 20:31:05,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:31:06,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3678 [WARNING|trainer.py:803] 2025-04-26 20:31:06,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:07,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3234 3679 3209 [WARNING|trainer.py:803] 2025-04-26 20:31:08,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:08,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:31:08,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3680 3235 [WARNING|trainer.py:803] 2025-04-26 20:31:09,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3210 3681 [WARNING|trainer.py:803] 2025-04-26 20:31:10,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:10,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:31:10,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3682 3236 3211 [WARNING|trainer.py:803] 2025-04-26 20:31:11,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:31:12,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3683 [WARNING|trainer.py:803] 2025-04-26 20:31:12,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:31:13,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3237 3684 3212 [WARNING|trainer.py:803] 2025-04-26 20:31:14,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:14,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:14,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3685 3238 [WARNING|trainer.py:803] 2025-04-26 20:31:15,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3213 3686 [WARNING|trainer.py:803] 2025-04-26 20:31:16,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:16,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:31:16,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3687 3239 3214 [WARNING|trainer.py:803] 2025-04-26 20:31:17,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3688 [WARNING|trainer.py:803] 2025-04-26 20:31:18,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:31:18,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:31:19,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3689 3240 3215 [WARNING|trainer.py:803] 2025-04-26 20:31:20,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:20,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3690 [WARNING|trainer.py:803] 2025-04-26 20:31:20,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:31:21,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3241 3691 3216 [WARNING|trainer.py:803] 2025-04-26 20:31:22,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:22,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:31:22,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3692 3242 [WARNING|trainer.py:803] 2025-04-26 20:31:23,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3217 3693 [WARNING|trainer.py:803] 2025-04-26 20:31:24,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:24,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:31:24,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3694 3243 [WARNING|trainer.py:803] 2025-04-26 20:31:25,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3218 [WARNING|trainer.py:803] 2025-04-26 20:31:26,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3695 [WARNING|trainer.py:803] 2025-04-26 20:31:26,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:31:27,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3244 3696 3219 [WARNING|trainer.py:803] 2025-04-26 20:31:28,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:31:28,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3697 [WARNING|trainer.py:803] 2025-04-26 20:31:28,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3245 [WARNING|trainer.py:803] 2025-04-26 20:31:29,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3698 3220 [WARNING|trainer.py:803] 2025-04-26 20:31:30,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:30,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:31:30,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3699 3246 [WARNING|trainer.py:803] 2025-04-26 20:31:31,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3221 3700 [WARNING|trainer.py:803] 2025-04-26 20:31:32,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:31:32,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:31:33,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3701 3247 3222 [WARNING|trainer.py:803] 2025-04-26 20:31:34,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:34,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3702 [WARNING|trainer.py:803] 2025-04-26 20:31:34,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3248 [WARNING|trainer.py:803] 2025-04-26 20:31:35,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3703 3223 [WARNING|trainer.py:803] 2025-04-26 20:31:36,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:36,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:31:37,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3704 3249 [WARNING|trainer.py:803] 2025-04-26 20:31:38,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3224 [WARNING|trainer.py:803] 2025-04-26 20:31:38,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3705 [WARNING|trainer.py:803] 2025-04-26 20:31:38,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:31:39,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3250 3706 3225 [WARNING|trainer.py:803] 2025-04-26 20:31:40,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:31:40,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3707 [WARNING|trainer.py:803] 2025-04-26 20:31:40,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3251 [WARNING|trainer.py:803] 2025-04-26 20:31:41,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3708 [WARNING|trainer.py:803] 2025-04-26 20:31:42,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3226 [WARNING|trainer.py:803] 2025-04-26 20:31:42,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:31:42,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3252 3709 [WARNING|trainer.py:803] 2025-04-26 20:31:43,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3227 [WARNING|trainer.py:803] 2025-04-26 20:31:43,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3710 [WARNING|trainer.py:803] 2025-04-26 20:31:44,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3253 [WARNING|trainer.py:803] 2025-04-26 20:31:45,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3711 [WARNING|trainer.py:803] 2025-04-26 20:31:45,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3228 [WARNING|trainer.py:803] 2025-04-26 20:31:46,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3254 [WARNING|trainer.py:803] 2025-04-26 20:31:46,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3712 [WARNING|trainer.py:803] 2025-04-26 20:31:47,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:31:47,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3713 3229 3255 [WARNING|trainer.py:803] 2025-04-26 20:31:48,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:48,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3714 [WARNING|trainer.py:803] 2025-04-26 20:31:49,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3230 [WARNING|trainer.py:803] 2025-04-26 20:31:50,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3715 3256 [WARNING|trainer.py:803] 2025-04-26 20:31:50,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:31:51,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:31:51,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 3716 3231 [WARNING|trainer.py:803] 2025-04-26 20:31:52,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3257 3717 [WARNING|trainer.py:803] 2025-04-26 20:31:52,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:31:53,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 20:31:53,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3718 3232 3258 [WARNING|trainer.py:803] 2025-04-26 20:31:54,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:31:54,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3719 [WARNING|trainer.py:803] 2025-04-26 20:31:55,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 3233 [WARNING|trainer.py:803] 2025-04-26 20:31:56,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3259 3720 [WARNING|trainer.py:803] 2025-04-26 20:31:56,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:31:57,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 20:31:57,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3721 3234 3260 [WARNING|trainer.py:803] 2025-04-26 20:31:58,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:31:58,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3722 [WARNING|trainer.py:803] 2025-04-26 20:31:59,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:32:00,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3235 3723 3261 [WARNING|trainer.py:803] 2025-04-26 20:32:00,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:01,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:32:01,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3724 3236 3262 [WARNING|trainer.py:803] 2025-04-26 20:32:02,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3725 [WARNING|trainer.py:803] 2025-04-26 20:32:02,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:03,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:03,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3726 3237 3263 [WARNING|trainer.py:803] 2025-04-26 20:32:04,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:32:05,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:05,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3727 [WARNING|trainer.py:803] 2025-04-26 20:32:06,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3238 3264 3728 [WARNING|trainer.py:803] 2025-04-26 20:32:07,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:07,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:32:07,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3729 3239 [WARNING|trainer.py:803] 2025-04-26 20:32:08,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3265 3730 [WARNING|trainer.py:803] 2025-04-26 20:32:09,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:32:09,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:32:09,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3731 3240 3266 [WARNING|trainer.py:803] 2025-04-26 20:32:10,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3732 [WARNING|trainer.py:803] 2025-04-26 20:32:11,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:32:11,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:11,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3733 3241 3267 [WARNING|trainer.py:803] 2025-04-26 20:32:13,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:13,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3734 [WARNING|trainer.py:803] 2025-04-26 20:32:13,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:32:14,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3242 3735 3268 [WARNING|trainer.py:803] 2025-04-26 20:32:15,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:15,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:32:15,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3736 3243 [WARNING|trainer.py:803] 2025-04-26 20:32:16,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3269 3737 [WARNING|trainer.py:803] 2025-04-26 20:32:17,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:32:17,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:32:17,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3738 3244 3270 [WARNING|trainer.py:803] 2025-04-26 20:32:19,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:32:19,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3739 [WARNING|trainer.py:803] 2025-04-26 20:32:19,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:32:20,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3245 3740 3271 [WARNING|trainer.py:803] 2025-04-26 20:32:21,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:21,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:32:21,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3741 3246 [WARNING|trainer.py:803] 2025-04-26 20:32:22,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3272 3742 [WARNING|trainer.py:803] 2025-04-26 20:32:23,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:32:23,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:23,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3743 3247 3273 [WARNING|trainer.py:803] 2025-04-26 20:32:25,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:32:25,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3744 [WARNING|trainer.py:803] 2025-04-26 20:32:25,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:32:26,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3248 3745 3274 [WARNING|trainer.py:803] 2025-04-26 20:32:27,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:27,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:32:27,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3746 3249 [WARNING|trainer.py:803] 2025-04-26 20:32:28,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3275 3747 [WARNING|trainer.py:803] 2025-04-26 20:32:29,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:29,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:32:29,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3748 3250 3276 [WARNING|trainer.py:803] 2025-04-26 20:32:31,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:32:31,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3749 [WARNING|trainer.py:803] 2025-04-26 20:32:31,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:32:32,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3251 3750 3277 [WARNING|trainer.py:803] 2025-04-26 20:32:33,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:32:33,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3751 [WARNING|trainer.py:803] 2025-04-26 20:32:33,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3252 [WARNING|trainer.py:803] 2025-04-26 20:32:34,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3278 3752 [WARNING|trainer.py:803] 2025-04-26 20:32:35,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:32:35,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:35,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3753 3253 3279 [WARNING|trainer.py:803] 2025-04-26 20:32:37,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:32:37,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3754 [WARNING|trainer.py:803] 2025-04-26 20:32:37,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:38,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3254 3755 3280 [WARNING|trainer.py:803] 2025-04-26 20:32:39,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:32:39,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3756 [WARNING|trainer.py:803] 2025-04-26 20:32:39,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3255 [WARNING|trainer.py:803] 2025-04-26 20:32:40,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3757 [WARNING|trainer.py:803] 2025-04-26 20:32:41,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3281 [WARNING|trainer.py:803] 2025-04-26 20:32:41,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:32:41,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3758 3256 [WARNING|trainer.py:803] 2025-04-26 20:32:43,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 20:32:43,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3282 3759 [WARNING|trainer.py:803] 2025-04-26 20:32:44,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3257 [WARNING|trainer.py:803] 2025-04-26 20:32:44,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3760 [WARNING|trainer.py:803] 2025-04-26 20:32:44,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 3283 [WARNING|trainer.py:803] 2025-04-26 20:32:45,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3761 [WARNING|trainer.py:803] 2025-04-26 20:32:46,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3258 [WARNING|trainer.py:803] 2025-04-26 20:32:46,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:32:47,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 3762 3284 [WARNING|trainer.py:803] 2025-04-26 20:32:47,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3259 [WARNING|trainer.py:803] 2025-04-26 20:32:48,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3763 [WARNING|trainer.py:803] 2025-04-26 20:32:48,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 20:32:49,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3285 3764 3260 [WARNING|trainer.py:803] 2025-04-26 20:32:50,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:32:50,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3765 [WARNING|trainer.py:803] 2025-04-26 20:32:50,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:32:51,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3286 3766 3261 [WARNING|trainer.py:803] 2025-04-26 20:32:52,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:32:52,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:52,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3767 3287 [WARNING|trainer.py:803] 2025-04-26 20:32:53,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3262 3768 [WARNING|trainer.py:803] 2025-04-26 20:32:54,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:32:54,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:32:55,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3769 3288 3263 [WARNING|trainer.py:803] 2025-04-26 20:32:56,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:32:56,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3770 [WARNING|trainer.py:803] 2025-04-26 20:32:56,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:32:57,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3771 3289 3264 [WARNING|trainer.py:803] 2025-04-26 20:32:58,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:32:58,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3772 [WARNING|trainer.py:803] 2025-04-26 20:32:58,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:32:59,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3290 3773 3265 [WARNING|trainer.py:803] 2025-04-26 20:33:00,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:33:00,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3774 [WARNING|trainer.py:803] 2025-04-26 20:33:01,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3291 [WARNING|trainer.py:803] 2025-04-26 20:33:01,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3775 3266 [WARNING|trainer.py:803] 2025-04-26 20:33:02,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:03,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:33:03,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3776 3292 [WARNING|trainer.py:803] 2025-04-26 20:33:04,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3267 3777 [WARNING|trainer.py:803] 2025-04-26 20:33:04,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:05,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:33:05,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3778 3293 3268 [WARNING|trainer.py:803] 2025-04-26 20:33:06,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:06,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3779 [WARNING|trainer.py:803] 2025-04-26 20:33:07,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:33:07,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3294 3780 3269 [WARNING|trainer.py:803] 2025-04-26 20:33:08,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:09,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:33:09,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3781 3295 [WARNING|trainer.py:803] 2025-04-26 20:33:10,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3270 3782 [WARNING|trainer.py:803] 2025-04-26 20:33:10,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:11,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:33:11,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3783 3296 3271 [WARNING|trainer.py:803] 2025-04-26 20:33:12,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:12,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3784 [WARNING|trainer.py:803] 2025-04-26 20:33:13,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:33:13,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3297 3785 3272 [WARNING|trainer.py:803] 2025-04-26 20:33:14,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:33:15,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:33:15,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3786 3298 [WARNING|trainer.py:803] 2025-04-26 20:33:16,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3273 3787 [WARNING|trainer.py:803] 2025-04-26 20:33:16,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:17,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:33:17,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3788 3299 3274 [WARNING|trainer.py:803] 2025-04-26 20:33:18,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:33:19,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3789 [WARNING|trainer.py:803] 2025-04-26 20:33:19,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:20,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3300 3790 3275 [WARNING|trainer.py:803] 2025-04-26 20:33:21,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:33:21,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:33:21,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3791 3301 [WARNING|trainer.py:803] 2025-04-26 20:33:22,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:33:22,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3276 3792 3302 [WARNING|trainer.py:803] 2025-04-26 20:33:23,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:33:23,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:33:23,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3793 3303 [WARNING|trainer.py:803] 2025-04-26 20:33:24,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3277 [WARNING|trainer.py:803] 2025-04-26 20:33:25,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3794 3304 [WARNING|trainer.py:803] 2025-04-26 20:33:25,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:33:25,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3795 [WARNING|trainer.py:803] 2025-04-26 20:33:26,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3278 3305 [WARNING|trainer.py:803] 2025-04-26 20:33:27,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3796 [WARNING|trainer.py:803] 2025-04-26 20:33:27,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:33:27,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3306 [WARNING|trainer.py:803] 2025-04-26 20:33:28,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3797 3279 [WARNING|trainer.py:803] 2025-04-26 20:33:28,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3307 [WARNING|trainer.py:803] 2025-04-26 20:33:29,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:33:29,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3798 [WARNING|trainer.py:803] 2025-04-26 20:33:30,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3308 [WARNING|trainer.py:803] 2025-04-26 20:33:30,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3280 3799 [WARNING|trainer.py:803] 2025-04-26 20:33:31,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:31,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:31,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3309 3800 [WARNING|trainer.py:803] 2025-04-26 20:33:32,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:32,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3281 3801 3310 [WARNING|trainer.py:803] 2025-04-26 20:33:33,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:33:34,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:34,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3802 3311 3282 [WARNING|trainer.py:803] 2025-04-26 20:33:35,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:33:35,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3803 [WARNING|trainer.py:803] 2025-04-26 20:33:35,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3312 [WARNING|trainer.py:803] 2025-04-26 20:33:36,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:36,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3804 3283 3313 [WARNING|trainer.py:803] 2025-04-26 20:33:37,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:33:37,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3805 [WARNING|trainer.py:803] 2025-04-26 20:33:38,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3314 [WARNING|trainer.py:803] 2025-04-26 20:33:38,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3806 3284 [WARNING|trainer.py:803] 2025-04-26 20:33:39,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3315 [WARNING|trainer.py:803] 2025-04-26 20:33:39,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:33:39,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3807 [WARNING|trainer.py:803] 2025-04-26 20:33:40,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:33:40,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3316 3285 3808 [WARNING|trainer.py:803] 2025-04-26 20:33:41,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:33:41,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:33:42,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3317 3809 [WARNING|trainer.py:803] 2025-04-26 20:33:43,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:33:43,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3286 3810 3318 [WARNING|trainer.py:803] 2025-04-26 20:33:44,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:44,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:33:44,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3811 3319 3287 [WARNING|trainer.py:803] 2025-04-26 20:33:45,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3812 [WARNING|trainer.py:803] 2025-04-26 20:33:45,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:33:46,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3320 [WARNING|trainer.py:803] 2025-04-26 20:33:46,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3813 [WARNING|trainer.py:803] 2025-04-26 20:33:46,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3288 3321 [WARNING|trainer.py:803] 2025-04-26 20:33:47,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3814 [WARNING|trainer.py:803] 2025-04-26 20:33:48,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:33:48,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:33:48,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3322 3815 3289 [WARNING|trainer.py:803] 2025-04-26 20:33:49,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:33:49,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3323 3816 [WARNING|trainer.py:803] 2025-04-26 20:33:50,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:50,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:50,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3817 3324 3290 [WARNING|trainer.py:803] 2025-04-26 20:33:51,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:52,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:33:52,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3818 3325 [WARNING|trainer.py:803] 2025-04-26 20:33:53,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3291 3819 [WARNING|trainer.py:803] 2025-04-26 20:33:53,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3326 [WARNING|trainer.py:803] 2025-04-26 20:33:54,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:54,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3820 [WARNING|trainer.py:803] 2025-04-26 20:33:54,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3327 [WARNING|trainer.py:803] 2025-04-26 20:33:55,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3292 3821 [WARNING|trainer.py:803] 2025-04-26 20:33:56,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:33:56,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:56,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3328 3822 [WARNING|trainer.py:803] 2025-04-26 20:33:57,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3293 [WARNING|trainer.py:803] 2025-04-26 20:33:57,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3329 3823 [WARNING|trainer.py:803] 2025-04-26 20:33:58,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:58,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:33:58,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3824 3330 3294 [WARNING|trainer.py:803] 2025-04-26 20:33:59,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:33:59,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3825 [WARNING|trainer.py:803] 2025-04-26 20:34:00,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3331 [WARNING|trainer.py:803] 2025-04-26 20:34:00,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3826 [WARNING|trainer.py:803] 2025-04-26 20:34:01,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3295 3332 [WARNING|trainer.py:803] 2025-04-26 20:34:02,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3827 [WARNING|trainer.py:803] 2025-04-26 20:34:02,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:02,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3333 [WARNING|trainer.py:803] 2025-04-26 20:34:03,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3828 3296 [WARNING|trainer.py:803] 2025-04-26 20:34:03,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:04,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3334 3829 [WARNING|trainer.py:803] 2025-04-26 20:34:04,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:34:05,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:34:05,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3830 3335 3297 [WARNING|trainer.py:803] 2025-04-26 20:34:06,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:06,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:06,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3831 3336 [WARNING|trainer.py:803] 2025-04-26 20:34:07,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3832 3298 [WARNING|trainer.py:803] 2025-04-26 20:34:07,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3337 [WARNING|trainer.py:803] 2025-04-26 20:34:08,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:34:08,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3833 [WARNING|trainer.py:803] 2025-04-26 20:34:09,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3338 [WARNING|trainer.py:803] 2025-04-26 20:34:09,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3834 3299 [WARNING|trainer.py:803] 2025-04-26 20:34:10,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:10,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:10,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3339 3835 [WARNING|trainer.py:803] 2025-04-26 20:34:11,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:34:11,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3300 3836 3340 [WARNING|trainer.py:803] 2025-04-26 20:34:12,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:34:12,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:12,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3837 3301 3341 [WARNING|trainer.py:803] 2025-04-26 20:34:13,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:13,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3838 [WARNING|trainer.py:803] 2025-04-26 20:34:14,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3302 3342 [WARNING|trainer.py:803] 2025-04-26 20:34:15,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:34:15,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3839 [WARNING|trainer.py:803] 2025-04-26 20:34:15,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3303 3343 [WARNING|trainer.py:803] 2025-04-26 20:34:16,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3840 [WARNING|trainer.py:803] 2025-04-26 20:34:16,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:16,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3304 [WARNING|trainer.py:803] 2025-04-26 20:34:17,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3344 3841 [WARNING|trainer.py:803] 2025-04-26 20:34:17,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:34:18,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3305 [WARNING|trainer.py:803] 2025-04-26 20:34:18,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3842 3345 [WARNING|trainer.py:803] 2025-04-26 20:34:19,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:19,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:19,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3306 3843 3346 [WARNING|trainer.py:803] 2025-04-26 20:34:20,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:20,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:34:20,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3844 3307 3347 [WARNING|trainer.py:803] 2025-04-26 20:34:21,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:21,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3845 [WARNING|trainer.py:803] 2025-04-26 20:34:22,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3308 3348 [WARNING|trainer.py:803] 2025-04-26 20:34:22,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3846 [WARNING|trainer.py:803] 2025-04-26 20:34:23,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:23,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3309 [WARNING|trainer.py:803] 2025-04-26 20:34:23,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3349 3847 [WARNING|trainer.py:803] 2025-04-26 20:34:24,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:24,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:24,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3310 3848 3350 [WARNING|trainer.py:803] 2025-04-26 20:34:25,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:25,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:25,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3849 3311 3351 [WARNING|trainer.py:803] 2025-04-26 20:34:26,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:26,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3850 [WARNING|trainer.py:803] 2025-04-26 20:34:27,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3312 3352 [WARNING|trainer.py:803] 2025-04-26 20:34:27,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3851 [WARNING|trainer.py:803] 2025-04-26 20:34:28,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:28,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3313 3353 [WARNING|trainer.py:803] 2025-04-26 20:34:29,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3852 [WARNING|trainer.py:803] 2025-04-26 20:34:29,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:29,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3314 [WARNING|trainer.py:803] 2025-04-26 20:34:30,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3853 3354 [WARNING|trainer.py:803] 2025-04-26 20:34:30,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:34:31,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:31,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3315 3854 3355 [WARNING|trainer.py:803] 2025-04-26 20:34:32,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:34:32,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3855 [WARNING|trainer.py:803] 2025-04-26 20:34:32,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3316 3356 [WARNING|trainer.py:803] 2025-04-26 20:34:33,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:33,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3856 [WARNING|trainer.py:803] 2025-04-26 20:34:33,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3317 3357 [WARNING|trainer.py:803] 2025-04-26 20:34:34,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3857 [WARNING|trainer.py:803] 2025-04-26 20:34:34,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:35,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3318 [WARNING|trainer.py:803] 2025-04-26 20:34:35,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3358 3858 [WARNING|trainer.py:803] 2025-04-26 20:34:36,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:36,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:34:36,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3319 3859 3359 [WARNING|trainer.py:803] 2025-04-26 20:34:37,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:34:37,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:34:37,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3860 3320 3360 [WARNING|trainer.py:803] 2025-04-26 20:34:38,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:38,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:38,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3861 3321 3361 [WARNING|trainer.py:803] 2025-04-26 20:34:39,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:34:40,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3862 [WARNING|trainer.py:803] 2025-04-26 20:34:40,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3322 3362 [WARNING|trainer.py:803] 2025-04-26 20:34:40,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3863 [WARNING|trainer.py:803] 2025-04-26 20:34:41,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:34:41,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3323 [WARNING|trainer.py:803] 2025-04-26 20:34:41,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3363 3864 [WARNING|trainer.py:803] 2025-04-26 20:34:42,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:42,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:34:43,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3324 3364 3865 [WARNING|trainer.py:803] 2025-04-26 20:34:43,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:44,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:44,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3866 3325 3365 [WARNING|trainer.py:803] 2025-04-26 20:34:45,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:34:45,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:34:45,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3867 3326 3366 [WARNING|trainer.py:803] 2025-04-26 20:34:46,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3868 [WARNING|trainer.py:803] 2025-04-26 20:34:46,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:46,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3327 3367 [WARNING|trainer.py:803] 2025-04-26 20:34:47,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3869 [WARNING|trainer.py:803] 2025-04-26 20:34:47,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:34:48,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3328 [WARNING|trainer.py:803] 2025-04-26 20:34:48,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3368 3870 [WARNING|trainer.py:803] 2025-04-26 20:34:49,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:49,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:49,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3329 3369 3871 [WARNING|trainer.py:803] 2025-04-26 20:34:50,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:34:50,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:34:50,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3330 3872 3370 [WARNING|trainer.py:803] 2025-04-26 20:34:51,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:34:51,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:34:51,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3331 3873 3371 [WARNING|trainer.py:803] 2025-04-26 20:34:53,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:34:53,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:34:53,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3874 3332 3372 [WARNING|trainer.py:803] 2025-04-26 20:34:54,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:54,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:54,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3875 3333 3373 [WARNING|trainer.py:803] 2025-04-26 20:34:55,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:34:55,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3876 [WARNING|trainer.py:803] 2025-04-26 20:34:55,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3334 3374 [WARNING|trainer.py:803] 2025-04-26 20:34:56,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3877 [WARNING|trainer.py:803] 2025-04-26 20:34:57,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:34:57,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3335 3375 [WARNING|trainer.py:803] 2025-04-26 20:34:57,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3878 [WARNING|trainer.py:803] 2025-04-26 20:34:58,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:34:58,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3336 [WARNING|trainer.py:803] 2025-04-26 20:34:58,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3376 3879 [WARNING|trainer.py:803] 2025-04-26 20:34:59,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:34:59,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:00,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3337 3377 3880 [WARNING|trainer.py:803] 2025-04-26 20:35:01,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:01,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:01,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3881 3338 3378 [WARNING|trainer.py:803] 2025-04-26 20:35:02,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:35:02,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:02,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3882 3339 3379 [WARNING|trainer.py:803] 2025-04-26 20:35:03,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:35:03,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:35:03,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3883 3340 3380 [WARNING|trainer.py:803] 2025-04-26 20:35:04,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3884 [WARNING|trainer.py:803] 2025-04-26 20:35:04,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:05,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3341 3381 [WARNING|trainer.py:803] 2025-04-26 20:35:05,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3885 [WARNING|trainer.py:803] 2025-04-26 20:35:06,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:06,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3342 [WARNING|trainer.py:803] 2025-04-26 20:35:06,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3382 3886 [WARNING|trainer.py:803] 2025-04-26 20:35:07,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:07,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:35:07,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3343 3383 3887 [WARNING|trainer.py:803] 2025-04-26 20:35:08,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:08,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:35:09,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3344 3888 3384 [WARNING|trainer.py:803] 2025-04-26 20:35:10,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:10,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:10,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3889 3345 3385 [WARNING|trainer.py:803] 2025-04-26 20:35:11,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:11,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:11,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3890 3346 3386 [WARNING|trainer.py:803] 2025-04-26 20:35:12,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:12,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:12,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3891 3347 3387 [WARNING|trainer.py:803] 2025-04-26 20:35:13,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3892 [WARNING|trainer.py:803] 2025-04-26 20:35:14,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:14,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3348 3388 [WARNING|trainer.py:803] 2025-04-26 20:35:14,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3893 [WARNING|trainer.py:803] 2025-04-26 20:35:15,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:15,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3349 3389 [WARNING|trainer.py:803] 2025-04-26 20:35:16,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3894 [WARNING|trainer.py:803] 2025-04-26 20:35:16,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:16,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:17,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3390 3350 3895 [WARNING|trainer.py:803] 2025-04-26 20:35:18,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:18,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:18,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3391 3351 3896 [WARNING|trainer.py:803] 2025-04-26 20:35:19,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:19,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:19,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3897 3392 3352 [WARNING|trainer.py:803] 2025-04-26 20:35:20,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:20,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:20,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3898 3393 3353 [WARNING|trainer.py:803] 2025-04-26 20:35:21,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:21,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:22,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3899 3394 3354 [WARNING|trainer.py:803] 2025-04-26 20:35:22,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3900 [WARNING|trainer.py:803] 2025-04-26 20:35:23,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:23,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3395 3355 [WARNING|trainer.py:803] 2025-04-26 20:35:24,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3901 [WARNING|trainer.py:803] 2025-04-26 20:35:24,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:24,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3396 3356 [WARNING|trainer.py:803] 2025-04-26 20:35:25,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3902 [WARNING|trainer.py:803] 2025-04-26 20:35:25,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:35:25,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3397 3357 [WARNING|trainer.py:803] 2025-04-26 20:35:26,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:35:27,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3903 [WARNING|trainer.py:803] 2025-04-26 20:35:27,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3398 3358 [WARNING|trainer.py:803] 2025-04-26 20:35:28,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3904 [WARNING|trainer.py:803] 2025-04-26 20:35:28,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:28,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3399 3359 [WARNING|trainer.py:803] 2025-04-26 20:35:29,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:29,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3905 [WARNING|trainer.py:803] 2025-04-26 20:35:29,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3400 3360 [WARNING|trainer.py:803] 2025-04-26 20:35:30,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:31,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3906 [WARNING|trainer.py:803] 2025-04-26 20:35:31,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3401 3361 [WARNING|trainer.py:803] 2025-04-26 20:35:31,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3907 [WARNING|trainer.py:803] 2025-04-26 20:35:32,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:35:32,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3402 3362 [WARNING|trainer.py:803] 2025-04-26 20:35:33,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3908 [WARNING|trainer.py:803] 2025-04-26 20:35:33,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:35:33,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3403 3363 [WARNING|trainer.py:803] 2025-04-26 20:35:34,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:34,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 20:35:35,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3909 3404 3364 [WARNING|trainer.py:803] 2025-04-26 20:35:35,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:36,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3910 [WARNING|trainer.py:803] 2025-04-26 20:35:36,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3405 3365 [WARNING|trainer.py:803] 2025-04-26 20:35:37,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:37,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3911 [WARNING|trainer.py:803] 2025-04-26 20:35:37,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3406 3366 [WARNING|trainer.py:803] 2025-04-26 20:35:38,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:38,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:35:38,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3912 3407 3367 [WARNING|trainer.py:803] 2025-04-26 20:35:39,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:40,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:40,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3913 3408 3368 [WARNING|trainer.py:803] 2025-04-26 20:35:41,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:41,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:41,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3914 3409 3369 [WARNING|trainer.py:803] 2025-04-26 20:35:42,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:35:42,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:35:42,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3915 3410 3370 [WARNING|trainer.py:803] 2025-04-26 20:35:43,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:35:44,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:44,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3916 3411 3371 [WARNING|trainer.py:803] 2025-04-26 20:35:45,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:45,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:35:45,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3917 3412 3372 [WARNING|trainer.py:803] 2025-04-26 20:35:46,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:35:46,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:46,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3918 3413 3373 [WARNING|trainer.py:803] 2025-04-26 20:35:47,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:48,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:48,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3919 3414 3374 [WARNING|trainer.py:803] 2025-04-26 20:35:49,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:49,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:49,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3920 3415 3375 [WARNING|trainer.py:803] 2025-04-26 20:35:50,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:50,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:50,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3921 3416 3376 [WARNING|trainer.py:803] 2025-04-26 20:35:51,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:35:52,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:52,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3922 3417 3377 [WARNING|trainer.py:803] 2025-04-26 20:35:53,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:53,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:35:53,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3923 3378 3418 [WARNING|trainer.py:803] 2025-04-26 20:35:54,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:54,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:35:54,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3924 3379 3419 [WARNING|trainer.py:803] 2025-04-26 20:35:56,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 20:35:56,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes NoYes [WARNING|trainer.py:803] 2025-04-26 20:35:56,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3925 3380 3420 [WARNING|trainer.py:803] 2025-04-26 20:35:57,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:35:57,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:57,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3381 3926 3421 [WARNING|trainer.py:803] 2025-04-26 20:35:58,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:35:58,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:35:58,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3927 3422 3382 [WARNING|trainer.py:803] 2025-04-26 20:36:00,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:00,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:00,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3928 3423 3383 [WARNING|trainer.py:803] 2025-04-26 20:36:01,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:01,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:01,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3929 3424 3384 [WARNING|trainer.py:803] 2025-04-26 20:36:02,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:02,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:02,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3930 3425 3385 [WARNING|trainer.py:803] 2025-04-26 20:36:03,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:04,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:04,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3931 3426 3386 [WARNING|trainer.py:803] 2025-04-26 20:36:05,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:05,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:05,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3932 3427 3387 [WARNING|trainer.py:803] 2025-04-26 20:36:06,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:06,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:06,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3933 3428 3388 [WARNING|trainer.py:803] 2025-04-26 20:36:07,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:07,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:08,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3429 3934 3389 [WARNING|trainer.py:803] 2025-04-26 20:36:09,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:09,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:09,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3430 3935 3390 [WARNING|trainer.py:803] 2025-04-26 20:36:10,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:36:10,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:10,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3431 3936 3391 [WARNING|trainer.py:803] 2025-04-26 20:36:11,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:11,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:12,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3937 3432 3392 [WARNING|trainer.py:803] 2025-04-26 20:36:13,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:13,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:36:13,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3938 3433 3393 [WARNING|trainer.py:803] 2025-04-26 20:36:14,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:14,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:14,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3939 3434 3394 [WARNING|trainer.py:803] 2025-04-26 20:36:15,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:15,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:16,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3940 3435 3395 [WARNING|trainer.py:803] 2025-04-26 20:36:17,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:17,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:17,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3941 3436 3396 [WARNING|trainer.py:803] 2025-04-26 20:36:18,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:18,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:18,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3942 3437 3397 [WARNING|trainer.py:803] 2025-04-26 20:36:19,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:19,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:20,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3438 3943 3398 [WARNING|trainer.py:803] 2025-04-26 20:36:21,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:21,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:21,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3439 3944 3399 [WARNING|trainer.py:803] 2025-04-26 20:36:22,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:22,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:22,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3440 3945 3400 [WARNING|trainer.py:803] 2025-04-26 20:36:23,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:23,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:24,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3441 3946 3401 [WARNING|trainer.py:803] 2025-04-26 20:36:25,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:25,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:25,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3442 3947 3402 [WARNING|trainer.py:803] 2025-04-26 20:36:26,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:26,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:26,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3443 3948 3403 [WARNING|trainer.py:803] 2025-04-26 20:36:27,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:27,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:28,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 3444 3949 3404 [WARNING|trainer.py:803] 2025-04-26 20:36:29,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:29,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:29,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3445 3950 3405 [WARNING|trainer.py:803] 2025-04-26 20:36:30,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:30,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:30,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3446 3951 3406 [WARNING|trainer.py:803] 2025-04-26 20:36:31,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:31,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:31,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3447 3952 3407 [WARNING|trainer.py:803] 2025-04-26 20:36:33,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:33,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:33,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3448 3408 3953 [WARNING|trainer.py:803] 2025-04-26 20:36:34,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:34,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:34,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3449 3409 3954 [WARNING|trainer.py:803] 2025-04-26 20:36:35,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:35,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:36,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3450 3410 3955 [WARNING|trainer.py:803] 2025-04-26 20:36:37,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:37,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:37,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3451 3411 3956 [WARNING|trainer.py:803] 2025-04-26 20:36:38,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:38,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:36:38,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3452 3412 3957 [WARNING|trainer.py:803] 2025-04-26 20:36:39,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:39,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:40,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3453 3413 3958 [WARNING|trainer.py:803] 2025-04-26 20:36:41,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:41,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:41,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3454 3414 3959 [WARNING|trainer.py:803] 2025-04-26 20:36:42,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:42,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:42,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3455 3415 3960 [WARNING|trainer.py:803] 2025-04-26 20:36:43,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:43,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:43,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3456 3416 3961 [WARNING|trainer.py:803] 2025-04-26 20:36:45,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:45,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:45,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3457 3417 3962 [WARNING|trainer.py:803] 2025-04-26 20:36:46,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:36:46,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:36:46,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3458 3963 3418 [WARNING|trainer.py:803] 2025-04-26 20:36:47,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:47,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:47,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3459 3964 3419 [WARNING|trainer.py:803] 2025-04-26 20:36:48,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:49,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:49,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3460 3420 3965 [WARNING|trainer.py:803] 2025-04-26 20:36:50,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:50,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:50,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3461 3966 3421 [WARNING|trainer.py:803] 2025-04-26 20:36:51,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:51,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:51,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3462 3967 3422 [WARNING|trainer.py:803] 2025-04-26 20:36:52,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:53,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:53,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3463 3423 3968 [WARNING|trainer.py:803] 2025-04-26 20:36:54,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:54,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:54,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3464 3424 3969 [WARNING|trainer.py:803] 2025-04-26 20:36:55,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:55,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:55,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3465 3425 3970 [WARNING|trainer.py:803] 2025-04-26 20:36:56,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:57,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:36:57,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3466 3426 3971 [WARNING|trainer.py:803] 2025-04-26 20:36:58,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:36:58,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:36:58,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3467 3427 3972 [WARNING|trainer.py:803] 2025-04-26 20:36:59,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:59,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:36:59,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 3468 NoYes 3428 3973 [WARNING|trainer.py:803] 2025-04-26 20:37:00,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:01,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3469 [WARNING|trainer.py:803] 2025-04-26 20:37:01,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3429 3974 [WARNING|trainer.py:803] 2025-04-26 20:37:02,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:37:02,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:02,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3470 3430 3975 [WARNING|trainer.py:803] 2025-04-26 20:37:03,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:03,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3471 [WARNING|trainer.py:803] 2025-04-26 20:37:03,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3431 3976 [WARNING|trainer.py:803] 2025-04-26 20:37:04,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:05,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3472 [WARNING|trainer.py:803] 2025-04-26 20:37:05,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3432 3977 [WARNING|trainer.py:803] 2025-04-26 20:37:05,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3473 [WARNING|trainer.py:803] 2025-04-26 20:37:06,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:37:06,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3433 3978 [WARNING|trainer.py:803] 2025-04-26 20:37:07,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3474 [WARNING|trainer.py:803] 2025-04-26 20:37:07,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:37:07,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3434 3979 [WARNING|trainer.py:803] 2025-04-26 20:37:08,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3475 [WARNING|trainer.py:803] 2025-04-26 20:37:08,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:09,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3435 3980 [WARNING|trainer.py:803] 2025-04-26 20:37:09,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3476 [WARNING|trainer.py:803] 2025-04-26 20:37:10,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:10,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3436 3981 [WARNING|trainer.py:803] 2025-04-26 20:37:11,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3477 [WARNING|trainer.py:803] 2025-04-26 20:37:11,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:11,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3437 3982 [WARNING|trainer.py:803] 2025-04-26 20:37:12,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3478 [WARNING|trainer.py:803] 2025-04-26 20:37:12,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:12,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3438 3983 [WARNING|trainer.py:803] 2025-04-26 20:37:13,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3479 [WARNING|trainer.py:803] 2025-04-26 20:37:14,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:37:14,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3439 3984 [WARNING|trainer.py:803] 2025-04-26 20:37:14,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3480 [WARNING|trainer.py:803] 2025-04-26 20:37:15,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:15,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3440 3985 [WARNING|trainer.py:803] 2025-04-26 20:37:16,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3481 [WARNING|trainer.py:803] 2025-04-26 20:37:16,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:37:17,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3441 3986 [WARNING|trainer.py:803] 2025-04-26 20:37:17,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3482 [WARNING|trainer.py:803] 2025-04-26 20:37:18,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:37:18,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3442 3987 [WARNING|trainer.py:803] 2025-04-26 20:37:18,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3483 [WARNING|trainer.py:803] 2025-04-26 20:37:19,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:37:19,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3443 3988 [WARNING|trainer.py:803] 2025-04-26 20:37:20,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3484 [WARNING|trainer.py:803] 2025-04-26 20:37:20,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:37:20,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3989 3444 [WARNING|trainer.py:803] 2025-04-26 20:37:21,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3485 [WARNING|trainer.py:803] 2025-04-26 20:37:22,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:22,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3990 3445 [WARNING|trainer.py:803] 2025-04-26 20:37:22,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3486 [WARNING|trainer.py:803] 2025-04-26 20:37:23,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:23,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3991 3446 [WARNING|trainer.py:803] 2025-04-26 20:37:24,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3487 [WARNING|trainer.py:803] 2025-04-26 20:37:24,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:24,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3447 3992 [WARNING|trainer.py:803] 2025-04-26 20:37:25,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3488 [WARNING|trainer.py:803] 2025-04-26 20:37:26,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:26,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3993 3448 [WARNING|trainer.py:803] 2025-04-26 20:37:26,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3489 [WARNING|trainer.py:803] 2025-04-26 20:37:27,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:27,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3994 3449 [WARNING|trainer.py:803] 2025-04-26 20:37:27,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3490 [WARNING|trainer.py:803] 2025-04-26 20:37:28,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:28,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3995 3450 [WARNING|trainer.py:803] 2025-04-26 20:37:29,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3491 [WARNING|trainer.py:803] 2025-04-26 20:37:29,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:29,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3996 [WARNING|trainer.py:803] 2025-04-26 20:37:30,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3451 3492 [WARNING|trainer.py:803] 2025-04-26 20:37:31,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:37:31,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3997 [WARNING|trainer.py:803] 2025-04-26 20:37:31,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3452 3493 [WARNING|trainer.py:803] 2025-04-26 20:37:32,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:32,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3998 [WARNING|trainer.py:803] 2025-04-26 20:37:33,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3453 3494 [WARNING|trainer.py:803] 2025-04-26 20:37:33,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:37:33,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3999 [WARNING|trainer.py:803] 2025-04-26 20:37:34,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3454 3495 [WARNING|trainer.py:803] 2025-04-26 20:37:35,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:37:35,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4000 [WARNING|trainer.py:803] 2025-04-26 20:37:35,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3455 3496 [WARNING|trainer.py:803] 2025-04-26 20:37:36,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:37:36,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4001 [WARNING|trainer.py:803] 2025-04-26 20:37:37,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3456 3497 [WARNING|trainer.py:803] 2025-04-26 20:37:37,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:37:37,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:37:38,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4002 3457 3498 [WARNING|trainer.py:803] 2025-04-26 20:37:39,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:37:39,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:37:39,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4003 3458 3499 [WARNING|trainer.py:803] 2025-04-26 20:37:40,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:40,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:40,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4004 3459 3500 [WARNING|trainer.py:803] 2025-04-26 20:37:41,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:41,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:37:42,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4005 3460 3501 [WARNING|trainer.py:803] 2025-04-26 20:37:43,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:37:43,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:37:43,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4006 3461 3502 [WARNING|trainer.py:803] 2025-04-26 20:37:44,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:44,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:37:44,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4007 3462 3503 [WARNING|trainer.py:803] 2025-04-26 20:37:45,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:45,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:46,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4008 3463 3504 [WARNING|trainer.py:803] 2025-04-26 20:37:47,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:47,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:47,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4009 3464 3505 [WARNING|trainer.py:803] 2025-04-26 20:37:48,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:37:48,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:48,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4010 3465 3506 [WARNING|trainer.py:803] 2025-04-26 20:37:49,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:49,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:49,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4011 3466 3507 [WARNING|trainer.py:803] 2025-04-26 20:37:50,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:51,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:51,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4012 3467 3508 [WARNING|trainer.py:803] 2025-04-26 20:37:52,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:52,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:37:52,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4013 3468 3509 [WARNING|trainer.py:803] 2025-04-26 20:37:53,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:37:53,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4014[WARNING|trainer.py:803] 2025-04-26 20:37:53,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3469 3510 [WARNING|trainer.py:803] 2025-04-26 20:37:54,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:55,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:37:55,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4015 3470 3511 [WARNING|trainer.py:803] 2025-04-26 20:37:56,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:56,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:56,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4016 3471 3512 [WARNING|trainer.py:803] 2025-04-26 20:37:57,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:57,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:57,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4017 3472 3513 [WARNING|trainer.py:803] 2025-04-26 20:37:58,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:37:59,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:37:59,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4018 3473 3514 [WARNING|trainer.py:803] 2025-04-26 20:38:00,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:00,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:38:00,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3474 3515 4019 [WARNING|trainer.py:803] 2025-04-26 20:38:01,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:38:01,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:01,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3475 3516 4020 [WARNING|trainer.py:803] 2025-04-26 20:38:02,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:38:03,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:38:03,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3476 3517 4021 [WARNING|trainer.py:803] 2025-04-26 20:38:04,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:04,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:04,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3477 3518 4022 [WARNING|trainer.py:803] 2025-04-26 20:38:05,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:38:05,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3478 [WARNING|trainer.py:803] 2025-04-26 20:38:06,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3519 4023 [WARNING|trainer.py:803] 2025-04-26 20:38:06,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:06,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3479 3520 [WARNING|trainer.py:803] 2025-04-26 20:38:07,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4024 [WARNING|trainer.py:803] 2025-04-26 20:38:08,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:08,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3480 3521 [WARNING|trainer.py:803] 2025-04-26 20:38:08,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4025 [WARNING|trainer.py:803] 2025-04-26 20:38:09,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:38:09,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3481 3522 [WARNING|trainer.py:803] 2025-04-26 20:38:10,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:10,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4026 [WARNING|trainer.py:803] 2025-04-26 20:38:10,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3482 3523 [WARNING|trainer.py:803] 2025-04-26 20:38:11,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:12,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:12,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4027 3483 3524 [WARNING|trainer.py:803] 2025-04-26 20:38:13,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:13,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:13,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4028 3484 3525 [WARNING|trainer.py:803] 2025-04-26 20:38:14,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:14,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:14,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4029 3485 3526 [WARNING|trainer.py:803] 2025-04-26 20:38:15,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:15,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:16,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4030 3486 3527 [WARNING|trainer.py:803] 2025-04-26 20:38:17,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:38:17,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:17,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4031 3487 3528 [WARNING|trainer.py:803] 2025-04-26 20:38:18,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:38:18,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:18,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4032 3488 3529 [WARNING|trainer.py:803] 2025-04-26 20:38:19,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:38:19,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:19,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4033 3489 3530 [WARNING|trainer.py:803] 2025-04-26 20:38:20,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:38:21,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:21,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [mov,mp4,m4a,3gp,3g2,mj2 @ 0x8975f900] moov atom not found [20:38:21] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4, Invalid data found when processing input Error reading /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4... Failed to load video: /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k 4034 3531 3490 [WARNING|trainer.py:803] 2025-04-26 20:38:22,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:22,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:22,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4035 3532 3491 [WARNING|trainer.py:803] 2025-04-26 20:38:23,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:23,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:23,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4036 3533 3492 [WARNING|trainer.py:803] 2025-04-26 20:38:24,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:25,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:25,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4037 3534 3493 [WARNING|trainer.py:803] 2025-04-26 20:38:26,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:26,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:26,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4038 3535 3494 [WARNING|trainer.py:803] 2025-04-26 20:38:27,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:38:27,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:27,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4039 3536 3495 [WARNING|trainer.py:803] 2025-04-26 20:38:28,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:28,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:38:29,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4040 3537 3496 [WARNING|trainer.py:803] 2025-04-26 20:38:30,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:38:30,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:30,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4041 3538 3497 [WARNING|trainer.py:803] 2025-04-26 20:38:31,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:38:31,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4042 [WARNING|trainer.py:803] 2025-04-26 20:38:31,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3539 3498 [WARNING|trainer.py:803] 2025-04-26 20:38:32,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:32,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:38:33,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4043 3540 3499 [WARNING|trainer.py:803] 2025-04-26 20:38:34,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:34,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:38:34,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4044 3541 3500 [WARNING|trainer.py:803] 2025-04-26 20:38:35,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:38:35,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:35,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4045 3542 3501 [WARNING|trainer.py:803] 2025-04-26 20:38:36,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:36,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:38:37,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4046 3543 3502 [WARNING|trainer.py:803] 2025-04-26 20:38:37,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:37,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:38:38,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4047 3544 3503 [WARNING|trainer.py:803] 2025-04-26 20:38:39,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:39,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:39,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3545 4048 3504 [WARNING|trainer.py:803] 2025-04-26 20:38:40,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:40,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:40,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3546 4049 3505 [WARNING|trainer.py:803] 2025-04-26 20:38:41,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:42,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:42,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3547 4050 3506 [WARNING|trainer.py:803] 2025-04-26 20:38:43,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:43,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:38:43,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3548 4051 3507 [WARNING|trainer.py:803] 2025-04-26 20:38:44,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:38:44,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:44,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3549 4052 3508 [WARNING|trainer.py:803] 2025-04-26 20:38:45,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:45,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:46,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3550 4053 3509 [WARNING|trainer.py:803] 2025-04-26 20:38:47,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:47,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:47,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3551 4054 3510 [WARNING|trainer.py:803] 2025-04-26 20:38:48,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:48,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:38:48,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3552 4055 3511 [WARNING|trainer.py:803] 2025-04-26 20:38:49,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:49,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:50,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3553 3512 4056 [WARNING|trainer.py:803] 2025-04-26 20:38:50,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:51,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:51,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3554 3513 4057 [WARNING|trainer.py:803] 2025-04-26 20:38:52,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:52,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3555 [WARNING|trainer.py:803] 2025-04-26 20:38:52,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3514 4058 [WARNING|trainer.py:803] 2025-04-26 20:38:53,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:53,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3556 [WARNING|trainer.py:803] 2025-04-26 20:38:54,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3515 4059 [WARNING|trainer.py:803] 2025-04-26 20:38:54,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:55,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3557 [WARNING|trainer.py:803] 2025-04-26 20:38:55,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3516 4060 [WARNING|trainer.py:803] 2025-04-26 20:38:56,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:56,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3558 [WARNING|trainer.py:803] 2025-04-26 20:38:56,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3517 4061 [WARNING|trainer.py:803] 2025-04-26 20:38:57,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:57,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3559 [WARNING|trainer.py:803] 2025-04-26 20:38:57,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3518 4062 [WARNING|trainer.py:803] 2025-04-26 20:38:58,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:38:59,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3560 [WARNING|trainer.py:803] 2025-04-26 20:38:59,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3519 4063 [WARNING|trainer.py:803] 2025-04-26 20:38:59,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:00,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3561 [WARNING|trainer.py:803] 2025-04-26 20:39:00,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3520 [WARNING|trainer.py:803] 2025-04-26 20:39:01,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4064 [WARNING|trainer.py:803] 2025-04-26 20:39:01,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3562 3521 [WARNING|trainer.py:803] 2025-04-26 20:39:02,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:02,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4065 [WARNING|trainer.py:803] 2025-04-26 20:39:02,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3563 3522 [WARNING|trainer.py:803] 2025-04-26 20:39:03,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:03,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4066 [WARNING|trainer.py:803] 2025-04-26 20:39:04,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3564 3523 [WARNING|trainer.py:803] 2025-04-26 20:39:04,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:05,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4067 [WARNING|trainer.py:803] 2025-04-26 20:39:05,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3565 3524 [WARNING|trainer.py:803] 2025-04-26 20:39:06,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:06,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4068 [WARNING|trainer.py:803] 2025-04-26 20:39:06,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3566 3525 [WARNING|trainer.py:803] 2025-04-26 20:39:07,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:07,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4069 [WARNING|trainer.py:803] 2025-04-26 20:39:08,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3567 3526 [WARNING|trainer.py:803] 2025-04-26 20:39:08,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:09,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4070 [WARNING|trainer.py:803] 2025-04-26 20:39:09,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3568 3527 [WARNING|trainer.py:803] 2025-04-26 20:39:10,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:10,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4071 [WARNING|trainer.py:803] 2025-04-26 20:39:10,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3569 3528 [WARNING|trainer.py:803] 2025-04-26 20:39:11,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:11,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4072 [WARNING|trainer.py:803] 2025-04-26 20:39:11,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3570 3529 [WARNING|trainer.py:803] 2025-04-26 20:39:12,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:39:12,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4073 [WARNING|trainer.py:803] 2025-04-26 20:39:13,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3571 3530 [WARNING|trainer.py:803] 2025-04-26 20:39:13,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:14,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4074 [WARNING|trainer.py:803] 2025-04-26 20:39:14,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [mov,mp4,m4a,3gp,3g2,mj2 @ 0x3b860a80] moov atom not found [20:39:14] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4, Invalid data found when processing input Error reading /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4... Failed to load video: /home/wangjiarui/AIGV6K/Videos300/Latte/10231.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k 3572 3531 [WARNING|trainer.py:803] 2025-04-26 20:39:15,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4075 [WARNING|trainer.py:803] 2025-04-26 20:39:15,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:39:15,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3573 3532 [WARNING|trainer.py:803] 2025-04-26 20:39:16,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:16,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4076 [WARNING|trainer.py:803] 2025-04-26 20:39:17,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3574 3533 [WARNING|trainer.py:803] 2025-04-26 20:39:17,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:18,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:39:18,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4077 3575 3534 [WARNING|trainer.py:803] 2025-04-26 20:39:19,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:19,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:19,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4078 3576 3535 [WARNING|trainer.py:803] 2025-04-26 20:39:20,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:20,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:20,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4079 3577 3536 [WARNING|trainer.py:803] 2025-04-26 20:39:21,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:22,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:22,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4080 3578 3537 [WARNING|trainer.py:803] 2025-04-26 20:39:23,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:23,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:23,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4081 3579 3538 [WARNING|trainer.py:803] 2025-04-26 20:39:24,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:24,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:24,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4082 3580 3539 [WARNING|trainer.py:803] 2025-04-26 20:39:25,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:25,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:26,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4083 3581 3540 [WARNING|trainer.py:803] 2025-04-26 20:39:27,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:27,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:27,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4084 3582 3541 [WARNING|trainer.py:803] 2025-04-26 20:39:28,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:28,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:39:28,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4085 3583 3542 [WARNING|trainer.py:803] 2025-04-26 20:39:29,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:39:29,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:29,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4086 3584 3543 [WARNING|trainer.py:803] 2025-04-26 20:39:31,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:31,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:31,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3585 3544 4087 [WARNING|trainer.py:803] 2025-04-26 20:39:32,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:39:32,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:32,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3586 3545 4088 [WARNING|trainer.py:803] 2025-04-26 20:39:33,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:33,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:34,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3587 3546 4089 [WARNING|trainer.py:803] 2025-04-26 20:39:35,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:35,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:35,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3588 3547 4090 [WARNING|trainer.py:803] 2025-04-26 20:39:36,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:36,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:36,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3589 3548 4091 [WARNING|trainer.py:803] 2025-04-26 20:39:37,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:37,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:39:38,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3549 3590 4092 [WARNING|trainer.py:803] 2025-04-26 20:39:39,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:39,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:39,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3550 3591 4093 [WARNING|trainer.py:803] 2025-04-26 20:39:40,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:40,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:40,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3551 3592 4094 [WARNING|trainer.py:803] 2025-04-26 20:39:41,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:41,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:42,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3552 3593 4095 [WARNING|trainer.py:803] 2025-04-26 20:39:42,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:42,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3553 3594 [WARNING|trainer.py:803] 2025-04-26 20:39:43,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4096 [WARNING|trainer.py:803] 2025-04-26 20:39:44,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:44,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3554 3595 [WARNING|trainer.py:803] 2025-04-26 20:39:44,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4097 [WARNING|trainer.py:803] 2025-04-26 20:39:45,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:45,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3555 3596 [WARNING|trainer.py:803] 2025-04-26 20:39:46,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:39:46,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4098 [WARNING|trainer.py:803] 2025-04-26 20:39:46,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3556 3597 [WARNING|trainer.py:803] 2025-04-26 20:39:47,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 20:39:48,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4099 [WARNING|trainer.py:803] 2025-04-26 20:39:48,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3557 3598 [WARNING|trainer.py:803] 2025-04-26 20:39:48,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:49,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:49,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4100 3558 3599 [WARNING|trainer.py:803] 2025-04-26 20:39:50,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:50,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:50,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4101 3559 3600 [WARNING|trainer.py:803] 2025-04-26 20:39:51,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:51,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:52,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4102 3560 3601 [WARNING|trainer.py:803] 2025-04-26 20:39:52,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:53,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:53,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4103 3561 3602 [WARNING|trainer.py:803] 2025-04-26 20:39:54,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:39:54,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:39:54,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4104 3562 3603 [WARNING|trainer.py:803] 2025-04-26 20:39:55,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:55,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4105 [WARNING|trainer.py:803] 2025-04-26 20:39:56,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3563 3604 [WARNING|trainer.py:803] 2025-04-26 20:39:56,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:57,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4106 [WARNING|trainer.py:803] 2025-04-26 20:39:57,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3564 3605 [WARNING|trainer.py:803] 2025-04-26 20:39:58,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:58,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4107 [WARNING|trainer.py:803] 2025-04-26 20:39:58,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3565 3606 [WARNING|trainer.py:803] 2025-04-26 20:39:59,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:39:59,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4108 [WARNING|trainer.py:803] 2025-04-26 20:40:00,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3566 3607 [WARNING|trainer.py:803] 2025-04-26 20:40:00,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:40:01,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4109 [WARNING|trainer.py:803] 2025-04-26 20:40:01,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3567 3608 [WARNING|trainer.py:803] 2025-04-26 20:40:02,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:40:02,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4110 [WARNING|trainer.py:803] 2025-04-26 20:40:02,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3568 3609 [WARNING|trainer.py:803] 2025-04-26 20:40:03,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:03,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4111 [WARNING|trainer.py:803] 2025-04-26 20:40:03,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3569 3610 [WARNING|trainer.py:803] 2025-04-26 20:40:04,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:04,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4112 [WARNING|trainer.py:803] 2025-04-26 20:40:05,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3570 3611 [WARNING|trainer.py:803] 2025-04-26 20:40:05,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:06,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4113 3571 [WARNING|trainer.py:803] 2025-04-26 20:40:06,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3612 [WARNING|trainer.py:803] 2025-04-26 20:40:07,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:07,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4114 3572 [WARNING|trainer.py:803] 2025-04-26 20:40:07,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3613 [WARNING|trainer.py:803] 2025-04-26 20:40:08,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:08,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4115 3573 [WARNING|trainer.py:803] 2025-04-26 20:40:09,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3614 [WARNING|trainer.py:803] 2025-04-26 20:40:09,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:10,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4116 3574 [WARNING|trainer.py:803] 2025-04-26 20:40:10,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3615 [WARNING|trainer.py:803] 2025-04-26 20:40:11,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:11,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4117 3575 [WARNING|trainer.py:803] 2025-04-26 20:40:11,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:40:12,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3616 [WARNING|trainer.py:803] 2025-04-26 20:40:12,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4118 3576 [WARNING|trainer.py:803] 2025-04-26 20:40:13,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:40:13,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3617 [WARNING|trainer.py:803] 2025-04-26 20:40:13,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4119 3577 [WARNING|trainer.py:803] 2025-04-26 20:40:14,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:40:15,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3618 [WARNING|trainer.py:803] 2025-04-26 20:40:15,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3578 4120 [WARNING|trainer.py:803] 2025-04-26 20:40:15,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3619 [WARNING|trainer.py:803] 2025-04-26 20:40:16,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:16,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4121 3579 [WARNING|trainer.py:803] 2025-04-26 20:40:17,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:17,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3620 [WARNING|trainer.py:803] 2025-04-26 20:40:17,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4122 3580 [WARNING|trainer.py:803] 2025-04-26 20:40:18,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3621 [WARNING|trainer.py:803] 2025-04-26 20:40:19,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:19,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3581 4123 [WARNING|trainer.py:803] 2025-04-26 20:40:19,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3622 [WARNING|trainer.py:803] 2025-04-26 20:40:20,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:40:20,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3582 4124 [WARNING|trainer.py:803] 2025-04-26 20:40:21,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3623 [WARNING|trainer.py:803] 2025-04-26 20:40:21,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:40:21,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4125 3583 [WARNING|trainer.py:803] 2025-04-26 20:40:22,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:22,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:40:23,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3624 4126 3584 [WARNING|trainer.py:803] 2025-04-26 20:40:23,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:40:24,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:24,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3625 4127 3585 [WARNING|trainer.py:803] 2025-04-26 20:40:25,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:40:25,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:25,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3626 4128 3586 [WARNING|trainer.py:803] 2025-04-26 20:40:26,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:40:26,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:26,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3627 4129 3587 [WARNING|trainer.py:803] 2025-04-26 20:40:27,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:40:28,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:28,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3628 4130 3588 [WARNING|trainer.py:803] 2025-04-26 20:40:29,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:40:29,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:29,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4131 3629 3589 [WARNING|trainer.py:803] 2025-04-26 20:40:30,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:30,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:40:30,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4132 3630 3590 [WARNING|trainer.py:803] 2025-04-26 20:40:31,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:32,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4133 [WARNING|trainer.py:803] 2025-04-26 20:40:32,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3631 3591 [WARNING|trainer.py:803] 2025-04-26 20:40:33,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:33,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4134 [WARNING|trainer.py:803] 2025-04-26 20:40:33,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3632 3592 [WARNING|trainer.py:803] 2025-04-26 20:40:34,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:40:34,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4135 [WARNING|trainer.py:803] 2025-04-26 20:40:34,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3633 3593 [WARNING|trainer.py:803] 2025-04-26 20:40:35,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:36,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4136 [WARNING|trainer.py:803] 2025-04-26 20:40:36,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3634 3594 [WARNING|trainer.py:803] 2025-04-26 20:40:36,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:37,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4137 [WARNING|trainer.py:803] 2025-04-26 20:40:37,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3635 3595 [WARNING|trainer.py:803] 2025-04-26 20:40:38,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:38,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4138 [WARNING|trainer.py:803] 2025-04-26 20:40:38,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3636 3596 [WARNING|trainer.py:803] 2025-04-26 20:40:39,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4139 [WARNING|trainer.py:803] 2025-04-26 20:40:39,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:40:40,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3597 3637 [WARNING|trainer.py:803] 2025-04-26 20:40:40,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4140 [WARNING|trainer.py:803] 2025-04-26 20:40:41,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:40:41,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3598 3638 [WARNING|trainer.py:803] 2025-04-26 20:40:42,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4141 [WARNING|trainer.py:803] 2025-04-26 20:40:42,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:42,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3599 [WARNING|trainer.py:803] 2025-04-26 20:40:43,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3639 4142 [WARNING|trainer.py:803] 2025-04-26 20:40:44,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:44,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3600 [WARNING|trainer.py:803] 2025-04-26 20:40:44,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3640 4143 [WARNING|trainer.py:803] 2025-04-26 20:40:45,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:45,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3601 [WARNING|trainer.py:803] 2025-04-26 20:40:45,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3641 4144 [WARNING|trainer.py:803] 2025-04-26 20:40:46,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:40:46,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3602 [WARNING|trainer.py:803] 2025-04-26 20:40:47,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3642 4145 [WARNING|trainer.py:803] 2025-04-26 20:40:47,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:40:48,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:48,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3603 3643 4146 [WARNING|trainer.py:803] 2025-04-26 20:40:49,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:49,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3604 [WARNING|trainer.py:803] 2025-04-26 20:40:49,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3644 4147 [WARNING|trainer.py:803] 2025-04-26 20:40:50,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:40:50,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3605 [WARNING|trainer.py:803] 2025-04-26 20:40:51,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3645 4148 [WARNING|trainer.py:803] 2025-04-26 20:40:51,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:52,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3606 [WARNING|trainer.py:803] 2025-04-26 20:40:52,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3646 4149 [WARNING|trainer.py:803] 2025-04-26 20:40:53,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3607 [WARNING|trainer.py:803] 2025-04-26 20:40:53,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:53,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3647 4150 [WARNING|trainer.py:803] 2025-04-26 20:40:54,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3608 [WARNING|trainer.py:803] 2025-04-26 20:40:55,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:55,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4151 3648 [WARNING|trainer.py:803] 2025-04-26 20:40:55,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:40:56,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3609 [WARNING|trainer.py:803] 2025-04-26 20:40:56,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4152 3649 [WARNING|trainer.py:803] 2025-04-26 20:40:57,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:40:57,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3610 [WARNING|trainer.py:803] 2025-04-26 20:40:57,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4153 3650 [WARNING|trainer.py:803] 2025-04-26 20:40:58,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:40:58,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3611 [WARNING|trainer.py:803] 2025-04-26 20:40:59,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4154 3651 [WARNING|trainer.py:803] 2025-04-26 20:40:59,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:40:59,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3612 4155 [WARNING|trainer.py:803] 2025-04-26 20:41:00,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3652 [WARNING|trainer.py:803] 2025-04-26 20:41:01,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:01,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4156 3613 [WARNING|trainer.py:803] 2025-04-26 20:41:01,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3653 [WARNING|trainer.py:803] 2025-04-26 20:41:02,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:02,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4157 3614 [WARNING|trainer.py:803] 2025-04-26 20:41:03,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3654 [WARNING|trainer.py:803] 2025-04-26 20:41:03,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:03,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4158 3615 [WARNING|trainer.py:803] 2025-04-26 20:41:04,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3655 [WARNING|trainer.py:803] 2025-04-26 20:41:05,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:05,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4159 3616 [WARNING|trainer.py:803] 2025-04-26 20:41:05,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3656 [WARNING|trainer.py:803] 2025-04-26 20:41:06,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:06,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4160 3617 [WARNING|trainer.py:803] 2025-04-26 20:41:07,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:41:07,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3657 4161 [WARNING|trainer.py:803] 2025-04-26 20:41:08,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:41:08,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3618 [WARNING|trainer.py:803] 2025-04-26 20:41:08,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3658 4162 [WARNING|trainer.py:803] 2025-04-26 20:41:09,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:09,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3619 [WARNING|trainer.py:803] 2025-04-26 20:41:10,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3659 [WARNING|trainer.py:803] 2025-04-26 20:41:10,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4163 [WARNING|trainer.py:803] 2025-04-26 20:41:11,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3620 [WARNING|trainer.py:803] 2025-04-26 20:41:11,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3660 4164 [WARNING|trainer.py:803] 2025-04-26 20:41:12,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:12,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3621 [WARNING|trainer.py:803] 2025-04-26 20:41:12,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3661 4165 [WARNING|trainer.py:803] 2025-04-26 20:41:13,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:41:13,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3622 [WARNING|trainer.py:803] 2025-04-26 20:41:14,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3662 4166 [WARNING|trainer.py:803] 2025-04-26 20:41:14,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3623 [WARNING|trainer.py:803] 2025-04-26 20:41:15,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:15,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3663 4167 [WARNING|trainer.py:803] 2025-04-26 20:41:16,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:16,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3624 [WARNING|trainer.py:803] 2025-04-26 20:41:16,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3664 4168 [WARNING|trainer.py:803] 2025-04-26 20:41:17,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:41:17,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3625 [WARNING|trainer.py:803] 2025-04-26 20:41:18,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3665 4169 [WARNING|trainer.py:803] 2025-04-26 20:41:18,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:19,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:19,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3626 3666 4170 [WARNING|trainer.py:803] 2025-04-26 20:41:20,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:41:20,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:20,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3627 3667 4171 [WARNING|trainer.py:803] 2025-04-26 20:41:21,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:41:22,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:41:22,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3628 4172 3668 [WARNING|trainer.py:803] 2025-04-26 20:41:23,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:41:23,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:23,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3629 4173 3669 [WARNING|trainer.py:803] 2025-04-26 20:41:24,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:41:24,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:24,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3630 4174 3670 [WARNING|trainer.py:803] 2025-04-26 20:41:25,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:41:26,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:26,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3631 4175 3671 [WARNING|trainer.py:803] 2025-04-26 20:41:27,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:27,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:27,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3632 4176 3672 [WARNING|trainer.py:803] 2025-04-26 20:41:28,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:28,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:29,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4177 3633 3673 [WARNING|trainer.py:803] 2025-04-26 20:41:29,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:29,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4178 [WARNING|trainer.py:803] 2025-04-26 20:41:30,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3634 3674 [WARNING|trainer.py:803] 2025-04-26 20:41:31,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:31,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4179 [WARNING|trainer.py:803] 2025-04-26 20:41:31,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3635 3675 [WARNING|trainer.py:803] 2025-04-26 20:41:32,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:32,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4180 [WARNING|trainer.py:803] 2025-04-26 20:41:33,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3636 [WARNING|trainer.py:803] 2025-04-26 20:41:33,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3676 [WARNING|trainer.py:803] 2025-04-26 20:41:33,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4181 [WARNING|trainer.py:803] 2025-04-26 20:41:34,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3637 3677 [WARNING|trainer.py:803] 2025-04-26 20:41:34,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:35,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4182 [WARNING|trainer.py:803] 2025-04-26 20:41:35,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3638 3678 [WARNING|trainer.py:803] 2025-04-26 20:41:36,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4183 [WARNING|trainer.py:803] 2025-04-26 20:41:36,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:41:36,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3639 [WARNING|trainer.py:803] 2025-04-26 20:41:37,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3679 4184 [WARNING|trainer.py:803] 2025-04-26 20:41:38,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:41:38,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3640 [WARNING|trainer.py:803] 2025-04-26 20:41:38,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3680 4185 [WARNING|trainer.py:803] 2025-04-26 20:41:39,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:39,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3641 [WARNING|trainer.py:803] 2025-04-26 20:41:40,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3681 4186 [WARNING|trainer.py:803] 2025-04-26 20:41:40,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:41,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3642 [WARNING|trainer.py:803] 2025-04-26 20:41:41,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3682 4187 [WARNING|trainer.py:803] 2025-04-26 20:41:42,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:42,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3643 [WARNING|trainer.py:803] 2025-04-26 20:41:42,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3683 4188 [WARNING|trainer.py:803] 2025-04-26 20:41:43,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:41:43,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:44,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3644 4189 3684 [WARNING|trainer.py:803] 2025-04-26 20:41:45,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:45,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:45,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3645 3685 4190 [WARNING|trainer.py:803] 2025-04-26 20:41:46,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:46,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3646 [WARNING|trainer.py:803] 2025-04-26 20:41:46,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4191 3686 [WARNING|trainer.py:803] 2025-04-26 20:41:47,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:48,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:48,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3647 4192 3687 [WARNING|trainer.py:803] 2025-04-26 20:41:49,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:49,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:49,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3648 4193 3688 [WARNING|trainer.py:803] 2025-04-26 20:41:50,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:50,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:50,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3649 4194 3689 [WARNING|trainer.py:803] 2025-04-26 20:41:51,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:52,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:52,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4195 3650 3690 [WARNING|trainer.py:803] 2025-04-26 20:41:53,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:53,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:41:53,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4196 3651 3691 [WARNING|trainer.py:803] 2025-04-26 20:41:54,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:54,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:54,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4197 3652 3692 [WARNING|trainer.py:803] 2025-04-26 20:41:55,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:56,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:56,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4198 3653 3693 [WARNING|trainer.py:803] 2025-04-26 20:41:57,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:57,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:57,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4199 3654 3694 [WARNING|trainer.py:803] 2025-04-26 20:41:58,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:41:58,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:41:58,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4200 3655 3695 [WARNING|trainer.py:803] 2025-04-26 20:41:59,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4201 [WARNING|trainer.py:803] 2025-04-26 20:42:00,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:00,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3656 3696 [WARNING|trainer.py:803] 2025-04-26 20:42:00,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4202 [WARNING|trainer.py:803] 2025-04-26 20:42:01,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:42:01,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:42:02,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3657 3697 4203 [WARNING|trainer.py:803] 2025-04-26 20:42:02,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:42:03,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:42:03,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4204 3658 3698 [WARNING|trainer.py:803] 2025-04-26 20:42:04,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:42:04,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4205 [WARNING|trainer.py:803] 2025-04-26 20:42:04,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3659 3699 [WARNING|trainer.py:803] 2025-04-26 20:42:05,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4206 [WARNING|trainer.py:803] 2025-04-26 20:42:05,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:42:05,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3660 3700 [WARNING|trainer.py:803] 2025-04-26 20:42:06,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4207 [WARNING|trainer.py:803] 2025-04-26 20:42:06,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:42:07,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:42:07,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4208 3661 3701 [WARNING|trainer.py:803] 2025-04-26 20:42:08,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:42:08,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:42:08,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4209 3662 3702 [WARNING|trainer.py:803] 2025-04-26 20:42:09,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4210 [WARNING|trainer.py:803] 2025-04-26 20:42:09,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:42:09,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3663 3703 [WARNING|trainer.py:803] 2025-04-26 20:42:10,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4211 [WARNING|trainer.py:803] 2025-04-26 20:42:11,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:42:11,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:42:11,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3664 4212 3704 [WARNING|trainer.py:803] 2025-04-26 20:42:12,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:12,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:12,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4213 3665 3705 [WARNING|trainer.py:803] 2025-04-26 20:42:13,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:42:13,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4214 [WARNING|trainer.py:803] 2025-04-26 20:42:14,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3666 [WARNING|trainer.py:803] 2025-04-26 20:42:14,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3706 4215 [WARNING|trainer.py:803] 2025-04-26 20:42:15,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:42:15,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3667 [WARNING|trainer.py:803] 2025-04-26 20:42:15,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4216 3707 [WARNING|trainer.py:803] 2025-04-26 20:42:16,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:42:16,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:17,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4217 3668 3708 [WARNING|trainer.py:803] 2025-04-26 20:42:18,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:42:18,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4218 [WARNING|trainer.py:803] 2025-04-26 20:42:18,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3669 3709 [WARNING|trainer.py:803] 2025-04-26 20:42:19,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4219 [WARNING|trainer.py:803] 2025-04-26 20:42:19,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:42:19,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3670 [WARNING|trainer.py:803] 2025-04-26 20:42:20,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4220 3710 [WARNING|trainer.py:803] 2025-04-26 20:42:20,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:42:21,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3671 4221 [WARNING|trainer.py:803] 2025-04-26 20:42:21,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3711 [WARNING|trainer.py:803] 2025-04-26 20:42:22,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:42:22,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4222 3672 [WARNING|trainer.py:803] 2025-04-26 20:42:22,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:23,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3712 4223 [WARNING|trainer.py:803] 2025-04-26 20:42:23,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3673 [WARNING|trainer.py:803] 2025-04-26 20:42:24,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:42:24,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4224 3713 [WARNING|trainer.py:803] 2025-04-26 20:42:24,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3674 [WARNING|trainer.py:803] 2025-04-26 20:42:25,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:42:25,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4225 [WARNING|trainer.py:803] 2025-04-26 20:42:26,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3714 [WARNING|trainer.py:803] 2025-04-26 20:42:26,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4226 3675 [WARNING|trainer.py:803] 2025-04-26 20:42:26,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:42:27,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3715 [WARNING|trainer.py:803] 2025-04-26 20:42:27,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4227 3676 [WARNING|trainer.py:803] 2025-04-26 20:42:28,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:28,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4228 3716 [WARNING|trainer.py:803] 2025-04-26 20:42:28,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3677 [WARNING|trainer.py:803] 2025-04-26 20:42:29,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:42:29,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4229 [WARNING|trainer.py:803] 2025-04-26 20:42:30,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3717 [WARNING|trainer.py:803] 2025-04-26 20:42:30,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3678 4230 [WARNING|trainer.py:803] 2025-04-26 20:42:31,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:42:31,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3718 [WARNING|trainer.py:803] 2025-04-26 20:42:31,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4231 3679 [WARNING|trainer.py:803] 2025-04-26 20:42:32,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:42:32,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:42:32,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4232 3719 3680 [WARNING|trainer.py:803] 2025-04-26 20:42:34,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:42:34,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4233 [WARNING|trainer.py:803] 2025-04-26 20:42:34,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3720 3681 [WARNING|trainer.py:803] 2025-04-26 20:42:35,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4234 [WARNING|trainer.py:803] 2025-04-26 20:42:35,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:42:35,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:42:36,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3721 3682 4235 [WARNING|trainer.py:803] 2025-04-26 20:42:37,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:42:37,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:42:37,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4236 3722 3683 [WARNING|trainer.py:803] 2025-04-26 20:42:38,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:42:38,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4237 [WARNING|trainer.py:803] 2025-04-26 20:42:38,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3723 3684 [WARNING|trainer.py:803] 2025-04-26 20:42:39,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4238 [WARNING|trainer.py:803] 2025-04-26 20:42:39,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:40,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:42:40,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3724 4239 3685 [WARNING|trainer.py:803] 2025-04-26 20:42:41,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:42:41,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:41,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4240 3725 3686 [WARNING|trainer.py:803] 2025-04-26 20:42:42,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:42:42,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4241 [WARNING|trainer.py:803] 2025-04-26 20:42:42,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3726 3687 [WARNING|trainer.py:803] 2025-04-26 20:42:43,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4242 [WARNING|trainer.py:803] 2025-04-26 20:42:44,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:42:44,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3727 [WARNING|trainer.py:803] 2025-04-26 20:42:44,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3688 4243 [WARNING|trainer.py:803] 2025-04-26 20:42:45,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:42:45,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:42:45,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4244 3728 3689 [WARNING|trainer.py:803] 2025-04-26 20:42:46,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:46,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:46,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4245 3729 3690 [WARNING|trainer.py:803] 2025-04-26 20:42:47,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4246 [WARNING|trainer.py:803] 2025-04-26 20:42:48,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:42:48,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3730 3691 [WARNING|trainer.py:803] 2025-04-26 20:42:48,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4247 [WARNING|trainer.py:803] 2025-04-26 20:42:49,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:42:49,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:49,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3731 4248 3692 [WARNING|trainer.py:803] 2025-04-26 20:42:50,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:42:51,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:51,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4249 3732 3693 [WARNING|trainer.py:803] 2025-04-26 20:42:52,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:52,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4250 [WARNING|trainer.py:803] 2025-04-26 20:42:52,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3733 3694 [WARNING|trainer.py:803] 2025-04-26 20:42:53,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4251 [WARNING|trainer.py:803] 2025-04-26 20:42:53,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:42:53,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3734 [WARNING|trainer.py:803] 2025-04-26 20:42:54,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4252 3695 [WARNING|trainer.py:803] 2025-04-26 20:42:54,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:55,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:55,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4253 3735 3696 [WARNING|trainer.py:803] 2025-04-26 20:42:56,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:42:56,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4254 [WARNING|trainer.py:803] 2025-04-26 20:42:56,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3736 3697 [WARNING|trainer.py:803] 2025-04-26 20:42:57,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4255 [WARNING|trainer.py:803] 2025-04-26 20:42:57,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:58,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3737 [WARNING|trainer.py:803] 2025-04-26 20:42:58,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3698 4256 [WARNING|trainer.py:803] 2025-04-26 20:42:59,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:42:59,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:42:59,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3738 4257 3699 [WARNING|trainer.py:803] 2025-04-26 20:43:00,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:43:00,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:00,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4258 3739 3700 [WARNING|trainer.py:803] 2025-04-26 20:43:01,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4259 [WARNING|trainer.py:803] 2025-04-26 20:43:01,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:43:02,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3740 3701 [WARNING|trainer.py:803] 2025-04-26 20:43:02,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4260 [WARNING|trainer.py:803] 2025-04-26 20:43:03,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:43:03,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:43:03,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3741 4261 3702 [WARNING|trainer.py:803] 2025-04-26 20:43:04,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:43:04,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:04,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4262 3742 3703 [WARNING|trainer.py:803] 2025-04-26 20:43:05,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:06,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4263 [WARNING|trainer.py:803] 2025-04-26 20:43:06,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3743 3704 [WARNING|trainer.py:803] 2025-04-26 20:43:07,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4264 [WARNING|trainer.py:803] 2025-04-26 20:43:07,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:43:07,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:08,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3744 3705 4265 [WARNING|trainer.py:803] 2025-04-26 20:43:09,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:43:09,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:09,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4266 3745 3706 [WARNING|trainer.py:803] 2025-04-26 20:43:10,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:10,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4267 [WARNING|trainer.py:803] 2025-04-26 20:43:10,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3746 3707 [WARNING|trainer.py:803] 2025-04-26 20:43:11,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4268 [WARNING|trainer.py:803] 2025-04-26 20:43:11,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:12,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3747 [WARNING|trainer.py:803] 2025-04-26 20:43:12,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4269 3708 [WARNING|trainer.py:803] 2025-04-26 20:43:13,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:13,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:13,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4270 3748 3709 [WARNING|trainer.py:803] 2025-04-26 20:43:14,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:14,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4271 [WARNING|trainer.py:803] 2025-04-26 20:43:14,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3749 [WARNING|trainer.py:803] 2025-04-26 20:43:15,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3710 4272 [WARNING|trainer.py:803] 2025-04-26 20:43:15,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:43:16,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3750 [WARNING|trainer.py:803] 2025-04-26 20:43:16,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4273 3711 [WARNING|trainer.py:803] 2025-04-26 20:43:17,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:43:17,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3751 [WARNING|trainer.py:803] 2025-04-26 20:43:17,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4274 3712 [WARNING|trainer.py:803] 2025-04-26 20:43:18,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:18,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4275 [WARNING|trainer.py:803] 2025-04-26 20:43:19,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3752 3713 [WARNING|trainer.py:803] 2025-04-26 20:43:19,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4276 [WARNING|trainer.py:803] 2025-04-26 20:43:20,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:20,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3753 [WARNING|trainer.py:803] 2025-04-26 20:43:20,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3714 4277 [WARNING|trainer.py:803] 2025-04-26 20:43:21,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:21,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:21,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3754 4278 3715 [WARNING|trainer.py:803] 2025-04-26 20:43:22,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:23,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4279 3755 [WARNING|trainer.py:803] 2025-04-26 20:43:23,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3716 [WARNING|trainer.py:803] 2025-04-26 20:43:24,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:24,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4280 3756 [WARNING|trainer.py:803] 2025-04-26 20:43:24,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:25,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3717 4281 [WARNING|trainer.py:803] 2025-04-26 20:43:25,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3757 [WARNING|trainer.py:803] 2025-04-26 20:43:26,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:26,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4282 [WARNING|trainer.py:803] 2025-04-26 20:43:26,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3718 [WARNING|trainer.py:803] 2025-04-26 20:43:27,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3758 4283 [WARNING|trainer.py:803] 2025-04-26 20:43:27,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:43:28,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3719 [WARNING|trainer.py:803] 2025-04-26 20:43:28,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4284 3759 [WARNING|trainer.py:803] 2025-04-26 20:43:29,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:29,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:29,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4285 3720 3760 [WARNING|trainer.py:803] 2025-04-26 20:43:30,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:30,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4286 [WARNING|trainer.py:803] 2025-04-26 20:43:31,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3721 [WARNING|trainer.py:803] 2025-04-26 20:43:31,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3761 4287 [WARNING|trainer.py:803] 2025-04-26 20:43:32,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:43:32,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:43:32,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3722 4288 3762 [WARNING|trainer.py:803] 2025-04-26 20:43:33,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:43:33,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:43:33,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4289 3723 3763 [WARNING|trainer.py:803] 2025-04-26 20:43:34,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4290 [WARNING|trainer.py:803] 2025-04-26 20:43:34,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:35,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3724 [WARNING|trainer.py:803] 2025-04-26 20:43:35,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3764 4291 [WARNING|trainer.py:803] 2025-04-26 20:43:36,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:43:36,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:36,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3725 4292 3765 [WARNING|trainer.py:803] 2025-04-26 20:43:37,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:37,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:38,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4293 3726 3766 [WARNING|trainer.py:803] 2025-04-26 20:43:38,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:39,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4294 [WARNING|trainer.py:803] 2025-04-26 20:43:39,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3727 3767 [WARNING|trainer.py:803] 2025-04-26 20:43:40,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4295 [WARNING|trainer.py:803] 2025-04-26 20:43:40,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:43:40,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3728 [WARNING|trainer.py:803] 2025-04-26 20:43:41,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3768 4296 [WARNING|trainer.py:803] 2025-04-26 20:43:41,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:42,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:43:42,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4297 3729 3769 [WARNING|trainer.py:803] 2025-04-26 20:43:43,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:43:43,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:43:43,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4298 3730 3770 [WARNING|trainer.py:803] 2025-04-26 20:43:44,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4299 [WARNING|trainer.py:803] 2025-04-26 20:43:44,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:44,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 3731 3771 [WARNING|trainer.py:803] 2025-04-26 20:43:45,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4300 [WARNING|trainer.py:803] 2025-04-26 20:43:46,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:43:46,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:46,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3732 3772 4301 [WARNING|trainer.py:803] 2025-04-26 20:43:47,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:47,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:43:47,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4302 3773 3733 [WARNING|trainer.py:803] 2025-04-26 20:43:48,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:48,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:48,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4303 3734 3774 [WARNING|trainer.py:803] 2025-04-26 20:43:49,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4304 [WARNING|trainer.py:803] 2025-04-26 20:43:50,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:50,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3775 3735 [WARNING|trainer.py:803] 2025-04-26 20:43:50,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4305 [WARNING|trainer.py:803] 2025-04-26 20:43:51,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:43:51,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:52,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3736 3776 4306 [WARNING|trainer.py:803] 2025-04-26 20:43:53,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:53,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:53,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4307 3777 3737 [WARNING|trainer.py:803] 2025-04-26 20:43:54,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:54,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4308 [WARNING|trainer.py:803] 2025-04-26 20:43:54,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3778 3738 [WARNING|trainer.py:803] 2025-04-26 20:43:55,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4309 [WARNING|trainer.py:803] 2025-04-26 20:43:55,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:55,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3779 [WARNING|trainer.py:803] 2025-04-26 20:43:56,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3739 4310 [WARNING|trainer.py:803] 2025-04-26 20:43:57,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:43:57,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:43:57,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4311 3780 3740 [WARNING|trainer.py:803] 2025-04-26 20:43:58,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:43:58,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:43:58,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4312 3781 3741 [WARNING|trainer.py:803] 2025-04-26 20:43:59,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4313 [WARNING|trainer.py:803] 2025-04-26 20:43:59,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:00,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3782 [WARNING|trainer.py:803] 2025-04-26 20:44:00,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3742 4314 [WARNING|trainer.py:803] 2025-04-26 20:44:01,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:01,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:44:01,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4315 3783 3743 [WARNING|trainer.py:803] 2025-04-26 20:44:02,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:02,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:02,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4316 3784 3744 [WARNING|trainer.py:803] 2025-04-26 20:44:03,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4317 [WARNING|trainer.py:803] 2025-04-26 20:44:04,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:04,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3785 [WARNING|trainer.py:803] 2025-04-26 20:44:04,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3745 4318 [WARNING|trainer.py:803] 2025-04-26 20:44:05,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:44:05,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:44:06,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3786 4319 3746 [WARNING|trainer.py:803] 2025-04-26 20:44:07,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:07,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:07,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4320 3787 3747 [WARNING|trainer.py:803] 2025-04-26 20:44:08,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:08,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4321 [WARNING|trainer.py:803] 2025-04-26 20:44:08,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3748 3788 [WARNING|trainer.py:803] 2025-04-26 20:44:09,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4322 [WARNING|trainer.py:803] 2025-04-26 20:44:09,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:44:09,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:10,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3749 3789 4323 [WARNING|trainer.py:803] 2025-04-26 20:44:11,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:44:11,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:11,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4324 3790 3750 [WARNING|trainer.py:803] 2025-04-26 20:44:12,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:12,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:12,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4325 3791 3751 [WARNING|trainer.py:803] 2025-04-26 20:44:13,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4326 [WARNING|trainer.py:803] 2025-04-26 20:44:13,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:44:14,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3792 [WARNING|trainer.py:803] 2025-04-26 20:44:14,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3752 4327 [WARNING|trainer.py:803] 2025-04-26 20:44:15,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:15,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:15,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3793 4328 3753 [WARNING|trainer.py:803] 2025-04-26 20:44:16,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:44:16,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:16,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4329 3794 3754 [WARNING|trainer.py:803] 2025-04-26 20:44:17,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:18,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4330 [WARNING|trainer.py:803] 2025-04-26 20:44:18,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3795 3755 [WARNING|trainer.py:803] 2025-04-26 20:44:18,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4331 [WARNING|trainer.py:803] 2025-04-26 20:44:19,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:19,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3756 3796 [WARNING|trainer.py:803] 2025-04-26 20:44:20,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4332 [WARNING|trainer.py:803] 2025-04-26 20:44:20,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:20,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:44:21,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3757 4333 3797 [WARNING|trainer.py:803] 2025-04-26 20:44:22,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:44:22,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:22,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4334 3798 3758 [WARNING|trainer.py:803] 2025-04-26 20:44:23,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4335 [WARNING|trainer.py:803] 2025-04-26 20:44:23,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:44:23,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3799 3759 [WARNING|trainer.py:803] 2025-04-26 20:44:24,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4336 [WARNING|trainer.py:803] 2025-04-26 20:44:24,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:25,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:25,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3800 4337 3760 [WARNING|trainer.py:803] 2025-04-26 20:44:26,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:26,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:26,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4338 3801 3761 [WARNING|trainer.py:803] 2025-04-26 20:44:27,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:44:27,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4339 [WARNING|trainer.py:803] 2025-04-26 20:44:27,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3802 3762 [WARNING|trainer.py:803] 2025-04-26 20:44:28,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4340 [WARNING|trainer.py:803] 2025-04-26 20:44:29,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:29,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3803 [WARNING|trainer.py:803] 2025-04-26 20:44:29,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4341 3763 [WARNING|trainer.py:803] 2025-04-26 20:44:30,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:30,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:30,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3804 4342 3764 [WARNING|trainer.py:803] 2025-04-26 20:44:31,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:44:31,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4343 3805 [WARNING|trainer.py:803] 2025-04-26 20:44:32,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3765 [WARNING|trainer.py:803] 2025-04-26 20:44:32,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:33,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4344 3806 [WARNING|trainer.py:803] 2025-04-26 20:44:33,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:44:33,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3766 4345 [WARNING|trainer.py:803] 2025-04-26 20:44:34,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3807 [WARNING|trainer.py:803] 2025-04-26 20:44:34,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:35,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4346 3767 [WARNING|trainer.py:803] 2025-04-26 20:44:35,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:44:36,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3808 4347 [WARNING|trainer.py:803] 2025-04-26 20:44:36,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3768 [WARNING|trainer.py:803] 2025-04-26 20:44:36,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:44:37,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4348 3809 [WARNING|trainer.py:803] 2025-04-26 20:44:37,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:44:38,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3769 [WARNING|trainer.py:803] 2025-04-26 20:44:38,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4349 3810 [WARNING|trainer.py:803] 2025-04-26 20:44:39,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:44:39,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4350 3770 [WARNING|trainer.py:803] 2025-04-26 20:44:39,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3811 [WARNING|trainer.py:803] 2025-04-26 20:44:40,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:40,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4351 [WARNING|trainer.py:803] 2025-04-26 20:44:40,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3771 [WARNING|trainer.py:803] 2025-04-26 20:44:41,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3812 4352 [WARNING|trainer.py:803] 2025-04-26 20:44:41,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:42,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3772 [WARNING|trainer.py:803] 2025-04-26 20:44:42,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4353 3813 [WARNING|trainer.py:803] 2025-04-26 20:44:43,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:43,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:43,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4354 3773 3814 [WARNING|trainer.py:803] 2025-04-26 20:44:44,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:44,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4355 [WARNING|trainer.py:803] 2025-04-26 20:44:44,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3774 3815 [WARNING|trainer.py:803] 2025-04-26 20:44:45,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4356 [WARNING|trainer.py:803] 2025-04-26 20:44:45,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:44:46,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:46,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3775 4357 3816 [WARNING|trainer.py:803] 2025-04-26 20:44:47,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:44:47,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:44:47,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4358 3817 3776 [WARNING|trainer.py:803] 2025-04-26 20:44:48,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4359 [WARNING|trainer.py:803] 2025-04-26 20:44:48,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:48,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3818 3777 [WARNING|trainer.py:803] 2025-04-26 20:44:49,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4360 [WARNING|trainer.py:803] 2025-04-26 20:44:50,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:50,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:44:50,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3819 4361 3778 [WARNING|trainer.py:803] 2025-04-26 20:44:51,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:51,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:44:51,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4362 3820 3779 [WARNING|trainer.py:803] 2025-04-26 20:44:52,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4363 [WARNING|trainer.py:803] 2025-04-26 20:44:52,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:52,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3821 [WARNING|trainer.py:803] 2025-04-26 20:44:53,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3780 4364 [WARNING|trainer.py:803] 2025-04-26 20:44:54,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:54,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:44:54,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3822 4365 3781 [WARNING|trainer.py:803] 2025-04-26 20:44:55,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:55,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4366 [WARNING|trainer.py:803] 2025-04-26 20:44:55,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3823 3782 [WARNING|trainer.py:803] 2025-04-26 20:44:56,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4367 [WARNING|trainer.py:803] 2025-04-26 20:44:56,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3824 [WARNING|trainer.py:803] 2025-04-26 20:44:57,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:57,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4368 3783 [WARNING|trainer.py:803] 2025-04-26 20:44:58,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:44:58,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3825 [WARNING|trainer.py:803] 2025-04-26 20:44:58,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4369 3784 [WARNING|trainer.py:803] 2025-04-26 20:44:59,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:44:59,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4370 3826 [WARNING|trainer.py:803] 2025-04-26 20:45:00,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3785 [WARNING|trainer.py:803] 2025-04-26 20:45:00,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:00,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4371 3827 [WARNING|trainer.py:803] 2025-04-26 20:45:01,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:45:01,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4372 3786 [WARNING|trainer.py:803] 2025-04-26 20:45:01,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3828 [WARNING|trainer.py:803] 2025-04-26 20:45:02,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4373 [WARNING|trainer.py:803] 2025-04-26 20:45:02,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:03,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3787 [WARNING|trainer.py:803] 2025-04-26 20:45:03,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3829 4374 [WARNING|trainer.py:803] 2025-04-26 20:45:04,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:45:04,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:04,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4375 3788 3830 [WARNING|trainer.py:803] 2025-04-26 20:45:05,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:05,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4376 [WARNING|trainer.py:803] 2025-04-26 20:45:05,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3831 3789 [WARNING|trainer.py:803] 2025-04-26 20:45:06,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4377 [WARNING|trainer.py:803] 2025-04-26 20:45:07,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:07,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3832 3790 [WARNING|trainer.py:803] 2025-04-26 20:45:07,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4378 [WARNING|trainer.py:803] 2025-04-26 20:45:08,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:45:08,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:08,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4379 3833 3791 [WARNING|trainer.py:803] 2025-04-26 20:45:09,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:09,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:09,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4380 3834 3792 [WARNING|trainer.py:803] 2025-04-26 20:45:10,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4381 [WARNING|trainer.py:803] 2025-04-26 20:45:10,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:11,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3835 3793 [WARNING|trainer.py:803] 2025-04-26 20:45:11,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4382 [WARNING|trainer.py:803] 2025-04-26 20:45:12,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:12,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3836 [WARNING|trainer.py:803] 2025-04-26 20:45:12,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4383 3794 [WARNING|trainer.py:803] 2025-04-26 20:45:13,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:13,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4384 [WARNING|trainer.py:803] 2025-04-26 20:45:13,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3837 3795 [WARNING|trainer.py:803] 2025-04-26 20:45:14,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:14,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4385 [WARNING|trainer.py:803] 2025-04-26 20:45:15,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3838 [WARNING|trainer.py:803] 2025-04-26 20:45:15,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3796 4386 [WARNING|trainer.py:803] 2025-04-26 20:45:16,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3839 [WARNING|trainer.py:803] 2025-04-26 20:45:16,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:16,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4387 3797 [WARNING|trainer.py:803] 2025-04-26 20:45:17,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:45:17,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3840 4388 [WARNING|trainer.py:803] 2025-04-26 20:45:18,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:18,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3798 [WARNING|trainer.py:803] 2025-04-26 20:45:18,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4389 3841 [WARNING|trainer.py:803] 2025-04-26 20:45:19,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:45:19,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3799 [WARNING|trainer.py:803] 2025-04-26 20:45:19,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4390 3842 [WARNING|trainer.py:803] 2025-04-26 20:45:20,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:20,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4391 [WARNING|trainer.py:803] 2025-04-26 20:45:21,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3800 3843 [WARNING|trainer.py:803] 2025-04-26 20:45:21,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4392 [WARNING|trainer.py:803] 2025-04-26 20:45:22,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:22,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3801 [WARNING|trainer.py:803] 2025-04-26 20:45:22,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4393 3844 [WARNING|trainer.py:803] 2025-04-26 20:45:23,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:23,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:23,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4394 3802 3845 [WARNING|trainer.py:803] 2025-04-26 20:45:24,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:25,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4395 [WARNING|trainer.py:803] 2025-04-26 20:45:25,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3803 3846 [WARNING|trainer.py:803] 2025-04-26 20:45:25,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4396 [WARNING|trainer.py:803] 2025-04-26 20:45:26,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:26,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3804 3847 [WARNING|trainer.py:803] 2025-04-26 20:45:26,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4397 [WARNING|trainer.py:803] 2025-04-26 20:45:27,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:45:27,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:27,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3805 4398 3848 [WARNING|trainer.py:803] 2025-04-26 20:45:28,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:45:28,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 20:45:29,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4399 3806 3849 [WARNING|trainer.py:803] 2025-04-26 20:45:29,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:30,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4400 [WARNING|trainer.py:803] 2025-04-26 20:45:30,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3807 3850 [WARNING|trainer.py:803] 2025-04-26 20:45:30,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4401 [WARNING|trainer.py:803] 2025-04-26 20:45:31,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:45:31,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3808 [WARNING|trainer.py:803] 2025-04-26 20:45:31,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3851 4402 [WARNING|trainer.py:803] 2025-04-26 20:45:32,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:45:32,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:32,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4403 3809 3852 [WARNING|trainer.py:803] 2025-04-26 20:45:34,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:34,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:45:34,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4404 3810 3853 [WARNING|trainer.py:803] 2025-04-26 20:45:35,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4405 [WARNING|trainer.py:803] 2025-04-26 20:45:35,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:45:35,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3811 3854 [WARNING|trainer.py:803] 2025-04-26 20:45:36,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4406 [WARNING|trainer.py:803] 2025-04-26 20:45:36,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:36,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:37,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3855 3812 4407 [WARNING|trainer.py:803] 2025-04-26 20:45:37,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:37,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:38,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4408 3813 3856 [WARNING|trainer.py:803] 2025-04-26 20:45:39,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:39,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:39,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4409 3857 3814 [WARNING|trainer.py:803] 2025-04-26 20:45:40,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4410 [WARNING|trainer.py:803] 2025-04-26 20:45:40,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:40,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3858 3815 [WARNING|trainer.py:803] 2025-04-26 20:45:41,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4411 [WARNING|trainer.py:803] 2025-04-26 20:45:41,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:45:41,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:42,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3859 3816 4412 [WARNING|trainer.py:803] 2025-04-26 20:45:43,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:45:43,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:45:43,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4413 3860 3817 [WARNING|trainer.py:803] 2025-04-26 20:45:44,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4414 [WARNING|trainer.py:803] 2025-04-26 20:45:44,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:44,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3861 3818 [WARNING|trainer.py:803] 2025-04-26 20:45:45,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4415 [WARNING|trainer.py:803] 2025-04-26 20:45:45,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:45:45,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:46,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3862 3819 4416 [WARNING|trainer.py:803] 2025-04-26 20:45:47,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:45:47,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:47,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4417 3863 3820 [WARNING|trainer.py:803] 2025-04-26 20:45:48,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:48,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4418 [WARNING|trainer.py:803] 2025-04-26 20:45:48,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3864 3821 [WARNING|trainer.py:803] 2025-04-26 20:45:49,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4419 [WARNING|trainer.py:803] 2025-04-26 20:45:49,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:45:49,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3865 3822 [WARNING|trainer.py:803] 2025-04-26 20:45:50,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4420 [WARNING|trainer.py:803] 2025-04-26 20:45:50,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:45:51,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:51,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3866 4421 3823 [WARNING|trainer.py:803] 2025-04-26 20:45:52,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:45:52,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4422 [WARNING|trainer.py:803] 2025-04-26 20:45:52,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3867 3824 [WARNING|trainer.py:803] 2025-04-26 20:45:53,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4423 [WARNING|trainer.py:803] 2025-04-26 20:45:53,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:45:53,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3868 3825 [WARNING|trainer.py:803] 2025-04-26 20:45:54,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4424 [WARNING|trainer.py:803] 2025-04-26 20:45:54,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:45:55,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:45:55,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3869 4425 3826 [WARNING|trainer.py:803] 2025-04-26 20:45:56,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:45:56,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:56,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4426 3870 3827 [WARNING|trainer.py:803] 2025-04-26 20:45:57,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:45:57,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4427 [WARNING|trainer.py:803] 2025-04-26 20:45:57,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3871 3828 [WARNING|trainer.py:803] 2025-04-26 20:45:58,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4428 [WARNING|trainer.py:803] 2025-04-26 20:45:58,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:45:59,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3872 [WARNING|trainer.py:803] 2025-04-26 20:45:59,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3829 4429 [WARNING|trainer.py:803] 2025-04-26 20:46:00,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:46:00,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:00,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4430 3830 3873 [WARNING|trainer.py:803] 2025-04-26 20:46:01,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4431 [WARNING|trainer.py:803] 2025-04-26 20:46:01,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:46:01,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3831 3874 [WARNING|trainer.py:803] 2025-04-26 20:46:02,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4432 [WARNING|trainer.py:803] 2025-04-26 20:46:02,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:02,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3832 [WARNING|trainer.py:803] 2025-04-26 20:46:03,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3875 4433 [WARNING|trainer.py:803] 2025-04-26 20:46:04,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:46:04,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:46:04,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4434 3833 3876 [WARNING|trainer.py:803] 2025-04-26 20:46:05,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:05,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4435 [WARNING|trainer.py:803] 2025-04-26 20:46:05,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3834 3877 [WARNING|trainer.py:803] 2025-04-26 20:46:06,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4436 [WARNING|trainer.py:803] 2025-04-26 20:46:06,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:07,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3835 [WARNING|trainer.py:803] 2025-04-26 20:46:07,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3878 4437 [WARNING|trainer.py:803] 2025-04-26 20:46:08,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:08,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3836 [WARNING|trainer.py:803] 2025-04-26 20:46:08,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4438 3879 [WARNING|trainer.py:803] 2025-04-26 20:46:09,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:09,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:09,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4439 3837 3880 [WARNING|trainer.py:803] 2025-04-26 20:46:10,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:10,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4440 [WARNING|trainer.py:803] 2025-04-26 20:46:10,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3838 [WARNING|trainer.py:803] 2025-04-26 20:46:11,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3881 4441 [WARNING|trainer.py:803] 2025-04-26 20:46:11,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3839 [WARNING|trainer.py:803] 2025-04-26 20:46:12,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:46:12,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4442 3882 [WARNING|trainer.py:803] 2025-04-26 20:46:13,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:46:13,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3840 4443 [WARNING|trainer.py:803] 2025-04-26 20:46:13,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3883 [WARNING|trainer.py:803] 2025-04-26 20:46:14,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:46:14,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4444 3841 [WARNING|trainer.py:803] 2025-04-26 20:46:15,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:46:15,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3884 4445 [WARNING|trainer.py:803] 2025-04-26 20:46:15,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3842 [WARNING|trainer.py:803] 2025-04-26 20:46:16,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:16,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4446 3885 [WARNING|trainer.py:803] 2025-04-26 20:46:17,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:17,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3843 4447 [WARNING|trainer.py:803] 2025-04-26 20:46:17,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3886 [WARNING|trainer.py:803] 2025-04-26 20:46:18,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:46:18,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4448 3844 [WARNING|trainer.py:803] 2025-04-26 20:46:19,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:19,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3887 4449 [WARNING|trainer.py:803] 2025-04-26 20:46:19,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3845 [WARNING|trainer.py:803] 2025-04-26 20:46:20,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:20,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4450 3888 [WARNING|trainer.py:803] 2025-04-26 20:46:21,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3846 [WARNING|trainer.py:803] 2025-04-26 20:46:21,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4451 [WARNING|trainer.py:803] 2025-04-26 20:46:21,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3889 [WARNING|trainer.py:803] 2025-04-26 20:46:22,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:46:22,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3847 4452 [WARNING|trainer.py:803] 2025-04-26 20:46:23,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:23,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:23,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3890 4453 3848 [WARNING|trainer.py:803] 2025-04-26 20:46:24,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:24,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4454 [WARNING|trainer.py:803] 2025-04-26 20:46:24,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3891 3849 [WARNING|trainer.py:803] 2025-04-26 20:46:25,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4455 [WARNING|trainer.py:803] 2025-04-26 20:46:26,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:26,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3892 [WARNING|trainer.py:803] 2025-04-26 20:46:26,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3850 4456 [WARNING|trainer.py:803] 2025-04-26 20:46:27,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:27,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:46:27,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3893 4457 3851 [WARNING|trainer.py:803] 2025-04-26 20:46:28,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:28,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:28,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4458 3894 3852 [WARNING|trainer.py:803] 2025-04-26 20:46:29,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4459 [WARNING|trainer.py:803] 2025-04-26 20:46:30,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:30,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3895 3853 [WARNING|trainer.py:803] 2025-04-26 20:46:30,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4460 [WARNING|trainer.py:803] 2025-04-26 20:46:31,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:31,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:31,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3896 3854 4461 [WARNING|trainer.py:803] 2025-04-26 20:46:32,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:32,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:32,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4462 3855 3897 [WARNING|trainer.py:803] 2025-04-26 20:46:33,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:33,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4463 [WARNING|trainer.py:803] 2025-04-26 20:46:34,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3856 3898 [WARNING|trainer.py:803] 2025-04-26 20:46:34,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4464 [WARNING|trainer.py:803] 2025-04-26 20:46:35,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:35,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3857 [WARNING|trainer.py:803] 2025-04-26 20:46:35,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3899 4465 [WARNING|trainer.py:803] 2025-04-26 20:46:36,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:36,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:36,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3858 4466 3900 [WARNING|trainer.py:803] 2025-04-26 20:46:37,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:46:37,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4467 [WARNING|trainer.py:803] 2025-04-26 20:46:38,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3859 3901 [WARNING|trainer.py:803] 2025-04-26 20:46:38,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4468 [WARNING|trainer.py:803] 2025-04-26 20:46:39,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3860 [WARNING|trainer.py:803] 2025-04-26 20:46:39,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:39,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4469 3902 [WARNING|trainer.py:803] 2025-04-26 20:46:40,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:40,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3861 [WARNING|trainer.py:803] 2025-04-26 20:46:41,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4470 [WARNING|trainer.py:803] 2025-04-26 20:46:41,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3903 [WARNING|trainer.py:803] 2025-04-26 20:46:41,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4471 3862 [WARNING|trainer.py:803] 2025-04-26 20:46:42,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:43,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:43,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4472 3904 3863 [WARNING|trainer.py:803] 2025-04-26 20:46:44,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:46:44,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4473 [WARNING|trainer.py:803] 2025-04-26 20:46:44,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3905 3864 [WARNING|trainer.py:803] 2025-04-26 20:46:45,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4474 [WARNING|trainer.py:803] 2025-04-26 20:46:45,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:45,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:46:46,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3865 4475 3906 [WARNING|trainer.py:803] 2025-04-26 20:46:47,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:46:47,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:47,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4476 3866 3907 [WARNING|trainer.py:803] 2025-04-26 20:46:48,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4477 [WARNING|trainer.py:803] 2025-04-26 20:46:48,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:46:48,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3867 [WARNING|trainer.py:803] 2025-04-26 20:46:49,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4478 3908 [WARNING|trainer.py:803] 2025-04-26 20:46:49,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3868 [WARNING|trainer.py:803] 2025-04-26 20:46:50,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:50,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4479 [WARNING|trainer.py:803] 2025-04-26 20:46:50,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3909 [WARNING|trainer.py:803] 2025-04-26 20:46:51,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4480 3869 [WARNING|trainer.py:803] 2025-04-26 20:46:51,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:52,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:52,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4481 3910 3870 [WARNING|trainer.py:803] 2025-04-26 20:46:53,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4482 [WARNING|trainer.py:803] 2025-04-26 20:46:53,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:53,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3871 3911 [WARNING|trainer.py:803] 2025-04-26 20:46:54,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4483 [WARNING|trainer.py:803] 2025-04-26 20:46:54,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:46:54,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:55,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4484 3872 3912 [WARNING|trainer.py:803] 2025-04-26 20:46:56,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:56,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4485 [WARNING|trainer.py:803] 2025-04-26 20:46:56,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3873 3913 [WARNING|trainer.py:803] 2025-04-26 20:46:57,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4486 [WARNING|trainer.py:803] 2025-04-26 20:46:57,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:46:57,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:46:58,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3874 4487 3914 [WARNING|trainer.py:803] 2025-04-26 20:46:59,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:46:59,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4488 [WARNING|trainer.py:803] 2025-04-26 20:46:59,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3875 3915 [WARNING|trainer.py:803] 2025-04-26 20:47:00,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4489 [WARNING|trainer.py:803] 2025-04-26 20:47:00,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:47:00,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3876 [WARNING|trainer.py:803] 2025-04-26 20:47:01,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4490 3916 [WARNING|trainer.py:803] 2025-04-26 20:47:01,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:02,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3877 4491 [WARNING|trainer.py:803] 2025-04-26 20:47:02,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3917 [WARNING|trainer.py:803] 2025-04-26 20:47:03,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:03,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4492 3878 [WARNING|trainer.py:803] 2025-04-26 20:47:04,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:47:04,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:04,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4493 3918 3879 [WARNING|trainer.py:803] 2025-04-26 20:47:05,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:05,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4494 [WARNING|trainer.py:803] 2025-04-26 20:47:05,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3919 3880 [WARNING|trainer.py:803] 2025-04-26 20:47:06,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4495 [WARNING|trainer.py:803] 2025-04-26 20:47:07,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:07,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:47:07,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4496 3881 3920 [WARNING|trainer.py:803] 2025-04-26 20:47:08,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:08,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:47:08,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4497 3882 3921 [WARNING|trainer.py:803] 2025-04-26 20:47:09,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4498 [WARNING|trainer.py:803] 2025-04-26 20:47:10,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:47:10,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3883 [WARNING|trainer.py:803] 2025-04-26 20:47:10,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3922 4499 [WARNING|trainer.py:803] 2025-04-26 20:47:11,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:47:11,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:11,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3884 4500 3923 [WARNING|trainer.py:803] 2025-04-26 20:47:12,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:12,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4501 3885 [WARNING|trainer.py:803] 2025-04-26 20:47:13,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:13,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4502 [WARNING|trainer.py:803] 2025-04-26 20:47:14,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3924 3886 [WARNING|trainer.py:803] 2025-04-26 20:47:14,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4503 [WARNING|trainer.py:803] 2025-04-26 20:47:15,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:15,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3925 [WARNING|trainer.py:803] 2025-04-26 20:47:15,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4504 3887 [WARNING|trainer.py:803] 2025-04-26 20:47:16,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:47:16,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:16,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4505 3888 3926 [WARNING|trainer.py:803] 2025-04-26 20:47:17,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4506 [WARNING|trainer.py:803] 2025-04-26 20:47:18,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:18,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3889 [WARNING|trainer.py:803] 2025-04-26 20:47:18,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3927 4507 [WARNING|trainer.py:803] 2025-04-26 20:47:19,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:19,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:19,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3890 4508 3928 [WARNING|trainer.py:803] 2025-04-26 20:47:21,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:21,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4509 [WARNING|trainer.py:803] 2025-04-26 20:47:21,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3891 [WARNING|trainer.py:803] 2025-04-26 20:47:22,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3929 4510 [WARNING|trainer.py:803] 2025-04-26 20:47:22,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3892 [WARNING|trainer.py:803] 2025-04-26 20:47:22,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:23,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4511 [WARNING|trainer.py:803] 2025-04-26 20:47:23,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3930 [WARNING|trainer.py:803] 2025-04-26 20:47:24,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3893 4512 [WARNING|trainer.py:803] 2025-04-26 20:47:24,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:47:24,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:25,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3931 4513 3894 [WARNING|trainer.py:803] 2025-04-26 20:47:26,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:26,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:26,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4514 3895 3932 [WARNING|trainer.py:803] 2025-04-26 20:47:27,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4515 [WARNING|trainer.py:803] 2025-04-26 20:47:27,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:27,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3896 [WARNING|trainer.py:803] 2025-04-26 20:47:28,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3933 4516 [WARNING|trainer.py:803] 2025-04-26 20:47:28,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:29,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:29,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4517 3897 3934 [WARNING|trainer.py:803] 2025-04-26 20:47:30,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:30,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4518 3898 [WARNING|trainer.py:803] 2025-04-26 20:47:30,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:47:31,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4519 [WARNING|trainer.py:803] 2025-04-26 20:47:31,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3935 3899 [WARNING|trainer.py:803] 2025-04-26 20:47:32,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:32,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4520 [WARNING|trainer.py:803] 2025-04-26 20:47:32,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3936 [WARNING|trainer.py:803] 2025-04-26 20:47:33,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3900 4521 [WARNING|trainer.py:803] 2025-04-26 20:47:33,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:34,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:34,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3937 4522 3901 [WARNING|trainer.py:803] 2025-04-26 20:47:35,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:35,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4523 [WARNING|trainer.py:803] 2025-04-26 20:47:35,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3938 3902 [WARNING|trainer.py:803] 2025-04-26 20:47:36,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4524 [WARNING|trainer.py:803] 2025-04-26 20:47:37,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:47:37,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:47:37,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3939 4525 3903 [WARNING|trainer.py:803] 2025-04-26 20:47:38,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:47:38,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4526 [WARNING|trainer.py:803] 2025-04-26 20:47:38,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3940 3904 [WARNING|trainer.py:803] 2025-04-26 20:47:39,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4527 [WARNING|trainer.py:803] 2025-04-26 20:47:40,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:40,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:40,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3941 4528 3905 [WARNING|trainer.py:803] 2025-04-26 20:47:41,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:41,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:41,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4529 3942 3906 [WARNING|trainer.py:803] 2025-04-26 20:47:42,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4530 [WARNING|trainer.py:803] 2025-04-26 20:47:43,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:43,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:43,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3907 4531 3943 [WARNING|trainer.py:803] 2025-04-26 20:47:44,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:44,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:44,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4532 3908 3944 [WARNING|trainer.py:803] 2025-04-26 20:47:46,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4533 [WARNING|trainer.py:803] 2025-04-26 20:47:46,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:46,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:47,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4534 3909 3945 [WARNING|trainer.py:803] 2025-04-26 20:47:48,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:48,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:48,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4535 3910 3946 [WARNING|trainer.py:803] 2025-04-26 20:47:49,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4536 [WARNING|trainer.py:803] 2025-04-26 20:47:49,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:49,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:50,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3911 4537 3947 [WARNING|trainer.py:803] 2025-04-26 20:47:51,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:51,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:51,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4538 3912 3948 [WARNING|trainer.py:803] 2025-04-26 20:47:52,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4539 [WARNING|trainer.py:803] 2025-04-26 20:47:52,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:52,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:53,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3913 4540 3949 [WARNING|trainer.py:803] 2025-04-26 20:47:54,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:54,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:54,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4541 3914 3950 [WARNING|trainer.py:803] 2025-04-26 20:47:55,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4542 [WARNING|trainer.py:803] 2025-04-26 20:47:55,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:47:55,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:56,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3915 3951 4543 [WARNING|trainer.py:803] 2025-04-26 20:47:57,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:57,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:47:57,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4544 3916 3952 [WARNING|trainer.py:803] 2025-04-26 20:47:58,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4545 [WARNING|trainer.py:803] 2025-04-26 20:47:59,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:47:59,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:47:59,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3917 3953 4546 [WARNING|trainer.py:803] 2025-04-26 20:48:00,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:00,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:48:00,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4547 3954 3918 [WARNING|trainer.py:803] 2025-04-26 20:48:01,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4548 [WARNING|trainer.py:803] 2025-04-26 20:48:02,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:02,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:02,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3955 3919 4549 [WARNING|trainer.py:803] 2025-04-26 20:48:03,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:48:03,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:03,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4550 3956 3920 [WARNING|trainer.py:803] 2025-04-26 20:48:04,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4551 [WARNING|trainer.py:803] 2025-04-26 20:48:05,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:05,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3957 [WARNING|trainer.py:803] 2025-04-26 20:48:05,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3921 4552 [WARNING|trainer.py:803] 2025-04-26 20:48:06,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:06,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:48:06,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4553 3958 3922 [WARNING|trainer.py:803] 2025-04-26 20:48:07,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4554 [WARNING|trainer.py:803] 2025-04-26 20:48:08,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:48:08,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3959 [WARNING|trainer.py:803] 2025-04-26 20:48:08,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3923 4555 [WARNING|trainer.py:803] 2025-04-26 20:48:09,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:09,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:09,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4556 3960 3924 [WARNING|trainer.py:803] 2025-04-26 20:48:11,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4557 [WARNING|trainer.py:803] 2025-04-26 20:48:11,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:48:11,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3961 [WARNING|trainer.py:803] 2025-04-26 20:48:12,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4558 3925 [WARNING|trainer.py:803] 2025-04-26 20:48:12,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:13,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:48:13,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4559 3962 3926 [WARNING|trainer.py:803] 2025-04-26 20:48:14,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:48:14,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4560 [WARNING|trainer.py:803] 2025-04-26 20:48:14,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3963 [WARNING|trainer.py:803] 2025-04-26 20:48:15,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4561 3927 [WARNING|trainer.py:803] 2025-04-26 20:48:15,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:48:16,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:16,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3964 4562 3928 [WARNING|trainer.py:803] 2025-04-26 20:48:17,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:17,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4563 [WARNING|trainer.py:803] 2025-04-26 20:48:17,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3965 [WARNING|trainer.py:803] 2025-04-26 20:48:18,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3929 4564 [WARNING|trainer.py:803] 2025-04-26 20:48:18,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:19,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:19,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3966 4565 3930 [WARNING|trainer.py:803] 2025-04-26 20:48:20,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:20,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4566 [WARNING|trainer.py:803] 2025-04-26 20:48:20,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3967 [WARNING|trainer.py:803] 2025-04-26 20:48:21,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3931 4567 [WARNING|trainer.py:803] 2025-04-26 20:48:21,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:22,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:22,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3968 4568 3932 [WARNING|trainer.py:803] 2025-04-26 20:48:23,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:23,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4569 [WARNING|trainer.py:803] 2025-04-26 20:48:23,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3969 3933 [WARNING|trainer.py:803] 2025-04-26 20:48:24,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4570 [WARNING|trainer.py:803] 2025-04-26 20:48:25,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:48:25,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:25,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3970 4571 3934 [WARNING|trainer.py:803] 2025-04-26 20:48:26,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:48:26,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4572 [WARNING|trainer.py:803] 2025-04-26 20:48:27,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3971 [WARNING|trainer.py:803] 2025-04-26 20:48:27,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4573 3935 [WARNING|trainer.py:803] 2025-04-26 20:48:28,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:28,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:28,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3972 4574 3936 [WARNING|trainer.py:803] 2025-04-26 20:48:29,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:29,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4575 [WARNING|trainer.py:803] 2025-04-26 20:48:30,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3973 [WARNING|trainer.py:803] 2025-04-26 20:48:30,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3937 4576 [WARNING|trainer.py:803] 2025-04-26 20:48:31,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:31,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:31,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3974 4577 3938 [WARNING|trainer.py:803] 2025-04-26 20:48:32,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:32,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4578 [WARNING|trainer.py:803] 2025-04-26 20:48:33,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3975 [WARNING|trainer.py:803] 2025-04-26 20:48:33,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3939 4579 [WARNING|trainer.py:803] 2025-04-26 20:48:34,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:34,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:48:34,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4580 3976 3940 [WARNING|trainer.py:803] 2025-04-26 20:48:35,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:36,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4581 [WARNING|trainer.py:803] 2025-04-26 20:48:36,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3977 [WARNING|trainer.py:803] 2025-04-26 20:48:37,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3941 4582 [WARNING|trainer.py:803] 2025-04-26 20:48:37,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:37,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:38,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3978 4583 3942 [WARNING|trainer.py:803] 2025-04-26 20:48:38,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:48:39,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4584 [WARNING|trainer.py:803] 2025-04-26 20:48:39,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3979 [WARNING|trainer.py:803] 2025-04-26 20:48:40,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4585 3943 [WARNING|trainer.py:803] 2025-04-26 20:48:40,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:48:41,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3980 [WARNING|trainer.py:803] 2025-04-26 20:48:41,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4586 [WARNING|trainer.py:803] 2025-04-26 20:48:42,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3944 [WARNING|trainer.py:803] 2025-04-26 20:48:42,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4587 3981 [WARNING|trainer.py:803] 2025-04-26 20:48:42,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:43,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4588 [WARNING|trainer.py:803] 2025-04-26 20:48:43,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3945 3982 [WARNING|trainer.py:803] 2025-04-26 20:48:44,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4589 [WARNING|trainer.py:803] 2025-04-26 20:48:44,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:44,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 3946 [WARNING|trainer.py:803] 2025-04-26 20:48:45,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4590 3983 [WARNING|trainer.py:803] 2025-04-26 20:48:46,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:46,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4591 [WARNING|trainer.py:803] 2025-04-26 20:48:46,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3947 [WARNING|trainer.py:803] 2025-04-26 20:48:47,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3984 4592 [WARNING|trainer.py:803] 2025-04-26 20:48:47,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:48,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3948 [WARNING|trainer.py:803] 2025-04-26 20:48:48,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4593 3985 [WARNING|trainer.py:803] 2025-04-26 20:48:49,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:49,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4594 [WARNING|trainer.py:803] 2025-04-26 20:48:49,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3949 [WARNING|trainer.py:803] 2025-04-26 20:48:50,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3986 4595 [WARNING|trainer.py:803] 2025-04-26 20:48:50,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3950 [WARNING|trainer.py:803] 2025-04-26 20:48:51,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:48:51,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4596 3987 [WARNING|trainer.py:803] 2025-04-26 20:48:52,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:52,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4597 3951 [WARNING|trainer.py:803] 2025-04-26 20:48:52,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3988 [WARNING|trainer.py:803] 2025-04-26 20:48:53,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:53,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4598 [WARNING|trainer.py:803] 2025-04-26 20:48:54,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3952 [WARNING|trainer.py:803] 2025-04-26 20:48:54,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4599 3989 [WARNING|trainer.py:803] 2025-04-26 20:48:55,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:55,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:55,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4600 3953 3990 [WARNING|trainer.py:803] 2025-04-26 20:48:56,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:48:56,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4601 [WARNING|trainer.py:803] 2025-04-26 20:48:57,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3954 [WARNING|trainer.py:803] 2025-04-26 20:48:57,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4602 3991 [WARNING|trainer.py:803] 2025-04-26 20:48:58,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:58,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:58,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4603 3955 3992 [WARNING|trainer.py:803] 2025-04-26 20:48:59,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:48:59,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4604 [WARNING|trainer.py:803] 2025-04-26 20:49:00,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3956 [WARNING|trainer.py:803] 2025-04-26 20:49:00,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4605 3993 [WARNING|trainer.py:803] 2025-04-26 20:49:01,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:01,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:01,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3957 4606 3994 [WARNING|trainer.py:803] 2025-04-26 20:49:02,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:02,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4607 [WARNING|trainer.py:803] 2025-04-26 20:49:03,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3958 [WARNING|trainer.py:803] 2025-04-26 20:49:03,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3995 4608 [WARNING|trainer.py:803] 2025-04-26 20:49:04,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:49:04,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 3959 NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:04,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4609 3996 [WARNING|trainer.py:803] 2025-04-26 20:49:05,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:05,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4610 [WARNING|trainer.py:803] 2025-04-26 20:49:06,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 3960 [WARNING|trainer.py:803] 2025-04-26 20:49:06,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3997 4611 [WARNING|trainer.py:803] 2025-04-26 20:49:07,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:49:07,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3961 [WARNING|trainer.py:803] 2025-04-26 20:49:07,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4612 3998 [WARNING|trainer.py:803] 2025-04-26 20:49:08,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:09,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4613 3962 [WARNING|trainer.py:803] 2025-04-26 20:49:09,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:49:10,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3999 [WARNING|trainer.py:803] 2025-04-26 20:49:10,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4614 3963 [WARNING|trainer.py:803] 2025-04-26 20:49:10,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:49:11,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4615 4000 [WARNING|trainer.py:803] 2025-04-26 20:49:11,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:49:12,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3964 4616 [WARNING|trainer.py:803] 2025-04-26 20:49:12,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:49:13,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4001 [WARNING|trainer.py:803] 2025-04-26 20:49:13,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4617 3965 [WARNING|trainer.py:803] 2025-04-26 20:49:14,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:49:14,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4618 [WARNING|trainer.py:803] 2025-04-26 20:49:14,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4002 3966 [WARNING|trainer.py:803] 2025-04-26 20:49:15,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4619 [WARNING|trainer.py:803] 2025-04-26 20:49:15,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:49:16,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:16,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4003 4620 3967 [WARNING|trainer.py:803] 2025-04-26 20:49:17,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:17,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4621 [WARNING|trainer.py:803] 2025-04-26 20:49:17,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4004 [WARNING|trainer.py:803] 2025-04-26 20:49:18,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3968 4622 [WARNING|trainer.py:803] 2025-04-26 20:49:18,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:19,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:19,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4005 4623 3969 [WARNING|trainer.py:803] 2025-04-26 20:49:20,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:49:20,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4624 [WARNING|trainer.py:803] 2025-04-26 20:49:20,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4006 [WARNING|trainer.py:803] 2025-04-26 20:49:21,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3970 4625 [WARNING|trainer.py:803] 2025-04-26 20:49:21,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:22,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:49:22,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4007 4626 3971 [WARNING|trainer.py:803] 2025-04-26 20:49:23,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:23,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4627 [WARNING|trainer.py:803] 2025-04-26 20:49:23,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4008 [WARNING|trainer.py:803] 2025-04-26 20:49:24,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3972 4628 [WARNING|trainer.py:803] 2025-04-26 20:49:25,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:25,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:25,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4009 4629 3973 [WARNING|trainer.py:803] 2025-04-26 20:49:26,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:49:26,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4630 [WARNING|trainer.py:803] 2025-04-26 20:49:27,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4010 3974 [WARNING|trainer.py:803] 2025-04-26 20:49:27,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4631 [WARNING|trainer.py:803] 2025-04-26 20:49:28,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:28,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4011 [WARNING|trainer.py:803] 2025-04-26 20:49:28,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4632 3975 [WARNING|trainer.py:803] 2025-04-26 20:49:29,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:29,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4633 4012 [WARNING|trainer.py:803] 2025-04-26 20:49:30,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:30,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3976 [WARNING|trainer.py:803] 2025-04-26 20:49:31,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4634 [WARNING|trainer.py:803] 2025-04-26 20:49:31,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4013 [WARNING|trainer.py:803] 2025-04-26 20:49:32,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4635 3977 [WARNING|trainer.py:803] 2025-04-26 20:49:32,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:33,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4636 [WARNING|trainer.py:803] 2025-04-26 20:49:33,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4014 3978 [WARNING|trainer.py:803] 2025-04-26 20:49:34,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:34,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4637 [WARNING|trainer.py:803] 2025-04-26 20:49:34,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4015 [WARNING|trainer.py:803] 2025-04-26 20:49:35,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4638 3979 [WARNING|trainer.py:803] 2025-04-26 20:49:35,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:36,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:36,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4639 4016 3980 [WARNING|trainer.py:803] 2025-04-26 20:49:37,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:37,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4640 [WARNING|trainer.py:803] 2025-04-26 20:49:37,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4017 [WARNING|trainer.py:803] 2025-04-26 20:49:38,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4641 3981 [WARNING|trainer.py:803] 2025-04-26 20:49:38,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:39,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:49:39,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4642 4018 3982 [WARNING|trainer.py:803] 2025-04-26 20:49:40,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4643 [WARNING|trainer.py:803] 2025-04-26 20:49:40,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:49:40,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:49:41,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4019 4644 3983 [WARNING|trainer.py:803] 2025-04-26 20:49:42,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:42,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:42,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4645 4020 3984 [WARNING|trainer.py:803] 2025-04-26 20:49:43,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4646 [WARNING|trainer.py:803] 2025-04-26 20:49:43,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:49:43,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:44,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4647 3985 4021 [WARNING|trainer.py:803] 2025-04-26 20:49:45,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:45,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:45,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4648 3986 4022 [WARNING|trainer.py:803] 2025-04-26 20:49:46,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4649 [WARNING|trainer.py:803] 2025-04-26 20:49:47,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:49:47,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:49:47,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4650 3987 4023 [WARNING|trainer.py:803] 2025-04-26 20:49:48,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:48,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4651 [WARNING|trainer.py:803] 2025-04-26 20:49:48,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3988 [WARNING|trainer.py:803] 2025-04-26 20:49:49,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4024 4652 [WARNING|trainer.py:803] 2025-04-26 20:49:50,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:49:50,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:49:50,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3989 4653 4025 [WARNING|trainer.py:803] 2025-04-26 20:49:51,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:51,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4654 [WARNING|trainer.py:803] 2025-04-26 20:49:52,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3990 [WARNING|trainer.py:803] 2025-04-26 20:49:52,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4026 4655 [WARNING|trainer.py:803] 2025-04-26 20:49:52,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:53,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3991 [WARNING|trainer.py:803] 2025-04-26 20:49:53,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4656 4027 [WARNING|trainer.py:803] 2025-04-26 20:49:54,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:54,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4657 [WARNING|trainer.py:803] 2025-04-26 20:49:55,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 3992 [WARNING|trainer.py:803] 2025-04-26 20:49:55,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4658 4028 [WARNING|trainer.py:803] 2025-04-26 20:49:56,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:56,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3993 [WARNING|trainer.py:803] 2025-04-26 20:49:56,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4659 4029 [WARNING|trainer.py:803] 2025-04-26 20:49:57,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:49:57,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4660 3994 [WARNING|trainer.py:803] 2025-04-26 20:49:58,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:58,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4661 4030 [WARNING|trainer.py:803] 2025-04-26 20:49:59,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:49:59,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3995 4662 [WARNING|trainer.py:803] 2025-04-26 20:50:00,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:50:00,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4031 [WARNING|trainer.py:803] 2025-04-26 20:50:00,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4663 3996 [WARNING|trainer.py:803] 2025-04-26 20:50:01,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:50:01,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4664 4032 [WARNING|trainer.py:803] 2025-04-26 20:50:02,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:50:03,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 3997 [WARNING|trainer.py:803] 2025-04-26 20:50:03,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4665 4033 [WARNING|trainer.py:803] 2025-04-26 20:50:03,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:04,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4666 3998 [WARNING|trainer.py:803] 2025-04-26 20:50:04,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:50:05,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4667 4034 [WARNING|trainer.py:803] 2025-04-26 20:50:05,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 3999 [WARNING|trainer.py:803] 2025-04-26 20:50:06,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:06,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4668 [WARNING|trainer.py:803] 2025-04-26 20:50:06,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4035 [WARNING|trainer.py:803] 2025-04-26 20:50:07,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4669 4000 [WARNING|trainer.py:803] 2025-04-26 20:50:07,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:08,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:08,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4036 4670 4001 [WARNING|trainer.py:803] 2025-04-26 20:50:09,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:09,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4671 [WARNING|trainer.py:803] 2025-04-26 20:50:10,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4037 [WARNING|trainer.py:803] 2025-04-26 20:50:10,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4672 4002 [WARNING|trainer.py:803] 2025-04-26 20:50:10,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:11,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:11,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4038 4673 4003 [WARNING|trainer.py:803] 2025-04-26 20:50:12,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:50:12,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4674 [WARNING|trainer.py:803] 2025-04-26 20:50:13,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4039 [WARNING|trainer.py:803] 2025-04-26 20:50:13,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4675 4004 [WARNING|trainer.py:803] 2025-04-26 20:50:14,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:14,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:14,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4040 4676 4005 [WARNING|trainer.py:803] 2025-04-26 20:50:15,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:50:15,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4677 [WARNING|trainer.py:803] 2025-04-26 20:50:16,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4041 [WARNING|trainer.py:803] 2025-04-26 20:50:16,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4006 4678 [WARNING|trainer.py:803] 2025-04-26 20:50:17,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:50:17,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:17,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4042 4679 4007 [WARNING|trainer.py:803] 2025-04-26 20:50:18,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:18,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4680 [WARNING|trainer.py:803] 2025-04-26 20:50:19,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4043 [WARNING|trainer.py:803] 2025-04-26 20:50:19,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4681 4008 [WARNING|trainer.py:803] 2025-04-26 20:50:20,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:20,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:20,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4044 4682 4009 [WARNING|trainer.py:803] 2025-04-26 20:50:21,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:50:21,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4683 [WARNING|trainer.py:803] 2025-04-26 20:50:22,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4045 [WARNING|trainer.py:803] 2025-04-26 20:50:22,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4010 4684 [WARNING|trainer.py:803] 2025-04-26 20:50:23,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:50:23,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:23,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4046 4685 4011 [WARNING|trainer.py:803] 2025-04-26 20:50:24,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:24,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4686 [WARNING|trainer.py:803] 2025-04-26 20:50:25,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4047 [WARNING|trainer.py:803] 2025-04-26 20:50:25,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4012 4687 [WARNING|trainer.py:803] 2025-04-26 20:50:26,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:26,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:26,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4688 4048 4013 [WARNING|trainer.py:803] 2025-04-26 20:50:28,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:28,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4689 [WARNING|trainer.py:803] 2025-04-26 20:50:28,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4049 [WARNING|trainer.py:803] 2025-04-26 20:50:29,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4014 4690 [WARNING|trainer.py:803] 2025-04-26 20:50:29,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:30,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:30,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4050 4691 4015 [WARNING|trainer.py:803] 2025-04-26 20:50:31,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:50:31,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4692 [WARNING|trainer.py:803] 2025-04-26 20:50:31,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4051 [WARNING|trainer.py:803] 2025-04-26 20:50:32,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4016 4693 [WARNING|trainer.py:803] 2025-04-26 20:50:32,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:33,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:33,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4052 4694 4017 [WARNING|trainer.py:803] 2025-04-26 20:50:34,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:34,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4695 [WARNING|trainer.py:803] 2025-04-26 20:50:34,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4053 [WARNING|trainer.py:803] 2025-04-26 20:50:35,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4696 4018 [WARNING|trainer.py:803] 2025-04-26 20:50:35,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:36,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4054 [WARNING|trainer.py:803] 2025-04-26 20:50:36,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4697 4019 [WARNING|trainer.py:803] 2025-04-26 20:50:37,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:37,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4698 4055 [WARNING|trainer.py:803] 2025-04-26 20:50:37,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:38,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4699 4020 [WARNING|trainer.py:803] 2025-04-26 20:50:38,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:39,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4056 [WARNING|trainer.py:803] 2025-04-26 20:50:39,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4700 [WARNING|trainer.py:803] 2025-04-26 20:50:40,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4021 [WARNING|trainer.py:803] 2025-04-26 20:50:40,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4701 4057 [WARNING|trainer.py:803] 2025-04-26 20:50:41,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:50:41,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4702 [WARNING|trainer.py:803] 2025-04-26 20:50:41,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4022 [WARNING|trainer.py:803] 2025-04-26 20:50:42,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4058 4703 [WARNING|trainer.py:803] 2025-04-26 20:50:42,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:50:43,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:43,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4023 4704 4059 [WARNING|trainer.py:803] 2025-04-26 20:50:44,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:44,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4705 [WARNING|trainer.py:803] 2025-04-26 20:50:45,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4024 [WARNING|trainer.py:803] 2025-04-26 20:50:45,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4060 4706 [WARNING|trainer.py:803] 2025-04-26 20:50:46,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:50:46,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:46,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4025 4707 4061 [WARNING|trainer.py:803] 2025-04-26 20:50:47,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:47,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4708 [WARNING|trainer.py:803] 2025-04-26 20:50:48,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4026 [WARNING|trainer.py:803] 2025-04-26 20:50:48,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4062 4709 [WARNING|trainer.py:803] 2025-04-26 20:50:49,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:49,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:49,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4710 4027 4063 [WARNING|trainer.py:803] 2025-04-26 20:50:50,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:51,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4711 [WARNING|trainer.py:803] 2025-04-26 20:50:51,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4028 [WARNING|trainer.py:803] 2025-04-26 20:50:51,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4712 4064 [WARNING|trainer.py:803] 2025-04-26 20:50:52,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:52,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:53,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4713 4029 4065 [WARNING|trainer.py:803] 2025-04-26 20:50:53,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:54,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4714 [WARNING|trainer.py:803] 2025-04-26 20:50:54,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4030 [WARNING|trainer.py:803] 2025-04-26 20:50:54,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4066 4715 [WARNING|trainer.py:803] 2025-04-26 20:50:55,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:50:55,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:50:55,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4716 4031 4067 [WARNING|trainer.py:803] 2025-04-26 20:50:57,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:57,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4717 [WARNING|trainer.py:803] 2025-04-26 20:50:57,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4032 [WARNING|trainer.py:803] 2025-04-26 20:50:58,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4718 4068 [WARNING|trainer.py:803] 2025-04-26 20:50:58,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:50:59,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:50:59,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4719 4033 4069 [WARNING|trainer.py:803] 2025-04-26 20:51:00,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:00,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4720 [WARNING|trainer.py:803] 2025-04-26 20:51:00,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4034 [WARNING|trainer.py:803] 2025-04-26 20:51:01,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4721 4070 [WARNING|trainer.py:803] 2025-04-26 20:51:01,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:02,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:02,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4722 4035 4071 [WARNING|trainer.py:803] 2025-04-26 20:51:03,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:03,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4723 [WARNING|trainer.py:803] 2025-04-26 20:51:03,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4036 [WARNING|trainer.py:803] 2025-04-26 20:51:04,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4072 4724 [WARNING|trainer.py:803] 2025-04-26 20:51:04,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:05,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:51:05,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4725 4037 4073 [WARNING|trainer.py:803] 2025-04-26 20:51:06,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4726 [WARNING|trainer.py:803] 2025-04-26 20:51:06,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:06,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4038 4074 [WARNING|trainer.py:803] 2025-04-26 20:51:07,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4727 [WARNING|trainer.py:803] 2025-04-26 20:51:07,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:51:08,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:51:08,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4728 4039 4075 [WARNING|trainer.py:803] 2025-04-26 20:51:09,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4729 [WARNING|trainer.py:803] 2025-04-26 20:51:09,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:09,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4040 [WARNING|trainer.py:803] 2025-04-26 20:51:10,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4076 4730 [WARNING|trainer.py:803] 2025-04-26 20:51:11,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:51:11,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:11,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4731 4041 4077 [WARNING|trainer.py:803] 2025-04-26 20:51:12,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:12,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4732 [WARNING|trainer.py:803] 2025-04-26 20:51:12,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4042 [WARNING|trainer.py:803] 2025-04-26 20:51:13,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4078 4733 [WARNING|trainer.py:803] 2025-04-26 20:51:14,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:14,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:14,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4734 4043 4079 [WARNING|trainer.py:803] 2025-04-26 20:51:15,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4735 [WARNING|trainer.py:803] 2025-04-26 20:51:15,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:15,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4044 [WARNING|trainer.py:803] 2025-04-26 20:51:16,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4080 4736 [WARNING|trainer.py:803] 2025-04-26 20:51:17,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:51:17,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:17,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4737 4045 4081 [WARNING|trainer.py:803] 2025-04-26 20:51:18,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4738 [WARNING|trainer.py:803] 2025-04-26 20:51:18,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:51:19,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4046 [WARNING|trainer.py:803] 2025-04-26 20:51:19,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4082 4739 [WARNING|trainer.py:803] 2025-04-26 20:51:20,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:20,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:20,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4740 4047 4083 [WARNING|trainer.py:803] 2025-04-26 20:51:21,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4741 [WARNING|trainer.py:803] 2025-04-26 20:51:21,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:22,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:22,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4048 4742 4084 [WARNING|trainer.py:803] 2025-04-26 20:51:23,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:23,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:23,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4743 4049 4085 [WARNING|trainer.py:803] 2025-04-26 20:51:24,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4744 [WARNING|trainer.py:803] 2025-04-26 20:51:25,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:25,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:25,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4050 4086 4745 [WARNING|trainer.py:803] 2025-04-26 20:51:26,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:51:26,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:26,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4746 4051 4087 [WARNING|trainer.py:803] 2025-04-26 20:51:27,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4747 [WARNING|trainer.py:803] 2025-04-26 20:51:28,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:28,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:28,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4052 4748 4088 [WARNING|trainer.py:803] 2025-04-26 20:51:29,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:29,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4749 [WARNING|trainer.py:803] 2025-04-26 20:51:29,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4053 4089 [WARNING|trainer.py:803] 2025-04-26 20:51:30,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4750 [WARNING|trainer.py:803] 2025-04-26 20:51:31,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:31,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:31,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4054 4751 4090 [WARNING|trainer.py:803] 2025-04-26 20:51:32,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:32,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4752 [WARNING|trainer.py:803] 2025-04-26 20:51:33,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4055 4091 [WARNING|trainer.py:803] 2025-04-26 20:51:33,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4753 [WARNING|trainer.py:803] 2025-04-26 20:51:34,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:34,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:34,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4754 4056 4092 [WARNING|trainer.py:803] 2025-04-26 20:51:35,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:35,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:36,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4755 4057 4093 [WARNING|trainer.py:803] 2025-04-26 20:51:36,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4756 [WARNING|trainer.py:803] 2025-04-26 20:51:37,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:37,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:37,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4757 4058 4094 [WARNING|trainer.py:803] 2025-04-26 20:51:38,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:39,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4758 [WARNING|trainer.py:803] 2025-04-26 20:51:39,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4059 4095 [WARNING|trainer.py:803] 2025-04-26 20:51:39,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4759 [WARNING|trainer.py:803] 2025-04-26 20:51:40,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:40,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:40,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4760 4060 4096 [WARNING|trainer.py:803] 2025-04-26 20:51:41,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4761 [WARNING|trainer.py:803] 2025-04-26 20:51:42,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:42,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4061 [WARNING|trainer.py:803] 2025-04-26 20:51:43,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4097 4762 [WARNING|trainer.py:803] 2025-04-26 20:51:43,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:51:43,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:51:44,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4763 4062 4098 [WARNING|trainer.py:803] 2025-04-26 20:51:45,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4764 [WARNING|trainer.py:803] 2025-04-26 20:51:45,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:45,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 20:51:46,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4063 4099 4765 [WARNING|trainer.py:803] 2025-04-26 20:51:46,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:47,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:47,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4766 4100 4064 [WARNING|trainer.py:803] 2025-04-26 20:51:48,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4767 [WARNING|trainer.py:803] 2025-04-26 20:51:48,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:48,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:49,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4768 4065 4101 [WARNING|trainer.py:803] 2025-04-26 20:51:50,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:50,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:50,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4769 4066 4102 [WARNING|trainer.py:803] 2025-04-26 20:51:51,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4770 [WARNING|trainer.py:803] 2025-04-26 20:51:51,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:51,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:52,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4771 4103 4067 [WARNING|trainer.py:803] 2025-04-26 20:51:53,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:51:53,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:53,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4772 4104 4068 [WARNING|trainer.py:803] 2025-04-26 20:51:54,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4773 [WARNING|trainer.py:803] 2025-04-26 20:51:54,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:54,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:51:55,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4774 4105 4069 [WARNING|trainer.py:803] 2025-04-26 20:51:56,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:56,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4775 [WARNING|trainer.py:803] 2025-04-26 20:51:56,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4106 4070 [WARNING|trainer.py:803] 2025-04-26 20:51:57,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4776 [WARNING|trainer.py:803] 2025-04-26 20:51:57,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:58,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:58,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4107 4777 4071 [WARNING|trainer.py:803] 2025-04-26 20:51:59,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:59,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:51:59,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4778 4108 4072 [WARNING|trainer.py:803] 2025-04-26 20:52:00,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4779 [WARNING|trainer.py:803] 2025-04-26 20:52:00,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:52:00,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:52:01,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4073 4109 4780 [WARNING|trainer.py:803] 2025-04-26 20:52:02,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:02,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:02,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4781 4074 4110 [WARNING|trainer.py:803] 2025-04-26 20:52:03,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4782 [WARNING|trainer.py:803] 2025-04-26 20:52:03,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:52:03,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4075 4111 [WARNING|trainer.py:803] 2025-04-26 20:52:04,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4783 [WARNING|trainer.py:803] 2025-04-26 20:52:05,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:05,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:05,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4784 4112 4076 [WARNING|trainer.py:803] 2025-04-26 20:52:06,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:06,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4785 [WARNING|trainer.py:803] 2025-04-26 20:52:06,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4113 4077 [WARNING|trainer.py:803] 2025-04-26 20:52:07,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4786 [WARNING|trainer.py:803] 2025-04-26 20:52:08,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:08,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:08,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4787 4114 4078 [WARNING|trainer.py:803] 2025-04-26 20:52:09,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:09,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4788 [WARNING|trainer.py:803] 2025-04-26 20:52:10,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4115 [WARNING|trainer.py:803] 2025-04-26 20:52:10,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4079 4789 [WARNING|trainer.py:803] 2025-04-26 20:52:11,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:11,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:11,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4790 4116 4080 [WARNING|trainer.py:803] 2025-04-26 20:52:12,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:12,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4791 [WARNING|trainer.py:803] 2025-04-26 20:52:13,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4117 [WARNING|trainer.py:803] 2025-04-26 20:52:13,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4081 4792 [WARNING|trainer.py:803] 2025-04-26 20:52:14,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:14,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:14,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4118 4793 4082 [WARNING|trainer.py:803] 2025-04-26 20:52:15,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:15,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4794 [WARNING|trainer.py:803] 2025-04-26 20:52:16,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4119 [WARNING|trainer.py:803] 2025-04-26 20:52:17,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4083 4795 [WARNING|trainer.py:803] 2025-04-26 20:52:17,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:17,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:18,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4796 4120 4084 [WARNING|trainer.py:803] 2025-04-26 20:52:19,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:19,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4797 [WARNING|trainer.py:803] 2025-04-26 20:52:19,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4121 [WARNING|trainer.py:803] 2025-04-26 20:52:20,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4085 4798 [WARNING|trainer.py:803] 2025-04-26 20:52:20,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:21,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:21,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4122 4799 4086 [WARNING|trainer.py:803] 2025-04-26 20:52:22,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:22,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4800 [WARNING|trainer.py:803] 2025-04-26 20:52:22,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4123 [WARNING|trainer.py:803] 2025-04-26 20:52:23,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4087 4801 [WARNING|trainer.py:803] 2025-04-26 20:52:23,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:24,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:24,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4124 4802 4088 [WARNING|trainer.py:803] 2025-04-26 20:52:25,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:25,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4803 4125 [WARNING|trainer.py:803] 2025-04-26 20:52:25,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:26,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4804 [WARNING|trainer.py:803] 2025-04-26 20:52:26,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4089 [WARNING|trainer.py:803] 2025-04-26 20:52:27,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4126 4805 [WARNING|trainer.py:803] 2025-04-26 20:52:27,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:28,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4090 [WARNING|trainer.py:803] 2025-04-26 20:52:28,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4806 4127 [WARNING|trainer.py:803] 2025-04-26 20:52:29,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:29,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4807 [WARNING|trainer.py:803] 2025-04-26 20:52:29,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4091 [WARNING|trainer.py:803] 2025-04-26 20:52:30,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4128 4808 [WARNING|trainer.py:803] 2025-04-26 20:52:30,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:31,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4092 [WARNING|trainer.py:803] 2025-04-26 20:52:31,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4809 4129 [WARNING|trainer.py:803] 2025-04-26 20:52:32,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:32,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4810 [WARNING|trainer.py:803] 2025-04-26 20:52:32,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4093 4130 [WARNING|trainer.py:803] 2025-04-26 20:52:33,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4811 [WARNING|trainer.py:803] 2025-04-26 20:52:33,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:34,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:34,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4094 4812 4131 [WARNING|trainer.py:803] 2025-04-26 20:52:35,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:52:35,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4813 [WARNING|trainer.py:803] 2025-04-26 20:52:35,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4095 4132 [WARNING|trainer.py:803] 2025-04-26 20:52:36,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4814 [WARNING|trainer.py:803] 2025-04-26 20:52:36,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:37,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:37,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4815 4096 4133 [WARNING|trainer.py:803] 2025-04-26 20:52:38,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:52:38,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4816 [WARNING|trainer.py:803] 2025-04-26 20:52:38,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4134 4097 [WARNING|trainer.py:803] 2025-04-26 20:52:39,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4817 [WARNING|trainer.py:803] 2025-04-26 20:52:40,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:40,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:52:40,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4818 4135 4098 [WARNING|trainer.py:803] 2025-04-26 20:52:41,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:41,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4819 [WARNING|trainer.py:803] 2025-04-26 20:52:41,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4136 [WARNING|trainer.py:803] 2025-04-26 20:52:42,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4099 4820 [WARNING|trainer.py:803] 2025-04-26 20:52:43,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:43,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:43,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4821 4137 4100 [WARNING|trainer.py:803] 2025-04-26 20:52:44,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4822 [WARNING|trainer.py:803] 2025-04-26 20:52:44,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:45,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4138 [WARNING|trainer.py:803] 2025-04-26 20:52:45,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4823 4101 [WARNING|trainer.py:803] 2025-04-26 20:52:46,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:46,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:46,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4824 4139 4102 [WARNING|trainer.py:803] 2025-04-26 20:52:47,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:47,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4825 [WARNING|trainer.py:803] 2025-04-26 20:52:48,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4140 [WARNING|trainer.py:803] 2025-04-26 20:52:48,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4826 4103 [WARNING|trainer.py:803] 2025-04-26 20:52:49,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:49,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:49,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4827 4141 4104 [WARNING|trainer.py:803] 2025-04-26 20:52:50,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:50,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4828 [WARNING|trainer.py:803] 2025-04-26 20:52:51,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4142 [WARNING|trainer.py:803] 2025-04-26 20:52:51,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4829 4105 [WARNING|trainer.py:803] 2025-04-26 20:52:52,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:52,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:52,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4830 4143 4106 [WARNING|trainer.py:803] 2025-04-26 20:52:53,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:53,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4831 [WARNING|trainer.py:803] 2025-04-26 20:52:54,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4144 [WARNING|trainer.py:803] 2025-04-26 20:52:54,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4832 4107 [WARNING|trainer.py:803] 2025-04-26 20:52:55,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:55,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:55,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4833 4145 4108 [WARNING|trainer.py:803] 2025-04-26 20:52:56,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:56,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4834 [WARNING|trainer.py:803] 2025-04-26 20:52:57,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4146 [WARNING|trainer.py:803] 2025-04-26 20:52:57,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4835 4109 [WARNING|trainer.py:803] 2025-04-26 20:52:58,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:52:58,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:58,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4836 4147 4110 [WARNING|trainer.py:803] 2025-04-26 20:52:59,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:52:59,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4837 [WARNING|trainer.py:803] 2025-04-26 20:53:00,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4148 [WARNING|trainer.py:803] 2025-04-26 20:53:00,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4111 4838 [WARNING|trainer.py:803] 2025-04-26 20:53:01,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:01,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:01,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4839 4149 4112 [WARNING|trainer.py:803] 2025-04-26 20:53:02,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:53:02,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4840 [WARNING|trainer.py:803] 2025-04-26 20:53:03,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4150 [WARNING|trainer.py:803] 2025-04-26 20:53:03,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4113 4841 [WARNING|trainer.py:803] 2025-04-26 20:53:04,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:04,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:04,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4151 4842 4114 [WARNING|trainer.py:803] 2025-04-26 20:53:05,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:05,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4843 [WARNING|trainer.py:803] 2025-04-26 20:53:06,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4152 [WARNING|trainer.py:803] 2025-04-26 20:53:06,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4115 4844 [WARNING|trainer.py:803] 2025-04-26 20:53:07,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:07,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4153 [WARNING|trainer.py:803] 2025-04-26 20:53:07,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4845 4116 [WARNING|trainer.py:803] 2025-04-26 20:53:08,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:08,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4846 [WARNING|trainer.py:803] 2025-04-26 20:53:09,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4154 [WARNING|trainer.py:803] 2025-04-26 20:53:09,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4117 4847 [WARNING|trainer.py:803] 2025-04-26 20:53:10,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:10,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4155 [WARNING|trainer.py:803] 2025-04-26 20:53:10,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4848 4118 [WARNING|trainer.py:803] 2025-04-26 20:53:11,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:11,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4849 [WARNING|trainer.py:803] 2025-04-26 20:53:12,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4156 [WARNING|trainer.py:803] 2025-04-26 20:53:12,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4119 4850 [WARNING|trainer.py:803] 2025-04-26 20:53:13,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:13,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4157 [WARNING|trainer.py:803] 2025-04-26 20:53:13,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4851 [WARNING|trainer.py:803] 2025-04-26 20:53:14,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4120 [WARNING|trainer.py:803] 2025-04-26 20:53:14,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4852 4158 [WARNING|trainer.py:803] 2025-04-26 20:53:15,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:15,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4853 4121 [WARNING|trainer.py:803] 2025-04-26 20:53:16,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:16,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:16,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4159 4854 4122 [WARNING|trainer.py:803] 2025-04-26 20:53:17,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:17,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4855 4160 [WARNING|trainer.py:803] 2025-04-26 20:53:18,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:18,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4856 4123 [WARNING|trainer.py:803] 2025-04-26 20:53:19,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4161 [WARNING|trainer.py:803] 2025-04-26 20:53:19,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:20,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4857 [WARNING|trainer.py:803] 2025-04-26 20:53:20,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4124 [WARNING|trainer.py:803] 2025-04-26 20:53:20,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4858 4162 [WARNING|trainer.py:803] 2025-04-26 20:53:21,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:21,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4859 4125 [WARNING|trainer.py:803] 2025-04-26 20:53:22,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:22,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4163 [WARNING|trainer.py:803] 2025-04-26 20:53:23,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4860 4126 [WARNING|trainer.py:803] 2025-04-26 20:53:23,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:23,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4861 4164 [WARNING|trainer.py:803] 2025-04-26 20:53:24,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:24,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4862 [WARNING|trainer.py:803] 2025-04-26 20:53:25,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4127 4165 [WARNING|trainer.py:803] 2025-04-26 20:53:26,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:26,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4863 4128 [WARNING|trainer.py:803] 2025-04-26 20:53:26,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:27,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4864 4166 [WARNING|trainer.py:803] 2025-04-26 20:53:27,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:28,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4865 [WARNING|trainer.py:803] 2025-04-26 20:53:28,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4129 4167 [WARNING|trainer.py:803] 2025-04-26 20:53:29,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:29,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4866 4130 [WARNING|trainer.py:803] 2025-04-26 20:53:29,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:30,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4867 4168 [WARNING|trainer.py:803] 2025-04-26 20:53:30,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:31,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4868 [WARNING|trainer.py:803] 2025-04-26 20:53:31,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4131 4169 [WARNING|trainer.py:803] 2025-04-26 20:53:32,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:32,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4869 4132 [WARNING|trainer.py:803] 2025-04-26 20:53:32,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:33,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4870 4170 [WARNING|trainer.py:803] 2025-04-26 20:53:33,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:34,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4133 4871 [WARNING|trainer.py:803] 2025-04-26 20:53:34,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4171 [WARNING|trainer.py:803] 2025-04-26 20:53:35,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:35,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4872 4134 [WARNING|trainer.py:803] 2025-04-26 20:53:35,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:36,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4873 4172 [WARNING|trainer.py:803] 2025-04-26 20:53:36,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:37,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4135 [WARNING|trainer.py:803] 2025-04-26 20:53:37,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4874 4173 [WARNING|trainer.py:803] 2025-04-26 20:53:38,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:38,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4875 [WARNING|trainer.py:803] 2025-04-26 20:53:38,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4136 [WARNING|trainer.py:803] 2025-04-26 20:53:39,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4876 4174 [WARNING|trainer.py:803] 2025-04-26 20:53:39,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:40,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:40,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4137 4877 4175 [WARNING|trainer.py:803] 2025-04-26 20:53:41,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:41,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4878 4138 [WARNING|trainer.py:803] 2025-04-26 20:53:41,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:42,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4879 4176 [WARNING|trainer.py:803] 2025-04-26 20:53:42,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4139 [WARNING|trainer.py:803] 2025-04-26 20:53:43,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:43,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4880 4177 [WARNING|trainer.py:803] 2025-04-26 20:53:44,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:44,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4881 4140 [WARNING|trainer.py:803] 2025-04-26 20:53:44,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:45,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4882 4178 [WARNING|trainer.py:803] 2025-04-26 20:53:45,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4141 [WARNING|trainer.py:803] 2025-04-26 20:53:46,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:46,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4883 4179 [WARNING|trainer.py:803] 2025-04-26 20:53:47,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:47,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4884 4142 [WARNING|trainer.py:803] 2025-04-26 20:53:47,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:48,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4180 4885 [WARNING|trainer.py:803] 2025-04-26 20:53:48,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:49,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4143 [WARNING|trainer.py:803] 2025-04-26 20:53:49,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4886 4181 [WARNING|trainer.py:803] 2025-04-26 20:53:50,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:50,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4887 4144 [WARNING|trainer.py:803] 2025-04-26 20:53:50,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:51,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4182 4888 [WARNING|trainer.py:803] 2025-04-26 20:53:51,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:52,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4145 [WARNING|trainer.py:803] 2025-04-26 20:53:52,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4889 4183 [WARNING|trainer.py:803] 2025-04-26 20:53:53,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:53,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4890 [WARNING|trainer.py:803] 2025-04-26 20:53:53,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4146 [WARNING|trainer.py:803] 2025-04-26 20:53:54,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4184 4891 [WARNING|trainer.py:803] 2025-04-26 20:53:54,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:55,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4147 [WARNING|trainer.py:803] 2025-04-26 20:53:55,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4892 4185 [WARNING|trainer.py:803] 2025-04-26 20:53:56,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:56,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4893 [WARNING|trainer.py:803] 2025-04-26 20:53:56,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4148 [WARNING|trainer.py:803] 2025-04-26 20:53:57,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4186 4894 [WARNING|trainer.py:803] 2025-04-26 20:53:57,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:53:58,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:58,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4149 4895 4187 [WARNING|trainer.py:803] 2025-04-26 20:53:59,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:53:59,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4896 4150 [WARNING|trainer.py:803] 2025-04-26 20:54:00,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:00,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4897 4188 [WARNING|trainer.py:803] 2025-04-26 20:54:00,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4151 [WARNING|trainer.py:803] 2025-04-26 20:54:01,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:54:01,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4898 4189 [WARNING|trainer.py:803] 2025-04-26 20:54:02,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:02,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4899 4152 [WARNING|trainer.py:803] 2025-04-26 20:54:02,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:03,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4900 [WARNING|trainer.py:803] 2025-04-26 20:54:03,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4190 4153 [WARNING|trainer.py:803] 2025-04-26 20:54:04,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4901 [WARNING|trainer.py:803] 2025-04-26 20:54:04,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:05,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4191 [WARNING|trainer.py:803] 2025-04-26 20:54:05,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4902 4154 [WARNING|trainer.py:803] 2025-04-26 20:54:06,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:06,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:06,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4903 4192 4155 [WARNING|trainer.py:803] 2025-04-26 20:54:07,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4904 [WARNING|trainer.py:803] 2025-04-26 20:54:07,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:08,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4193 [WARNING|trainer.py:803] 2025-04-26 20:54:08,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4905 4156 [WARNING|trainer.py:803] 2025-04-26 20:54:09,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:09,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:09,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4906 4194 4157 [WARNING|trainer.py:803] 2025-04-26 20:54:10,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4907 [WARNING|trainer.py:803] 2025-04-26 20:54:10,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:11,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4195 [WARNING|trainer.py:803] 2025-04-26 20:54:11,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4908 4158 [WARNING|trainer.py:803] 2025-04-26 20:54:12,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:12,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4909 [WARNING|trainer.py:803] 2025-04-26 20:54:12,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4196 4159 [WARNING|trainer.py:803] 2025-04-26 20:54:13,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4910 [WARNING|trainer.py:803] 2025-04-26 20:54:13,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:14,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4197 [WARNING|trainer.py:803] 2025-04-26 20:54:14,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4911 4160 [WARNING|trainer.py:803] 2025-04-26 20:54:15,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:15,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:54:15,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4912 4198 4161 [WARNING|trainer.py:803] 2025-04-26 20:54:16,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4913 [WARNING|trainer.py:803] 2025-04-26 20:54:17,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:17,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4199 [WARNING|trainer.py:803] 2025-04-26 20:54:17,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4914 4162 [WARNING|trainer.py:803] 2025-04-26 20:54:18,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:18,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:18,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4915 4200 4163 [WARNING|trainer.py:803] 2025-04-26 20:54:19,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4916 [WARNING|trainer.py:803] 2025-04-26 20:54:20,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4201 [WARNING|trainer.py:803] 2025-04-26 20:54:20,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:20,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4917 4164 [WARNING|trainer.py:803] 2025-04-26 20:54:21,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4202 [WARNING|trainer.py:803] 2025-04-26 20:54:21,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4918 [WARNING|trainer.py:803] 2025-04-26 20:54:22,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:22,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4165 [WARNING|trainer.py:803] 2025-04-26 20:54:22,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4203 4919 [WARNING|trainer.py:803] 2025-04-26 20:54:23,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:23,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:23,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4920 4204 4166 [WARNING|trainer.py:803] 2025-04-26 20:54:24,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:25,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4921 [WARNING|trainer.py:803] 2025-04-26 20:54:25,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4205 4167 [WARNING|trainer.py:803] 2025-04-26 20:54:25,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4922 [WARNING|trainer.py:803] 2025-04-26 20:54:26,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:26,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4206 [WARNING|trainer.py:803] 2025-04-26 20:54:26,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4923 4168 [WARNING|trainer.py:803] 2025-04-26 20:54:27,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:27,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4207 [WARNING|trainer.py:803] 2025-04-26 20:54:28,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4924 [WARNING|trainer.py:803] 2025-04-26 20:54:28,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4169 [WARNING|trainer.py:803] 2025-04-26 20:54:28,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4925 4208 [WARNING|trainer.py:803] 2025-04-26 20:54:29,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:29,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:30,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4926 4170 4209 [WARNING|trainer.py:803] 2025-04-26 20:54:31,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:31,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4927 [WARNING|trainer.py:803] 2025-04-26 20:54:31,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4210 4171 [WARNING|trainer.py:803] 2025-04-26 20:54:32,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4928 [WARNING|trainer.py:803] 2025-04-26 20:54:32,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:32,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4211 [WARNING|trainer.py:803] 2025-04-26 20:54:33,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4929 4172 [WARNING|trainer.py:803] 2025-04-26 20:54:33,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:34,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4212 [WARNING|trainer.py:803] 2025-04-26 20:54:34,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4930 4173 [WARNING|trainer.py:803] 2025-04-26 20:54:34,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:35,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4931 4213 [WARNING|trainer.py:803] 2025-04-26 20:54:35,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:36,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:36,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4932 4174 4214 [WARNING|trainer.py:803] 2025-04-26 20:54:37,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:37,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4933 [WARNING|trainer.py:803] 2025-04-26 20:54:37,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4215 4175 [WARNING|trainer.py:803] 2025-04-26 20:54:38,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4934 [WARNING|trainer.py:803] 2025-04-26 20:54:38,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:38,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4216 [WARNING|trainer.py:803] 2025-04-26 20:54:39,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4935 4176 [WARNING|trainer.py:803] 2025-04-26 20:54:39,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:40,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:40,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4936 4217 4177 [WARNING|trainer.py:803] 2025-04-26 20:54:41,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:41,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4937 4218 [WARNING|trainer.py:803] 2025-04-26 20:54:41,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:42,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4178 4938 [WARNING|trainer.py:803] 2025-04-26 20:54:42,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4219 [WARNING|trainer.py:803] 2025-04-26 20:54:43,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:43,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4939 [WARNING|trainer.py:803] 2025-04-26 20:54:43,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4179 4220 [WARNING|trainer.py:803] 2025-04-26 20:54:44,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4940 [WARNING|trainer.py:803] 2025-04-26 20:54:44,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:44,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:45,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4180 4221 4941 [WARNING|trainer.py:803] 2025-04-26 20:54:46,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:46,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:46,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4942 4222 4181 [WARNING|trainer.py:803] 2025-04-26 20:54:47,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4943 [WARNING|trainer.py:803] 2025-04-26 20:54:47,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:47,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4223 [WARNING|trainer.py:803] 2025-04-26 20:54:48,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4182 4944 [WARNING|trainer.py:803] 2025-04-26 20:54:48,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4224 [WARNING|trainer.py:803] 2025-04-26 20:54:49,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:54:49,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4945 4183 [WARNING|trainer.py:803] 2025-04-26 20:54:49,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:50,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4225 4946 [WARNING|trainer.py:803] 2025-04-26 20:54:50,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:51,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:54:51,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4184 4947 4226 [WARNING|trainer.py:803] 2025-04-26 20:54:52,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:52,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:52,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4948 4227 4185 [WARNING|trainer.py:803] 2025-04-26 20:54:53,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4949 [WARNING|trainer.py:803] 2025-04-26 20:54:53,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:53,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4228 [WARNING|trainer.py:803] 2025-04-26 20:54:54,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4950 4186 [WARNING|trainer.py:803] 2025-04-26 20:54:54,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4229 [WARNING|trainer.py:803] 2025-04-26 20:54:55,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:55,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4951 [WARNING|trainer.py:803] 2025-04-26 20:54:55,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4187 [WARNING|trainer.py:803] 2025-04-26 20:54:56,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4230 4952 [WARNING|trainer.py:803] 2025-04-26 20:54:57,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:57,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:54:57,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4953 4231 4188 [WARNING|trainer.py:803] 2025-04-26 20:54:58,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:54:58,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4954 [WARNING|trainer.py:803] 2025-04-26 20:54:58,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4232 4189 [WARNING|trainer.py:803] 2025-04-26 20:54:59,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4955 [WARNING|trainer.py:803] 2025-04-26 20:54:59,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4233 [WARNING|trainer.py:803] 2025-04-26 20:55:00,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:00,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4956 [WARNING|trainer.py:803] 2025-04-26 20:55:00,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4190 4234 [WARNING|trainer.py:803] 2025-04-26 20:55:01,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4957 [WARNING|trainer.py:803] 2025-04-26 20:55:01,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:02,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:55:02,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4191 4235 4958 [WARNING|trainer.py:803] 2025-04-26 20:55:03,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:03,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:03,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4959 4236 4192 [WARNING|trainer.py:803] 2025-04-26 20:55:04,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4960 [WARNING|trainer.py:803] 2025-04-26 20:55:04,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:04,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4237 [WARNING|trainer.py:803] 2025-04-26 20:55:05,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4193 4961 [WARNING|trainer.py:803] 2025-04-26 20:55:05,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4238 [WARNING|trainer.py:803] 2025-04-26 20:55:06,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:06,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4962 4194 [WARNING|trainer.py:803] 2025-04-26 20:55:07,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4239 [WARNING|trainer.py:803] 2025-04-26 20:55:07,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4963 [WARNING|trainer.py:803] 2025-04-26 20:55:07,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:08,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:08,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4195 4964 4240 [WARNING|trainer.py:803] 2025-04-26 20:55:09,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:09,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:09,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4965 4241 4196 [WARNING|trainer.py:803] 2025-04-26 20:55:10,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4966 [WARNING|trainer.py:803] 2025-04-26 20:55:10,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:11,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4242 [WARNING|trainer.py:803] 2025-04-26 20:55:11,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4197 4967 [WARNING|trainer.py:803] 2025-04-26 20:55:12,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4243 [WARNING|trainer.py:803] 2025-04-26 20:55:12,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:12,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4968 4198 [WARNING|trainer.py:803] 2025-04-26 20:55:13,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:13,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4244 4969 [WARNING|trainer.py:803] 2025-04-26 20:55:14,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:14,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:14,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4199 4970 4245 [WARNING|trainer.py:803] 2025-04-26 20:55:15,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:15,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:15,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4971 4246 4200 [WARNING|trainer.py:803] 2025-04-26 20:55:16,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4972 [WARNING|trainer.py:803] 2025-04-26 20:55:17,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:17,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4247 4201 [WARNING|trainer.py:803] 2025-04-26 20:55:17,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4973 [WARNING|trainer.py:803] 2025-04-26 20:55:18,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:18,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4248 [WARNING|trainer.py:803] 2025-04-26 20:55:18,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4202 4974 [WARNING|trainer.py:803] 2025-04-26 20:55:19,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:19,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:19,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4249 4975 4203 [WARNING|trainer.py:803] 2025-04-26 20:55:20,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:20,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:20,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4976 4250 4204 [WARNING|trainer.py:803] 2025-04-26 20:55:21,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4977 [WARNING|trainer.py:803] 2025-04-26 20:55:22,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:22,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4251 4205 [WARNING|trainer.py:803] 2025-04-26 20:55:22,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4978 [WARNING|trainer.py:803] 2025-04-26 20:55:23,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:23,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4252 4206 [WARNING|trainer.py:803] 2025-04-26 20:55:23,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4979 [WARNING|trainer.py:803] 2025-04-26 20:55:24,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:24,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:24,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4253 4207 4980 [WARNING|trainer.py:803] 2025-04-26 20:55:25,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:25,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:25,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4981 4254 4208 [WARNING|trainer.py:803] 2025-04-26 20:55:26,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:26,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:27,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4982 4255 4209 [WARNING|trainer.py:803] 2025-04-26 20:55:27,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4983 [WARNING|trainer.py:803] 2025-04-26 20:55:28,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:28,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4256 4210 [WARNING|trainer.py:803] 2025-04-26 20:55:28,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4984 [WARNING|trainer.py:803] 2025-04-26 20:55:29,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:29,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4257 [WARNING|trainer.py:803] 2025-04-26 20:55:29,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4211 4985 [WARNING|trainer.py:803] 2025-04-26 20:55:30,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:30,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:30,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4258 4986 4212 [WARNING|trainer.py:803] 2025-04-26 20:55:31,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:32,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:32,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4987 4259 4213 [WARNING|trainer.py:803] 2025-04-26 20:55:33,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:33,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4988 [WARNING|trainer.py:803] 2025-04-26 20:55:33,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4260 4214 [WARNING|trainer.py:803] 2025-04-26 20:55:34,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4989 [WARNING|trainer.py:803] 2025-04-26 20:55:34,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:34,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4261 4215 [WARNING|trainer.py:803] 2025-04-26 20:55:35,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4990 [WARNING|trainer.py:803] 2025-04-26 20:55:35,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:35,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4262 [WARNING|trainer.py:803] 2025-04-26 20:55:36,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4991 4216 [WARNING|trainer.py:803] 2025-04-26 20:55:36,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:37,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:37,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4263 4992 4217 [WARNING|trainer.py:803] 2025-04-26 20:55:38,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:38,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4993 [WARNING|trainer.py:803] 2025-04-26 20:55:38,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4264 4218 [WARNING|trainer.py:803] 2025-04-26 20:55:39,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:39,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4994 [WARNING|trainer.py:803] 2025-04-26 20:55:39,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4265 4219 [WARNING|trainer.py:803] 2025-04-26 20:55:40,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4995 [WARNING|trainer.py:803] 2025-04-26 20:55:40,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:40,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4266 [WARNING|trainer.py:803] 2025-04-26 20:55:41,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4220 4996 [WARNING|trainer.py:803] 2025-04-26 20:55:41,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:42,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4267 [WARNING|trainer.py:803] 2025-04-26 20:55:42,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4997 4221 [WARNING|trainer.py:803] 2025-04-26 20:55:43,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:43,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:43,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4998 4268 4222 [WARNING|trainer.py:803] 2025-04-26 20:55:44,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:44,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4999 [WARNING|trainer.py:803] 2025-04-26 20:55:44,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4269 4223 [WARNING|trainer.py:803] 2025-04-26 20:55:45,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5000 [WARNING|trainer.py:803] 2025-04-26 20:55:45,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:45,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4270 4224 [WARNING|trainer.py:803] 2025-04-26 20:55:46,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5001 [WARNING|trainer.py:803] 2025-04-26 20:55:46,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:47,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4271 [WARNING|trainer.py:803] 2025-04-26 20:55:47,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4225 5002 [WARNING|trainer.py:803] 2025-04-26 20:55:47,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:48,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:55:48,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4272 5003 4226 [WARNING|trainer.py:803] 2025-04-26 20:55:49,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:49,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:49,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5004 4273 4227 [WARNING|trainer.py:803] 2025-04-26 20:55:50,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:50,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5005 [WARNING|trainer.py:803] 2025-04-26 20:55:50,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4274 4228 [WARNING|trainer.py:803] 2025-04-26 20:55:51,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5006 [WARNING|trainer.py:803] 2025-04-26 20:55:51,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:51,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4275 4229 [WARNING|trainer.py:803] 2025-04-26 20:55:52,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5007 [WARNING|trainer.py:803] 2025-04-26 20:55:52,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:53,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4276 [WARNING|trainer.py:803] 2025-04-26 20:55:53,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5008 4230 [WARNING|trainer.py:803] 2025-04-26 20:55:54,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:54,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:54,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4277 5009 4231 [WARNING|trainer.py:803] 2025-04-26 20:55:55,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:55:55,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5010 [WARNING|trainer.py:803] 2025-04-26 20:55:55,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4278 4232 [WARNING|trainer.py:803] 2025-04-26 20:55:56,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:56,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5011 [WARNING|trainer.py:803] 2025-04-26 20:55:56,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4279 4233 [WARNING|trainer.py:803] 2025-04-26 20:55:57,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5012 [WARNING|trainer.py:803] 2025-04-26 20:55:57,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4280 [WARNING|trainer.py:803] 2025-04-26 20:55:58,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:55:58,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4234 5013 [WARNING|trainer.py:803] 2025-04-26 20:55:58,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4281 [WARNING|trainer.py:803] 2025-04-26 20:55:59,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:55:59,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5014 4235 [WARNING|trainer.py:803] 2025-04-26 20:56:00,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:00,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4282 [WARNING|trainer.py:803] 2025-04-26 20:56:00,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5015 4236 [WARNING|trainer.py:803] 2025-04-26 20:56:01,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:01,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5016 [WARNING|trainer.py:803] 2025-04-26 20:56:01,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4283 4237 [WARNING|trainer.py:803] 2025-04-26 20:56:02,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5017 [WARNING|trainer.py:803] 2025-04-26 20:56:02,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:03,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4284 4238 [WARNING|trainer.py:803] 2025-04-26 20:56:03,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5018 [WARNING|trainer.py:803] 2025-04-26 20:56:03,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:04,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4285 [WARNING|trainer.py:803] 2025-04-26 20:56:04,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5019 4239 [WARNING|trainer.py:803] 2025-04-26 20:56:05,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:05,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:56:05,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4286 5020 4240 [WARNING|trainer.py:803] 2025-04-26 20:56:06,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:56:06,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5021 [WARNING|trainer.py:803] 2025-04-26 20:56:06,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4287 4241 [WARNING|trainer.py:803] 2025-04-26 20:56:07,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:07,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5022 [WARNING|trainer.py:803] 2025-04-26 20:56:07,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4288 4242 [WARNING|trainer.py:803] 2025-04-26 20:56:08,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5023 [WARNING|trainer.py:803] 2025-04-26 20:56:08,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4289 [WARNING|trainer.py:803] 2025-04-26 20:56:09,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:09,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4243 5024 [WARNING|trainer.py:803] 2025-04-26 20:56:10,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4290 [WARNING|trainer.py:803] 2025-04-26 20:56:10,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:10,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5025 4244 [WARNING|trainer.py:803] 2025-04-26 20:56:11,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:11,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4291 [WARNING|trainer.py:803] 2025-04-26 20:56:11,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5026 4245 [WARNING|trainer.py:803] 2025-04-26 20:56:12,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:12,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5027 [WARNING|trainer.py:803] 2025-04-26 20:56:12,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4292 4246 [WARNING|trainer.py:803] 2025-04-26 20:56:13,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:13,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5028 [WARNING|trainer.py:803] 2025-04-26 20:56:14,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4293 4247 [WARNING|trainer.py:803] 2025-04-26 20:56:14,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5029 [WARNING|trainer.py:803] 2025-04-26 20:56:15,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:15,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4294 [WARNING|trainer.py:803] 2025-04-26 20:56:15,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4248 5030 [WARNING|trainer.py:803] 2025-04-26 20:56:16,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:16,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4295 [WARNING|trainer.py:803] 2025-04-26 20:56:16,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5031 4249 [WARNING|trainer.py:803] 2025-04-26 20:56:17,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:17,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:17,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5032 4296 4250 [WARNING|trainer.py:803] 2025-04-26 20:56:18,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:18,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5033 [WARNING|trainer.py:803] 2025-04-26 20:56:19,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4297 4251 [WARNING|trainer.py:803] 2025-04-26 20:56:19,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5034 [WARNING|trainer.py:803] 2025-04-26 20:56:20,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:56:20,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4298 4252 [WARNING|trainer.py:803] 2025-04-26 20:56:20,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5035 [WARNING|trainer.py:803] 2025-04-26 20:56:21,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:21,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4299 [WARNING|trainer.py:803] 2025-04-26 20:56:21,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5036 4253 [WARNING|trainer.py:803] 2025-04-26 20:56:22,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:22,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:22,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4300 5037 4254 [WARNING|trainer.py:803] 2025-04-26 20:56:23,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:23,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5038 [WARNING|trainer.py:803] 2025-04-26 20:56:24,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4301 4255 [WARNING|trainer.py:803] 2025-04-26 20:56:24,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5039 [WARNING|trainer.py:803] 2025-04-26 20:56:25,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:25,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4302 4256 [WARNING|trainer.py:803] 2025-04-26 20:56:25,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5040 [WARNING|trainer.py:803] 2025-04-26 20:56:26,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:26,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4303 [WARNING|trainer.py:803] 2025-04-26 20:56:26,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5041 4257 [WARNING|trainer.py:803] 2025-04-26 20:56:27,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:27,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:27,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4304 5042 4258 [WARNING|trainer.py:803] 2025-04-26 20:56:28,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:28,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5043 [WARNING|trainer.py:803] 2025-04-26 20:56:29,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4305 4259 [WARNING|trainer.py:803] 2025-04-26 20:56:29,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:30,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5044 [WARNING|trainer.py:803] 2025-04-26 20:56:30,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4306 4260 [WARNING|trainer.py:803] 2025-04-26 20:56:30,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5045 [WARNING|trainer.py:803] 2025-04-26 20:56:31,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:31,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4307 [WARNING|trainer.py:803] 2025-04-26 20:56:31,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4261 5046 [WARNING|trainer.py:803] 2025-04-26 20:56:32,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:32,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:32,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4308 5047 4262 [WARNING|trainer.py:803] 2025-04-26 20:56:33,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:33,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:34,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5048 4309 4263 [WARNING|trainer.py:803] 2025-04-26 20:56:34,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:56:35,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5049 [WARNING|trainer.py:803] 2025-04-26 20:56:35,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4310 4264 [WARNING|trainer.py:803] 2025-04-26 20:56:35,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5050 [WARNING|trainer.py:803] 2025-04-26 20:56:36,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:36,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4311 4265 [WARNING|trainer.py:803] 2025-04-26 20:56:37,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5051 [WARNING|trainer.py:803] 2025-04-26 20:56:37,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:37,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4312 [WARNING|trainer.py:803] 2025-04-26 20:56:38,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4266 5052 [WARNING|trainer.py:803] 2025-04-26 20:56:38,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:39,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:39,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4313 5053 4267 [WARNING|trainer.py:803] 2025-04-26 20:56:39,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:40,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5054 [WARNING|trainer.py:803] 2025-04-26 20:56:40,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4314 4268 [WARNING|trainer.py:803] 2025-04-26 20:56:41,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:41,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5055 [WARNING|trainer.py:803] 2025-04-26 20:56:41,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4315 4269 [WARNING|trainer.py:803] 2025-04-26 20:56:42,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5056 [WARNING|trainer.py:803] 2025-04-26 20:56:42,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:42,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4316 [WARNING|trainer.py:803] 2025-04-26 20:56:43,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4270 5057 [WARNING|trainer.py:803] 2025-04-26 20:56:43,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:43,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:44,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4317 5058 4271 [WARNING|trainer.py:803] 2025-04-26 20:56:45,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:45,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:45,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5059 4318 4272 [WARNING|trainer.py:803] 2025-04-26 20:56:46,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:46,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5060 [WARNING|trainer.py:803] 2025-04-26 20:56:46,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4319 4273 [WARNING|trainer.py:803] 2025-04-26 20:56:47,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5061 [WARNING|trainer.py:803] 2025-04-26 20:56:47,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:47,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4320 4274 [WARNING|trainer.py:803] 2025-04-26 20:56:48,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5062 [WARNING|trainer.py:803] 2025-04-26 20:56:48,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:49,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:49,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4321 5063 4275 [WARNING|trainer.py:803] 2025-04-26 20:56:50,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:50,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:50,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5064 4322 4276 [WARNING|trainer.py:803] 2025-04-26 20:56:51,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:51,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5065 [WARNING|trainer.py:803] 2025-04-26 20:56:51,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4323 4277 [WARNING|trainer.py:803] 2025-04-26 20:56:52,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5066 [WARNING|trainer.py:803] 2025-04-26 20:56:52,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:52,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4324 4278 [WARNING|trainer.py:803] 2025-04-26 20:56:53,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5067 [WARNING|trainer.py:803] 2025-04-26 20:56:53,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:54,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4325 [WARNING|trainer.py:803] 2025-04-26 20:56:54,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5068 4279 [WARNING|trainer.py:803] 2025-04-26 20:56:55,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:55,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:55,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4326 5069 4280 [WARNING|trainer.py:803] 2025-04-26 20:56:56,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:56:56,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5070 [WARNING|trainer.py:803] 2025-04-26 20:56:56,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4327 4281 [WARNING|trainer.py:803] 2025-04-26 20:56:57,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5071 [WARNING|trainer.py:803] 2025-04-26 20:56:57,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:56:57,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4328 4282 [WARNING|trainer.py:803] 2025-04-26 20:56:58,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5072 [WARNING|trainer.py:803] 2025-04-26 20:56:58,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:56:59,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4329 [WARNING|trainer.py:803] 2025-04-26 20:56:59,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4283 5073 [WARNING|trainer.py:803] 2025-04-26 20:57:00,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:00,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:00,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4330 5074 4284 [WARNING|trainer.py:803] 2025-04-26 20:57:01,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:01,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5075 [WARNING|trainer.py:803] 2025-04-26 20:57:01,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4331 4285 [WARNING|trainer.py:803] 2025-04-26 20:57:02,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:02,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5076 [WARNING|trainer.py:803] 2025-04-26 20:57:02,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4332 4286 [WARNING|trainer.py:803] 2025-04-26 20:57:03,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5077 [WARNING|trainer.py:803] 2025-04-26 20:57:03,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:57:04,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4333 [WARNING|trainer.py:803] 2025-04-26 20:57:04,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4287 5078 [WARNING|trainer.py:803] 2025-04-26 20:57:04,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:05,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4334 [WARNING|trainer.py:803] 2025-04-26 20:57:05,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5079 4288 [WARNING|trainer.py:803] 2025-04-26 20:57:06,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:06,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:06,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4335 5080 4289 [WARNING|trainer.py:803] 2025-04-26 20:57:07,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:07,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5081 [WARNING|trainer.py:803] 2025-04-26 20:57:07,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4336 4290 [WARNING|trainer.py:803] 2025-04-26 20:57:08,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:08,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5082 [WARNING|trainer.py:803] 2025-04-26 20:57:09,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4337 4291 [WARNING|trainer.py:803] 2025-04-26 20:57:09,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5083 [WARNING|trainer.py:803] 2025-04-26 20:57:09,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:10,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4338 [WARNING|trainer.py:803] 2025-04-26 20:57:10,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4292 5084 [WARNING|trainer.py:803] 2025-04-26 20:57:11,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4339 [WARNING|trainer.py:803] 2025-04-26 20:57:11,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:11,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5085 4293 [WARNING|trainer.py:803] 2025-04-26 20:57:12,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:12,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4340 [WARNING|trainer.py:803] 2025-04-26 20:57:12,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5086 4294 [WARNING|trainer.py:803] 2025-04-26 20:57:13,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:13,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5087 4341 [WARNING|trainer.py:803] 2025-04-26 20:57:14,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4295 [WARNING|trainer.py:803] 2025-04-26 20:57:14,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:14,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5088 4342 [WARNING|trainer.py:803] 2025-04-26 20:57:15,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:15,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4296 5089 [WARNING|trainer.py:803] 2025-04-26 20:57:16,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4343 [WARNING|trainer.py:803] 2025-04-26 20:57:16,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:16,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5090 4297 [WARNING|trainer.py:803] 2025-04-26 20:57:17,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4344 [WARNING|trainer.py:803] 2025-04-26 20:57:17,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:17,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5091 4298 [WARNING|trainer.py:803] 2025-04-26 20:57:18,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:18,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5092 4345 [WARNING|trainer.py:803] 2025-04-26 20:57:19,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4299 [WARNING|trainer.py:803] 2025-04-26 20:57:19,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:19,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5093 4346 [WARNING|trainer.py:803] 2025-04-26 20:57:20,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4300 [WARNING|trainer.py:803] 2025-04-26 20:57:20,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5094 [WARNING|trainer.py:803] 2025-04-26 20:57:20,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4347 [WARNING|trainer.py:803] 2025-04-26 20:57:21,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:21,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5095 4301 [WARNING|trainer.py:803] 2025-04-26 20:57:22,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4348 [WARNING|trainer.py:803] 2025-04-26 20:57:22,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:22,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5096 4302 [WARNING|trainer.py:803] 2025-04-26 20:57:23,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:23,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4349 5097 [WARNING|trainer.py:803] 2025-04-26 20:57:24,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4303 [WARNING|trainer.py:803] 2025-04-26 20:57:24,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:24,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5098 4350 [WARNING|trainer.py:803] 2025-04-26 20:57:25,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4304 [WARNING|trainer.py:803] 2025-04-26 20:57:25,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:25,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5099 4351 [WARNING|trainer.py:803] 2025-04-26 20:57:26,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:26,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4305 5100 [WARNING|trainer.py:803] 2025-04-26 20:57:26,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4352 [WARNING|trainer.py:803] 2025-04-26 20:57:27,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:27,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5101 4306 [WARNING|trainer.py:803] 2025-04-26 20:57:28,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4353 [WARNING|trainer.py:803] 2025-04-26 20:57:28,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:57:29,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5102 4307 [WARNING|trainer.py:803] 2025-04-26 20:57:29,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4354 [WARNING|trainer.py:803] 2025-04-26 20:57:30,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5103 [WARNING|trainer.py:803] 2025-04-26 20:57:30,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4308 [WARNING|trainer.py:803] 2025-04-26 20:57:30,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:31,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4355 5104 [WARNING|trainer.py:803] 2025-04-26 20:57:31,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4309 [WARNING|trainer.py:803] 2025-04-26 20:57:32,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:32,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4356 5105 [WARNING|trainer.py:803] 2025-04-26 20:57:32,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:57:33,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4310 [WARNING|trainer.py:803] 2025-04-26 20:57:33,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4357 5106 [WARNING|trainer.py:803] 2025-04-26 20:57:34,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:34,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4311 [WARNING|trainer.py:803] 2025-04-26 20:57:34,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4358 5107 [WARNING|trainer.py:803] 2025-04-26 20:57:35,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:35,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:35,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4312 5108 4359 [WARNING|trainer.py:803] 2025-04-26 20:57:36,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:36,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:36,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4313 5109 4360 [WARNING|trainer.py:803] 2025-04-26 20:57:37,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:37,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:37,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5110 4314 4361 [WARNING|trainer.py:803] 2025-04-26 20:57:39,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:39,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:39,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5111 4315 4362 [WARNING|trainer.py:803] 2025-04-26 20:57:40,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:57:40,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5112 [WARNING|trainer.py:803] 2025-04-26 20:57:40,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4316 4363 [WARNING|trainer.py:803] 2025-04-26 20:57:41,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5113 [WARNING|trainer.py:803] 2025-04-26 20:57:41,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:41,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4317 4364 [WARNING|trainer.py:803] 2025-04-26 20:57:42,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5114 [WARNING|trainer.py:803] 2025-04-26 20:57:42,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:42,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4365 4318 [WARNING|trainer.py:803] 2025-04-26 20:57:43,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5115 [WARNING|trainer.py:803] 2025-04-26 20:57:43,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:44,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4366 [WARNING|trainer.py:803] 2025-04-26 20:57:44,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4319 5116 [WARNING|trainer.py:803] 2025-04-26 20:57:45,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:45,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4367 [WARNING|trainer.py:803] 2025-04-26 20:57:45,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4320 5117 [WARNING|trainer.py:803] 2025-04-26 20:57:46,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:46,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:46,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4368 5118 4321 [WARNING|trainer.py:803] 2025-04-26 20:57:47,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:57:47,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:57:47,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4369 5119 4322 [WARNING|trainer.py:803] 2025-04-26 20:57:48,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:48,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4370 [WARNING|trainer.py:803] 2025-04-26 20:57:49,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5120 4323 [WARNING|trainer.py:803] 2025-04-26 20:57:49,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:50,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4371 5121 [WARNING|trainer.py:803] 2025-04-26 20:57:50,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4324 [WARNING|trainer.py:803] 2025-04-26 20:57:51,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:51,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5122 4372 [WARNING|trainer.py:803] 2025-04-26 20:57:51,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4325 [WARNING|trainer.py:803] 2025-04-26 20:57:52,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:57:52,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5123 4373 [WARNING|trainer.py:803] 2025-04-26 20:57:52,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4326 [WARNING|trainer.py:803] 2025-04-26 20:57:53,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:53,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5124 4374 [WARNING|trainer.py:803] 2025-04-26 20:57:54,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4327 [WARNING|trainer.py:803] 2025-04-26 20:57:54,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:54,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5125 4375 [WARNING|trainer.py:803] 2025-04-26 20:57:55,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:55,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4328 [WARNING|trainer.py:803] 2025-04-26 20:57:55,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5126 4376 [WARNING|trainer.py:803] 2025-04-26 20:57:56,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:56,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4329 5127 [WARNING|trainer.py:803] 2025-04-26 20:57:57,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4377 [WARNING|trainer.py:803] 2025-04-26 20:57:57,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:57:57,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4330 5128 [WARNING|trainer.py:803] 2025-04-26 20:57:58,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4378 [WARNING|trainer.py:803] 2025-04-26 20:57:59,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:57:59,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5129 4331 [WARNING|trainer.py:803] 2025-04-26 20:57:59,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4379 [WARNING|trainer.py:803] 2025-04-26 20:58:00,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:58:00,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5130 [WARNING|trainer.py:803] 2025-04-26 20:58:00,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4332 4380 [WARNING|trainer.py:803] 2025-04-26 20:58:01,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5131 [WARNING|trainer.py:803] 2025-04-26 20:58:01,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:58:01,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4333 4381 [WARNING|trainer.py:803] 2025-04-26 20:58:02,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5132 [WARNING|trainer.py:803] 2025-04-26 20:58:02,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:03,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4334 4382 [WARNING|trainer.py:803] 2025-04-26 20:58:03,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5133 [WARNING|trainer.py:803] 2025-04-26 20:58:04,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:04,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4335 4383 [WARNING|trainer.py:803] 2025-04-26 20:58:04,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5134 [WARNING|trainer.py:803] 2025-04-26 20:58:05,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:05,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4336 [WARNING|trainer.py:803] 2025-04-26 20:58:05,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4384 5135 [WARNING|trainer.py:803] 2025-04-26 20:58:06,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:06,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:06,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4337 4385 5136 [WARNING|trainer.py:803] 2025-04-26 20:58:07,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:07,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:08,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4338 4386 5137 [WARNING|trainer.py:803] 2025-04-26 20:58:09,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:58:09,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:09,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4339 4387 5138 [WARNING|trainer.py:803] 2025-04-26 20:58:10,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:10,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:10,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4388 4340 5139 [WARNING|trainer.py:803] 2025-04-26 20:58:11,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:11,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:11,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4389 4341 5140 [WARNING|trainer.py:803] 2025-04-26 20:58:12,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:12,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:12,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4390 5141 4342 [WARNING|trainer.py:803] 2025-04-26 20:58:13,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:13,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:14,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5142 4391 4343 [WARNING|trainer.py:803] 2025-04-26 20:58:15,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:15,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:15,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4392 5143 4344 [WARNING|trainer.py:803] 2025-04-26 20:58:16,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:16,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:58:16,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5144 4393 4345 [WARNING|trainer.py:803] 2025-04-26 20:58:17,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:17,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5145 [WARNING|trainer.py:803] 2025-04-26 20:58:17,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4394 4346 [WARNING|trainer.py:803] 2025-04-26 20:58:18,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:18,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5146 4395 [WARNING|trainer.py:803] 2025-04-26 20:58:19,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4347 [WARNING|trainer.py:803] 2025-04-26 20:58:19,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:19,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5147 4396 [WARNING|trainer.py:803] 2025-04-26 20:58:20,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4348 [WARNING|trainer.py:803] 2025-04-26 20:58:20,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:21,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5148 4397 [WARNING|trainer.py:803] 2025-04-26 20:58:21,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4349 [WARNING|trainer.py:803] 2025-04-26 20:58:21,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5149 [WARNING|trainer.py:803] 2025-04-26 20:58:22,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4398 [WARNING|trainer.py:803] 2025-04-26 20:58:22,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:23,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4350 5150 [WARNING|trainer.py:803] 2025-04-26 20:58:23,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4399 [WARNING|trainer.py:803] 2025-04-26 20:58:23,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:24,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4351 5151 [WARNING|trainer.py:803] 2025-04-26 20:58:24,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4400 [WARNING|trainer.py:803] 2025-04-26 20:58:25,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:25,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4352 5152 [WARNING|trainer.py:803] 2025-04-26 20:58:25,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4401 [WARNING|trainer.py:803] 2025-04-26 20:58:26,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:26,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4353 5153 [WARNING|trainer.py:803] 2025-04-26 20:58:26,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4402 [WARNING|trainer.py:803] 2025-04-26 20:58:27,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:27,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4354 5154 [WARNING|trainer.py:803] 2025-04-26 20:58:28,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4403 [WARNING|trainer.py:803] 2025-04-26 20:58:28,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:28,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5155 4355 [WARNING|trainer.py:803] 2025-04-26 20:58:29,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4404 [WARNING|trainer.py:803] 2025-04-26 20:58:29,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:29,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5156 4356 [WARNING|trainer.py:803] 2025-04-26 20:58:30,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4405 [WARNING|trainer.py:803] 2025-04-26 20:58:31,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:31,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5157 4357 [WARNING|trainer.py:803] 2025-04-26 20:58:31,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4406 [WARNING|trainer.py:803] 2025-04-26 20:58:32,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5158 [WARNING|trainer.py:803] 2025-04-26 20:58:32,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4358 [WARNING|trainer.py:803] 2025-04-26 20:58:32,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4407 [WARNING|trainer.py:803] 2025-04-26 20:58:33,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5159 [WARNING|trainer.py:803] 2025-04-26 20:58:33,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4359 [WARNING|trainer.py:803] 2025-04-26 20:58:34,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:34,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4408 5160 [WARNING|trainer.py:803] 2025-04-26 20:58:34,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4360 [WARNING|trainer.py:803] 2025-04-26 20:58:35,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:35,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4409 5161 [WARNING|trainer.py:803] 2025-04-26 20:58:35,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4361 [WARNING|trainer.py:803] 2025-04-26 20:58:36,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:36,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4410 5162 [WARNING|trainer.py:803] 2025-04-26 20:58:37,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4362 [WARNING|trainer.py:803] 2025-04-26 20:58:37,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:37,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5163 4411 [WARNING|trainer.py:803] 2025-04-26 20:58:38,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4363 [WARNING|trainer.py:803] 2025-04-26 20:58:38,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:58:38,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5164 4412 [WARNING|trainer.py:803] 2025-04-26 20:58:39,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:39,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4364 [WARNING|trainer.py:803] 2025-04-26 20:58:40,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5165 4413 [WARNING|trainer.py:803] 2025-04-26 20:58:40,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:41,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4365 [WARNING|trainer.py:803] 2025-04-26 20:58:41,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5166 4414 [WARNING|trainer.py:803] 2025-04-26 20:58:41,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:42,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4366 [WARNING|trainer.py:803] 2025-04-26 20:58:42,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5167 4415 [WARNING|trainer.py:803] 2025-04-26 20:58:43,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:43,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4367 [WARNING|trainer.py:803] 2025-04-26 20:58:43,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5168 4416 [WARNING|trainer.py:803] 2025-04-26 20:58:44,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:44,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4368 5169 [WARNING|trainer.py:803] 2025-04-26 20:58:44,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4417 [WARNING|trainer.py:803] 2025-04-26 20:58:45,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:58:45,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4369 5170 [WARNING|trainer.py:803] 2025-04-26 20:58:45,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4418 [WARNING|trainer.py:803] 2025-04-26 20:58:46,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:46,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5171 4370 [WARNING|trainer.py:803] 2025-04-26 20:58:47,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4419 [WARNING|trainer.py:803] 2025-04-26 20:58:47,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:58:47,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5172 4371 [WARNING|trainer.py:803] 2025-04-26 20:58:48,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4420 [WARNING|trainer.py:803] 2025-04-26 20:58:49,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:49,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5173 4372 [WARNING|trainer.py:803] 2025-04-26 20:58:49,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4421 [WARNING|trainer.py:803] 2025-04-26 20:58:50,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:50,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5174 4373 [WARNING|trainer.py:803] 2025-04-26 20:58:50,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4422 [WARNING|trainer.py:803] 2025-04-26 20:58:51,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5175 [WARNING|trainer.py:803] 2025-04-26 20:58:51,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4374 [WARNING|trainer.py:803] 2025-04-26 20:58:51,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4423 [WARNING|trainer.py:803] 2025-04-26 20:58:52,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5176 [WARNING|trainer.py:803] 2025-04-26 20:58:52,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:53,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4375 4424 [WARNING|trainer.py:803] 2025-04-26 20:58:53,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5177 [WARNING|trainer.py:803] 2025-04-26 20:58:53,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:54,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4376 4425 [WARNING|trainer.py:803] 2025-04-26 20:58:54,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5178 [WARNING|trainer.py:803] 2025-04-26 20:58:55,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:55,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4377 4426 [WARNING|trainer.py:803] 2025-04-26 20:58:55,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5179 [WARNING|trainer.py:803] 2025-04-26 20:58:56,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:58:56,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4378 4427 [WARNING|trainer.py:803] 2025-04-26 20:58:56,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5180 [WARNING|trainer.py:803] 2025-04-26 20:58:57,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:57,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4379 [WARNING|trainer.py:803] 2025-04-26 20:58:58,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4428 5181 [WARNING|trainer.py:803] 2025-04-26 20:58:58,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:58:58,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4380 [WARNING|trainer.py:803] 2025-04-26 20:58:59,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4429 5182 [WARNING|trainer.py:803] 2025-04-26 20:58:59,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:00,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4381 [WARNING|trainer.py:803] 2025-04-26 20:59:00,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4430 5183 [WARNING|trainer.py:803] 2025-04-26 20:59:01,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:01,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4382 [WARNING|trainer.py:803] 2025-04-26 20:59:01,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4431 5184 [WARNING|trainer.py:803] 2025-04-26 20:59:02,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:02,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4383 4432 [WARNING|trainer.py:803] 2025-04-26 20:59:02,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5185 [WARNING|trainer.py:803] 2025-04-26 20:59:03,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:03,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4384 4433 [WARNING|trainer.py:803] 2025-04-26 20:59:04,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5186 [WARNING|trainer.py:803] 2025-04-26 20:59:04,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:04,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4385 [WARNING|trainer.py:803] 2025-04-26 20:59:05,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4434 5187 [WARNING|trainer.py:803] 2025-04-26 20:59:05,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:06,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4386 [WARNING|trainer.py:803] 2025-04-26 20:59:06,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4435 5188 [WARNING|trainer.py:803] 2025-04-26 20:59:07,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:07,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4387 [WARNING|trainer.py:803] 2025-04-26 20:59:07,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4436 5189 [WARNING|trainer.py:803] 2025-04-26 20:59:08,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:08,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:08,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4388 4437 5190 [WARNING|trainer.py:803] 2025-04-26 20:59:09,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:09,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:09,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4389 4438 5191 [WARNING|trainer.py:803] 2025-04-26 20:59:10,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:10,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:10,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4390 4439 5192 [WARNING|trainer.py:803] 2025-04-26 20:59:11,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:11,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:12,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4440 4391 5193 [WARNING|trainer.py:803] 2025-04-26 20:59:13,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:13,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:13,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4441 5194 4392 [WARNING|trainer.py:803] 2025-04-26 20:59:14,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:14,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:14,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5195 4442 4393 [WARNING|trainer.py:803] 2025-04-26 20:59:15,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:15,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:15,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5196 4443 4394 [WARNING|trainer.py:803] 2025-04-26 20:59:16,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:59:16,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:16,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5197 4444 4395 [WARNING|trainer.py:803] 2025-04-26 20:59:17,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:59:17,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:17,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5198 4445 4396 [WARNING|trainer.py:803] 2025-04-26 20:59:18,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:19,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:19,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5199 4446 4397 [WARNING|trainer.py:803] 2025-04-26 20:59:20,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:59:20,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:20,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5200 4447 4398 [WARNING|trainer.py:803] 2025-04-26 20:59:21,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:59:21,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:21,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 5201 4448 4399 [WARNING|trainer.py:803] 2025-04-26 20:59:22,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:59:22,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:22,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5202 4449 4400 [WARNING|trainer.py:803] 2025-04-26 20:59:23,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:59:23,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:23,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5203 4450 4401 [WARNING|trainer.py:803] 2025-04-26 20:59:24,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:24,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:25,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5204 4451 4402 [WARNING|trainer.py:803] 2025-04-26 20:59:26,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:26,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5205 [WARNING|trainer.py:803] 2025-04-26 20:59:26,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4452 4403 [WARNING|trainer.py:803] 2025-04-26 20:59:27,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:59:27,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5206 [WARNING|trainer.py:803] 2025-04-26 20:59:27,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4453 4404 [WARNING|trainer.py:803] 2025-04-26 20:59:28,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 20:59:28,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5207 [WARNING|trainer.py:803] 2025-04-26 20:59:28,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4454 4405 [WARNING|trainer.py:803] 2025-04-26 20:59:29,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:29,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5208 [WARNING|trainer.py:803] 2025-04-26 20:59:29,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4455 4406 [WARNING|trainer.py:803] 2025-04-26 20:59:30,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:30,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5209 [WARNING|trainer.py:803] 2025-04-26 20:59:31,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4456 4407 [WARNING|trainer.py:803] 2025-04-26 20:59:31,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:59:32,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5210 [WARNING|trainer.py:803] 2025-04-26 20:59:32,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4457 4408 [WARNING|trainer.py:803] 2025-04-26 20:59:32,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:33,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5211 4458 [WARNING|trainer.py:803] 2025-04-26 20:59:33,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4409 [WARNING|trainer.py:803] 2025-04-26 20:59:34,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 20:59:34,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 5212 4459 [WARNING|trainer.py:803] 2025-04-26 20:59:34,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4410 [WARNING|trainer.py:803] 2025-04-26 20:59:35,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5213 [WARNING|trainer.py:803] 2025-04-26 20:59:35,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4460 [WARNING|trainer.py:803] 2025-04-26 20:59:35,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4411 [WARNING|trainer.py:803] 2025-04-26 20:59:36,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5214 [WARNING|trainer.py:803] 2025-04-26 20:59:36,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4461 [WARNING|trainer.py:803] 2025-04-26 20:59:37,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4412 [WARNING|trainer.py:803] 2025-04-26 20:59:37,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:37,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5215 4462 [WARNING|trainer.py:803] 2025-04-26 20:59:38,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4413 [WARNING|trainer.py:803] 2025-04-26 20:59:38,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:59:39,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5216 4463 [WARNING|trainer.py:803] 2025-04-26 20:59:39,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4414 [WARNING|trainer.py:803] 2025-04-26 20:59:39,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:40,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5217 4464 [WARNING|trainer.py:803] 2025-04-26 20:59:40,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4415 [WARNING|trainer.py:803] 2025-04-26 20:59:41,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:59:41,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5218 4465 [WARNING|trainer.py:803] 2025-04-26 20:59:41,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4416 [WARNING|trainer.py:803] 2025-04-26 20:59:42,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 20:59:42,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5219 4466 [WARNING|trainer.py:803] 2025-04-26 20:59:43,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4417 [WARNING|trainer.py:803] 2025-04-26 20:59:43,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:43,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5220 4467 [WARNING|trainer.py:803] 2025-04-26 20:59:44,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4418 [WARNING|trainer.py:803] 2025-04-26 20:59:44,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 20:59:44,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5221 4468 [WARNING|trainer.py:803] 2025-04-26 20:59:45,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:45,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4419 [WARNING|trainer.py:803] 2025-04-26 20:59:46,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5222 4469 [WARNING|trainer.py:803] 2025-04-26 20:59:46,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:46,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4420 [WARNING|trainer.py:803] 2025-04-26 20:59:47,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5223 4470 [WARNING|trainer.py:803] 2025-04-26 20:59:47,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:48,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4421 [WARNING|trainer.py:803] 2025-04-26 20:59:48,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5224 4471 [WARNING|trainer.py:803] 2025-04-26 20:59:49,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:49,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4422 [WARNING|trainer.py:803] 2025-04-26 20:59:49,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5225 4472 [WARNING|trainer.py:803] 2025-04-26 20:59:50,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:50,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4423 [WARNING|trainer.py:803] 2025-04-26 20:59:50,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5226 4473 [WARNING|trainer.py:803] 2025-04-26 20:59:51,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:51,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4424 [WARNING|trainer.py:803] 2025-04-26 20:59:51,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5227 4474 [WARNING|trainer.py:803] 2025-04-26 20:59:52,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:52,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4425 [WARNING|trainer.py:803] 2025-04-26 20:59:53,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5228 4475 [WARNING|trainer.py:803] 2025-04-26 20:59:53,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:54,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4426 [WARNING|trainer.py:803] 2025-04-26 20:59:54,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5229 4476 [WARNING|trainer.py:803] 2025-04-26 20:59:55,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:55,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4427 [WARNING|trainer.py:803] 2025-04-26 20:59:55,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5230 4477 [WARNING|trainer.py:803] 2025-04-26 20:59:56,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:56,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4428 [WARNING|trainer.py:803] 2025-04-26 20:59:56,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5231 4478 [WARNING|trainer.py:803] 2025-04-26 20:59:57,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:57,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4429 [WARNING|trainer.py:803] 2025-04-26 20:59:57,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5232 4479 [WARNING|trainer.py:803] 2025-04-26 20:59:58,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 20:59:58,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4430 [WARNING|trainer.py:803] 2025-04-26 20:59:58,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5233 4480 [WARNING|trainer.py:803] 2025-04-26 20:59:59,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 20:59:59,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:00:00,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4431 5234 4481 [WARNING|trainer.py:803] 2025-04-26 21:00:00,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:01,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:01,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4432 5235 4482 [WARNING|trainer.py:803] 2025-04-26 21:00:02,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:02,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:02,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4433 5236 4483 [WARNING|trainer.py:803] 2025-04-26 21:00:03,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:03,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:03,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5237 4434 4484 [WARNING|trainer.py:803] 2025-04-26 21:00:04,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:04,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:04,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5238 4435 4485 [WARNING|trainer.py:803] 2025-04-26 21:00:05,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:00:05,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:06,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5239 4436 4486 [WARNING|trainer.py:803] 2025-04-26 21:00:06,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:00:07,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:07,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5240 4437 4487 [WARNING|trainer.py:803] 2025-04-26 21:00:08,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:00:08,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:08,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5241 4438 4488 [WARNING|trainer.py:803] 2025-04-26 21:00:09,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:00:09,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5242 [WARNING|trainer.py:803] 2025-04-26 21:00:09,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4439 4489 [WARNING|trainer.py:803] 2025-04-26 21:00:10,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:00:10,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5243 [WARNING|trainer.py:803] 2025-04-26 21:00:10,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4440 4490 [WARNING|trainer.py:803] 2025-04-26 21:00:11,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:00:11,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5244 [WARNING|trainer.py:803] 2025-04-26 21:00:11,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4441 4491 [WARNING|trainer.py:803] 2025-04-26 21:00:12,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:00:13,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:13,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5245 4442 4492 [WARNING|trainer.py:803] 2025-04-26 21:00:14,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:00:14,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:14,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5246 4443 4493 [WARNING|trainer.py:803] 2025-04-26 21:00:15,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:15,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:15,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5247 4444 4494 [WARNING|trainer.py:803] 2025-04-26 21:00:16,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:16,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5248 [WARNING|trainer.py:803] 2025-04-26 21:00:16,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4445 4495 [WARNING|trainer.py:803] 2025-04-26 21:00:17,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:00:17,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5249 [WARNING|trainer.py:803] 2025-04-26 21:00:18,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4446 4496 [WARNING|trainer.py:803] 2025-04-26 21:00:18,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5250 [WARNING|trainer.py:803] 2025-04-26 21:00:19,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:19,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4447 4497 [WARNING|trainer.py:803] 2025-04-26 21:00:19,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5251 [WARNING|trainer.py:803] 2025-04-26 21:00:20,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:20,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4448 4498 [WARNING|trainer.py:803] 2025-04-26 21:00:20,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5252 [WARNING|trainer.py:803] 2025-04-26 21:00:21,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:21,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4449 4499 [WARNING|trainer.py:803] 2025-04-26 21:00:22,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5253 [WARNING|trainer.py:803] 2025-04-26 21:00:22,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:22,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4450 [WARNING|trainer.py:803] 2025-04-26 21:00:23,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4500 5254 [WARNING|trainer.py:803] 2025-04-26 21:00:23,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:24,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4451 [WARNING|trainer.py:803] 2025-04-26 21:00:24,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4501 5255 [WARNING|trainer.py:803] 2025-04-26 21:00:25,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:25,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4452 [WARNING|trainer.py:803] 2025-04-26 21:00:25,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4502 5256 [WARNING|trainer.py:803] 2025-04-26 21:00:26,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:26,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4453 [WARNING|trainer.py:803] 2025-04-26 21:00:26,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4503 5257 [WARNING|trainer.py:803] 2025-04-26 21:00:27,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:27,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4454 [WARNING|trainer.py:803] 2025-04-26 21:00:27,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4504 5258 [WARNING|trainer.py:803] 2025-04-26 21:00:28,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:28,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4455 [WARNING|trainer.py:803] 2025-04-26 21:00:29,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4505 5259 [WARNING|trainer.py:803] 2025-04-26 21:00:29,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:30,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4456 [WARNING|trainer.py:803] 2025-04-26 21:00:30,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4506 5260 [WARNING|trainer.py:803] 2025-04-26 21:00:30,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:31,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4457 [WARNING|trainer.py:803] 2025-04-26 21:00:31,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4507 5261 [WARNING|trainer.py:803] 2025-04-26 21:00:32,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:32,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:32,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4458 5262 4508 [WARNING|trainer.py:803] 2025-04-26 21:00:33,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 21:00:33,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:00:33,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4459 5263 4509 [WARNING|trainer.py:803] 2025-04-26 21:00:34,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:34,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:00:34,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4460 5264 4510 [WARNING|trainer.py:803] 2025-04-26 21:00:35,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:36,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4461 5265 [WARNING|trainer.py:803] 2025-04-26 21:00:36,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4511 [WARNING|trainer.py:803] 2025-04-26 21:00:36,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:37,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4462 5266 [WARNING|trainer.py:803] 2025-04-26 21:00:37,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4512 [WARNING|trainer.py:803] 2025-04-26 21:00:38,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:38,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4463 5267 [WARNING|trainer.py:803] 2025-04-26 21:00:38,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4513 [WARNING|trainer.py:803] 2025-04-26 21:00:39,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:39,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5268 4464 [WARNING|trainer.py:803] 2025-04-26 21:00:39,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4514 [WARNING|trainer.py:803] 2025-04-26 21:00:40,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:40,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5269 4465 [WARNING|trainer.py:803] 2025-04-26 21:00:41,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4515 [WARNING|trainer.py:803] 2025-04-26 21:00:41,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:00:41,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5270 4466 [WARNING|trainer.py:803] 2025-04-26 21:00:42,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4516 [WARNING|trainer.py:803] 2025-04-26 21:00:42,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:42,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5271 4467 [WARNING|trainer.py:803] 2025-04-26 21:00:43,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4517 [WARNING|trainer.py:803] 2025-04-26 21:00:44,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:44,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5272 4468 [WARNING|trainer.py:803] 2025-04-26 21:00:44,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4518 [WARNING|trainer.py:803] 2025-04-26 21:00:45,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:00:45,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5273 4469 [WARNING|trainer.py:803] 2025-04-26 21:00:45,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4519 [WARNING|trainer.py:803] 2025-04-26 21:00:46,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5274 [WARNING|trainer.py:803] 2025-04-26 21:00:46,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4470 [WARNING|trainer.py:803] 2025-04-26 21:00:47,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:47,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4520 5275 [WARNING|trainer.py:803] 2025-04-26 21:00:47,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4471 [WARNING|trainer.py:803] 2025-04-26 21:00:48,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:48,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4521 5276 [WARNING|trainer.py:803] 2025-04-26 21:00:48,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4472 [WARNING|trainer.py:803] 2025-04-26 21:00:49,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:00:49,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4522 5277 [WARNING|trainer.py:803] 2025-04-26 21:00:50,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4473 [WARNING|trainer.py:803] 2025-04-26 21:00:50,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:50,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5278 4523 [WARNING|trainer.py:803] 2025-04-26 21:00:51,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4474 [WARNING|trainer.py:803] 2025-04-26 21:00:51,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:51,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5279 4524 [WARNING|trainer.py:803] 2025-04-26 21:00:52,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4475 [WARNING|trainer.py:803] 2025-04-26 21:00:52,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:53,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5280 4525 [WARNING|trainer.py:803] 2025-04-26 21:00:53,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4476 [WARNING|trainer.py:803] 2025-04-26 21:00:54,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:00:54,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5281 4526 [WARNING|trainer.py:803] 2025-04-26 21:00:54,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4477 [WARNING|trainer.py:803] 2025-04-26 21:00:55,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:55,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5282 4527 [WARNING|trainer.py:803] 2025-04-26 21:00:56,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4478 [WARNING|trainer.py:803] 2025-04-26 21:00:56,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5283 [WARNING|trainer.py:803] 2025-04-26 21:00:56,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4528 [WARNING|trainer.py:803] 2025-04-26 21:00:57,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:57,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4479 5284 [WARNING|trainer.py:803] 2025-04-26 21:00:57,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4529 [WARNING|trainer.py:803] 2025-04-26 21:00:58,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:00:58,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4480 5285 [WARNING|trainer.py:803] 2025-04-26 21:00:59,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4530 [WARNING|trainer.py:803] 2025-04-26 21:00:59,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:00:59,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4481 5286 [WARNING|trainer.py:803] 2025-04-26 21:01:00,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4531 [WARNING|trainer.py:803] 2025-04-26 21:01:00,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:00,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4482 5287 [WARNING|trainer.py:803] 2025-04-26 21:01:01,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4532 [WARNING|trainer.py:803] 2025-04-26 21:01:01,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:02,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4483 5288 [WARNING|trainer.py:803] 2025-04-26 21:01:02,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4533 [WARNING|trainer.py:803] 2025-04-26 21:01:03,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:03,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4484 5289 [WARNING|trainer.py:803] 2025-04-26 21:01:03,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4534 [WARNING|trainer.py:803] 2025-04-26 21:01:04,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:04,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4485 5290 [WARNING|trainer.py:803] 2025-04-26 21:01:04,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4535 [WARNING|trainer.py:803] 2025-04-26 21:01:05,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:05,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4486 5291 [WARNING|trainer.py:803] 2025-04-26 21:01:06,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4536 [WARNING|trainer.py:803] 2025-04-26 21:01:06,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:06,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5292 4487 [WARNING|trainer.py:803] 2025-04-26 21:01:07,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4537 [WARNING|trainer.py:803] 2025-04-26 21:01:07,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:07,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5293 4488 [WARNING|trainer.py:803] 2025-04-26 21:01:08,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4538 [WARNING|trainer.py:803] 2025-04-26 21:01:09,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:09,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5294 4489 [WARNING|trainer.py:803] 2025-04-26 21:01:09,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4539 [WARNING|trainer.py:803] 2025-04-26 21:01:10,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:01:10,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5295 4490 [WARNING|trainer.py:803] 2025-04-26 21:01:10,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4540 [WARNING|trainer.py:803] 2025-04-26 21:01:11,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:11,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5296 4491 [WARNING|trainer.py:803] 2025-04-26 21:01:12,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4541 [WARNING|trainer.py:803] 2025-04-26 21:01:12,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:12,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5297 4492 [WARNING|trainer.py:803] 2025-04-26 21:01:13,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4542 [WARNING|trainer.py:803] 2025-04-26 21:01:13,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:01:13,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5298 [WARNING|trainer.py:803] 2025-04-26 21:01:14,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4493 4543 [WARNING|trainer.py:803] 2025-04-26 21:01:14,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5299 [WARNING|trainer.py:803] 2025-04-26 21:01:15,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:15,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4494 4544 [WARNING|trainer.py:803] 2025-04-26 21:01:16,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5300 [WARNING|trainer.py:803] 2025-04-26 21:01:16,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:16,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4495 4545 [WARNING|trainer.py:803] 2025-04-26 21:01:17,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5301 [WARNING|trainer.py:803] 2025-04-26 21:01:17,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:17,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4496 4546 [WARNING|trainer.py:803] 2025-04-26 21:01:18,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5302 [WARNING|trainer.py:803] 2025-04-26 21:01:18,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:19,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4497 4547 [WARNING|trainer.py:803] 2025-04-26 21:01:19,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5303 [WARNING|trainer.py:803] 2025-04-26 21:01:20,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:20,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4498 4548 [WARNING|trainer.py:803] 2025-04-26 21:01:20,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5304 [WARNING|trainer.py:803] 2025-04-26 21:01:21,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:21,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4499 4549 [WARNING|trainer.py:803] 2025-04-26 21:01:21,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5305 [WARNING|trainer.py:803] 2025-04-26 21:01:22,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:22,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:23,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4550 4500 5306 [WARNING|trainer.py:803] 2025-04-26 21:01:23,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:23,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:24,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4551 4501 5307 [WARNING|trainer.py:803] 2025-04-26 21:01:25,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:25,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:01:25,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4552 4502 5308 [WARNING|trainer.py:803] 2025-04-26 21:01:26,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:26,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:26,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4553 4503 5309 [WARNING|trainer.py:803] 2025-04-26 21:01:27,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:27,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:27,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4554 4504 5310 [WARNING|trainer.py:803] 2025-04-26 21:01:28,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:28,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:28,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4555 5311 4505 [WARNING|trainer.py:803] 2025-04-26 21:01:29,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:29,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:30,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5312 4556 4506 [WARNING|trainer.py:803] 2025-04-26 21:01:31,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:31,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:31,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5313 4557 4507 [WARNING|trainer.py:803] 2025-04-26 21:01:32,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:32,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:32,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5314 4558 4508 [WARNING|trainer.py:803] 2025-04-26 21:01:33,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:01:33,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:01:33,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5315 4559 4509 [WARNING|trainer.py:803] 2025-04-26 21:01:34,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:34,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5316 [WARNING|trainer.py:803] 2025-04-26 21:01:34,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4560 4510 [WARNING|trainer.py:803] 2025-04-26 21:01:35,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:35,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5317 [WARNING|trainer.py:803] 2025-04-26 21:01:36,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4561 4511 [WARNING|trainer.py:803] 2025-04-26 21:01:36,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5318 [WARNING|trainer.py:803] 2025-04-26 21:01:37,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:37,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4562 4512 [WARNING|trainer.py:803] 2025-04-26 21:01:37,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5319 [WARNING|trainer.py:803] 2025-04-26 21:01:38,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:38,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4563 4513 [WARNING|trainer.py:803] 2025-04-26 21:01:38,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5320 [WARNING|trainer.py:803] 2025-04-26 21:01:39,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:39,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4564 4514 [WARNING|trainer.py:803] 2025-04-26 21:01:40,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5321 [WARNING|trainer.py:803] 2025-04-26 21:01:40,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:40,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4565 4515 [WARNING|trainer.py:803] 2025-04-26 21:01:41,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5322 [WARNING|trainer.py:803] 2025-04-26 21:01:41,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:42,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4566 4516 [WARNING|trainer.py:803] 2025-04-26 21:01:42,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5323 [WARNING|trainer.py:803] 2025-04-26 21:01:42,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:43,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4567 4517 [WARNING|trainer.py:803] 2025-04-26 21:01:43,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5324 [WARNING|trainer.py:803] 2025-04-26 21:01:44,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:44,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4568 4518 [WARNING|trainer.py:803] 2025-04-26 21:01:44,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5325 [WARNING|trainer.py:803] 2025-04-26 21:01:45,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:45,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4569 [WARNING|trainer.py:803] 2025-04-26 21:01:45,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4519 5326 [WARNING|trainer.py:803] 2025-04-26 21:01:46,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:46,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4570 [WARNING|trainer.py:803] 2025-04-26 21:01:47,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4520 5327 [WARNING|trainer.py:803] 2025-04-26 21:01:47,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:47,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4571 [WARNING|trainer.py:803] 2025-04-26 21:01:48,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4521 5328 [WARNING|trainer.py:803] 2025-04-26 21:01:48,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:49,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4572 [WARNING|trainer.py:803] 2025-04-26 21:01:49,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4522 5329 [WARNING|trainer.py:803] 2025-04-26 21:01:50,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:50,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4573 [WARNING|trainer.py:803] 2025-04-26 21:01:50,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5330 4523 [WARNING|trainer.py:803] 2025-04-26 21:01:51,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:51,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:51,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4574 5331 4524 [WARNING|trainer.py:803] 2025-04-26 21:01:52,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:52,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:01:52,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4575 5332 4525 [WARNING|trainer.py:803] 2025-04-26 21:01:53,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:53,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4576 [WARNING|trainer.py:803] 2025-04-26 21:01:53,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5333 4526 [WARNING|trainer.py:803] 2025-04-26 21:01:54,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:55,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4577 [WARNING|trainer.py:803] 2025-04-26 21:01:55,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5334 4527 [WARNING|trainer.py:803] 2025-04-26 21:01:55,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:56,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4578 [WARNING|trainer.py:803] 2025-04-26 21:01:56,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5335 4528 [WARNING|trainer.py:803] 2025-04-26 21:01:57,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:01:57,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4579 [WARNING|trainer.py:803] 2025-04-26 21:01:57,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5336 4529 [WARNING|trainer.py:803] 2025-04-26 21:01:58,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:58,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4580 [WARNING|trainer.py:803] 2025-04-26 21:01:58,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5337 4530 [WARNING|trainer.py:803] 2025-04-26 21:01:59,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:01:59,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4581 5338 [WARNING|trainer.py:803] 2025-04-26 21:01:59,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4531 [WARNING|trainer.py:803] 2025-04-26 21:02:00,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:00,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5339 4582 [WARNING|trainer.py:803] 2025-04-26 21:02:01,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4532 [WARNING|trainer.py:803] 2025-04-26 21:02:01,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:02:01,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5340 4583 [WARNING|trainer.py:803] 2025-04-26 21:02:02,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4533 [WARNING|trainer.py:803] 2025-04-26 21:02:02,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:03,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5341 4584 [WARNING|trainer.py:803] 2025-04-26 21:02:03,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4534 [WARNING|trainer.py:803] 2025-04-26 21:02:04,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:04,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5342 4585 [WARNING|trainer.py:803] 2025-04-26 21:02:04,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4535 [WARNING|trainer.py:803] 2025-04-26 21:02:05,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:05,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5343 [WARNING|trainer.py:803] 2025-04-26 21:02:05,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4586 4536 [WARNING|trainer.py:803] 2025-04-26 21:02:06,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:02:06,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5344 4587 [WARNING|trainer.py:803] 2025-04-26 21:02:07,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4537 [WARNING|trainer.py:803] 2025-04-26 21:02:07,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:07,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5345 4588 [WARNING|trainer.py:803] 2025-04-26 21:02:08,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4538 [WARNING|trainer.py:803] 2025-04-26 21:02:08,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:08,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5346 4589 [WARNING|trainer.py:803] 2025-04-26 21:02:09,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4539 [WARNING|trainer.py:803] 2025-04-26 21:02:09,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:02:10,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5347 4590 [WARNING|trainer.py:803] 2025-04-26 21:02:10,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4540 [WARNING|trainer.py:803] 2025-04-26 21:02:11,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:11,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5348 4591 [WARNING|trainer.py:803] 2025-04-26 21:02:11,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4541 [WARNING|trainer.py:803] 2025-04-26 21:02:12,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5349 [WARNING|trainer.py:803] 2025-04-26 21:02:12,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4592 [WARNING|trainer.py:803] 2025-04-26 21:02:12,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4542 [WARNING|trainer.py:803] 2025-04-26 21:02:13,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5350 [WARNING|trainer.py:803] 2025-04-26 21:02:13,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4593 [WARNING|trainer.py:803] 2025-04-26 21:02:14,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4543 [WARNING|trainer.py:803] 2025-04-26 21:02:14,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5351 [WARNING|trainer.py:803] 2025-04-26 21:02:14,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4594 [WARNING|trainer.py:803] 2025-04-26 21:02:15,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:15,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4544 5352 [WARNING|trainer.py:803] 2025-04-26 21:02:16,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4595 [WARNING|trainer.py:803] 2025-04-26 21:02:16,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:16,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4545 5353 [WARNING|trainer.py:803] 2025-04-26 21:02:17,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4596 [WARNING|trainer.py:803] 2025-04-26 21:02:17,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:17,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4546 5354 [WARNING|trainer.py:803] 2025-04-26 21:02:18,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4597 [WARNING|trainer.py:803] 2025-04-26 21:02:18,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:18,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4547 5355 [WARNING|trainer.py:803] 2025-04-26 21:02:19,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4598 [WARNING|trainer.py:803] 2025-04-26 21:02:19,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:20,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4548 5356 [WARNING|trainer.py:803] 2025-04-26 21:02:20,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4599 [WARNING|trainer.py:803] 2025-04-26 21:02:21,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:21,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4549 5357 [WARNING|trainer.py:803] 2025-04-26 21:02:21,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4600 [WARNING|trainer.py:803] 2025-04-26 21:02:22,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:22,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4550 5358 [WARNING|trainer.py:803] 2025-04-26 21:02:23,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4601 [WARNING|trainer.py:803] 2025-04-26 21:02:23,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:23,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5359 4551 [WARNING|trainer.py:803] 2025-04-26 21:02:24,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:02:24,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4602 [WARNING|trainer.py:803] 2025-04-26 21:02:24,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5360 4552 [WARNING|trainer.py:803] 2025-04-26 21:02:25,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:25,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4603 [WARNING|trainer.py:803] 2025-04-26 21:02:25,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5361 4553 [WARNING|trainer.py:803] 2025-04-26 21:02:26,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:27,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4604 [WARNING|trainer.py:803] 2025-04-26 21:02:27,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5362 4554 [WARNING|trainer.py:803] 2025-04-26 21:02:27,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:28,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:02:28,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4605 5363 4555 [WARNING|trainer.py:803] 2025-04-26 21:02:29,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:29,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5364 [WARNING|trainer.py:803] 2025-04-26 21:02:29,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4606 4556 [WARNING|trainer.py:803] 2025-04-26 21:02:30,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:02:30,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 5365 [WARNING|trainer.py:803] 2025-04-26 21:02:30,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4607 4557 [WARNING|trainer.py:803] 2025-04-26 21:02:31,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:02:31,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5366 [WARNING|trainer.py:803] 2025-04-26 21:02:31,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4608 4558 [WARNING|trainer.py:803] 2025-04-26 21:02:32,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:32,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5367 [WARNING|trainer.py:803] 2025-04-26 21:02:33,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4609 4559 [WARNING|trainer.py:803] 2025-04-26 21:02:33,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5368 [WARNING|trainer.py:803] 2025-04-26 21:02:34,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:34,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4610 4560 [WARNING|trainer.py:803] 2025-04-26 21:02:34,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5369 [WARNING|trainer.py:803] 2025-04-26 21:02:35,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:35,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4611 4561 [WARNING|trainer.py:803] 2025-04-26 21:02:35,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5370 [WARNING|trainer.py:803] 2025-04-26 21:02:36,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:36,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4612 4562 [WARNING|trainer.py:803] 2025-04-26 21:02:37,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5371 [WARNING|trainer.py:803] 2025-04-26 21:02:37,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:37,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4613 4563 [WARNING|trainer.py:803] 2025-04-26 21:02:38,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5372 [WARNING|trainer.py:803] 2025-04-26 21:02:38,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:39,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4614 [WARNING|trainer.py:803] 2025-04-26 21:02:39,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4564 5373 [WARNING|trainer.py:803] 2025-04-26 21:02:40,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:40,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4615 [WARNING|trainer.py:803] 2025-04-26 21:02:40,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4565 5374 [WARNING|trainer.py:803] 2025-04-26 21:02:41,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:41,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:41,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4566 4616 5375 [WARNING|trainer.py:803] 2025-04-26 21:02:42,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:42,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:42,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4567 5376 4617 [WARNING|trainer.py:803] 2025-04-26 21:02:43,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:44,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:44,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4568 5377 4618 [WARNING|trainer.py:803] 2025-04-26 21:02:44,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:45,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4569 [WARNING|trainer.py:803] 2025-04-26 21:02:45,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5378 4619 [WARNING|trainer.py:803] 2025-04-26 21:02:46,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:46,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4570 [WARNING|trainer.py:803] 2025-04-26 21:02:46,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5379 4620 [WARNING|trainer.py:803] 2025-04-26 21:02:47,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:47,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4571 5380 [WARNING|trainer.py:803] 2025-04-26 21:02:47,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4621 [WARNING|trainer.py:803] 2025-04-26 21:02:48,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:48,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5381 4572 [WARNING|trainer.py:803] 2025-04-26 21:02:49,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4622 [WARNING|trainer.py:803] 2025-04-26 21:02:49,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:02:49,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5382 4573 [WARNING|trainer.py:803] 2025-04-26 21:02:50,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4623 [WARNING|trainer.py:803] 2025-04-26 21:02:50,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:02:50,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5383 4574 [WARNING|trainer.py:803] 2025-04-26 21:02:51,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4624 [WARNING|trainer.py:803] 2025-04-26 21:02:52,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:02:52,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5384 4575 [WARNING|trainer.py:803] 2025-04-26 21:02:52,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4625 [WARNING|trainer.py:803] 2025-04-26 21:02:53,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:02:53,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5385 4576 [WARNING|trainer.py:803] 2025-04-26 21:02:53,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4626 [WARNING|trainer.py:803] 2025-04-26 21:02:54,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:54,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5386 4577 [WARNING|trainer.py:803] 2025-04-26 21:02:55,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4627 [WARNING|trainer.py:803] 2025-04-26 21:02:55,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:55,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5387 4578 [WARNING|trainer.py:803] 2025-04-26 21:02:56,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4628 [WARNING|trainer.py:803] 2025-04-26 21:02:56,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:02:56,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5388 4579 [WARNING|trainer.py:803] 2025-04-26 21:02:57,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4629 [WARNING|trainer.py:803] 2025-04-26 21:02:57,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5389 [WARNING|trainer.py:803] 2025-04-26 21:02:58,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4580 [WARNING|trainer.py:803] 2025-04-26 21:02:58,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:02:58,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4630 5390 [WARNING|trainer.py:803] 2025-04-26 21:02:59,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4581 [WARNING|trainer.py:803] 2025-04-26 21:02:59,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:03:00,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4631 5391 [WARNING|trainer.py:803] 2025-04-26 21:03:00,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4582 [WARNING|trainer.py:803] 2025-04-26 21:03:01,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:03:01,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4632 5392 [WARNING|trainer.py:803] 2025-04-26 21:03:01,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4583 [WARNING|trainer.py:803] 2025-04-26 21:03:02,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:02,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5393 4633 [WARNING|trainer.py:803] 2025-04-26 21:03:02,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4584 [WARNING|trainer.py:803] 2025-04-26 21:03:03,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:03,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5394 4634 [WARNING|trainer.py:803] 2025-04-26 21:03:04,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4585 [WARNING|trainer.py:803] 2025-04-26 21:03:04,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:04,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5395 4635 [WARNING|trainer.py:803] 2025-04-26 21:03:05,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4586 [WARNING|trainer.py:803] 2025-04-26 21:03:05,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:05,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5396 4636 [WARNING|trainer.py:803] 2025-04-26 21:03:06,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4587 [WARNING|trainer.py:803] 2025-04-26 21:03:06,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:07,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5397 4637 [WARNING|trainer.py:803] 2025-04-26 21:03:07,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:03:07,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4588 5398 [WARNING|trainer.py:803] 2025-04-26 21:03:08,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4638 [WARNING|trainer.py:803] 2025-04-26 21:03:08,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:09,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4589 5399 [WARNING|trainer.py:803] 2025-04-26 21:03:09,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4639 [WARNING|trainer.py:803] 2025-04-26 21:03:09,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:03:10,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4590 5400 [WARNING|trainer.py:803] 2025-04-26 21:03:10,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4640 [WARNING|trainer.py:803] 2025-04-26 21:03:11,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:11,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4591 5401 [WARNING|trainer.py:803] 2025-04-26 21:03:11,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4641 [WARNING|trainer.py:803] 2025-04-26 21:03:12,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:12,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4592 [WARNING|trainer.py:803] 2025-04-26 21:03:13,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5402 4642 [WARNING|trainer.py:803] 2025-04-26 21:03:13,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4593 [WARNING|trainer.py:803] 2025-04-26 21:03:13,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:03:14,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5403 4643 [WARNING|trainer.py:803] 2025-04-26 21:03:14,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4594 [WARNING|trainer.py:803] 2025-04-26 21:03:15,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:15,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5404 4644 [WARNING|trainer.py:803] 2025-04-26 21:03:15,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4595 [WARNING|trainer.py:803] 2025-04-26 21:03:16,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:03:16,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5405 4645 [WARNING|trainer.py:803] 2025-04-26 21:03:17,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4596 [WARNING|trainer.py:803] 2025-04-26 21:03:17,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:03:17,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4646 [WARNING|trainer.py:803] 2025-04-26 21:03:18,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5406 4597 [WARNING|trainer.py:803] 2025-04-26 21:03:19,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:19,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:19,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4647 5407 4598 [WARNING|trainer.py:803] 2025-04-26 21:03:20,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:20,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:03:20,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4648 5408 4599 [WARNING|trainer.py:803] 2025-04-26 21:03:21,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:21,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:21,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4649 4600 5409 [WARNING|trainer.py:803] 2025-04-26 21:03:22,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:23,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:03:23,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4650 4601 5410 [WARNING|trainer.py:803] 2025-04-26 21:03:23,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:24,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4651 [WARNING|trainer.py:803] 2025-04-26 21:03:24,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4602 5411 [WARNING|trainer.py:803] 2025-04-26 21:03:25,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:03:25,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4652 [WARNING|trainer.py:803] 2025-04-26 21:03:25,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4603 5412 [WARNING|trainer.py:803] 2025-04-26 21:03:26,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:26,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4653 [WARNING|trainer.py:803] 2025-04-26 21:03:26,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4604 5413 [WARNING|trainer.py:803] 2025-04-26 21:03:27,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:27,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4654 [WARNING|trainer.py:803] 2025-04-26 21:03:28,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4605 5414 [WARNING|trainer.py:803] 2025-04-26 21:03:28,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4655 [WARNING|trainer.py:803] 2025-04-26 21:03:29,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:29,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4606 5415 [WARNING|trainer.py:803] 2025-04-26 21:03:29,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4656 [WARNING|trainer.py:803] 2025-04-26 21:03:30,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4607 [WARNING|trainer.py:803] 2025-04-26 21:03:30,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:03:31,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5416 4657 [WARNING|trainer.py:803] 2025-04-26 21:03:31,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4608 [WARNING|trainer.py:803] 2025-04-26 21:03:31,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:03:32,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5417 4658 [WARNING|trainer.py:803] 2025-04-26 21:03:32,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4609 [WARNING|trainer.py:803] 2025-04-26 21:03:33,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:03:33,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5418 4659 [WARNING|trainer.py:803] 2025-04-26 21:03:33,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4610 [WARNING|trainer.py:803] 2025-04-26 21:03:34,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:03:34,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5419 [WARNING|trainer.py:803] 2025-04-26 21:03:35,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4660 4611 [WARNING|trainer.py:803] 2025-04-26 21:03:35,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:03:35,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:36,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5420 4661 4612 [WARNING|trainer.py:803] 2025-04-26 21:03:37,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:03:37,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:37,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4662 5421 4613 [WARNING|trainer.py:803] 2025-04-26 21:03:38,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:03:38,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:03:38,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4663 5422 4614 [WARNING|trainer.py:803] 2025-04-26 21:03:39,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:03:39,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:03:39,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4664 5423 4615 [WARNING|trainer.py:803] 2025-04-26 21:03:40,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:41,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:41,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4665 4616 5424 [WARNING|trainer.py:803] 2025-04-26 21:03:41,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:03:42,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4666 [WARNING|trainer.py:803] 2025-04-26 21:03:42,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4617 5425 [WARNING|trainer.py:803] 2025-04-26 21:03:43,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4667 [WARNING|trainer.py:803] 2025-04-26 21:03:43,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:43,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4618 5426 [WARNING|trainer.py:803] 2025-04-26 21:03:44,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4668 [WARNING|trainer.py:803] 2025-04-26 21:03:44,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:44,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4619 5427 [WARNING|trainer.py:803] 2025-04-26 21:03:45,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4669 [WARNING|trainer.py:803] 2025-04-26 21:03:45,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:46,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4620 [WARNING|trainer.py:803] 2025-04-26 21:03:46,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5428 4670 [WARNING|trainer.py:803] 2025-04-26 21:03:47,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4621 [WARNING|trainer.py:803] 2025-04-26 21:03:47,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:03:47,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5429 4671 [WARNING|trainer.py:803] 2025-04-26 21:03:48,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4622 [WARNING|trainer.py:803] 2025-04-26 21:03:48,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:03:49,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5430 4672 [WARNING|trainer.py:803] 2025-04-26 21:03:49,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4623 [WARNING|trainer.py:803] 2025-04-26 21:03:50,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:03:50,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4673 5431 [WARNING|trainer.py:803] 2025-04-26 21:03:50,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4624 [WARNING|trainer.py:803] 2025-04-26 21:03:51,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:51,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4674 [WARNING|trainer.py:803] 2025-04-26 21:03:51,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5432 4625 [WARNING|trainer.py:803] 2025-04-26 21:03:52,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:52,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4675 [WARNING|trainer.py:803] 2025-04-26 21:03:53,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5433 4626 [WARNING|trainer.py:803] 2025-04-26 21:03:53,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:03:54,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4676 [WARNING|trainer.py:803] 2025-04-26 21:03:54,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5434 4627 [WARNING|trainer.py:803] 2025-04-26 21:03:55,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:03:55,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4677 [WARNING|trainer.py:803] 2025-04-26 21:03:55,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5435 4628 [WARNING|trainer.py:803] 2025-04-26 21:03:56,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4678 [WARNING|trainer.py:803] 2025-04-26 21:03:56,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:03:56,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4629 5436 [WARNING|trainer.py:803] 2025-04-26 21:03:57,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4679 [WARNING|trainer.py:803] 2025-04-26 21:03:57,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:03:58,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4630 5437 [WARNING|trainer.py:803] 2025-04-26 21:03:58,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4680 [WARNING|trainer.py:803] 2025-04-26 21:03:59,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:03:59,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4631 5438 [WARNING|trainer.py:803] 2025-04-26 21:03:59,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4681 [WARNING|trainer.py:803] 2025-04-26 21:04:00,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:04:00,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4632 5439 [WARNING|trainer.py:803] 2025-04-26 21:04:01,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4682 [WARNING|trainer.py:803] 2025-04-26 21:04:01,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:01,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4633 5440 [WARNING|trainer.py:803] 2025-04-26 21:04:02,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4683 [WARNING|trainer.py:803] 2025-04-26 21:04:02,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4634 [WARNING|trainer.py:803] 2025-04-26 21:04:03,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:04:03,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5441 4684 [WARNING|trainer.py:803] 2025-04-26 21:04:03,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4635 [WARNING|trainer.py:803] 2025-04-26 21:04:04,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:04:04,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5442 4685 [WARNING|trainer.py:803] 2025-04-26 21:04:05,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4636 [WARNING|trainer.py:803] 2025-04-26 21:04:05,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:04:05,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5443 4686 [WARNING|trainer.py:803] 2025-04-26 21:04:06,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4637 [WARNING|trainer.py:803] 2025-04-26 21:04:07,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:04:07,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4687 [WARNING|trainer.py:803] 2025-04-26 21:04:07,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5444 4638 [WARNING|trainer.py:803] 2025-04-26 21:04:08,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:08,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4688 [WARNING|trainer.py:803] 2025-04-26 21:04:08,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5445 4639 [WARNING|trainer.py:803] 2025-04-26 21:04:09,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:09,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4689 [WARNING|trainer.py:803] 2025-04-26 21:04:09,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5446 4640 [WARNING|trainer.py:803] 2025-04-26 21:04:10,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:11,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4690 [WARNING|trainer.py:803] 2025-04-26 21:04:11,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4641 5447 [WARNING|trainer.py:803] 2025-04-26 21:04:11,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4691 [WARNING|trainer.py:803] 2025-04-26 21:04:12,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:04:12,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4642 5448 [WARNING|trainer.py:803] 2025-04-26 21:04:13,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4692 [WARNING|trainer.py:803] 2025-04-26 21:04:13,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:13,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4643 5449 [WARNING|trainer.py:803] 2025-04-26 21:04:14,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4693 [WARNING|trainer.py:803] 2025-04-26 21:04:14,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:14,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4644 5450 [WARNING|trainer.py:803] 2025-04-26 21:04:15,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:04:15,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4694 4645 [WARNING|trainer.py:803] 2025-04-26 21:04:16,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:04:16,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5451 4695 [WARNING|trainer.py:803] 2025-04-26 21:04:17,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4646 [WARNING|trainer.py:803] 2025-04-26 21:04:17,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:04:17,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5452 4696 [WARNING|trainer.py:803] 2025-04-26 21:04:18,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4647 [WARNING|trainer.py:803] 2025-04-26 21:04:19,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:19,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5453 4697 [WARNING|trainer.py:803] 2025-04-26 21:04:19,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4648 [WARNING|trainer.py:803] 2025-04-26 21:04:20,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:04:20,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4698 [WARNING|trainer.py:803] 2025-04-26 21:04:20,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5454 4649 [WARNING|trainer.py:803] 2025-04-26 21:04:21,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:21,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4699 [WARNING|trainer.py:803] 2025-04-26 21:04:21,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5455 4650 [WARNING|trainer.py:803] 2025-04-26 21:04:22,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:22,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4700 [WARNING|trainer.py:803] 2025-04-26 21:04:23,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5456 4651 [WARNING|trainer.py:803] 2025-04-26 21:04:23,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4701 [WARNING|trainer.py:803] 2025-04-26 21:04:24,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:04:24,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4652 5457 [WARNING|trainer.py:803] 2025-04-26 21:04:25,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4702 [WARNING|trainer.py:803] 2025-04-26 21:04:25,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:25,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4653 5458 [WARNING|trainer.py:803] 2025-04-26 21:04:26,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4703 [WARNING|trainer.py:803] 2025-04-26 21:04:26,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:26,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4654 5459 [WARNING|trainer.py:803] 2025-04-26 21:04:27,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4704 [WARNING|trainer.py:803] 2025-04-26 21:04:27,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:04:28,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4655 5460 [WARNING|trainer.py:803] 2025-04-26 21:04:28,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4705 [WARNING|trainer.py:803] 2025-04-26 21:04:29,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:04:29,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4656 [WARNING|trainer.py:803] 2025-04-26 21:04:29,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5461 4706 [WARNING|trainer.py:803] 2025-04-26 21:04:30,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4657 [WARNING|trainer.py:803] 2025-04-26 21:04:30,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:04:30,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5462 4707 [WARNING|trainer.py:803] 2025-04-26 21:04:31,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4658 [WARNING|trainer.py:803] 2025-04-26 21:04:31,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:04:32,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5463 4708 [WARNING|trainer.py:803] 2025-04-26 21:04:32,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4659 [WARNING|trainer.py:803] 2025-04-26 21:04:33,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:04:33,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4709 5464 [WARNING|trainer.py:803] 2025-04-26 21:04:33,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4660 [WARNING|trainer.py:803] 2025-04-26 21:04:34,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:34,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4710 5465 [WARNING|trainer.py:803] 2025-04-26 21:04:35,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4661 [WARNING|trainer.py:803] 2025-04-26 21:04:35,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:35,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4711 [WARNING|trainer.py:803] 2025-04-26 21:04:36,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5466 4662 [WARNING|trainer.py:803] 2025-04-26 21:04:36,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:37,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4712 [WARNING|trainer.py:803] 2025-04-26 21:04:37,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5467 4663 [WARNING|trainer.py:803] 2025-04-26 21:04:38,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:38,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4713 [WARNING|trainer.py:803] 2025-04-26 21:04:38,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5468 4664 [WARNING|trainer.py:803] 2025-04-26 21:04:39,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4714 [WARNING|trainer.py:803] 2025-04-26 21:04:39,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:04:39,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5469 4665 [WARNING|trainer.py:803] 2025-04-26 21:04:40,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4715 [WARNING|trainer.py:803] 2025-04-26 21:04:40,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:04:41,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5470 4666 [WARNING|trainer.py:803] 2025-04-26 21:04:41,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4716 [WARNING|trainer.py:803] 2025-04-26 21:04:42,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:04:42,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5471 4667 [WARNING|trainer.py:803] 2025-04-26 21:04:42,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4717 [WARNING|trainer.py:803] 2025-04-26 21:04:43,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:04:43,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4668 5472 [WARNING|trainer.py:803] 2025-04-26 21:04:43,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4718 [WARNING|trainer.py:803] 2025-04-26 21:04:44,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:04:44,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4669 [WARNING|trainer.py:803] 2025-04-26 21:04:45,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5473 4719 [WARNING|trainer.py:803] 2025-04-26 21:04:45,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:45,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4670 [WARNING|trainer.py:803] 2025-04-26 21:04:46,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5474 4720 [WARNING|trainer.py:803] 2025-04-26 21:04:47,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:47,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4671 [WARNING|trainer.py:803] 2025-04-26 21:04:47,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5475 4721 [WARNING|trainer.py:803] 2025-04-26 21:04:48,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:48,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4672 [WARNING|trainer.py:803] 2025-04-26 21:04:48,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5476 4722 [WARNING|trainer.py:803] 2025-04-26 21:04:49,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4673 [WARNING|trainer.py:803] 2025-04-26 21:04:49,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:04:49,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4723 5477 [WARNING|trainer.py:803] 2025-04-26 21:04:50,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4674 [WARNING|trainer.py:803] 2025-04-26 21:04:51,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:51,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4724 5478 [WARNING|trainer.py:803] 2025-04-26 21:04:51,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4675 [WARNING|trainer.py:803] 2025-04-26 21:04:52,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:52,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4725 5479 [WARNING|trainer.py:803] 2025-04-26 21:04:53,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4676 [WARNING|trainer.py:803] 2025-04-26 21:04:53,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:53,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4726 5480 [WARNING|trainer.py:803] 2025-04-26 21:04:54,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4677 [WARNING|trainer.py:803] 2025-04-26 21:04:54,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:04:54,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4727 5481 [WARNING|trainer.py:803] 2025-04-26 21:04:55,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4678 [WARNING|trainer.py:803] 2025-04-26 21:04:55,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4728 [WARNING|trainer.py:803] 2025-04-26 21:04:56,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:04:56,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5482 4679 [WARNING|trainer.py:803] 2025-04-26 21:04:56,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4729 [WARNING|trainer.py:803] 2025-04-26 21:04:57,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:04:57,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5483 4680 [WARNING|trainer.py:803] 2025-04-26 21:04:58,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4730 [WARNING|trainer.py:803] 2025-04-26 21:04:58,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:04:58,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5484 [WARNING|trainer.py:803] 2025-04-26 21:04:59,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4681 4731 [WARNING|trainer.py:803] 2025-04-26 21:05:00,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:05:00,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:00,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4682 5485 4732 [WARNING|trainer.py:803] 2025-04-26 21:05:01,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:01,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:05:01,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4683 5486 4733 [WARNING|trainer.py:803] 2025-04-26 21:05:02,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:02,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:02,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4684 4734 5487 [WARNING|trainer.py:803] 2025-04-26 21:05:03,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:05:04,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4685 [WARNING|trainer.py:803] 2025-04-26 21:05:04,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4735 5488 [WARNING|trainer.py:803] 2025-04-26 21:05:04,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:05:05,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4686 [WARNING|trainer.py:803] 2025-04-26 21:05:05,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4736 5489 [WARNING|trainer.py:803] 2025-04-26 21:05:06,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:06,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4687 [WARNING|trainer.py:803] 2025-04-26 21:05:06,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4737 5490 [WARNING|trainer.py:803] 2025-04-26 21:05:07,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:07,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4688 [WARNING|trainer.py:803] 2025-04-26 21:05:07,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4738 5491 [WARNING|trainer.py:803] 2025-04-26 21:05:08,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:08,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4689 4739 [WARNING|trainer.py:803] 2025-04-26 21:05:09,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5492 [WARNING|trainer.py:803] 2025-04-26 21:05:09,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:10,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4690 4740 [WARNING|trainer.py:803] 2025-04-26 21:05:10,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:05:10,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5493 [WARNING|trainer.py:803] 2025-04-26 21:05:11,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4691 4741 [WARNING|trainer.py:803] 2025-04-26 21:05:11,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:05:12,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5494 [WARNING|trainer.py:803] 2025-04-26 21:05:12,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4692 4742 [WARNING|trainer.py:803] 2025-04-26 21:05:13,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:05:13,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:13,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5495 4693 4743 [WARNING|trainer.py:803] 2025-04-26 21:05:14,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:05:14,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:05:14,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5496 4694 4744 [WARNING|trainer.py:803] 2025-04-26 21:05:15,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:05:15,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:15,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4695 5497 4745 [WARNING|trainer.py:803] 2025-04-26 21:05:16,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:17,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:05:17,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4696 5498 4746 [WARNING|trainer.py:803] 2025-04-26 21:05:18,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:18,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:05:18,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4697 4747 5499 [WARNING|trainer.py:803] 2025-04-26 21:05:19,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:19,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:19,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4698 4748 5500 [WARNING|trainer.py:803] 2025-04-26 21:05:20,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:20,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:20,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4699 4749 5501 [WARNING|trainer.py:803] 2025-04-26 21:05:21,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:21,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4700 [WARNING|trainer.py:803] 2025-04-26 21:05:22,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4750 5502 [WARNING|trainer.py:803] 2025-04-26 21:05:22,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:05:23,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4701 [WARNING|trainer.py:803] 2025-04-26 21:05:23,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4751 5503 [WARNING|trainer.py:803] 2025-04-26 21:05:24,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:24,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4702 [WARNING|trainer.py:803] 2025-04-26 21:05:24,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4752 5504 [WARNING|trainer.py:803] 2025-04-26 21:05:25,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:25,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4703 4753 [WARNING|trainer.py:803] 2025-04-26 21:05:25,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5505 [WARNING|trainer.py:803] 2025-04-26 21:05:26,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:26,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4704 4754 [WARNING|trainer.py:803] 2025-04-26 21:05:27,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5506 [WARNING|trainer.py:803] 2025-04-26 21:05:27,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:27,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4705 4755 [WARNING|trainer.py:803] 2025-04-26 21:05:28,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5507 [WARNING|trainer.py:803] 2025-04-26 21:05:28,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:29,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4706 4756 [WARNING|trainer.py:803] 2025-04-26 21:05:29,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5508 [WARNING|trainer.py:803] 2025-04-26 21:05:30,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:30,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4707 4757 [WARNING|trainer.py:803] 2025-04-26 21:05:30,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5509 [WARNING|trainer.py:803] 2025-04-26 21:05:31,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:31,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4708 4758 [WARNING|trainer.py:803] 2025-04-26 21:05:31,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5510 [WARNING|trainer.py:803] 2025-04-26 21:05:32,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:05:32,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4709 4759 [WARNING|trainer.py:803] 2025-04-26 21:05:33,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5511 [WARNING|trainer.py:803] 2025-04-26 21:05:33,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:33,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4710 4760 [WARNING|trainer.py:803] 2025-04-26 21:05:34,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5512 [WARNING|trainer.py:803] 2025-04-26 21:05:34,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:35,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4711 4761 [WARNING|trainer.py:803] 2025-04-26 21:05:35,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5513 [WARNING|trainer.py:803] 2025-04-26 21:05:36,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:36,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4712 4762 [WARNING|trainer.py:803] 2025-04-26 21:05:36,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5514 [WARNING|trainer.py:803] 2025-04-26 21:05:37,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:37,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4713 4763 [WARNING|trainer.py:803] 2025-04-26 21:05:38,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5515 [WARNING|trainer.py:803] 2025-04-26 21:05:38,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:38,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4714 4764 [WARNING|trainer.py:803] 2025-04-26 21:05:39,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5516 [WARNING|trainer.py:803] 2025-04-26 21:05:39,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:39,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4765 4715 [WARNING|trainer.py:803] 2025-04-26 21:05:40,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5517 [WARNING|trainer.py:803] 2025-04-26 21:05:40,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:05:40,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4766 4716 [WARNING|trainer.py:803] 2025-04-26 21:05:41,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:05:42,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:42,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5518 4767 4717 [WARNING|trainer.py:803] 2025-04-26 21:05:43,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:05:43,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:43,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5519 4768 4718 [WARNING|trainer.py:803] 2025-04-26 21:05:44,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:44,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:44,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5520 4769 4719 [WARNING|trainer.py:803] 2025-04-26 21:05:45,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:05:45,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:45,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5521 4770 4720 [WARNING|trainer.py:803] 2025-04-26 21:05:46,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:05:46,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:47,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4771 5522 4721 [WARNING|trainer.py:803] 2025-04-26 21:05:48,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:05:48,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:05:48,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4772 5523 4722 [WARNING|trainer.py:803] 2025-04-26 21:05:49,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:49,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:49,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4773 5524 4723 [WARNING|trainer.py:803] 2025-04-26 21:05:50,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:05:50,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:05:50,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4774 5525 4724 [WARNING|trainer.py:803] 2025-04-26 21:05:51,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:51,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:05:51,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4775 4725 5526 [WARNING|trainer.py:803] 2025-04-26 21:05:52,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:53,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:53,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4776 4726 5527 [WARNING|trainer.py:803] 2025-04-26 21:05:54,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:54,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:54,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4777 4727 5528 [WARNING|trainer.py:803] 2025-04-26 21:05:55,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:55,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:55,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4778 4728 5529 [WARNING|trainer.py:803] 2025-04-26 21:05:56,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:56,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:56,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4779 4729 5530 [WARNING|trainer.py:803] 2025-04-26 21:05:57,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:05:57,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4780 [WARNING|trainer.py:803] 2025-04-26 21:05:58,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4730 5531 [WARNING|trainer.py:803] 2025-04-26 21:05:58,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:05:59,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4781 [WARNING|trainer.py:803] 2025-04-26 21:05:59,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4731 5532 [WARNING|trainer.py:803] 2025-04-26 21:06:00,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:00,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4782 [WARNING|trainer.py:803] 2025-04-26 21:06:00,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4732 5533 [WARNING|trainer.py:803] 2025-04-26 21:06:01,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:06:01,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4783 [WARNING|trainer.py:803] 2025-04-26 21:06:01,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4733 5534 [WARNING|trainer.py:803] 2025-04-26 21:06:02,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:06:02,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4784 [WARNING|trainer.py:803] 2025-04-26 21:06:03,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4734 5535 [WARNING|trainer.py:803] 2025-04-26 21:06:03,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:06:04,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4785 [WARNING|trainer.py:803] 2025-04-26 21:06:04,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4735 5536 [WARNING|trainer.py:803] 2025-04-26 21:06:04,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:05,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4786 [WARNING|trainer.py:803] 2025-04-26 21:06:05,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4736 5537 [WARNING|trainer.py:803] 2025-04-26 21:06:06,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:06,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4787 [WARNING|trainer.py:803] 2025-04-26 21:06:06,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4737 [WARNING|trainer.py:803] 2025-04-26 21:06:07,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5538 [WARNING|trainer.py:803] 2025-04-26 21:06:07,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4788 4738 [WARNING|trainer.py:803] 2025-04-26 21:06:08,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:06:08,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:08,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5539 4789 4739 [WARNING|trainer.py:803] 2025-04-26 21:06:09,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:06:09,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:10,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4790 5540 4740 [WARNING|trainer.py:803] 2025-04-26 21:06:10,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:11,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:06:11,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4791 4741 5541 [WARNING|trainer.py:803] 2025-04-26 21:06:12,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:12,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4792 [WARNING|trainer.py:803] 2025-04-26 21:06:12,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4742 5542 [WARNING|trainer.py:803] 2025-04-26 21:06:13,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:13,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4793 4743 [WARNING|trainer.py:803] 2025-04-26 21:06:14,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:06:14,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5543 [WARNING|trainer.py:803] 2025-04-26 21:06:14,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4794 4744 [WARNING|trainer.py:803] 2025-04-26 21:06:15,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:06:15,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4795 5544 [WARNING|trainer.py:803] 2025-04-26 21:06:16,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4745 [WARNING|trainer.py:803] 2025-04-26 21:06:16,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:17,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4796 [WARNING|trainer.py:803] 2025-04-26 21:06:17,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5545 4746 [WARNING|trainer.py:803] 2025-04-26 21:06:18,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4797 [WARNING|trainer.py:803] 2025-04-26 21:06:18,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:06:18,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4747 5546 [WARNING|trainer.py:803] 2025-04-26 21:06:19,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4798 [WARNING|trainer.py:803] 2025-04-26 21:06:19,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:20,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4748 [WARNING|trainer.py:803] 2025-04-26 21:06:20,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5547 4799 [WARNING|trainer.py:803] 2025-04-26 21:06:21,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4749 [WARNING|trainer.py:803] 2025-04-26 21:06:21,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:06:21,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4800 [WARNING|trainer.py:803] 2025-04-26 21:06:22,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5548 4750 [WARNING|trainer.py:803] 2025-04-26 21:06:22,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:23,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4801 [WARNING|trainer.py:803] 2025-04-26 21:06:23,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5549 4751 [WARNING|trainer.py:803] 2025-04-26 21:06:24,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:24,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4802 [WARNING|trainer.py:803] 2025-04-26 21:06:24,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5550 4752 [WARNING|trainer.py:803] 2025-04-26 21:06:25,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4803 [WARNING|trainer.py:803] 2025-04-26 21:06:25,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:06:25,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5551 4753 [WARNING|trainer.py:803] 2025-04-26 21:06:26,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4804 [WARNING|trainer.py:803] 2025-04-26 21:06:27,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:06:27,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4754 5552 [WARNING|trainer.py:803] 2025-04-26 21:06:27,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4805 [WARNING|trainer.py:803] 2025-04-26 21:06:28,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:28,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4755 5553 [WARNING|trainer.py:803] 2025-04-26 21:06:28,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4806 [WARNING|trainer.py:803] 2025-04-26 21:06:29,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:29,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4756 5554 [WARNING|trainer.py:803] 2025-04-26 21:06:30,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4807 [WARNING|trainer.py:803] 2025-04-26 21:06:30,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:30,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4757 [WARNING|trainer.py:803] 2025-04-26 21:06:31,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5555 4808 [WARNING|trainer.py:803] 2025-04-26 21:06:31,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4758 [WARNING|trainer.py:803] 2025-04-26 21:06:32,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:06:32,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5556 4809 [WARNING|trainer.py:803] 2025-04-26 21:06:33,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4759 [WARNING|trainer.py:803] 2025-04-26 21:06:33,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:06:33,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5557 4810 [WARNING|trainer.py:803] 2025-04-26 21:06:34,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4760 [WARNING|trainer.py:803] 2025-04-26 21:06:34,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:06:34,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4811 5558 [WARNING|trainer.py:803] 2025-04-26 21:06:35,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4761 [WARNING|trainer.py:803] 2025-04-26 21:06:35,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:06:36,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4812 [WARNING|trainer.py:803] 2025-04-26 21:06:36,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5559 4762 [WARNING|trainer.py:803] 2025-04-26 21:06:37,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:37,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4813 [WARNING|trainer.py:803] 2025-04-26 21:06:37,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5560 4763 [WARNING|trainer.py:803] 2025-04-26 21:06:38,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4814 [WARNING|trainer.py:803] 2025-04-26 21:06:38,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:06:38,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5561 4764 [WARNING|trainer.py:803] 2025-04-26 21:06:39,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4815 [WARNING|trainer.py:803] 2025-04-26 21:06:40,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:06:40,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4765 5562 [WARNING|trainer.py:803] 2025-04-26 21:06:40,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4816 [WARNING|trainer.py:803] 2025-04-26 21:06:41,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:06:41,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4766 5563 [WARNING|trainer.py:803] 2025-04-26 21:06:41,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4817 [WARNING|trainer.py:803] 2025-04-26 21:06:42,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:42,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4767 [WARNING|trainer.py:803] 2025-04-26 21:06:43,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5564 4818 [WARNING|trainer.py:803] 2025-04-26 21:06:43,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:44,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4768 [WARNING|trainer.py:803] 2025-04-26 21:06:44,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5565 4819 [WARNING|trainer.py:803] 2025-04-26 21:06:44,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4769 [WARNING|trainer.py:803] 2025-04-26 21:06:45,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:06:45,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5566 4820 [WARNING|trainer.py:803] 2025-04-26 21:06:46,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4770 [WARNING|trainer.py:803] 2025-04-26 21:06:46,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:06:46,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4821 5567 [WARNING|trainer.py:803] 2025-04-26 21:06:47,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4771 [WARNING|trainer.py:803] 2025-04-26 21:06:47,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:06:47,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4822 5568 [WARNING|trainer.py:803] 2025-04-26 21:06:48,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4772 [WARNING|trainer.py:803] 2025-04-26 21:06:48,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:06:49,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4823 [WARNING|trainer.py:803] 2025-04-26 21:06:49,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5569 4773 [WARNING|trainer.py:803] 2025-04-26 21:06:50,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4824 [WARNING|trainer.py:803] 2025-04-26 21:06:50,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:06:50,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5570 4774 [WARNING|trainer.py:803] 2025-04-26 21:06:51,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4825 [WARNING|trainer.py:803] 2025-04-26 21:06:51,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:06:52,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5571 4775 [WARNING|trainer.py:803] 2025-04-26 21:06:52,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4826 [WARNING|trainer.py:803] 2025-04-26 21:06:53,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:53,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5572 4776 [WARNING|trainer.py:803] 2025-04-26 21:06:53,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4827 [WARNING|trainer.py:803] 2025-04-26 21:06:54,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:06:54,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5573 4777 [WARNING|trainer.py:803] 2025-04-26 21:06:54,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4828 [WARNING|trainer.py:803] 2025-04-26 21:06:55,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:06:55,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5574 [WARNING|trainer.py:803] 2025-04-26 21:06:56,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4778 4829 [WARNING|trainer.py:803] 2025-04-26 21:06:56,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:06:56,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:57,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5575 4779 4830 [WARNING|trainer.py:803] 2025-04-26 21:06:58,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:06:58,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:06:58,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4780 5576 4831 [WARNING|trainer.py:803] 2025-04-26 21:06:59,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:59,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:06:59,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4781 4832 5577 [WARNING|trainer.py:803] 2025-04-26 21:07:00,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:00,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:00,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4782 4833 5578 [WARNING|trainer.py:803] 2025-04-26 21:07:01,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:01,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4783 [WARNING|trainer.py:803] 2025-04-26 21:07:02,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4834 5579 [WARNING|trainer.py:803] 2025-04-26 21:07:02,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:03,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4784 4835 [WARNING|trainer.py:803] 2025-04-26 21:07:03,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5580 [WARNING|trainer.py:803] 2025-04-26 21:07:04,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:04,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4785 4836 [WARNING|trainer.py:803] 2025-04-26 21:07:04,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:07:05,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5581 [WARNING|trainer.py:803] 2025-04-26 21:07:05,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4786 4837 [WARNING|trainer.py:803] 2025-04-26 21:07:06,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:07:06,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5582 [WARNING|trainer.py:803] 2025-04-26 21:07:06,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4787 4838 [WARNING|trainer.py:803] 2025-04-26 21:07:07,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:07,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5583 [WARNING|trainer.py:803] 2025-04-26 21:07:07,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4788 4839 [WARNING|trainer.py:803] 2025-04-26 21:07:08,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:08,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:09,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5584 4789 4840 [WARNING|trainer.py:803] 2025-04-26 21:07:10,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:10,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:10,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5585 4790 4841 [WARNING|trainer.py:803] 2025-04-26 21:07:11,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:11,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:11,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4791 4842 5586 [WARNING|trainer.py:803] 2025-04-26 21:07:12,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:12,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:12,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4792 4843 5587 [WARNING|trainer.py:803] 2025-04-26 21:07:13,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:13,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:13,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4844 4793 5588 [WARNING|trainer.py:803] 2025-04-26 21:07:15,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:15,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:15,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4845 4794 5589 [WARNING|trainer.py:803] 2025-04-26 21:07:16,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:16,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4846 4795 [WARNING|trainer.py:803] 2025-04-26 21:07:16,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5590 [WARNING|trainer.py:803] 2025-04-26 21:07:17,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:17,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4847 4796 [WARNING|trainer.py:803] 2025-04-26 21:07:17,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5591 [WARNING|trainer.py:803] 2025-04-26 21:07:18,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:18,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4848 4797 [WARNING|trainer.py:803] 2025-04-26 21:07:19,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5592 [WARNING|trainer.py:803] 2025-04-26 21:07:19,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:19,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4849 4798 [WARNING|trainer.py:803] 2025-04-26 21:07:20,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5593 [WARNING|trainer.py:803] 2025-04-26 21:07:20,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:21,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4850 4799 [WARNING|trainer.py:803] 2025-04-26 21:07:21,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:07:22,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5594 [WARNING|trainer.py:803] 2025-04-26 21:07:22,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4851 4800 [WARNING|trainer.py:803] 2025-04-26 21:07:23,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:07:23,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5595 [WARNING|trainer.py:803] 2025-04-26 21:07:23,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4852 4801 [WARNING|trainer.py:803] 2025-04-26 21:07:24,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:07:24,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:24,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5596 4853 4802 [WARNING|trainer.py:803] 2025-04-26 21:07:25,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:25,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:25,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5597 4854 4803 [WARNING|trainer.py:803] 2025-04-26 21:07:26,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:07:26,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:27,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4855 5598 4804 [WARNING|trainer.py:803] 2025-04-26 21:07:28,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:28,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:07:28,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4856 5599 4805 [WARNING|trainer.py:803] 2025-04-26 21:07:29,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:29,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:07:29,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4857 5600 4806 [WARNING|trainer.py:803] 2025-04-26 21:07:30,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:30,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:07:30,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4858 4807 5601 [WARNING|trainer.py:803] 2025-04-26 21:07:31,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:31,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:32,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4859 4808 5602 [WARNING|trainer.py:803] 2025-04-26 21:07:32,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:33,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:33,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4860 4809 5603 [WARNING|trainer.py:803] 2025-04-26 21:07:34,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:34,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4861 [WARNING|trainer.py:803] 2025-04-26 21:07:34,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4810 5604 [WARNING|trainer.py:803] 2025-04-26 21:07:35,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:35,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4862 [WARNING|trainer.py:803] 2025-04-26 21:07:35,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4811 5605 [WARNING|trainer.py:803] 2025-04-26 21:07:36,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:36,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4863 [WARNING|trainer.py:803] 2025-04-26 21:07:37,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4812 5606 [WARNING|trainer.py:803] 2025-04-26 21:07:37,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:07:37,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4864 [WARNING|trainer.py:803] 2025-04-26 21:07:38,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4813 5607 [WARNING|trainer.py:803] 2025-04-26 21:07:38,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:39,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4865 [WARNING|trainer.py:803] 2025-04-26 21:07:39,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4814 5608 [WARNING|trainer.py:803] 2025-04-26 21:07:40,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:40,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4866 4815 [WARNING|trainer.py:803] 2025-04-26 21:07:40,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5609 [WARNING|trainer.py:803] 2025-04-26 21:07:41,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:41,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4867 4816 [WARNING|trainer.py:803] 2025-04-26 21:07:41,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5610 [WARNING|trainer.py:803] 2025-04-26 21:07:42,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:07:42,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4868 4817 [WARNING|trainer.py:803] 2025-04-26 21:07:43,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:07:43,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5611 [WARNING|trainer.py:803] 2025-04-26 21:07:43,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4869 4818 [WARNING|trainer.py:803] 2025-04-26 21:07:44,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:44,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5612 4870 [WARNING|trainer.py:803] 2025-04-26 21:07:45,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4819 [WARNING|trainer.py:803] 2025-04-26 21:07:45,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:07:45,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5613 4871 [WARNING|trainer.py:803] 2025-04-26 21:07:46,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4820 [WARNING|trainer.py:803] 2025-04-26 21:07:46,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:47,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5614 4872 [WARNING|trainer.py:803] 2025-04-26 21:07:47,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4821 [WARNING|trainer.py:803] 2025-04-26 21:07:48,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:48,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4873 [WARNING|trainer.py:803] 2025-04-26 21:07:48,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5615 4822 [WARNING|trainer.py:803] 2025-04-26 21:07:49,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:49,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4874 [WARNING|trainer.py:803] 2025-04-26 21:07:49,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5616 4823 [WARNING|trainer.py:803] 2025-04-26 21:07:50,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:50,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4875 [WARNING|trainer.py:803] 2025-04-26 21:07:51,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5617 4824 [WARNING|trainer.py:803] 2025-04-26 21:07:51,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:52,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4876 [WARNING|trainer.py:803] 2025-04-26 21:07:52,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5618 4825 [WARNING|trainer.py:803] 2025-04-26 21:07:53,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:53,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4877 [WARNING|trainer.py:803] 2025-04-26 21:07:53,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5619 4826 [WARNING|trainer.py:803] 2025-04-26 21:07:54,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4878 [WARNING|trainer.py:803] 2025-04-26 21:07:54,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:07:54,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4827 5620 [WARNING|trainer.py:803] 2025-04-26 21:07:55,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4879 [WARNING|trainer.py:803] 2025-04-26 21:07:55,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:55,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4828 5621 [WARNING|trainer.py:803] 2025-04-26 21:07:56,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4880 [WARNING|trainer.py:803] 2025-04-26 21:07:57,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:07:57,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4829 5622 [WARNING|trainer.py:803] 2025-04-26 21:07:57,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4881 [WARNING|trainer.py:803] 2025-04-26 21:07:58,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:07:58,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4830 [WARNING|trainer.py:803] 2025-04-26 21:07:58,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5623 4882 [WARNING|trainer.py:803] 2025-04-26 21:07:59,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4831 [WARNING|trainer.py:803] 2025-04-26 21:07:59,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:08:00,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5624 4883 [WARNING|trainer.py:803] 2025-04-26 21:08:00,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4832 [WARNING|trainer.py:803] 2025-04-26 21:08:01,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:01,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5625 4884 [WARNING|trainer.py:803] 2025-04-26 21:08:01,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4833 [WARNING|trainer.py:803] 2025-04-26 21:08:02,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:08:02,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4885 5626 [WARNING|trainer.py:803] 2025-04-26 21:08:02,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4834 [WARNING|trainer.py:803] 2025-04-26 21:08:03,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:03,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4886 5627 [WARNING|trainer.py:803] 2025-04-26 21:08:04,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4835 [WARNING|trainer.py:803] 2025-04-26 21:08:04,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:08:04,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4887 [WARNING|trainer.py:803] 2025-04-26 21:08:05,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5628 4836 [WARNING|trainer.py:803] 2025-04-26 21:08:05,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:06,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4888 [WARNING|trainer.py:803] 2025-04-26 21:08:06,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5629 4837 [WARNING|trainer.py:803] 2025-04-26 21:08:07,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:08:07,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4889 [WARNING|trainer.py:803] 2025-04-26 21:08:07,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5630 4838 [WARNING|trainer.py:803] 2025-04-26 21:08:08,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:08:08,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4890 [WARNING|trainer.py:803] 2025-04-26 21:08:08,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5631 4839 [WARNING|trainer.py:803] 2025-04-26 21:08:09,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4891 [WARNING|trainer.py:803] 2025-04-26 21:08:09,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:08:09,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4840 5632 [WARNING|trainer.py:803] 2025-04-26 21:08:10,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4892 [WARNING|trainer.py:803] 2025-04-26 21:08:11,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:08:11,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4841 5633 [WARNING|trainer.py:803] 2025-04-26 21:08:11,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4893 [WARNING|trainer.py:803] 2025-04-26 21:08:12,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:08:12,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4842 5634 [WARNING|trainer.py:803] 2025-04-26 21:08:12,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4894 [WARNING|trainer.py:803] 2025-04-26 21:08:13,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:08:13,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4843 [WARNING|trainer.py:803] 2025-04-26 21:08:14,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5635 4895 [WARNING|trainer.py:803] 2025-04-26 21:08:14,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4844 [WARNING|trainer.py:803] 2025-04-26 21:08:15,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:08:15,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5636 4896 [WARNING|trainer.py:803] 2025-04-26 21:08:15,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4845 [WARNING|trainer.py:803] 2025-04-26 21:08:16,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:08:16,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4897 5637 [WARNING|trainer.py:803] 2025-04-26 21:08:17,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4846 [WARNING|trainer.py:803] 2025-04-26 21:08:17,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:08:17,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4898 5638 [WARNING|trainer.py:803] 2025-04-26 21:08:18,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4847 [WARNING|trainer.py:803] 2025-04-26 21:08:18,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:18,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4899 5639 [WARNING|trainer.py:803] 2025-04-26 21:08:19,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4848 [WARNING|trainer.py:803] 2025-04-26 21:08:19,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:08:20,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4900 [WARNING|trainer.py:803] 2025-04-26 21:08:20,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5640 4849 [WARNING|trainer.py:803] 2025-04-26 21:08:21,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:08:21,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4901 [WARNING|trainer.py:803] 2025-04-26 21:08:21,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5641 4850 [WARNING|trainer.py:803] 2025-04-26 21:08:22,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4902 [WARNING|trainer.py:803] 2025-04-26 21:08:22,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:22,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5642 4851 [WARNING|trainer.py:803] 2025-04-26 21:08:23,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4903 [WARNING|trainer.py:803] 2025-04-26 21:08:24,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:08:24,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4852 5643 [WARNING|trainer.py:803] 2025-04-26 21:08:24,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4904 [WARNING|trainer.py:803] 2025-04-26 21:08:25,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:08:25,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4853 5644 [WARNING|trainer.py:803] 2025-04-26 21:08:25,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4905 [WARNING|trainer.py:803] 2025-04-26 21:08:26,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:26,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4854 5645 [WARNING|trainer.py:803] 2025-04-26 21:08:27,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4906 [WARNING|trainer.py:803] 2025-04-26 21:08:27,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:27,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4855 5646 [WARNING|trainer.py:803] 2025-04-26 21:08:28,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4907 [WARNING|trainer.py:803] 2025-04-26 21:08:28,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:29,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4856 [WARNING|trainer.py:803] 2025-04-26 21:08:29,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5647 4908 [WARNING|trainer.py:803] 2025-04-26 21:08:30,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:30,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4857 [WARNING|trainer.py:803] 2025-04-26 21:08:30,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5648 4909 [WARNING|trainer.py:803] 2025-04-26 21:08:31,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:31,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4858 [WARNING|trainer.py:803] 2025-04-26 21:08:31,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5649 4910 [WARNING|trainer.py:803] 2025-04-26 21:08:32,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4859 [WARNING|trainer.py:803] 2025-04-26 21:08:32,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:08:32,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4911 5650 [WARNING|trainer.py:803] 2025-04-26 21:08:33,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4860 [WARNING|trainer.py:803] 2025-04-26 21:08:34,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:08:34,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4912 5651 [WARNING|trainer.py:803] 2025-04-26 21:08:34,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4861 [WARNING|trainer.py:803] 2025-04-26 21:08:35,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:08:35,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4913 5652 [WARNING|trainer.py:803] 2025-04-26 21:08:36,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4862 [WARNING|trainer.py:803] 2025-04-26 21:08:36,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:36,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4914 5653 [WARNING|trainer.py:803] 2025-04-26 21:08:37,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4863 [WARNING|trainer.py:803] 2025-04-26 21:08:37,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:37,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4915 5654 [WARNING|trainer.py:803] 2025-04-26 21:08:38,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4864 [WARNING|trainer.py:803] 2025-04-26 21:08:38,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:08:39,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4916 [WARNING|trainer.py:803] 2025-04-26 21:08:39,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5655 4865 [WARNING|trainer.py:803] 2025-04-26 21:08:40,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4917 [WARNING|trainer.py:803] 2025-04-26 21:08:40,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:40,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5656 4866 [WARNING|trainer.py:803] 2025-04-26 21:08:41,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4918 [WARNING|trainer.py:803] 2025-04-26 21:08:41,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:08:41,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5657 4867 [WARNING|trainer.py:803] 2025-04-26 21:08:42,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4919 [WARNING|trainer.py:803] 2025-04-26 21:08:42,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:08:43,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5658 4868 [WARNING|trainer.py:803] 2025-04-26 21:08:43,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4920 [WARNING|trainer.py:803] 2025-04-26 21:08:44,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:08:44,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4869 5659 [WARNING|trainer.py:803] 2025-04-26 21:08:44,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4921 [WARNING|trainer.py:803] 2025-04-26 21:08:45,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:45,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4870 5660 [WARNING|trainer.py:803] 2025-04-26 21:08:45,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4922 [WARNING|trainer.py:803] 2025-04-26 21:08:46,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:46,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4871 [WARNING|trainer.py:803] 2025-04-26 21:08:47,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5661 4923 [WARNING|trainer.py:803] 2025-04-26 21:08:47,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:47,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4872 [WARNING|trainer.py:803] 2025-04-26 21:08:48,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5662 4924 [WARNING|trainer.py:803] 2025-04-26 21:08:48,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:49,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4873 [WARNING|trainer.py:803] 2025-04-26 21:08:49,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5663 4925 [WARNING|trainer.py:803] 2025-04-26 21:08:50,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4874 [WARNING|trainer.py:803] 2025-04-26 21:08:50,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:08:50,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5664 4926 [WARNING|trainer.py:803] 2025-04-26 21:08:51,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4875 [WARNING|trainer.py:803] 2025-04-26 21:08:51,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:08:51,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5665 4927 [WARNING|trainer.py:803] 2025-04-26 21:08:52,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4876 [WARNING|trainer.py:803] 2025-04-26 21:08:53,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:08:53,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5666 4928 [WARNING|trainer.py:803] 2025-04-26 21:08:53,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4877 [WARNING|trainer.py:803] 2025-04-26 21:08:54,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:08:54,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4929 5667 [WARNING|trainer.py:803] 2025-04-26 21:08:54,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4878 [WARNING|trainer.py:803] 2025-04-26 21:08:55,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:08:55,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4930 5668 [WARNING|trainer.py:803] 2025-04-26 21:08:55,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4879 [WARNING|trainer.py:803] 2025-04-26 21:08:56,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:08:56,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4931 [WARNING|trainer.py:803] 2025-04-26 21:08:57,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5669 4880 [WARNING|trainer.py:803] 2025-04-26 21:08:57,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:08:57,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4932 [WARNING|trainer.py:803] 2025-04-26 21:08:58,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5670 4881 [WARNING|trainer.py:803] 2025-04-26 21:08:58,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:08:59,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4933 [WARNING|trainer.py:803] 2025-04-26 21:08:59,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5671 4882 [WARNING|trainer.py:803] 2025-04-26 21:09:00,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4934 [WARNING|trainer.py:803] 2025-04-26 21:09:00,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:09:00,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5672 4883 [WARNING|trainer.py:803] 2025-04-26 21:09:01,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4935 [WARNING|trainer.py:803] 2025-04-26 21:09:01,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:01,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4884 5673 [WARNING|trainer.py:803] 2025-04-26 21:09:02,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4936 [WARNING|trainer.py:803] 2025-04-26 21:09:03,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:03,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4885 5674 [WARNING|trainer.py:803] 2025-04-26 21:09:03,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4937 [WARNING|trainer.py:803] 2025-04-26 21:09:04,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:04,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4886 [WARNING|trainer.py:803] 2025-04-26 21:09:04,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5675 4938 [WARNING|trainer.py:803] 2025-04-26 21:09:05,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:09:05,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4887 [WARNING|trainer.py:803] 2025-04-26 21:09:06,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5676 4939 [WARNING|trainer.py:803] 2025-04-26 21:09:06,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4888 [WARNING|trainer.py:803] 2025-04-26 21:09:06,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:09:07,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5677 4940 [WARNING|trainer.py:803] 2025-04-26 21:09:07,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4889 [WARNING|trainer.py:803] 2025-04-26 21:09:08,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:09:08,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 4941 5678 [WARNING|trainer.py:803] 2025-04-26 21:09:08,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4890 [WARNING|trainer.py:803] 2025-04-26 21:09:09,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:09:09,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4942 5679 [WARNING|trainer.py:803] 2025-04-26 21:09:10,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4891 [WARNING|trainer.py:803] 2025-04-26 21:09:10,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:09:10,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4943 [WARNING|trainer.py:803] 2025-04-26 21:09:11,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5680 4892 [WARNING|trainer.py:803] 2025-04-26 21:09:11,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:12,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4944 [WARNING|trainer.py:803] 2025-04-26 21:09:12,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5681 4893 [WARNING|trainer.py:803] 2025-04-26 21:09:13,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4945 [WARNING|trainer.py:803] 2025-04-26 21:09:13,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:09:13,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5682 4894 [WARNING|trainer.py:803] 2025-04-26 21:09:14,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4946 [WARNING|trainer.py:803] 2025-04-26 21:09:14,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:09:14,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5683 4895 [WARNING|trainer.py:803] 2025-04-26 21:09:15,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4947 [WARNING|trainer.py:803] 2025-04-26 21:09:15,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 21:09:15,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes NoYes 4896 5684 [WARNING|trainer.py:803] 2025-04-26 21:09:16,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4948 [WARNING|trainer.py:803] 2025-04-26 21:09:17,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:09:17,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4897 5685 [WARNING|trainer.py:803] 2025-04-26 21:09:17,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4949 [WARNING|trainer.py:803] 2025-04-26 21:09:18,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:09:18,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4898 [WARNING|trainer.py:803] 2025-04-26 21:09:18,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5686 4950 [WARNING|trainer.py:803] 2025-04-26 21:09:19,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:19,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4899 [WARNING|trainer.py:803] 2025-04-26 21:09:20,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5687 4951 [WARNING|trainer.py:803] 2025-04-26 21:09:20,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4900 [WARNING|trainer.py:803] 2025-04-26 21:09:21,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:09:21,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5688 4952 [WARNING|trainer.py:803] 2025-04-26 21:09:21,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4901 [WARNING|trainer.py:803] 2025-04-26 21:09:22,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:09:22,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5689 4953 [WARNING|trainer.py:803] 2025-04-26 21:09:22,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 4902 [WARNING|trainer.py:803] 2025-04-26 21:09:23,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:23,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4954 5690 [WARNING|trainer.py:803] 2025-04-26 21:09:24,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4903 [WARNING|trainer.py:803] 2025-04-26 21:09:24,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:24,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4955 5691 [WARNING|trainer.py:803] 2025-04-26 21:09:25,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4904 [WARNING|trainer.py:803] 2025-04-26 21:09:25,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:26,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 4956 [WARNING|trainer.py:803] 2025-04-26 21:09:26,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5692 4905 [WARNING|trainer.py:803] 2025-04-26 21:09:27,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:27,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4957 [WARNING|trainer.py:803] 2025-04-26 21:09:27,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5693 4906 [WARNING|trainer.py:803] 2025-04-26 21:09:28,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4958 [WARNING|trainer.py:803] 2025-04-26 21:09:28,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:09:28,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5694 4907 [WARNING|trainer.py:803] 2025-04-26 21:09:29,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4959 [WARNING|trainer.py:803] 2025-04-26 21:09:29,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:09:30,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5695 4908 [WARNING|trainer.py:803] 2025-04-26 21:09:30,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4960 [WARNING|trainer.py:803] 2025-04-26 21:09:31,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:31,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5696 4909 [WARNING|trainer.py:803] 2025-04-26 21:09:31,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4961 [WARNING|trainer.py:803] 2025-04-26 21:09:32,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [mov,mp4,m4a,3gp,3g2,mj2 @ 0x66148e80] moov atom not found [21:09:32] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4, Invalid data found when processing input Error reading /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4... Failed to load video: /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k [WARNING|trainer.py:803] 2025-04-26 21:09:32,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4910 5697 [WARNING|trainer.py:803] 2025-04-26 21:09:32,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4962 [WARNING|trainer.py:803] 2025-04-26 21:09:33,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:09:33,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4911 [WARNING|trainer.py:803] 2025-04-26 21:09:34,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5698 4963 [WARNING|trainer.py:803] 2025-04-26 21:09:34,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:09:34,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4912 [WARNING|trainer.py:803] 2025-04-26 21:09:35,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5699 4964 [WARNING|trainer.py:803] 2025-04-26 21:09:36,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:09:36,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4913 [WARNING|trainer.py:803] 2025-04-26 21:09:36,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5700 4965 [WARNING|trainer.py:803] 2025-04-26 21:09:37,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:37,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4914 [WARNING|trainer.py:803] 2025-04-26 21:09:37,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4966 [WARNING|trainer.py:803] 2025-04-26 21:09:38,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5701 4915 [WARNING|trainer.py:803] 2025-04-26 21:09:38,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4967 [WARNING|trainer.py:803] 2025-04-26 21:09:39,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:09:39,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4916 [WARNING|trainer.py:803] 2025-04-26 21:09:40,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4968 5702 [WARNING|trainer.py:803] 2025-04-26 21:09:40,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4917 [WARNING|trainer.py:803] 2025-04-26 21:09:41,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:09:41,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4969 [WARNING|trainer.py:803] 2025-04-26 21:09:41,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4918 [WARNING|trainer.py:803] 2025-04-26 21:09:42,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5703 4970 [WARNING|trainer.py:803] 2025-04-26 21:09:43,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:09:43,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:09:43,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4919 4971 [WARNING|trainer.py:803] 2025-04-26 21:09:44,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5704 [WARNING|trainer.py:803] 2025-04-26 21:09:44,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4920 4972 [WARNING|trainer.py:803] 2025-04-26 21:09:45,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:09:45,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:45,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4921 4973 5705 [WARNING|trainer.py:803] 2025-04-26 21:09:46,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:47,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4922 [WARNING|trainer.py:803] 2025-04-26 21:09:47,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4974 [WARNING|trainer.py:803] 2025-04-26 21:09:47,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5706 [WARNING|trainer.py:803] 2025-04-26 21:09:48,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4923 4975 [WARNING|trainer.py:803] 2025-04-26 21:09:49,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:09:49,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:09:49,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4924 4976 5707 [WARNING|trainer.py:803] 2025-04-26 21:09:50,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:50,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4925 [WARNING|trainer.py:803] 2025-04-26 21:09:50,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4977 [WARNING|trainer.py:803] 2025-04-26 21:09:51,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5708 [WARNING|trainer.py:803] 2025-04-26 21:09:51,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4926 4978 [WARNING|trainer.py:803] 2025-04-26 21:09:52,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:09:52,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:52,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4927 4979 5709 [WARNING|trainer.py:803] 2025-04-26 21:09:53,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:54,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4928 [WARNING|trainer.py:803] 2025-04-26 21:09:54,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4980 [WARNING|trainer.py:803] 2025-04-26 21:09:55,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:55,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5710 4929 4981 [WARNING|trainer.py:803] 2025-04-26 21:09:56,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:56,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:09:56,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4930 4982 5711 [WARNING|trainer.py:803] 2025-04-26 21:09:57,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:09:57,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4931 4983 [WARNING|trainer.py:803] 2025-04-26 21:09:58,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:58,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:09:58,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5712 4932 4984 [WARNING|trainer.py:803] 2025-04-26 21:09:59,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:00,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:00,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4933 4985 5713 [WARNING|trainer.py:803] 2025-04-26 21:10:01,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:01,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4934 4986 [WARNING|trainer.py:803] 2025-04-26 21:10:01,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:10:02,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:02,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4935 4987 5714 [WARNING|trainer.py:803] 2025-04-26 21:10:03,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:03,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:03,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4936 4988 5715 [WARNING|trainer.py:803] 2025-04-26 21:10:04,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:04,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4937 4989 [WARNING|trainer.py:803] 2025-04-26 21:10:05,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:10:06,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:06,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4938 4990 5716 [WARNING|trainer.py:803] 2025-04-26 21:10:07,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:10:07,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:07,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4939 4991 [WARNING|trainer.py:803] 2025-04-26 21:10:08,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5717 [WARNING|trainer.py:803] 2025-04-26 21:10:08,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4940 4992 [WARNING|trainer.py:803] 2025-04-26 21:10:09,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:10:09,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 21:10:09,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4941 4993 5718 [WARNING|trainer.py:803] 2025-04-26 21:10:10,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:10:10,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:11,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4942 4994 [WARNING|trainer.py:803] 2025-04-26 21:10:11,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5719 [WARNING|trainer.py:803] 2025-04-26 21:10:12,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4943 4995 [WARNING|trainer.py:803] 2025-04-26 21:10:12,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:13,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:13,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4944 4996 5720 [WARNING|trainer.py:803] 2025-04-26 21:10:14,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:14,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:14,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4945 4997 5721 [WARNING|trainer.py:803] 2025-04-26 21:10:15,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:15,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4946 4998 [WARNING|trainer.py:803] 2025-04-26 21:10:16,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:10:16,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:16,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4947 4999 5722 [WARNING|trainer.py:803] 2025-04-26 21:10:17,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:18,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4948 [WARNING|trainer.py:803] 2025-04-26 21:10:18,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5000 [WARNING|trainer.py:803] 2025-04-26 21:10:19,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:19,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5723 4949 5001 [WARNING|trainer.py:803] 2025-04-26 21:10:20,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:20,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:20,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4950 5002 5724 [WARNING|trainer.py:803] 2025-04-26 21:10:21,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:21,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4951 [WARNING|trainer.py:803] 2025-04-26 21:10:22,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5003 [WARNING|trainer.py:803] 2025-04-26 21:10:22,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:22,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4952 5725 5004 [WARNING|trainer.py:803] 2025-04-26 21:10:23,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:23,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:10:24,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4953 5005 5726 [WARNING|trainer.py:803] 2025-04-26 21:10:25,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:25,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4954 5006 [WARNING|trainer.py:803] 2025-04-26 21:10:25,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:10:26,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:26,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4955 5727 5007 [WARNING|trainer.py:803] 2025-04-26 21:10:27,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:27,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:27,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4956 5008 [WARNING|trainer.py:803] 2025-04-26 21:10:28,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5728 [WARNING|trainer.py:803] 2025-04-26 21:10:29,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4957 5009 [WARNING|trainer.py:803] 2025-04-26 21:10:29,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:10:29,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:30,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4958 5729 5010 [WARNING|trainer.py:803] 2025-04-26 21:10:31,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:31,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:10:31,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4959 5011 [WARNING|trainer.py:803] 2025-04-26 21:10:32,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5730 [WARNING|trainer.py:803] 2025-04-26 21:10:32,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4960 5012 [WARNING|trainer.py:803] 2025-04-26 21:10:33,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:10:33,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:33,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4961 5013 5731 [WARNING|trainer.py:803] 2025-04-26 21:10:34,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:34,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:35,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4962 5014 5732 [WARNING|trainer.py:803] 2025-04-26 21:10:35,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:10:36,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4963 5015 [WARNING|trainer.py:803] 2025-04-26 21:10:36,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:37,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:37,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4964 5733 5016 [WARNING|trainer.py:803] 2025-04-26 21:10:38,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:38,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:38,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4965 5017 5734 [WARNING|trainer.py:803] 2025-04-26 21:10:39,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:39,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4966 5018 [WARNING|trainer.py:803] 2025-04-26 21:10:40,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:10:40,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:40,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 4967 5735 5019 [WARNING|trainer.py:803] 2025-04-26 21:10:41,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:42,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 21:10:42,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes NoYes 4968 5020 [WARNING|trainer.py:803] 2025-04-26 21:10:43,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5736 [WARNING|trainer.py:803] 2025-04-26 21:10:43,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4969 5021 [WARNING|trainer.py:803] 2025-04-26 21:10:44,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:10:44,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:10:44,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4970 5022 5737 [WARNING|trainer.py:803] 2025-04-26 21:10:45,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:45,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4971 [WARNING|trainer.py:803] 2025-04-26 21:10:46,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5023 [WARNING|trainer.py:803] 2025-04-26 21:10:46,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5738 [WARNING|trainer.py:803] 2025-04-26 21:10:47,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4972 5024 [WARNING|trainer.py:803] 2025-04-26 21:10:47,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:10:47,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:48,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4973 5025 5739 [WARNING|trainer.py:803] 2025-04-26 21:10:49,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:49,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4974 [WARNING|trainer.py:803] 2025-04-26 21:10:49,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5026 [WARNING|trainer.py:803] 2025-04-26 21:10:50,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:50,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5740 4975 5027 [WARNING|trainer.py:803] 2025-04-26 21:10:51,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:51,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:51,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4976 5028 5741 [WARNING|trainer.py:803] 2025-04-26 21:10:52,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:52,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4977 [WARNING|trainer.py:803] 2025-04-26 21:10:53,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5029 [WARNING|trainer.py:803] 2025-04-26 21:10:53,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5742 [WARNING|trainer.py:803] 2025-04-26 21:10:54,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4978 5030 [WARNING|trainer.py:803] 2025-04-26 21:10:54,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:10:55,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:55,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4979 5031 5743 [WARNING|trainer.py:803] 2025-04-26 21:10:56,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:56,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4980 5032 [WARNING|trainer.py:803] 2025-04-26 21:10:56,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:10:57,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:10:57,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4981 5744 5033 [WARNING|trainer.py:803] 2025-04-26 21:10:58,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:58,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:10:58,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4982 5034 5745 [WARNING|trainer.py:803] 2025-04-26 21:10:59,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:00,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4983 5035 [WARNING|trainer.py:803] 2025-04-26 21:11:00,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:01,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:01,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4984 5036 5746 [WARNING|trainer.py:803] 2025-04-26 21:11:02,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:02,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4985 [WARNING|trainer.py:803] 2025-04-26 21:11:02,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5037 [WARNING|trainer.py:803] 2025-04-26 21:11:03,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:03,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5747 4986 5038 [WARNING|trainer.py:803] 2025-04-26 21:11:04,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:04,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:04,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4987 5039 5748 [WARNING|trainer.py:803] 2025-04-26 21:11:05,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:06,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4988 5040 [WARNING|trainer.py:803] 2025-04-26 21:11:06,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:07,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:07,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4989 5749 5041 [WARNING|trainer.py:803] 2025-04-26 21:11:08,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:08,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:08,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4990 5042 5750 [WARNING|trainer.py:803] 2025-04-26 21:11:09,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:09,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4991 5043 [WARNING|trainer.py:803] 2025-04-26 21:11:10,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:11:10,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:10,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4992 5751 5044 [WARNING|trainer.py:803] 2025-04-26 21:11:11,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:11,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:11:12,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4993 5045 [WARNING|trainer.py:803] 2025-04-26 21:11:12,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5752 [WARNING|trainer.py:803] 2025-04-26 21:11:13,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4994 5046 [WARNING|trainer.py:803] 2025-04-26 21:11:13,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:14,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:14,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4995 5753 5047 [WARNING|trainer.py:803] 2025-04-26 21:11:15,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:15,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:15,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 4996 5048 5754 [WARNING|trainer.py:803] 2025-04-26 21:11:16,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4997 [WARNING|trainer.py:803] 2025-04-26 21:11:16,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5049 [WARNING|trainer.py:803] 2025-04-26 21:11:17,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:17,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 4998 [WARNING|trainer.py:803] 2025-04-26 21:11:18,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5755 5050 [WARNING|trainer.py:803] 2025-04-26 21:11:18,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:19,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 4999 [WARNING|trainer.py:803] 2025-04-26 21:11:19,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5051 [WARNING|trainer.py:803] 2025-04-26 21:11:20,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5756 5000 [WARNING|trainer.py:803] 2025-04-26 21:11:20,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5052 [WARNING|trainer.py:803] 2025-04-26 21:11:21,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:11:21,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5001 [WARNING|trainer.py:803] 2025-04-26 21:11:21,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5757 5053 [WARNING|trainer.py:803] 2025-04-26 21:11:22,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:22,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5002 [WARNING|trainer.py:803] 2025-04-26 21:11:22,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5054 [WARNING|trainer.py:803] 2025-04-26 21:11:23,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5758 5003 [WARNING|trainer.py:803] 2025-04-26 21:11:24,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5055 [WARNING|trainer.py:803] 2025-04-26 21:11:24,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:11:24,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5004 [WARNING|trainer.py:803] 2025-04-26 21:11:25,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5759 5056 [WARNING|trainer.py:803] 2025-04-26 21:11:26,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5005 [WARNING|trainer.py:803] 2025-04-26 21:11:26,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:11:26,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5057 [WARNING|trainer.py:803] 2025-04-26 21:11:27,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5760 5006 [WARNING|trainer.py:803] 2025-04-26 21:11:27,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5058 [WARNING|trainer.py:803] 2025-04-26 21:11:28,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:28,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5007 [WARNING|trainer.py:803] 2025-04-26 21:11:28,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5761 5059 [WARNING|trainer.py:803] 2025-04-26 21:11:29,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:29,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5008 [WARNING|trainer.py:803] 2025-04-26 21:11:30,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5060 [WARNING|trainer.py:803] 2025-04-26 21:11:30,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5762 5009 [WARNING|trainer.py:803] 2025-04-26 21:11:31,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5061 [WARNING|trainer.py:803] 2025-04-26 21:11:31,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:11:32,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5010 [WARNING|trainer.py:803] 2025-04-26 21:11:32,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5763 5062 [WARNING|trainer.py:803] 2025-04-26 21:11:33,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:33,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5011 [WARNING|trainer.py:803] 2025-04-26 21:11:33,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5063 [WARNING|trainer.py:803] 2025-04-26 21:11:34,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5764 5012 [WARNING|trainer.py:803] 2025-04-26 21:11:34,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5064 [WARNING|trainer.py:803] 2025-04-26 21:11:35,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:11:35,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5013 [WARNING|trainer.py:803] 2025-04-26 21:11:35,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5765 5065 [WARNING|trainer.py:803] 2025-04-26 21:11:36,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5014 [WARNING|trainer.py:803] 2025-04-26 21:11:37,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:37,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5066 [WARNING|trainer.py:803] 2025-04-26 21:11:38,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5766 5015 [WARNING|trainer.py:803] 2025-04-26 21:11:38,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5067 [WARNING|trainer.py:803] 2025-04-26 21:11:38,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:11:39,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5016 [WARNING|trainer.py:803] 2025-04-26 21:11:39,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5068 5767 [WARNING|trainer.py:803] 2025-04-26 21:11:40,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5017 [WARNING|trainer.py:803] 2025-04-26 21:11:40,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:40,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5069 [WARNING|trainer.py:803] 2025-04-26 21:11:41,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5768 5018 [WARNING|trainer.py:803] 2025-04-26 21:11:41,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5070 [WARNING|trainer.py:803] 2025-04-26 21:11:42,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:42,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5019 [WARNING|trainer.py:803] 2025-04-26 21:11:43,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5769 5071 [WARNING|trainer.py:803] 2025-04-26 21:11:43,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:11:44,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5020 [WARNING|trainer.py:803] 2025-04-26 21:11:44,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5072 [WARNING|trainer.py:803] 2025-04-26 21:11:45,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5770 5021 [WARNING|trainer.py:803] 2025-04-26 21:11:45,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5073 [WARNING|trainer.py:803] 2025-04-26 21:11:46,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:11:46,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5022 [WARNING|trainer.py:803] 2025-04-26 21:11:46,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5074 5771 [WARNING|trainer.py:803] 2025-04-26 21:11:47,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5023 [WARNING|trainer.py:803] 2025-04-26 21:11:47,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:47,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5075 [WARNING|trainer.py:803] 2025-04-26 21:11:48,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5772 5024 [WARNING|trainer.py:803] 2025-04-26 21:11:49,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5076 [WARNING|trainer.py:803] 2025-04-26 21:11:49,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:49,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5025 [WARNING|trainer.py:803] 2025-04-26 21:11:50,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5077 5773 [WARNING|trainer.py:803] 2025-04-26 21:11:51,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:51,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5026 [WARNING|trainer.py:803] 2025-04-26 21:11:51,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5078 [WARNING|trainer.py:803] 2025-04-26 21:11:52,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5774 [WARNING|trainer.py:803] 2025-04-26 21:11:52,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5027 5079 [WARNING|trainer.py:803] 2025-04-26 21:11:53,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:53,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:53,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5028 5080 5775 [WARNING|trainer.py:803] 2025-04-26 21:11:54,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:11:55,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5029 5081 [WARNING|trainer.py:803] 2025-04-26 21:11:55,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:11:55,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:56,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5030 5776 5082 [WARNING|trainer.py:803] 2025-04-26 21:11:57,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:57,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5031 [WARNING|trainer.py:803] 2025-04-26 21:11:57,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5083 [WARNING|trainer.py:803] 2025-04-26 21:11:58,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5777 5032 [WARNING|trainer.py:803] 2025-04-26 21:11:58,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5084 [WARNING|trainer.py:803] 2025-04-26 21:11:59,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:11:59,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5033 [WARNING|trainer.py:803] 2025-04-26 21:11:59,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5778 5085 [WARNING|trainer.py:803] 2025-04-26 21:12:00,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:00,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5034 [WARNING|trainer.py:803] 2025-04-26 21:12:01,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5086 [WARNING|trainer.py:803] 2025-04-26 21:12:01,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5779 5035 [WARNING|trainer.py:803] 2025-04-26 21:12:02,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5087 [WARNING|trainer.py:803] 2025-04-26 21:12:02,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:12:03,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:12:03,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5036 5088 5780 [WARNING|trainer.py:803] 2025-04-26 21:12:04,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:04,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5037 5089 [WARNING|trainer.py:803] 2025-04-26 21:12:05,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:12:05,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:05,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5038 5781 5090 [WARNING|trainer.py:803] 2025-04-26 21:12:06,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:06,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:12:06,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5039 5091 [WARNING|trainer.py:803] 2025-04-26 21:12:07,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5782 [WARNING|trainer.py:803] 2025-04-26 21:12:08,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5040 5092 [WARNING|trainer.py:803] 2025-04-26 21:12:08,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:12:08,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:09,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5041 5093 5783 [WARNING|trainer.py:803] 2025-04-26 21:12:10,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:12:10,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5042 [WARNING|trainer.py:803] 2025-04-26 21:12:10,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5094 [WARNING|trainer.py:803] 2025-04-26 21:12:11,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:11,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5784 5043 5095 [WARNING|trainer.py:803] 2025-04-26 21:12:12,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:12:12,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:12,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5044 5096 5785 [WARNING|trainer.py:803] 2025-04-26 21:12:13,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:12:14,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5045 5097 [WARNING|trainer.py:803] 2025-04-26 21:12:14,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:12:14,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:15,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5046 5786 5098 [WARNING|trainer.py:803] 2025-04-26 21:12:16,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:16,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:12:16,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5047 5099 [WARNING|trainer.py:803] 2025-04-26 21:12:17,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:17,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5787 5048 5100 [WARNING|trainer.py:803] 2025-04-26 21:12:18,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:18,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:12:18,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5049 5101 5788 [WARNING|trainer.py:803] 2025-04-26 21:12:19,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:20,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5050 [WARNING|trainer.py:803] 2025-04-26 21:12:20,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5102 [WARNING|trainer.py:803] 2025-04-26 21:12:20,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5789 5051 [WARNING|trainer.py:803] 2025-04-26 21:12:21,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:12:21,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5103 [WARNING|trainer.py:803] 2025-04-26 21:12:22,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5052 5790 [WARNING|trainer.py:803] 2025-04-26 21:12:22,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5104 [WARNING|trainer.py:803] 2025-04-26 21:12:23,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:23,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5053 [WARNING|trainer.py:803] 2025-04-26 21:12:24,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:12:24,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5105 5791 5054 [WARNING|trainer.py:803] 2025-04-26 21:12:25,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:12:25,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:12:25,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5106 5055 5792 [WARNING|trainer.py:803] 2025-04-26 21:12:26,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:12:26,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:27,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5056 5107 [WARNING|trainer.py:803] 2025-04-26 21:12:28,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5793 [WARNING|trainer.py:803] 2025-04-26 21:12:28,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5057 5108 [WARNING|trainer.py:803] 2025-04-26 21:12:29,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:12:29,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:12:29,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5058 5794 5109 [WARNING|trainer.py:803] 2025-04-26 21:12:30,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5059 [WARNING|trainer.py:803] 2025-04-26 21:12:30,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:31,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5110 [WARNING|trainer.py:803] 2025-04-26 21:12:31,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5795 5060 [WARNING|trainer.py:803] 2025-04-26 21:12:32,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:32,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:32,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5111 5061 5796 [WARNING|trainer.py:803] 2025-04-26 21:12:33,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:12:34,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5112 5062 [WARNING|trainer.py:803] 2025-04-26 21:12:34,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:12:35,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:12:35,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5063 5113 5797 [WARNING|trainer.py:803] 2025-04-26 21:12:36,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:12:36,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:36,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5064 5114 5798 [WARNING|trainer.py:803] 2025-04-26 21:12:37,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:37,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5065 [WARNING|trainer.py:803] 2025-04-26 21:12:38,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5115 [WARNING|trainer.py:803] 2025-04-26 21:12:38,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:12:39,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5066 5799 5116 [WARNING|trainer.py:803] 2025-04-26 21:12:40,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:12:40,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5067 [WARNING|trainer.py:803] 2025-04-26 21:12:40,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5117 5800 [WARNING|trainer.py:803] 2025-04-26 21:12:41,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5068 [WARNING|trainer.py:803] 2025-04-26 21:12:42,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:42,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:12:42,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5118 5069 5801 [WARNING|trainer.py:803] 2025-04-26 21:12:43,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:12:43,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5119 [WARNING|trainer.py:803] 2025-04-26 21:12:44,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5070 [WARNING|trainer.py:803] 2025-04-26 21:12:44,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:12:44,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5802 5071 5120 [WARNING|trainer.py:803] 2025-04-26 21:12:45,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:12:46,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:12:46,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5072 5121 5803 [WARNING|trainer.py:803] 2025-04-26 21:12:47,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:47,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5073 [WARNING|trainer.py:803] 2025-04-26 21:12:47,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5122 [WARNING|trainer.py:803] 2025-04-26 21:12:48,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5074 [WARNING|trainer.py:803] 2025-04-26 21:12:48,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5804 5123 [WARNING|trainer.py:803] 2025-04-26 21:12:49,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:49,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5075 [WARNING|trainer.py:803] 2025-04-26 21:12:50,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5805 5124 [WARNING|trainer.py:803] 2025-04-26 21:12:50,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5076 [WARNING|trainer.py:803] 2025-04-26 21:12:51,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:12:51,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:52,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5125 5077 5806 [WARNING|trainer.py:803] 2025-04-26 21:12:53,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:12:53,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:12:53,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5126 5078 5807 [WARNING|trainer.py:803] 2025-04-26 21:12:54,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:54,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5079 5127 [WARNING|trainer.py:803] 2025-04-26 21:12:55,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:12:55,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:55,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5080 5808 5128 [WARNING|trainer.py:803] 2025-04-26 21:12:56,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:12:57,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:12:57,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5081 5129 5809 [WARNING|trainer.py:803] 2025-04-26 21:12:58,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5082 [WARNING|trainer.py:803] 2025-04-26 21:12:58,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:12:58,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5130 [WARNING|trainer.py:803] 2025-04-26 21:12:59,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5083 5810 [WARNING|trainer.py:803] 2025-04-26 21:12:59,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5131 [WARNING|trainer.py:803] 2025-04-26 21:13:00,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:13:00,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5084 [WARNING|trainer.py:803] 2025-04-26 21:13:01,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5811 [WARNING|trainer.py:803] 2025-04-26 21:13:01,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5132 5085 [WARNING|trainer.py:803] 2025-04-26 21:13:02,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:13:02,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:02,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5133 5086 5812 [WARNING|trainer.py:803] 2025-04-26 21:13:03,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:04,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:13:04,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5087 5134 5813 [WARNING|trainer.py:803] 2025-04-26 21:13:05,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:13:05,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5088 5135 [WARNING|trainer.py:803] 2025-04-26 21:13:06,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:13:06,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:13:06,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5089 5814 5136 [WARNING|trainer.py:803] 2025-04-26 21:13:07,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:13:07,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:08,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5090 5137 5815 [WARNING|trainer.py:803] 2025-04-26 21:13:08,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5091 [WARNING|trainer.py:803] 2025-04-26 21:13:09,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:09,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5138 [WARNING|trainer.py:803] 2025-04-26 21:13:10,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5092 [WARNING|trainer.py:803] 2025-04-26 21:13:10,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5816 5139 [WARNING|trainer.py:803] 2025-04-26 21:13:11,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5093 [WARNING|trainer.py:803] 2025-04-26 21:13:11,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:12,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:13:12,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5817 5140 5094 [WARNING|trainer.py:803] 2025-04-26 21:13:13,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:13,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:13:13,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5141 5095 5818 [WARNING|trainer.py:803] 2025-04-26 21:13:14,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:14,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5096 5142 [WARNING|trainer.py:803] 2025-04-26 21:13:15,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:13:16,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:13:16,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5097 5819 5143 [WARNING|trainer.py:803] 2025-04-26 21:13:17,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:13:17,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5098 [WARNING|trainer.py:803] 2025-04-26 21:13:17,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5144 [WARNING|trainer.py:803] 2025-04-26 21:13:18,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5820 5099 [WARNING|trainer.py:803] 2025-04-26 21:13:19,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:13:19,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5145 [WARNING|trainer.py:803] 2025-04-26 21:13:19,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5100 [WARNING|trainer.py:803] 2025-04-26 21:13:20,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5821 [WARNING|trainer.py:803] 2025-04-26 21:13:20,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5146 5101 [WARNING|trainer.py:803] 2025-04-26 21:13:21,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:21,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:22,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5147 5822 5102 [WARNING|trainer.py:803] 2025-04-26 21:13:23,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:13:23,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:13:23,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5148 5103 [WARNING|trainer.py:803] 2025-04-26 21:13:24,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5823 [WARNING|trainer.py:803] 2025-04-26 21:13:25,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5149 [WARNING|trainer.py:803] 2025-04-26 21:13:25,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5104 [WARNING|trainer.py:803] 2025-04-26 21:13:26,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:26,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5824 5150 5105 [WARNING|trainer.py:803] 2025-04-26 21:13:27,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:13:27,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:27,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5151 5825 5106 [WARNING|trainer.py:803] 2025-04-26 21:13:28,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:29,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:13:29,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5152 5107 [WARNING|trainer.py:803] 2025-04-26 21:13:30,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5826 [WARNING|trainer.py:803] 2025-04-26 21:13:30,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5153 [WARNING|trainer.py:803] 2025-04-26 21:13:31,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5108 [WARNING|trainer.py:803] 2025-04-26 21:13:31,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:32,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5154 5827 5109 [WARNING|trainer.py:803] 2025-04-26 21:13:32,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:33,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:13:33,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5155 5110 5828 [WARNING|trainer.py:803] 2025-04-26 21:13:34,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:34,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5156 [WARNING|trainer.py:803] 2025-04-26 21:13:34,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5111 [WARNING|trainer.py:803] 2025-04-26 21:13:35,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5829 [WARNING|trainer.py:803] 2025-04-26 21:13:36,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5157 [WARNING|trainer.py:803] 2025-04-26 21:13:36,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5112 [WARNING|trainer.py:803] 2025-04-26 21:13:37,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:37,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5158 5830 5113 [WARNING|trainer.py:803] 2025-04-26 21:13:38,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:13:38,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:13:38,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5159 5114 5831 [WARNING|trainer.py:803] 2025-04-26 21:13:39,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:13:40,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5160 [WARNING|trainer.py:803] 2025-04-26 21:13:40,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5115 [WARNING|trainer.py:803] 2025-04-26 21:13:41,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5832 5161 [WARNING|trainer.py:803] 2025-04-26 21:13:41,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5116 [WARNING|trainer.py:803] 2025-04-26 21:13:42,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:13:42,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5162 [WARNING|trainer.py:803] 2025-04-26 21:13:42,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5833 5117 [WARNING|trainer.py:803] 2025-04-26 21:13:43,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:44,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5163 [WARNING|trainer.py:803] 2025-04-26 21:13:44,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5118 5834 [WARNING|trainer.py:803] 2025-04-26 21:13:45,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:13:45,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5164 [WARNING|trainer.py:803] 2025-04-26 21:13:46,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5119 [WARNING|trainer.py:803] 2025-04-26 21:13:46,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:13:47,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5165 5835 5120 [WARNING|trainer.py:803] 2025-04-26 21:13:47,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:48,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:13:48,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5166 5121 5836 [WARNING|trainer.py:803] 2025-04-26 21:13:49,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:49,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:13:49,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5167 5122 [WARNING|trainer.py:803] 2025-04-26 21:13:50,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5837 [WARNING|trainer.py:803] 2025-04-26 21:13:51,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5168 [WARNING|trainer.py:803] 2025-04-26 21:13:51,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5123 [WARNING|trainer.py:803] 2025-04-26 21:13:52,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5838 5169 [WARNING|trainer.py:803] 2025-04-26 21:13:52,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5124 [WARNING|trainer.py:803] 2025-04-26 21:13:53,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:13:53,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5170 [WARNING|trainer.py:803] 2025-04-26 21:13:54,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5839 5125 [WARNING|trainer.py:803] 2025-04-26 21:13:54,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:55,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5171 [WARNING|trainer.py:803] 2025-04-26 21:13:55,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5126 [WARNING|trainer.py:803] 2025-04-26 21:13:56,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5840 5172 [WARNING|trainer.py:803] 2025-04-26 21:13:56,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:13:57,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5127 [WARNING|trainer.py:803] 2025-04-26 21:13:57,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5841 5173 [WARNING|trainer.py:803] 2025-04-26 21:13:58,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5128 [WARNING|trainer.py:803] 2025-04-26 21:13:58,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:13:59,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5174 [WARNING|trainer.py:803] 2025-04-26 21:13:59,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5842 5129 [WARNING|trainer.py:803] 2025-04-26 21:14:00,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:14:00,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:00,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5175 5130 [WARNING|trainer.py:803] 2025-04-26 21:14:01,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5843 5176 [WARNING|trainer.py:803] 2025-04-26 21:14:02,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:02,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5131 [WARNING|trainer.py:803] 2025-04-26 21:14:03,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5177 [WARNING|trainer.py:803] 2025-04-26 21:14:03,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5844 5132 [WARNING|trainer.py:803] 2025-04-26 21:14:04,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:14:04,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5178 [WARNING|trainer.py:803] 2025-04-26 21:14:05,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5845 5133 [WARNING|trainer.py:803] 2025-04-26 21:14:05,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5179 [WARNING|trainer.py:803] 2025-04-26 21:14:06,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:14:06,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5134 [WARNING|trainer.py:803] 2025-04-26 21:14:07,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5846 5180 [WARNING|trainer.py:803] 2025-04-26 21:14:07,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:14:08,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5135 [WARNING|trainer.py:803] 2025-04-26 21:14:08,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5847 5181 [WARNING|trainer.py:803] 2025-04-26 21:14:09,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5136 [WARNING|trainer.py:803] 2025-04-26 21:14:09,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:14:09,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5182 [WARNING|trainer.py:803] 2025-04-26 21:14:10,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5848 5137 [WARNING|trainer.py:803] 2025-04-26 21:14:11,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:14:11,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:14:11,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5183 5138 [WARNING|trainer.py:803] 2025-04-26 21:14:12,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5849 [WARNING|trainer.py:803] 2025-04-26 21:14:13,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5184 [WARNING|trainer.py:803] 2025-04-26 21:14:13,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5139 [WARNING|trainer.py:803] 2025-04-26 21:14:14,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:14:14,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5185 5850 5140 [WARNING|trainer.py:803] 2025-04-26 21:14:15,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:14:15,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:14:16,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5186 5141 5851 [WARNING|trainer.py:803] 2025-04-26 21:14:16,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:17,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5187 [WARNING|trainer.py:803] 2025-04-26 21:14:17,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5142 [WARNING|trainer.py:803] 2025-04-26 21:14:18,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5852 [WARNING|trainer.py:803] 2025-04-26 21:14:18,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5188 5143 [WARNING|trainer.py:803] 2025-04-26 21:14:19,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:14:19,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5189 [WARNING|trainer.py:803] 2025-04-26 21:14:20,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5853 5144 [WARNING|trainer.py:803] 2025-04-26 21:14:21,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:14:21,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:14:21,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5190 5145 5854 [WARNING|trainer.py:803] 2025-04-26 21:14:22,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:14:23,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5191 [WARNING|trainer.py:803] 2025-04-26 21:14:23,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5146 [WARNING|trainer.py:803] 2025-04-26 21:14:23,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5855 [WARNING|trainer.py:803] 2025-04-26 21:14:24,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5192 [WARNING|trainer.py:803] 2025-04-26 21:14:24,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5147 [WARNING|trainer.py:803] 2025-04-26 21:14:25,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:14:25,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5856 5193 5148 [WARNING|trainer.py:803] 2025-04-26 21:14:26,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:14:26,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:14:27,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5194 5857 5149 [WARNING|trainer.py:803] 2025-04-26 21:14:28,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:14:28,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:14:28,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5195 5150 [WARNING|trainer.py:803] 2025-04-26 21:14:29,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5858 [WARNING|trainer.py:803] 2025-04-26 21:14:29,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5196 5151 [WARNING|trainer.py:803] 2025-04-26 21:14:30,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:30,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:31,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5859 5197 5152 [WARNING|trainer.py:803] 2025-04-26 21:14:32,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:32,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:14:32,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5198 5153 5860 [WARNING|trainer.py:803] 2025-04-26 21:14:33,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:14:34,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:14:34,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5199 5154 [WARNING|trainer.py:803] 2025-04-26 21:14:35,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5861 [WARNING|trainer.py:803] 2025-04-26 21:14:35,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5200 [WARNING|trainer.py:803] 2025-04-26 21:14:35,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5155 [WARNING|trainer.py:803] 2025-04-26 21:14:36,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:14:36,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5862 5201 5156 [WARNING|trainer.py:803] 2025-04-26 21:14:37,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:37,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:38,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5202 5157 5863 [WARNING|trainer.py:803] 2025-04-26 21:14:39,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:39,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:14:39,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5203 5158 [WARNING|trainer.py:803] 2025-04-26 21:14:40,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5864 [WARNING|trainer.py:803] 2025-04-26 21:14:40,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5204 5159 [WARNING|trainer.py:803] 2025-04-26 21:14:41,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:14:42,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:14:42,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5205 5865 5160 [WARNING|trainer.py:803] 2025-04-26 21:14:43,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:43,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:43,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5206 5161 5866 [WARNING|trainer.py:803] 2025-04-26 21:14:44,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:14:45,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:14:45,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5207 5162 [WARNING|trainer.py:803] 2025-04-26 21:14:46,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5867 [WARNING|trainer.py:803] 2025-04-26 21:14:46,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5208 5163 [WARNING|trainer.py:803] 2025-04-26 21:14:47,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:47,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:14:48,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5209 5868 5164 [WARNING|trainer.py:803] 2025-04-26 21:14:49,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:49,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:49,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5210 5165 [WARNING|trainer.py:803] 2025-04-26 21:14:50,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5869 [WARNING|trainer.py:803] 2025-04-26 21:14:50,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5211 [WARNING|trainer.py:803] 2025-04-26 21:14:51,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5166 [WARNING|trainer.py:803] 2025-04-26 21:14:51,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:14:52,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5870 5212 5167 [WARNING|trainer.py:803] 2025-04-26 21:14:53,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:14:53,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:14:53,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5213 5871 5168 [WARNING|trainer.py:803] 2025-04-26 21:14:54,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:14:55,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:14:55,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5214 5169 5872 [WARNING|trainer.py:803] 2025-04-26 21:14:55,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:14:56,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5215 [WARNING|trainer.py:803] 2025-04-26 21:14:56,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5170 [WARNING|trainer.py:803] 2025-04-26 21:14:57,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5873 [WARNING|trainer.py:803] 2025-04-26 21:14:57,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5216 5171 [WARNING|trainer.py:803] 2025-04-26 21:14:58,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:14:58,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:14:59,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5217 5874 5172 [WARNING|trainer.py:803] 2025-04-26 21:15:00,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:15:00,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:15:00,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5218 5173 5875 [WARNING|trainer.py:803] 2025-04-26 21:15:01,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:15:02,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:15:02,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5219 5174 [WARNING|trainer.py:803] 2025-04-26 21:15:03,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5876 [WARNING|trainer.py:803] 2025-04-26 21:15:03,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5220 [WARNING|trainer.py:803] 2025-04-26 21:15:03,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5175 [WARNING|trainer.py:803] 2025-04-26 21:15:04,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:15:04,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5877 5221 5176 [WARNING|trainer.py:803] 2025-04-26 21:15:05,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:15:05,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:15:06,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5222 5177 5878 [WARNING|trainer.py:803] 2025-04-26 21:15:07,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:15:07,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:15:07,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5223 5178 5879 [WARNING|trainer.py:803] 2025-04-26 21:15:08,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:15:09,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5224 [WARNING|trainer.py:803] 2025-04-26 21:15:09,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5179 [WARNING|trainer.py:803] 2025-04-26 21:15:10,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:15:10,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5225 5880 5180 [WARNING|trainer.py:803] 2025-04-26 21:15:11,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:15:11,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:15:11,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5226 5181 5881 [WARNING|trainer.py:803] 2025-04-26 21:15:12,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:15:13,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5227 [WARNING|trainer.py:803] 2025-04-26 21:15:13,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5182 [WARNING|trainer.py:803] 2025-04-26 21:15:14,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5882 [WARNING|trainer.py:803] 2025-04-26 21:15:14,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5228 5183 [WARNING|trainer.py:803] 2025-04-26 21:15:15,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:15:15,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:15:15,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5229 5883 5184 [WARNING|trainer.py:803] 2025-04-26 21:15:17,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:15:17,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:15:17,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5230 5185 5884 [WARNING|trainer.py:803] 2025-04-26 21:15:18,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:15:18,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5231 [WARNING|trainer.py:803] 2025-04-26 21:15:19,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5186 [WARNING|trainer.py:803] 2025-04-26 21:15:19,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5885 [WARNING|trainer.py:803] 2025-04-26 21:15:20,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5232 5187 [WARNING|trainer.py:803] 2025-04-26 21:15:20,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:15:21,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:15:21,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5233 5886 5188 [WARNING|trainer.py:803] 2025-04-26 21:15:22,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:15:22,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:15:23,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5234 5189 [WARNING|trainer.py:803] 2025-04-26 21:15:23,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5887 [WARNING|trainer.py:803] 2025-04-26 21:15:24,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5235 [WARNING|trainer.py:803] 2025-04-26 21:15:24,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5190 [WARNING|trainer.py:803] 2025-04-26 21:15:25,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:15:25,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5236 5888 5191 [WARNING|trainer.py:803] 2025-04-26 21:15:26,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:15:26,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:15:27,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5237 5192 5889 [WARNING|trainer.py:803] 2025-04-26 21:15:28,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:15:28,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5238 [WARNING|trainer.py:803] 2025-04-26 21:15:28,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5193 [WARNING|trainer.py:803] 2025-04-26 21:15:29,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:15:29,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5239 5890 5194 [WARNING|trainer.py:803] 2025-04-26 21:15:30,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:15:30,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:15:31,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5240 5891 5195 [WARNING|trainer.py:803] 2025-04-26 21:15:32,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:15:32,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:15:32,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5241 5196 [WARNING|trainer.py:803] 2025-04-26 21:15:33,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5892 5242 [WARNING|trainer.py:803] 2025-04-26 21:15:34,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:15:34,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5197 [WARNING|trainer.py:803] 2025-04-26 21:15:34,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5243 [WARNING|trainer.py:803] 2025-04-26 21:15:35,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5893 5198 [WARNING|trainer.py:803] 2025-04-26 21:15:36,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:15:36,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5244 [WARNING|trainer.py:803] 2025-04-26 21:15:36,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5199 5894 [WARNING|trainer.py:803] 2025-04-26 21:15:37,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:15:38,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5245 [WARNING|trainer.py:803] 2025-04-26 21:15:38,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5200 [WARNING|trainer.py:803] 2025-04-26 21:15:39,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5895 5246 [WARNING|trainer.py:803] 2025-04-26 21:15:39,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5201 [WARNING|trainer.py:803] 2025-04-26 21:15:40,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:15:40,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5247 [WARNING|trainer.py:803] 2025-04-26 21:15:41,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5896 5202 [WARNING|trainer.py:803] 2025-04-26 21:15:41,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:15:42,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:15:42,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5248 5203 [WARNING|trainer.py:803] 2025-04-26 21:15:43,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5897 5249 [WARNING|trainer.py:803] 2025-04-26 21:15:43,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:15:44,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5204 [WARNING|trainer.py:803] 2025-04-26 21:15:44,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5250 [WARNING|trainer.py:803] 2025-04-26 21:15:45,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5898 5205 [WARNING|trainer.py:803] 2025-04-26 21:15:46,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:15:46,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5251 [WARNING|trainer.py:803] 2025-04-26 21:15:46,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5899 5206 [WARNING|trainer.py:803] 2025-04-26 21:15:47,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5252 [WARNING|trainer.py:803] 2025-04-26 21:15:48,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:15:48,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5207 [WARNING|trainer.py:803] 2025-04-26 21:15:48,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5900 5253 [WARNING|trainer.py:803] 2025-04-26 21:15:49,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:15:49,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5208 [WARNING|trainer.py:803] 2025-04-26 21:15:50,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5254 5901 [WARNING|trainer.py:803] 2025-04-26 21:15:50,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5209 [WARNING|trainer.py:803] 2025-04-26 21:15:51,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:15:51,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5255 [WARNING|trainer.py:803] 2025-04-26 21:15:52,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5210 5902 [WARNING|trainer.py:803] 2025-04-26 21:15:52,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5256 [WARNING|trainer.py:803] 2025-04-26 21:15:53,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:15:53,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5211 [WARNING|trainer.py:803] 2025-04-26 21:15:54,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5903 5257 [WARNING|trainer.py:803] 2025-04-26 21:15:55,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:15:55,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5212 [WARNING|trainer.py:803] 2025-04-26 21:15:55,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5258 5904 [WARNING|trainer.py:803] 2025-04-26 21:15:56,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5213 [WARNING|trainer.py:803] 2025-04-26 21:15:57,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:15:57,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5259 [WARNING|trainer.py:803] 2025-04-26 21:15:57,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5905 5214 [WARNING|trainer.py:803] 2025-04-26 21:15:58,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:15:59,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5260 [WARNING|trainer.py:803] 2025-04-26 21:15:59,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5215 [WARNING|trainer.py:803] 2025-04-26 21:15:59,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5906 5261 [WARNING|trainer.py:803] 2025-04-26 21:16:00,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:00,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5216 [WARNING|trainer.py:803] 2025-04-26 21:16:01,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5907 5262 [WARNING|trainer.py:803] 2025-04-26 21:16:01,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:02,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5217 [WARNING|trainer.py:803] 2025-04-26 21:16:02,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5263 [WARNING|trainer.py:803] 2025-04-26 21:16:03,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5908 5218 [WARNING|trainer.py:803] 2025-04-26 21:16:04,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:04,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5264 [WARNING|trainer.py:803] 2025-04-26 21:16:04,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5909 5219 [WARNING|trainer.py:803] 2025-04-26 21:16:05,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5265 [WARNING|trainer.py:803] 2025-04-26 21:16:06,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:06,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:06,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5220 5910 5266 [WARNING|trainer.py:803] 2025-04-26 21:16:07,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:08,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:08,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5221 5267 [WARNING|trainer.py:803] 2025-04-26 21:16:09,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5911 [WARNING|trainer.py:803] 2025-04-26 21:16:09,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5222 [WARNING|trainer.py:803] 2025-04-26 21:16:09,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5268 [WARNING|trainer.py:803] 2025-04-26 21:16:10,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5912 [WARNING|trainer.py:803] 2025-04-26 21:16:10,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5223 5269 [WARNING|trainer.py:803] 2025-04-26 21:16:11,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:11,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:16:12,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5913 5224 5270 [WARNING|trainer.py:803] 2025-04-26 21:16:13,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:13,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:13,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5225 5271 5914 [WARNING|trainer.py:803] 2025-04-26 21:16:14,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:14,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:16:15,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5226 5272 5915 [WARNING|trainer.py:803] 2025-04-26 21:16:16,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:16,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5227 [WARNING|trainer.py:803] 2025-04-26 21:16:16,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5273 [WARNING|trainer.py:803] 2025-04-26 21:16:17,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5916 [WARNING|trainer.py:803] 2025-04-26 21:16:17,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5228 5274 [WARNING|trainer.py:803] 2025-04-26 21:16:18,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:18,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:19,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5229 5917 5275 [WARNING|trainer.py:803] 2025-04-26 21:16:20,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:20,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:16:20,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5230 5276 5918 [WARNING|trainer.py:803] 2025-04-26 21:16:21,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:21,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:22,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5231 5277 5919 [WARNING|trainer.py:803] 2025-04-26 21:16:23,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:23,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5232 5278 [WARNING|trainer.py:803] 2025-04-26 21:16:23,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:24,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:16:24,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5920 5233 5279 [WARNING|trainer.py:803] 2025-04-26 21:16:25,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:25,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:26,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5234 5921 5280 [WARNING|trainer.py:803] 2025-04-26 21:16:27,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:16:27,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:27,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5235 5281 5922 [WARNING|trainer.py:803] 2025-04-26 21:16:28,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:28,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:29,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5236 5282 [WARNING|trainer.py:803] 2025-04-26 21:16:30,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:16:30,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5923 5237 5283 [WARNING|trainer.py:803] 2025-04-26 21:16:31,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:16:31,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:31,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5924 5238 5284 [WARNING|trainer.py:803] 2025-04-26 21:16:32,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:16:32,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:32,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5285 5239 5925 [WARNING|trainer.py:803] 2025-04-26 21:16:34,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:34,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:34,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5240 5286 5926 [WARNING|trainer.py:803] 2025-04-26 21:16:35,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:35,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5241 5287 [WARNING|trainer.py:803] 2025-04-26 21:16:36,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:16:37,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:37,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5927 5242 5288 [WARNING|trainer.py:803] 2025-04-26 21:16:38,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:38,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:38,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5928 5243 5289 [WARNING|trainer.py:803] 2025-04-26 21:16:39,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:39,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:40,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5244 5290 5929 [WARNING|trainer.py:803] 2025-04-26 21:16:41,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:41,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:41,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5245 5291 5930 [WARNING|trainer.py:803] 2025-04-26 21:16:42,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:42,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5246 [WARNING|trainer.py:803] 2025-04-26 21:16:43,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5292 [WARNING|trainer.py:803] 2025-04-26 21:16:44,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5931 [WARNING|trainer.py:803] 2025-04-26 21:16:44,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5247 5293 [WARNING|trainer.py:803] 2025-04-26 21:16:45,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:45,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:16:45,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5932 5248 5294 [WARNING|trainer.py:803] 2025-04-26 21:16:46,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:47,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:47,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5249 5295 5933 [WARNING|trainer.py:803] 2025-04-26 21:16:48,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:48,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:16:48,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5250 5296 5934 [WARNING|trainer.py:803] 2025-04-26 21:16:49,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:49,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:50,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5251 5297 5935 [WARNING|trainer.py:803] 2025-04-26 21:16:51,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:16:51,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5252 5298 [WARNING|trainer.py:803] 2025-04-26 21:16:51,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:16:52,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:52,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5936 5253 5299 [WARNING|trainer.py:803] 2025-04-26 21:16:53,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:53,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:16:54,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5254 5937 5300 [WARNING|trainer.py:803] 2025-04-26 21:16:55,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:55,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:55,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5255 5301 5938 [WARNING|trainer.py:803] 2025-04-26 21:16:56,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:16:56,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:16:57,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5256 5302 5939 [WARNING|trainer.py:803] 2025-04-26 21:16:58,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:16:58,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5257 [WARNING|trainer.py:803] 2025-04-26 21:16:58,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5303 [WARNING|trainer.py:803] 2025-04-26 21:16:59,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:16:59,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5940 5258 5304 [WARNING|trainer.py:803] 2025-04-26 21:17:00,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:17:01,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:01,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5259 5941 5305 [WARNING|trainer.py:803] 2025-04-26 21:17:02,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:17:02,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:02,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5260 5306 5942 [WARNING|trainer.py:803] 2025-04-26 21:17:03,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:03,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:04,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5261 5307 5943 [WARNING|trainer.py:803] 2025-04-26 21:17:05,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:05,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5262 5308 [WARNING|trainer.py:803] 2025-04-26 21:17:06,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:17:06,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:06,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5944 5263 5309 [WARNING|trainer.py:803] 2025-04-26 21:17:07,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:08,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:17:08,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5945 5264 5310 [WARNING|trainer.py:803] 2025-04-26 21:17:09,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:09,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:17:09,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5265 5311 5946 [WARNING|trainer.py:803] 2025-04-26 21:17:10,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:10,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:11,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5266 5312 5947 [WARNING|trainer.py:803] 2025-04-26 21:17:12,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:12,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5267 5313 [WARNING|trainer.py:803] 2025-04-26 21:17:12,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:13,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:13,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5948 5268 5314 [WARNING|trainer.py:803] 2025-04-26 21:17:14,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:17:15,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:15,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5949 5269 5315 [WARNING|trainer.py:803] 2025-04-26 21:17:16,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:16,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:17:16,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5316 5270 5950 [WARNING|trainer.py:803] 2025-04-26 21:17:17,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:17,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:18,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5317 5271 5951 [WARNING|trainer.py:803] 2025-04-26 21:17:19,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:17:19,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5318 5272 [WARNING|trainer.py:803] 2025-04-26 21:17:20,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:20,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:20,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5952 5319 5273 [WARNING|trainer.py:803] 2025-04-26 21:17:21,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:22,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:17:22,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5953 5274 5320 [WARNING|trainer.py:803] 2025-04-26 21:17:23,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:23,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:17:23,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5275 5321 5954 [WARNING|trainer.py:803] 2025-04-26 21:17:24,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:24,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:25,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5276 5322 5955 [WARNING|trainer.py:803] 2025-04-26 21:17:26,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:26,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5277 5323 [WARNING|trainer.py:803] 2025-04-26 21:17:26,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:27,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:27,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5956 5278 5324 [WARNING|trainer.py:803] 2025-04-26 21:17:28,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:28,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:29,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5279 5325 5957 [WARNING|trainer.py:803] 2025-04-26 21:17:30,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:30,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:30,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5326 5280 5958 [WARNING|trainer.py:803] 2025-04-26 21:17:31,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:31,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5327 [WARNING|trainer.py:803] 2025-04-26 21:17:32,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5281 [WARNING|trainer.py:803] 2025-04-26 21:17:33,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:33,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5959 5328 5282 [WARNING|trainer.py:803] 2025-04-26 21:17:34,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:34,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:34,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5960 5329 5283 [WARNING|trainer.py:803] 2025-04-26 21:17:35,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:17:35,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:35,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5330 5284 5961 [WARNING|trainer.py:803] 2025-04-26 21:17:37,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:37,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:37,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5331 5285 5962 [WARNING|trainer.py:803] 2025-04-26 21:17:38,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:38,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5332 [WARNING|trainer.py:803] 2025-04-26 21:17:39,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5286 [WARNING|trainer.py:803] 2025-04-26 21:17:39,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5963 [WARNING|trainer.py:803] 2025-04-26 21:17:40,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5333 5287 [WARNING|trainer.py:803] 2025-04-26 21:17:41,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:41,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:41,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5964 5334 5288 [WARNING|trainer.py:803] 2025-04-26 21:17:42,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:42,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:43,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5335 5965 5289 [WARNING|trainer.py:803] 2025-04-26 21:17:44,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:44,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:44,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5336 5290 5966 [WARNING|trainer.py:803] 2025-04-26 21:17:45,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:45,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5337 [WARNING|trainer.py:803] 2025-04-26 21:17:46,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5291 [WARNING|trainer.py:803] 2025-04-26 21:17:46,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5967 [WARNING|trainer.py:803] 2025-04-26 21:17:47,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5338 5292 [WARNING|trainer.py:803] 2025-04-26 21:17:47,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:48,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:48,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5968 5339 5293 [WARNING|trainer.py:803] 2025-04-26 21:17:49,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:49,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:50,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5340 5294 5969 [WARNING|trainer.py:803] 2025-04-26 21:17:51,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:51,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:17:51,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5341 5295 5970 [WARNING|trainer.py:803] 2025-04-26 21:17:52,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:52,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5342 [WARNING|trainer.py:803] 2025-04-26 21:17:53,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5296 [WARNING|trainer.py:803] 2025-04-26 21:17:53,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:54,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5971 5343 5297 [WARNING|trainer.py:803] 2025-04-26 21:17:55,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:17:55,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:17:55,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5344 5972 5298 [WARNING|trainer.py:803] 2025-04-26 21:17:56,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:56,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:17:57,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5345 5299 5973 [WARNING|trainer.py:803] 2025-04-26 21:17:57,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:17:58,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5346 [WARNING|trainer.py:803] 2025-04-26 21:17:58,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5300 [WARNING|trainer.py:803] 2025-04-26 21:17:59,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5974 5347 [WARNING|trainer.py:803] 2025-04-26 21:17:59,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5301 [WARNING|trainer.py:803] 2025-04-26 21:18:00,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:00,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5348 [WARNING|trainer.py:803] 2025-04-26 21:18:01,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5975 5302 [WARNING|trainer.py:803] 2025-04-26 21:18:02,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:18:02,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5349 [WARNING|trainer.py:803] 2025-04-26 21:18:02,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5303 5976 [WARNING|trainer.py:803] 2025-04-26 21:18:03,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5350 [WARNING|trainer.py:803] 2025-04-26 21:18:04,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:18:04,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5304 [WARNING|trainer.py:803] 2025-04-26 21:18:04,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5977 5351 [WARNING|trainer.py:803] 2025-04-26 21:18:05,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:18:06,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. 5305 NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:06,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5352 5978 [WARNING|trainer.py:803] 2025-04-26 21:18:06,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5306 [WARNING|trainer.py:803] 2025-04-26 21:18:07,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:07,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5353 [WARNING|trainer.py:803] 2025-04-26 21:18:08,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5979 5307 [WARNING|trainer.py:803] 2025-04-26 21:18:08,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:09,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5354 [WARNING|trainer.py:803] 2025-04-26 21:18:09,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5308 [WARNING|trainer.py:803] 2025-04-26 21:18:10,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5980 5355 [WARNING|trainer.py:803] 2025-04-26 21:18:11,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:11,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5309 [WARNING|trainer.py:803] 2025-04-26 21:18:11,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5356 5981 [WARNING|trainer.py:803] 2025-04-26 21:18:12,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5310 [WARNING|trainer.py:803] 2025-04-26 21:18:13,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 21:18:13,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5357 [WARNING|trainer.py:803] 2025-04-26 21:18:13,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5982 5311 [WARNING|trainer.py:803] 2025-04-26 21:18:14,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5358 [WARNING|trainer.py:803] 2025-04-26 21:18:15,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:18:15,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5312 [WARNING|trainer.py:803] 2025-04-26 21:18:15,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5983 5359 [WARNING|trainer.py:803] 2025-04-26 21:18:16,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:18:16,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5313 [WARNING|trainer.py:803] 2025-04-26 21:18:17,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5360 [WARNING|trainer.py:803] 2025-04-26 21:18:17,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5984 [WARNING|trainer.py:803] 2025-04-26 21:18:18,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5314 [WARNING|trainer.py:803] 2025-04-26 21:18:18,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5361 [WARNING|trainer.py:803] 2025-04-26 21:18:19,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5985 [WARNING|trainer.py:803] 2025-04-26 21:18:19,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5315 5362 [WARNING|trainer.py:803] 2025-04-26 21:18:20,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:18:20,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:21,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5316 5363 5986 [WARNING|trainer.py:803] 2025-04-26 21:18:22,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:22,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5317 [WARNING|trainer.py:803] 2025-04-26 21:18:22,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5364 [WARNING|trainer.py:803] 2025-04-26 21:18:23,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5987 [WARNING|trainer.py:803] 2025-04-26 21:18:23,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5318 5365 [WARNING|trainer.py:803] 2025-04-26 21:18:24,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:24,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:18:25,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5319 5988 5366 [WARNING|trainer.py:803] 2025-04-26 21:18:26,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:18:26,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:18:26,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5320 5367 5989 [WARNING|trainer.py:803] 2025-04-26 21:18:27,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:18:28,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:18:28,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5321 5368 [WARNING|trainer.py:803] 2025-04-26 21:18:29,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:29,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5990 5322 5369 [WARNING|trainer.py:803] 2025-04-26 21:18:30,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:18:30,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:30,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5323 5370 5991 [WARNING|trainer.py:803] 2025-04-26 21:18:31,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:32,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:18:32,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5324 5371 5992 [WARNING|trainer.py:803] 2025-04-26 21:18:33,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:33,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5325 5372 [WARNING|trainer.py:803] 2025-04-26 21:18:34,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:34,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:34,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5993 5326 5373 [WARNING|trainer.py:803] 2025-04-26 21:18:35,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:36,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:18:36,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5994 5327 5374 [WARNING|trainer.py:803] 2025-04-26 21:18:37,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:37,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:37,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5328 5375 5995 [WARNING|trainer.py:803] 2025-04-26 21:18:38,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:39,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:39,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5329 5376 5996 [WARNING|trainer.py:803] 2025-04-26 21:18:40,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:40,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5330 5377 [WARNING|trainer.py:803] 2025-04-26 21:18:41,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:41,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:18:41,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5997 5331 5378 [WARNING|trainer.py:803] 2025-04-26 21:18:42,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:43,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:18:43,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5332 5379 5998 [WARNING|trainer.py:803] 2025-04-26 21:18:44,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:44,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:18:44,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5333 5380 5999 [WARNING|trainer.py:803] 2025-04-26 21:18:45,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:45,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5334 5381 [WARNING|trainer.py:803] 2025-04-26 21:18:46,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:47,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:18:47,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6000 5335 5382 [WARNING|trainer.py:803] 2025-04-26 21:18:48,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:48,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:18:48,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5336 6001 5383 [WARNING|trainer.py:803] 2025-04-26 21:18:49,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:50,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:18:50,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5337 5384 6002 [WARNING|trainer.py:803] 2025-04-26 21:18:51,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:51,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:18:51,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5338 5385 [WARNING|trainer.py:803] 2025-04-26 21:18:52,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:18:52,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6003 5339 5386 [WARNING|trainer.py:803] 2025-04-26 21:18:53,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:54,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:18:54,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5340 6004 5387 [WARNING|trainer.py:803] 2025-04-26 21:18:55,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:18:55,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:18:55,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5341 5388 6005 [WARNING|trainer.py:803] 2025-04-26 21:18:56,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:57,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:57,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5342 5389 [WARNING|trainer.py:803] 2025-04-26 21:18:58,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6006 [WARNING|trainer.py:803] 2025-04-26 21:18:58,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5343 5390 [WARNING|trainer.py:803] 2025-04-26 21:18:59,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:18:59,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:18:59,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6007 5344 5391 [WARNING|trainer.py:803] 2025-04-26 21:19:00,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:00,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:19:01,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5345 6008 5392 [WARNING|trainer.py:803] 2025-04-26 21:19:02,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:02,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:19:02,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5346 5393 6009 [WARNING|trainer.py:803] 2025-04-26 21:19:03,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:19:03,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5347 [WARNING|trainer.py:803] 2025-04-26 21:19:04,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5394 [WARNING|trainer.py:803] 2025-04-26 21:19:05,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6010 [WARNING|trainer.py:803] 2025-04-26 21:19:05,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5348 5395 [WARNING|trainer.py:803] 2025-04-26 21:19:06,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:06,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:06,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6011 5349 5396 [WARNING|trainer.py:803] 2025-04-26 21:19:07,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:07,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:08,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5350 5397 6012 [WARNING|trainer.py:803] 2025-04-26 21:19:09,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:09,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:09,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5351 5398 6013 [WARNING|trainer.py:803] 2025-04-26 21:19:10,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:19:10,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5352 [WARNING|trainer.py:803] 2025-04-26 21:19:11,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5399 [WARNING|trainer.py:803] 2025-04-26 21:19:11,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6014 [WARNING|trainer.py:803] 2025-04-26 21:19:12,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5353 5400 [WARNING|trainer.py:803] 2025-04-26 21:19:13,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:19:13,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:19:13,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5354 6015 5401 [WARNING|trainer.py:803] 2025-04-26 21:19:14,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:19:14,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:19:15,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5355 6016 5402 [WARNING|trainer.py:803] 2025-04-26 21:19:16,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:19:16,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5356 [WARNING|trainer.py:803] 2025-04-26 21:19:16,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6017 5403 [WARNING|trainer.py:803] 2025-04-26 21:19:17,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 5357 [WARNING|trainer.py:803] 2025-04-26 21:19:18,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:19:18,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:19:18,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5404 6018 5358 [WARNING|trainer.py:803] 2025-04-26 21:19:19,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:19,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:20,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5405 5359 6019 [WARNING|trainer.py:803] 2025-04-26 21:19:21,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:21,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:19:21,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5406 5360 6020 [WARNING|trainer.py:803] 2025-04-26 21:19:22,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:19:22,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:19:23,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5407 5361 6021 [WARNING|trainer.py:803] 2025-04-26 21:19:24,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:24,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5362 5408 [WARNING|trainer.py:803] 2025-04-26 21:19:24,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:19:25,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:25,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6022 5363 5409 [WARNING|trainer.py:803] 2025-04-26 21:19:26,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:19:26,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:27,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5364 5410 6023 [WARNING|trainer.py:803] 2025-04-26 21:19:28,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:28,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:28,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5365 5411 [WARNING|trainer.py:803] 2025-04-26 21:19:29,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6024 [WARNING|trainer.py:803] 2025-04-26 21:19:30,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5366 5412 [WARNING|trainer.py:803] 2025-04-26 21:19:30,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:31,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:31,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6025 5367 5413 [WARNING|trainer.py:803] 2025-04-26 21:19:32,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:32,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5368 [WARNING|trainer.py:803] 2025-04-26 21:19:33,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6026 5414 [WARNING|trainer.py:803] 2025-04-26 21:19:33,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:34,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5369 [WARNING|trainer.py:803] 2025-04-26 21:19:34,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6027 [WARNING|trainer.py:803] 2025-04-26 21:19:35,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5415 5370 [WARNING|trainer.py:803] 2025-04-26 21:19:35,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:36,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:19:36,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6028 5416 5371 [WARNING|trainer.py:803] 2025-04-26 21:19:37,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:19:37,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:38,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5417 5372 6029 [WARNING|trainer.py:803] 2025-04-26 21:19:39,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:19:39,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:19:39,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5418 5373 [WARNING|trainer.py:803] 2025-04-26 21:19:40,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6030 [WARNING|trainer.py:803] 2025-04-26 21:19:40,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5419 5374 [WARNING|trainer.py:803] 2025-04-26 21:19:41,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:42,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:19:42,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6031 5420 5375 [WARNING|trainer.py:803] 2025-04-26 21:19:43,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:43,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:43,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6032 5421 5376 [WARNING|trainer.py:803] 2025-04-26 21:19:45,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:45,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:19:45,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5377 5422 6033 [WARNING|trainer.py:803] 2025-04-26 21:19:46,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:19:46,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:46,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5378 5423 [WARNING|trainer.py:803] 2025-04-26 21:19:47,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6034 [WARNING|trainer.py:803] 2025-04-26 21:19:48,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5379 5424 [WARNING|trainer.py:803] 2025-04-26 21:19:48,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:49,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:49,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6035 5380 5425 [WARNING|trainer.py:803] 2025-04-26 21:19:50,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:50,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:19:51,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5381 6036 5426 [WARNING|trainer.py:803] 2025-04-26 21:19:51,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:19:52,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5382 [WARNING|trainer.py:803] 2025-04-26 21:19:52,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6037 5427 [WARNING|trainer.py:803] 2025-04-26 21:19:53,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5383 [WARNING|trainer.py:803] 2025-04-26 21:19:53,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:54,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:54,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5428 6038 5384 [WARNING|trainer.py:803] 2025-04-26 21:19:55,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:19:55,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:19:56,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5429 5385 6039 [WARNING|trainer.py:803] 2025-04-26 21:19:57,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:57,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:57,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5430 5386 6040 [WARNING|trainer.py:803] 2025-04-26 21:19:58,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:19:58,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:19:59,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5431 5387 [WARNING|trainer.py:803] 2025-04-26 21:20:00,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6041 [WARNING|trainer.py:803] 2025-04-26 21:20:00,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5388 5432 [WARNING|trainer.py:803] 2025-04-26 21:20:00,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:01,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:20:01,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6042 5389 5433 [WARNING|trainer.py:803] 2025-04-26 21:20:02,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:20:02,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:03,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5390 5434 6043 [WARNING|trainer.py:803] 2025-04-26 21:20:04,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:20:04,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:04,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5391 5435 6044 [WARNING|trainer.py:803] 2025-04-26 21:20:05,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:20:06,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5392 [WARNING|trainer.py:803] 2025-04-26 21:20:06,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5436 [WARNING|trainer.py:803] 2025-04-26 21:20:07,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6045 [WARNING|trainer.py:803] 2025-04-26 21:20:07,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5393 [WARNING|trainer.py:803] 2025-04-26 21:20:08,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5437 [WARNING|trainer.py:803] 2025-04-26 21:20:08,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6046 5394 [WARNING|trainer.py:803] 2025-04-26 21:20:09,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5438 [WARNING|trainer.py:803] 2025-04-26 21:20:09,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:09,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5395 [WARNING|trainer.py:803] 2025-04-26 21:20:10,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6047 5439 [WARNING|trainer.py:803] 2025-04-26 21:20:11,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5396 [WARNING|trainer.py:803] 2025-04-26 21:20:11,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:20:12,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:12,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5440 6048 5397 [WARNING|trainer.py:803] 2025-04-26 21:20:13,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:13,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:20:14,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5441 6049 5398 [WARNING|trainer.py:803] 2025-04-26 21:20:15,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:15,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:15,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5442 5399 6050 [WARNING|trainer.py:803] 2025-04-26 21:20:16,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:16,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:20:17,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5443 5400 [WARNING|trainer.py:803] 2025-04-26 21:20:18,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:20:18,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6051 5444 5401 [WARNING|trainer.py:803] 2025-04-26 21:20:19,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:19,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:20:19,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6052 5445 5402 [WARNING|trainer.py:803] 2025-04-26 21:20:21,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:20:21,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:20:21,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5446 6053 5403 [WARNING|trainer.py:803] 2025-04-26 21:20:22,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:22,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:22,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5447 5404 6054 [WARNING|trainer.py:803] 2025-04-26 21:20:24,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:20:24,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:24,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5448 5405 6055 [WARNING|trainer.py:803] 2025-04-26 21:20:25,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:25,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:26,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5449 5406 6056 [WARNING|trainer.py:803] 2025-04-26 21:20:27,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:27,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:20:27,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5450 5407 6057 [WARNING|trainer.py:803] 2025-04-26 21:20:28,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:28,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:29,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5408 5451 6058 [WARNING|trainer.py:803] 2025-04-26 21:20:30,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:20:30,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5409 [WARNING|trainer.py:803] 2025-04-26 21:20:30,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5452 [WARNING|trainer.py:803] 2025-04-26 21:20:31,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:31,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6059 5410 5453 [WARNING|trainer.py:803] 2025-04-26 21:20:32,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:33,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:33,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6060 5411 5454 [WARNING|trainer.py:803] 2025-04-26 21:20:34,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:34,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:20:34,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6061 5412 5455 [WARNING|trainer.py:803] 2025-04-26 21:20:35,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:20:36,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:36,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6062 5413 5456 [WARNING|trainer.py:803] 2025-04-26 21:20:37,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:37,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:37,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6063 5414 5457 [WARNING|trainer.py:803] 2025-04-26 21:20:39,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:20:39,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:39,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5415 6064 5458 [WARNING|trainer.py:803] 2025-04-26 21:20:40,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:20:40,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:40,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5416 5459 6065 [WARNING|trainer.py:803] 2025-04-26 21:20:42,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:42,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5417 [WARNING|trainer.py:803] 2025-04-26 21:20:42,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5460 6066 [WARNING|trainer.py:803] 2025-04-26 21:20:43,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:20:43,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5461 5418 [WARNING|trainer.py:803] 2025-04-26 21:20:44,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:45,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6067 [WARNING|trainer.py:803] 2025-04-26 21:20:45,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5462 5419 [WARNING|trainer.py:803] 2025-04-26 21:20:46,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:20:46,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:46,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6068 5463 5420 [WARNING|trainer.py:803] 2025-04-26 21:20:47,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:48,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:20:48,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6069 5464 5421 [WARNING|trainer.py:803] 2025-04-26 21:20:49,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:20:49,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:20:49,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5422 5465 6070 [WARNING|trainer.py:803] 2025-04-26 21:20:51,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:51,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:20:51,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5423 5466 6071 [WARNING|trainer.py:803] 2025-04-26 21:20:52,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:20:52,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:53,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5467 5424 [WARNING|trainer.py:803] 2025-04-26 21:20:54,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:54,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6072 5425 5468 [WARNING|trainer.py:803] 2025-04-26 21:20:55,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:55,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:55,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6073 5469 5426 [WARNING|trainer.py:803] 2025-04-26 21:20:56,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:57,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6074 [WARNING|trainer.py:803] 2025-04-26 21:20:57,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5470 5427 [WARNING|trainer.py:803] 2025-04-26 21:20:58,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:20:58,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:20:58,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5471 6075 5428 [WARNING|trainer.py:803] 2025-04-26 21:21:00,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:21:00,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:21:00,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5472 5429 6076 [WARNING|trainer.py:803] 2025-04-26 21:21:01,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:21:02,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:02,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5473 5430 6077 [WARNING|trainer.py:803] 2025-04-26 21:21:03,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:21:03,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5474 [WARNING|trainer.py:803] 2025-04-26 21:21:03,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5431 [WARNING|trainer.py:803] 2025-04-26 21:21:04,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6078 [WARNING|trainer.py:803] 2025-04-26 21:21:05,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5475 [WARNING|trainer.py:803] 2025-04-26 21:21:05,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5432 [WARNING|trainer.py:803] 2025-04-26 21:21:06,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6079 [WARNING|trainer.py:803] 2025-04-26 21:21:06,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5476 [WARNING|trainer.py:803] 2025-04-26 21:21:07,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5433 [WARNING|trainer.py:803] 2025-04-26 21:21:07,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:21:08,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6080 5477 5434 [WARNING|trainer.py:803] 2025-04-26 21:21:09,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:09,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5478 [WARNING|trainer.py:803] 2025-04-26 21:21:09,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6081 5435 [WARNING|trainer.py:803] 2025-04-26 21:21:10,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:10,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5479 [WARNING|trainer.py:803] 2025-04-26 21:21:11,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6082 5436 [WARNING|trainer.py:803] 2025-04-26 21:21:12,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:21:12,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5480 [WARNING|trainer.py:803] 2025-04-26 21:21:12,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5437 [WARNING|trainer.py:803] 2025-04-26 21:21:13,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6083 5481 [WARNING|trainer.py:803] 2025-04-26 21:21:14,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:21:14,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5438 [WARNING|trainer.py:803] 2025-04-26 21:21:15,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5482 6084 [WARNING|trainer.py:803] 2025-04-26 21:21:15,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:16,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5439 [WARNING|trainer.py:803] 2025-04-26 21:21:16,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5483 [WARNING|trainer.py:803] 2025-04-26 21:21:17,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6085 5440 [WARNING|trainer.py:803] 2025-04-26 21:21:18,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:18,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5484 [WARNING|trainer.py:803] 2025-04-26 21:21:18,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6086 5441 [WARNING|trainer.py:803] 2025-04-26 21:21:19,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5485 [WARNING|trainer.py:803] 2025-04-26 21:21:20,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:20,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5442 [WARNING|trainer.py:803] 2025-04-26 21:21:21,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6087 [WARNING|trainer.py:803] 2025-04-26 21:21:21,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5486 [WARNING|trainer.py:803] 2025-04-26 21:21:22,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5443 [WARNING|trainer.py:803] 2025-04-26 21:21:22,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6088 5487 [WARNING|trainer.py:803] 2025-04-26 21:21:23,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:21:23,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5444 [WARNING|trainer.py:803] 2025-04-26 21:21:24,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6089 5488 [WARNING|trainer.py:803] 2025-04-26 21:21:24,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:21:25,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5445 [WARNING|trainer.py:803] 2025-04-26 21:21:25,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6090 5489 [WARNING|trainer.py:803] 2025-04-26 21:21:26,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:21:26,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:27,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5446 6091 5490 [WARNING|trainer.py:803] 2025-04-26 21:21:27,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:21:28,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:21:28,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5447 5491 [WARNING|trainer.py:803] 2025-04-26 21:21:29,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6092 [WARNING|trainer.py:803] 2025-04-26 21:21:30,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5448 [WARNING|trainer.py:803] 2025-04-26 21:21:30,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5492 [WARNING|trainer.py:803] 2025-04-26 21:21:30,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6093 [WARNING|trainer.py:803] 2025-04-26 21:21:31,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5449 [WARNING|trainer.py:803] 2025-04-26 21:21:32,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5493 [WARNING|trainer.py:803] 2025-04-26 21:21:32,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6094 [WARNING|trainer.py:803] 2025-04-26 21:21:33,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5450 [WARNING|trainer.py:803] 2025-04-26 21:21:33,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5494 [WARNING|trainer.py:803] 2025-04-26 21:21:34,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:21:34,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6095 5451 5495 [WARNING|trainer.py:803] 2025-04-26 21:21:35,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:21:35,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:21:36,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5452 6096 5496 [WARNING|trainer.py:803] 2025-04-26 21:21:37,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:21:37,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:21:37,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5453 5497 6097 [WARNING|trainer.py:803] 2025-04-26 21:21:38,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:21:39,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:21:39,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5454 5498 [WARNING|trainer.py:803] 2025-04-26 21:21:40,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6098 [WARNING|trainer.py:803] 2025-04-26 21:21:40,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5455 5499 [WARNING|trainer.py:803] 2025-04-26 21:21:41,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:21:41,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:21:42,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6099 5456 5500 [WARNING|trainer.py:803] 2025-04-26 21:21:42,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:21:43,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:21:43,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6100 5457 5501 [WARNING|trainer.py:803] 2025-04-26 21:21:44,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:21:44,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:21:45,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5458 5502 6101 [WARNING|trainer.py:803] 2025-04-26 21:21:46,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:46,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:46,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5459 5503 6102 [WARNING|trainer.py:803] 2025-04-26 21:21:47,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:47,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:21:48,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5460 5504 6103 [WARNING|trainer.py:803] 2025-04-26 21:21:49,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:21:49,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5461 5505 [WARNING|trainer.py:803] 2025-04-26 21:21:50,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:21:50,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:21:50,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6104 5506 5462 [WARNING|trainer.py:803] 2025-04-26 21:21:51,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:21:52,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:52,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6105 5463 5507 [WARNING|trainer.py:803] 2025-04-26 21:21:53,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:53,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:21:53,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5508 5464 6106 [WARNING|trainer.py:803] 2025-04-26 21:21:55,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:55,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:21:55,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5509 5465 6107 [WARNING|trainer.py:803] 2025-04-26 21:21:56,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:56,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5510 [WARNING|trainer.py:803] 2025-04-26 21:21:57,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5466 [WARNING|trainer.py:803] 2025-04-26 21:21:58,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:21:58,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6108 5511 5467 [WARNING|trainer.py:803] 2025-04-26 21:21:59,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:21:59,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6109 [WARNING|trainer.py:803] 2025-04-26 21:21:59,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5512 5468 [WARNING|trainer.py:803] 2025-04-26 21:22:00,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:00,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:22:01,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5513 6110 5469 [WARNING|trainer.py:803] 2025-04-26 21:22:02,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:22:02,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:22:02,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5514 6111 5470 [WARNING|trainer.py:803] 2025-04-26 21:22:04,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:22:04,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:04,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5515 6112 5471 [WARNING|trainer.py:803] 2025-04-26 21:22:05,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:22:05,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:22:05,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5516 5472 6113 [WARNING|trainer.py:803] 2025-04-26 21:22:06,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:22:07,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5517 [WARNING|trainer.py:803] 2025-04-26 21:22:07,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5473 [WARNING|trainer.py:803] 2025-04-26 21:22:08,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6114 [WARNING|trainer.py:803] 2025-04-26 21:22:08,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5518 5474 [WARNING|trainer.py:803] 2025-04-26 21:22:09,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:22:09,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:10,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5519 6115 5475 [WARNING|trainer.py:803] 2025-04-26 21:22:11,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:22:11,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:11,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5520 6116 5476 [WARNING|trainer.py:803] 2025-04-26 21:22:12,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:22:13,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5521 [WARNING|trainer.py:803] 2025-04-26 21:22:13,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6117 5477 [WARNING|trainer.py:803] 2025-04-26 21:22:14,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5522 [WARNING|trainer.py:803] 2025-04-26 21:22:14,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:14,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5478 [WARNING|trainer.py:803] 2025-04-26 21:22:15,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6118 5523 [WARNING|trainer.py:803] 2025-04-26 21:22:16,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:17,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5479 [WARNING|trainer.py:803] 2025-04-26 21:22:17,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5524 [WARNING|trainer.py:803] 2025-04-26 21:22:17,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6119 [WARNING|trainer.py:803] 2025-04-26 21:22:18,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5480 [WARNING|trainer.py:803] 2025-04-26 21:22:18,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5525 [WARNING|trainer.py:803] 2025-04-26 21:22:19,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:22:20,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6120 5481 5526 [WARNING|trainer.py:803] 2025-04-26 21:22:20,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:22:20,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:21,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5482 6121 5527 [WARNING|trainer.py:803] 2025-04-26 21:22:22,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:22:22,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:22:22,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5483 5528 [WARNING|trainer.py:803] 2025-04-26 21:22:24,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6122 [WARNING|trainer.py:803] 2025-04-26 21:22:24,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5484 [WARNING|trainer.py:803] 2025-04-26 21:22:24,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5529 [WARNING|trainer.py:803] 2025-04-26 21:22:25,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6123 [WARNING|trainer.py:803] 2025-04-26 21:22:25,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5485 5530 [WARNING|trainer.py:803] 2025-04-26 21:22:26,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:22:27,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:27,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6124 5486 5531 [WARNING|trainer.py:803] 2025-04-26 21:22:28,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:22:28,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:22:28,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5532 5487 6125 [WARNING|trainer.py:803] 2025-04-26 21:22:30,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:22:30,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:22:30,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5533 5488 6126 [WARNING|trainer.py:803] 2025-04-26 21:22:31,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:22:31,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:22:32,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5534 5489 [WARNING|trainer.py:803] 2025-04-26 21:22:33,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:33,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6127 5535 5490 [WARNING|trainer.py:803] 2025-04-26 21:22:34,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:34,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:22:34,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6128 5536 5491 [WARNING|trainer.py:803] 2025-04-26 21:22:35,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:22:36,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:22:36,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5537 6129 5492 [WARNING|trainer.py:803] 2025-04-26 21:22:37,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:22:37,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:37,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5493 5538 6130 [WARNING|trainer.py:803] 2025-04-26 21:22:39,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:39,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:22:39,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5494 6131 5539 [WARNING|trainer.py:803] 2025-04-26 21:22:40,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:22:40,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:41,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5495 6132 5540 [WARNING|trainer.py:803] 2025-04-26 21:22:42,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:22:42,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5496 [WARNING|trainer.py:803] 2025-04-26 21:22:42,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:43,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6133 5541 5497 [WARNING|trainer.py:803] 2025-04-26 21:22:44,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:22:44,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:45,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6134 5542 5498 [WARNING|trainer.py:803] 2025-04-26 21:22:46,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:22:46,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:46,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5499 5543 6135 [WARNING|trainer.py:803] 2025-04-26 21:22:48,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:48,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:22:48,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5500 5544 6136 [WARNING|trainer.py:803] 2025-04-26 21:22:49,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:22:49,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:22:50,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5501 5545 [WARNING|trainer.py:803] 2025-04-26 21:22:51,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6137 [WARNING|trainer.py:803] 2025-04-26 21:22:51,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5502 [WARNING|trainer.py:803] 2025-04-26 21:22:52,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5546 [WARNING|trainer.py:803] 2025-04-26 21:22:52,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6138 5503 [WARNING|trainer.py:803] 2025-04-26 21:22:53,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:22:53,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:22:54,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5547 5504 6139 [WARNING|trainer.py:803] 2025-04-26 21:22:55,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:22:55,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:22:55,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5505 5548 6140 [WARNING|trainer.py:803] 2025-04-26 21:22:57,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:57,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5506 [WARNING|trainer.py:803] 2025-04-26 21:22:57,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5549 [WARNING|trainer.py:803] 2025-04-26 21:22:58,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:58,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6141 5507 5550 [WARNING|trainer.py:803] 2025-04-26 21:22:59,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:22:59,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:23:00,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6142 5508 5551 [WARNING|trainer.py:803] 2025-04-26 21:23:01,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:01,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:01,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5509 5552 6143 [WARNING|trainer.py:803] 2025-04-26 21:23:02,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:03,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:23:03,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5510 5553 6144 [WARNING|trainer.py:803] 2025-04-26 21:23:04,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:23:04,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5511 [WARNING|trainer.py:803] 2025-04-26 21:23:04,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5554 [WARNING|trainer.py:803] 2025-04-26 21:23:05,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6145 [WARNING|trainer.py:803] 2025-04-26 21:23:06,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5512 [WARNING|trainer.py:803] 2025-04-26 21:23:06,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5555 [WARNING|trainer.py:803] 2025-04-26 21:23:07,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6146 5513 [WARNING|trainer.py:803] 2025-04-26 21:23:07,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:23:08,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5556 [WARNING|trainer.py:803] 2025-04-26 21:23:08,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5514 [WARNING|trainer.py:803] 2025-04-26 21:23:09,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6147 5557 [WARNING|trainer.py:803] 2025-04-26 21:23:09,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:23:10,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5515 [WARNING|trainer.py:803] 2025-04-26 21:23:10,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:23:11,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6148 5558 5516 [WARNING|trainer.py:803] 2025-04-26 21:23:12,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:23:12,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:23:12,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5559 6149 5517 [WARNING|trainer.py:803] 2025-04-26 21:23:13,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:23:14,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:14,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5560 5518 6150 [WARNING|trainer.py:803] 2025-04-26 21:23:15,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:23:15,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:15,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5561 5519 6151 [WARNING|trainer.py:803] 2025-04-26 21:23:17,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:17,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5562 [WARNING|trainer.py:803] 2025-04-26 21:23:17,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5520 [WARNING|trainer.py:803] 2025-04-26 21:23:18,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:23:18,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6152 5563 5521 [WARNING|trainer.py:803] 2025-04-26 21:23:19,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:23:20,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:23:20,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6153 5564 5522 [WARNING|trainer.py:803] 2025-04-26 21:23:21,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:23:21,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:21,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5565 5523 6154 [WARNING|trainer.py:803] 2025-04-26 21:23:23,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:23,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:23:23,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5566 5524 6155 [WARNING|trainer.py:803] 2025-04-26 21:23:24,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:23:24,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:23:25,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5525 5567 6156 [WARNING|trainer.py:803] 2025-04-26 21:23:26,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:23:26,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5526 [WARNING|trainer.py:803] 2025-04-26 21:23:26,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5568 [WARNING|trainer.py:803] 2025-04-26 21:23:27,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6157 [WARNING|trainer.py:803] 2025-04-26 21:23:27,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5527 5569 [WARNING|trainer.py:803] 2025-04-26 21:23:28,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:23:29,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:29,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5528 5570 6158 [WARNING|trainer.py:803] 2025-04-26 21:23:30,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:23:30,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:30,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5529 5571 6159 [WARNING|trainer.py:803] 2025-04-26 21:23:32,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:32,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:23:32,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5530 5572 [WARNING|trainer.py:803] 2025-04-26 21:23:33,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6160 [WARNING|trainer.py:803] 2025-04-26 21:23:33,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5531 5573 [WARNING|trainer.py:803] 2025-04-26 21:23:34,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:34,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:35,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6161 5532 5574 [WARNING|trainer.py:803] 2025-04-26 21:23:36,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:23:36,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:23:36,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5533 6162 5575 [WARNING|trainer.py:803] 2025-04-26 21:23:37,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:23:37,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:38,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5534 5576 6163 [WARNING|trainer.py:803] 2025-04-26 21:23:39,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:39,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:23:39,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5535 5577 [WARNING|trainer.py:803] 2025-04-26 21:23:40,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6164 [WARNING|trainer.py:803] 2025-04-26 21:23:41,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5536 [WARNING|trainer.py:803] 2025-04-26 21:23:41,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5578 [WARNING|trainer.py:803] 2025-04-26 21:23:42,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6165 [WARNING|trainer.py:803] 2025-04-26 21:23:42,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5537 [WARNING|trainer.py:803] 2025-04-26 21:23:43,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5579 [WARNING|trainer.py:803] 2025-04-26 21:23:43,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6166 [WARNING|trainer.py:803] 2025-04-26 21:23:44,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5538 [WARNING|trainer.py:803] 2025-04-26 21:23:45,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5580 [WARNING|trainer.py:803] 2025-04-26 21:23:45,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:23:45,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6167 5581 5539 [WARNING|trainer.py:803] 2025-04-26 21:23:46,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:47,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:47,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6168 5582 5540 [WARNING|trainer.py:803] 2025-04-26 21:23:48,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:48,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:23:49,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6169 5583 5541 [WARNING|trainer.py:803] 2025-04-26 21:23:50,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:50,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:23:50,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6170 5584 5542 [WARNING|trainer.py:803] 2025-04-26 21:23:51,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:23:51,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:23:52,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6171 5585 [WARNING|trainer.py:803] 2025-04-26 21:23:53,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:23:53,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5543 5586 6172 [WARNING|trainer.py:803] 2025-04-26 21:23:54,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:23:55,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:23:55,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5544 5587 6173 [WARNING|trainer.py:803] 2025-04-26 21:23:56,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:23:56,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:23:56,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5545 5588 6174 [WARNING|trainer.py:803] 2025-04-26 21:23:57,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:23:58,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:23:58,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5546 5589 6175 [WARNING|trainer.py:803] 2025-04-26 21:23:59,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:23:59,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:00,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5590 5547 6176 [WARNING|trainer.py:803] 2025-04-26 21:24:01,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:24:01,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:01,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5591 6177 5548 [WARNING|trainer.py:803] 2025-04-26 21:24:02,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:24:03,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:03,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5592 5549 [WARNING|trainer.py:803] 2025-04-26 21:24:04,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6178 [WARNING|trainer.py:803] 2025-04-26 21:24:04,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5593 [WARNING|trainer.py:803] 2025-04-26 21:24:05,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes 5550 [WARNING|trainer.py:803] 2025-04-26 21:24:05,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6179 [WARNING|trainer.py:803] 2025-04-26 21:24:06,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5594 5551 [WARNING|trainer.py:803] 2025-04-26 21:24:07,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:07,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:24:08,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5595 6180 5552 [WARNING|trainer.py:803] 2025-04-26 21:24:08,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:24:09,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:09,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5596 6181 5553 [WARNING|trainer.py:803] 2025-04-26 21:24:10,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:11,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:11,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5597 5554 [WARNING|trainer.py:803] 2025-04-26 21:24:11,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6182 [WARNING|trainer.py:803] 2025-04-26 21:24:12,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5598 [WARNING|trainer.py:803] 2025-04-26 21:24:12,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5555 [WARNING|trainer.py:803] 2025-04-26 21:24:13,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6183 5599 [WARNING|trainer.py:803] 2025-04-26 21:24:14,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:14,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5556 [WARNING|trainer.py:803] 2025-04-26 21:24:14,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6184 5600 [WARNING|trainer.py:803] 2025-04-26 21:24:15,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:16,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5557 [WARNING|trainer.py:803] 2025-04-26 21:24:16,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5601 [WARNING|trainer.py:803] 2025-04-26 21:24:17,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6185 [WARNING|trainer.py:803] 2025-04-26 21:24:17,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5558 [WARNING|trainer.py:803] 2025-04-26 21:24:18,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5602 [WARNING|trainer.py:803] 2025-04-26 21:24:18,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6186 [WARNING|trainer.py:803] 2025-04-26 21:24:19,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5559 [WARNING|trainer.py:803] 2025-04-26 21:24:19,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5603 [WARNING|trainer.py:803] 2025-04-26 21:24:20,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6187 [WARNING|trainer.py:803] 2025-04-26 21:24:20,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5560 5604 [WARNING|trainer.py:803] 2025-04-26 21:24:21,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:21,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:24:22,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5561 6188 5605 [WARNING|trainer.py:803] 2025-04-26 21:24:23,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:24:23,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:23,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5562 6189 5606 [WARNING|trainer.py:803] 2025-04-26 21:24:24,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:24:25,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:24:25,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5563 5607 6190 [WARNING|trainer.py:803] 2025-04-26 21:24:26,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:26,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:27,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5564 5608 6191 [WARNING|trainer.py:803] 2025-04-26 21:24:28,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:24:28,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:28,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5565 5609 [WARNING|trainer.py:803] 2025-04-26 21:24:29,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:24:29,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6192 5566 5610 [WARNING|trainer.py:803] 2025-04-26 21:24:30,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:24:31,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:31,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6193 5567 5611 [WARNING|trainer.py:803] 2025-04-26 21:24:32,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:32,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:24:32,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5568 5612 6194 [WARNING|trainer.py:803] 2025-04-26 21:24:34,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:34,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:34,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5613 5569 6195 [WARNING|trainer.py:803] 2025-04-26 21:24:35,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:35,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:36,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5614 5570 [WARNING|trainer.py:803] 2025-04-26 21:24:37,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:37,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6196 5615 5571 [WARNING|trainer.py:803] 2025-04-26 21:24:38,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:38,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:24:38,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5616 5572 6197 [WARNING|trainer.py:803] 2025-04-26 21:24:40,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:24:40,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:40,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5617 5573 6198 [WARNING|trainer.py:803] 2025-04-26 21:24:41,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:24:41,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:42,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5574 5618 6199 [WARNING|trainer.py:803] 2025-04-26 21:24:43,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:24:43,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:24:43,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5575 5619 6200 [WARNING|trainer.py:803] 2025-04-26 21:24:44,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:24:44,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5620 5576 [WARNING|trainer.py:803] 2025-04-26 21:24:45,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:24:46,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:46,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6201 5621 5577 [WARNING|trainer.py:803] 2025-04-26 21:24:47,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:47,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:24:47,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6202 5622 5578 [WARNING|trainer.py:803] 2025-04-26 21:24:49,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:24:49,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:24:49,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5623 5579 6203 [WARNING|trainer.py:803] 2025-04-26 21:24:50,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:50,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:50,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5624 5580 6204 [WARNING|trainer.py:803] 2025-04-26 21:24:52,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:52,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:52,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5625 5581 6205 [WARNING|trainer.py:803] 2025-04-26 21:24:53,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:24:53,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:24:53,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5626 5582 6206 [WARNING|trainer.py:803] 2025-04-26 21:24:55,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:24:55,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:55,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5627 5583 6207 [WARNING|trainer.py:803] 2025-04-26 21:24:56,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:56,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5628 [WARNING|trainer.py:803] 2025-04-26 21:24:57,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5584 [WARNING|trainer.py:803] 2025-04-26 21:24:57,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:24:58,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6208 5629 5585 [WARNING|trainer.py:803] 2025-04-26 21:24:59,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:24:59,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:25:00,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5630 6209 5586 [WARNING|trainer.py:803] 2025-04-26 21:25:00,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:01,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:01,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5631 6210 5587 [WARNING|trainer.py:803] 2025-04-26 21:25:02,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:02,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:03,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5632 6211 5588 [WARNING|trainer.py:803] 2025-04-26 21:25:03,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:25:04,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5633 [WARNING|trainer.py:803] 2025-04-26 21:25:04,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6212 5589 [WARNING|trainer.py:803] 2025-04-26 21:25:05,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5634 [WARNING|trainer.py:803] 2025-04-26 21:25:06,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:25:06,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6213 5590 [WARNING|trainer.py:803] 2025-04-26 21:25:06,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5635 [WARNING|trainer.py:803] 2025-04-26 21:25:07,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:07,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5591 [WARNING|trainer.py:803] 2025-04-26 21:25:08,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6214 5636 [WARNING|trainer.py:803] 2025-04-26 21:25:09,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:25:09,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:25:09,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5592 6215 5637 [WARNING|trainer.py:803] 2025-04-26 21:25:10,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:10,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:25:11,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5593 6216 5638 [WARNING|trainer.py:803] 2025-04-26 21:25:12,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:25:12,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:25:12,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5594 6217 5639 [WARNING|trainer.py:803] 2025-04-26 21:25:13,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:14,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:14,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5595 6218 5640 [WARNING|trainer.py:803] 2025-04-26 21:25:15,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:15,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:15,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5596 5641 6219 [WARNING|trainer.py:803] 2025-04-26 21:25:16,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:17,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:17,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5597 5642 6220 [WARNING|trainer.py:803] 2025-04-26 21:25:18,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:18,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5598 [WARNING|trainer.py:803] 2025-04-26 21:25:19,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5643 6221 [WARNING|trainer.py:803] 2025-04-26 21:25:19,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:25:20,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:20,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5599 5644 6222 [WARNING|trainer.py:803] 2025-04-26 21:25:21,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:25:21,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5600 [WARNING|trainer.py:803] 2025-04-26 21:25:22,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5645 [WARNING|trainer.py:803] 2025-04-26 21:25:22,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:23,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6223 5601 5646 [WARNING|trainer.py:803] 2025-04-26 21:25:24,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:25:24,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:24,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6224 5602 5647 [WARNING|trainer.py:803] 2025-04-26 21:25:25,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:25:26,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:26,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6225 5603 5648 [WARNING|trainer.py:803] 2025-04-26 21:25:27,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:27,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:27,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6226 5604 5649 [WARNING|trainer.py:803] 2025-04-26 21:25:28,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:25:28,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:25:29,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6227 5605 5650 [WARNING|trainer.py:803] 2025-04-26 21:25:30,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:30,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:25:30,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5606 6228 5651 [WARNING|trainer.py:803] 2025-04-26 21:25:31,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:25:32,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:32,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5607 5652 6229 [WARNING|trainer.py:803] 2025-04-26 21:25:33,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:33,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:33,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5608 5653 6230 [WARNING|trainer.py:803] 2025-04-26 21:25:34,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:25:35,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:35,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5609 5654 6231 [WARNING|trainer.py:803] 2025-04-26 21:25:36,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:36,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:36,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5610 5655 6232 [WARNING|trainer.py:803] 2025-04-26 21:25:37,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:25:37,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:38,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5656 5611 6233 [WARNING|trainer.py:803] 2025-04-26 21:25:39,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:25:39,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:39,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5612 5657 6234 [WARNING|trainer.py:803] 2025-04-26 21:25:40,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:25:40,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:41,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5613 5658 6235 [WARNING|trainer.py:803] 2025-04-26 21:25:42,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:42,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5614 [WARNING|trainer.py:803] 2025-04-26 21:25:43,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5659 6236 [WARNING|trainer.py:803] 2025-04-26 21:25:43,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:43,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5615 5660 [WARNING|trainer.py:803] 2025-04-26 21:25:44,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:25:45,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:25:45,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6237 5616 5661 [WARNING|trainer.py:803] 2025-04-26 21:25:46,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:46,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:46,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6238 5617 5662 [WARNING|trainer.py:803] 2025-04-26 21:25:48,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:48,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:48,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5618 6239 5663 [WARNING|trainer.py:803] 2025-04-26 21:25:49,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:49,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:49,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5619 5664 6240 [WARNING|trainer.py:803] 2025-04-26 21:25:51,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:25:51,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:51,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5620 5665 6241 [WARNING|trainer.py:803] 2025-04-26 21:25:52,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:25:52,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:53,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5666 5621 6242 [WARNING|trainer.py:803] 2025-04-26 21:25:54,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:54,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:25:54,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5667 5622 6243 [WARNING|trainer.py:803] 2025-04-26 21:25:55,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:25:55,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:25:56,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5668 5623 6244 [WARNING|trainer.py:803] 2025-04-26 21:25:57,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:57,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5669 [WARNING|trainer.py:803] 2025-04-26 21:25:57,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5624 6245 [WARNING|trainer.py:803] 2025-04-26 21:25:58,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:25:58,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:25:59,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5670 5625 6246 [WARNING|trainer.py:803] 2025-04-26 21:26:00,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:00,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:26:00,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5671 5626 6247 [WARNING|trainer.py:803] 2025-04-26 21:26:01,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:01,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5672 [WARNING|trainer.py:803] 2025-04-26 21:26:02,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5627 [WARNING|trainer.py:803] 2025-04-26 21:26:03,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6248 [WARNING|trainer.py:803] 2025-04-26 21:26:03,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5673 5628 [WARNING|trainer.py:803] 2025-04-26 21:26:04,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:26:04,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:26:04,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6249 5674 5629 [WARNING|trainer.py:803] 2025-04-26 21:26:05,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:06,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:26:06,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6250 5675 5630 [WARNING|trainer.py:803] 2025-04-26 21:26:07,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:07,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:07,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6251 5631 5676 [WARNING|trainer.py:803] 2025-04-26 21:26:08,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:26:09,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:09,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6252 5677 5632 [WARNING|trainer.py:803] 2025-04-26 21:26:10,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:10,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:26:10,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5633 6253 5678 [WARNING|trainer.py:803] 2025-04-26 21:26:12,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:26:12,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:26:12,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6254 5679 5634 [WARNING|trainer.py:803] 2025-04-26 21:26:13,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:26:13,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:13,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6255 5680 5635 [WARNING|trainer.py:803] 2025-04-26 21:26:14,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:15,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:15,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5681 6256 5636 [WARNING|trainer.py:803] 2025-04-26 21:26:16,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:16,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:26:16,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5682 6257 5637 [WARNING|trainer.py:803] 2025-04-26 21:26:17,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:18,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:26:18,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5683 5638 6258 [WARNING|trainer.py:803] 2025-04-26 21:26:19,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:19,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:19,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5684 5639 6259 [WARNING|trainer.py:803] 2025-04-26 21:26:20,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:26:21,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:21,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5685 5640 6260 [WARNING|trainer.py:803] 2025-04-26 21:26:22,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:22,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:23,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5686 5641 6261 [WARNING|trainer.py:803] 2025-04-26 21:26:23,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:26:24,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5687 [WARNING|trainer.py:803] 2025-04-26 21:26:24,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5642 [WARNING|trainer.py:803] 2025-04-26 21:26:25,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6262 [WARNING|trainer.py:803] 2025-04-26 21:26:25,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5688 5643 [WARNING|trainer.py:803] 2025-04-26 21:26:26,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:26,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:26:27,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6263 5689 5644 [WARNING|trainer.py:803] 2025-04-26 21:26:28,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:28,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:26:28,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6264 5690 5645 [WARNING|trainer.py:803] 2025-04-26 21:26:29,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:29,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:26:30,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5691 6265 5646 [WARNING|trainer.py:803] 2025-04-26 21:26:31,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:26:31,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:31,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5692 6266 5647 [WARNING|trainer.py:803] 2025-04-26 21:26:32,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:26:33,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:26:33,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5693 6267 5648 [WARNING|trainer.py:803] 2025-04-26 21:26:34,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:26:34,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:34,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5694 5649 6268 [WARNING|trainer.py:803] 2025-04-26 21:26:35,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:26:36,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5695 [WARNING|trainer.py:803] 2025-04-26 21:26:36,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5650 [WARNING|trainer.py:803] 2025-04-26 21:26:37,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6269 [WARNING|trainer.py:803] 2025-04-26 21:26:37,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5696 [WARNING|trainer.py:803] 2025-04-26 21:26:38,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5651 [WARNING|trainer.py:803] 2025-04-26 21:26:38,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [mov,mp4,m4a,3gp,3g2,mj2 @ 0x89738440] moov atom not found [21:26:38] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4, Invalid data found when processing input Error reading /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4... Failed to load video: /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k 6270 [WARNING|trainer.py:803] 2025-04-26 21:26:39,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5697 [WARNING|trainer.py:803] 2025-04-26 21:26:39,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5652 [WARNING|trainer.py:803] 2025-04-26 21:26:40,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6271 [WARNING|trainer.py:803] 2025-04-26 21:26:40,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5698 5653 [WARNING|trainer.py:803] 2025-04-26 21:26:41,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:41,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:26:42,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5699 6272 5654 [WARNING|trainer.py:803] 2025-04-26 21:26:43,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:26:43,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:43,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5700 5655 6273 [WARNING|trainer.py:803] 2025-04-26 21:26:44,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:26:45,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:26:45,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5656 5701 6274 [WARNING|trainer.py:803] 2025-04-26 21:26:46,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:26:46,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:47,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5657 [WARNING|trainer.py:803] 2025-04-26 21:26:48,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6275 5702 5658 [WARNING|trainer.py:803] 2025-04-26 21:26:48,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:49,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:26:49,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6276 5659 5703 [WARNING|trainer.py:803] 2025-04-26 21:26:50,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:51,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:51,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6277 5660 [WARNING|trainer.py:803] 2025-04-26 21:26:52,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5704 [WARNING|trainer.py:803] 2025-04-26 21:26:52,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6278 5661 [WARNING|trainer.py:803] 2025-04-26 21:26:53,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:53,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:26:53,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6279 5705 5662 [WARNING|trainer.py:803] 2025-04-26 21:26:55,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:26:55,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:26:55,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6280 5663 5706 [WARNING|trainer.py:803] 2025-04-26 21:26:56,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:26:56,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:26:57,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5664 6281 [WARNING|trainer.py:803] 2025-04-26 21:26:58,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5707 [WARNING|trainer.py:803] 2025-04-26 21:26:58,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5665 [WARNING|trainer.py:803] 2025-04-26 21:26:59,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6282 [WARNING|trainer.py:803] 2025-04-26 21:26:59,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:27:00,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5666 5708 6283 [WARNING|trainer.py:803] 2025-04-26 21:27:01,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:27:01,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5667 [WARNING|trainer.py:803] 2025-04-26 21:27:02,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5709 [WARNING|trainer.py:803] 2025-04-26 21:27:02,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6284 5668 [WARNING|trainer.py:803] 2025-04-26 21:27:03,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:03,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:04,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6285 5710 5669 [WARNING|trainer.py:803] 2025-04-26 21:27:05,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:27:05,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:05,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5670 5711 6286 [WARNING|trainer.py:803] 2025-04-26 21:27:07,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:27:07,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:07,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5671 [WARNING|trainer.py:803] 2025-04-26 21:27:08,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5712 6287 5672 [WARNING|trainer.py:803] 2025-04-26 21:27:09,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:09,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:10,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6288 5673 5713 [WARNING|trainer.py:803] 2025-04-26 21:27:11,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:11,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:11,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6289 5674 [WARNING|trainer.py:803] 2025-04-26 21:27:12,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5714 [WARNING|trainer.py:803] 2025-04-26 21:27:13,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:13,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5675 6290 [WARNING|trainer.py:803] 2025-04-26 21:27:14,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:27:15,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5715 5676 6291 [WARNING|trainer.py:803] 2025-04-26 21:27:16,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:27:16,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:27:16,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5677 5716 6292 [WARNING|trainer.py:803] 2025-04-26 21:27:17,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:27:18,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5678 [WARNING|trainer.py:803] 2025-04-26 21:27:18,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6293 5717 [WARNING|trainer.py:803] 2025-04-26 21:27:19,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5679 [WARNING|trainer.py:803] 2025-04-26 21:27:20,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:20,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:27:20,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6294 5718 5680 [WARNING|trainer.py:803] 2025-04-26 21:27:21,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:22,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:22,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6295 5681 5719 [WARNING|trainer.py:803] 2025-04-26 21:27:23,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:23,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:27:24,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6296 5682 [WARNING|trainer.py:803] 2025-04-26 21:27:25,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:27:25,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5720 5683 6297 [WARNING|trainer.py:803] 2025-04-26 21:27:26,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:26,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:27:26,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5684 5721 6298 [WARNING|trainer.py:803] 2025-04-26 21:27:28,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:28,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:27:28,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5685 5722 6299 [WARNING|trainer.py:803] 2025-04-26 21:27:29,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:27:30,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5686 [WARNING|trainer.py:803] 2025-04-26 21:27:30,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6300 [WARNING|trainer.py:803] 2025-04-26 21:27:31,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5723 5687 [WARNING|trainer.py:803] 2025-04-26 21:27:32,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6301 [WARNING|trainer.py:803] 2025-04-26 21:27:32,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:32,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:33,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5688 6302 5724 [WARNING|trainer.py:803] 2025-04-26 21:27:34,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:34,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:34,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6303 5689 [WARNING|trainer.py:803] 2025-04-26 21:27:35,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:35,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6304 5725 5690 [WARNING|trainer.py:803] 2025-04-26 21:27:36,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:27:36,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6305 [WARNING|trainer.py:803] 2025-04-26 21:27:37,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:37,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5691 5726 6306 [WARNING|trainer.py:803] 2025-04-26 21:27:38,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:27:38,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:27:38,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6307 5692 [WARNING|trainer.py:803] 2025-04-26 21:27:40,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5727 [WARNING|trainer.py:803] 2025-04-26 21:27:40,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6308 5693 [WARNING|trainer.py:803] 2025-04-26 21:27:40,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:41,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6309 [WARNING|trainer.py:803] 2025-04-26 21:27:41,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5728 [WARNING|trainer.py:803] 2025-04-26 21:27:42,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5694 6310 [WARNING|trainer.py:803] 2025-04-26 21:27:43,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:27:43,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:43,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6311 5695 5729 [WARNING|trainer.py:803] 2025-04-26 21:27:44,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:44,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6312 [WARNING|trainer.py:803] 2025-04-26 21:27:45,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5696 [WARNING|trainer.py:803] 2025-04-26 21:27:45,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6313 [WARNING|trainer.py:803] 2025-04-26 21:27:46,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [mov,mp4,m4a,3gp,3g2,mj2 @ 0x3b860a80] moov atom not found [21:27:46] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4, Invalid data found when processing input Error reading /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4... sharegpt4v_instruct_gpt4-vision_cap100k Traceback (most recent call last): File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 903, in __getitem__ ret = self.video_get_item(data_item) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 809, in video_get_item sampled_video = self.load_video_fast(video_path) File "/home/wangjiarui/AIGV_2025/train/train_qa.py", line 688, in load_video_fast video_reader = VideoReader(video_path) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/decord/video_reader.py", line 57, in __init__ raise RuntimeError("Error reading " + uri + "...") RuntimeError: Error reading /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4... Failed to load video: /home/wangjiarui/AIGV6K/Videos300/Pixverse/10297.mp4, the dataset is: sharegpt4v_instruct_gpt4-vision_cap100k 5730 [WARNING|trainer.py:803] 2025-04-26 21:27:46,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5697 6314 [WARNING|trainer.py:803] 2025-04-26 21:27:47,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:27:47,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:47,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6315 5698 5731 [WARNING|trainer.py:803] 2025-04-26 21:27:48,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6316 [WARNING|trainer.py:803] 2025-04-26 21:27:49,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:49,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5699 [WARNING|trainer.py:803] 2025-04-26 21:27:50,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6317 5732 [WARNING|trainer.py:803] 2025-04-26 21:27:50,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:51,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:51,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6318 5700 [WARNING|trainer.py:803] 2025-04-26 21:27:52,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:52,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6319 5733 [WARNING|trainer.py:803] 2025-04-26 21:27:53,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:53,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5701 6320 [WARNING|trainer.py:803] 2025-04-26 21:27:54,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:27:54,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5734 6321 [WARNING|trainer.py:803] 2025-04-26 21:27:55,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:27:55,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5702 6322 [WARNING|trainer.py:803] 2025-04-26 21:27:56,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:56,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5735 6323 [WARNING|trainer.py:803] 2025-04-26 21:27:57,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:27:57,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5703 6324 5736 [WARNING|trainer.py:803] 2025-04-26 21:27:58,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:27:58,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6325 [WARNING|trainer.py:803] 2025-04-26 21:27:59,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:28:00,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5704 6326 [WARNING|trainer.py:803] 2025-04-26 21:28:00,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5737 [WARNING|trainer.py:803] 2025-04-26 21:28:01,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6327 [WARNING|trainer.py:803] 2025-04-26 21:28:01,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:02,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5705 6328 5738 [WARNING|trainer.py:803] 2025-04-26 21:28:03,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:03,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6329 [WARNING|trainer.py:803] 2025-04-26 21:28:03,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:04,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5706 6330 5739 [WARNING|trainer.py:803] 2025-04-26 21:28:05,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:05,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6331 [WARNING|trainer.py:803] 2025-04-26 21:28:05,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5707 [WARNING|trainer.py:803] 2025-04-26 21:28:06,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6332 5740 [WARNING|trainer.py:803] 2025-04-26 21:28:07,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:07,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:08,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6333 5708 [WARNING|trainer.py:803] 2025-04-26 21:28:08,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5741 6334 [WARNING|trainer.py:803] 2025-04-26 21:28:09,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:09,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:10,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6335 5709 [WARNING|trainer.py:803] 2025-04-26 21:28:11,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5742 6336 [WARNING|trainer.py:803] 2025-04-26 21:28:11,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:11,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:12,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6337 5710 5743 [WARNING|trainer.py:803] 2025-04-26 21:28:13,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6338 [WARNING|trainer.py:803] 2025-04-26 21:28:13,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:28:14,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:28:14,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6339 5711 5744 [WARNING|trainer.py:803] 2025-04-26 21:28:15,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:28:15,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6340 [WARNING|trainer.py:803] 2025-04-26 21:28:16,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:16,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6341 5712 5745 [WARNING|trainer.py:803] 2025-04-26 21:28:17,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:28:17,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6342 [WARNING|trainer.py:803] 2025-04-26 21:28:18,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:18,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5713 6343 5746 [WARNING|trainer.py:803] 2025-04-26 21:28:20,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:20,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6344 [WARNING|trainer.py:803] 2025-04-26 21:28:20,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:28:21,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5714 6345 5747 [WARNING|trainer.py:803] 2025-04-26 21:28:22,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:28:22,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6346 [WARNING|trainer.py:803] 2025-04-26 21:28:22,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:23,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5715 6347 5748 [WARNING|trainer.py:803] 2025-04-26 21:28:24,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:28:24,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:24,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6348 5716 [WARNING|trainer.py:803] 2025-04-26 21:28:25,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5749 6349 [WARNING|trainer.py:803] 2025-04-26 21:28:26,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:26,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:28:26,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6350 5717 5750 [WARNING|trainer.py:803] 2025-04-26 21:28:27,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6351 [WARNING|trainer.py:803] 2025-04-26 21:28:28,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:28:28,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:28:29,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6352 5718 5751 [WARNING|trainer.py:803] 2025-04-26 21:28:30,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6353 [WARNING|trainer.py:803] 2025-04-26 21:28:30,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:30,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:28:31,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6354 5719 5752 [WARNING|trainer.py:803] 2025-04-26 21:28:32,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6355 [WARNING|trainer.py:803] 2025-04-26 21:28:32,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:28:32,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:33,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6356 5720 5753 [WARNING|trainer.py:803] 2025-04-26 21:28:34,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:28:34,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6357 [WARNING|trainer.py:803] 2025-04-26 21:28:34,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:28:35,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6358 5721 5754 [WARNING|trainer.py:803] 2025-04-26 21:28:36,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:36,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:36,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6359 [WARNING|trainer.py:803] 2025-04-26 21:28:37,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5722 6360 5755 [WARNING|trainer.py:803] 2025-04-26 21:28:38,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:28:38,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:38,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6361 [WARNING|trainer.py:803] 2025-04-26 21:28:40,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5756 5723 6362 [WARNING|trainer.py:803] 2025-04-26 21:28:41,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:28:41,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:41,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6363 5757 5724 [WARNING|trainer.py:803] 2025-04-26 21:28:42,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6364 [WARNING|trainer.py:803] 2025-04-26 21:28:42,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:43,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:43,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6365 5758 5725 [WARNING|trainer.py:803] 2025-04-26 21:28:44,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:28:44,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6366 [WARNING|trainer.py:803] 2025-04-26 21:28:45,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:28:45,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6367 5759 5726 [WARNING|trainer.py:803] 2025-04-26 21:28:46,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:28:47,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6368 [WARNING|trainer.py:803] 2025-04-26 21:28:47,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:48,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5760 6369 5727 [WARNING|trainer.py:803] 2025-04-26 21:28:49,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:28:49,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:28:49,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6370 5761 [WARNING|trainer.py:803] 2025-04-26 21:28:50,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6371 5728 [WARNING|trainer.py:803] 2025-04-26 21:28:51,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:28:51,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:51,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6372 5762 [WARNING|trainer.py:803] 2025-04-26 21:28:52,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5729 6373 [WARNING|trainer.py:803] 2025-04-26 21:28:53,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:53,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:53,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6374 5763 [WARNING|trainer.py:803] 2025-04-26 21:28:54,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5730 [WARNING|trainer.py:803] 2025-04-26 21:28:55,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6375 [WARNING|trainer.py:803] 2025-04-26 21:28:55,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:55,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6376 5764 5731 [WARNING|trainer.py:803] 2025-04-26 21:28:57,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:28:57,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6377 [WARNING|trainer.py:803] 2025-04-26 21:28:57,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:28:58,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6378 5765 5732 [WARNING|trainer.py:803] 2025-04-26 21:28:59,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:28:59,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6379 [WARNING|trainer.py:803] 2025-04-26 21:28:59,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:29:00,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5766 6380 5733 [WARNING|trainer.py:803] 2025-04-26 21:29:01,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:01,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6381 [WARNING|trainer.py:803] 2025-04-26 21:29:01,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5767 [WARNING|trainer.py:803] 2025-04-26 21:29:02,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6382 5734 [WARNING|trainer.py:803] 2025-04-26 21:29:03,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:29:03,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:29:03,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6383 5768 [WARNING|trainer.py:803] 2025-04-26 21:29:04,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6384 5735 [WARNING|trainer.py:803] 2025-04-26 21:29:05,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:29:05,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:29:06,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6385 5769 [WARNING|trainer.py:803] 2025-04-26 21:29:07,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:07,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6386 5736 [WARNING|trainer.py:803] 2025-04-26 21:29:08,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:08,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6387 5770 [WARNING|trainer.py:803] 2025-04-26 21:29:09,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:29:09,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5737 6388 [WARNING|trainer.py:803] 2025-04-26 21:29:10,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:29:10,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5771 6389 [WARNING|trainer.py:803] 2025-04-26 21:29:11,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:29:11,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5738 6390 [WARNING|trainer.py:803] 2025-04-26 21:29:12,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:12,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5772 6391 [WARNING|trainer.py:803] 2025-04-26 21:29:13,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5739 [WARNING|trainer.py:803] 2025-04-26 21:29:13,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6392 [WARNING|trainer.py:803] 2025-04-26 21:29:14,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5773 [WARNING|trainer.py:803] 2025-04-26 21:29:14,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6393 [WARNING|trainer.py:803] 2025-04-26 21:29:15,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5740 [WARNING|trainer.py:803] 2025-04-26 21:29:16,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6394 [WARNING|trainer.py:803] 2025-04-26 21:29:16,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5774 [WARNING|trainer.py:803] 2025-04-26 21:29:17,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6395 [WARNING|trainer.py:803] 2025-04-26 21:29:17,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5741 [WARNING|trainer.py:803] 2025-04-26 21:29:18,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6396 [WARNING|trainer.py:803] 2025-04-26 21:29:18,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5775 [WARNING|trainer.py:803] 2025-04-26 21:29:19,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6397 [WARNING|trainer.py:803] 2025-04-26 21:29:19,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5742 [WARNING|trainer.py:803] 2025-04-26 21:29:20,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:29:20,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6398 5776 [WARNING|trainer.py:803] 2025-04-26 21:29:21,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:29:21,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6399 5743 [WARNING|trainer.py:803] 2025-04-26 21:29:22,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:29:22,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6400 5777 [WARNING|trainer.py:803] 2025-04-26 21:29:23,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:23,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6401 5744 [WARNING|trainer.py:803] 2025-04-26 21:29:24,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:24,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5778 6402 [WARNING|trainer.py:803] 2025-04-26 21:29:25,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:26,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5745 6403 [WARNING|trainer.py:803] 2025-04-26 21:29:26,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5779 [WARNING|trainer.py:803] 2025-04-26 21:29:27,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6404 [WARNING|trainer.py:803] 2025-04-26 21:29:27,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:28,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5746 6405 [WARNING|trainer.py:803] 2025-04-26 21:29:29,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:29:29,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5780 6406 [WARNING|trainer.py:803] 2025-04-26 21:29:30,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:29:30,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5747 6407 [WARNING|trainer.py:803] 2025-04-26 21:29:31,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5781 [WARNING|trainer.py:803] 2025-04-26 21:29:31,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6408 [WARNING|trainer.py:803] 2025-04-26 21:29:32,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5748 [WARNING|trainer.py:803] 2025-04-26 21:29:32,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6409 [WARNING|trainer.py:803] 2025-04-26 21:29:33,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5782 [WARNING|trainer.py:803] 2025-04-26 21:29:33,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6410 [WARNING|trainer.py:803] 2025-04-26 21:29:34,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5749 [WARNING|trainer.py:803] 2025-04-26 21:29:34,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6411 [WARNING|trainer.py:803] 2025-04-26 21:29:35,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5783 [WARNING|trainer.py:803] 2025-04-26 21:29:36,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6412 [WARNING|trainer.py:803] 2025-04-26 21:29:36,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5750 [WARNING|trainer.py:803] 2025-04-26 21:29:37,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6413 [WARNING|trainer.py:803] 2025-04-26 21:29:37,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5784 [WARNING|trainer.py:803] 2025-04-26 21:29:38,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6414 5751 [WARNING|trainer.py:803] 2025-04-26 21:29:38,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:39,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6415 [WARNING|trainer.py:803] 2025-04-26 21:29:39,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5785 [WARNING|trainer.py:803] 2025-04-26 21:29:40,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6416 5752 [WARNING|trainer.py:803] 2025-04-26 21:29:40,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:41,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:29:41,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6417 5786 [WARNING|trainer.py:803] 2025-04-26 21:29:42,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 5753 6418 [WARNING|trainer.py:803] 2025-04-26 21:29:43,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:43,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:29:43,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6419 5787 5754 [WARNING|trainer.py:803] 2025-04-26 21:29:45,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6420 [WARNING|trainer.py:803] 2025-04-26 21:29:45,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:29:45,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:29:46,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6421 5788 5755 [WARNING|trainer.py:803] 2025-04-26 21:29:47,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6422 [WARNING|trainer.py:803] 2025-04-26 21:29:47,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:29:47,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:48,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6423 5789 5756 [WARNING|trainer.py:803] 2025-04-26 21:29:49,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:29:49,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6424 [WARNING|trainer.py:803] 2025-04-26 21:29:50,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:29:50,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5790 6425 5757 [WARNING|trainer.py:803] 2025-04-26 21:29:51,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:51,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6426 [WARNING|trainer.py:803] 2025-04-26 21:29:52,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:52,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5791 6427 5758 [WARNING|trainer.py:803] 2025-04-26 21:29:53,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:29:53,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:54,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6428 5792 [WARNING|trainer.py:803] 2025-04-26 21:29:55,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5759 6429 [WARNING|trainer.py:803] 2025-04-26 21:29:55,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:29:56,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:29:56,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6430 5793 5760 [WARNING|trainer.py:803] 2025-04-26 21:29:57,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6431 [WARNING|trainer.py:803] 2025-04-26 21:29:57,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:29:58,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:29:58,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6432 5794 5761 [WARNING|trainer.py:803] 2025-04-26 21:29:59,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:29:59,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6433 [WARNING|trainer.py:803] 2025-04-26 21:30:00,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:30:00,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6434 5795 5762 [WARNING|trainer.py:803] 2025-04-26 21:30:01,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:30:02,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:30:02,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6435 [WARNING|trainer.py:803] 2025-04-26 21:30:02,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5796 6436 5763 [WARNING|trainer.py:803] 2025-04-26 21:30:04,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:30:04,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:30:04,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6437 [WARNING|trainer.py:803] 2025-04-26 21:30:05,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5797 5764 6438 [WARNING|trainer.py:803] 2025-04-26 21:30:06,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:30:06,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:30:06,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6439 [WARNING|trainer.py:803] 2025-04-26 21:30:07,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5798 5765 6440 [WARNING|trainer.py:803] 2025-04-26 21:30:08,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:08,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:30:08,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6441 5799 5766 [WARNING|trainer.py:803] 2025-04-26 21:30:09,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6442 [WARNING|trainer.py:803] 2025-04-26 21:30:10,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:30:10,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:10,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6443 5800 5767 [WARNING|trainer.py:803] 2025-04-26 21:30:11,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6444 [WARNING|trainer.py:803] 2025-04-26 21:30:12,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:30:12,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:30:13,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6445 5801 5768 [WARNING|trainer.py:803] 2025-04-26 21:30:14,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6446 [WARNING|trainer.py:803] 2025-04-26 21:30:14,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:14,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:30:15,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6447 5769 5802 [WARNING|trainer.py:803] 2025-04-26 21:30:16,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:30:16,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6448 [WARNING|trainer.py:803] 2025-04-26 21:30:16,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:17,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6449 5803 5770 [WARNING|trainer.py:803] 2025-04-26 21:30:18,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:18,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:30:18,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6450 [WARNING|trainer.py:803] 2025-04-26 21:30:19,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6451 5771 5804 [WARNING|trainer.py:803] 2025-04-26 21:30:20,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:30:20,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:30:21,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6452 [WARNING|trainer.py:803] 2025-04-26 21:30:21,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5772 5805 6453 [WARNING|trainer.py:803] 2025-04-26 21:30:23,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:30:23,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:23,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6454 5806 [WARNING|trainer.py:803] 2025-04-26 21:30:24,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5773 6455 [WARNING|trainer.py:803] 2025-04-26 21:30:25,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:30:25,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:30:25,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6456 5774 5807 [WARNING|trainer.py:803] 2025-04-26 21:30:26,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6457 [WARNING|trainer.py:803] 2025-04-26 21:30:27,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:30:27,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:30:27,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6458 5808 5775 [WARNING|trainer.py:803] 2025-04-26 21:30:28,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6459 [WARNING|trainer.py:803] 2025-04-26 21:30:29,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:30:29,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:29,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6460 5809 5776 [WARNING|trainer.py:803] 2025-04-26 21:30:30,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:30:31,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6461 [WARNING|trainer.py:803] 2025-04-26 21:30:31,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:30:32,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6462 5810 5777 [WARNING|trainer.py:803] 2025-04-26 21:30:33,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:33,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:30:33,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6463 [WARNING|trainer.py:803] 2025-04-26 21:30:34,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5811 5778 6464 [WARNING|trainer.py:803] 2025-04-26 21:30:35,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:35,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:35,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6465 5812 [WARNING|trainer.py:803] 2025-04-26 21:30:36,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5779 6466 [WARNING|trainer.py:803] 2025-04-26 21:30:37,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:37,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:37,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6467 5813 [WARNING|trainer.py:803] 2025-04-26 21:30:38,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5780 6468 [WARNING|trainer.py:803] 2025-04-26 21:30:39,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:30:39,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:30:40,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6469 5814 [WARNING|trainer.py:803] 2025-04-26 21:30:41,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5781 [WARNING|trainer.py:803] 2025-04-26 21:30:41,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6470 [WARNING|trainer.py:803] 2025-04-26 21:30:42,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:42,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6471 5815 5782 [WARNING|trainer.py:803] 2025-04-26 21:30:43,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:30:43,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6472 [WARNING|trainer.py:803] 2025-04-26 21:30:44,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:30:44,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6473 5816 [WARNING|trainer.py:803] 2025-04-26 21:30:45,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:30:45,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5783 6474 [WARNING|trainer.py:803] 2025-04-26 21:30:46,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:46,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5817 6475 [WARNING|trainer.py:803] 2025-04-26 21:30:47,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:47,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5784 6476 [WARNING|trainer.py:803] 2025-04-26 21:30:48,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:48,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5818 6477 5785 [WARNING|trainer.py:803] 2025-04-26 21:30:50,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:50,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6478 [WARNING|trainer.py:803] 2025-04-26 21:30:50,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:51,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5819 6479 5786 [WARNING|trainer.py:803] 2025-04-26 21:30:52,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:30:52,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6480 [WARNING|trainer.py:803] 2025-04-26 21:30:52,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:53,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5820 6481 [WARNING|trainer.py:803] 2025-04-26 21:30:54,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5787 [WARNING|trainer.py:803] 2025-04-26 21:30:54,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6482 [WARNING|trainer.py:803] 2025-04-26 21:30:55,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:30:55,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5821 6483 5788 [WARNING|trainer.py:803] 2025-04-26 21:30:56,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:30:56,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6484 [WARNING|trainer.py:803] 2025-04-26 21:30:57,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:30:57,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6485 5822 5789 [WARNING|trainer.py:803] 2025-04-26 21:30:58,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:30:59,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:30:59,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6486 [WARNING|trainer.py:803] 2025-04-26 21:31:00,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5790 5823 6487 [WARNING|trainer.py:803] 2025-04-26 21:31:01,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:31:01,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:01,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6488 5824 [WARNING|trainer.py:803] 2025-04-26 21:31:02,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5791 6489 [WARNING|trainer.py:803] 2025-04-26 21:31:03,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:31:03,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:03,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6490 5792 5825 [WARNING|trainer.py:803] 2025-04-26 21:31:04,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6491 [WARNING|trainer.py:803] 2025-04-26 21:31:05,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:05,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:31:05,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6492 5793 5826 [WARNING|trainer.py:803] 2025-04-26 21:31:06,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6493 [WARNING|trainer.py:803] 2025-04-26 21:31:07,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:07,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:31:07,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6494 5794 [WARNING|trainer.py:803] 2025-04-26 21:31:09,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5827 6495 [WARNING|trainer.py:803] 2025-04-26 21:31:09,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:31:09,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:31:10,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6496 5795 5828 [WARNING|trainer.py:803] 2025-04-26 21:31:11,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6497 [WARNING|trainer.py:803] 2025-04-26 21:31:11,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:31:11,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:31:12,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6498 5796 5829 [WARNING|trainer.py:803] 2025-04-26 21:31:13,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:31:13,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6499 [WARNING|trainer.py:803] 2025-04-26 21:31:13,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:31:14,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5797 6500 5830 [WARNING|trainer.py:803] 2025-04-26 21:31:15,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:31:15,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:31:16,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6501 [WARNING|trainer.py:803] 2025-04-26 21:31:16,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5798 6502 5831 [WARNING|trainer.py:803] 2025-04-26 21:31:17,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:31:18,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:18,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6503 5799 [WARNING|trainer.py:803] 2025-04-26 21:31:19,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6504 5832 [WARNING|trainer.py:803] 2025-04-26 21:31:19,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:31:20,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:31:20,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6505 5800 [WARNING|trainer.py:803] 2025-04-26 21:31:21,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5833 6506 [WARNING|trainer.py:803] 2025-04-26 21:31:21,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:31:22,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:31:22,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6507 5801 [WARNING|trainer.py:803] 2025-04-26 21:31:23,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5834 6508 [WARNING|trainer.py:803] 2025-04-26 21:31:24,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:31:24,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:31:24,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6509 5802 5835 [WARNING|trainer.py:803] 2025-04-26 21:31:25,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:31:26,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6510 [WARNING|trainer.py:803] 2025-04-26 21:31:26,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:27,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6511 5803 5836 [WARNING|trainer.py:803] 2025-04-26 21:31:28,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:28,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6512 [WARNING|trainer.py:803] 2025-04-26 21:31:28,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:29,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6513 5804 5837 [WARNING|trainer.py:803] 2025-04-26 21:31:30,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:30,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6514 [WARNING|trainer.py:803] 2025-04-26 21:31:30,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:31:31,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5805 6515 5838 [WARNING|trainer.py:803] 2025-04-26 21:31:32,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:31:32,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6516 [WARNING|trainer.py:803] 2025-04-26 21:31:32,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5806 [WARNING|trainer.py:803] 2025-04-26 21:31:33,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6517 5839 [WARNING|trainer.py:803] 2025-04-26 21:31:34,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:31:34,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:35,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6518 5807 [WARNING|trainer.py:803] 2025-04-26 21:31:36,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5840 6519 [WARNING|trainer.py:803] 2025-04-26 21:31:36,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:31:37,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:31:37,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6520 5808 5841 [WARNING|trainer.py:803] 2025-04-26 21:31:38,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6521 [WARNING|trainer.py:803] 2025-04-26 21:31:38,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:31:39,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:31:39,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6522 5809 5842 [WARNING|trainer.py:803] 2025-04-26 21:31:40,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:40,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6523 [WARNING|trainer.py:803] 2025-04-26 21:31:41,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:31:41,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6524 5810 5843 [WARNING|trainer.py:803] 2025-04-26 21:31:42,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:42,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6525 [WARNING|trainer.py:803] 2025-04-26 21:31:43,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:31:43,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5811 6526 5844 [WARNING|trainer.py:803] 2025-04-26 21:31:44,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:31:45,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6527 [WARNING|trainer.py:803] 2025-04-26 21:31:45,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5812 [WARNING|trainer.py:803] 2025-04-26 21:31:46,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6528 5845 [WARNING|trainer.py:803] 2025-04-26 21:31:46,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:31:47,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6529 [WARNING|trainer.py:803] 2025-04-26 21:31:47,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5813 [WARNING|trainer.py:803] 2025-04-26 21:31:48,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6530 5846 [WARNING|trainer.py:803] 2025-04-26 21:31:48,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:31:49,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:31:49,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6531 5814 [WARNING|trainer.py:803] 2025-04-26 21:31:50,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5847 6532 [WARNING|trainer.py:803] 2025-04-26 21:31:51,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:31:51,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:51,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6533 5815 [WARNING|trainer.py:803] 2025-04-26 21:31:52,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5848 [WARNING|trainer.py:803] 2025-04-26 21:31:53,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6534 [WARNING|trainer.py:803] 2025-04-26 21:31:53,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:31:53,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6535 5816 [WARNING|trainer.py:803] 2025-04-26 21:31:55,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5849 [WARNING|trainer.py:803] 2025-04-26 21:31:55,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6536 [WARNING|trainer.py:803] 2025-04-26 21:31:55,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:31:56,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5817 6537 5850 [WARNING|trainer.py:803] 2025-04-26 21:31:57,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:57,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6538 [WARNING|trainer.py:803] 2025-04-26 21:31:58,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:31:58,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6539 5818 [WARNING|trainer.py:803] 2025-04-26 21:31:59,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:31:59,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5851 6540 [WARNING|trainer.py:803] 2025-04-26 21:32:00,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:00,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5819 6541 [WARNING|trainer.py:803] 2025-04-26 21:32:01,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:32:01,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5852 6542 [WARNING|trainer.py:803] 2025-04-26 21:32:02,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:02,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5820 6543 [WARNING|trainer.py:803] 2025-04-26 21:32:03,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:03,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5853 6544 [WARNING|trainer.py:803] 2025-04-26 21:32:04,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:32:04,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6545 5821 5854 [WARNING|trainer.py:803] 2025-04-26 21:32:06,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:32:06,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6546 [WARNING|trainer.py:803] 2025-04-26 21:32:06,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:07,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6547 5822 5855 [WARNING|trainer.py:803] 2025-04-26 21:32:08,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:32:08,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6548 [WARNING|trainer.py:803] 2025-04-26 21:32:08,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:09,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5823 6549 5856 [WARNING|trainer.py:803] 2025-04-26 21:32:10,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:32:10,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:32:10,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6550 [WARNING|trainer.py:803] 2025-04-26 21:32:11,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5824 6551 5857 [WARNING|trainer.py:803] 2025-04-26 21:32:12,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:12,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:32:12,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6552 5825 [WARNING|trainer.py:803] 2025-04-26 21:32:13,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6553 5858 [WARNING|trainer.py:803] 2025-04-26 21:32:14,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:14,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6554 [WARNING|trainer.py:803] 2025-04-26 21:32:15,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5826 [WARNING|trainer.py:803] 2025-04-26 21:32:15,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6555 5859 [WARNING|trainer.py:803] 2025-04-26 21:32:16,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:17,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:32:17,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6556 [WARNING|trainer.py:803] 2025-04-26 21:32:18,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5827 6557 5860 [WARNING|trainer.py:803] 2025-04-26 21:32:19,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:19,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:32:19,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6558 5828 [WARNING|trainer.py:803] 2025-04-26 21:32:20,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5861 6559 [WARNING|trainer.py:803] 2025-04-26 21:32:21,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:32:21,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:21,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6560 5829 [WARNING|trainer.py:803] 2025-04-26 21:32:22,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5862 6561 [WARNING|trainer.py:803] 2025-04-26 21:32:23,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:23,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:23,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6562 5830 [WARNING|trainer.py:803] 2025-04-26 21:32:24,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5863 6563 [WARNING|trainer.py:803] 2025-04-26 21:32:25,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:25,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:25,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6564 5831 [WARNING|trainer.py:803] 2025-04-26 21:32:27,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5864 6565 [WARNING|trainer.py:803] 2025-04-26 21:32:27,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:28,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:28,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6566 5832 [WARNING|trainer.py:803] 2025-04-26 21:32:29,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5865 6567 [WARNING|trainer.py:803] 2025-04-26 21:32:29,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:32:30,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:30,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6568 5833 5866 [WARNING|trainer.py:803] 2025-04-26 21:32:31,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6569 [WARNING|trainer.py:803] 2025-04-26 21:32:31,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:32,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:32,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6570 5834 5867 [WARNING|trainer.py:803] 2025-04-26 21:32:33,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6571 [WARNING|trainer.py:803] 2025-04-26 21:32:34,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:34,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:34,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6572 5835 5868 [WARNING|trainer.py:803] 2025-04-26 21:32:35,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6573 [WARNING|trainer.py:803] 2025-04-26 21:32:36,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:36,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:32:36,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6574 5836 5869 [WARNING|trainer.py:803] 2025-04-26 21:32:37,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6575 [WARNING|trainer.py:803] 2025-04-26 21:32:38,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:32:38,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:39,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6576 5870 5837 [WARNING|trainer.py:803] 2025-04-26 21:32:40,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6577 [WARNING|trainer.py:803] 2025-04-26 21:32:40,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:40,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:32:41,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6578 5838 5871 [WARNING|trainer.py:803] 2025-04-26 21:32:42,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6579 [WARNING|trainer.py:803] 2025-04-26 21:32:42,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:42,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:32:43,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6580 5872 5839 [WARNING|trainer.py:803] 2025-04-26 21:32:44,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:44,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:44,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6581 [WARNING|trainer.py:803] 2025-04-26 21:32:45,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5873 5840 6582 [WARNING|trainer.py:803] 2025-04-26 21:32:46,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:32:46,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:46,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6583 5874 [WARNING|trainer.py:803] 2025-04-26 21:32:47,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5841 6584 [WARNING|trainer.py:803] 2025-04-26 21:32:48,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:32:48,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:48,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6585 5875 5842 [WARNING|trainer.py:803] 2025-04-26 21:32:50,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6586 [WARNING|trainer.py:803] 2025-04-26 21:32:50,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:50,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:51,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6587 5876 5843 [WARNING|trainer.py:803] 2025-04-26 21:32:52,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6588 [WARNING|trainer.py:803] 2025-04-26 21:32:52,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:32:53,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:53,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6589 5877 5844 [WARNING|trainer.py:803] 2025-04-26 21:32:54,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6590 [WARNING|trainer.py:803] 2025-04-26 21:32:54,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:55,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:55,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6591 5878 5845 [WARNING|trainer.py:803] 2025-04-26 21:32:56,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:32:56,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6592 [WARNING|trainer.py:803] 2025-04-26 21:32:57,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:32:57,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6593 5879 5846 [WARNING|trainer.py:803] 2025-04-26 21:32:58,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:32:59,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6594 [WARNING|trainer.py:803] 2025-04-26 21:32:59,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:00,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6595 5880 5847 [WARNING|trainer.py:803] 2025-04-26 21:33:01,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:33:01,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:33:01,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6596 [WARNING|trainer.py:803] 2025-04-26 21:33:02,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6597 5881 5848 [WARNING|trainer.py:803] 2025-04-26 21:33:03,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:33:03,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6598 [WARNING|trainer.py:803] 2025-04-26 21:33:03,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:33:04,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6599 5882 5849 [WARNING|trainer.py:803] 2025-04-26 21:33:05,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:33:05,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:33:05,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6600 [WARNING|trainer.py:803] 2025-04-26 21:33:06,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6601 5883 5850 [WARNING|trainer.py:803] 2025-04-26 21:33:07,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:33:07,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:33:07,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6602 [WARNING|trainer.py:803] 2025-04-26 21:33:08,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5884 5851 6603 [WARNING|trainer.py:803] 2025-04-26 21:33:09,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:09,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:33:10,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6604 5885 [WARNING|trainer.py:803] 2025-04-26 21:33:11,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5852 6605 [WARNING|trainer.py:803] 2025-04-26 21:33:11,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:12,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:33:12,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6606 5886 [WARNING|trainer.py:803] 2025-04-26 21:33:13,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5853 6607 [WARNING|trainer.py:803] 2025-04-26 21:33:13,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:33:14,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:33:14,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6608 5887 5854 [WARNING|trainer.py:803] 2025-04-26 21:33:15,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6609 [WARNING|trainer.py:803] 2025-04-26 21:33:16,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:33:16,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:33:16,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6610 5855 [WARNING|trainer.py:803] 2025-04-26 21:33:17,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5888 6611 [WARNING|trainer.py:803] 2025-04-26 21:33:18,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:18,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:33:18,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6612 5856 [WARNING|trainer.py:803] 2025-04-26 21:33:20,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5889 6613 [WARNING|trainer.py:803] 2025-04-26 21:33:20,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:33:20,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:33:21,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6614 5857 [WARNING|trainer.py:803] 2025-04-26 21:33:22,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5890 6615 [WARNING|trainer.py:803] 2025-04-26 21:33:22,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:33:23,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:33:23,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6616 5858 5891 [WARNING|trainer.py:803] 2025-04-26 21:33:24,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6617 [WARNING|trainer.py:803] 2025-04-26 21:33:24,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:25,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:25,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6618 5859 5892 [WARNING|trainer.py:803] 2025-04-26 21:33:26,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6619 [WARNING|trainer.py:803] 2025-04-26 21:33:26,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:27,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:27,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6620 5860 5893 [WARNING|trainer.py:803] 2025-04-26 21:33:28,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6621 [WARNING|trainer.py:803] 2025-04-26 21:33:29,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:33:29,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:33:29,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6622 5861 5894 [WARNING|trainer.py:803] 2025-04-26 21:33:30,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6623 [WARNING|trainer.py:803] 2025-04-26 21:33:31,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:31,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:32,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6624 5862 5895 [WARNING|trainer.py:803] 2025-04-26 21:33:33,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6625 [WARNING|trainer.py:803] 2025-04-26 21:33:33,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:33,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:34,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6626 5863 5896 [WARNING|trainer.py:803] 2025-04-26 21:33:35,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6627 [WARNING|trainer.py:803] 2025-04-26 21:33:35,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:35,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:33:36,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6628 5864 5897 [WARNING|trainer.py:803] 2025-04-26 21:33:37,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6629 [WARNING|trainer.py:803] 2025-04-26 21:33:38,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:33:38,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:33:38,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6630 5898 5865 [WARNING|trainer.py:803] 2025-04-26 21:33:39,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6631 [WARNING|trainer.py:803] 2025-04-26 21:33:40,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:33:40,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:40,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6632 5899 5866 [WARNING|trainer.py:803] 2025-04-26 21:33:41,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6633 [WARNING|trainer.py:803] 2025-04-26 21:33:42,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:42,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:42,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6634 5900 5867 [WARNING|trainer.py:803] 2025-04-26 21:33:44,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6635 [WARNING|trainer.py:803] 2025-04-26 21:33:44,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:33:44,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:45,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6636 5901 5868 [WARNING|trainer.py:803] 2025-04-26 21:33:46,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:33:46,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6637 [WARNING|trainer.py:803] 2025-04-26 21:33:46,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:47,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5902 6638 5869 [WARNING|trainer.py:803] 2025-04-26 21:33:48,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:33:48,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6639 [WARNING|trainer.py:803] 2025-04-26 21:33:48,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5903 [WARNING|trainer.py:803] 2025-04-26 21:33:49,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6640 5870 [WARNING|trainer.py:803] 2025-04-26 21:33:50,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:50,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6641 [WARNING|trainer.py:803] 2025-04-26 21:33:51,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5904 [WARNING|trainer.py:803] 2025-04-26 21:33:51,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6642 5871 [WARNING|trainer.py:803] 2025-04-26 21:33:52,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:33:52,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:33:53,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6643 5905 [WARNING|trainer.py:803] 2025-04-26 21:33:54,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5872 6644 [WARNING|trainer.py:803] 2025-04-26 21:33:54,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:33:55,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:33:55,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6645 5906 [WARNING|trainer.py:803] 2025-04-26 21:33:56,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5873 [WARNING|trainer.py:803] 2025-04-26 21:33:56,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6646 [WARNING|trainer.py:803] 2025-04-26 21:33:57,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:33:57,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5907 6647 5874 [WARNING|trainer.py:803] 2025-04-26 21:33:58,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:33:58,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6648 [WARNING|trainer.py:803] 2025-04-26 21:33:59,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 5908 [WARNING|trainer.py:803] 2025-04-26 21:33:59,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6649 [WARNING|trainer.py:803] 2025-04-26 21:34:00,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5875 [WARNING|trainer.py:803] 2025-04-26 21:34:00,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6650 [WARNING|trainer.py:803] 2025-04-26 21:34:01,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5909 [WARNING|trainer.py:803] 2025-04-26 21:34:01,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6651 [WARNING|trainer.py:803] 2025-04-26 21:34:02,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5876 [WARNING|trainer.py:803] 2025-04-26 21:34:03,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:34:03,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6652 5910 [WARNING|trainer.py:803] 2025-04-26 21:34:04,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:34:04,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6653 5877 [WARNING|trainer.py:803] 2025-04-26 21:34:05,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:34:05,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6654 5911 [WARNING|trainer.py:803] 2025-04-26 21:34:06,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:34:06,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5878 6655 [WARNING|trainer.py:803] 2025-04-26 21:34:07,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:34:07,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5912 6656 [WARNING|trainer.py:803] 2025-04-26 21:34:08,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:34:08,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5879 6657 5913 [WARNING|trainer.py:803] 2025-04-26 21:34:09,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:34:09,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6658 [WARNING|trainer.py:803] 2025-04-26 21:34:10,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:34:10,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5880 6659 5914 [WARNING|trainer.py:803] 2025-04-26 21:34:11,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:34:11,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6660 [WARNING|trainer.py:803] 2025-04-26 21:34:12,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:34:13,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5881 6661 5915 [WARNING|trainer.py:803] 2025-04-26 21:34:14,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:34:14,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:34:14,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6662 [WARNING|trainer.py:803] 2025-04-26 21:34:15,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5916 5882 6663 [WARNING|trainer.py:803] 2025-04-26 21:34:16,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:34:16,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:34:16,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6664 5917 [WARNING|trainer.py:803] 2025-04-26 21:34:17,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5883 6665 [WARNING|trainer.py:803] 2025-04-26 21:34:18,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:34:18,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:34:18,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6666 5918 [WARNING|trainer.py:803] 2025-04-26 21:34:19,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5884 6667 [WARNING|trainer.py:803] 2025-04-26 21:34:20,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:34:20,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:34:20,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6668 5919 5885 [WARNING|trainer.py:803] 2025-04-26 21:34:21,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6669 [WARNING|trainer.py:803] 2025-04-26 21:34:22,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:34:22,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:34:23,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6670 5920 5886 [WARNING|trainer.py:803] 2025-04-26 21:34:24,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6671 [WARNING|trainer.py:803] 2025-04-26 21:34:24,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:34:24,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:34:25,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6672 5921 5887 [WARNING|trainer.py:803] 2025-04-26 21:34:26,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:34:26,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6673 [WARNING|trainer.py:803] 2025-04-26 21:34:26,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5922 [WARNING|trainer.py:803] 2025-04-26 21:34:27,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6674 [WARNING|trainer.py:803] 2025-04-26 21:34:28,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5888 [WARNING|trainer.py:803] 2025-04-26 21:34:28,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6675 [WARNING|trainer.py:803] 2025-04-26 21:34:29,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5923 [WARNING|trainer.py:803] 2025-04-26 21:34:29,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6676 [WARNING|trainer.py:803] 2025-04-26 21:34:30,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5889 [WARNING|trainer.py:803] 2025-04-26 21:34:30,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6677 [WARNING|trainer.py:803] 2025-04-26 21:34:31,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5924 [WARNING|trainer.py:803] 2025-04-26 21:34:32,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6678 [WARNING|trainer.py:803] 2025-04-26 21:34:32,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5890 [WARNING|trainer.py:803] 2025-04-26 21:34:33,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6679 [WARNING|trainer.py:803] 2025-04-26 21:34:33,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5925 [WARNING|trainer.py:803] 2025-04-26 21:34:34,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6680 [WARNING|trainer.py:803] 2025-04-26 21:34:34,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5891 [WARNING|trainer.py:803] 2025-04-26 21:34:35,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6681 [WARNING|trainer.py:803] 2025-04-26 21:34:35,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5926 [WARNING|trainer.py:803] 2025-04-26 21:34:36,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6682 [WARNING|trainer.py:803] 2025-04-26 21:34:36,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5892 [WARNING|trainer.py:803] 2025-04-26 21:34:37,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6683 5927 [WARNING|trainer.py:803] 2025-04-26 21:34:38,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:34:38,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:34:38,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6684 5893 5928 [WARNING|trainer.py:803] 2025-04-26 21:34:39,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6685 [WARNING|trainer.py:803] 2025-04-26 21:34:40,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:34:40,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:34:40,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6686 5894 5929 [WARNING|trainer.py:803] 2025-04-26 21:34:42,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6687 [WARNING|trainer.py:803] 2025-04-26 21:34:42,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:34:42,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:34:43,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6688 5895 5930 [WARNING|trainer.py:803] 2025-04-26 21:34:44,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6689 [WARNING|trainer.py:803] 2025-04-26 21:34:44,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:34:44,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:34:45,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6690 5931 5896 [WARNING|trainer.py:803] 2025-04-26 21:34:46,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:34:46,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6691 [WARNING|trainer.py:803] 2025-04-26 21:34:46,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:34:47,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5932 6692 5897 [WARNING|trainer.py:803] 2025-04-26 21:34:48,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:34:48,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6693 [WARNING|trainer.py:803] 2025-04-26 21:34:48,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5933 [WARNING|trainer.py:803] 2025-04-26 21:34:49,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6694 5898 [WARNING|trainer.py:803] 2025-04-26 21:34:50,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:34:50,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:34:50,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6695 5934 [WARNING|trainer.py:803] 2025-04-26 21:34:51,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5899 6696 [WARNING|trainer.py:803] 2025-04-26 21:34:52,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:34:52,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:34:53,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6697 5935 [WARNING|trainer.py:803] 2025-04-26 21:34:54,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5900 [WARNING|trainer.py:803] 2025-04-26 21:34:54,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6698 [WARNING|trainer.py:803] 2025-04-26 21:34:55,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:34:55,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5936 6699 [WARNING|trainer.py:803] 2025-04-26 21:34:56,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5901 [WARNING|trainer.py:803] 2025-04-26 21:34:56,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6700 [WARNING|trainer.py:803] 2025-04-26 21:34:57,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:34:57,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5937 6701 [WARNING|trainer.py:803] 2025-04-26 21:34:58,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5902 [WARNING|trainer.py:803] 2025-04-26 21:34:58,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6702 [WARNING|trainer.py:803] 2025-04-26 21:34:59,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5938 [WARNING|trainer.py:803] 2025-04-26 21:34:59,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6703 [WARNING|trainer.py:803] 2025-04-26 21:35:00,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5903 [WARNING|trainer.py:803] 2025-04-26 21:35:00,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6704 [WARNING|trainer.py:803] 2025-04-26 21:35:01,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5939 [WARNING|trainer.py:803] 2025-04-26 21:35:02,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6705 [WARNING|trainer.py:803] 2025-04-26 21:35:02,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5904 [WARNING|trainer.py:803] 2025-04-26 21:35:03,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:35:03,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6706 5940 [WARNING|trainer.py:803] 2025-04-26 21:35:04,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:35:04,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5905 6707 [WARNING|trainer.py:803] 2025-04-26 21:35:05,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5941 [WARNING|trainer.py:803] 2025-04-26 21:35:05,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6708 [WARNING|trainer.py:803] 2025-04-26 21:35:06,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5906 [WARNING|trainer.py:803] 2025-04-26 21:35:06,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6709 [WARNING|trainer.py:803] 2025-04-26 21:35:07,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5942 [WARNING|trainer.py:803] 2025-04-26 21:35:07,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6710 [WARNING|trainer.py:803] 2025-04-26 21:35:08,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5907 [WARNING|trainer.py:803] 2025-04-26 21:35:08,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6711 [WARNING|trainer.py:803] 2025-04-26 21:35:09,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5943 [WARNING|trainer.py:803] 2025-04-26 21:35:10,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6712 [WARNING|trainer.py:803] 2025-04-26 21:35:10,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5908 [WARNING|trainer.py:803] 2025-04-26 21:35:11,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6713 [WARNING|trainer.py:803] 2025-04-26 21:35:11,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5944 [WARNING|trainer.py:803] 2025-04-26 21:35:12,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:35:12,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6714 5909 [WARNING|trainer.py:803] 2025-04-26 21:35:13,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5945 [WARNING|trainer.py:803] 2025-04-26 21:35:13,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6715 [WARNING|trainer.py:803] 2025-04-26 21:35:14,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:35:14,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5910 6716 5946 [WARNING|trainer.py:803] 2025-04-26 21:35:15,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:15,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6717 [WARNING|trainer.py:803] 2025-04-26 21:35:16,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5911 [WARNING|trainer.py:803] 2025-04-26 21:35:16,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6718 5947 [WARNING|trainer.py:803] 2025-04-26 21:35:17,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:17,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:35:18,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6719 5912 [WARNING|trainer.py:803] 2025-04-26 21:35:19,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5948 6720 [WARNING|trainer.py:803] 2025-04-26 21:35:19,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:35:19,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:35:20,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6721 5913 5949 [WARNING|trainer.py:803] 2025-04-26 21:35:21,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:35:21,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6722 [WARNING|trainer.py:803] 2025-04-26 21:35:21,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:35:22,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5914 6723 5950 [WARNING|trainer.py:803] 2025-04-26 21:35:23,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:23,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6724 [WARNING|trainer.py:803] 2025-04-26 21:35:24,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5915 [WARNING|trainer.py:803] 2025-04-26 21:35:24,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6725 5951 [WARNING|trainer.py:803] 2025-04-26 21:35:25,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:35:25,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:35:26,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6726 5916 [WARNING|trainer.py:803] 2025-04-26 21:35:26,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5952 6727 [WARNING|trainer.py:803] 2025-04-26 21:35:27,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:35:27,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:28,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6728 5917 5953 [WARNING|trainer.py:803] 2025-04-26 21:35:29,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:35:29,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6729 [WARNING|trainer.py:803] 2025-04-26 21:35:29,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:30,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6730 5918 5954 [WARNING|trainer.py:803] 2025-04-26 21:35:31,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 21:35:31,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6731 [WARNING|trainer.py:803] 2025-04-26 21:35:31,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:32,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5919 6732 5955 [WARNING|trainer.py:803] 2025-04-26 21:35:33,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:35:33,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:35:33,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6733 [WARNING|trainer.py:803] 2025-04-26 21:35:34,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5920 5956 6734 [WARNING|trainer.py:803] 2025-04-26 21:35:35,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:35:35,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:35,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6735 5921 5957 [WARNING|trainer.py:803] 2025-04-26 21:35:37,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6736 [WARNING|trainer.py:803] 2025-04-26 21:35:37,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:37,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:38,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6737 5958 5922 [WARNING|trainer.py:803] 2025-04-26 21:35:39,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6738 [WARNING|trainer.py:803] 2025-04-26 21:35:39,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:39,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:40,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6739 5959 5923 [WARNING|trainer.py:803] 2025-04-26 21:35:41,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6740 [WARNING|trainer.py:803] 2025-04-26 21:35:41,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:41,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:42,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5960 6741 5924 [WARNING|trainer.py:803] 2025-04-26 21:35:43,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:35:43,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6742 [WARNING|trainer.py:803] 2025-04-26 21:35:43,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5961 [WARNING|trainer.py:803] 2025-04-26 21:35:44,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6743 5925 [WARNING|trainer.py:803] 2025-04-26 21:35:45,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:35:45,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:35:45,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6744 5962 [WARNING|trainer.py:803] 2025-04-26 21:35:47,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5926 6745 [WARNING|trainer.py:803] 2025-04-26 21:35:47,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:35:48,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:48,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6746 5963 5927 [WARNING|trainer.py:803] 2025-04-26 21:35:49,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:49,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6747 [WARNING|trainer.py:803] 2025-04-26 21:35:50,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:35:50,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5964 6748 5928 [WARNING|trainer.py:803] 2025-04-26 21:35:51,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:35:51,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6749 [WARNING|trainer.py:803] 2025-04-26 21:35:51,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5965 [WARNING|trainer.py:803] 2025-04-26 21:35:52,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6750 5929 [WARNING|trainer.py:803] 2025-04-26 21:35:53,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:35:53,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:35:53,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6751 5966 [WARNING|trainer.py:803] 2025-04-26 21:35:54,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5930 6752 [WARNING|trainer.py:803] 2025-04-26 21:35:55,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:55,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:56,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6753 5967 5931 [WARNING|trainer.py:803] 2025-04-26 21:35:57,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:35:57,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6754 [WARNING|trainer.py:803] 2025-04-26 21:35:57,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:35:58,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5968 6755 5932 [WARNING|trainer.py:803] 2025-04-26 21:35:59,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:35:59,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6756 [WARNING|trainer.py:803] 2025-04-26 21:35:59,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:36:00,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5969 6757 5933 [WARNING|trainer.py:803] 2025-04-26 21:36:01,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:36:01,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:01,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6758 5970 [WARNING|trainer.py:803] 2025-04-26 21:36:02,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5934 6759 [WARNING|trainer.py:803] 2025-04-26 21:36:03,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:03,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:03,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6760 5971 5935 [WARNING|trainer.py:803] 2025-04-26 21:36:04,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6761 [WARNING|trainer.py:803] 2025-04-26 21:36:05,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:36:05,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:06,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6762 5972 5936 [WARNING|trainer.py:803] 2025-04-26 21:36:07,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6763 [WARNING|trainer.py:803] 2025-04-26 21:36:07,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:07,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:08,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6764 5973 5937 [WARNING|trainer.py:803] 2025-04-26 21:36:09,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:09,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6765 [WARNING|trainer.py:803] 2025-04-26 21:36:09,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:10,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6766 5938 5974 [WARNING|trainer.py:803] 2025-04-26 21:36:11,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:11,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:11,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6767 5939 [WARNING|trainer.py:803] 2025-04-26 21:36:12,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6768 5975 [WARNING|trainer.py:803] 2025-04-26 21:36:13,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:36:13,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:36:13,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6769 5940 5976 [WARNING|trainer.py:803] 2025-04-26 21:36:14,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6770 [WARNING|trainer.py:803] 2025-04-26 21:36:15,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:36:15,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:16,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6771 5941 5977 [WARNING|trainer.py:803] 2025-04-26 21:36:17,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6772 [WARNING|trainer.py:803] 2025-04-26 21:36:17,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:17,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:18,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6773 5942 5978 [WARNING|trainer.py:803] 2025-04-26 21:36:19,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6774 [WARNING|trainer.py:803] 2025-04-26 21:36:19,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:19,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:20,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6775 5943 5979 [WARNING|trainer.py:803] 2025-04-26 21:36:21,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:36:21,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:36:21,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6776 [WARNING|trainer.py:803] 2025-04-26 21:36:22,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5944 5980 6777 [WARNING|trainer.py:803] 2025-04-26 21:36:23,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:23,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:23,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6778 5945 [WARNING|trainer.py:803] 2025-04-26 21:36:24,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5981 6779 [WARNING|trainer.py:803] 2025-04-26 21:36:25,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:25,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:36:25,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6780 5946 [WARNING|trainer.py:803] 2025-04-26 21:36:27,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5982 6781 [WARNING|trainer.py:803] 2025-04-26 21:36:27,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:27,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:36:28,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6782 5947 5983 [WARNING|trainer.py:803] 2025-04-26 21:36:29,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6783 [WARNING|trainer.py:803] 2025-04-26 21:36:29,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:30,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:36:30,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6784 5948 5984 [WARNING|trainer.py:803] 2025-04-26 21:36:31,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:31,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6785 [WARNING|trainer.py:803] 2025-04-26 21:36:32,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:36:32,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5949 6786 5985 [WARNING|trainer.py:803] 2025-04-26 21:36:33,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:33,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6787 [WARNING|trainer.py:803] 2025-04-26 21:36:34,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:36:34,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5950 6788 5986 [WARNING|trainer.py:803] 2025-04-26 21:36:35,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:36:35,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6789 [WARNING|trainer.py:803] 2025-04-26 21:36:36,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5951 [WARNING|trainer.py:803] 2025-04-26 21:36:36,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6790 [WARNING|trainer.py:803] 2025-04-26 21:36:37,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5987 [WARNING|trainer.py:803] 2025-04-26 21:36:38,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6791 [WARNING|trainer.py:803] 2025-04-26 21:36:38,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5952 [WARNING|trainer.py:803] 2025-04-26 21:36:39,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6792 [WARNING|trainer.py:803] 2025-04-26 21:36:39,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5988 [WARNING|trainer.py:803] 2025-04-26 21:36:40,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:40,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6793 5953 [WARNING|trainer.py:803] 2025-04-26 21:36:41,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:41,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5989 6794 [WARNING|trainer.py:803] 2025-04-26 21:36:42,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:36:42,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5954 6795 [WARNING|trainer.py:803] 2025-04-26 21:36:43,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:43,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6796 5990 5955 [WARNING|trainer.py:803] 2025-04-26 21:36:44,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:44,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6797 [WARNING|trainer.py:803] 2025-04-26 21:36:45,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:45,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5991 6798 5956 [WARNING|trainer.py:803] 2025-04-26 21:36:46,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:47,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6799 [WARNING|trainer.py:803] 2025-04-26 21:36:47,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5992 [WARNING|trainer.py:803] 2025-04-26 21:36:48,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6800 5957 [WARNING|trainer.py:803] 2025-04-26 21:36:48,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:49,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6801 [WARNING|trainer.py:803] 2025-04-26 21:36:49,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5993 [WARNING|trainer.py:803] 2025-04-26 21:36:50,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6802 5958 [WARNING|trainer.py:803] 2025-04-26 21:36:51,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:51,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:51,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6803 5994 [WARNING|trainer.py:803] 2025-04-26 21:36:52,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5959 6804 [WARNING|trainer.py:803] 2025-04-26 21:36:53,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:53,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:53,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6805 5995 5960 [WARNING|trainer.py:803] 2025-04-26 21:36:54,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6806 [WARNING|trainer.py:803] 2025-04-26 21:36:55,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:55,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:36:55,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6807 5996 5961 [WARNING|trainer.py:803] 2025-04-26 21:36:56,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:56,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6808 [WARNING|trainer.py:803] 2025-04-26 21:36:57,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:36:57,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5997 6809 5962 [WARNING|trainer.py:803] 2025-04-26 21:36:59,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:36:59,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6810 [WARNING|trainer.py:803] 2025-04-26 21:36:59,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5998 [WARNING|trainer.py:803] 2025-04-26 21:37:00,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6811 5963 [WARNING|trainer.py:803] 2025-04-26 21:37:00,978 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:01,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:01,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6812 5999 [WARNING|trainer.py:803] 2025-04-26 21:37:02,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5964 6813 [WARNING|trainer.py:803] 2025-04-26 21:37:02,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:03,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:37:03,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6814 6000 5965 [WARNING|trainer.py:803] 2025-04-26 21:37:04,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6815 [WARNING|trainer.py:803] 2025-04-26 21:37:04,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:05,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:37:05,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6816 6001 5966 [WARNING|trainer.py:803] 2025-04-26 21:37:06,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:06,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6817 [WARNING|trainer.py:803] 2025-04-26 21:37:07,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:07,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6818 6002 5967 [WARNING|trainer.py:803] 2025-04-26 21:37:08,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:09,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6819 [WARNING|trainer.py:803] 2025-04-26 21:37:09,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:09,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6820 5968 6003 [WARNING|trainer.py:803] 2025-04-26 21:37:11,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:11,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6821 [WARNING|trainer.py:803] 2025-04-26 21:37:11,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:12,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6004 6822 5969 [WARNING|trainer.py:803] 2025-04-26 21:37:13,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:37:13,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:13,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6823 [WARNING|trainer.py:803] 2025-04-26 21:37:14,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6005 5970 6824 [WARNING|trainer.py:803] 2025-04-26 21:37:15,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:15,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:15,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6825 6006 [WARNING|trainer.py:803] 2025-04-26 21:37:16,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5971 6826 [WARNING|trainer.py:803] 2025-04-26 21:37:17,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:17,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:37:17,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6827 6007 [WARNING|trainer.py:803] 2025-04-26 21:37:18,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5972 6828 [WARNING|trainer.py:803] 2025-04-26 21:37:19,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:19,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:19,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6829 6008 [WARNING|trainer.py:803] 2025-04-26 21:37:20,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5973 [WARNING|trainer.py:803] 2025-04-26 21:37:20,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6830 [WARNING|trainer.py:803] 2025-04-26 21:37:21,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:21,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6831 6009 [WARNING|trainer.py:803] 2025-04-26 21:37:23,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:23,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5974 6832 [WARNING|trainer.py:803] 2025-04-26 21:37:23,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6010 [WARNING|trainer.py:803] 2025-04-26 21:37:24,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6833 [WARNING|trainer.py:803] 2025-04-26 21:37:24,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:25,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5975 6834 6011 [WARNING|trainer.py:803] 2025-04-26 21:37:26,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:26,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6835 [WARNING|trainer.py:803] 2025-04-26 21:37:26,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5976 [WARNING|trainer.py:803] 2025-04-26 21:37:27,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6836 6012 [WARNING|trainer.py:803] 2025-04-26 21:37:28,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:28,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6837 [WARNING|trainer.py:803] 2025-04-26 21:37:29,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 5977 [WARNING|trainer.py:803] 2025-04-26 21:37:29,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6838 6013 [WARNING|trainer.py:803] 2025-04-26 21:37:30,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:30,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:30,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6839 5978 [WARNING|trainer.py:803] 2025-04-26 21:37:31,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6840 [WARNING|trainer.py:803] 2025-04-26 21:37:32,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6014 [WARNING|trainer.py:803] 2025-04-26 21:37:32,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:37:33,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6841 5979 [WARNING|trainer.py:803] 2025-04-26 21:37:33,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:34,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6015 6842 [WARNING|trainer.py:803] 2025-04-26 21:37:34,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:37:35,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6843 5980 6016 [WARNING|trainer.py:803] 2025-04-26 21:37:36,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:36,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6844 [WARNING|trainer.py:803] 2025-04-26 21:37:36,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:37,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6845 5981 6017 [WARNING|trainer.py:803] 2025-04-26 21:37:38,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:38,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6846 [WARNING|trainer.py:803] 2025-04-26 21:37:38,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:37:39,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 5982 6847 6018 [WARNING|trainer.py:803] 2025-04-26 21:37:40,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:37:40,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6848 [WARNING|trainer.py:803] 2025-04-26 21:37:40,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:37:41,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 5983 6849 6019 [WARNING|trainer.py:803] 2025-04-26 21:37:42,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:37:42,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:42,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6850 5984 [WARNING|trainer.py:803] 2025-04-26 21:37:43,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6020 6851 [WARNING|trainer.py:803] 2025-04-26 21:37:44,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:37:44,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:44,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6852 6021 5985 [WARNING|trainer.py:803] 2025-04-26 21:37:45,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6853 [WARNING|trainer.py:803] 2025-04-26 21:37:46,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:37:46,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:37:47,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6854 6022 [WARNING|trainer.py:803] 2025-04-26 21:37:48,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5986 6855 [WARNING|trainer.py:803] 2025-04-26 21:37:48,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:37:49,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:49,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6856 6023 5987 [WARNING|trainer.py:803] 2025-04-26 21:37:50,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6857 [WARNING|trainer.py:803] 2025-04-26 21:37:50,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:51,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:37:51,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6858 5988 6024 [WARNING|trainer.py:803] 2025-04-26 21:37:52,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6859 [WARNING|trainer.py:803] 2025-04-26 21:37:53,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:53,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:37:53,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6860 6025 5989 [WARNING|trainer.py:803] 2025-04-26 21:37:54,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6861 [WARNING|trainer.py:803] 2025-04-26 21:37:55,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:37:55,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:37:55,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6862 6026 5990 [WARNING|trainer.py:803] 2025-04-26 21:37:56,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6863 [WARNING|trainer.py:803] 2025-04-26 21:37:57,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:37:57,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:37:57,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6027 6864 5991 [WARNING|trainer.py:803] 2025-04-26 21:37:58,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:37:59,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6865 [WARNING|trainer.py:803] 2025-04-26 21:37:59,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:00,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6028 6866 5992 [WARNING|trainer.py:803] 2025-04-26 21:38:01,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:38:01,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6867 [WARNING|trainer.py:803] 2025-04-26 21:38:01,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:02,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6029 6868 5993 [WARNING|trainer.py:803] 2025-04-26 21:38:03,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:38:03,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:38:03,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6869 [WARNING|trainer.py:803] 2025-04-26 21:38:04,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6030 5994 6870 [WARNING|trainer.py:803] 2025-04-26 21:38:05,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:38:05,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:05,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6871 5995 [WARNING|trainer.py:803] 2025-04-26 21:38:06,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6031 6872 [WARNING|trainer.py:803] 2025-04-26 21:38:07,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:07,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:38:08,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6873 6032 5996 [WARNING|trainer.py:803] 2025-04-26 21:38:09,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6874 [WARNING|trainer.py:803] 2025-04-26 21:38:09,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:38:09,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:10,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6875 6033 5997 [WARNING|trainer.py:803] 2025-04-26 21:38:11,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:38:11,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6876 [WARNING|trainer.py:803] 2025-04-26 21:38:11,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:12,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6877 5998 6034 [WARNING|trainer.py:803] 2025-04-26 21:38:13,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:13,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:13,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6878 [WARNING|trainer.py:803] 2025-04-26 21:38:14,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 5999 6879 6035 [WARNING|trainer.py:803] 2025-04-26 21:38:15,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:15,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:15,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6880 6000 [WARNING|trainer.py:803] 2025-04-26 21:38:16,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6036 6881 [WARNING|trainer.py:803] 2025-04-26 21:38:17,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:17,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:38:18,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6882 6001 6037 [WARNING|trainer.py:803] 2025-04-26 21:38:19,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6883 [WARNING|trainer.py:803] 2025-04-26 21:38:19,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:38:19,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:38:20,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6884 6002 6038 [WARNING|trainer.py:803] 2025-04-26 21:38:21,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6885 [WARNING|trainer.py:803] 2025-04-26 21:38:21,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:38:21,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:22,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6886 6039 6003 [WARNING|trainer.py:803] 2025-04-26 21:38:23,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6887 [WARNING|trainer.py:803] 2025-04-26 21:38:23,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:38:24,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:24,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6040 6888 6004 [WARNING|trainer.py:803] 2025-04-26 21:38:25,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:38:25,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:25,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6889 [WARNING|trainer.py:803] 2025-04-26 21:38:26,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6041 6005 6890 [WARNING|trainer.py:803] 2025-04-26 21:38:27,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:38:27,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:38:28,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6891 6042 6006 [WARNING|trainer.py:803] 2025-04-26 21:38:29,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6892 [WARNING|trainer.py:803] 2025-04-26 21:38:29,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:29,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:30,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6893 6007 6043 [WARNING|trainer.py:803] 2025-04-26 21:38:31,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:31,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6894 [WARNING|trainer.py:803] 2025-04-26 21:38:31,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:38:32,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6895 6008 6044 [WARNING|trainer.py:803] 2025-04-26 21:38:33,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:33,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6896 [WARNING|trainer.py:803] 2025-04-26 21:38:34,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:34,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6009 6897 6045 [WARNING|trainer.py:803] 2025-04-26 21:38:35,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:38:35,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:35,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6898 6010 [WARNING|trainer.py:803] 2025-04-26 21:38:37,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6046 6899 [WARNING|trainer.py:803] 2025-04-26 21:38:37,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:38:37,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:38:38,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6900 6011 [WARNING|trainer.py:803] 2025-04-26 21:38:39,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6047 [WARNING|trainer.py:803] 2025-04-26 21:38:39,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6901 [WARNING|trainer.py:803] 2025-04-26 21:38:40,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:40,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6902 6012 [WARNING|trainer.py:803] 2025-04-26 21:38:41,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6048 6903 [WARNING|trainer.py:803] 2025-04-26 21:38:41,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:38:42,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:42,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6904 6013 6049 [WARNING|trainer.py:803] 2025-04-26 21:38:43,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:38:43,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6905 [WARNING|trainer.py:803] 2025-04-26 21:38:44,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:38:44,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6906 6014 [WARNING|trainer.py:803] 2025-04-26 21:38:45,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6050 [WARNING|trainer.py:803] 2025-04-26 21:38:45,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6907 [WARNING|trainer.py:803] 2025-04-26 21:38:46,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:38:46,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6908 6015 [WARNING|trainer.py:803] 2025-04-26 21:38:47,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:38:47,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6051 6909 [WARNING|trainer.py:803] 2025-04-26 21:38:48,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:38:48,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6016 6910 [WARNING|trainer.py:803] 2025-04-26 21:38:49,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:38:49,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6052 6911 [WARNING|trainer.py:803] 2025-04-26 21:38:50,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6017 [WARNING|trainer.py:803] 2025-04-26 21:38:50,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6912 [WARNING|trainer.py:803] 2025-04-26 21:38:51,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6053 [WARNING|trainer.py:803] 2025-04-26 21:38:51,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6913 [WARNING|trainer.py:803] 2025-04-26 21:38:52,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6018 [WARNING|trainer.py:803] 2025-04-26 21:38:52,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6914 [WARNING|trainer.py:803] 2025-04-26 21:38:53,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6054 [WARNING|trainer.py:803] 2025-04-26 21:38:53,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6915 [WARNING|trainer.py:803] 2025-04-26 21:38:54,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6019 [WARNING|trainer.py:803] 2025-04-26 21:38:54,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6916 [WARNING|trainer.py:803] 2025-04-26 21:38:55,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6055 [WARNING|trainer.py:803] 2025-04-26 21:38:55,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6917 [WARNING|trainer.py:803] 2025-04-26 21:38:56,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6020 [WARNING|trainer.py:803] 2025-04-26 21:38:56,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6918 [WARNING|trainer.py:803] 2025-04-26 21:38:57,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6056 [WARNING|trainer.py:803] 2025-04-26 21:38:57,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6919 [WARNING|trainer.py:803] 2025-04-26 21:38:58,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6021 [WARNING|trainer.py:803] 2025-04-26 21:38:58,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6920 [WARNING|trainer.py:803] 2025-04-26 21:38:59,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6057 [WARNING|trainer.py:803] 2025-04-26 21:38:59,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6921 [WARNING|trainer.py:803] 2025-04-26 21:39:00,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6022 [WARNING|trainer.py:803] 2025-04-26 21:39:00,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6922 [WARNING|trainer.py:803] 2025-04-26 21:39:01,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6058 [WARNING|trainer.py:803] 2025-04-26 21:39:01,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:39:02,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6923 6023 [WARNING|trainer.py:803] 2025-04-26 21:39:03,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6924 6059 [WARNING|trainer.py:803] 2025-04-26 21:39:03,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:39:04,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:39:04,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6925 6024 6060 [WARNING|trainer.py:803] 2025-04-26 21:39:05,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6926 [WARNING|trainer.py:803] 2025-04-26 21:39:05,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:39:05,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:39:06,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6927 6061 6025 [WARNING|trainer.py:803] 2025-04-26 21:39:07,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6928 [WARNING|trainer.py:803] 2025-04-26 21:39:07,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:39:07,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:39:08,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6929 6062 6026 [WARNING|trainer.py:803] 2025-04-26 21:39:09,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6930 [WARNING|trainer.py:803] 2025-04-26 21:39:09,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:39:09,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:39:10,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6931 6063 6027 [WARNING|trainer.py:803] 2025-04-26 21:39:11,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:39:11,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:39:11,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6932 [WARNING|trainer.py:803] 2025-04-26 21:39:12,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6064 6933 6028 [WARNING|trainer.py:803] 2025-04-26 21:39:13,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:39:13,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6934 [WARNING|trainer.py:803] 2025-04-26 21:39:13,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:39:14,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6935 6065 6029 [WARNING|trainer.py:803] 2025-04-26 21:39:15,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6936 [WARNING|trainer.py:803] 2025-04-26 21:39:15,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:39:16,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:39:16,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6937 6066 6030 [WARNING|trainer.py:803] 2025-04-26 21:39:17,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:39:17,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6938 [WARNING|trainer.py:803] 2025-04-26 21:39:18,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:39:18,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6067 6939 [WARNING|trainer.py:803] 2025-04-26 21:39:19,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6031 [WARNING|trainer.py:803] 2025-04-26 21:39:19,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6940 6068 [WARNING|trainer.py:803] 2025-04-26 21:39:20,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:39:20,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6941 [WARNING|trainer.py:803] 2025-04-26 21:39:21,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6032 [WARNING|trainer.py:803] 2025-04-26 21:39:21,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6942 [WARNING|trainer.py:803] 2025-04-26 21:39:22,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6069 [WARNING|trainer.py:803] 2025-04-26 21:39:22,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6943 [WARNING|trainer.py:803] 2025-04-26 21:39:23,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6033 [WARNING|trainer.py:803] 2025-04-26 21:39:23,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6944 [WARNING|trainer.py:803] 2025-04-26 21:39:24,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6070 [WARNING|trainer.py:803] 2025-04-26 21:39:24,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6945 [WARNING|trainer.py:803] 2025-04-26 21:39:25,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6034 [WARNING|trainer.py:803] 2025-04-26 21:39:25,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6946 [WARNING|trainer.py:803] 2025-04-26 21:39:26,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6071 [WARNING|trainer.py:803] 2025-04-26 21:39:27,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6947 [WARNING|trainer.py:803] 2025-04-26 21:39:27,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6035 [WARNING|trainer.py:803] 2025-04-26 21:39:28,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6948 [WARNING|trainer.py:803] 2025-04-26 21:39:28,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6072 [WARNING|trainer.py:803] 2025-04-26 21:39:29,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6949 [WARNING|trainer.py:803] 2025-04-26 21:39:29,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6036 [WARNING|trainer.py:803] 2025-04-26 21:39:30,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6950 6073 [WARNING|trainer.py:803] 2025-04-26 21:39:30,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:39:31,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6951 [WARNING|trainer.py:803] 2025-04-26 21:39:31,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6037 [WARNING|trainer.py:803] 2025-04-26 21:39:32,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6952 6074 [WARNING|trainer.py:803] 2025-04-26 21:39:32,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:39:33,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:39:33,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6953 6038 [WARNING|trainer.py:803] 2025-04-26 21:39:34,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6954 [WARNING|trainer.py:803] 2025-04-26 21:39:34,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6075 [WARNING|trainer.py:803] 2025-04-26 21:39:35,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6955 [WARNING|trainer.py:803] 2025-04-26 21:39:35,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6039 [WARNING|trainer.py:803] 2025-04-26 21:39:36,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6956 [WARNING|trainer.py:803] 2025-04-26 21:39:36,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6076 [WARNING|trainer.py:803] 2025-04-26 21:39:37,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6957 [WARNING|trainer.py:803] 2025-04-26 21:39:37,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6040 [WARNING|trainer.py:803] 2025-04-26 21:39:38,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:39:38,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6958 6077 [WARNING|trainer.py:803] 2025-04-26 21:39:39,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:39:39,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6041 6959 [WARNING|trainer.py:803] 2025-04-26 21:39:40,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:39:40,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6078 6960 [WARNING|trainer.py:803] 2025-04-26 21:39:41,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:39:41,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6042 6961 6079 [WARNING|trainer.py:803] 2025-04-26 21:39:42,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:39:42,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6962 [WARNING|trainer.py:803] 2025-04-26 21:39:43,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:39:43,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6043 6963 6080 [WARNING|trainer.py:803] 2025-04-26 21:39:44,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:39:44,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6964 [WARNING|trainer.py:803] 2025-04-26 21:39:45,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:39:45,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6044 6965 6081 [WARNING|trainer.py:803] 2025-04-26 21:39:46,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:39:46,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6966 [WARNING|trainer.py:803] 2025-04-26 21:39:47,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6045 [WARNING|trainer.py:803] 2025-04-26 21:39:48,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6967 6082 [WARNING|trainer.py:803] 2025-04-26 21:39:48,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:39:49,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:39:49,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6968 6046 [WARNING|trainer.py:803] 2025-04-26 21:39:50,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6969 6083 [WARNING|trainer.py:803] 2025-04-26 21:39:50,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:39:51,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6970 [WARNING|trainer.py:803] 2025-04-26 21:39:51,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:39:52,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6047 6971 6084 [WARNING|trainer.py:803] 2025-04-26 21:39:53,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:39:53,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6972 [WARNING|trainer.py:803] 2025-04-26 21:39:53,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6048 [WARNING|trainer.py:803] 2025-04-26 21:39:54,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6973 6085 [WARNING|trainer.py:803] 2025-04-26 21:39:55,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:39:55,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6974 [WARNING|trainer.py:803] 2025-04-26 21:39:55,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6049 [WARNING|trainer.py:803] 2025-04-26 21:39:56,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6975 [WARNING|trainer.py:803] 2025-04-26 21:39:57,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6086 [WARNING|trainer.py:803] 2025-04-26 21:39:57,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6976 [WARNING|trainer.py:803] 2025-04-26 21:39:58,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:39:58,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6050 6977 [WARNING|trainer.py:803] 2025-04-26 21:39:59,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:39:59,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6087 6978 [WARNING|trainer.py:803] 2025-04-26 21:40:00,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:00,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6051 6979 6088 [WARNING|trainer.py:803] 2025-04-26 21:40:01,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:40:01,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6980 [WARNING|trainer.py:803] 2025-04-26 21:40:02,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:40:02,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6052 6981 6089 [WARNING|trainer.py:803] 2025-04-26 21:40:03,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:40:03,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6982 [WARNING|trainer.py:803] 2025-04-26 21:40:04,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6053 [WARNING|trainer.py:803] 2025-04-26 21:40:04,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6983 6090 [WARNING|trainer.py:803] 2025-04-26 21:40:05,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:05,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:05,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6984 6054 [WARNING|trainer.py:803] 2025-04-26 21:40:06,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6985 6091 [WARNING|trainer.py:803] 2025-04-26 21:40:07,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:40:07,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:07,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6986 6055 [WARNING|trainer.py:803] 2025-04-26 21:40:08,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6987 6092 [WARNING|trainer.py:803] 2025-04-26 21:40:09,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:40:09,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6988 [WARNING|trainer.py:803] 2025-04-26 21:40:10,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6056 [WARNING|trainer.py:803] 2025-04-26 21:40:10,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6989 6093 [WARNING|trainer.py:803] 2025-04-26 21:40:11,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:40:11,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:40:12,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6990 6057 6094 [WARNING|trainer.py:803] 2025-04-26 21:40:13,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:13,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6991 [WARNING|trainer.py:803] 2025-04-26 21:40:13,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:40:14,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6058 6992 [WARNING|trainer.py:803] 2025-04-26 21:40:15,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:40:15,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6095 6993 [WARNING|trainer.py:803] 2025-04-26 21:40:16,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6059 [WARNING|trainer.py:803] 2025-04-26 21:40:16,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6994 [WARNING|trainer.py:803] 2025-04-26 21:40:17,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6096 [WARNING|trainer.py:803] 2025-04-26 21:40:17,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6995 [WARNING|trainer.py:803] 2025-04-26 21:40:18,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6060 [WARNING|trainer.py:803] 2025-04-26 21:40:18,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6996 [WARNING|trainer.py:803] 2025-04-26 21:40:18,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6097 [WARNING|trainer.py:803] 2025-04-26 21:40:19,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6997 6061 [WARNING|trainer.py:803] 2025-04-26 21:40:19,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:40:20,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6998 [WARNING|trainer.py:803] 2025-04-26 21:40:20,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6098 [WARNING|trainer.py:803] 2025-04-26 21:40:21,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6999 6062 [WARNING|trainer.py:803] 2025-04-26 21:40:22,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:40:22,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:22,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7000 6099 6063 [WARNING|trainer.py:803] 2025-04-26 21:40:23,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7001 [WARNING|trainer.py:803] 2025-04-26 21:40:24,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:40:24,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:40:24,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7002 6100 6064 [WARNING|trainer.py:803] 2025-04-26 21:40:26,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:40:26,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7003 [WARNING|trainer.py:803] 2025-04-26 21:40:26,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:27,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7004 6101 6065 [WARNING|trainer.py:803] 2025-04-26 21:40:28,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7005 [WARNING|trainer.py:803] 2025-04-26 21:40:28,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:40:28,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:29,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7006 6102 6066 [WARNING|trainer.py:803] 2025-04-26 21:40:30,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7007 [WARNING|trainer.py:803] 2025-04-26 21:40:30,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:40:30,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:40:31,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7008 6103 6067 [WARNING|trainer.py:803] 2025-04-26 21:40:32,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:40:32,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:32,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7009 [WARNING|trainer.py:803] 2025-04-26 21:40:33,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6068 7010 6104 [WARNING|trainer.py:803] 2025-04-26 21:40:34,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:34,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:34,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7011 [WARNING|trainer.py:803] 2025-04-26 21:40:35,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6069 7012 6105 [WARNING|trainer.py:803] 2025-04-26 21:40:36,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:40:36,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:36,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7013 [WARNING|trainer.py:803] 2025-04-26 21:40:37,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7014 6070 6106 [WARNING|trainer.py:803] 2025-04-26 21:40:38,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:38,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:40:38,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7015 [WARNING|trainer.py:803] 2025-04-26 21:40:39,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6071 7016 6107 [WARNING|trainer.py:803] 2025-04-26 21:40:40,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:40:40,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:40:40,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7017 [WARNING|trainer.py:803] 2025-04-26 21:40:41,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6072 7018 6108 [WARNING|trainer.py:803] 2025-04-26 21:40:42,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:42,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:40:42,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7019 6073 6109 [WARNING|trainer.py:803] 2025-04-26 21:40:43,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7020 [WARNING|trainer.py:803] 2025-04-26 21:40:44,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:44,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:40:44,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7021 6074 [WARNING|trainer.py:803] 2025-04-26 21:40:46,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6110 7022 [WARNING|trainer.py:803] 2025-04-26 21:40:46,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:46,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:47,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7023 6111 [WARNING|trainer.py:803] 2025-04-26 21:40:48,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6075 7024 [WARNING|trainer.py:803] 2025-04-26 21:40:48,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:40:49,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:40:49,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7025 6112 6076 [WARNING|trainer.py:803] 2025-04-26 21:40:50,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7026 [WARNING|trainer.py:803] 2025-04-26 21:40:50,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:40:50,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:51,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7027 6077 6113 [WARNING|trainer.py:803] 2025-04-26 21:40:52,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7028 [WARNING|trainer.py:803] 2025-04-26 21:40:52,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:40:52,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:53,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7029 6078 [WARNING|trainer.py:803] 2025-04-26 21:40:54,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6114 7030 [WARNING|trainer.py:803] 2025-04-26 21:40:54,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:40:55,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:40:55,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7031 6079 [WARNING|trainer.py:803] 2025-04-26 21:40:56,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6115 7032 [WARNING|trainer.py:803] 2025-04-26 21:40:56,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:40:57,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:40:57,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7033 6080 6116 [WARNING|trainer.py:803] 2025-04-26 21:40:58,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7034 [WARNING|trainer.py:803] 2025-04-26 21:40:58,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:40:59,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:40:59,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7035 6081 6117 [WARNING|trainer.py:803] 2025-04-26 21:41:00,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:41:00,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7036 [WARNING|trainer.py:803] 2025-04-26 21:41:01,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:01,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6082 7037 [WARNING|trainer.py:803] 2025-04-26 21:41:02,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6118 [WARNING|trainer.py:803] 2025-04-26 21:41:02,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7038 [WARNING|trainer.py:803] 2025-04-26 21:41:03,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:03,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7039 6083 [WARNING|trainer.py:803] 2025-04-26 21:41:04,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6119 7040 [WARNING|trainer.py:803] 2025-04-26 21:41:05,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:05,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:41:05,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7041 6084 [WARNING|trainer.py:803] 2025-04-26 21:41:06,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6120 7042 [WARNING|trainer.py:803] 2025-04-26 21:41:07,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:07,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:41:07,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7043 6085 [WARNING|trainer.py:803] 2025-04-26 21:41:09,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6121 7044 [WARNING|trainer.py:803] 2025-04-26 21:41:09,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:41:10,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:41:10,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7045 6086 [WARNING|trainer.py:803] 2025-04-26 21:41:11,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7046 [WARNING|trainer.py:803] 2025-04-26 21:41:11,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6122 [WARNING|trainer.py:803] 2025-04-26 21:41:12,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7047 [WARNING|trainer.py:803] 2025-04-26 21:41:12,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6087 [WARNING|trainer.py:803] 2025-04-26 21:41:13,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7048 6123 [WARNING|trainer.py:803] 2025-04-26 21:41:13,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:41:14,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:41:14,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7049 6088 [WARNING|trainer.py:803] 2025-04-26 21:41:15,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7050 [WARNING|trainer.py:803] 2025-04-26 21:41:15,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6124 [WARNING|trainer.py:803] 2025-04-26 21:41:16,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:41:16,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6089 7051 [WARNING|trainer.py:803] 2025-04-26 21:41:17,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:41:17,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7052 6125 6090 [WARNING|trainer.py:803] 2025-04-26 21:41:18,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7053 [WARNING|trainer.py:803] 2025-04-26 21:41:18,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:41:19,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:19,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7054 6126 6091 [WARNING|trainer.py:803] 2025-04-26 21:41:20,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7055 [WARNING|trainer.py:803] 2025-04-26 21:41:20,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:21,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:41:21,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7056 6127 [WARNING|trainer.py:803] 2025-04-26 21:41:22,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6092 7057 [WARNING|trainer.py:803] 2025-04-26 21:41:23,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:23,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:23,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7058 6128 6093 [WARNING|trainer.py:803] 2025-04-26 21:41:24,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7059 [WARNING|trainer.py:803] 2025-04-26 21:41:24,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:41:25,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:41:25,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7060 6129 6094 [WARNING|trainer.py:803] 2025-04-26 21:41:26,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7061 [WARNING|trainer.py:803] 2025-04-26 21:41:26,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:27,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:27,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7062 6130 6095 [WARNING|trainer.py:803] 2025-04-26 21:41:28,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7063 [WARNING|trainer.py:803] 2025-04-26 21:41:29,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:29,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:41:29,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7064 6131 6096 [WARNING|trainer.py:803] 2025-04-26 21:41:30,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:41:30,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7065 [WARNING|trainer.py:803] 2025-04-26 21:41:31,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:41:31,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7066 6132 6097 [WARNING|trainer.py:803] 2025-04-26 21:41:32,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:41:32,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7067 [WARNING|trainer.py:803] 2025-04-26 21:41:33,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:41:33,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7068 6133 6098 [WARNING|trainer.py:803] 2025-04-26 21:41:34,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:35,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7069 [WARNING|trainer.py:803] 2025-04-26 21:41:35,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:41:35,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6134 7070 6099 [WARNING|trainer.py:803] 2025-04-26 21:41:36,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:41:36,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7071 [WARNING|trainer.py:803] 2025-04-26 21:41:37,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:41:38,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7072 6135 6100 [WARNING|trainer.py:803] 2025-04-26 21:41:39,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7073 [WARNING|trainer.py:803] 2025-04-26 21:41:39,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:41:39,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:41:40,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7074 6136 6101 [WARNING|trainer.py:803] 2025-04-26 21:41:41,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7075 [WARNING|trainer.py:803] 2025-04-26 21:41:41,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:41:41,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:42,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7076 6137 6102 [WARNING|trainer.py:803] 2025-04-26 21:41:43,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:41:43,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7077 [WARNING|trainer.py:803] 2025-04-26 21:41:43,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:44,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7078 6138 6103 [WARNING|trainer.py:803] 2025-04-26 21:41:45,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:41:45,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:41:45,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7079 [WARNING|trainer.py:803] 2025-04-26 21:41:46,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7080 6139 6104 [WARNING|trainer.py:803] 2025-04-26 21:41:47,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:41:47,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7081 [WARNING|trainer.py:803] 2025-04-26 21:41:47,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:41:48,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7082 6105 6140 [WARNING|trainer.py:803] 2025-04-26 21:41:49,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:41:49,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7083 [WARNING|trainer.py:803] 2025-04-26 21:41:49,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:50,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7084 6106 6141 [WARNING|trainer.py:803] 2025-04-26 21:41:51,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes [WARNING|trainer.py:803] 2025-04-26 21:41:51,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7085 [WARNING|trainer.py:803] 2025-04-26 21:41:51,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:52,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7086 6107 6142 [WARNING|trainer.py:803] 2025-04-26 21:41:53,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:41:53,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7087 [WARNING|trainer.py:803] 2025-04-26 21:41:53,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:54,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7088 6108 6143 [WARNING|trainer.py:803] 2025-04-26 21:41:55,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7089 [WARNING|trainer.py:803] 2025-04-26 21:41:55,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:56,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:41:56,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6109 7090 6144 [WARNING|trainer.py:803] 2025-04-26 21:41:57,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:41:57,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7091 [WARNING|trainer.py:803] 2025-04-26 21:41:58,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:41:58,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7092 6110 6145 [WARNING|trainer.py:803] 2025-04-26 21:41:59,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:42:00,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7093 [WARNING|trainer.py:803] 2025-04-26 21:42:00,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6111 [WARNING|trainer.py:803] 2025-04-26 21:42:00,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7094 6146 [WARNING|trainer.py:803] 2025-04-26 21:42:01,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:02,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:42:02,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7095 6112 [WARNING|trainer.py:803] 2025-04-26 21:42:03,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7096 6147 [WARNING|trainer.py:803] 2025-04-26 21:42:03,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:42:04,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7097 [WARNING|trainer.py:803] 2025-04-26 21:42:04,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6113 [WARNING|trainer.py:803] 2025-04-26 21:42:05,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7098 6148 [WARNING|trainer.py:803] 2025-04-26 21:42:05,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:42:06,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7099 [WARNING|trainer.py:803] 2025-04-26 21:42:06,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:42:07,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6114 7100 6149 [WARNING|trainer.py:803] 2025-04-26 21:42:08,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:42:08,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7101 [WARNING|trainer.py:803] 2025-04-26 21:42:08,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:09,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6115 7102 6150 [WARNING|trainer.py:803] 2025-04-26 21:42:10,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:10,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7103 [WARNING|trainer.py:803] 2025-04-26 21:42:10,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6116 [WARNING|trainer.py:803] 2025-04-26 21:42:11,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7104 6151 [WARNING|trainer.py:803] 2025-04-26 21:42:12,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:42:12,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7105 [WARNING|trainer.py:803] 2025-04-26 21:42:12,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6117 [WARNING|trainer.py:803] 2025-04-26 21:42:13,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7106 6152 [WARNING|trainer.py:803] 2025-04-26 21:42:14,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:14,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7107 [WARNING|trainer.py:803] 2025-04-26 21:42:15,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:42:15,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7108 6118 6153 [WARNING|trainer.py:803] 2025-04-26 21:42:16,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:42:16,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7109 [WARNING|trainer.py:803] 2025-04-26 21:42:17,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:42:17,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7110 6119 6154 [WARNING|trainer.py:803] 2025-04-26 21:42:18,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7111 [WARNING|trainer.py:803] 2025-04-26 21:42:18,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:42:19,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:42:19,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7112 6120 6155 [WARNING|trainer.py:803] 2025-04-26 21:42:20,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7113 [WARNING|trainer.py:803] 2025-04-26 21:42:21,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:42:21,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:21,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7114 6121 6156 [WARNING|trainer.py:803] 2025-04-26 21:42:22,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7115 [WARNING|trainer.py:803] 2025-04-26 21:42:23,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:42:23,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:42:23,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7116 6157 [WARNING|trainer.py:803] 2025-04-26 21:42:24,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6122 7117 [WARNING|trainer.py:803] 2025-04-26 21:42:25,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:42:25,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:25,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7118 6123 [WARNING|trainer.py:803] 2025-04-26 21:42:26,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6158 7119 [WARNING|trainer.py:803] 2025-04-26 21:42:27,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:42:27,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:27,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7120 6124 6159 [WARNING|trainer.py:803] 2025-04-26 21:42:28,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7121 [WARNING|trainer.py:803] 2025-04-26 21:42:29,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:42:29,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:42:29,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7122 6160 [WARNING|trainer.py:803] 2025-04-26 21:42:30,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6125 7123 [WARNING|trainer.py:803] 2025-04-26 21:42:31,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:31,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:42:32,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7124 6161 [WARNING|trainer.py:803] 2025-04-26 21:42:33,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6126 7125 [WARNING|trainer.py:803] 2025-04-26 21:42:33,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:42:33,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:34,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7126 6162 [WARNING|trainer.py:803] 2025-04-26 21:42:35,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6127 7127 [WARNING|trainer.py:803] 2025-04-26 21:42:35,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:36,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:36,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7128 6163 6128 [WARNING|trainer.py:803] 2025-04-26 21:42:37,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7129 [WARNING|trainer.py:803] 2025-04-26 21:42:37,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:42:38,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:42:38,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7130 6164 6129 [WARNING|trainer.py:803] 2025-04-26 21:42:39,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7131 [WARNING|trainer.py:803] 2025-04-26 21:42:39,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:42:40,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:40,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7132 6165 6130 [WARNING|trainer.py:803] 2025-04-26 21:42:41,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7133 [WARNING|trainer.py:803] 2025-04-26 21:42:41,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:42,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:42,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7134 6166 6131 [WARNING|trainer.py:803] 2025-04-26 21:42:43,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:42:43,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7135 [WARNING|trainer.py:803] 2025-04-26 21:42:43,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:44,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7136 6167 6132 [WARNING|trainer.py:803] 2025-04-26 21:42:45,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:42:45,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7137 [WARNING|trainer.py:803] 2025-04-26 21:42:45,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:46,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7138 6168 6133 [WARNING|trainer.py:803] 2025-04-26 21:42:47,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:42:47,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7139 [WARNING|trainer.py:803] 2025-04-26 21:42:47,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:42:48,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6169 7140 6134 [WARNING|trainer.py:803] 2025-04-26 21:42:49,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:42:49,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:42:49,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7141 6170 [WARNING|trainer.py:803] 2025-04-26 21:42:50,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7142 6135 [WARNING|trainer.py:803] 2025-04-26 21:42:51,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:42:51,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7143 [WARNING|trainer.py:803] 2025-04-26 21:42:52,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6171 [WARNING|trainer.py:803] 2025-04-26 21:42:52,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7144 [WARNING|trainer.py:803] 2025-04-26 21:42:53,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6136 [WARNING|trainer.py:803] 2025-04-26 21:42:53,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7145 6172 [WARNING|trainer.py:803] 2025-04-26 21:42:54,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:42:54,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7146 [WARNING|trainer.py:803] 2025-04-26 21:42:55,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6137 [WARNING|trainer.py:803] 2025-04-26 21:42:55,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7147 6173 [WARNING|trainer.py:803] 2025-04-26 21:42:56,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:42:56,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:42:57,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7148 6138 [WARNING|trainer.py:803] 2025-04-26 21:42:58,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6174 7149 [WARNING|trainer.py:803] 2025-04-26 21:42:58,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:42:58,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:42:59,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7150 6139 6175 [WARNING|trainer.py:803] 2025-04-26 21:43:00,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7151 [WARNING|trainer.py:803] 2025-04-26 21:43:00,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:43:00,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:01,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7152 6140 [WARNING|trainer.py:803] 2025-04-26 21:43:02,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6176 7153 [WARNING|trainer.py:803] 2025-04-26 21:43:02,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:03,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:03,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7154 6177 6141 [WARNING|trainer.py:803] 2025-04-26 21:43:04,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7155 [WARNING|trainer.py:803] 2025-04-26 21:43:04,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:43:04,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:05,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7156 6178 6142 [WARNING|trainer.py:803] 2025-04-26 21:43:06,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7157 [WARNING|trainer.py:803] 2025-04-26 21:43:07,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes [WARNING|trainer.py:803] 2025-04-26 21:43:07,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:07,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7158 [WARNING|trainer.py:803] 2025-04-26 21:43:08,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6179 6143 7159 [WARNING|trainer.py:803] 2025-04-26 21:43:09,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:43:09,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:43:09,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7160 6144 [WARNING|trainer.py:803] 2025-04-26 21:43:10,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6180 7161 [WARNING|trainer.py:803] 2025-04-26 21:43:11,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:43:11,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:43:11,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7162 [WARNING|trainer.py:803] 2025-04-26 21:43:12,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6181 6145 7163 [WARNING|trainer.py:803] 2025-04-26 21:43:13,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:43:13,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:43:13,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7164 6146 [WARNING|trainer.py:803] 2025-04-26 21:43:14,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6182 7165 [WARNING|trainer.py:803] 2025-04-26 21:43:15,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:43:15,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:43:15,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7166 [WARNING|trainer.py:803] 2025-04-26 21:43:16,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6183 7167 6147 [WARNING|trainer.py:803] 2025-04-26 21:43:17,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:43:17,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:43:17,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7168 6184 [WARNING|trainer.py:803] 2025-04-26 21:43:18,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6148 7169 [WARNING|trainer.py:803] 2025-04-26 21:43:19,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:19,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:43:19,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7170 [WARNING|trainer.py:803] 2025-04-26 21:43:20,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6149 6185 7171 [WARNING|trainer.py:803] 2025-04-26 21:43:21,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:21,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:43:21,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7172 6186 [WARNING|trainer.py:803] 2025-04-26 21:43:23,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7173 6150 [WARNING|trainer.py:803] 2025-04-26 21:43:23,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:43:24,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:43:24,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7174 6187 [WARNING|trainer.py:803] 2025-04-26 21:43:25,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6151 7175 [WARNING|trainer.py:803] 2025-04-26 21:43:25,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:43:26,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:43:26,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7176 6188 [WARNING|trainer.py:803] 2025-04-26 21:43:27,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7177 6152 [WARNING|trainer.py:803] 2025-04-26 21:43:27,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:43:28,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7178 [WARNING|trainer.py:803] 2025-04-26 21:43:28,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6189 [WARNING|trainer.py:803] 2025-04-26 21:43:29,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7179 6153 [WARNING|trainer.py:803] 2025-04-26 21:43:29,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:43:30,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7180 [WARNING|trainer.py:803] 2025-04-26 21:43:30,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6190 [WARNING|trainer.py:803] 2025-04-26 21:43:31,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7181 6154 [WARNING|trainer.py:803] 2025-04-26 21:43:31,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:32,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7182 [WARNING|trainer.py:803] 2025-04-26 21:43:32,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6191 [WARNING|trainer.py:803] 2025-04-26 21:43:33,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7183 [WARNING|trainer.py:803] 2025-04-26 21:43:33,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6155 [WARNING|trainer.py:803] 2025-04-26 21:43:34,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7184 [WARNING|trainer.py:803] 2025-04-26 21:43:34,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6192 [WARNING|trainer.py:803] 2025-04-26 21:43:35,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7185 6156 [WARNING|trainer.py:803] 2025-04-26 21:43:36,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:43:36,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7186 [WARNING|trainer.py:803] 2025-04-26 21:43:36,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6193 [WARNING|trainer.py:803] 2025-04-26 21:43:37,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7187 6157 [WARNING|trainer.py:803] 2025-04-26 21:43:38,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:43:38,495 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:43:38,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7188 6194 [WARNING|trainer.py:803] 2025-04-26 21:43:39,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7189 6158 [WARNING|trainer.py:803] 2025-04-26 21:43:40,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:40,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7190 [WARNING|trainer.py:803] 2025-04-26 21:43:41,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6195 [WARNING|trainer.py:803] 2025-04-26 21:43:41,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7191 6159 [WARNING|trainer.py:803] 2025-04-26 21:43:42,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:43:42,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7192 [WARNING|trainer.py:803] 2025-04-26 21:43:42,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:43:43,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6196 7193 6160 [WARNING|trainer.py:803] 2025-04-26 21:43:44,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:43:44,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7194 [WARNING|trainer.py:803] 2025-04-26 21:43:45,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:45,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7195 6197 6161 [WARNING|trainer.py:803] 2025-04-26 21:43:46,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7196 [WARNING|trainer.py:803] 2025-04-26 21:43:47,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:43:47,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:43:47,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7197 6162 6198 [WARNING|trainer.py:803] 2025-04-26 21:43:48,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:43:48,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:49,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7198 [WARNING|trainer.py:803] 2025-04-26 21:43:49,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6199 7199 6163 [WARNING|trainer.py:803] 2025-04-26 21:43:50,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:50,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:43:51,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7200 [WARNING|trainer.py:803] 2025-04-26 21:43:52,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6200 7201 6164 [WARNING|trainer.py:803] 2025-04-26 21:43:52,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:53,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7202 [WARNING|trainer.py:803] 2025-04-26 21:43:53,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:43:54,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6201 7203 6165 [WARNING|trainer.py:803] 2025-04-26 21:43:55,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:43:55,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:43:55,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7204 [WARNING|trainer.py:803] 2025-04-26 21:43:56,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6166 7205 6202 [WARNING|trainer.py:803] 2025-04-26 21:43:56,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:57,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:43:57,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7206 [WARNING|trainer.py:803] 2025-04-26 21:43:58,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6167 6203 7207 [WARNING|trainer.py:803] 2025-04-26 21:43:59,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:43:59,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:43:59,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7208 6204 6168 [WARNING|trainer.py:803] 2025-04-26 21:44:00,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7209 [WARNING|trainer.py:803] 2025-04-26 21:44:01,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:44:01,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:44:01,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7210 6205 6169 [WARNING|trainer.py:803] 2025-04-26 21:44:02,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7211 [WARNING|trainer.py:803] 2025-04-26 21:44:02,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:44:03,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:44:03,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7212 6206 6170 [WARNING|trainer.py:803] 2025-04-26 21:44:04,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:44:04,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7213 [WARNING|trainer.py:803] 2025-04-26 21:44:04,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:44:05,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7214 6207 6171 [WARNING|trainer.py:803] 2025-04-26 21:44:06,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:44:06,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:44:06,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7215 [WARNING|trainer.py:803] 2025-04-26 21:44:07,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7216 6172 6208 [WARNING|trainer.py:803] 2025-04-26 21:44:08,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:44:08,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7217 [WARNING|trainer.py:803] 2025-04-26 21:44:08,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:44:09,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6173 7218 6209 [WARNING|trainer.py:803] 2025-04-26 21:44:10,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:44:10,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7219 [WARNING|trainer.py:803] 2025-04-26 21:44:11,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6174 [WARNING|trainer.py:803] 2025-04-26 21:44:11,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7220 6210 [WARNING|trainer.py:803] 2025-04-26 21:44:12,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:44:12,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:44:12,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7221 6175 [WARNING|trainer.py:803] 2025-04-26 21:44:13,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6211 7222 [WARNING|trainer.py:803] 2025-04-26 21:44:14,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:44:14,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:44:14,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7223 6176 6212 [WARNING|trainer.py:803] 2025-04-26 21:44:15,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7224 [WARNING|trainer.py:803] 2025-04-26 21:44:16,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:44:16,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:44:16,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7225 6213 6177 [WARNING|trainer.py:803] 2025-04-26 21:44:17,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7226 [WARNING|trainer.py:803] 2025-04-26 21:44:18,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:44:18,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:44:18,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7227 6214 6178 [WARNING|trainer.py:803] 2025-04-26 21:44:19,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7228 [WARNING|trainer.py:803] 2025-04-26 21:44:20,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:44:20,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes [WARNING|trainer.py:803] 2025-04-26 21:44:20,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7229 6215 6179 [WARNING|trainer.py:803] 2025-04-26 21:44:21,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7230 [WARNING|trainer.py:803] 2025-04-26 21:44:22,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:44:22,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:44:22,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7231 6216 [WARNING|trainer.py:803] 2025-04-26 21:44:23,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6180 7232 [WARNING|trainer.py:803] 2025-04-26 21:44:24,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:44:24,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:44:25,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6217 7233 [WARNING|trainer.py:803] 2025-04-26 21:44:26,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:44:26,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6181 7234 [WARNING|trainer.py:803] 2025-04-26 21:44:26,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:44:27,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6218 7235 [WARNING|trainer.py:803] 2025-04-26 21:44:27,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:44:28,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6182 7236 [WARNING|trainer.py:803] 2025-04-26 21:44:28,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6219 [WARNING|trainer.py:803] 2025-04-26 21:44:29,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7237 [WARNING|trainer.py:803] 2025-04-26 21:44:29,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:44:30,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6183 7238 6220 [WARNING|trainer.py:803] 2025-04-26 21:44:31,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:44:31,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7239 [WARNING|trainer.py:803] 2025-04-26 21:44:31,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6184 [WARNING|trainer.py:803] 2025-04-26 21:44:32,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7240 6221 [WARNING|trainer.py:803] 2025-04-26 21:44:32,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:44:33,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7241 [WARNING|trainer.py:803] 2025-04-26 21:44:33,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6185 [WARNING|trainer.py:803] 2025-04-26 21:44:34,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7242 6222 [WARNING|trainer.py:803] 2025-04-26 21:44:35,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:44:35,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:44:35,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7243 6186 [WARNING|trainer.py:803] 2025-04-26 21:44:36,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7244 6223 [WARNING|trainer.py:803] 2025-04-26 21:44:36,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:44:37,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:44:37,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7245 6187 6224 [WARNING|trainer.py:803] 2025-04-26 21:44:38,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7246 [WARNING|trainer.py:803] 2025-04-26 21:44:39,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:44:39,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:44:39,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7247 6225 6188 [WARNING|trainer.py:803] 2025-04-26 21:44:40,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7248 [WARNING|trainer.py:803] 2025-04-26 21:44:41,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:44:41,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:44:41,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7249 6226 6189 [WARNING|trainer.py:803] 2025-04-26 21:44:42,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7250 [WARNING|trainer.py:803] 2025-04-26 21:44:42,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:44:43,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:44:43,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7251 6227 6190 [WARNING|trainer.py:803] 2025-04-26 21:44:44,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:44:44,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7252 [WARNING|trainer.py:803] 2025-04-26 21:44:45,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:44:45,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6228 7253 6191 [WARNING|trainer.py:803] 2025-04-26 21:44:46,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:44:46,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7254 [WARNING|trainer.py:803] 2025-04-26 21:44:47,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6229 [WARNING|trainer.py:803] 2025-04-26 21:44:47,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7255 [WARNING|trainer.py:803] 2025-04-26 21:44:48,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6192 [WARNING|trainer.py:803] 2025-04-26 21:44:48,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7256 [WARNING|trainer.py:803] 2025-04-26 21:44:49,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6230 [WARNING|trainer.py:803] 2025-04-26 21:44:49,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7257 [WARNING|trainer.py:803] 2025-04-26 21:44:50,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6193 [WARNING|trainer.py:803] 2025-04-26 21:44:50,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7258 6231 [WARNING|trainer.py:803] 2025-04-26 21:44:51,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:44:51,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7259 [WARNING|trainer.py:803] 2025-04-26 21:44:52,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6194 [WARNING|trainer.py:803] 2025-04-26 21:44:52,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7260 6232 [WARNING|trainer.py:803] 2025-04-26 21:44:53,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:44:53,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7261 [WARNING|trainer.py:803] 2025-04-26 21:44:54,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6195 [WARNING|trainer.py:803] 2025-04-26 21:44:54,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6233 7262 [WARNING|trainer.py:803] 2025-04-26 21:44:55,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:44:55,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:44:56,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7263 6234 [WARNING|trainer.py:803] 2025-04-26 21:44:57,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7264 6196 [WARNING|trainer.py:803] 2025-04-26 21:44:57,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:44:58,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:44:58,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7265 6235 [WARNING|trainer.py:803] 2025-04-26 21:44:59,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7266 [WARNING|trainer.py:803] 2025-04-26 21:44:59,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6197 [WARNING|trainer.py:803] 2025-04-26 21:45:00,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7267 [WARNING|trainer.py:803] 2025-04-26 21:45:00,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6236 [WARNING|trainer.py:803] 2025-04-26 21:45:01,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:01,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7268 6198 [WARNING|trainer.py:803] 2025-04-26 21:45:02,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7269 [WARNING|trainer.py:803] 2025-04-26 21:45:02,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6237 [WARNING|trainer.py:803] 2025-04-26 21:45:03,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6199 [WARNING|trainer.py:803] 2025-04-26 21:45:03,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7270 [WARNING|trainer.py:803] 2025-04-26 21:45:04,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:04,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7271 6238 [WARNING|trainer.py:803] 2025-04-26 21:45:05,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:45:05,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6200 7272 [WARNING|trainer.py:803] 2025-04-26 21:45:06,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:06,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6239 7273 [WARNING|trainer.py:803] 2025-04-26 21:45:07,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:07,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6201 7274 6240 [WARNING|trainer.py:803] 2025-04-26 21:45:08,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:08,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7275 [WARNING|trainer.py:803] 2025-04-26 21:45:09,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:09,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7276 6202 6241 [WARNING|trainer.py:803] 2025-04-26 21:45:10,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:45:10,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7277 [WARNING|trainer.py:803] 2025-04-26 21:45:10,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:11,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6203 7278 6242 [WARNING|trainer.py:803] 2025-04-26 21:45:12,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:12,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:12,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7279 6204 [WARNING|trainer.py:803] 2025-04-26 21:45:13,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6243 7280 [WARNING|trainer.py:803] 2025-04-26 21:45:14,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:14,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:14,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7281 6205 6244 [WARNING|trainer.py:803] 2025-04-26 21:45:15,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7282 [WARNING|trainer.py:803] 2025-04-26 21:45:15,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:16,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:45:16,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7283 6206 6245 [WARNING|trainer.py:803] 2025-04-26 21:45:17,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:17,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7284 [WARNING|trainer.py:803] 2025-04-26 21:45:18,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:45:18,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7285 6207 6246 [WARNING|trainer.py:803] 2025-04-26 21:45:19,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:19,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:45:19,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7286 [WARNING|trainer.py:803] 2025-04-26 21:45:20,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6247 7287 6208 [WARNING|trainer.py:803] 2025-04-26 21:45:21,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:21,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7288 [WARNING|trainer.py:803] 2025-04-26 21:45:22,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6248 [WARNING|trainer.py:803] 2025-04-26 21:45:22,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7289 6209 [WARNING|trainer.py:803] 2025-04-26 21:45:23,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:45:23,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7290 [WARNING|trainer.py:803] 2025-04-26 21:45:24,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6249 [WARNING|trainer.py:803] 2025-04-26 21:45:24,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6210 7291 [WARNING|trainer.py:803] 2025-04-26 21:45:25,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:25,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:45:26,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7292 6250 6211 [WARNING|trainer.py:803] 2025-04-26 21:45:27,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7293 [WARNING|trainer.py:803] 2025-04-26 21:45:27,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:27,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:28,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7294 6251 6212 [WARNING|trainer.py:803] 2025-04-26 21:45:29,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:45:29,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7295 [WARNING|trainer.py:803] 2025-04-26 21:45:29,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:30,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6252 7296 6213 [WARNING|trainer.py:803] 2025-04-26 21:45:31,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:45:31,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7297 [WARNING|trainer.py:803] 2025-04-26 21:45:31,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6253 [WARNING|trainer.py:803] 2025-04-26 21:45:32,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7298 6214 [WARNING|trainer.py:803] 2025-04-26 21:45:32,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:45:33,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:33,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7299 6254 [WARNING|trainer.py:803] 2025-04-26 21:45:34,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7300 6215 [WARNING|trainer.py:803] 2025-04-26 21:45:34,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:45:35,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:35,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6255 7301 [WARNING|trainer.py:803] 2025-04-26 21:45:36,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:36,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6216 7302 [WARNING|trainer.py:803] 2025-04-26 21:45:37,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6256 [WARNING|trainer.py:803] 2025-04-26 21:45:37,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7303 [WARNING|trainer.py:803] 2025-04-26 21:45:38,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6217 [WARNING|trainer.py:803] 2025-04-26 21:45:38,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 7304 6257 [WARNING|trainer.py:803] 2025-04-26 21:45:39,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:39,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7305 [WARNING|trainer.py:803] 2025-04-26 21:45:39,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6218 [WARNING|trainer.py:803] 2025-04-26 21:45:40,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7306 6258 [WARNING|trainer.py:803] 2025-04-26 21:45:41,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:41,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:41,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7307 6219 [WARNING|trainer.py:803] 2025-04-26 21:45:42,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7308 6259 [WARNING|trainer.py:803] 2025-04-26 21:45:42,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:43,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:43,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7309 6220 [WARNING|trainer.py:803] 2025-04-26 21:45:44,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6260 7310 [WARNING|trainer.py:803] 2025-04-26 21:45:45,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:45,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:45,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7311 6221 6261 [WARNING|trainer.py:803] 2025-04-26 21:45:46,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:45:46,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7312 [WARNING|trainer.py:803] 2025-04-26 21:45:47,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:45:47,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6222 7313 [WARNING|trainer.py:803] 2025-04-26 21:45:48,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6262 [WARNING|trainer.py:803] 2025-04-26 21:45:48,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7314 [WARNING|trainer.py:803] 2025-04-26 21:45:49,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:45:49,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6223 7315 6263 [WARNING|trainer.py:803] 2025-04-26 21:45:50,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:45:50,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7316 [WARNING|trainer.py:803] 2025-04-26 21:45:51,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6224 [WARNING|trainer.py:803] 2025-04-26 21:45:51,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7317 [WARNING|trainer.py:803] 2025-04-26 21:45:52,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6264 [WARNING|trainer.py:803] 2025-04-26 21:45:53,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7318 [WARNING|trainer.py:803] 2025-04-26 21:45:53,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6225 [WARNING|trainer.py:803] 2025-04-26 21:45:54,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7319 [WARNING|trainer.py:803] 2025-04-26 21:45:54,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6265 [WARNING|trainer.py:803] 2025-04-26 21:45:55,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7320 [WARNING|trainer.py:803] 2025-04-26 21:45:55,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6226 [WARNING|trainer.py:803] 2025-04-26 21:45:56,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:56,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7321 6266 6227 [WARNING|trainer.py:803] 2025-04-26 21:45:57,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:57,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7322 [WARNING|trainer.py:803] 2025-04-26 21:45:57,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:45:58,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6267 7323 6228 [WARNING|trainer.py:803] 2025-04-26 21:45:59,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:45:59,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7324 [WARNING|trainer.py:803] 2025-04-26 21:45:59,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6268 [WARNING|trainer.py:803] 2025-04-26 21:46:00,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7325 6229 [WARNING|trainer.py:803] 2025-04-26 21:46:01,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:46:01,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7326 [WARNING|trainer.py:803] 2025-04-26 21:46:01,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6269 [WARNING|trainer.py:803] 2025-04-26 21:46:02,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7327 6230 [WARNING|trainer.py:803] 2025-04-26 21:46:02,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:46:03,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:46:03,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7328 6270 [WARNING|trainer.py:803] 2025-04-26 21:46:04,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7329 6231 [WARNING|trainer.py:803] 2025-04-26 21:46:04,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:46:05,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:46:05,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7330 6271 6232 [WARNING|trainer.py:803] 2025-04-26 21:46:06,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7331 [WARNING|trainer.py:803] 2025-04-26 21:46:06,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:46:07,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:46:07,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7332 6233 6272 [WARNING|trainer.py:803] 2025-04-26 21:46:08,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7333 [WARNING|trainer.py:803] 2025-04-26 21:46:08,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:46:09,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:46:09,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7334 6234 6273 [WARNING|trainer.py:803] 2025-04-26 21:46:10,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:46:10,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7335 [WARNING|trainer.py:803] 2025-04-26 21:46:11,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:46:11,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6235 7336 6274 [WARNING|trainer.py:803] 2025-04-26 21:46:12,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:46:12,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7337 [WARNING|trainer.py:803] 2025-04-26 21:46:13,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6236 [WARNING|trainer.py:803] 2025-04-26 21:46:13,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7338 [WARNING|trainer.py:803] 2025-04-26 21:46:14,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6275 [WARNING|trainer.py:803] 2025-04-26 21:46:14,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7339 [WARNING|trainer.py:803] 2025-04-26 21:46:15,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6237 [WARNING|trainer.py:803] 2025-04-26 21:46:15,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7340 6276 [WARNING|trainer.py:803] 2025-04-26 21:46:16,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:46:16,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7341 [WARNING|trainer.py:803] 2025-04-26 21:46:17,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6238 [WARNING|trainer.py:803] 2025-04-26 21:46:17,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7342 6277 [WARNING|trainer.py:803] 2025-04-26 21:46:18,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:46:18,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7343 [WARNING|trainer.py:803] 2025-04-26 21:46:19,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6239 [WARNING|trainer.py:803] 2025-04-26 21:46:19,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7344 6278 [WARNING|trainer.py:803] 2025-04-26 21:46:20,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:46:20,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:46:21,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7345 6240 6279 [WARNING|trainer.py:803] 2025-04-26 21:46:22,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7346 [WARNING|trainer.py:803] 2025-04-26 21:46:22,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:46:22,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:46:23,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7347 6241 6280 [WARNING|trainer.py:803] 2025-04-26 21:46:24,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:46:24,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7348 [WARNING|trainer.py:803] 2025-04-26 21:46:24,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:46:25,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6242 7349 [WARNING|trainer.py:803] 2025-04-26 21:46:26,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:46:26,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6281 7350 [WARNING|trainer.py:803] 2025-04-26 21:46:27,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6243 [WARNING|trainer.py:803] 2025-04-26 21:46:27,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7351 [WARNING|trainer.py:803] 2025-04-26 21:46:27,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6282 [WARNING|trainer.py:803] 2025-04-26 21:46:28,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7352 6244 [WARNING|trainer.py:803] 2025-04-26 21:46:29,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:46:29,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7353 [WARNING|trainer.py:803] 2025-04-26 21:46:29,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6283 [WARNING|trainer.py:803] 2025-04-26 21:46:30,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7354 6245 [WARNING|trainer.py:803] 2025-04-26 21:46:30,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:46:31,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:46:31,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7355 6284 6246 [WARNING|trainer.py:803] 2025-04-26 21:46:32,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7356 [WARNING|trainer.py:803] 2025-04-26 21:46:32,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:46:33,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:46:33,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7357 6285 6247 [WARNING|trainer.py:803] 2025-04-26 21:46:34,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7358 [WARNING|trainer.py:803] 2025-04-26 21:46:34,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:46:34,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:46:35,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7359 6248 6286 [WARNING|trainer.py:803] 2025-04-26 21:46:36,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7360 [WARNING|trainer.py:803] 2025-04-26 21:46:36,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:46:37,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:46:37,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7361 6249 [WARNING|trainer.py:803] 2025-04-26 21:46:38,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7362 6287 [WARNING|trainer.py:803] 2025-04-26 21:46:38,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:46:39,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:46:39,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7363 6250 6288 [WARNING|trainer.py:803] 2025-04-26 21:46:40,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7364 [WARNING|trainer.py:803] 2025-04-26 21:46:40,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:46:41,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:46:41,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6251 7365 6289 [WARNING|trainer.py:803] 2025-04-26 21:46:42,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:46:42,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7366 [WARNING|trainer.py:803] 2025-04-26 21:46:43,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6252 [WARNING|trainer.py:803] 2025-04-26 21:46:43,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7367 [WARNING|trainer.py:803] 2025-04-26 21:46:44,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:46:44,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6290 7368 6253 [WARNING|trainer.py:803] 2025-04-26 21:46:45,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:46:45,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7369 [WARNING|trainer.py:803] 2025-04-26 21:46:46,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6291 [WARNING|trainer.py:803] 2025-04-26 21:46:46,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6254 7370 [WARNING|trainer.py:803] 2025-04-26 21:46:47,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:46:47,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:46:47,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7371 6255 6292 [WARNING|trainer.py:803] 2025-04-26 21:46:48,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7372 [WARNING|trainer.py:803] 2025-04-26 21:46:49,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:46:49,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:46:49,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7373 6256 6293 [WARNING|trainer.py:803] 2025-04-26 21:46:51,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7374 [WARNING|trainer.py:803] 2025-04-26 21:46:51,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:46:51,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:46:52,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7375 6257 6294 [WARNING|trainer.py:803] 2025-04-26 21:46:53,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:46:53,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7376 [WARNING|trainer.py:803] 2025-04-26 21:46:53,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:46:54,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6258 7377 6295 [WARNING|trainer.py:803] 2025-04-26 21:46:55,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:46:55,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:46:55,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7378 [WARNING|trainer.py:803] 2025-04-26 21:46:56,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6259 7379 6296 [WARNING|trainer.py:803] 2025-04-26 21:46:57,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:46:57,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:46:57,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7380 6260 [WARNING|trainer.py:803] 2025-04-26 21:46:58,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7381 6297 [WARNING|trainer.py:803] 2025-04-26 21:46:58,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:46:59,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:46:59,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7382 6261 [WARNING|trainer.py:803] 2025-04-26 21:47:00,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6298 7383 [WARNING|trainer.py:803] 2025-04-26 21:47:00,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:01,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:47:01,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7384 6262 [WARNING|trainer.py:803] 2025-04-26 21:47:02,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6299 7385 [WARNING|trainer.py:803] 2025-04-26 21:47:02,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:47:03,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:47:03,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7386 6263 6300 [WARNING|trainer.py:803] 2025-04-26 21:47:04,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7387 [WARNING|trainer.py:803] 2025-04-26 21:47:04,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:47:05,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:47:05,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6301 7388 6264 [WARNING|trainer.py:803] 2025-04-26 21:47:06,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:47:06,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:47:06,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7389 6302 [WARNING|trainer.py:803] 2025-04-26 21:47:07,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6265 [WARNING|trainer.py:803] 2025-04-26 21:47:07,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7390 6303 [WARNING|trainer.py:803] 2025-04-26 21:47:08,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:47:08,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7391 [WARNING|trainer.py:803] 2025-04-26 21:47:09,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6304 [WARNING|trainer.py:803] 2025-04-26 21:47:09,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6266 7392 [WARNING|trainer.py:803] 2025-04-26 21:47:10,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:47:10,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:10,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6305 7393 6267 [WARNING|trainer.py:803] 2025-04-26 21:47:11,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:11,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7394 6306 [WARNING|trainer.py:803] 2025-04-26 21:47:12,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:47:13,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:47:13,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7395 6307 6268 [WARNING|trainer.py:803] 2025-04-26 21:47:14,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7396 [WARNING|trainer.py:803] 2025-04-26 21:47:14,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:47:14,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6308 [WARNING|trainer.py:803] 2025-04-26 21:47:15,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7397 6269 [WARNING|trainer.py:803] 2025-04-26 21:47:15,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:47:16,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7398 6309 [WARNING|trainer.py:803] 2025-04-26 21:47:16,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:47:17,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:17,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7399 6310 6270 [WARNING|trainer.py:803] 2025-04-26 21:47:18,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7400 [WARNING|trainer.py:803] 2025-04-26 21:47:18,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:47:18,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6311 [WARNING|trainer.py:803] 2025-04-26 21:47:19,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7401 6271 [WARNING|trainer.py:803] 2025-04-26 21:47:19,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:20,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6312 7402 [WARNING|trainer.py:803] 2025-04-26 21:47:20,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:47:21,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:47:21,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7403 6313 6272 [WARNING|trainer.py:803] 2025-04-26 21:47:22,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:47:22,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7404 [WARNING|trainer.py:803] 2025-04-26 21:47:22,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6314 [WARNING|trainer.py:803] 2025-04-26 21:47:23,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7405 [WARNING|trainer.py:803] 2025-04-26 21:47:23,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6273 6315 [WARNING|trainer.py:803] 2025-04-26 21:47:24,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7406 [WARNING|trainer.py:803] 2025-04-26 21:47:24,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:47:24,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6316 [WARNING|trainer.py:803] 2025-04-26 21:47:25,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7407 6274 [WARNING|trainer.py:803] 2025-04-26 21:47:26,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:26,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6317 7408 [WARNING|trainer.py:803] 2025-04-26 21:47:26,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:47:27,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:27,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7409 6318 6275 [WARNING|trainer.py:803] 2025-04-26 21:47:28,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7410 [WARNING|trainer.py:803] 2025-04-26 21:47:28,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:47:28,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6319 [WARNING|trainer.py:803] 2025-04-26 21:47:29,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6276 7411 [WARNING|trainer.py:803] 2025-04-26 21:47:30,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6320 [WARNING|trainer.py:803] 2025-04-26 21:47:30,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:47:30,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7412 [WARNING|trainer.py:803] 2025-04-26 21:47:31,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6277 6321 [WARNING|trainer.py:803] 2025-04-26 21:47:31,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7413 [WARNING|trainer.py:803] 2025-04-26 21:47:32,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:47:32,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:47:33,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7414 6322 6278 [WARNING|trainer.py:803] 2025-04-26 21:47:34,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:47:34,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7415 [WARNING|trainer.py:803] 2025-04-26 21:47:34,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6323 [WARNING|trainer.py:803] 2025-04-26 21:47:35,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7416 6279 [WARNING|trainer.py:803] 2025-04-26 21:47:35,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6324 [WARNING|trainer.py:803] 2025-04-26 21:47:36,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:47:36,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7417 [WARNING|trainer.py:803] 2025-04-26 21:47:36,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:47:37,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6325 7418 6280 [WARNING|trainer.py:803] 2025-04-26 21:47:38,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:47:38,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerNo [WARNING|trainer.py:803] 2025-04-26 21:47:38,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7419 6326 [WARNING|trainer.py:803] 2025-04-26 21:47:39,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:47:39,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7420 6281 6327 [WARNING|trainer.py:803] 2025-04-26 21:47:40,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:47:40,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7421 [WARNING|trainer.py:803] 2025-04-26 21:47:40,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6328 [WARNING|trainer.py:803] 2025-04-26 21:47:41,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6282 7422 [WARNING|trainer.py:803] 2025-04-26 21:47:42,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:47:42,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:47:42,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6329 7423 [WARNING|trainer.py:803] 2025-04-26 21:47:43,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6283 [WARNING|trainer.py:803] 2025-04-26 21:47:43,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7424 6330 [WARNING|trainer.py:803] 2025-04-26 21:47:44,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:47:44,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:47:44,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7425 6331 6284 [WARNING|trainer.py:803] 2025-04-26 21:47:45,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7426 [WARNING|trainer.py:803] 2025-04-26 21:47:45,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:47:46,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6332 [WARNING|trainer.py:803] 2025-04-26 21:47:46,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7427 [WARNING|trainer.py:803] 2025-04-26 21:47:47,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6285 [WARNING|trainer.py:803] 2025-04-26 21:47:47,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7428 6333 [WARNING|trainer.py:803] 2025-04-26 21:47:48,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:47:48,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:48,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7429 6334 [WARNING|trainer.py:803] 2025-04-26 21:47:49,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6286 7430 [WARNING|trainer.py:803] 2025-04-26 21:47:50,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6335 [WARNING|trainer.py:803] 2025-04-26 21:47:50,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:50,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7431 [WARNING|trainer.py:803] 2025-04-26 21:47:51,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:51,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6336 7432 6287 [WARNING|trainer.py:803] 2025-04-26 21:47:52,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:52,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:47:52,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7433 6337 6288 [WARNING|trainer.py:803] 2025-04-26 21:47:53,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:53,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7434 6338 [WARNING|trainer.py:803] 2025-04-26 21:47:54,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:54,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7435 [WARNING|trainer.py:803] 2025-04-26 21:47:55,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6289 6339 [WARNING|trainer.py:803] 2025-04-26 21:47:55,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7436 [WARNING|trainer.py:803] 2025-04-26 21:47:56,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:47:56,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:56,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6340 7437 [WARNING|trainer.py:803] 2025-04-26 21:47:57,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:47:57,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7438 6290 6341 [WARNING|trainer.py:803] 2025-04-26 21:47:59,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:47:59,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:47:59,203 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7439 6342 [WARNING|trainer.py:803] 2025-04-26 21:48:00,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6291 7440 [WARNING|trainer.py:803] 2025-04-26 21:48:00,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:48:00,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6343 [WARNING|trainer.py:803] 2025-04-26 21:48:01,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7441 [WARNING|trainer.py:803] 2025-04-26 21:48:01,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6292 [WARNING|trainer.py:803] 2025-04-26 21:48:02,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7442 6344 [WARNING|trainer.py:803] 2025-04-26 21:48:02,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:03,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:48:03,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7443 6345 6293 [WARNING|trainer.py:803] 2025-04-26 21:48:04,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7444 [WARNING|trainer.py:803] 2025-04-26 21:48:04,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:04,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6346 [WARNING|trainer.py:803] 2025-04-26 21:48:05,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7445 6294 [WARNING|trainer.py:803] 2025-04-26 21:48:05,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:06,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6347 7446 [WARNING|trainer.py:803] 2025-04-26 21:48:06,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:07,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:07,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7447 6348 6295 [WARNING|trainer.py:803] 2025-04-26 21:48:08,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7448 [WARNING|trainer.py:803] 2025-04-26 21:48:08,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:08,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6349 [WARNING|trainer.py:803] 2025-04-26 21:48:09,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7449 6296 [WARNING|trainer.py:803] 2025-04-26 21:48:09,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6350 [WARNING|trainer.py:803] 2025-04-26 21:48:10,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:10,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7450 [WARNING|trainer.py:803] 2025-04-26 21:48:11,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:48:11,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7451 6351 6297 [WARNING|trainer.py:803] 2025-04-26 21:48:12,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:12,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:12,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7452 6352 [WARNING|trainer.py:803] 2025-04-26 21:48:13,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7453 6298 [WARNING|trainer.py:803] 2025-04-26 21:48:13,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6353 [WARNING|trainer.py:803] 2025-04-26 21:48:14,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:14,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7454 [WARNING|trainer.py:803] 2025-04-26 21:48:15,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:15,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7455 6299 6354 [WARNING|trainer.py:803] 2025-04-26 21:48:16,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:16,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:16,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7456 6355 [WARNING|trainer.py:803] 2025-04-26 21:48:17,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6300 7457 [WARNING|trainer.py:803] 2025-04-26 21:48:17,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6356 [WARNING|trainer.py:803] 2025-04-26 21:48:18,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:18,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7458 6301 [WARNING|trainer.py:803] 2025-04-26 21:48:19,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:19,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6357 [WARNING|trainer.py:803] 2025-04-26 21:48:19,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7459 6302 [WARNING|trainer.py:803] 2025-04-26 21:48:20,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:20,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7460 6358 [WARNING|trainer.py:803] 2025-04-26 21:48:21,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6303 [WARNING|trainer.py:803] 2025-04-26 21:48:21,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:21,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7461 6359 [WARNING|trainer.py:803] 2025-04-26 21:48:22,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:22,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7462 6304 [WARNING|trainer.py:803] 2025-04-26 21:48:23,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6360 [WARNING|trainer.py:803] 2025-04-26 21:48:23,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:48:23,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7463 6305 [WARNING|trainer.py:803] 2025-04-26 21:48:24,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:48:24,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6361 7464 [WARNING|trainer.py:803] 2025-04-26 21:48:25,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6306 [WARNING|trainer.py:803] 2025-04-26 21:48:25,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:25,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7465 6362 [WARNING|trainer.py:803] 2025-04-26 21:48:26,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:26,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6307 7466 [WARNING|trainer.py:803] 2025-04-26 21:48:27,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6363 [WARNING|trainer.py:803] 2025-04-26 21:48:27,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:27,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7467 6308 [WARNING|trainer.py:803] 2025-04-26 21:48:28,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6364 [WARNING|trainer.py:803] 2025-04-26 21:48:28,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7468 [WARNING|trainer.py:803] 2025-04-26 21:48:29,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:29,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6309 [WARNING|trainer.py:803] 2025-04-26 21:48:29,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7469 6365 [WARNING|trainer.py:803] 2025-04-26 21:48:30,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:31,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:48:31,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6310 7470 6366 [WARNING|trainer.py:803] 2025-04-26 21:48:31,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:32,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7471 [WARNING|trainer.py:803] 2025-04-26 21:48:32,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6311 6367 [WARNING|trainer.py:803] 2025-04-26 21:48:33,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:48:33,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7472 [WARNING|trainer.py:803] 2025-04-26 21:48:33,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6312 [WARNING|trainer.py:803] 2025-04-26 21:48:34,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6368 7473 [WARNING|trainer.py:803] 2025-04-26 21:48:34,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:35,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6313 [WARNING|trainer.py:803] 2025-04-26 21:48:35,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7474 6369 [WARNING|trainer.py:803] 2025-04-26 21:48:35,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:48:36,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7475 6314 [WARNING|trainer.py:803] 2025-04-26 21:48:36,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6370 [WARNING|trainer.py:803] 2025-04-26 21:48:37,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:37,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7476 6315 [WARNING|trainer.py:803] 2025-04-26 21:48:37,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:48:38,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6371 7477 [WARNING|trainer.py:803] 2025-04-26 21:48:38,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6316 [WARNING|trainer.py:803] 2025-04-26 21:48:39,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:39,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7478 6372 [WARNING|trainer.py:803] 2025-04-26 21:48:39,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:40,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6317 [WARNING|trainer.py:803] 2025-04-26 21:48:40,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7479 6373 [WARNING|trainer.py:803] 2025-04-26 21:48:41,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:41,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7480 6318 [WARNING|trainer.py:803] 2025-04-26 21:48:41,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6374 [WARNING|trainer.py:803] 2025-04-26 21:48:42,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:48:42,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7481 [WARNING|trainer.py:803] 2025-04-26 21:48:43,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6319 [WARNING|trainer.py:803] 2025-04-26 21:48:43,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6375 7482 [WARNING|trainer.py:803] 2025-04-26 21:48:43,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6320 [WARNING|trainer.py:803] 2025-04-26 21:48:44,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:44,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7483 6376 [WARNING|trainer.py:803] 2025-04-26 21:48:45,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:45,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7484 [WARNING|trainer.py:803] 2025-04-26 21:48:45,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6321 6377 [WARNING|trainer.py:803] 2025-04-26 21:48:46,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:46,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7485 [WARNING|trainer.py:803] 2025-04-26 21:48:47,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6322 6378 [WARNING|trainer.py:803] 2025-04-26 21:48:47,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7486 [WARNING|trainer.py:803] 2025-04-26 21:48:47,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:48:48,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6323 [WARNING|trainer.py:803] 2025-04-26 21:48:48,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7487 6379 [WARNING|trainer.py:803] 2025-04-26 21:48:49,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:49,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:49,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6324 7488 6380 [WARNING|trainer.py:803] 2025-04-26 21:48:50,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:48:50,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7489 [WARNING|trainer.py:803] 2025-04-26 21:48:51,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6325 6381 [WARNING|trainer.py:803] 2025-04-26 21:48:51,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7490 [WARNING|trainer.py:803] 2025-04-26 21:48:51,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:48:52,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6326 [WARNING|trainer.py:803] 2025-04-26 21:48:52,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6382 7491 [WARNING|trainer.py:803] 2025-04-26 21:48:53,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:53,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6327 [WARNING|trainer.py:803] 2025-04-26 21:48:53,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7492 6383 [WARNING|trainer.py:803] 2025-04-26 21:48:54,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:48:54,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7493 6328 [WARNING|trainer.py:803] 2025-04-26 21:48:55,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6384 [WARNING|trainer.py:803] 2025-04-26 21:48:55,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:48:55,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7494 6329 [WARNING|trainer.py:803] 2025-04-26 21:48:56,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:48:56,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6385 7495 [WARNING|trainer.py:803] 2025-04-26 21:48:57,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6330 [WARNING|trainer.py:803] 2025-04-26 21:48:57,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:48:57,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7496 6386 [WARNING|trainer.py:803] 2025-04-26 21:48:58,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:48:58,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6331 [WARNING|trainer.py:803] 2025-04-26 21:48:59,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7497 6387 [WARNING|trainer.py:803] 2025-04-26 21:48:59,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:48:59,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7498 6332 [WARNING|trainer.py:803] 2025-04-26 21:49:00,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6388 [WARNING|trainer.py:803] 2025-04-26 21:49:00,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7499 [WARNING|trainer.py:803] 2025-04-26 21:49:01,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6333 [WARNING|trainer.py:803] 2025-04-26 21:49:01,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:49:02,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7500 6389 [WARNING|trainer.py:803] 2025-04-26 21:49:02,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:49:03,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6334 [WARNING|trainer.py:803] 2025-04-26 21:49:03,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7501 6390 [WARNING|trainer.py:803] 2025-04-26 21:49:03,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:49:04,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:49:04,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6335 7502 6391 [WARNING|trainer.py:803] 2025-04-26 21:49:05,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:05,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6336 [WARNING|trainer.py:803] 2025-04-26 21:49:05,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7503 6392 [WARNING|trainer.py:803] 2025-04-26 21:49:06,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:06,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7504 6337 [WARNING|trainer.py:803] 2025-04-26 21:49:07,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6393 [WARNING|trainer.py:803] 2025-04-26 21:49:07,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:49:07,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7505 6338 [WARNING|trainer.py:803] 2025-04-26 21:49:08,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6394 [WARNING|trainer.py:803] 2025-04-26 21:49:08,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:09,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7506 6339 [WARNING|trainer.py:803] 2025-04-26 21:49:09,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:49:10,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6395 7507 [WARNING|trainer.py:803] 2025-04-26 21:49:10,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6340 [WARNING|trainer.py:803] 2025-04-26 21:49:11,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:49:11,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6396 7508 [WARNING|trainer.py:803] 2025-04-26 21:49:11,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6341 [WARNING|trainer.py:803] 2025-04-26 21:49:12,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:49:12,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6397 7509 [WARNING|trainer.py:803] 2025-04-26 21:49:13,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6342 [WARNING|trainer.py:803] 2025-04-26 21:49:13,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:49:13,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7510 6398 [WARNING|trainer.py:803] 2025-04-26 21:49:14,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:49:14,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6343 [WARNING|trainer.py:803] 2025-04-26 21:49:15,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7511 6399 [WARNING|trainer.py:803] 2025-04-26 21:49:15,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:49:16,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6344 [WARNING|trainer.py:803] 2025-04-26 21:49:16,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7512 6400 [WARNING|trainer.py:803] 2025-04-26 21:49:17,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:49:17,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6345 [WARNING|trainer.py:803] 2025-04-26 21:49:17,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7513 6401 [WARNING|trainer.py:803] 2025-04-26 21:49:18,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:49:18,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6346 7514 [WARNING|trainer.py:803] 2025-04-26 21:49:18,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6402 [WARNING|trainer.py:803] 2025-04-26 21:49:19,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:49:19,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7515 6347 [WARNING|trainer.py:803] 2025-04-26 21:49:20,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6403 [WARNING|trainer.py:803] 2025-04-26 21:49:20,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:49:21,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7516 6348 [WARNING|trainer.py:803] 2025-04-26 21:49:21,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:49:22,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6404 [WARNING|trainer.py:803] 2025-04-26 21:49:22,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7517 6349 [WARNING|trainer.py:803] 2025-04-26 21:49:23,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:23,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6405 [WARNING|trainer.py:803] 2025-04-26 21:49:23,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7518 6350 [WARNING|trainer.py:803] 2025-04-26 21:49:24,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:49:24,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6406 7519 [WARNING|trainer.py:803] 2025-04-26 21:49:25,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6351 [WARNING|trainer.py:803] 2025-04-26 21:49:25,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:49:25,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6407 7520 [WARNING|trainer.py:803] 2025-04-26 21:49:26,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6352 [WARNING|trainer.py:803] 2025-04-26 21:49:27,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:49:27,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7521 6408 [WARNING|trainer.py:803] 2025-04-26 21:49:27,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6353 [WARNING|trainer.py:803] 2025-04-26 21:49:28,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:49:28,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7522 6409 [WARNING|trainer.py:803] 2025-04-26 21:49:29,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6354 [WARNING|trainer.py:803] 2025-04-26 21:49:29,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:49:29,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7523 6410 [WARNING|trainer.py:803] 2025-04-26 21:49:30,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:30,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6355 [WARNING|trainer.py:803] 2025-04-26 21:49:31,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7524 6411 [WARNING|trainer.py:803] 2025-04-26 21:49:31,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:31,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6356 7525 [WARNING|trainer.py:803] 2025-04-26 21:49:32,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6412 [WARNING|trainer.py:803] 2025-04-26 21:49:32,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:33,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6357 7526 [WARNING|trainer.py:803] 2025-04-26 21:49:33,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6413 [WARNING|trainer.py:803] 2025-04-26 21:49:34,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:34,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7527 6358 [WARNING|trainer.py:803] 2025-04-26 21:49:34,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6414 [WARNING|trainer.py:803] 2025-04-26 21:49:35,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:49:35,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7528 6359 [WARNING|trainer.py:803] 2025-04-26 21:49:36,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6415 [WARNING|trainer.py:803] 2025-04-26 21:49:36,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:49:36,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7529 6360 [WARNING|trainer.py:803] 2025-04-26 21:49:37,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:49:37,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6416 [WARNING|trainer.py:803] 2025-04-26 21:49:38,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7530 6361 [WARNING|trainer.py:803] 2025-04-26 21:49:38,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:39,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6417 [WARNING|trainer.py:803] 2025-04-26 21:49:39,474 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7531 6362 [WARNING|trainer.py:803] 2025-04-26 21:49:40,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:49:40,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7532 6418 [WARNING|trainer.py:803] 2025-04-26 21:49:40,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6363 [WARNING|trainer.py:803] 2025-04-26 21:49:41,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:49:41,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7533 6419 [WARNING|trainer.py:803] 2025-04-26 21:49:42,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6364 [WARNING|trainer.py:803] 2025-04-26 21:49:42,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:42,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7534 [WARNING|trainer.py:803] 2025-04-26 21:49:43,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6420 [WARNING|trainer.py:803] 2025-04-26 21:49:43,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6365 7535 [WARNING|trainer.py:803] 2025-04-26 21:49:44,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:44,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6421 [WARNING|trainer.py:803] 2025-04-26 21:49:45,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6366 7536 [WARNING|trainer.py:803] 2025-04-26 21:49:45,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:49:46,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6422 [WARNING|trainer.py:803] 2025-04-26 21:49:46,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7537 6367 [WARNING|trainer.py:803] 2025-04-26 21:49:46,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:49:47,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:47,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6423 7538 6368 [WARNING|trainer.py:803] 2025-04-26 21:49:48,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:49:48,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:49:48,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6424 7539 6369 [WARNING|trainer.py:803] 2025-04-26 21:49:49,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:49:49,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7540 [WARNING|trainer.py:803] 2025-04-26 21:49:50,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6425 6370 [WARNING|trainer.py:803] 2025-04-26 21:49:50,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:50,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7541 6426 [WARNING|trainer.py:803] 2025-04-26 21:49:51,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6371 [WARNING|trainer.py:803] 2025-04-26 21:49:52,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:52,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7542 6427 [WARNING|trainer.py:803] 2025-04-26 21:49:52,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6372 [WARNING|trainer.py:803] 2025-04-26 21:49:53,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:49:53,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7543 6428 [WARNING|trainer.py:803] 2025-04-26 21:49:54,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:54,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6373 7544 [WARNING|trainer.py:803] 2025-04-26 21:49:54,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6429 [WARNING|trainer.py:803] 2025-04-26 21:49:55,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:49:55,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6374 7545 [WARNING|trainer.py:803] 2025-04-26 21:49:56,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6430 [WARNING|trainer.py:803] 2025-04-26 21:49:56,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:56,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6375 7546 [WARNING|trainer.py:803] 2025-04-26 21:49:57,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6431 [WARNING|trainer.py:803] 2025-04-26 21:49:58,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:58,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7547 6376 [WARNING|trainer.py:803] 2025-04-26 21:49:58,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:49:59,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6432 [WARNING|trainer.py:803] 2025-04-26 21:49:59,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7548 6377 [WARNING|trainer.py:803] 2025-04-26 21:50:00,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:50:00,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6433 [WARNING|trainer.py:803] 2025-04-26 21:50:00,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7549 6378 [WARNING|trainer.py:803] 2025-04-26 21:50:01,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:01,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6434 [WARNING|trainer.py:803] 2025-04-26 21:50:02,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7550 6379 [WARNING|trainer.py:803] 2025-04-26 21:50:02,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:50:02,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7551 6435 [WARNING|trainer.py:803] 2025-04-26 21:50:03,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6380 [WARNING|trainer.py:803] 2025-04-26 21:50:04,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:50:04,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7552 6436 [WARNING|trainer.py:803] 2025-04-26 21:50:04,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6381 [WARNING|trainer.py:803] 2025-04-26 21:50:05,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:50:05,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7553 6437 [WARNING|trainer.py:803] 2025-04-26 21:50:06,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:06,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6382 7554 [WARNING|trainer.py:803] 2025-04-26 21:50:06,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6438 [WARNING|trainer.py:803] 2025-04-26 21:50:07,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:50:07,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6383 7555 [WARNING|trainer.py:803] 2025-04-26 21:50:08,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6439 [WARNING|trainer.py:803] 2025-04-26 21:50:08,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:08,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7556 6384 [WARNING|trainer.py:803] 2025-04-26 21:50:09,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6440 [WARNING|trainer.py:803] 2025-04-26 21:50:10,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:10,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7557 6385 [WARNING|trainer.py:803] 2025-04-26 21:50:10,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:50:11,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6441 [WARNING|trainer.py:803] 2025-04-26 21:50:11,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7558 6386 [WARNING|trainer.py:803] 2025-04-26 21:50:12,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:12,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6442 7559 [WARNING|trainer.py:803] 2025-04-26 21:50:12,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6387 [WARNING|trainer.py:803] 2025-04-26 21:50:13,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:13,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6443 7560 [WARNING|trainer.py:803] 2025-04-26 21:50:14,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6388 [WARNING|trainer.py:803] 2025-04-26 21:50:14,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:50:14,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7561 6444 [WARNING|trainer.py:803] 2025-04-26 21:50:15,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6389 [WARNING|trainer.py:803] 2025-04-26 21:50:15,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:50:16,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7562 6445 [WARNING|trainer.py:803] 2025-04-26 21:50:16,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:17,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6390 7563 [WARNING|trainer.py:803] 2025-04-26 21:50:17,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6446 [WARNING|trainer.py:803] 2025-04-26 21:50:18,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:18,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6391 7564 [WARNING|trainer.py:803] 2025-04-26 21:50:18,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6447 [WARNING|trainer.py:803] 2025-04-26 21:50:19,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:50:19,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7565 6392 [WARNING|trainer.py:803] 2025-04-26 21:50:20,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6448 [WARNING|trainer.py:803] 2025-04-26 21:50:20,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:20,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7566 6393 [WARNING|trainer.py:803] 2025-04-26 21:50:21,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:50:21,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6449 [WARNING|trainer.py:803] 2025-04-26 21:50:22,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7567 6394 [WARNING|trainer.py:803] 2025-04-26 21:50:22,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:22,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6450 7568 [WARNING|trainer.py:803] 2025-04-26 21:50:23,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6395 [WARNING|trainer.py:803] 2025-04-26 21:50:23,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:24,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7569 6451 [WARNING|trainer.py:803] 2025-04-26 21:50:24,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6396 [WARNING|trainer.py:803] 2025-04-26 21:50:25,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:25,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7570 6452 [WARNING|trainer.py:803] 2025-04-26 21:50:25,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:26,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6397 7571 [WARNING|trainer.py:803] 2025-04-26 21:50:26,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6453 [WARNING|trainer.py:803] 2025-04-26 21:50:27,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:50:27,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6398 7572 [WARNING|trainer.py:803] 2025-04-26 21:50:28,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6454 [WARNING|trainer.py:803] 2025-04-26 21:50:28,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:50:28,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7573 6399 [WARNING|trainer.py:803] 2025-04-26 21:50:29,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6455 [WARNING|trainer.py:803] 2025-04-26 21:50:29,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:50:29,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7574 6400 [WARNING|trainer.py:803] 2025-04-26 21:50:30,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:31,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6456 [WARNING|trainer.py:803] 2025-04-26 21:50:31,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7575 6401 [WARNING|trainer.py:803] 2025-04-26 21:50:32,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:32,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6457 7576 [WARNING|trainer.py:803] 2025-04-26 21:50:32,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:33,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6402 [WARNING|trainer.py:803] 2025-04-26 21:50:33,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7577 6458 [WARNING|trainer.py:803] 2025-04-26 21:50:34,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6403 [WARNING|trainer.py:803] 2025-04-26 21:50:34,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:34,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7578 6459 [WARNING|trainer.py:803] 2025-04-26 21:50:35,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:50:35,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6404 7579 [WARNING|trainer.py:803] 2025-04-26 21:50:36,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6460 [WARNING|trainer.py:803] 2025-04-26 21:50:36,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:36,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6405 7580 [WARNING|trainer.py:803] 2025-04-26 21:50:37,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6461 [WARNING|trainer.py:803] 2025-04-26 21:50:38,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:50:38,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7581 6406 [WARNING|trainer.py:803] 2025-04-26 21:50:38,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:39,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6462 [WARNING|trainer.py:803] 2025-04-26 21:50:39,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7582 6407 [WARNING|trainer.py:803] 2025-04-26 21:50:40,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:40,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6463 [WARNING|trainer.py:803] 2025-04-26 21:50:40,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7583 6408 [WARNING|trainer.py:803] 2025-04-26 21:50:41,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:50:41,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7584 6464 [WARNING|trainer.py:803] 2025-04-26 21:50:42,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6409 [WARNING|trainer.py:803] 2025-04-26 21:50:42,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:42,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7585 6465 [WARNING|trainer.py:803] 2025-04-26 21:50:43,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6410 [WARNING|trainer.py:803] 2025-04-26 21:50:44,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:44,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7586 6466 [WARNING|trainer.py:803] 2025-04-26 21:50:44,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6411 [WARNING|trainer.py:803] 2025-04-26 21:50:45,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:45,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7587 6467 [WARNING|trainer.py:803] 2025-04-26 21:50:46,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:50:46,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6412 7588 [WARNING|trainer.py:803] 2025-04-26 21:50:46,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:50:47,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6468 [WARNING|trainer.py:803] 2025-04-26 21:50:47,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6413 7589 [WARNING|trainer.py:803] 2025-04-26 21:50:48,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:48,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6469 [WARNING|trainer.py:803] 2025-04-26 21:50:48,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7590 6414 [WARNING|trainer.py:803] 2025-04-26 21:50:49,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:50:50,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:50,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6470 7591 6415 [WARNING|trainer.py:803] 2025-04-26 21:50:50,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:51,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:50:51,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6471 7592 6416 [WARNING|trainer.py:803] 2025-04-26 21:50:52,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:50:52,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7593 [WARNING|trainer.py:803] 2025-04-26 21:50:52,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6472 6417 [WARNING|trainer.py:803] 2025-04-26 21:50:53,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:53,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7594 6473 [WARNING|trainer.py:803] 2025-04-26 21:50:54,110 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6418 [WARNING|trainer.py:803] 2025-04-26 21:50:54,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:54,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7595 6474 [WARNING|trainer.py:803] 2025-04-26 21:50:55,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:50:55,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6419 [WARNING|trainer.py:803] 2025-04-26 21:50:56,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7596 6475 [WARNING|trainer.py:803] 2025-04-26 21:50:56,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:50:57,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6420 7597 [WARNING|trainer.py:803] 2025-04-26 21:50:57,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6476 [WARNING|trainer.py:803] 2025-04-26 21:50:58,186 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:50:58,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7598 6421 [WARNING|trainer.py:803] 2025-04-26 21:50:58,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6477 [WARNING|trainer.py:803] 2025-04-26 21:50:59,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:50:59,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7599 6422 [WARNING|trainer.py:803] 2025-04-26 21:51:00,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:51:00,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6478 [WARNING|trainer.py:803] 2025-04-26 21:51:00,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7600 6423 [WARNING|trainer.py:803] 2025-04-26 21:51:01,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:51:01,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6479 [WARNING|trainer.py:803] 2025-04-26 21:51:02,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7601 6424 [WARNING|trainer.py:803] 2025-04-26 21:51:02,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:03,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6480 7602 [WARNING|trainer.py:803] 2025-04-26 21:51:03,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6425 [WARNING|trainer.py:803] 2025-04-26 21:51:04,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:51:04,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7603 6481 [WARNING|trainer.py:803] 2025-04-26 21:51:04,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6426 [WARNING|trainer.py:803] 2025-04-26 21:51:05,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:05,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7604 6482 [WARNING|trainer.py:803] 2025-04-26 21:51:06,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:51:06,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6427 [WARNING|trainer.py:803] 2025-04-26 21:51:06,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7605 6483 [WARNING|trainer.py:803] 2025-04-26 21:51:07,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:51:07,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6428 [WARNING|trainer.py:803] 2025-04-26 21:51:08,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7606 6484 [WARNING|trainer.py:803] 2025-04-26 21:51:08,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:51:09,084 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6429 7607 [WARNING|trainer.py:803] 2025-04-26 21:51:09,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6485 [WARNING|trainer.py:803] 2025-04-26 21:51:10,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:51:10,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7608 6430 [WARNING|trainer.py:803] 2025-04-26 21:51:10,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6486 [WARNING|trainer.py:803] 2025-04-26 21:51:11,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:51:11,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7609 6431 [WARNING|trainer.py:803] 2025-04-26 21:51:12,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6487 [WARNING|trainer.py:803] 2025-04-26 21:51:12,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:51:12,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7610 6432 [WARNING|trainer.py:803] 2025-04-26 21:51:13,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:13,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6488 [WARNING|trainer.py:803] 2025-04-26 21:51:14,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7611 6433 [WARNING|trainer.py:803] 2025-04-26 21:51:14,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:51:15,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6489 7612 [WARNING|trainer.py:803] 2025-04-26 21:51:15,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6434 [WARNING|trainer.py:803] 2025-04-26 21:51:16,143 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:51:16,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7613 6490 [WARNING|trainer.py:803] 2025-04-26 21:51:16,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6435 [WARNING|trainer.py:803] 2025-04-26 21:51:17,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:17,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7614 6491 [WARNING|trainer.py:803] 2025-04-26 21:51:18,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:51:18,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6436 [WARNING|trainer.py:803] 2025-04-26 21:51:18,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7615 6492 [WARNING|trainer.py:803] 2025-04-26 21:51:19,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:51:19,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6437 [WARNING|trainer.py:803] 2025-04-26 21:51:20,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7616 6493 [WARNING|trainer.py:803] 2025-04-26 21:51:20,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:51:21,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6438 [WARNING|trainer.py:803] 2025-04-26 21:51:21,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7617 6494 [WARNING|trainer.py:803] 2025-04-26 21:51:22,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:51:22,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6439 7618 [WARNING|trainer.py:803] 2025-04-26 21:51:22,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6495 [WARNING|trainer.py:803] 2025-04-26 21:51:23,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:23,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6440 7619 [WARNING|trainer.py:803] 2025-04-26 21:51:24,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6496 [WARNING|trainer.py:803] 2025-04-26 21:51:24,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:51:24,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7620 6441 [WARNING|trainer.py:803] 2025-04-26 21:51:25,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6497 [WARNING|trainer.py:803] 2025-04-26 21:51:26,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:51:26,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7621 6442 [WARNING|trainer.py:803] 2025-04-26 21:51:26,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:51:27,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6498 [WARNING|trainer.py:803] 2025-04-26 21:51:27,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7622 6443 [WARNING|trainer.py:803] 2025-04-26 21:51:28,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:51:28,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6499 [WARNING|trainer.py:803] 2025-04-26 21:51:28,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7623 6444 [WARNING|trainer.py:803] 2025-04-26 21:51:29,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:29,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6500 7624 [WARNING|trainer.py:803] 2025-04-26 21:51:30,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6445 [WARNING|trainer.py:803] 2025-04-26 21:51:30,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:30,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6501 7625 [WARNING|trainer.py:803] 2025-04-26 21:51:31,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6446 [WARNING|trainer.py:803] 2025-04-26 21:51:32,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:51:32,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7626 6502 [WARNING|trainer.py:803] 2025-04-26 21:51:32,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6447 [WARNING|trainer.py:803] 2025-04-26 21:51:33,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:51:33,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7627 6503 [WARNING|trainer.py:803] 2025-04-26 21:51:34,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6448 [WARNING|trainer.py:803] 2025-04-26 21:51:34,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:51:34,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7628 6504 [WARNING|trainer.py:803] 2025-04-26 21:51:35,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:51:35,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6449 [WARNING|trainer.py:803] 2025-04-26 21:51:36,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7629 6505 [WARNING|trainer.py:803] 2025-04-26 21:51:36,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:51:37,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6450 7630 [WARNING|trainer.py:803] 2025-04-26 21:51:37,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6506 [WARNING|trainer.py:803] 2025-04-26 21:51:38,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:51:38,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7631 6451 [WARNING|trainer.py:803] 2025-04-26 21:51:38,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6507 [WARNING|trainer.py:803] 2025-04-26 21:51:39,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:51:39,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7632 6452 [WARNING|trainer.py:803] 2025-04-26 21:51:40,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6508 [WARNING|trainer.py:803] 2025-04-26 21:51:40,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:51:40,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7633 [WARNING|trainer.py:803] 2025-04-26 21:51:41,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6453 [WARNING|trainer.py:803] 2025-04-26 21:51:41,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6509 [WARNING|trainer.py:803] 2025-04-26 21:51:42,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7634 6454 [WARNING|trainer.py:803] 2025-04-26 21:51:42,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:43,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6510 7635 [WARNING|trainer.py:803] 2025-04-26 21:51:43,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6455 [WARNING|trainer.py:803] 2025-04-26 21:51:44,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:51:44,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6511 7636 [WARNING|trainer.py:803] 2025-04-26 21:51:44,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:45,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6456 [WARNING|trainer.py:803] 2025-04-26 21:51:45,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6512 7637 [WARNING|trainer.py:803] 2025-04-26 21:51:46,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6457 [WARNING|trainer.py:803] 2025-04-26 21:51:46,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:51:46,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7638 6513 [WARNING|trainer.py:803] 2025-04-26 21:51:47,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:47,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:51:47,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6458 7639 6514 [WARNING|trainer.py:803] 2025-04-26 21:51:48,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:49,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:49,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6459 7640 6515 [WARNING|trainer.py:803] 2025-04-26 21:51:50,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:51:50,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:51:50,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7641 6460 6516 [WARNING|trainer.py:803] 2025-04-26 21:51:51,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:51:51,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7642 [WARNING|trainer.py:803] 2025-04-26 21:51:51,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6461 6517 [WARNING|trainer.py:803] 2025-04-26 21:51:52,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:52,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7643 [WARNING|trainer.py:803] 2025-04-26 21:51:53,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6462 6518 [WARNING|trainer.py:803] 2025-04-26 21:51:53,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:51:54,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7644 [WARNING|trainer.py:803] 2025-04-26 21:51:54,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6463 6519 [WARNING|trainer.py:803] 2025-04-26 21:51:55,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:55,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7645 [WARNING|trainer.py:803] 2025-04-26 21:51:55,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6464 6520 [WARNING|trainer.py:803] 2025-04-26 21:51:56,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7646 [WARNING|trainer.py:803] 2025-04-26 21:51:56,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:51:57,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6465 [WARNING|trainer.py:803] 2025-04-26 21:51:57,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6521 7647 [WARNING|trainer.py:803] 2025-04-26 21:51:58,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:51:58,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6466 [WARNING|trainer.py:803] 2025-04-26 21:51:58,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6522 7648 [WARNING|trainer.py:803] 2025-04-26 21:51:59,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:51:59,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6467 [WARNING|trainer.py:803] 2025-04-26 21:52:00,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6523 7649 [WARNING|trainer.py:803] 2025-04-26 21:52:00,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:52:01,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6468 [WARNING|trainer.py:803] 2025-04-26 21:52:01,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6524 7650 [WARNING|trainer.py:803] 2025-04-26 21:52:02,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:52:02,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:52:02,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6469 7651 6525 [WARNING|trainer.py:803] 2025-04-26 21:52:03,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:52:03,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:52:03,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6470 7652 6526 [WARNING|trainer.py:803] 2025-04-26 21:52:04,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:52:05,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:52:05,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6471 7653 6527 [WARNING|trainer.py:803] 2025-04-26 21:52:06,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:52:06,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:06,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6472 7654 6528 [WARNING|trainer.py:803] 2025-04-26 21:52:07,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:07,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6473 [WARNING|trainer.py:803] 2025-04-26 21:52:07,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7655 6529 [WARNING|trainer.py:803] 2025-04-26 21:52:08,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:52:08,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6474 [WARNING|trainer.py:803] 2025-04-26 21:52:09,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7656 6530 [WARNING|trainer.py:803] 2025-04-26 21:52:10,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:52:10,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7657 6475 [WARNING|trainer.py:803] 2025-04-26 21:52:10,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6531 [WARNING|trainer.py:803] 2025-04-26 21:52:11,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:52:11,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:52:11,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7658 6476 6532 [WARNING|trainer.py:803] 2025-04-26 21:52:12,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:12,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7659 [WARNING|trainer.py:803] 2025-04-26 21:52:13,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6477 6533 [WARNING|trainer.py:803] 2025-04-26 21:52:13,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:52:14,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7660 [WARNING|trainer.py:803] 2025-04-26 21:52:14,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6478 6534 [WARNING|trainer.py:803] 2025-04-26 21:52:15,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:52:15,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7661 [WARNING|trainer.py:803] 2025-04-26 21:52:15,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6479 6535 [WARNING|trainer.py:803] 2025-04-26 21:52:16,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:52:16,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7662 [WARNING|trainer.py:803] 2025-04-26 21:52:17,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6480 6536 [WARNING|trainer.py:803] 2025-04-26 21:52:17,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:52:18,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7663 [WARNING|trainer.py:803] 2025-04-26 21:52:18,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6481 6537 [WARNING|trainer.py:803] 2025-04-26 21:52:19,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7664 [WARNING|trainer.py:803] 2025-04-26 21:52:19,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:52:19,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6482 6538 [WARNING|trainer.py:803] 2025-04-26 21:52:20,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7665 [WARNING|trainer.py:803] 2025-04-26 21:52:20,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:52:21,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6483 [WARNING|trainer.py:803] 2025-04-26 21:52:21,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6539 7666 [WARNING|trainer.py:803] 2025-04-26 21:52:22,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:52:22,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6484 [WARNING|trainer.py:803] 2025-04-26 21:52:22,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6540 7667 [WARNING|trainer.py:803] 2025-04-26 21:52:23,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:23,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:52:23,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6485 6541 7668 [WARNING|trainer.py:803] 2025-04-26 21:52:24,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:25,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:25,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6486 6542 7669 [WARNING|trainer.py:803] 2025-04-26 21:52:26,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:26,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6487 [WARNING|trainer.py:803] 2025-04-26 21:52:26,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6543 7670 [WARNING|trainer.py:803] 2025-04-26 21:52:27,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:27,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:27,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6544 6488 7671 [WARNING|trainer.py:803] 2025-04-26 21:52:28,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:52:28,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:52:29,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6545 6489 7672 [WARNING|trainer.py:803] 2025-04-26 21:52:30,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:30,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:52:30,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6546 7673 6490 [WARNING|trainer.py:803] 2025-04-26 21:52:31,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:31,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:52:31,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7674 6547 6491 [WARNING|trainer.py:803] 2025-04-26 21:52:32,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:52:32,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:32,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7675 6548 6492 [WARNING|trainer.py:803] 2025-04-26 21:52:34,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:52:34,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:52:34,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7676 6549 6493 [WARNING|trainer.py:803] 2025-04-26 21:52:35,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:35,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:35,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7677 6550 6494 [WARNING|trainer.py:803] 2025-04-26 21:52:36,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:36,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7678 [WARNING|trainer.py:803] 2025-04-26 21:52:37,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6551 6495 [WARNING|trainer.py:803] 2025-04-26 21:52:37,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:52:38,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7679 [WARNING|trainer.py:803] 2025-04-26 21:52:38,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6552 6496 [WARNING|trainer.py:803] 2025-04-26 21:52:39,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:39,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7680 [WARNING|trainer.py:803] 2025-04-26 21:52:39,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6553 6497 [WARNING|trainer.py:803] 2025-04-26 21:52:40,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:40,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7681 [WARNING|trainer.py:803] 2025-04-26 21:52:41,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6554 [WARNING|trainer.py:803] 2025-04-26 21:52:41,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6498 [WARNING|trainer.py:803] 2025-04-26 21:52:41,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7682 6555 [WARNING|trainer.py:803] 2025-04-26 21:52:42,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:52:42,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6499 [WARNING|trainer.py:803] 2025-04-26 21:52:43,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7683 6556 [WARNING|trainer.py:803] 2025-04-26 21:52:43,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:44,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6500 [WARNING|trainer.py:803] 2025-04-26 21:52:44,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7684 6557 [WARNING|trainer.py:803] 2025-04-26 21:52:45,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:45,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6501 7685 [WARNING|trainer.py:803] 2025-04-26 21:52:45,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6558 [WARNING|trainer.py:803] 2025-04-26 21:52:46,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:52:46,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6502 7686 [WARNING|trainer.py:803] 2025-04-26 21:52:47,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6559 [WARNING|trainer.py:803] 2025-04-26 21:52:47,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:52:47,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6503 7687 [WARNING|trainer.py:803] 2025-04-26 21:52:48,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6560 [WARNING|trainer.py:803] 2025-04-26 21:52:49,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:49,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6504 7688 [WARNING|trainer.py:803] 2025-04-26 21:52:49,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6561 [WARNING|trainer.py:803] 2025-04-26 21:52:50,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:50,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7689 6505 [WARNING|trainer.py:803] 2025-04-26 21:52:51,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6562 [WARNING|trainer.py:803] 2025-04-26 21:52:51,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:52:51,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7690 6506 [WARNING|trainer.py:803] 2025-04-26 21:52:52,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6563 [WARNING|trainer.py:803] 2025-04-26 21:52:52,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:52:53,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7691 6507 [WARNING|trainer.py:803] 2025-04-26 21:52:53,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6564 [WARNING|trainer.py:803] 2025-04-26 21:52:54,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:54,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7692 [WARNING|trainer.py:803] 2025-04-26 21:52:54,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6508 6565 [WARNING|trainer.py:803] 2025-04-26 21:52:55,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:52:55,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7693 [WARNING|trainer.py:803] 2025-04-26 21:52:56,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6509 6566 [WARNING|trainer.py:803] 2025-04-26 21:52:56,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:52:57,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7694 [WARNING|trainer.py:803] 2025-04-26 21:52:57,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6510 6567 [WARNING|trainer.py:803] 2025-04-26 21:52:58,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:52:58,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7695 [WARNING|trainer.py:803] 2025-04-26 21:52:58,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6511 6568 [WARNING|trainer.py:803] 2025-04-26 21:52:59,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:52:59,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7696 [WARNING|trainer.py:803] 2025-04-26 21:53:00,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6512 6569 [WARNING|trainer.py:803] 2025-04-26 21:53:00,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:01,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7697 [WARNING|trainer.py:803] 2025-04-26 21:53:01,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6513 6570 [WARNING|trainer.py:803] 2025-04-26 21:53:02,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:53:02,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7698 [WARNING|trainer.py:803] 2025-04-26 21:53:02,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6514 6571 [WARNING|trainer.py:803] 2025-04-26 21:53:03,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7699 [WARNING|trainer.py:803] 2025-04-26 21:53:03,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:04,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6515 [WARNING|trainer.py:803] 2025-04-26 21:53:04,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6572 7700 [WARNING|trainer.py:803] 2025-04-26 21:53:05,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:05,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6516 [WARNING|trainer.py:803] 2025-04-26 21:53:05,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6573 7701 [WARNING|trainer.py:803] 2025-04-26 21:53:06,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:06,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6517 [WARNING|trainer.py:803] 2025-04-26 21:53:07,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6574 7702 [WARNING|trainer.py:803] 2025-04-26 21:53:07,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:07,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6518 [WARNING|trainer.py:803] 2025-04-26 21:53:08,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6575 7703 [WARNING|trainer.py:803] 2025-04-26 21:53:09,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:09,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:53:09,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6519 6576 7704 [WARNING|trainer.py:803] 2025-04-26 21:53:10,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:10,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:10,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6520 6577 7705 [WARNING|trainer.py:803] 2025-04-26 21:53:11,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:11,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:12,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6521 7706 6578 [WARNING|trainer.py:803] 2025-04-26 21:53:13,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:13,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:53:13,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7707 6522 6579 [WARNING|trainer.py:803] 2025-04-26 21:53:14,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:53:14,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:14,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7708 6523 6580 [WARNING|trainer.py:803] 2025-04-26 21:53:15,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:15,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:15,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7709 6524 6581 [WARNING|trainer.py:803] 2025-04-26 21:53:16,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:17,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:17,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7710 6525 6582 [WARNING|trainer.py:803] 2025-04-26 21:53:18,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:53:18,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7711 [WARNING|trainer.py:803] 2025-04-26 21:53:18,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6526 6583 [WARNING|trainer.py:803] 2025-04-26 21:53:19,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7712 [WARNING|trainer.py:803] 2025-04-26 21:53:19,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:53:19,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6527 6584 [WARNING|trainer.py:803] 2025-04-26 21:53:20,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7713 [WARNING|trainer.py:803] 2025-04-26 21:53:21,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:21,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6528 [WARNING|trainer.py:803] 2025-04-26 21:53:21,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6585 7714 [WARNING|trainer.py:803] 2025-04-26 21:53:22,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:22,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:22,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6529 6586 7715 [WARNING|trainer.py:803] 2025-04-26 21:53:23,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:23,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:24,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6530 7716 6587 [WARNING|trainer.py:803] 2025-04-26 21:53:25,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:25,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:53:25,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6531 7717 6588 [WARNING|trainer.py:803] 2025-04-26 21:53:26,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:53:26,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:53:26,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7718 6532 6589 [WARNING|trainer.py:803] 2025-04-26 21:53:27,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:53:27,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7719 [WARNING|trainer.py:803] 2025-04-26 21:53:27,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6533 6590 [WARNING|trainer.py:803] 2025-04-26 21:53:28,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:29,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7720 [WARNING|trainer.py:803] 2025-04-26 21:53:29,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6534 6591 [WARNING|trainer.py:803] 2025-04-26 21:53:30,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:53:30,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7721 [WARNING|trainer.py:803] 2025-04-26 21:53:30,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6535 6592 [WARNING|trainer.py:803] 2025-04-26 21:53:31,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7722 [WARNING|trainer.py:803] 2025-04-26 21:53:31,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:32,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6536 6593 [WARNING|trainer.py:803] 2025-04-26 21:53:32,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7723 [WARNING|trainer.py:803] 2025-04-26 21:53:33,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:33,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6537 [WARNING|trainer.py:803] 2025-04-26 21:53:33,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6594 7724 [WARNING|trainer.py:803] 2025-04-26 21:53:34,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:34,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6538 [WARNING|trainer.py:803] 2025-04-26 21:53:34,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6595 7725 [WARNING|trainer.py:803] 2025-04-26 21:53:35,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:53:35,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6539 [WARNING|trainer.py:803] 2025-04-26 21:53:36,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7726 6596 [WARNING|trainer.py:803] 2025-04-26 21:53:36,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:37,307 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:37,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6540 7727 6597 [WARNING|trainer.py:803] 2025-04-26 21:53:38,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:53:38,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:38,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6541 7728 6598 [WARNING|trainer.py:803] 2025-04-26 21:53:39,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:39,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6542 7729 [WARNING|trainer.py:803] 2025-04-26 21:53:39,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6599 [WARNING|trainer.py:803] 2025-04-26 21:53:40,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:40,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7730 6543 [WARNING|trainer.py:803] 2025-04-26 21:53:41,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6600 [WARNING|trainer.py:803] 2025-04-26 21:53:41,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:53:42,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7731 6544 [WARNING|trainer.py:803] 2025-04-26 21:53:42,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6601 [WARNING|trainer.py:803] 2025-04-26 21:53:43,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:53:43,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7732 6545 [WARNING|trainer.py:803] 2025-04-26 21:53:43,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:44,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6602 7733 [WARNING|trainer.py:803] 2025-04-26 21:53:44,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6546 [WARNING|trainer.py:803] 2025-04-26 21:53:45,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:45,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6603 7734 [WARNING|trainer.py:803] 2025-04-26 21:53:45,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6547 [WARNING|trainer.py:803] 2025-04-26 21:53:46,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:46,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7735 6604 [WARNING|trainer.py:803] 2025-04-26 21:53:47,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6548 [WARNING|trainer.py:803] 2025-04-26 21:53:47,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:47,885 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7736 6605 [WARNING|trainer.py:803] 2025-04-26 21:53:48,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:53:48,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6549 [WARNING|trainer.py:803] 2025-04-26 21:53:49,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7737 6606 [WARNING|trainer.py:803] 2025-04-26 21:53:49,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:50,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6550 7738 [WARNING|trainer.py:803] 2025-04-26 21:53:50,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6607 [WARNING|trainer.py:803] 2025-04-26 21:53:51,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:53:51,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7739 6551 [WARNING|trainer.py:803] 2025-04-26 21:53:51,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6608 [WARNING|trainer.py:803] 2025-04-26 21:53:52,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:53:52,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7740 6552 [WARNING|trainer.py:803] 2025-04-26 21:53:53,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:53:53,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6609 [WARNING|trainer.py:803] 2025-04-26 21:53:53,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7741 6553 [WARNING|trainer.py:803] 2025-04-26 21:53:54,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:54,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6610 [WARNING|trainer.py:803] 2025-04-26 21:53:55,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7742 6554 [WARNING|trainer.py:803] 2025-04-26 21:53:55,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:56,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6611 7743 [WARNING|trainer.py:803] 2025-04-26 21:53:56,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6555 [WARNING|trainer.py:803] 2025-04-26 21:53:57,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:57,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7744 6612 [WARNING|trainer.py:803] 2025-04-26 21:53:57,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6556 [WARNING|trainer.py:803] 2025-04-26 21:53:58,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:53:58,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7745 6613 [WARNING|trainer.py:803] 2025-04-26 21:53:59,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:53:59,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6557 [WARNING|trainer.py:803] 2025-04-26 21:53:59,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7746 6614 [WARNING|trainer.py:803] 2025-04-26 21:54:00,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:00,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6558 [WARNING|trainer.py:803] 2025-04-26 21:54:01,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7747 6615 [WARNING|trainer.py:803] 2025-04-26 21:54:01,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:54:01,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6559 7748 [WARNING|trainer.py:803] 2025-04-26 21:54:02,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6616 [WARNING|trainer.py:803] 2025-04-26 21:54:03,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:03,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7749 6560 [WARNING|trainer.py:803] 2025-04-26 21:54:03,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6617 [WARNING|trainer.py:803] 2025-04-26 21:54:04,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:04,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7750 [WARNING|trainer.py:803] 2025-04-26 21:54:05,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6561 6618 [WARNING|trainer.py:803] 2025-04-26 21:54:05,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:54:05,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7751 [WARNING|trainer.py:803] 2025-04-26 21:54:06,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6562 [WARNING|trainer.py:803] 2025-04-26 21:54:06,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6619 7752 [WARNING|trainer.py:803] 2025-04-26 21:54:07,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:54:07,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6563 [WARNING|trainer.py:803] 2025-04-26 21:54:07,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6620 7753 [WARNING|trainer.py:803] 2025-04-26 21:54:08,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:08,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6564 [WARNING|trainer.py:803] 2025-04-26 21:54:09,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7754 6621 [WARNING|trainer.py:803] 2025-04-26 21:54:09,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:54:10,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:10,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6565 7755 6622 [WARNING|trainer.py:803] 2025-04-26 21:54:11,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:54:11,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:11,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6566 7756 6623 [WARNING|trainer.py:803] 2025-04-26 21:54:12,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:12,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7757 [WARNING|trainer.py:803] 2025-04-26 21:54:12,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6567 6624 [WARNING|trainer.py:803] 2025-04-26 21:54:13,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:54:13,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7758 [WARNING|trainer.py:803] 2025-04-26 21:54:14,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6568 6625 [WARNING|trainer.py:803] 2025-04-26 21:54:14,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:54:15,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7759 [WARNING|trainer.py:803] 2025-04-26 21:54:15,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6569 6626 [WARNING|trainer.py:803] 2025-04-26 21:54:16,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7760 [WARNING|trainer.py:803] 2025-04-26 21:54:16,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:16,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6570 [WARNING|trainer.py:803] 2025-04-26 21:54:17,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6627 7761 [WARNING|trainer.py:803] 2025-04-26 21:54:17,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:54:18,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6571 [WARNING|trainer.py:803] 2025-04-26 21:54:18,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6628 7762 [WARNING|trainer.py:803] 2025-04-26 21:54:19,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:19,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:54:19,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6572 7763 6629 [WARNING|trainer.py:803] 2025-04-26 21:54:20,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:54:20,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:54:20,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6573 7764 6630 [WARNING|trainer.py:803] 2025-04-26 21:54:21,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:21,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7765 [WARNING|trainer.py:803] 2025-04-26 21:54:22,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6574 6631 [WARNING|trainer.py:803] 2025-04-26 21:54:23,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:54:23,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7766 [WARNING|trainer.py:803] 2025-04-26 21:54:23,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6575 6632 [WARNING|trainer.py:803] 2025-04-26 21:54:24,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7767 [WARNING|trainer.py:803] 2025-04-26 21:54:24,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:54:24,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6576 [WARNING|trainer.py:803] 2025-04-26 21:54:25,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6633 7768 [WARNING|trainer.py:803] 2025-04-26 21:54:25,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:26,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6577 [WARNING|trainer.py:803] 2025-04-26 21:54:26,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6634 7769 [WARNING|trainer.py:803] 2025-04-26 21:54:27,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:27,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:27,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6578 7770 6635 [WARNING|trainer.py:803] 2025-04-26 21:54:28,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:28,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:54:28,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6579 7771 6636 [WARNING|trainer.py:803] 2025-04-26 21:54:29,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:29,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:54:30,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6580 7772 6637 [WARNING|trainer.py:803] 2025-04-26 21:54:31,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:54:31,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:54:31,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7773 6581 6638 [WARNING|trainer.py:803] 2025-04-26 21:54:32,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:54:32,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7774 [WARNING|trainer.py:803] 2025-04-26 21:54:32,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6582 6639 [WARNING|trainer.py:803] 2025-04-26 21:54:33,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7775 [WARNING|trainer.py:803] 2025-04-26 21:54:33,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:54:34,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6583 6640 [WARNING|trainer.py:803] 2025-04-26 21:54:34,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7776 [WARNING|trainer.py:803] 2025-04-26 21:54:35,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:54:35,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6584 [WARNING|trainer.py:803] 2025-04-26 21:54:35,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6641 7777 [WARNING|trainer.py:803] 2025-04-26 21:54:36,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:36,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:37,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6585 6642 7778 [WARNING|trainer.py:803] 2025-04-26 21:54:37,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:38,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:38,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7779 6586 6643 [WARNING|trainer.py:803] 2025-04-26 21:54:39,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:39,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:39,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7780 6587 6644 [WARNING|trainer.py:803] 2025-04-26 21:54:40,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:40,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:40,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7781 6588 6645 [WARNING|trainer.py:803] 2025-04-26 21:54:41,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:54:41,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7782 [WARNING|trainer.py:803] 2025-04-26 21:54:42,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6589 6646 [WARNING|trainer.py:803] 2025-04-26 21:54:42,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7783 [WARNING|trainer.py:803] 2025-04-26 21:54:43,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:43,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6590 6647 [WARNING|trainer.py:803] 2025-04-26 21:54:44,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7784 [WARNING|trainer.py:803] 2025-04-26 21:54:44,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:44,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6591 [WARNING|trainer.py:803] 2025-04-26 21:54:45,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6648 7785 [WARNING|trainer.py:803] 2025-04-26 21:54:46,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:46,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:46,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6592 6649 7786 [WARNING|trainer.py:803] 2025-04-26 21:54:47,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:47,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:47,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6593 6650 7787 [WARNING|trainer.py:803] 2025-04-26 21:54:48,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:48,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:54:48,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7788 6594 6651 [WARNING|trainer.py:803] 2025-04-26 21:54:49,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:50,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:50,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7789 6595 6652 [WARNING|trainer.py:803] 2025-04-26 21:54:51,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:51,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:51,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7790 6596 6653 [WARNING|trainer.py:803] 2025-04-26 21:54:52,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:52,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7791 [WARNING|trainer.py:803] 2025-04-26 21:54:52,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6597 6654 [WARNING|trainer.py:803] 2025-04-26 21:54:53,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7792 [WARNING|trainer.py:803] 2025-04-26 21:54:54,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:54,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6598 6655 [WARNING|trainer.py:803] 2025-04-26 21:54:54,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7793 [WARNING|trainer.py:803] 2025-04-26 21:54:55,385 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:55,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6599 [WARNING|trainer.py:803] 2025-04-26 21:54:55,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6656 7794 [WARNING|trainer.py:803] 2025-04-26 21:54:56,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:56,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:57,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6600 6657 7795 [WARNING|trainer.py:803] 2025-04-26 21:54:58,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:58,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:58,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6601 6658 7796 [WARNING|trainer.py:803] 2025-04-26 21:54:59,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:54:59,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:54:59,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6602 7797 6659 [WARNING|trainer.py:803] 2025-04-26 21:55:00,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:00,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:55:00,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7798 6603 6660 [WARNING|trainer.py:803] 2025-04-26 21:55:01,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:02,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:02,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7799 6604 6661 [WARNING|trainer.py:803] 2025-04-26 21:55:03,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:03,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:55:03,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7800 6605 6662 [WARNING|trainer.py:803] 2025-04-26 21:55:04,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:55:04,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:04,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7801 6606 6663 [WARNING|trainer.py:803] 2025-04-26 21:55:05,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:55:06,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:06,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7802 6607 6664 [WARNING|trainer.py:803] 2025-04-26 21:55:07,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:55:07,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:07,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6608 6665 7803 [WARNING|trainer.py:803] 2025-04-26 21:55:08,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:08,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:09,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6609 6666 7804 [WARNING|trainer.py:803] 2025-04-26 21:55:10,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:10,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6610 [WARNING|trainer.py:803] 2025-04-26 21:55:10,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6667 7805 [WARNING|trainer.py:803] 2025-04-26 21:55:11,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:11,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6611 6668 [WARNING|trainer.py:803] 2025-04-26 21:55:12,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:12,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7806 [WARNING|trainer.py:803] 2025-04-26 21:55:12,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6612 6669 [WARNING|trainer.py:803] 2025-04-26 21:55:13,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:13,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:14,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7807 6613 6670 [WARNING|trainer.py:803] 2025-04-26 21:55:15,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:15,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:15,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7808 6614 6671 [WARNING|trainer.py:803] 2025-04-26 21:55:16,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:55:16,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:16,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6615 6672 7809 [WARNING|trainer.py:803] 2025-04-26 21:55:17,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:55:18,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:18,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6616 6673 7810 [WARNING|trainer.py:803] 2025-04-26 21:55:19,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:55:19,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:19,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6617 6674 7811 [WARNING|trainer.py:803] 2025-04-26 21:55:20,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:20,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6618 6675 [WARNING|trainer.py:803] 2025-04-26 21:55:21,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7812 [WARNING|trainer.py:803] 2025-04-26 21:55:21,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:22,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6619 6676 [WARNING|trainer.py:803] 2025-04-26 21:55:22,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:23,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:23,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7813 6620 6677 [WARNING|trainer.py:803] 2025-04-26 21:55:24,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:55:24,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:24,683 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7814 6621 6678 [WARNING|trainer.py:803] 2025-04-26 21:55:25,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:55:25,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:25,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7815 6679 6622 [WARNING|trainer.py:803] 2025-04-26 21:55:27,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:55:27,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:27,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6680 6623 7816 [WARNING|trainer.py:803] 2025-04-26 21:55:28,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:28,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:28,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6681 6624 7817 [WARNING|trainer.py:803] 2025-04-26 21:55:29,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:30,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:55:30,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6682 6625 7818 [WARNING|trainer.py:803] 2025-04-26 21:55:31,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:31,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:55:31,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6683 6626 7819 [WARNING|trainer.py:803] 2025-04-26 21:55:32,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:32,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6627 6684 [WARNING|trainer.py:803] 2025-04-26 21:55:33,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7820 [WARNING|trainer.py:803] 2025-04-26 21:55:34,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:34,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6685 6628 [WARNING|trainer.py:803] 2025-04-26 21:55:34,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:35,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:35,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7821 6686 6629 [WARNING|trainer.py:803] 2025-04-26 21:55:36,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:55:36,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:55:36,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7822 6687 6630 [WARNING|trainer.py:803] 2025-04-26 21:55:37,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:55:37,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:38,052 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6688 6631 7823 [WARNING|trainer.py:803] 2025-04-26 21:55:39,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:55:39,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:39,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6689 6632 7824 [WARNING|trainer.py:803] 2025-04-26 21:55:40,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:40,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6690 6633 [WARNING|trainer.py:803] 2025-04-26 21:55:41,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7825 [WARNING|trainer.py:803] 2025-04-26 21:55:41,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:42,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6691 6634 [WARNING|trainer.py:803] 2025-04-26 21:55:42,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7826 [WARNING|trainer.py:803] 2025-04-26 21:55:43,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:43,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6692 6635 [WARNING|trainer.py:803] 2025-04-26 21:55:43,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:44,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7827 [WARNING|trainer.py:803] 2025-04-26 21:55:44,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6693 6636 [WARNING|trainer.py:803] 2025-04-26 21:55:45,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:45,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:46,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6694 7828 6637 [WARNING|trainer.py:803] 2025-04-26 21:55:47,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:47,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:55:47,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6695 6638 7829 [WARNING|trainer.py:803] 2025-04-26 21:55:48,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:48,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6696 [WARNING|trainer.py:803] 2025-04-26 21:55:49,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6639 7830 [WARNING|trainer.py:803] 2025-04-26 21:55:49,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:55:50,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6697 [WARNING|trainer.py:803] 2025-04-26 21:55:50,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6640 [WARNING|trainer.py:803] 2025-04-26 21:55:51,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:55:51,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7831 6698 6641 [WARNING|trainer.py:803] 2025-04-26 21:55:52,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:55:52,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:55:52,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7832 6699 6642 [WARNING|trainer.py:803] 2025-04-26 21:55:53,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:53,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:54,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7833 6700 6643 [WARNING|trainer.py:803] 2025-04-26 21:55:55,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:55:55,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:55:55,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6701 7834 6644 [WARNING|trainer.py:803] 2025-04-26 21:55:56,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:55:56,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:55:56,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6702 6645 7835 [WARNING|trainer.py:803] 2025-04-26 21:55:57,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:55:58,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:58,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6703 6646 7836 [WARNING|trainer.py:803] 2025-04-26 21:55:59,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:55:59,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:55:59,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6704 6647 7837 [WARNING|trainer.py:803] 2025-04-26 21:56:00,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:00,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:00,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6705 6648 7838 [WARNING|trainer.py:803] 2025-04-26 21:56:01,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:56:02,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6706 [WARNING|trainer.py:803] 2025-04-26 21:56:02,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6649 7839 [WARNING|trainer.py:803] 2025-04-26 21:56:03,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:56:03,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6707 6650 [WARNING|trainer.py:803] 2025-04-26 21:56:03,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7840 [WARNING|trainer.py:803] 2025-04-26 21:56:04,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:04,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6708 6651 [WARNING|trainer.py:803] 2025-04-26 21:56:05,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7841 [WARNING|trainer.py:803] 2025-04-26 21:56:05,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:06,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6709 6652 [WARNING|trainer.py:803] 2025-04-26 21:56:06,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:07,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:56:07,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6710 6653 7842 [WARNING|trainer.py:803] 2025-04-26 21:56:08,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:08,774 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:08,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6711 6654 7843 [WARNING|trainer.py:803] 2025-04-26 21:56:09,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:10,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6712 [WARNING|trainer.py:803] 2025-04-26 21:56:10,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6655 [WARNING|trainer.py:803] 2025-04-26 21:56:11,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:11,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6713 7844 6656 [WARNING|trainer.py:803] 2025-04-26 21:56:12,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:12,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:12,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6714 7845 6657 [WARNING|trainer.py:803] 2025-04-26 21:56:13,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:13,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:14,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6715 6658 7846 [WARNING|trainer.py:803] 2025-04-26 21:56:15,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:15,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6716 [WARNING|trainer.py:803] 2025-04-26 21:56:15,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6659 [WARNING|trainer.py:803] 2025-04-26 21:56:16,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7847 [WARNING|trainer.py:803] 2025-04-26 21:56:16,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6717 6660 [WARNING|trainer.py:803] 2025-04-26 21:56:17,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:17,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7848 [WARNING|trainer.py:803] 2025-04-26 21:56:18,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6718 6661 [WARNING|trainer.py:803] 2025-04-26 21:56:18,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:56:19,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:56:19,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7849 6719 6662 [WARNING|trainer.py:803] 2025-04-26 21:56:20,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:20,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:20,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6720 6663 7850 [WARNING|trainer.py:803] 2025-04-26 21:56:21,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:56:22,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:56:22,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6721 6664 [WARNING|trainer.py:803] 2025-04-26 21:56:23,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7851 [WARNING|trainer.py:803] 2025-04-26 21:56:23,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6722 6665 [WARNING|trainer.py:803] 2025-04-26 21:56:24,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:56:24,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:24,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6723 7852 6666 [WARNING|trainer.py:803] 2025-04-26 21:56:25,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:26,147 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6724 [WARNING|trainer.py:803] 2025-04-26 21:56:26,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7853 6667 [WARNING|trainer.py:803] 2025-04-26 21:56:27,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:27,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6725 [WARNING|trainer.py:803] 2025-04-26 21:56:27,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6668 7854 [WARNING|trainer.py:803] 2025-04-26 21:56:28,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6726 [WARNING|trainer.py:803] 2025-04-26 21:56:29,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:29,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6669 7855 [WARNING|trainer.py:803] 2025-04-26 21:56:29,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6727 [WARNING|trainer.py:803] 2025-04-26 21:56:30,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:56:30,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6670 7856 [WARNING|trainer.py:803] 2025-04-26 21:56:31,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6728 [WARNING|trainer.py:803] 2025-04-26 21:56:31,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:32,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6671 [WARNING|trainer.py:803] 2025-04-26 21:56:32,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7857 6729 [WARNING|trainer.py:803] 2025-04-26 21:56:33,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:33,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6672 [WARNING|trainer.py:803] 2025-04-26 21:56:33,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7858 6730 [WARNING|trainer.py:803] 2025-04-26 21:56:34,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:34,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6673 [WARNING|trainer.py:803] 2025-04-26 21:56:35,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 7859 6731 [WARNING|trainer.py:803] 2025-04-26 21:56:35,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6674 [WARNING|trainer.py:803] 2025-04-26 21:56:36,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:56:36,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7860 6732 [WARNING|trainer.py:803] 2025-04-26 21:56:36,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6675 [WARNING|trainer.py:803] 2025-04-26 21:56:37,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:56:37,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7861 6733 [WARNING|trainer.py:803] 2025-04-26 21:56:38,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6676 [WARNING|trainer.py:803] 2025-04-26 21:56:39,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:39,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6734 7862 [WARNING|trainer.py:803] 2025-04-26 21:56:39,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6677 [WARNING|trainer.py:803] 2025-04-26 21:56:40,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:40,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6735 7863 [WARNING|trainer.py:803] 2025-04-26 21:56:40,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6678 [WARNING|trainer.py:803] 2025-04-26 21:56:41,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:41,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6736 [WARNING|trainer.py:803] 2025-04-26 21:56:42,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7864 6679 [WARNING|trainer.py:803] 2025-04-26 21:56:43,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:43,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6737 [WARNING|trainer.py:803] 2025-04-26 21:56:43,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6680 7865 [WARNING|trainer.py:803] 2025-04-26 21:56:44,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6738 [WARNING|trainer.py:803] 2025-04-26 21:56:44,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:45,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6681 7866 [WARNING|trainer.py:803] 2025-04-26 21:56:45,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6739 [WARNING|trainer.py:803] 2025-04-26 21:56:46,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:46,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6682 7867 [WARNING|trainer.py:803] 2025-04-26 21:56:46,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6740 [WARNING|trainer.py:803] 2025-04-26 21:56:47,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:47,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6683 [WARNING|trainer.py:803] 2025-04-26 21:56:48,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6741 [WARNING|trainer.py:803] 2025-04-26 21:56:48,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7868 6684 [WARNING|trainer.py:803] 2025-04-26 21:56:49,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:49,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6742 [WARNING|trainer.py:803] 2025-04-26 21:56:50,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7869 6685 [WARNING|trainer.py:803] 2025-04-26 21:56:50,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:56:51,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6743 [WARNING|trainer.py:803] 2025-04-26 21:56:51,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6686 [WARNING|trainer.py:803] 2025-04-26 21:56:52,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7870 6744 [WARNING|trainer.py:803] 2025-04-26 21:56:52,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:56:53,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6687 [WARNING|trainer.py:803] 2025-04-26 21:56:53,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7871 6745 [WARNING|trainer.py:803] 2025-04-26 21:56:54,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:56:54,686 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6688 [WARNING|trainer.py:803] 2025-04-26 21:56:54,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7872 6746 [WARNING|trainer.py:803] 2025-04-26 21:56:55,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6689 [WARNING|trainer.py:803] 2025-04-26 21:56:56,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:56:56,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7873 6747 [WARNING|trainer.py:803] 2025-04-26 21:56:56,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6690 [WARNING|trainer.py:803] 2025-04-26 21:56:57,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:56:57,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7874 6748 [WARNING|trainer.py:803] 2025-04-26 21:56:58,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6691 [WARNING|trainer.py:803] 2025-04-26 21:56:58,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:56:58,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6749 [WARNING|trainer.py:803] 2025-04-26 21:56:59,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6692 [WARNING|trainer.py:803] 2025-04-26 21:57:00,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7875 6750 [WARNING|trainer.py:803] 2025-04-26 21:57:00,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:01,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6693 [WARNING|trainer.py:803] 2025-04-26 21:57:01,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7876 6751 [WARNING|trainer.py:803] 2025-04-26 21:57:02,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:02,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6694 [WARNING|trainer.py:803] 2025-04-26 21:57:02,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6752 7877 [WARNING|trainer.py:803] 2025-04-26 21:57:03,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6695 [WARNING|trainer.py:803] 2025-04-26 21:57:03,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:04,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6753 [WARNING|trainer.py:803] 2025-04-26 21:57:04,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7878 6696 [WARNING|trainer.py:803] 2025-04-26 21:57:05,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:05,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6754 [WARNING|trainer.py:803] 2025-04-26 21:57:06,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7879 6697 [WARNING|trainer.py:803] 2025-04-26 21:57:06,579 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6755 [WARNING|trainer.py:803] 2025-04-26 21:57:07,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:57:07,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6698 [WARNING|trainer.py:803] 2025-04-26 21:57:07,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7880 6756 [WARNING|trainer.py:803] 2025-04-26 21:57:08,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:57:08,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:09,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6699 7881 6757 [WARNING|trainer.py:803] 2025-04-26 21:57:10,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:10,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:57:10,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6700 6758 7882 [WARNING|trainer.py:803] 2025-04-26 21:57:11,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:57:11,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6701 [WARNING|trainer.py:803] 2025-04-26 21:57:11,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6759 [WARNING|trainer.py:803] 2025-04-26 21:57:12,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:57:13,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7883 6702 6760 [WARNING|trainer.py:803] 2025-04-26 21:57:13,943 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:13,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:57:14,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6703 6761 7884 [WARNING|trainer.py:803] 2025-04-26 21:57:15,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 21:57:15,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6704 [WARNING|trainer.py:803] 2025-04-26 21:57:16,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6762 [WARNING|trainer.py:803] 2025-04-26 21:57:16,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7885 [WARNING|trainer.py:803] 2025-04-26 21:57:17,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6705 6763 [WARNING|trainer.py:803] 2025-04-26 21:57:17,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:57:17,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:57:18,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7886 6706 6764 [WARNING|trainer.py:803] 2025-04-26 21:57:19,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:57:19,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:57:19,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6707 7887 6765 [WARNING|trainer.py:803] 2025-04-26 21:57:20,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:20,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:57:21,014 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6708 7888 6766 [WARNING|trainer.py:803] 2025-04-26 21:57:21,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:22,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:57:22,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6709 7889 6767 [WARNING|trainer.py:803] 2025-04-26 21:57:23,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:57:23,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:57:23,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6710 7890 6768 [WARNING|trainer.py:803] 2025-04-26 21:57:24,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:24,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:57:24,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6711 6769 7891 [WARNING|trainer.py:803] 2025-04-26 21:57:25,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:26,325 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6712 [WARNING|trainer.py:803] 2025-04-26 21:57:26,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6770 [WARNING|trainer.py:803] 2025-04-26 21:57:27,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7892 6713 [WARNING|trainer.py:803] 2025-04-26 21:57:27,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:28,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6771 [WARNING|trainer.py:803] 2025-04-26 21:57:28,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7893 6714 [WARNING|trainer.py:803] 2025-04-26 21:57:29,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6772 [WARNING|trainer.py:803] 2025-04-26 21:57:29,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:57:29,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6715 [WARNING|trainer.py:803] 2025-04-26 21:57:30,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6773 7894 [WARNING|trainer.py:803] 2025-04-26 21:57:31,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:31,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6716 [WARNING|trainer.py:803] 2025-04-26 21:57:31,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6774 [WARNING|trainer.py:803] 2025-04-26 21:57:32,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7895 [WARNING|trainer.py:803] 2025-04-26 21:57:33,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6717 [WARNING|trainer.py:803] 2025-04-26 21:57:33,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6775 [WARNING|trainer.py:803] 2025-04-26 21:57:33,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7896 [WARNING|trainer.py:803] 2025-04-26 21:57:34,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6718 6776 [WARNING|trainer.py:803] 2025-04-26 21:57:35,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:57:35,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:57:35,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7897 6719 6777 [WARNING|trainer.py:803] 2025-04-26 21:57:36,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:57:36,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:36,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6720 6778 7898 [WARNING|trainer.py:803] 2025-04-26 21:57:38,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:57:38,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:38,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6721 6779 7899 [WARNING|trainer.py:803] 2025-04-26 21:57:39,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:39,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6722 [WARNING|trainer.py:803] 2025-04-26 21:57:39,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6780 7900 [WARNING|trainer.py:803] 2025-04-26 21:57:40,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:40,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6723 6781 [WARNING|trainer.py:803] 2025-04-26 21:57:41,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:57:42,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:42,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7901 6724 6782 [WARNING|trainer.py:803] 2025-04-26 21:57:43,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:57:43,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:43,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6725 7902 6783 [WARNING|trainer.py:803] 2025-04-26 21:57:44,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:44,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:57:44,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6726 6784 7903 [WARNING|trainer.py:803] 2025-04-26 21:57:46,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:46,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:46,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6785 6727 7904 [WARNING|trainer.py:803] 2025-04-26 21:57:47,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:57:47,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6786 6728 [WARNING|trainer.py:803] 2025-04-26 21:57:48,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7905 [WARNING|trainer.py:803] 2025-04-26 21:57:48,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:48,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6787 6729 [WARNING|trainer.py:803] 2025-04-26 21:57:49,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:57:50,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:50,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7906 6788 6730 [WARNING|trainer.py:803] 2025-04-26 21:57:51,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:57:51,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:51,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 7907 6789 6731 [WARNING|trainer.py:803] 2025-04-26 21:57:52,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:57:52,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:52,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7908 6790 6732 [WARNING|trainer.py:803] 2025-04-26 21:57:54,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:57:54,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:54,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7909 6791 6733 [WARNING|trainer.py:803] 2025-04-26 21:57:55,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:57:55,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:55,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6792 6734 7910 [WARNING|trainer.py:803] 2025-04-26 21:57:56,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:56,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:57,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6735 6793 7911 [WARNING|trainer.py:803] 2025-04-26 21:57:58,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:58,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:57:58,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6736 6794 7912 [WARNING|trainer.py:803] 2025-04-26 21:57:59,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:57:59,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:57:59,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6795 6737 7913 [WARNING|trainer.py:803] 2025-04-26 21:58:00,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:00,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:01,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6796 6738 7914 [WARNING|trainer.py:803] 2025-04-26 21:58:02,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:02,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6797 6739 [WARNING|trainer.py:803] 2025-04-26 21:58:02,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7915 [WARNING|trainer.py:803] 2025-04-26 21:58:03,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:03,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6798 6740 [WARNING|trainer.py:803] 2025-04-26 21:58:04,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:04,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:58:04,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7916 6799 6741 [WARNING|trainer.py:803] 2025-04-26 21:58:05,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:06,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:58:06,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6800 7917 6742 [WARNING|trainer.py:803] 2025-04-26 21:58:07,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:07,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:58:07,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6801 6743 7918 [WARNING|trainer.py:803] 2025-04-26 21:58:08,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:08,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6802 [WARNING|trainer.py:803] 2025-04-26 21:58:09,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6744 7919 [WARNING|trainer.py:803] 2025-04-26 21:58:10,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:10,282 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6803 6745 [WARNING|trainer.py:803] 2025-04-26 21:58:10,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:11,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:11,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6804 7920 6746 [WARNING|trainer.py:803] 2025-04-26 21:58:12,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:12,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:58:12,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6805 6747 7921 [WARNING|trainer.py:803] 2025-04-26 21:58:14,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:14,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6806 [WARNING|trainer.py:803] 2025-04-26 21:58:14,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6748 [WARNING|trainer.py:803] 2025-04-26 21:58:15,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:15,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6807 7922 6749 [WARNING|trainer.py:803] 2025-04-26 21:58:16,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:16,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:16,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6808 6750 7923 [WARNING|trainer.py:803] 2025-04-26 21:58:17,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:18,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:58:18,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6809 6751 [WARNING|trainer.py:803] 2025-04-26 21:58:19,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:58:19,600 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7924 6810 6752 [WARNING|trainer.py:803] 2025-04-26 21:58:20,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:58:20,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:58:20,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6811 7925 6753 [WARNING|trainer.py:803] 2025-04-26 21:58:21,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:21,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:58:22,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6812 7926 6754 [WARNING|trainer.py:803] 2025-04-26 21:58:23,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:23,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:58:23,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6813 6755 [WARNING|trainer.py:803] 2025-04-26 21:58:24,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7927 [WARNING|trainer.py:803] 2025-04-26 21:58:24,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6814 6756 [WARNING|trainer.py:803] 2025-04-26 21:58:25,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:58:25,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7928 [WARNING|trainer.py:803] 2025-04-26 21:58:26,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6815 6757 [WARNING|trainer.py:803] 2025-04-26 21:58:26,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:27,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:27,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7929 6816 6758 [WARNING|trainer.py:803] 2025-04-26 21:58:28,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:28,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6817 [WARNING|trainer.py:803] 2025-04-26 21:58:28,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7930 6759 [WARNING|trainer.py:803] 2025-04-26 21:58:29,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:29,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6818 [WARNING|trainer.py:803] 2025-04-26 21:58:30,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7931 6760 [WARNING|trainer.py:803] 2025-04-26 21:58:30,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:31,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6819 [WARNING|trainer.py:803] 2025-04-26 21:58:31,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6761 7932 [WARNING|trainer.py:803] 2025-04-26 21:58:32,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6820 [WARNING|trainer.py:803] 2025-04-26 21:58:32,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:33,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6762 [WARNING|trainer.py:803] 2025-04-26 21:58:33,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7933 6821 [WARNING|trainer.py:803] 2025-04-26 21:58:34,069 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6763 [WARNING|trainer.py:803] 2025-04-26 21:58:34,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:58:34,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6822 [WARNING|trainer.py:803] 2025-04-26 21:58:35,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7934 6764 [WARNING|trainer.py:803] 2025-04-26 21:58:36,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:36,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6823 [WARNING|trainer.py:803] 2025-04-26 21:58:36,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6765 7935 [WARNING|trainer.py:803] 2025-04-26 21:58:37,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6824 [WARNING|trainer.py:803] 2025-04-26 21:58:38,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:38,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6766 7936 [WARNING|trainer.py:803] 2025-04-26 21:58:38,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6825 [WARNING|trainer.py:803] 2025-04-26 21:58:39,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:39,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6767 [WARNING|trainer.py:803] 2025-04-26 21:58:40,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6826 7937 [WARNING|trainer.py:803] 2025-04-26 21:58:40,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6768 [WARNING|trainer.py:803] 2025-04-26 21:58:41,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:41,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6827 [WARNING|trainer.py:803] 2025-04-26 21:58:41,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7938 6769 [WARNING|trainer.py:803] 2025-04-26 21:58:42,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:42,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6828 [WARNING|trainer.py:803] 2025-04-26 21:58:43,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7939 6770 [WARNING|trainer.py:803] 2025-04-26 21:58:43,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6829 [WARNING|trainer.py:803] 2025-04-26 21:58:44,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:58:44,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6771 7940 [WARNING|trainer.py:803] 2025-04-26 21:58:45,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6830 [WARNING|trainer.py:803] 2025-04-26 21:58:45,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:46,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6772 [WARNING|trainer.py:803] 2025-04-26 21:58:46,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7941 6831 [WARNING|trainer.py:803] 2025-04-26 21:58:47,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:47,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6773 [WARNING|trainer.py:803] 2025-04-26 21:58:47,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6832 7942 [WARNING|trainer.py:803] 2025-04-26 21:58:48,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6774 [WARNING|trainer.py:803] 2025-04-26 21:58:49,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:49,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6833 [WARNING|trainer.py:803] 2025-04-26 21:58:49,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7943 6775 [WARNING|trainer.py:803] 2025-04-26 21:58:50,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:50,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6834 [WARNING|trainer.py:803] 2025-04-26 21:58:51,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7944 6776 [WARNING|trainer.py:803] 2025-04-26 21:58:51,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6835 [WARNING|trainer.py:803] 2025-04-26 21:58:52,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:58:52,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6777 [WARNING|trainer.py:803] 2025-04-26 21:58:52,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6836 7945 [WARNING|trainer.py:803] 2025-04-26 21:58:53,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:54,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6778 [WARNING|trainer.py:803] 2025-04-26 21:58:54,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6837 7946 [WARNING|trainer.py:803] 2025-04-26 21:58:55,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:58:55,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6779 [WARNING|trainer.py:803] 2025-04-26 21:58:55,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6838 [WARNING|trainer.py:803] 2025-04-26 21:58:56,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7947 [WARNING|trainer.py:803] 2025-04-26 21:58:56,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6780 6839 [WARNING|trainer.py:803] 2025-04-26 21:58:57,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:58:57,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:58,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6781 7948 6840 [WARNING|trainer.py:803] 2025-04-26 21:58:59,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:58:59,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:58:59,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6782 7949 6841 [WARNING|trainer.py:803] 2025-04-26 21:59:00,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:00,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:59:00,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6783 6842 7950 [WARNING|trainer.py:803] 2025-04-26 21:59:01,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:02,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6784 [WARNING|trainer.py:803] 2025-04-26 21:59:02,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6843 [WARNING|trainer.py:803] 2025-04-26 21:59:02,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:03,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7951 6785 6844 [WARNING|trainer.py:803] 2025-04-26 21:59:04,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:59:04,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:59:04,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6786 6845 7952 [WARNING|trainer.py:803] 2025-04-26 21:59:05,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:05,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6787 [WARNING|trainer.py:803] 2025-04-26 21:59:06,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6846 [WARNING|trainer.py:803] 2025-04-26 21:59:06,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7953 [WARNING|trainer.py:803] 2025-04-26 21:59:07,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6788 6847 [WARNING|trainer.py:803] 2025-04-26 21:59:07,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:08,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7954 [WARNING|trainer.py:803] 2025-04-26 21:59:08,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6789 6848 [WARNING|trainer.py:803] 2025-04-26 21:59:09,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:09,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:09,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6790 7955 6849 [WARNING|trainer.py:803] 2025-04-26 21:59:10,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:11,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:59:11,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6791 7956 6850 [WARNING|trainer.py:803] 2025-04-26 21:59:12,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:12,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:59:12,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6792 6851 7957 [WARNING|trainer.py:803] 2025-04-26 21:59:13,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:13,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:13,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6793 6852 [WARNING|trainer.py:803] 2025-04-26 21:59:15,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:15,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7958 6794 6853 [WARNING|trainer.py:803] 2025-04-26 21:59:16,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:59:16,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:59:16,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7959 6795 6854 [WARNING|trainer.py:803] 2025-04-26 21:59:17,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:59:17,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:17,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6796 6855 7960 [WARNING|trainer.py:803] 2025-04-26 21:59:19,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:19,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:19,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6797 6856 7961 [WARNING|trainer.py:803] 2025-04-26 21:59:20,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:20,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:20,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6798 6857 7962 [WARNING|trainer.py:803] 2025-04-26 21:59:21,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:59:21,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6858 6799 [WARNING|trainer.py:803] 2025-04-26 21:59:22,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7963 [WARNING|trainer.py:803] 2025-04-26 21:59:23,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 21:59:23,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6800 6859 [WARNING|trainer.py:803] 2025-04-26 21:59:23,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:59:24,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:24,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6860 6801 7964 [WARNING|trainer.py:803] 2025-04-26 21:59:25,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:25,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:25,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6861 6802 7965 [WARNING|trainer.py:803] 2025-04-26 21:59:26,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:27,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:27,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6862 6803 7966 [WARNING|trainer.py:803] 2025-04-26 21:59:28,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:59:28,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6863 [WARNING|trainer.py:803] 2025-04-26 21:59:28,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6804 [WARNING|trainer.py:803] 2025-04-26 21:59:29,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:29,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7967 6864 6805 [WARNING|trainer.py:803] 2025-04-26 21:59:30,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:59:30,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:30,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6865 6806 7968 [WARNING|trainer.py:803] 2025-04-26 21:59:32,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:59:32,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:32,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6866 6807 7969 [WARNING|trainer.py:803] 2025-04-26 21:59:33,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:33,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6867 [WARNING|trainer.py:803] 2025-04-26 21:59:33,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6808 7970 [WARNING|trainer.py:803] 2025-04-26 21:59:34,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:34,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6868 [WARNING|trainer.py:803] 2025-04-26 21:59:35,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6809 7971 [WARNING|trainer.py:803] 2025-04-26 21:59:36,006 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:36,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6869 6810 [WARNING|trainer.py:803] 2025-04-26 21:59:36,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:37,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7972 [WARNING|trainer.py:803] 2025-04-26 21:59:37,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6870 6811 [WARNING|trainer.py:803] 2025-04-26 21:59:38,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:59:38,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7973 [WARNING|trainer.py:803] 2025-04-26 21:59:38,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6871 6812 [WARNING|trainer.py:803] 2025-04-26 21:59:39,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:39,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7974 [WARNING|trainer.py:803] 2025-04-26 21:59:40,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6872 6813 [WARNING|trainer.py:803] 2025-04-26 21:59:40,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 21:59:41,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:59:41,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7975 6873 6814 [WARNING|trainer.py:803] 2025-04-26 21:59:42,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:59:42,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:59:42,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6874 6815 7976 [WARNING|trainer.py:803] 2025-04-26 21:59:43,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:44,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6875 [WARNING|trainer.py:803] 2025-04-26 21:59:44,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6816 7977 [WARNING|trainer.py:803] 2025-04-26 21:59:45,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:45,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6876 [WARNING|trainer.py:803] 2025-04-26 21:59:45,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6817 [WARNING|trainer.py:803] 2025-04-26 21:59:46,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7978 [WARNING|trainer.py:803] 2025-04-26 21:59:46,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6877 6818 [WARNING|trainer.py:803] 2025-04-26 21:59:47,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:47,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:48,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6878 7979 6819 [WARNING|trainer.py:803] 2025-04-26 21:59:49,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:49,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:49,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6879 6820 7980 [WARNING|trainer.py:803] 2025-04-26 21:59:50,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:50,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6880 [WARNING|trainer.py:803] 2025-04-26 21:59:51,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6821 [WARNING|trainer.py:803] 2025-04-26 21:59:51,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7981 [WARNING|trainer.py:803] 2025-04-26 21:59:52,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6881 6822 [WARNING|trainer.py:803] 2025-04-26 21:59:52,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:59:53,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7982 [WARNING|trainer.py:803] 2025-04-26 21:59:53,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6882 6823 [WARNING|trainer.py:803] 2025-04-26 21:59:54,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:54,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:59:54,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6883 7983 6824 [WARNING|trainer.py:803] 2025-04-26 21:59:55,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 21:59:55,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 21:59:56,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6884 7984 6825 [WARNING|trainer.py:803] 2025-04-26 21:59:57,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:57,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 21:59:57,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6885 6826 7985 [WARNING|trainer.py:803] 2025-04-26 21:59:58,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 21:59:58,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6886 [WARNING|trainer.py:803] 2025-04-26 21:59:58,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6827 7986 [WARNING|trainer.py:803] 2025-04-26 21:59:59,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:00,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6887 [WARNING|trainer.py:803] 2025-04-26 22:00:00,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6828 [WARNING|trainer.py:803] 2025-04-26 22:00:00,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:01,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6888 7987 6829 [WARNING|trainer.py:803] 2025-04-26 22:00:02,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:02,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:00:02,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6889 6830 7988 [WARNING|trainer.py:803] 2025-04-26 22:00:03,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:03,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6890 [WARNING|trainer.py:803] 2025-04-26 22:00:04,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6831 7989 [WARNING|trainer.py:803] 2025-04-26 22:00:04,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:05,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6891 [WARNING|trainer.py:803] 2025-04-26 22:00:05,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6832 [WARNING|trainer.py:803] 2025-04-26 22:00:06,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7990 [WARNING|trainer.py:803] 2025-04-26 22:00:06,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6892 6833 [WARNING|trainer.py:803] 2025-04-26 22:00:07,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:00:07,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:07,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6893 7991 6834 [WARNING|trainer.py:803] 2025-04-26 22:00:08,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:09,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:09,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6894 6835 [WARNING|trainer.py:803] 2025-04-26 22:00:10,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7992 [WARNING|trainer.py:803] 2025-04-26 22:00:10,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6895 6836 [WARNING|trainer.py:803] 2025-04-26 22:00:11,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:11,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:11,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6896 7993 6837 [WARNING|trainer.py:803] 2025-04-26 22:00:12,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:13,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:00:13,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6897 6838 7994 [WARNING|trainer.py:803] 2025-04-26 22:00:14,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:00:14,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6898 [WARNING|trainer.py:803] 2025-04-26 22:00:14,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6839 7995 [WARNING|trainer.py:803] 2025-04-26 22:00:15,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:15,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6899 6840 [WARNING|trainer.py:803] 2025-04-26 22:00:16,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:00:16,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:17,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6900 7996 6841 [WARNING|trainer.py:803] 2025-04-26 22:00:18,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:18,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:18,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6901 6842 7997 [WARNING|trainer.py:803] 2025-04-26 22:00:19,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:00:19,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6902 6843 [WARNING|trainer.py:803] 2025-04-26 22:00:20,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:00:20,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7998 [WARNING|trainer.py:803] 2025-04-26 22:00:20,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6903 6844 [WARNING|trainer.py:803] 2025-04-26 22:00:21,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:00:21,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6904 [WARNING|trainer.py:803] 2025-04-26 22:00:22,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7999 6845 [WARNING|trainer.py:803] 2025-04-26 22:00:23,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:00:23,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6905 [WARNING|trainer.py:803] 2025-04-26 22:00:23,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6846 [WARNING|trainer.py:803] 2025-04-26 22:00:24,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8000 6906 [WARNING|trainer.py:803] 2025-04-26 22:00:24,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:00:25,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6847 [WARNING|trainer.py:803] 2025-04-26 22:00:25,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6907 [WARNING|trainer.py:803] 2025-04-26 22:00:26,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8001 6848 [WARNING|trainer.py:803] 2025-04-26 22:00:26,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:27,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6908 [WARNING|trainer.py:803] 2025-04-26 22:00:27,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:00:28,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6849 8002 6909 [WARNING|trainer.py:803] 2025-04-26 22:00:28,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:00:28,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:00:29,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6850 8003 6910 [WARNING|trainer.py:803] 2025-04-26 22:00:30,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:00:30,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6851 [WARNING|trainer.py:803] 2025-04-26 22:00:30,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6911 8004 [WARNING|trainer.py:803] 2025-04-26 22:00:31,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:31,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6852 [WARNING|trainer.py:803] 2025-04-26 22:00:32,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6912 [WARNING|trainer.py:803] 2025-04-26 22:00:32,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8005 [WARNING|trainer.py:803] 2025-04-26 22:00:33,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6853 6913 [WARNING|trainer.py:803] 2025-04-26 22:00:33,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:00:34,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8006 [WARNING|trainer.py:803] 2025-04-26 22:00:34,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6854 6914 [WARNING|trainer.py:803] 2025-04-26 22:00:35,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:35,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:35,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6915 8007 6855 [WARNING|trainer.py:803] 2025-04-26 22:00:36,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:36,804 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:36,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6916 6856 8008 [WARNING|trainer.py:803] 2025-04-26 22:00:38,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:38,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6917 [WARNING|trainer.py:803] 2025-04-26 22:00:38,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6857 [WARNING|trainer.py:803] 2025-04-26 22:00:39,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8009 [WARNING|trainer.py:803] 2025-04-26 22:00:39,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6918 6858 [WARNING|trainer.py:803] 2025-04-26 22:00:40,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:00:40,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8010 [WARNING|trainer.py:803] 2025-04-26 22:00:40,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6919 6859 [WARNING|trainer.py:803] 2025-04-26 22:00:41,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:41,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8011 [WARNING|trainer.py:803] 2025-04-26 22:00:42,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6920 6860 [WARNING|trainer.py:803] 2025-04-26 22:00:42,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:43,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:43,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6921 8012 6861 [WARNING|trainer.py:803] 2025-04-26 22:00:44,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:00:44,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6922 [WARNING|trainer.py:803] 2025-04-26 22:00:44,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8013 6862 [WARNING|trainer.py:803] 2025-04-26 22:00:45,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:00:45,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6923 [WARNING|trainer.py:803] 2025-04-26 22:00:46,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8014 6863 [WARNING|trainer.py:803] 2025-04-26 22:00:46,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:47,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6924 [WARNING|trainer.py:803] 2025-04-26 22:00:47,340 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8015 6864 [WARNING|trainer.py:803] 2025-04-26 22:00:48,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:00:48,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6925 [WARNING|trainer.py:803] 2025-04-26 22:00:48,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8016 6865 [WARNING|trainer.py:803] 2025-04-26 22:00:49,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6926 [WARNING|trainer.py:803] 2025-04-26 22:00:49,823 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:00:49,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8017 6866 [WARNING|trainer.py:803] 2025-04-26 22:00:50,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6927 [WARNING|trainer.py:803] 2025-04-26 22:00:51,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:51,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6867 [WARNING|trainer.py:803] 2025-04-26 22:00:51,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8018 6928 [WARNING|trainer.py:803] 2025-04-26 22:00:52,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:00:52,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6868 [WARNING|trainer.py:803] 2025-04-26 22:00:53,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8019 6929 [WARNING|trainer.py:803] 2025-04-26 22:00:53,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6869 [WARNING|trainer.py:803] 2025-04-26 22:00:54,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:00:54,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6930 [WARNING|trainer.py:803] 2025-04-26 22:00:55,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8020 6870 [WARNING|trainer.py:803] 2025-04-26 22:00:55,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6931 [WARNING|trainer.py:803] 2025-04-26 22:00:56,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:00:56,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8021 6871 [WARNING|trainer.py:803] 2025-04-26 22:00:56,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6932 [WARNING|trainer.py:803] 2025-04-26 22:00:57,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:57,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:00:58,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6872 8022 6933 [WARNING|trainer.py:803] 2025-04-26 22:00:59,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:00:59,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:00:59,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6873 6934 8023 [WARNING|trainer.py:803] 2025-04-26 22:01:00,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:01:00,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6874 [WARNING|trainer.py:803] 2025-04-26 22:01:00,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6935 8024 [WARNING|trainer.py:803] 2025-04-26 22:01:01,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:01,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6875 6936 [WARNING|trainer.py:803] 2025-04-26 22:01:02,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:03,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8025 [WARNING|trainer.py:803] 2025-04-26 22:01:03,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6876 6937 [WARNING|trainer.py:803] 2025-04-26 22:01:04,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:04,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:04,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8026 6877 6938 [WARNING|trainer.py:803] 2025-04-26 22:01:05,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:05,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:05,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6939 6878 8027 [WARNING|trainer.py:803] 2025-04-26 22:01:07,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:01:07,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:07,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6940 6879 8028 [WARNING|trainer.py:803] 2025-04-26 22:01:08,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:08,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:08,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6941 6880 8029 [WARNING|trainer.py:803] 2025-04-26 22:01:09,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:09,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:09,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6942 6881 8030 [WARNING|trainer.py:803] 2025-04-26 22:01:10,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:10,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6943 [WARNING|trainer.py:803] 2025-04-26 22:01:11,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6882 8031 [WARNING|trainer.py:803] 2025-04-26 22:01:12,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:12,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6944 [WARNING|trainer.py:803] 2025-04-26 22:01:12,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6883 [WARNING|trainer.py:803] 2025-04-26 22:01:13,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8032 [WARNING|trainer.py:803] 2025-04-26 22:01:13,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6945 6884 [WARNING|trainer.py:803] 2025-04-26 22:01:14,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:14,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8033 6946 [WARNING|trainer.py:803] 2025-04-26 22:01:14,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6885 [WARNING|trainer.py:803] 2025-04-26 22:01:15,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:15,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6947 [WARNING|trainer.py:803] 2025-04-26 22:01:16,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8034 6886 [WARNING|trainer.py:803] 2025-04-26 22:01:17,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:17,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6948 [WARNING|trainer.py:803] 2025-04-26 22:01:17,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8035 6887 [WARNING|trainer.py:803] 2025-04-26 22:01:18,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:18,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6949 [WARNING|trainer.py:803] 2025-04-26 22:01:18,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8036 6888 [WARNING|trainer.py:803] 2025-04-26 22:01:19,493 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:19,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6950 [WARNING|trainer.py:803] 2025-04-26 22:01:20,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8037 6889 [WARNING|trainer.py:803] 2025-04-26 22:01:20,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6951 [WARNING|trainer.py:803] 2025-04-26 22:01:21,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:21,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:21,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8038 6890 6952 [WARNING|trainer.py:803] 2025-04-26 22:01:22,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:01:22,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:23,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8039 6891 6953 [WARNING|trainer.py:803] 2025-04-26 22:01:24,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:01:24,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:24,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6892 6954 8040 [WARNING|trainer.py:803] 2025-04-26 22:01:25,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:25,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:25,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6893 6955 8041 [WARNING|trainer.py:803] 2025-04-26 22:01:26,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:26,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:27,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6956 6894 8042 [WARNING|trainer.py:803] 2025-04-26 22:01:28,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:28,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6957 [WARNING|trainer.py:803] 2025-04-26 22:01:28,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6895 8043 [WARNING|trainer.py:803] 2025-04-26 22:01:29,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:29,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6958 [WARNING|trainer.py:803] 2025-04-26 22:01:30,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6896 8044 [WARNING|trainer.py:803] 2025-04-26 22:01:30,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:31,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6959 [WARNING|trainer.py:803] 2025-04-26 22:01:31,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6897 [WARNING|trainer.py:803] 2025-04-26 22:01:31,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8045 [WARNING|trainer.py:803] 2025-04-26 22:01:32,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6960 6898 [WARNING|trainer.py:803] 2025-04-26 22:01:32,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:33,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6961 [WARNING|trainer.py:803] 2025-04-26 22:01:33,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8046 6899 [WARNING|trainer.py:803] 2025-04-26 22:01:34,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6962 [WARNING|trainer.py:803] 2025-04-26 22:01:34,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:34,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6900 8047 [WARNING|trainer.py:803] 2025-04-26 22:01:35,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6963 [WARNING|trainer.py:803] 2025-04-26 22:01:36,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:36,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6901 [WARNING|trainer.py:803] 2025-04-26 22:01:36,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8048 6964 [WARNING|trainer.py:803] 2025-04-26 22:01:37,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:37,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6902 [WARNING|trainer.py:803] 2025-04-26 22:01:38,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6965 8049 [WARNING|trainer.py:803] 2025-04-26 22:01:38,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6903 [WARNING|trainer.py:803] 2025-04-26 22:01:39,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:39,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6966 [WARNING|trainer.py:803] 2025-04-26 22:01:40,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8050 6904 [WARNING|trainer.py:803] 2025-04-26 22:01:40,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:40,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6967 [WARNING|trainer.py:803] 2025-04-26 22:01:41,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8051 6905 [WARNING|trainer.py:803] 2025-04-26 22:01:41,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6968 [WARNING|trainer.py:803] 2025-04-26 22:01:42,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:42,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6906 8052 [WARNING|trainer.py:803] 2025-04-26 22:01:43,151 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6969 [WARNING|trainer.py:803] 2025-04-26 22:01:43,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:43,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6907 8053 [WARNING|trainer.py:803] 2025-04-26 22:01:44,388 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6970 [WARNING|trainer.py:803] 2025-04-26 22:01:45,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:45,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6908 [WARNING|trainer.py:803] 2025-04-26 22:01:45,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8054 6971 [WARNING|trainer.py:803] 2025-04-26 22:01:46,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:46,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6909 [WARNING|trainer.py:803] 2025-04-26 22:01:46,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8055 6972 [WARNING|trainer.py:803] 2025-04-26 22:01:47,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:47,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6910 [WARNING|trainer.py:803] 2025-04-26 22:01:48,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8056 6973 [WARNING|trainer.py:803] 2025-04-26 22:01:48,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:49,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6911 [WARNING|trainer.py:803] 2025-04-26 22:01:49,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8057 6974 [WARNING|trainer.py:803] 2025-04-26 22:01:50,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6912 [WARNING|trainer.py:803] 2025-04-26 22:01:50,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:01:50,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6975 8058 [WARNING|trainer.py:803] 2025-04-26 22:01:51,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6913 [WARNING|trainer.py:803] 2025-04-26 22:01:51,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:52,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6976 8059 [WARNING|trainer.py:803] 2025-04-26 22:01:52,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6914 [WARNING|trainer.py:803] 2025-04-26 22:01:53,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:01:53,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6977 8060 [WARNING|trainer.py:803] 2025-04-26 22:01:53,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6915 [WARNING|trainer.py:803] 2025-04-26 22:01:54,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:54,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6978 [WARNING|trainer.py:803] 2025-04-26 22:01:55,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8061 6916 [WARNING|trainer.py:803] 2025-04-26 22:01:55,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6979 [WARNING|trainer.py:803] 2025-04-26 22:01:56,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:01:56,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6917 8062 [WARNING|trainer.py:803] 2025-04-26 22:01:56,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6980 [WARNING|trainer.py:803] 2025-04-26 22:01:57,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:01:57,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6918 [WARNING|trainer.py:803] 2025-04-26 22:01:58,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8063 6981 [WARNING|trainer.py:803] 2025-04-26 22:01:58,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6919 [WARNING|trainer.py:803] 2025-04-26 22:01:59,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:01:59,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6982 8064 [WARNING|trainer.py:803] 2025-04-26 22:02:00,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6920 [WARNING|trainer.py:803] 2025-04-26 22:02:00,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:00,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6983 [WARNING|trainer.py:803] 2025-04-26 22:02:01,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8065 6921 [WARNING|trainer.py:803] 2025-04-26 22:02:01,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:02,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6984 [WARNING|trainer.py:803] 2025-04-26 22:02:02,485 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8066 6922 [WARNING|trainer.py:803] 2025-04-26 22:02:03,035 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6985 [WARNING|trainer.py:803] 2025-04-26 22:02:03,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:02:03,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6923 8067 [WARNING|trainer.py:803] 2025-04-26 22:02:04,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6986 [WARNING|trainer.py:803] 2025-04-26 22:02:04,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:05,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6924 [WARNING|trainer.py:803] 2025-04-26 22:02:05,507 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8068 6987 [WARNING|trainer.py:803] 2025-04-26 22:02:06,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:06,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6925 [WARNING|trainer.py:803] 2025-04-26 22:02:06,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6988 8069 [WARNING|trainer.py:803] 2025-04-26 22:02:07,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6926 [WARNING|trainer.py:803] 2025-04-26 22:02:07,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:08,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6989 8070 [WARNING|trainer.py:803] 2025-04-26 22:02:08,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6927 [WARNING|trainer.py:803] 2025-04-26 22:02:09,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:09,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6990 [WARNING|trainer.py:803] 2025-04-26 22:02:10,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8071 6928 [WARNING|trainer.py:803] 2025-04-26 22:02:10,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6991 [WARNING|trainer.py:803] 2025-04-26 22:02:11,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:11,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8072 6929 [WARNING|trainer.py:803] 2025-04-26 22:02:11,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6992 [WARNING|trainer.py:803] 2025-04-26 22:02:12,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:02:12,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:02:12,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6930 8073 6993 [WARNING|trainer.py:803] 2025-04-26 22:02:13,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:14,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:14,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6931 6994 [WARNING|trainer.py:803] 2025-04-26 22:02:15,057 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8074 [WARNING|trainer.py:803] 2025-04-26 22:02:15,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6932 6995 [WARNING|trainer.py:803] 2025-04-26 22:02:16,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:16,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:16,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6933 8075 6996 [WARNING|trainer.py:803] 2025-04-26 22:02:17,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:17,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:17,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6934 6997 8076 [WARNING|trainer.py:803] 2025-04-26 22:02:18,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:19,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6935 [WARNING|trainer.py:803] 2025-04-26 22:02:19,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6998 8077 [WARNING|trainer.py:803] 2025-04-26 22:02:20,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:20,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6936 [WARNING|trainer.py:803] 2025-04-26 22:02:20,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6999 8078 [WARNING|trainer.py:803] 2025-04-26 22:02:21,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:21,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6937 [WARNING|trainer.py:803] 2025-04-26 22:02:22,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7000 8079 [WARNING|trainer.py:803] 2025-04-26 22:02:22,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:22,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6938 7001 [WARNING|trainer.py:803] 2025-04-26 22:02:23,346 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:23,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8080 [WARNING|trainer.py:803] 2025-04-26 22:02:24,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6939 7002 [WARNING|trainer.py:803] 2025-04-26 22:02:24,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:25,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:02:25,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6940 8081 7003 [WARNING|trainer.py:803] 2025-04-26 22:02:26,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:26,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:02:26,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6941 7004 8082 [WARNING|trainer.py:803] 2025-04-26 22:02:27,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:27,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6942 [WARNING|trainer.py:803] 2025-04-26 22:02:28,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7005 8083 [WARNING|trainer.py:803] 2025-04-26 22:02:28,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6943 [WARNING|trainer.py:803] 2025-04-26 22:02:29,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:29,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7006 [WARNING|trainer.py:803] 2025-04-26 22:02:30,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8084 6944 [WARNING|trainer.py:803] 2025-04-26 22:02:30,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7007 [WARNING|trainer.py:803] 2025-04-26 22:02:30,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:31,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6945 8085 [WARNING|trainer.py:803] 2025-04-26 22:02:31,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7008 [WARNING|trainer.py:803] 2025-04-26 22:02:32,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:32,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6946 [WARNING|trainer.py:803] 2025-04-26 22:02:32,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7009 8086 [WARNING|trainer.py:803] 2025-04-26 22:02:33,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6947 [WARNING|trainer.py:803] 2025-04-26 22:02:34,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:02:34,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7010 [WARNING|trainer.py:803] 2025-04-26 22:02:35,028 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8087 6948 [WARNING|trainer.py:803] 2025-04-26 22:02:35,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7011 [WARNING|trainer.py:803] 2025-04-26 22:02:36,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:36,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8088 6949 [WARNING|trainer.py:803] 2025-04-26 22:02:36,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7012 [WARNING|trainer.py:803] 2025-04-26 22:02:37,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:37,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8089 6950 [WARNING|trainer.py:803] 2025-04-26 22:02:37,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7013 [WARNING|trainer.py:803] 2025-04-26 22:02:38,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:38,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:39,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6951 7014 [WARNING|trainer.py:803] 2025-04-26 22:02:40,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8090 [WARNING|trainer.py:803] 2025-04-26 22:02:40,419 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6952 7015 [WARNING|trainer.py:803] 2025-04-26 22:02:41,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:41,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8091 [WARNING|trainer.py:803] 2025-04-26 22:02:41,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6953 7016 [WARNING|trainer.py:803] 2025-04-26 22:02:42,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:42,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:42,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6954 8092 7017 [WARNING|trainer.py:803] 2025-04-26 22:02:43,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:43,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:44,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 6955 8093 7018 [WARNING|trainer.py:803] 2025-04-26 22:02:45,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:45,412 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6956 [WARNING|trainer.py:803] 2025-04-26 22:02:45,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7019 8094 [WARNING|trainer.py:803] 2025-04-26 22:02:46,301 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6957 [WARNING|trainer.py:803] 2025-04-26 22:02:46,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:46,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7020 8095 [WARNING|trainer.py:803] 2025-04-26 22:02:47,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:47,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6958 [WARNING|trainer.py:803] 2025-04-26 22:02:48,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7021 8096 [WARNING|trainer.py:803] 2025-04-26 22:02:48,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:49,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6959 [WARNING|trainer.py:803] 2025-04-26 22:02:49,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7022 [WARNING|trainer.py:803] 2025-04-26 22:02:50,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8097 [WARNING|trainer.py:803] 2025-04-26 22:02:50,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6960 7023 [WARNING|trainer.py:803] 2025-04-26 22:02:50,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:02:51,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:02:51,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6961 8098 7024 [WARNING|trainer.py:803] 2025-04-26 22:02:52,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:52,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:02:52,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6962 7025 8099 [WARNING|trainer.py:803] 2025-04-26 22:02:53,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:54,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:54,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6963 7026 8100 [WARNING|trainer.py:803] 2025-04-26 22:02:55,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:55,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6964 [WARNING|trainer.py:803] 2025-04-26 22:02:55,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7027 [WARNING|trainer.py:803] 2025-04-26 22:02:56,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8101 [WARNING|trainer.py:803] 2025-04-26 22:02:56,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6965 7028 [WARNING|trainer.py:803] 2025-04-26 22:02:57,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:02:57,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:57,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6966 7029 8102 [WARNING|trainer.py:803] 2025-04-26 22:02:58,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:59,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:02:59,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6967 7030 [WARNING|trainer.py:803] 2025-04-26 22:03:00,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8103 [WARNING|trainer.py:803] 2025-04-26 22:03:00,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 6968 7031 [WARNING|trainer.py:803] 2025-04-26 22:03:00,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:03:01,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:03:01,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6969 8104 7032 [WARNING|trainer.py:803] 2025-04-26 22:03:02,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:02,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:03:02,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6970 7033 8105 [WARNING|trainer.py:803] 2025-04-26 22:03:03,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:04,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6971 [WARNING|trainer.py:803] 2025-04-26 22:03:04,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7034 [WARNING|trainer.py:803] 2025-04-26 22:03:05,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8106 [WARNING|trainer.py:803] 2025-04-26 22:03:05,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6972 7035 [WARNING|trainer.py:803] 2025-04-26 22:03:06,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:03:06,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6973 [WARNING|trainer.py:803] 2025-04-26 22:03:06,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8107 7036 [WARNING|trainer.py:803] 2025-04-26 22:03:07,575 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:03:07,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6974 [WARNING|trainer.py:803] 2025-04-26 22:03:08,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7037 [WARNING|trainer.py:803] 2025-04-26 22:03:08,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8108 6975 [WARNING|trainer.py:803] 2025-04-26 22:03:09,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:03:09,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7038 [WARNING|trainer.py:803] 2025-04-26 22:03:10,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6976 [WARNING|trainer.py:803] 2025-04-26 22:03:10,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8109 7039 [WARNING|trainer.py:803] 2025-04-26 22:03:11,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:03:11,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:11,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6977 7040 8110 [WARNING|trainer.py:803] 2025-04-26 22:03:12,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:13,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6978 [WARNING|trainer.py:803] 2025-04-26 22:03:13,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7041 [WARNING|trainer.py:803] 2025-04-26 22:03:14,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8111 [WARNING|trainer.py:803] 2025-04-26 22:03:14,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 6979 7042 [WARNING|trainer.py:803] 2025-04-26 22:03:14,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:03:15,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:15,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6980 8112 7043 [WARNING|trainer.py:803] 2025-04-26 22:03:16,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:16,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:03:16,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6981 7044 8113 [WARNING|trainer.py:803] 2025-04-26 22:03:17,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:18,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6982 [WARNING|trainer.py:803] 2025-04-26 22:03:18,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7045 [WARNING|trainer.py:803] 2025-04-26 22:03:19,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6983 [WARNING|trainer.py:803] 2025-04-26 22:03:19,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8114 7046 [WARNING|trainer.py:803] 2025-04-26 22:03:20,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:20,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 6984 [WARNING|trainer.py:803] 2025-04-26 22:03:20,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7047 8115 [WARNING|trainer.py:803] 2025-04-26 22:03:21,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6985 [WARNING|trainer.py:803] 2025-04-26 22:03:21,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:03:22,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7048 [WARNING|trainer.py:803] 2025-04-26 22:03:22,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6986 [WARNING|trainer.py:803] 2025-04-26 22:03:23,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8116 7049 [WARNING|trainer.py:803] 2025-04-26 22:03:23,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:03:24,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6987 [WARNING|trainer.py:803] 2025-04-26 22:03:24,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7050 8117 [WARNING|trainer.py:803] 2025-04-26 22:03:25,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6988 [WARNING|trainer.py:803] 2025-04-26 22:03:25,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:25,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7051 [WARNING|trainer.py:803] 2025-04-26 22:03:26,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6989 [WARNING|trainer.py:803] 2025-04-26 22:03:26,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8118 7052 [WARNING|trainer.py:803] 2025-04-26 22:03:27,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:03:27,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 6990 [WARNING|trainer.py:803] 2025-04-26 22:03:28,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7053 8119 [WARNING|trainer.py:803] 2025-04-26 22:03:28,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6991 [WARNING|trainer.py:803] 2025-04-26 22:03:29,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:29,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7054 [WARNING|trainer.py:803] 2025-04-26 22:03:30,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8120 6992 [WARNING|trainer.py:803] 2025-04-26 22:03:30,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7055 [WARNING|trainer.py:803] 2025-04-26 22:03:31,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:31,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6993 [WARNING|trainer.py:803] 2025-04-26 22:03:31,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8121 7056 [WARNING|trainer.py:803] 2025-04-26 22:03:32,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6994 [WARNING|trainer.py:803] 2025-04-26 22:03:33,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:03:33,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7057 [WARNING|trainer.py:803] 2025-04-26 22:03:33,904 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8122 6995 [WARNING|trainer.py:803] 2025-04-26 22:03:34,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7058 [WARNING|trainer.py:803] 2025-04-26 22:03:34,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:03:35,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6996 [WARNING|trainer.py:803] 2025-04-26 22:03:35,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8123 7059 [WARNING|trainer.py:803] 2025-04-26 22:03:36,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:03:36,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 6997 [WARNING|trainer.py:803] 2025-04-26 22:03:36,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7060 [WARNING|trainer.py:803] 2025-04-26 22:03:37,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8124 6998 [WARNING|trainer.py:803] 2025-04-26 22:03:38,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7061 [WARNING|trainer.py:803] 2025-04-26 22:03:38,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:03:38,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 6999 [WARNING|trainer.py:803] 2025-04-26 22:03:39,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8125 7062 [WARNING|trainer.py:803] 2025-04-26 22:03:40,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:40,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7000 [WARNING|trainer.py:803] 2025-04-26 22:03:40,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7063 8126 [WARNING|trainer.py:803] 2025-04-26 22:03:41,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7001 [WARNING|trainer.py:803] 2025-04-26 22:03:41,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:03:42,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7064 [WARNING|trainer.py:803] 2025-04-26 22:03:42,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8127 7002 [WARNING|trainer.py:803] 2025-04-26 22:03:43,159 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7065 [WARNING|trainer.py:803] 2025-04-26 22:03:43,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:03:43,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7003 [WARNING|trainer.py:803] 2025-04-26 22:03:44,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7066 8128 [WARNING|trainer.py:803] 2025-04-26 22:03:45,128 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7004 [WARNING|trainer.py:803] 2025-04-26 22:03:45,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:45,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7067 [WARNING|trainer.py:803] 2025-04-26 22:03:46,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8129 7005 [WARNING|trainer.py:803] 2025-04-26 22:03:46,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7068 [WARNING|trainer.py:803] 2025-04-26 22:03:47,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:03:47,709 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:03:48,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7006 7069 8130 [WARNING|trainer.py:803] 2025-04-26 22:03:48,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:03:49,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7007 [WARNING|trainer.py:803] 2025-04-26 22:03:49,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7070 [WARNING|trainer.py:803] 2025-04-26 22:03:50,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8131 [WARNING|trainer.py:803] 2025-04-26 22:03:50,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7008 7071 [WARNING|trainer.py:803] 2025-04-26 22:03:51,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:03:51,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:03:51,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7009 7072 8132 [WARNING|trainer.py:803] 2025-04-26 22:03:52,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:03:53,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:03:53,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7010 7073 [WARNING|trainer.py:803] 2025-04-26 22:03:54,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:54,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8133 7011 7074 [WARNING|trainer.py:803] 2025-04-26 22:03:55,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:03:55,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:03:55,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7012 7075 8134 [WARNING|trainer.py:803] 2025-04-26 22:03:56,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:56,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:03:56,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7013 7076 [WARNING|trainer.py:803] 2025-04-26 22:03:57,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8135 [WARNING|trainer.py:803] 2025-04-26 22:03:58,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7014 7077 [WARNING|trainer.py:803] 2025-04-26 22:03:58,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:03:59,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:03:59,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7015 8136 7078 [WARNING|trainer.py:803] 2025-04-26 22:04:00,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:00,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:04:00,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7016 7079 8137 [WARNING|trainer.py:803] 2025-04-26 22:04:01,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:01,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7017 7080 [WARNING|trainer.py:803] 2025-04-26 22:04:02,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:02,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 22:04:03,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8138 7018 7081 [WARNING|trainer.py:803] 2025-04-26 22:04:04,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:04:04,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:04,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7019 7082 8139 [WARNING|trainer.py:803] 2025-04-26 22:04:05,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:05,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:05,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7020 7083 8140 [WARNING|trainer.py:803] 2025-04-26 22:04:06,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:04:06,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7021 7084 [WARNING|trainer.py:803] 2025-04-26 22:04:07,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:04:07,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:08,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes 7022 7085 8141 [WARNING|trainer.py:803] 2025-04-26 22:04:09,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:04:09,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:09,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7023 7086 8142 [WARNING|trainer.py:803] 2025-04-26 22:04:10,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:10,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7024 7087 [WARNING|trainer.py:803] 2025-04-26 22:04:11,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:04:11,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:11,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7025 7088 8143 [WARNING|trainer.py:803] 2025-04-26 22:04:12,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:13,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:13,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7026 7089 8144 [WARNING|trainer.py:803] 2025-04-26 22:04:14,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:04:14,290 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7027 7090 [WARNING|trainer.py:803] 2025-04-26 22:04:14,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:15,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:15,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8145 7091 7028 [WARNING|trainer.py:803] 2025-04-26 22:04:16,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:16,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:16,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7092 7029 8146 [WARNING|trainer.py:803] 2025-04-26 22:04:18,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:18,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7093 [WARNING|trainer.py:803] 2025-04-26 22:04:18,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7030 [WARNING|trainer.py:803] 2025-04-26 22:04:19,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:19,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8147 7094 7031 [WARNING|trainer.py:803] 2025-04-26 22:04:20,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:20,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:20,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7095 7032 8148 [WARNING|trainer.py:803] 2025-04-26 22:04:21,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:21,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:22,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7096 7033 [WARNING|trainer.py:803] 2025-04-26 22:04:22,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8149 [WARNING|trainer.py:803] 2025-04-26 22:04:23,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7097 7034 [WARNING|trainer.py:803] 2025-04-26 22:04:23,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:24,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:24,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7098 7035 8150 [WARNING|trainer.py:803] 2025-04-26 22:04:25,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:25,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:25,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7099 7036 [WARNING|trainer.py:803] 2025-04-26 22:04:26,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8151 [WARNING|trainer.py:803] 2025-04-26 22:04:26,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7100 7037 [WARNING|trainer.py:803] 2025-04-26 22:04:27,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:27,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:04:28,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7101 8152 7038 [WARNING|trainer.py:803] 2025-04-26 22:04:29,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:29,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:29,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7102 7039 8153 [WARNING|trainer.py:803] 2025-04-26 22:04:30,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:30,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7103 7040 [WARNING|trainer.py:803] 2025-04-26 22:04:31,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:04:31,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7104 [WARNING|trainer.py:803] 2025-04-26 22:04:32,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8154 7041 [WARNING|trainer.py:803] 2025-04-26 22:04:32,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:32,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7105 [WARNING|trainer.py:803] 2025-04-26 22:04:33,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7042 8155 [WARNING|trainer.py:803] 2025-04-26 22:04:34,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7106 [WARNING|trainer.py:803] 2025-04-26 22:04:34,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:04:34,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7043 [WARNING|trainer.py:803] 2025-04-26 22:04:35,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8156 7107 [WARNING|trainer.py:803] 2025-04-26 22:04:35,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7044 [WARNING|trainer.py:803] 2025-04-26 22:04:36,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:36,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7108 [WARNING|trainer.py:803] 2025-04-26 22:04:37,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8157 7045 [WARNING|trainer.py:803] 2025-04-26 22:04:37,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7109 [WARNING|trainer.py:803] 2025-04-26 22:04:38,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:04:38,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7046 [WARNING|trainer.py:803] 2025-04-26 22:04:38,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8158 7110 [WARNING|trainer.py:803] 2025-04-26 22:04:39,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:39,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7047 [WARNING|trainer.py:803] 2025-04-26 22:04:40,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7111 [WARNING|trainer.py:803] 2025-04-26 22:04:40,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8159 7048 [WARNING|trainer.py:803] 2025-04-26 22:04:41,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:41,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7112 [WARNING|trainer.py:803] 2025-04-26 22:04:42,064 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7049 8160 [WARNING|trainer.py:803] 2025-04-26 22:04:42,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7113 [WARNING|trainer.py:803] 2025-04-26 22:04:43,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:04:43,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7050 [WARNING|trainer.py:803] 2025-04-26 22:04:43,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7114 8161 [WARNING|trainer.py:803] 2025-04-26 22:04:44,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7051 [WARNING|trainer.py:803] 2025-04-26 22:04:45,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:45,189 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7115 [WARNING|trainer.py:803] 2025-04-26 22:04:45,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7052 8162 [WARNING|trainer.py:803] 2025-04-26 22:04:46,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7116 [WARNING|trainer.py:803] 2025-04-26 22:04:47,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:47,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7053 [WARNING|trainer.py:803] 2025-04-26 22:04:47,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8163 7117 [WARNING|trainer.py:803] 2025-04-26 22:04:48,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7054 [WARNING|trainer.py:803] 2025-04-26 22:04:48,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:48,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7118 [WARNING|trainer.py:803] 2025-04-26 22:04:49,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8164 7055 [WARNING|trainer.py:803] 2025-04-26 22:04:50,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7119 [WARNING|trainer.py:803] 2025-04-26 22:04:50,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:04:50,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7056 [WARNING|trainer.py:803] 2025-04-26 22:04:51,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8165 7120 [WARNING|trainer.py:803] 2025-04-26 22:04:52,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:04:52,424 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7057 [WARNING|trainer.py:803] 2025-04-26 22:04:52,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7121 8166 [WARNING|trainer.py:803] 2025-04-26 22:04:53,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7058 [WARNING|trainer.py:803] 2025-04-26 22:04:53,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:54,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7122 [WARNING|trainer.py:803] 2025-04-26 22:04:54,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8167 7059 [WARNING|trainer.py:803] 2025-04-26 22:04:54,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7123 [WARNING|trainer.py:803] 2025-04-26 22:04:55,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:04:55,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:56,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7060 7124 8168 [WARNING|trainer.py:803] 2025-04-26 22:04:57,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:57,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7061 [WARNING|trainer.py:803] 2025-04-26 22:04:57,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7125 [WARNING|trainer.py:803] 2025-04-26 22:04:58,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8169 [WARNING|trainer.py:803] 2025-04-26 22:04:58,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7062 7126 [WARNING|trainer.py:803] 2025-04-26 22:04:59,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:04:59,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:04:59,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7063 7127 8170 [WARNING|trainer.py:803] 2025-04-26 22:05:00,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:05:01,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7064 [WARNING|trainer.py:803] 2025-04-26 22:05:01,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7128 [WARNING|trainer.py:803] 2025-04-26 22:05:02,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8171 [WARNING|trainer.py:803] 2025-04-26 22:05:02,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7065 7129 [WARNING|trainer.py:803] 2025-04-26 22:05:03,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:05:03,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:03,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7066 7130 8172 [WARNING|trainer.py:803] 2025-04-26 22:05:04,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:04,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7067 [WARNING|trainer.py:803] 2025-04-26 22:05:05,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7131 [WARNING|trainer.py:803] 2025-04-26 22:05:05,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8173 [WARNING|trainer.py:803] 2025-04-26 22:05:06,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7068 7132 [WARNING|trainer.py:803] 2025-04-26 22:05:06,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:07,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:05:07,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7069 8174 7133 [WARNING|trainer.py:803] 2025-04-26 22:05:08,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:05:08,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:08,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7070 7134 [WARNING|trainer.py:803] 2025-04-26 22:05:09,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8175 [WARNING|trainer.py:803] 2025-04-26 22:05:09,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7071 7135 [WARNING|trainer.py:803] 2025-04-26 22:05:10,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:05:10,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:11,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7072 8176 7136 [WARNING|trainer.py:803] 2025-04-26 22:05:12,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:12,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:12,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7073 7137 8177 [WARNING|trainer.py:803] 2025-04-26 22:05:13,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:13,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7074 7138 [WARNING|trainer.py:803] 2025-04-26 22:05:14,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:05:14,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:14,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8178 7075 7139 [WARNING|trainer.py:803] 2025-04-26 22:05:15,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:05:15,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:15,996 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7076 7140 8179 [WARNING|trainer.py:803] 2025-04-26 22:05:17,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:17,236 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:17,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7077 7141 [WARNING|trainer.py:803] 2025-04-26 22:05:18,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8180 [WARNING|trainer.py:803] 2025-04-26 22:05:18,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7078 7142 [WARNING|trainer.py:803] 2025-04-26 22:05:19,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:05:19,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:19,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7079 7143 8181 [WARNING|trainer.py:803] 2025-04-26 22:05:20,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:20,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:21,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7080 7144 8182 [WARNING|trainer.py:803] 2025-04-26 22:05:22,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:22,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7081 7145 [WARNING|trainer.py:803] 2025-04-26 22:05:22,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:05:23,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:23,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8183 7082 7146 [WARNING|trainer.py:803] 2025-04-26 22:05:24,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:24,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:05:24,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7083 7147 8184 [WARNING|trainer.py:803] 2025-04-26 22:05:26,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:26,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7084 7148 [WARNING|trainer.py:803] 2025-04-26 22:05:26,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:05:27,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerYes [WARNING|trainer.py:803] 2025-04-26 22:05:27,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8185 7085 7149 [WARNING|trainer.py:803] 2025-04-26 22:05:28,362 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:05:28,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:28,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7086 7150 8186 [WARNING|trainer.py:803] 2025-04-26 22:05:29,793 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:29,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7087 7151 [WARNING|trainer.py:803] 2025-04-26 22:05:30,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:05:31,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:31,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7088 8187 7152 [WARNING|trainer.py:803] 2025-04-26 22:05:32,296 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:32,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:32,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7089 7153 8188 [WARNING|trainer.py:803] 2025-04-26 22:05:33,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:33,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7090 [WARNING|trainer.py:803] 2025-04-26 22:05:34,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7154 8189 [WARNING|trainer.py:803] 2025-04-26 22:05:34,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:34,886 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7091 7155 [WARNING|trainer.py:803] 2025-04-26 22:05:35,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:36,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:36,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8190 7092 7156 [WARNING|trainer.py:803] 2025-04-26 22:05:37,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:05:37,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:37,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7093 7157 8191 [WARNING|trainer.py:803] 2025-04-26 22:05:38,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:38,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7094 7158 [WARNING|trainer.py:803] 2025-04-26 22:05:39,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:39,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:39,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7095 8192 7159 [WARNING|trainer.py:803] 2025-04-26 22:05:41,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:41,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:05:41,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7096 7160 8193 [WARNING|trainer.py:803] 2025-04-26 22:05:42,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:42,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7097 7161 [WARNING|trainer.py:803] 2025-04-26 22:05:42,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:05:43,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:43,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8194 7098 7162 [WARNING|trainer.py:803] 2025-04-26 22:05:44,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:05:44,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:45,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7099 7163 8195 [WARNING|trainer.py:803] 2025-04-26 22:05:46,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:46,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:46,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7100 7164 [WARNING|trainer.py:803] 2025-04-26 22:05:47,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8196 [WARNING|trainer.py:803] 2025-04-26 22:05:47,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7101 7165 [WARNING|trainer.py:803] 2025-04-26 22:05:48,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:05:48,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:48,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7102 8197 7166 [WARNING|trainer.py:803] 2025-04-26 22:05:49,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:50,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:50,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7103 7167 [WARNING|trainer.py:803] 2025-04-26 22:05:51,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:51,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8198 7104 7168 [WARNING|trainer.py:803] 2025-04-26 22:05:52,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:52,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:52,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7105 7169 8199 [WARNING|trainer.py:803] 2025-04-26 22:05:53,763 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:53,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:54,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7170 7106 8200 [WARNING|trainer.py:803] 2025-04-26 22:05:55,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:05:55,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7171 7107 [WARNING|trainer.py:803] 2025-04-26 22:05:55,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:05:56,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:05:56,324 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7172 7108 8201 [WARNING|trainer.py:803] 2025-04-26 22:05:57,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:05:57,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:05:57,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7173 7109 [WARNING|trainer.py:803] 2025-04-26 22:05:58,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8202 [WARNING|trainer.py:803] 2025-04-26 22:05:58,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7174 7110 [WARNING|trainer.py:803] 2025-04-26 22:05:59,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:05:59,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:00,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7175 7111 8203 [WARNING|trainer.py:803] 2025-04-26 22:06:01,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:01,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:01,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 7176 7112 [WARNING|trainer.py:803] 2025-04-26 22:06:02,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8204 [WARNING|trainer.py:803] 2025-04-26 22:06:02,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7177 7113 [WARNING|trainer.py:803] 2025-04-26 22:06:03,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:06:03,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:03,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7178 8205 7114 [WARNING|trainer.py:803] 2025-04-26 22:06:04,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:05,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:06:05,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7179 7115 8206 [WARNING|trainer.py:803] 2025-04-26 22:06:06,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:06,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7180 7116 [WARNING|trainer.py:803] 2025-04-26 22:06:06,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:06:07,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:07,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7181 8207 7117 [WARNING|trainer.py:803] 2025-04-26 22:06:08,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:08,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:06:08,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7182 7118 8208 [WARNING|trainer.py:803] 2025-04-26 22:06:09,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7183 [WARNING|trainer.py:803] 2025-04-26 22:06:10,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:10,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7119 [WARNING|trainer.py:803] 2025-04-26 22:06:11,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7184 [WARNING|trainer.py:803] 2025-04-26 22:06:11,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8209 7120 [WARNING|trainer.py:803] 2025-04-26 22:06:12,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:12,335 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7185 [WARNING|trainer.py:803] 2025-04-26 22:06:12,737 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7121 8210 [WARNING|trainer.py:803] 2025-04-26 22:06:13,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7186 [WARNING|trainer.py:803] 2025-04-26 22:06:13,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:14,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7122 [WARNING|trainer.py:803] 2025-04-26 22:06:14,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8211 7187 [WARNING|trainer.py:803] 2025-04-26 22:06:15,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7123 [WARNING|trainer.py:803] 2025-04-26 22:06:15,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:06:15,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7188 [WARNING|trainer.py:803] 2025-04-26 22:06:16,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8212 7124 [WARNING|trainer.py:803] 2025-04-26 22:06:17,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:17,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7189 [WARNING|trainer.py:803] 2025-04-26 22:06:17,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7125 [WARNING|trainer.py:803] 2025-04-26 22:06:18,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8213 7190 [WARNING|trainer.py:803] 2025-04-26 22:06:19,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:19,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7126 [WARNING|trainer.py:803] 2025-04-26 22:06:19,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7191 [WARNING|trainer.py:803] 2025-04-26 22:06:20,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8214 7127 [WARNING|trainer.py:803] 2025-04-26 22:06:20,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:21,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7192 [WARNING|trainer.py:803] 2025-04-26 22:06:21,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7128 [WARNING|trainer.py:803] 2025-04-26 22:06:22,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8215 7193 [WARNING|trainer.py:803] 2025-04-26 22:06:22,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:22,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7129 [WARNING|trainer.py:803] 2025-04-26 22:06:23,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7194 8216 [WARNING|trainer.py:803] 2025-04-26 22:06:24,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7130 [WARNING|trainer.py:803] 2025-04-26 22:06:24,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:24,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7195 [WARNING|trainer.py:803] 2025-04-26 22:06:25,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7131 [WARNING|trainer.py:803] 2025-04-26 22:06:25,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8217 7196 [WARNING|trainer.py:803] 2025-04-26 22:06:26,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:26,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7132 [WARNING|trainer.py:803] 2025-04-26 22:06:27,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7197 [WARNING|trainer.py:803] 2025-04-26 22:06:27,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8218 7133 [WARNING|trainer.py:803] 2025-04-26 22:06:28,256 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7198 [WARNING|trainer.py:803] 2025-04-26 22:06:28,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:06:29,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7134 [WARNING|trainer.py:803] 2025-04-26 22:06:29,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8219 7199 [WARNING|trainer.py:803] 2025-04-26 22:06:30,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:30,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:06:30,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7135 7200 [WARNING|trainer.py:803] 2025-04-26 22:06:31,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8220 [WARNING|trainer.py:803] 2025-04-26 22:06:31,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7136 7201 [WARNING|trainer.py:803] 2025-04-26 22:06:32,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:06:32,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:33,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7137 8221 7202 [WARNING|trainer.py:803] 2025-04-26 22:06:34,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:34,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:34,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7138 7203 [WARNING|trainer.py:803] 2025-04-26 22:06:35,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8222 [WARNING|trainer.py:803] 2025-04-26 22:06:35,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7139 7204 [WARNING|trainer.py:803] 2025-04-26 22:06:36,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:06:36,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:36,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7140 8223 7205 [WARNING|trainer.py:803] 2025-04-26 22:06:37,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:38,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:06:38,246 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7141 7206 8224 [WARNING|trainer.py:803] 2025-04-26 22:06:39,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:39,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7142 7207 [WARNING|trainer.py:803] 2025-04-26 22:06:39,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:06:40,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:40,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7143 8225 7208 [WARNING|trainer.py:803] 2025-04-26 22:06:41,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:41,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:06:42,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7144 7209 8226 [WARNING|trainer.py:803] 2025-04-26 22:06:42,915 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:43,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7145 7210 [WARNING|trainer.py:803] 2025-04-26 22:06:43,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:06:44,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:44,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7146 8227 7211 [WARNING|trainer.py:803] 2025-04-26 22:06:45,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:06:45,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:06:45,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7147 7212 8228 [WARNING|trainer.py:803] 2025-04-26 22:06:46,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:46,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7148 [WARNING|trainer.py:803] 2025-04-26 22:06:47,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7213 [WARNING|trainer.py:803] 2025-04-26 22:06:47,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:48,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7149 8229 7214 [WARNING|trainer.py:803] 2025-04-26 22:06:49,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:49,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:06:49,497 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7150 7215 8230 [WARNING|trainer.py:803] 2025-04-26 22:06:50,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:50,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7151 [WARNING|trainer.py:803] 2025-04-26 22:06:51,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7216 [WARNING|trainer.py:803] 2025-04-26 22:06:51,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8231 [WARNING|trainer.py:803] 2025-04-26 22:06:51,981 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7152 7217 [WARNING|trainer.py:803] 2025-04-26 22:06:52,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:06:52,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:53,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7153 7218 8232 [WARNING|trainer.py:803] 2025-04-26 22:06:54,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:06:54,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7154 [WARNING|trainer.py:803] 2025-04-26 22:06:54,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7219 [WARNING|trainer.py:803] 2025-04-26 22:06:55,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8233 7155 [WARNING|trainer.py:803] 2025-04-26 22:06:55,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7220 [WARNING|trainer.py:803] 2025-04-26 22:06:56,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:06:56,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7156 [WARNING|trainer.py:803] 2025-04-26 22:06:56,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8234 7221 [WARNING|trainer.py:803] 2025-04-26 22:06:57,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7157 [WARNING|trainer.py:803] 2025-04-26 22:06:58,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:06:58,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7222 [WARNING|trainer.py:803] 2025-04-26 22:06:59,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8235 7158 [WARNING|trainer.py:803] 2025-04-26 22:06:59,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7223 [WARNING|trainer.py:803] 2025-04-26 22:07:00,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:07:00,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7159 [WARNING|trainer.py:803] 2025-04-26 22:07:00,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7224 8236 [WARNING|trainer.py:803] 2025-04-26 22:07:01,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7160 [WARNING|trainer.py:803] 2025-04-26 22:07:01,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:07:02,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7225 [WARNING|trainer.py:803] 2025-04-26 22:07:02,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8237 7161 [WARNING|trainer.py:803] 2025-04-26 22:07:03,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7226 [WARNING|trainer.py:803] 2025-04-26 22:07:03,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:07:04,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7162 [WARNING|trainer.py:803] 2025-04-26 22:07:04,470 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8238 7227 [WARNING|trainer.py:803] 2025-04-26 22:07:05,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7163 [WARNING|trainer.py:803] 2025-04-26 22:07:05,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:07:05,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7228 [WARNING|trainer.py:803] 2025-04-26 22:07:06,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8239 7164 [WARNING|trainer.py:803] 2025-04-26 22:07:06,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7229 [WARNING|trainer.py:803] 2025-04-26 22:07:07,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:07:07,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7165 [WARNING|trainer.py:803] 2025-04-26 22:07:08,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8240 7230 [WARNING|trainer.py:803] 2025-04-26 22:07:08,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:07:09,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7166 [WARNING|trainer.py:803] 2025-04-26 22:07:09,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7231 [WARNING|trainer.py:803] 2025-04-26 22:07:10,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8241 7167 [WARNING|trainer.py:803] 2025-04-26 22:07:10,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7232 [WARNING|trainer.py:803] 2025-04-26 22:07:11,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:07:11,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7168 [WARNING|trainer.py:803] 2025-04-26 22:07:11,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8242 7233 [WARNING|trainer.py:803] 2025-04-26 22:07:12,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:07:12,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7169 [WARNING|trainer.py:803] 2025-04-26 22:07:13,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7234 8243 [WARNING|trainer.py:803] 2025-04-26 22:07:13,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7170 [WARNING|trainer.py:803] 2025-04-26 22:07:14,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:07:14,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7235 [WARNING|trainer.py:803] 2025-04-26 22:07:15,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7171 8244 [WARNING|trainer.py:803] 2025-04-26 22:07:15,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7236 [WARNING|trainer.py:803] 2025-04-26 22:07:16,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:07:16,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7172 [WARNING|trainer.py:803] 2025-04-26 22:07:16,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7237 8245 [WARNING|trainer.py:803] 2025-04-26 22:07:17,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7173 [WARNING|trainer.py:803] 2025-04-26 22:07:18,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:07:18,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7238 [WARNING|trainer.py:803] 2025-04-26 22:07:18,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8246 7174 [WARNING|trainer.py:803] 2025-04-26 22:07:19,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:07:19,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7239 [WARNING|trainer.py:803] 2025-04-26 22:07:20,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7175 [WARNING|trainer.py:803] 2025-04-26 22:07:20,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8247 7240 [WARNING|trainer.py:803] 2025-04-26 22:07:21,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7176 [WARNING|trainer.py:803] 2025-04-26 22:07:21,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:07:21,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7241 [WARNING|trainer.py:803] 2025-04-26 22:07:22,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8248 7177 [WARNING|trainer.py:803] 2025-04-26 22:07:23,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7242 [WARNING|trainer.py:803] 2025-04-26 22:07:23,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:07:23,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7178 [WARNING|trainer.py:803] 2025-04-26 22:07:24,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8249 7243 [WARNING|trainer.py:803] 2025-04-26 22:07:25,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:07:25,358 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7179 [WARNING|trainer.py:803] 2025-04-26 22:07:25,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7244 [WARNING|trainer.py:803] 2025-04-26 22:07:26,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8250 7180 [WARNING|trainer.py:803] 2025-04-26 22:07:26,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:07:27,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7245 [WARNING|trainer.py:803] 2025-04-26 22:07:27,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7181 [WARNING|trainer.py:803] 2025-04-26 22:07:28,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8251 7246 [WARNING|trainer.py:803] 2025-04-26 22:07:28,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:07:29,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7182 [WARNING|trainer.py:803] 2025-04-26 22:07:29,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7247 [WARNING|trainer.py:803] 2025-04-26 22:07:29,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8252 7183 [WARNING|trainer.py:803] 2025-04-26 22:07:30,596 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7248 [WARNING|trainer.py:803] 2025-04-26 22:07:31,061 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:07:31,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7184 [WARNING|trainer.py:803] 2025-04-26 22:07:31,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8253 7249 [WARNING|trainer.py:803] 2025-04-26 22:07:32,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7185 [WARNING|trainer.py:803] 2025-04-26 22:07:32,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:07:33,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7250 [WARNING|trainer.py:803] 2025-04-26 22:07:33,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8254 7186 [WARNING|trainer.py:803] 2025-04-26 22:07:34,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:07:34,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7251 [WARNING|trainer.py:803] 2025-04-26 22:07:34,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7187 [WARNING|trainer.py:803] 2025-04-26 22:07:35,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8255 7252 [WARNING|trainer.py:803] 2025-04-26 22:07:36,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7188 [WARNING|trainer.py:803] 2025-04-26 22:07:36,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:07:36,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7253 [WARNING|trainer.py:803] 2025-04-26 22:07:37,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8256 7189 [WARNING|trainer.py:803] 2025-04-26 22:07:38,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:07:38,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7254 [WARNING|trainer.py:803] 2025-04-26 22:07:38,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7190 [WARNING|trainer.py:803] 2025-04-26 22:07:39,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8257 7255 [WARNING|trainer.py:803] 2025-04-26 22:07:39,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:07:40,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7191 [WARNING|trainer.py:803] 2025-04-26 22:07:40,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7256 [WARNING|trainer.py:803] 2025-04-26 22:07:41,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8258 7192 [WARNING|trainer.py:803] 2025-04-26 22:07:41,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7257 [WARNING|trainer.py:803] 2025-04-26 22:07:42,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:07:42,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7193 [WARNING|trainer.py:803] 2025-04-26 22:07:43,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8259 7258 [WARNING|trainer.py:803] 2025-04-26 22:07:43,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:07:43,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7194 [WARNING|trainer.py:803] 2025-04-26 22:07:44,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7259 [WARNING|trainer.py:803] 2025-04-26 22:07:44,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8260 7195 [WARNING|trainer.py:803] 2025-04-26 22:07:45,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7260 [WARNING|trainer.py:803] 2025-04-26 22:07:45,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:07:46,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7196 [WARNING|trainer.py:803] 2025-04-26 22:07:46,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8261 7261 [WARNING|trainer.py:803] 2025-04-26 22:07:47,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:07:47,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7197 [WARNING|trainer.py:803] 2025-04-26 22:07:47,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7262 8262 [WARNING|trainer.py:803] 2025-04-26 22:07:48,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7198 [WARNING|trainer.py:803] 2025-04-26 22:07:49,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:07:49,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7263 [WARNING|trainer.py:803] 2025-04-26 22:07:49,920 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7199 [WARNING|trainer.py:803] 2025-04-26 22:07:50,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8263 7264 [WARNING|trainer.py:803] 2025-04-26 22:07:51,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:07:51,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7200 [WARNING|trainer.py:803] 2025-04-26 22:07:51,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7265 [WARNING|trainer.py:803] 2025-04-26 22:07:52,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8264 7201 [WARNING|trainer.py:803] 2025-04-26 22:07:52,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:07:53,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7266 [WARNING|trainer.py:803] 2025-04-26 22:07:53,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7202 [WARNING|trainer.py:803] 2025-04-26 22:07:54,108 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8265 7267 [WARNING|trainer.py:803] 2025-04-26 22:07:54,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:07:55,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7203 [WARNING|trainer.py:803] 2025-04-26 22:07:55,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7268 8266 [WARNING|trainer.py:803] 2025-04-26 22:07:56,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7204 [WARNING|trainer.py:803] 2025-04-26 22:07:56,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:07:56,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7269 [WARNING|trainer.py:803] 2025-04-26 22:07:57,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7205 [WARNING|trainer.py:803] 2025-04-26 22:07:57,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8267 7270 [WARNING|trainer.py:803] 2025-04-26 22:07:58,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:07:58,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7206 [WARNING|trainer.py:803] 2025-04-26 22:07:58,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7271 8268 [WARNING|trainer.py:803] 2025-04-26 22:07:59,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7207 [WARNING|trainer.py:803] 2025-04-26 22:08:00,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:08:00,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7272 [WARNING|trainer.py:803] 2025-04-26 22:08:00,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7208 [WARNING|trainer.py:803] 2025-04-26 22:08:01,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8269 7273 [WARNING|trainer.py:803] 2025-04-26 22:08:02,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:02,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7209 [WARNING|trainer.py:803] 2025-04-26 22:08:02,667 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7274 8270 [WARNING|trainer.py:803] 2025-04-26 22:08:03,512 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:08:03,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7210 [WARNING|trainer.py:803] 2025-04-26 22:08:04,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7275 [WARNING|trainer.py:803] 2025-04-26 22:08:04,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:08:05,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8271 7211 7276 [WARNING|trainer.py:803] 2025-04-26 22:08:05,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. [WARNING|trainer.py:803] 2025-04-26 22:08:05,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes NoNo [WARNING|trainer.py:803] 2025-04-26 22:08:06,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7212 7277 8272 [WARNING|trainer.py:803] 2025-04-26 22:08:07,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 22:08:07,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7213 [WARNING|trainer.py:803] 2025-04-26 22:08:07,723 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7278 [WARNING|trainer.py:803] 2025-04-26 22:08:08,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8273 [WARNING|trainer.py:803] 2025-04-26 22:08:08,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7214 7279 [WARNING|trainer.py:803] 2025-04-26 22:08:09,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:08:09,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:08:10,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7215 8274 7280 [WARNING|trainer.py:803] 2025-04-26 22:08:10,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:08:11,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:08:11,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7216 7281 8275 [WARNING|trainer.py:803] 2025-04-26 22:08:12,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:08:12,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7217 [WARNING|trainer.py:803] 2025-04-26 22:08:12,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7282 [WARNING|trainer.py:803] 2025-04-26 22:08:13,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:13,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7218 8276 7283 [WARNING|trainer.py:803] 2025-04-26 22:08:14,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:14,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:15,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7219 7284 8277 [WARNING|trainer.py:803] 2025-04-26 22:08:15,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:16,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7220 7285 [WARNING|trainer.py:803] 2025-04-26 22:08:16,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:08:17,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:08:17,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7221 8278 7286 [WARNING|trainer.py:803] 2025-04-26 22:08:18,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:18,624 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:18,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7222 7287 8279 [WARNING|trainer.py:803] 2025-04-26 22:08:19,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:19,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7223 7288 [WARNING|trainer.py:803] 2025-04-26 22:08:20,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:08:20,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:08:21,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7224 8280 7289 [WARNING|trainer.py:803] 2025-04-26 22:08:22,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:22,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:22,456 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7225 7290 8281 [WARNING|trainer.py:803] 2025-04-26 22:08:23,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:08:23,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7226 7291 [WARNING|trainer.py:803] 2025-04-26 22:08:24,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:08:24,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:24,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7227 8282 7292 [WARNING|trainer.py:803] 2025-04-26 22:08:25,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:25,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:08:26,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7228 7293 8283 [WARNING|trainer.py:803] 2025-04-26 22:08:27,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:08:27,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7229 7294 [WARNING|trainer.py:803] 2025-04-26 22:08:27,919 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:08:28,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:28,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7230 8284 7295 [WARNING|trainer.py:803] 2025-04-26 22:08:29,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:29,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7231 [WARNING|trainer.py:803] 2025-04-26 22:08:29,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7296 8285 [WARNING|trainer.py:803] 2025-04-26 22:08:30,796 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7232 [WARNING|trainer.py:803] 2025-04-26 22:08:31,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:08:31,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7297 [WARNING|trainer.py:803] 2025-04-26 22:08:32,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7233 [WARNING|trainer.py:803] 2025-04-26 22:08:32,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8286 7298 [WARNING|trainer.py:803] 2025-04-26 22:08:33,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:08:33,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7234 [WARNING|trainer.py:803] 2025-04-26 22:08:33,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7299 8287 [WARNING|trainer.py:803] 2025-04-26 22:08:34,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7235 [WARNING|trainer.py:803] 2025-04-26 22:08:34,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:35,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7300 [WARNING|trainer.py:803] 2025-04-26 22:08:35,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7236 [WARNING|trainer.py:803] 2025-04-26 22:08:36,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8288 7301 [WARNING|trainer.py:803] 2025-04-26 22:08:36,951 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:08:37,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7237 [WARNING|trainer.py:803] 2025-04-26 22:08:37,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7302 8289 [WARNING|trainer.py:803] 2025-04-26 22:08:38,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7238 [WARNING|trainer.py:803] 2025-04-26 22:08:38,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:39,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7303 [WARNING|trainer.py:803] 2025-04-26 22:08:39,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7239 8290 [WARNING|trainer.py:803] 2025-04-26 22:08:40,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 7304 [WARNING|trainer.py:803] 2025-04-26 22:08:40,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:40,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7240 [WARNING|trainer.py:803] 2025-04-26 22:08:41,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8291 7305 [WARNING|trainer.py:803] 2025-04-26 22:08:41,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7241 [WARNING|trainer.py:803] 2025-04-26 22:08:42,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:42,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7306 [WARNING|trainer.py:803] 2025-04-26 22:08:43,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7242 8292 [WARNING|trainer.py:803] 2025-04-26 22:08:43,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7307 [WARNING|trainer.py:803] 2025-04-26 22:08:44,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:08:44,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7243 [WARNING|trainer.py:803] 2025-04-26 22:08:45,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7308 [WARNING|trainer.py:803] 2025-04-26 22:08:45,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8293 7244 [WARNING|trainer.py:803] 2025-04-26 22:08:46,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:46,638 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7309 [WARNING|trainer.py:803] 2025-04-26 22:08:46,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7245 [WARNING|trainer.py:803] 2025-04-26 22:08:47,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8294 7310 [WARNING|trainer.py:803] 2025-04-26 22:08:48,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:08:48,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7246 [WARNING|trainer.py:803] 2025-04-26 22:08:48,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7311 [WARNING|trainer.py:803] 2025-04-26 22:08:49,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8295 7247 [WARNING|trainer.py:803] 2025-04-26 22:08:50,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 22:08:50,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7312 [WARNING|trainer.py:803] 2025-04-26 22:08:50,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7248 [WARNING|trainer.py:803] 2025-04-26 22:08:51,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8296 7313 [WARNING|trainer.py:803] 2025-04-26 22:08:51,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7249 [WARNING|trainer.py:803] 2025-04-26 22:08:52,371 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:08:52,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7314 [WARNING|trainer.py:803] 2025-04-26 22:08:53,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7250 8297 [WARNING|trainer.py:803] 2025-04-26 22:08:53,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7315 [WARNING|trainer.py:803] 2025-04-26 22:08:54,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:54,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7251 [WARNING|trainer.py:803] 2025-04-26 22:08:55,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8298 7316 [WARNING|trainer.py:803] 2025-04-26 22:08:55,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7252 [WARNING|trainer.py:803] 2025-04-26 22:08:56,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:08:56,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7317 [WARNING|trainer.py:803] 2025-04-26 22:08:56,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8299 7253 [WARNING|trainer.py:803] 2025-04-26 22:08:57,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:08:57,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7318 [WARNING|trainer.py:803] 2025-04-26 22:08:58,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7254 [WARNING|trainer.py:803] 2025-04-26 22:08:58,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8300 7319 [WARNING|trainer.py:803] 2025-04-26 22:08:59,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:08:59,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7255 [WARNING|trainer.py:803] 2025-04-26 22:09:00,161 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7320 [WARNING|trainer.py:803] 2025-04-26 22:09:00,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8301 7256 [WARNING|trainer.py:803] 2025-04-26 22:09:01,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:09:01,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7321 [WARNING|trainer.py:803] 2025-04-26 22:09:01,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7257 [WARNING|trainer.py:803] 2025-04-26 22:09:02,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8302 7322 [WARNING|trainer.py:803] 2025-04-26 22:09:03,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:09:03,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7258 [WARNING|trainer.py:803] 2025-04-26 22:09:03,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7323 8303 [WARNING|trainer.py:803] 2025-04-26 22:09:04,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7259 [WARNING|trainer.py:803] 2025-04-26 22:09:05,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:09:05,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7324 [WARNING|trainer.py:803] 2025-04-26 22:09:05,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7260 8304 [WARNING|trainer.py:803] 2025-04-26 22:09:06,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7325 [WARNING|trainer.py:803] 2025-04-26 22:09:06,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:09:06,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7261 [WARNING|trainer.py:803] 2025-04-26 22:09:07,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8305 7326 [WARNING|trainer.py:803] 2025-04-26 22:09:08,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7262 [WARNING|trainer.py:803] 2025-04-26 22:09:08,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:09:08,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7327 [WARNING|trainer.py:803] 2025-04-26 22:09:09,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8306 7263 [WARNING|trainer.py:803] 2025-04-26 22:09:10,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:10,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7328 [WARNING|trainer.py:803] 2025-04-26 22:09:10,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7264 8307 [WARNING|trainer.py:803] 2025-04-26 22:09:11,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7329 [WARNING|trainer.py:803] 2025-04-26 22:09:11,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:09:12,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7265 [WARNING|trainer.py:803] 2025-04-26 22:09:12,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7330 8308 [WARNING|trainer.py:803] 2025-04-26 22:09:13,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7266 [WARNING|trainer.py:803] 2025-04-26 22:09:13,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 22:09:13,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7331 [WARNING|trainer.py:803] 2025-04-26 22:09:14,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8309 7267 [WARNING|trainer.py:803] 2025-04-26 22:09:15,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7332 [WARNING|trainer.py:803] 2025-04-26 22:09:15,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:15,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7268 [WARNING|trainer.py:803] 2025-04-26 22:09:16,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8310 7333 [WARNING|trainer.py:803] 2025-04-26 22:09:16,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:09:17,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7269 [WARNING|trainer.py:803] 2025-04-26 22:09:17,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7334 8311 [WARNING|trainer.py:803] 2025-04-26 22:09:18,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7270 [WARNING|trainer.py:803] 2025-04-26 22:09:18,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:09:18,984 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7335 [WARNING|trainer.py:803] 2025-04-26 22:09:19,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8312 7271 [WARNING|trainer.py:803] 2025-04-26 22:09:20,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7336 [WARNING|trainer.py:803] 2025-04-26 22:09:20,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:20,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7272 [WARNING|trainer.py:803] 2025-04-26 22:09:21,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8313 7337 [WARNING|trainer.py:803] 2025-04-26 22:09:22,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:09:22,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7273 [WARNING|trainer.py:803] 2025-04-26 22:09:22,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7338 [WARNING|trainer.py:803] 2025-04-26 22:09:23,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8314 7274 [WARNING|trainer.py:803] 2025-04-26 22:09:23,798 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7339 [WARNING|trainer.py:803] 2025-04-26 22:09:24,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:24,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7275 [WARNING|trainer.py:803] 2025-04-26 22:09:25,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8315 7340 [WARNING|trainer.py:803] 2025-04-26 22:09:25,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:09:25,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7276 [WARNING|trainer.py:803] 2025-04-26 22:09:26,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7341 8316 [WARNING|trainer.py:803] 2025-04-26 22:09:27,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7277 [WARNING|trainer.py:803] 2025-04-26 22:09:27,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:27,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7342 [WARNING|trainer.py:803] 2025-04-26 22:09:28,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8317 7278 [WARNING|trainer.py:803] 2025-04-26 22:09:28,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7343 [WARNING|trainer.py:803] 2025-04-26 22:09:29,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:29,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7279 [WARNING|trainer.py:803] 2025-04-26 22:09:29,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8318 7344 [WARNING|trainer.py:803] 2025-04-26 22:09:30,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:09:31,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7280 [WARNING|trainer.py:803] 2025-04-26 22:09:31,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7345 8319 [WARNING|trainer.py:803] 2025-04-26 22:09:32,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7281 [WARNING|trainer.py:803] 2025-04-26 22:09:32,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:09:32,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7346 [WARNING|trainer.py:803] 2025-04-26 22:09:33,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8320 7282 [WARNING|trainer.py:803] 2025-04-26 22:09:33,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7347 [WARNING|trainer.py:803] 2025-04-26 22:09:34,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:09:34,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7283 [WARNING|trainer.py:803] 2025-04-26 22:09:35,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8321 7348 [WARNING|trainer.py:803] 2025-04-26 22:09:35,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:36,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7284 [WARNING|trainer.py:803] 2025-04-26 22:09:36,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7349 [WARNING|trainer.py:803] 2025-04-26 22:09:36,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8322 7285 [WARNING|trainer.py:803] 2025-04-26 22:09:37,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:37,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7350 [WARNING|trainer.py:803] 2025-04-26 22:09:38,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7286 8323 [WARNING|trainer.py:803] 2025-04-26 22:09:38,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7351 [WARNING|trainer.py:803] 2025-04-26 22:09:39,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:39,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7287 [WARNING|trainer.py:803] 2025-04-26 22:09:40,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8324 7352 [WARNING|trainer.py:803] 2025-04-26 22:09:40,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7288 [WARNING|trainer.py:803] 2025-04-26 22:09:41,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:09:41,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7353 [WARNING|trainer.py:803] 2025-04-26 22:09:41,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8325 7289 [WARNING|trainer.py:803] 2025-04-26 22:09:42,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:09:43,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7354 [WARNING|trainer.py:803] 2025-04-26 22:09:43,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7290 8326 [WARNING|trainer.py:803] 2025-04-26 22:09:43,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7355 [WARNING|trainer.py:803] 2025-04-26 22:09:44,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:44,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7291 [WARNING|trainer.py:803] 2025-04-26 22:09:45,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8327 7356 [WARNING|trainer.py:803] 2025-04-26 22:09:45,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7292 [WARNING|trainer.py:803] 2025-04-26 22:09:46,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:46,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7357 [WARNING|trainer.py:803] 2025-04-26 22:09:46,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8328 7293 [WARNING|trainer.py:803] 2025-04-26 22:09:47,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:09:47,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7358 [WARNING|trainer.py:803] 2025-04-26 22:09:48,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7294 8329 [WARNING|trainer.py:803] 2025-04-26 22:09:48,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7359 [WARNING|trainer.py:803] 2025-04-26 22:09:49,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:09:49,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7295 [WARNING|trainer.py:803] 2025-04-26 22:09:50,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8330 7360 [WARNING|trainer.py:803] 2025-04-26 22:09:50,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7296 [WARNING|trainer.py:803] 2025-04-26 22:09:51,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:51,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7361 [WARNING|trainer.py:803] 2025-04-26 22:09:51,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8331 7297 [WARNING|trainer.py:803] 2025-04-26 22:09:52,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7362 [WARNING|trainer.py:803] 2025-04-26 22:09:53,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:53,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7298 [WARNING|trainer.py:803] 2025-04-26 22:09:53,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8332 7363 [WARNING|trainer.py:803] 2025-04-26 22:09:54,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:54,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7299 [WARNING|trainer.py:803] 2025-04-26 22:09:55,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7364 8333 [WARNING|trainer.py:803] 2025-04-26 22:09:55,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7300 [WARNING|trainer.py:803] 2025-04-26 22:09:56,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:09:56,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7365 [WARNING|trainer.py:803] 2025-04-26 22:09:56,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7301 8334 [WARNING|trainer.py:803] 2025-04-26 22:09:57,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7366 [WARNING|trainer.py:803] 2025-04-26 22:09:58,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:09:58,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7302 [WARNING|trainer.py:803] 2025-04-26 22:09:58,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8335 7367 [WARNING|trainer.py:803] 2025-04-26 22:09:59,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7303 [WARNING|trainer.py:803] 2025-04-26 22:10:00,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:10:00,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7368 [WARNING|trainer.py:803] 2025-04-26 22:10:00,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 8336 7304 [WARNING|trainer.py:803] 2025-04-26 22:10:01,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:10:01,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7369 [WARNING|trainer.py:803] 2025-04-26 22:10:01,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7305 8337 [WARNING|trainer.py:803] 2025-04-26 22:10:02,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7370 [WARNING|trainer.py:803] 2025-04-26 22:10:03,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:03,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7306 [WARNING|trainer.py:803] 2025-04-26 22:10:03,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7371 8338 [WARNING|trainer.py:803] 2025-04-26 22:10:04,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7307 [WARNING|trainer.py:803] 2025-04-26 22:10:05,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:10:05,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7372 [WARNING|trainer.py:803] 2025-04-26 22:10:05,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8339 7308 [WARNING|trainer.py:803] 2025-04-26 22:10:06,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7373 [WARNING|trainer.py:803] 2025-04-26 22:10:06,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:10:06,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7309 [WARNING|trainer.py:803] 2025-04-26 22:10:07,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8340 7374 [WARNING|trainer.py:803] 2025-04-26 22:10:08,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7310 [WARNING|trainer.py:803] 2025-04-26 22:10:08,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:10:08,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7375 [WARNING|trainer.py:803] 2025-04-26 22:10:09,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8341 7311 [WARNING|trainer.py:803] 2025-04-26 22:10:10,051 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7376 [WARNING|trainer.py:803] 2025-04-26 22:10:10,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:10:10,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7312 [WARNING|trainer.py:803] 2025-04-26 22:10:11,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8342 7377 [WARNING|trainer.py:803] 2025-04-26 22:10:11,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:12,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7313 [WARNING|trainer.py:803] 2025-04-26 22:10:12,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7378 [WARNING|trainer.py:803] 2025-04-26 22:10:13,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8343 7314 [WARNING|trainer.py:803] 2025-04-26 22:10:13,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:10:14,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7379 [WARNING|trainer.py:803] 2025-04-26 22:10:14,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7315 8344 [WARNING|trainer.py:803] 2025-04-26 22:10:14,974 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7380 [WARNING|trainer.py:803] 2025-04-26 22:10:15,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:10:15,673 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7316 [WARNING|trainer.py:803] 2025-04-26 22:10:16,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8345 7381 [WARNING|trainer.py:803] 2025-04-26 22:10:16,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7317 [WARNING|trainer.py:803] 2025-04-26 22:10:17,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:17,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7382 [WARNING|trainer.py:803] 2025-04-26 22:10:18,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8346 7318 [WARNING|trainer.py:803] 2025-04-26 22:10:18,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7383 [WARNING|trainer.py:803] 2025-04-26 22:10:19,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:10:19,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7319 [WARNING|trainer.py:803] 2025-04-26 22:10:19,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8347 7384 [WARNING|trainer.py:803] 2025-04-26 22:10:20,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:10:20,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7320 [WARNING|trainer.py:803] 2025-04-26 22:10:21,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7385 8348 [WARNING|trainer.py:803] 2025-04-26 22:10:21,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7321 [WARNING|trainer.py:803] 2025-04-26 22:10:22,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:22,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7386 [WARNING|trainer.py:803] 2025-04-26 22:10:23,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8349 7322 [WARNING|trainer.py:803] 2025-04-26 22:10:23,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7387 [WARNING|trainer.py:803] 2025-04-26 22:10:24,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:10:24,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7323 [WARNING|trainer.py:803] 2025-04-26 22:10:24,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7388 8350 [WARNING|trainer.py:803] 2025-04-26 22:10:25,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:10:26,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7324 [WARNING|trainer.py:803] 2025-04-26 22:10:26,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7389 [WARNING|trainer.py:803] 2025-04-26 22:10:26,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8351 [WARNING|trainer.py:803] 2025-04-26 22:10:27,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7325 7390 [WARNING|trainer.py:803] 2025-04-26 22:10:27,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:10:28,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:10:28,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7326 8352 7391 [WARNING|trainer.py:803] 2025-04-26 22:10:29,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:29,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:10:29,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7327 7392 8353 [WARNING|trainer.py:803] 2025-04-26 22:10:30,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:31,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7328 [WARNING|trainer.py:803] 2025-04-26 22:10:31,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7393 [WARNING|trainer.py:803] 2025-04-26 22:10:31,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:10:32,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8354 7329 7394 [WARNING|trainer.py:803] 2025-04-26 22:10:33,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:10:33,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:10:33,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7330 7395 8355 [WARNING|trainer.py:803] 2025-04-26 22:10:34,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 22:10:34,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:10:34,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7331 7396 [WARNING|trainer.py:803] 2025-04-26 22:10:35,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8356 [WARNING|trainer.py:803] 2025-04-26 22:10:36,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7332 7397 [WARNING|trainer.py:803] 2025-04-26 22:10:36,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:10:36,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:37,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8357 7333 7398 [WARNING|trainer.py:803] 2025-04-26 22:10:38,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:38,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:10:38,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7334 7399 8358 [WARNING|trainer.py:803] 2025-04-26 22:10:39,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:10:39,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7335 [WARNING|trainer.py:803] 2025-04-26 22:10:40,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7400 [WARNING|trainer.py:803] 2025-04-26 22:10:40,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8359 [WARNING|trainer.py:803] 2025-04-26 22:10:40,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7336 7401 [WARNING|trainer.py:803] 2025-04-26 22:10:41,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:10:41,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:10:42,162 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7337 8360 7402 [WARNING|trainer.py:803] 2025-04-26 22:10:43,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:10:43,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:43,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7338 7403 8361 [WARNING|trainer.py:803] 2025-04-26 22:10:44,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:44,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7339 [WARNING|trainer.py:803] 2025-04-26 22:10:45,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7404 [WARNING|trainer.py:803] 2025-04-26 22:10:45,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8362 [WARNING|trainer.py:803] 2025-04-26 22:10:45,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7340 7405 [WARNING|trainer.py:803] 2025-04-26 22:10:46,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:46,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:47,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7341 7406 8363 [WARNING|trainer.py:803] 2025-04-26 22:10:48,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:48,446 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:10:48,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7342 7407 8364 [WARNING|trainer.py:803] 2025-04-26 22:10:49,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:49,700 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7343 7408 [WARNING|trainer.py:803] 2025-04-26 22:10:50,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:50,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:10:50,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7344 8365 7409 [WARNING|trainer.py:803] 2025-04-26 22:10:52,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:10:52,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:52,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7345 7410 8366 [WARNING|trainer.py:803] 2025-04-26 22:10:53,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:10:53,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7346 [WARNING|trainer.py:803] 2025-04-26 22:10:53,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7411 8367 [WARNING|trainer.py:803] 2025-04-26 22:10:54,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:10:54,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7347 7412 [WARNING|trainer.py:803] 2025-04-26 22:10:55,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:10:55,772 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:10:55,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7348 8368 7413 [WARNING|trainer.py:803] 2025-04-26 22:10:57,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:10:57,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:57,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7349 7414 8369 [WARNING|trainer.py:803] 2025-04-26 22:10:58,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:10:58,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7350 [WARNING|trainer.py:803] 2025-04-26 22:10:58,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7415 [WARNING|trainer.py:803] 2025-04-26 22:10:59,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8370 [WARNING|trainer.py:803] 2025-04-26 22:10:59,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7351 7416 [WARNING|trainer.py:803] 2025-04-26 22:11:00,571 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:00,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:11:01,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7352 8371 7417 [WARNING|trainer.py:803] 2025-04-26 22:11:02,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:11:02,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:02,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7353 7418 8372 [WARNING|trainer.py:803] 2025-04-26 22:11:03,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:03,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerNo 7354 7419 [WARNING|trainer.py:803] 2025-04-26 22:11:04,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:04,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:11:04,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7355 8373 7420 [WARNING|trainer.py:803] 2025-04-26 22:11:05,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:05,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:11:06,055 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7356 7421 8374 [WARNING|trainer.py:803] 2025-04-26 22:11:06,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:07,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7357 [WARNING|trainer.py:803] 2025-04-26 22:11:07,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7422 [WARNING|trainer.py:803] 2025-04-26 22:11:08,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:11:08,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7358 8375 7423 [WARNING|trainer.py:803] 2025-04-26 22:11:09,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:09,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:09,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7359 7424 8376 [WARNING|trainer.py:803] 2025-04-26 22:11:10,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:11,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7360 7425 [WARNING|trainer.py:803] 2025-04-26 22:11:11,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:11,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8377 [WARNING|trainer.py:803] 2025-04-26 22:11:12,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7361 7426 [WARNING|trainer.py:803] 2025-04-26 22:11:13,025 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:13,185 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:13,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7362 8378 7427 [WARNING|trainer.py:803] 2025-04-26 22:11:14,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:14,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:14,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7363 7428 8379 [WARNING|trainer.py:803] 2025-04-26 22:11:15,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:11:15,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7364 7429 [WARNING|trainer.py:803] 2025-04-26 22:11:16,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:16,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:11:17,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8380 7365 7430 [WARNING|trainer.py:803] 2025-04-26 22:11:18,097 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:18,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:11:18,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7366 7431 8381 [WARNING|trainer.py:803] 2025-04-26 22:11:19,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:19,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7367 [WARNING|trainer.py:803] 2025-04-26 22:11:19,906 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7432 [WARNING|trainer.py:803] 2025-04-26 22:11:20,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8382 [WARNING|trainer.py:803] 2025-04-26 22:11:20,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7368 7433 [WARNING|trainer.py:803] 2025-04-26 22:11:21,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:21,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:22,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7369 8383 7434 [WARNING|trainer.py:803] 2025-04-26 22:11:23,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:11:23,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:23,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7370 7435 8384 [WARNING|trainer.py:803] 2025-04-26 22:11:24,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:24,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7371 7436 [WARNING|trainer.py:803] 2025-04-26 22:11:25,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:11:25,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:25,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7372 8385 7437 [WARNING|trainer.py:803] 2025-04-26 22:11:26,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:26,952 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:11:27,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7373 7438 8386 [WARNING|trainer.py:803] 2025-04-26 22:11:28,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:28,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7374 7439 [WARNING|trainer.py:803] 2025-04-26 22:11:28,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:29,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:11:29,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7375 7440 8387 [WARNING|trainer.py:803] 2025-04-26 22:11:30,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:30,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:31,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7376 7441 8388 [WARNING|trainer.py:803] 2025-04-26 22:11:31,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:32,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7377 7442 [WARNING|trainer.py:803] 2025-04-26 22:11:32,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:33,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:11:33,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7378 8389 7443 [WARNING|trainer.py:803] 2025-04-26 22:11:34,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:34,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:34,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7379 7444 8390 [WARNING|trainer.py:803] 2025-04-26 22:11:35,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:35,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7380 7445 [WARNING|trainer.py:803] 2025-04-26 22:11:36,427 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:36,903 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:11:37,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7381 8391 7446 [WARNING|trainer.py:803] 2025-04-26 22:11:38,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:38,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:38,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7382 7447 8392 [WARNING|trainer.py:803] 2025-04-26 22:11:39,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:39,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7383 7448 [WARNING|trainer.py:803] 2025-04-26 22:11:40,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:40,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:40,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8393 7384 7449 [WARNING|trainer.py:803] 2025-04-26 22:11:41,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:41,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:41,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7385 7450 8394 [WARNING|trainer.py:803] 2025-04-26 22:11:43,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:43,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:43,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7386 7451 8395 [WARNING|trainer.py:803] 2025-04-26 22:11:44,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:44,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7452 7387 [WARNING|trainer.py:803] 2025-04-26 22:11:45,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:45,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:45,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8396 7453 7388 [WARNING|trainer.py:803] 2025-04-26 22:11:46,902 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:11:46,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:47,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7454 7389 8397 [WARNING|trainer.py:803] 2025-04-26 22:11:48,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:48,297 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7455 [WARNING|trainer.py:803] 2025-04-26 22:11:48,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7390 [WARNING|trainer.py:803] 2025-04-26 22:11:49,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8398 [WARNING|trainer.py:803] 2025-04-26 22:11:49,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7456 7391 [WARNING|trainer.py:803] 2025-04-26 22:11:50,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:50,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:11:50,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7457 7392 8399 [WARNING|trainer.py:803] 2025-04-26 22:11:51,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:52,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:11:52,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7458 7393 8400 [WARNING|trainer.py:803] 2025-04-26 22:11:53,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:11:53,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7459 7394 [WARNING|trainer.py:803] 2025-04-26 22:11:53,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:11:54,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8401 [WARNING|trainer.py:803] 2025-04-26 22:11:54,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7460 7395 [WARNING|trainer.py:803] 2025-04-26 22:11:55,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:55,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:55,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8402 7461 7396 [WARNING|trainer.py:803] 2025-04-26 22:11:56,818 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:11:56,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:57,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7462 7397 8403 [WARNING|trainer.py:803] 2025-04-26 22:11:58,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:11:58,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:11:58,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7463 7398 8404 [WARNING|trainer.py:803] 2025-04-26 22:11:59,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:11:59,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7464 [WARNING|trainer.py:803] 2025-04-26 22:11:59,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7399 8405 [WARNING|trainer.py:803] 2025-04-26 22:12:00,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:00,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7465 7400 [WARNING|trainer.py:803] 2025-04-26 22:12:01,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:01,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8406 7466 [WARNING|trainer.py:803] 2025-04-26 22:12:02,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7401 [WARNING|trainer.py:803] 2025-04-26 22:12:02,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:02,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7467 [WARNING|trainer.py:803] 2025-04-26 22:12:03,393 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8407 7402 [WARNING|trainer.py:803] 2025-04-26 22:12:04,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:12:04,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7468 [WARNING|trainer.py:803] 2025-04-26 22:12:04,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8408 7403 [WARNING|trainer.py:803] 2025-04-26 22:12:05,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:05,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7469 [WARNING|trainer.py:803] 2025-04-26 22:12:05,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7404 8409 [WARNING|trainer.py:803] 2025-04-26 22:12:06,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7470 [WARNING|trainer.py:803] 2025-04-26 22:12:07,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:07,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7405 [WARNING|trainer.py:803] 2025-04-26 22:12:07,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8410 7471 [WARNING|trainer.py:803] 2025-04-26 22:12:08,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7406 [WARNING|trainer.py:803] 2025-04-26 22:12:08,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:12:09,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8411 7472 [WARNING|trainer.py:803] 2025-04-26 22:12:09,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7407 [WARNING|trainer.py:803] 2025-04-26 22:12:10,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:12:10,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7473 [WARNING|trainer.py:803] 2025-04-26 22:12:10,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8412 7408 [WARNING|trainer.py:803] 2025-04-26 22:12:11,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:12:11,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7474 [WARNING|trainer.py:803] 2025-04-26 22:12:12,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7409 8413 [WARNING|trainer.py:803] 2025-04-26 22:12:12,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7475 [WARNING|trainer.py:803] 2025-04-26 22:12:13,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:13,292 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7410 8414 [WARNING|trainer.py:803] 2025-04-26 22:12:13,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7476 [WARNING|trainer.py:803] 2025-04-26 22:12:14,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:12:14,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7411 [WARNING|trainer.py:803] 2025-04-26 22:12:15,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8415 7477 [WARNING|trainer.py:803] 2025-04-26 22:12:15,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:16,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7412 [WARNING|trainer.py:803] 2025-04-26 22:12:16,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7478 8416 [WARNING|trainer.py:803] 2025-04-26 22:12:17,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7413 [WARNING|trainer.py:803] 2025-04-26 22:12:17,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:17,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7479 [WARNING|trainer.py:803] 2025-04-26 22:12:18,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8417 7414 [WARNING|trainer.py:803] 2025-04-26 22:12:18,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:19,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7480 [WARNING|trainer.py:803] 2025-04-26 22:12:19,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7415 8418 [WARNING|trainer.py:803] 2025-04-26 22:12:20,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7481 [WARNING|trainer.py:803] 2025-04-26 22:12:20,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:20,743 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7416 [WARNING|trainer.py:803] 2025-04-26 22:12:21,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8419 7482 [WARNING|trainer.py:803] 2025-04-26 22:12:21,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:22,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7417 [WARNING|trainer.py:803] 2025-04-26 22:12:22,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8420 7483 [WARNING|trainer.py:803] 2025-04-26 22:12:23,224 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:23,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7418 [WARNING|trainer.py:803] 2025-04-26 22:12:23,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7484 8421 [WARNING|trainer.py:803] 2025-04-26 22:12:24,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. AnswerNo 7419 [WARNING|trainer.py:803] 2025-04-26 22:12:24,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:12:25,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7485 [WARNING|trainer.py:803] 2025-04-26 22:12:25,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8422 7420 [WARNING|trainer.py:803] 2025-04-26 22:12:26,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7486 [WARNING|trainer.py:803] 2025-04-26 22:12:26,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:26,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7421 [WARNING|trainer.py:803] 2025-04-26 22:12:27,413 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8423 7487 [WARNING|trainer.py:803] 2025-04-26 22:12:28,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:28,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7422 [WARNING|trainer.py:803] 2025-04-26 22:12:28,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7488 8424 [WARNING|trainer.py:803] 2025-04-26 22:12:29,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7423 [WARNING|trainer.py:803] 2025-04-26 22:12:29,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:29,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7489 8425 [WARNING|trainer.py:803] 2025-04-26 22:12:30,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7424 [WARNING|trainer.py:803] 2025-04-26 22:12:31,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:31,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7490 [WARNING|trainer.py:803] 2025-04-26 22:12:31,884 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8426 7425 [WARNING|trainer.py:803] 2025-04-26 22:12:32,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7491 [WARNING|trainer.py:803] 2025-04-26 22:12:32,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:33,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8427 7426 [WARNING|trainer.py:803] 2025-04-26 22:12:33,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7492 [WARNING|trainer.py:803] 2025-04-26 22:12:34,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:34,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7427 [WARNING|trainer.py:803] 2025-04-26 22:12:34,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8428 7493 [WARNING|trainer.py:803] 2025-04-26 22:12:35,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:35,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:35,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7428 7494 8429 [WARNING|trainer.py:803] 2025-04-26 22:12:36,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:12:37,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7429 [WARNING|trainer.py:803] 2025-04-26 22:12:37,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7495 [WARNING|trainer.py:803] 2025-04-26 22:12:38,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8430 [WARNING|trainer.py:803] 2025-04-26 22:12:38,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7430 7496 [WARNING|trainer.py:803] 2025-04-26 22:12:38,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:39,291 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:12:39,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7431 8431 7497 [WARNING|trainer.py:803] 2025-04-26 22:12:40,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:12:40,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:40,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7432 8432 7498 [WARNING|trainer.py:803] 2025-04-26 22:12:41,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:12:41,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:42,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7433 7499 8433 [WARNING|trainer.py:803] 2025-04-26 22:12:43,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:12:43,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:43,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7434 7500 8434 [WARNING|trainer.py:803] 2025-04-26 22:12:44,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:12:44,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7435 [WARNING|trainer.py:803] 2025-04-26 22:12:44,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7501 8435 [WARNING|trainer.py:803] 2025-04-26 22:12:45,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7436 [WARNING|trainer.py:803] 2025-04-26 22:12:46,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:46,341 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7502 [WARNING|trainer.py:803] 2025-04-26 22:12:46,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8436 7437 [WARNING|trainer.py:803] 2025-04-26 22:12:47,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:12:47,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:12:48,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7503 7438 8437 [WARNING|trainer.py:803] 2025-04-26 22:12:48,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:49,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:12:49,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7504 7439 8438 [WARNING|trainer.py:803] 2025-04-26 22:12:50,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:50,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:12:50,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7505 7440 8439 [WARNING|trainer.py:803] 2025-04-26 22:12:51,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:12:51,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7441 [WARNING|trainer.py:803] 2025-04-26 22:12:52,261 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7506 8440 [WARNING|trainer.py:803] 2025-04-26 22:12:53,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:12:53,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7442 7507 [WARNING|trainer.py:803] 2025-04-26 22:12:53,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:54,288 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8441 [WARNING|trainer.py:803] 2025-04-26 22:12:54,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7443 7508 [WARNING|trainer.py:803] 2025-04-26 22:12:55,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:12:55,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:12:55,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7444 8442 7509 [WARNING|trainer.py:803] 2025-04-26 22:12:56,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:12:56,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7445 [WARNING|trainer.py:803] 2025-04-26 22:12:57,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8443 7510 [WARNING|trainer.py:803] 2025-04-26 22:12:58,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:12:58,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7446 [WARNING|trainer.py:803] 2025-04-26 22:12:58,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8444 7511 [WARNING|trainer.py:803] 2025-04-26 22:12:59,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7447 [WARNING|trainer.py:803] 2025-04-26 22:12:59,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:13:00,071 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:00,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7512 8445 7448 [WARNING|trainer.py:803] 2025-04-26 22:13:01,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:01,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:01,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7513 7449 8446 [WARNING|trainer.py:803] 2025-04-26 22:13:02,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:13:03,016 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:03,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7450 7514 8447 [WARNING|trainer.py:803] 2025-04-26 22:13:04,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:04,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:04,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7451 7515 8448 [WARNING|trainer.py:803] 2025-04-26 22:13:05,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:05,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7452 [WARNING|trainer.py:803] 2025-04-26 22:13:05,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7516 8449 [WARNING|trainer.py:803] 2025-04-26 22:13:06,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:07,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7453 [WARNING|trainer.py:803] 2025-04-26 22:13:07,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7517 [WARNING|trainer.py:803] 2025-04-26 22:13:08,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8450 7454 [WARNING|trainer.py:803] 2025-04-26 22:13:08,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:13:09,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7518 [WARNING|trainer.py:803] 2025-04-26 22:13:09,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7455 8451 [WARNING|trainer.py:803] 2025-04-26 22:13:09,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7519 [WARNING|trainer.py:803] 2025-04-26 22:13:10,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:10,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7456 8452 [WARNING|trainer.py:803] 2025-04-26 22:13:11,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:13:11,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7520 [WARNING|trainer.py:803] 2025-04-26 22:13:12,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7457 8453 [WARNING|trainer.py:803] 2025-04-26 22:13:12,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:13,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7521 [WARNING|trainer.py:803] 2025-04-26 22:13:13,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7458 8454 [WARNING|trainer.py:803] 2025-04-26 22:13:14,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:13:14,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7522 7459 [WARNING|trainer.py:803] 2025-04-26 22:13:14,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:15,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8455 [WARNING|trainer.py:803] 2025-04-26 22:13:15,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7460 7523 [WARNING|trainer.py:803] 2025-04-26 22:13:16,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:16,928 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:13:17,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8456 7461 7524 [WARNING|trainer.py:803] 2025-04-26 22:13:17,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:18,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8457 [WARNING|trainer.py:803] 2025-04-26 22:13:18,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7462 7525 [WARNING|trainer.py:803] 2025-04-26 22:13:19,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:19,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7463 [WARNING|trainer.py:803] 2025-04-26 22:13:19,969 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8458 7526 [WARNING|trainer.py:803] 2025-04-26 22:13:20,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:13:20,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7464 [WARNING|trainer.py:803] 2025-04-26 22:13:21,378 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8459 7527 [WARNING|trainer.py:803] 2025-04-26 22:13:21,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:22,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7465 [WARNING|trainer.py:803] 2025-04-26 22:13:22,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8460 [WARNING|trainer.py:803] 2025-04-26 22:13:23,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7528 7466 [WARNING|trainer.py:803] 2025-04-26 22:13:23,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:24,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8461 [WARNING|trainer.py:803] 2025-04-26 22:13:24,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7529 7467 [WARNING|trainer.py:803] 2025-04-26 22:13:25,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:13:25,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:25,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8462 7468 7530 [WARNING|trainer.py:803] 2025-04-26 22:13:26,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:26,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:27,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7469 8463 7531 [WARNING|trainer.py:803] 2025-04-26 22:13:28,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:28,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:13:28,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7470 8464 7532 [WARNING|trainer.py:803] 2025-04-26 22:13:29,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:29,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7471 [WARNING|trainer.py:803] 2025-04-26 22:13:30,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8465 7533 [WARNING|trainer.py:803] 2025-04-26 22:13:30,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7472 [WARNING|trainer.py:803] 2025-04-26 22:13:31,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:31,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8466 [WARNING|trainer.py:803] 2025-04-26 22:13:31,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7534 7473 [WARNING|trainer.py:803] 2025-04-26 22:13:32,758 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:32,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:33,210 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7535 8467 7474 [WARNING|trainer.py:803] 2025-04-26 22:13:34,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:34,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:13:34,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7536 8468 7475 [WARNING|trainer.py:803] 2025-04-26 22:13:35,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:13:35,633 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:35,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7476 7537 8469 [WARNING|trainer.py:803] 2025-04-26 22:13:36,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:13:37,021 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:37,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7477 7538 8470 [WARNING|trainer.py:803] 2025-04-26 22:13:38,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:13:38,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:13:38,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7478 7539 8471 [WARNING|trainer.py:803] 2025-04-26 22:13:39,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:13:39,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7479 [WARNING|trainer.py:803] 2025-04-26 22:13:40,124 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7540 [WARNING|trainer.py:803] 2025-04-26 22:13:40,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8472 7480 [WARNING|trainer.py:803] 2025-04-26 22:13:41,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:41,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7541 [WARNING|trainer.py:803] 2025-04-26 22:13:42,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8473 7481 [WARNING|trainer.py:803] 2025-04-26 22:13:42,732 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:43,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:13:43,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7542 7482 8474 [WARNING|trainer.py:803] 2025-04-26 22:13:44,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:44,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:13:44,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7543 7483 8475 [WARNING|trainer.py:803] 2025-04-26 22:13:45,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:13:45,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:13:46,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7484 7544 8476 [WARNING|trainer.py:803] 2025-04-26 22:13:47,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:47,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7485 [WARNING|trainer.py:803] 2025-04-26 22:13:47,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7545 8477 [WARNING|trainer.py:803] 2025-04-26 22:13:48,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:48,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7486 [WARNING|trainer.py:803] 2025-04-26 22:13:49,041 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7546 [WARNING|trainer.py:803] 2025-04-26 22:13:49,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8478 [WARNING|trainer.py:803] 2025-04-26 22:13:49,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7487 [WARNING|trainer.py:803] 2025-04-26 22:13:50,498 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7547 [WARNING|trainer.py:803] 2025-04-26 22:13:50,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8479 7488 [WARNING|trainer.py:803] 2025-04-26 22:13:51,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7548 [WARNING|trainer.py:803] 2025-04-26 22:13:52,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:52,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7489 8480 [WARNING|trainer.py:803] 2025-04-26 22:13:52,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:13:53,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7549 [WARNING|trainer.py:803] 2025-04-26 22:13:53,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7490 [WARNING|trainer.py:803] 2025-04-26 22:13:54,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8481 [WARNING|trainer.py:803] 2025-04-26 22:13:54,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7550 7491 [WARNING|trainer.py:803] 2025-04-26 22:13:55,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:13:55,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8482 [WARNING|trainer.py:803] 2025-04-26 22:13:55,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7551 7492 [WARNING|trainer.py:803] 2025-04-26 22:13:56,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:13:56,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:13:57,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8483 7493 7552 [WARNING|trainer.py:803] 2025-04-26 22:13:57,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:58,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:13:58,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8484 7494 7553 [WARNING|trainer.py:803] 2025-04-26 22:13:59,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:13:59,605 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:13:59,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7495 8485 7554 [WARNING|trainer.py:803] 2025-04-26 22:14:00,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:14:00,887 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:01,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7496 8486 7555 [WARNING|trainer.py:803] 2025-04-26 22:14:02,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:14:02,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7497 [WARNING|trainer.py:803] 2025-04-26 22:14:02,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8487 7556 [WARNING|trainer.py:803] 2025-04-26 22:14:03,377 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7498 [WARNING|trainer.py:803] 2025-04-26 22:14:04,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:04,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8488 7557 [WARNING|trainer.py:803] 2025-04-26 22:14:04,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7499 [WARNING|trainer.py:803] 2025-04-26 22:14:05,402 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:14:05,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:05,860 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7558 8489 7500 [WARNING|trainer.py:803] 2025-04-26 22:14:06,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:06,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:14:07,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7559 8490 7501 [WARNING|trainer.py:803] 2025-04-26 22:14:08,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:08,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:08,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7560 8491 7502 [WARNING|trainer.py:803] 2025-04-26 22:14:09,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:09,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:14:10,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7561 8492 7503 [WARNING|trainer.py:803] 2025-04-26 22:14:10,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:14:11,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7562 [WARNING|trainer.py:803] 2025-04-26 22:14:11,534 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8493 7504 [WARNING|trainer.py:803] 2025-04-26 22:14:12,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:12,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7563 [WARNING|trainer.py:803] 2025-04-26 22:14:12,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8494 7505 [WARNING|trainer.py:803] 2025-04-26 22:14:13,642 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7564 [WARNING|trainer.py:803] 2025-04-26 22:14:14,206 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:14:14,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8495 7506 [WARNING|trainer.py:803] 2025-04-26 22:14:15,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7565 [WARNING|trainer.py:803] 2025-04-26 22:14:15,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:15,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7507 8496 [WARNING|trainer.py:803] 2025-04-26 22:14:16,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7566 [WARNING|trainer.py:803] 2025-04-26 22:14:17,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:17,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7508 8497 [WARNING|trainer.py:803] 2025-04-26 22:14:17,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7567 [WARNING|trainer.py:803] 2025-04-26 22:14:18,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:14:18,620 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7509 [WARNING|trainer.py:803] 2025-04-26 22:14:19,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8498 7568 [WARNING|trainer.py:803] 2025-04-26 22:14:19,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:20,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7510 [WARNING|trainer.py:803] 2025-04-26 22:14:20,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8499 7569 [WARNING|trainer.py:803] 2025-04-26 22:14:21,391 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:21,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7511 [WARNING|trainer.py:803] 2025-04-26 22:14:22,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8500 7570 [WARNING|trainer.py:803] 2025-04-26 22:14:22,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:23,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:14:23,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7512 8501 7571 [WARNING|trainer.py:803] 2025-04-26 22:14:24,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:24,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7513 [WARNING|trainer.py:803] 2025-04-26 22:14:24,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7572 8502 [WARNING|trainer.py:803] 2025-04-26 22:14:25,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:14:26,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7514 [WARNING|trainer.py:803] 2025-04-26 22:14:26,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7573 8503 [WARNING|trainer.py:803] 2025-04-26 22:14:27,107 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:27,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7515 [WARNING|trainer.py:803] 2025-04-26 22:14:27,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7574 [WARNING|trainer.py:803] 2025-04-26 22:14:28,513 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8504 [WARNING|trainer.py:803] 2025-04-26 22:14:28,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7516 [WARNING|trainer.py:803] 2025-04-26 22:14:29,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7575 [WARNING|trainer.py:803] 2025-04-26 22:14:29,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8505 [WARNING|trainer.py:803] 2025-04-26 22:14:30,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7517 7576 [WARNING|trainer.py:803] 2025-04-26 22:14:31,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 22:14:31,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:14:31,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8506 7518 7577 [WARNING|trainer.py:803] 2025-04-26 22:14:32,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:14:32,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:14:33,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7519 8507 7578 [WARNING|trainer.py:803] 2025-04-26 22:14:34,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:14:34,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:14:34,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7520 8508 7579 [WARNING|trainer.py:803] 2025-04-26 22:14:35,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:35,698 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:35,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7521 8509 7580 [WARNING|trainer.py:803] 2025-04-26 22:14:37,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:14:37,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:37,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7522 8510 7581 [WARNING|trainer.py:803] 2025-04-26 22:14:38,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:14:38,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:38,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7523 8511 7582 [WARNING|trainer.py:803] 2025-04-26 22:14:40,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:14:40,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:40,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7524 8512 7583 [WARNING|trainer.py:803] 2025-04-26 22:14:41,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:41,593 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:41,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7525 7584 8513 [WARNING|trainer.py:803] 2025-04-26 22:14:42,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:14:43,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:43,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7585 7526 8514 [WARNING|trainer.py:803] 2025-04-26 22:14:44,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:44,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:44,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7527 7586 8515 [WARNING|trainer.py:803] 2025-04-26 22:14:45,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:45,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:46,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7528 7587 8516 [WARNING|trainer.py:803] 2025-04-26 22:14:47,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:47,348 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:14:47,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7588 7529 8517 [WARNING|trainer.py:803] 2025-04-26 22:14:48,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:14:48,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7589 7530 [WARNING|trainer.py:803] 2025-04-26 22:14:49,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:50,087 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8518 [WARNING|trainer.py:803] 2025-04-26 22:14:50,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7590 7531 [WARNING|trainer.py:803] 2025-04-26 22:14:50,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:14:51,488 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:51,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8519 7591 7532 [WARNING|trainer.py:803] 2025-04-26 22:14:52,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:52,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:14:53,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8520 7592 7533 [WARNING|trainer.py:803] 2025-04-26 22:14:54,181 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:14:54,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:14:54,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7593 8521 7534 [WARNING|trainer.py:803] 2025-04-26 22:14:55,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:55,781 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:55,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7594 7535 8522 [WARNING|trainer.py:803] 2025-04-26 22:14:57,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:57,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:14:57,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7595 7536 8523 [WARNING|trainer.py:803] 2025-04-26 22:14:58,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:14:58,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:14:58,819 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7596 7537 8524 [WARNING|trainer.py:803] 2025-04-26 22:14:59,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:15:00,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:00,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7597 7538 8525 [WARNING|trainer.py:803] 2025-04-26 22:15:01,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:15:01,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7598 [WARNING|trainer.py:803] 2025-04-26 22:15:02,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7539 8526 [WARNING|trainer.py:803] 2025-04-26 22:15:02,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:15:03,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7599 [WARNING|trainer.py:803] 2025-04-26 22:15:03,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7540 [WARNING|trainer.py:803] 2025-04-26 22:15:04,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8527 [WARNING|trainer.py:803] 2025-04-26 22:15:04,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7600 7541 [WARNING|trainer.py:803] 2025-04-26 22:15:05,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:15:05,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8528 [WARNING|trainer.py:803] 2025-04-26 22:15:05,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7601 [WARNING|trainer.py:803] 2025-04-26 22:15:06,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7542 [WARNING|trainer.py:803] 2025-04-26 22:15:07,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8529 [WARNING|trainer.py:803] 2025-04-26 22:15:07,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7602 7543 [WARNING|trainer.py:803] 2025-04-26 22:15:08,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:15:08,486 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8530 [WARNING|trainer.py:803] 2025-04-26 22:15:08,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7603 7544 [WARNING|trainer.py:803] 2025-04-26 22:15:09,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:15:09,893 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8531 [WARNING|trainer.py:803] 2025-04-26 22:15:10,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7604 7545 [WARNING|trainer.py:803] 2025-04-26 22:15:11,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:15:11,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:11,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8532 7605 7546 [WARNING|trainer.py:803] 2025-04-26 22:15:12,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:15:12,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:15:13,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8533 7606 7547 [WARNING|trainer.py:803] 2025-04-26 22:15:14,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:15:14,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:15:14,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8534 7607 7548 [WARNING|trainer.py:803] 2025-04-26 22:15:15,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:15:15,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:15:15,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7608 8535 7549 [WARNING|trainer.py:803] 2025-04-26 22:15:17,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:15:17,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:15:17,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7609 7550 8536 [WARNING|trainer.py:803] 2025-04-26 22:15:18,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:15:18,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:18,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7610 7551 8537 [WARNING|trainer.py:803] 2025-04-26 22:15:20,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:20,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:15:20,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7611 7552 8538 [WARNING|trainer.py:803] 2025-04-26 22:15:21,519 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:15:21,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:15:21,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7612 7553 8539 [WARNING|trainer.py:803] 2025-04-26 22:15:22,891 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:15:23,111 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:23,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7613 7554 8540 [WARNING|trainer.py:803] 2025-04-26 22:15:24,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:24,496 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:24,929 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7614 7555 8541 [WARNING|trainer.py:803] 2025-04-26 22:15:26,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:26,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:26,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7615 7556 8542 [WARNING|trainer.py:803] 2025-04-26 22:15:27,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:15:27,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:28,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7557 7616 8543 [WARNING|trainer.py:803] 2025-04-26 22:15:29,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:29,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:29,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7558 7617 8544 [WARNING|trainer.py:803] 2025-04-26 22:15:30,356 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:15:30,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7559 [WARNING|trainer.py:803] 2025-04-26 22:15:31,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7618 [WARNING|trainer.py:803] 2025-04-26 22:15:31,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8545 [WARNING|trainer.py:803] 2025-04-26 22:15:32,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7560 [WARNING|trainer.py:803] 2025-04-26 22:15:32,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7619 [WARNING|trainer.py:803] 2025-04-26 22:15:33,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8546 [WARNING|trainer.py:803] 2025-04-26 22:15:33,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7561 [WARNING|trainer.py:803] 2025-04-26 22:15:33,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7620 [WARNING|trainer.py:803] 2025-04-26 22:15:34,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8547 [WARNING|trainer.py:803] 2025-04-26 22:15:34,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7562 7621 [WARNING|trainer.py:803] 2025-04-26 22:15:35,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:15:35,815 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8548 7563 [WARNING|trainer.py:803] 2025-04-26 22:15:36,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7622 [WARNING|trainer.py:803] 2025-04-26 22:15:37,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:15:37,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7564 8549 [WARNING|trainer.py:803] 2025-04-26 22:15:37,856 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7623 [WARNING|trainer.py:803] 2025-04-26 22:15:38,544 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:38,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7565 [WARNING|trainer.py:803] 2025-04-26 22:15:39,272 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8550 7624 [WARNING|trainer.py:803] 2025-04-26 22:15:39,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:40,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7566 8551 [WARNING|trainer.py:803] 2025-04-26 22:15:40,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:15:41,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7625 [WARNING|trainer.py:803] 2025-04-26 22:15:41,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7567 [WARNING|trainer.py:803] 2025-04-26 22:15:42,243 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8552 [WARNING|trainer.py:803] 2025-04-26 22:15:42,718 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7626 [WARNING|trainer.py:803] 2025-04-26 22:15:43,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7568 [WARNING|trainer.py:803] 2025-04-26 22:15:43,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8553 [WARNING|trainer.py:803] 2025-04-26 22:15:44,122 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7627 7569 [WARNING|trainer.py:803] 2025-04-26 22:15:44,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:45,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8554 [WARNING|trainer.py:803] 2025-04-26 22:15:45,481 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7628 7570 [WARNING|trainer.py:803] 2025-04-26 22:15:46,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:15:46,619 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:15:46,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8555 7629 7571 [WARNING|trainer.py:803] 2025-04-26 22:15:47,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:48,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:15:48,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8556 7630 7572 [WARNING|trainer.py:803] 2025-04-26 22:15:49,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:15:49,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:15:49,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8557 7631 7573 [WARNING|trainer.py:803] 2025-04-26 22:15:50,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:15:50,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:15:51,062 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7632 8558 7574 [WARNING|trainer.py:803] 2025-04-26 22:15:52,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:15:52,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:52,482 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8559 7633 7575 [WARNING|trainer.py:803] 2025-04-26 22:15:53,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:15:53,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:15:53,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8560 7634 7576 [WARNING|trainer.py:803] 2025-04-26 22:15:55,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:15:55,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:55,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7577 8561 7635 [WARNING|trainer.py:803] 2025-04-26 22:15:56,808 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:56,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:15:56,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7578 7636 8562 [WARNING|trainer.py:803] 2025-04-26 22:15:58,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:15:58,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:15:58,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7637 7579 8563 [WARNING|trainer.py:803] 2025-04-26 22:15:59,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:59,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:15:59,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7638 7580 8564 [WARNING|trainer.py:803] 2025-04-26 22:16:01,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:16:01,135 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:16:01,510 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7581 7639 8565 [WARNING|trainer.py:803] 2025-04-26 22:16:02,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:02,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:03,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7582 7640 8566 [WARNING|trainer.py:803] 2025-04-26 22:16:03,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:16:04,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7583 7641 [WARNING|trainer.py:803] 2025-04-26 22:16:04,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8567 [WARNING|trainer.py:803] 2025-04-26 22:16:05,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:05,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7584 [WARNING|trainer.py:803] 2025-04-26 22:16:06,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7642 8568 [WARNING|trainer.py:803] 2025-04-26 22:16:06,789 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:06,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7585 7643 [WARNING|trainer.py:803] 2025-04-26 22:16:07,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8569 [WARNING|trainer.py:803] 2025-04-26 22:16:08,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:08,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7644 7586 [WARNING|trainer.py:803] 2025-04-26 22:16:09,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8570 [WARNING|trainer.py:803] 2025-04-26 22:16:09,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:09,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7587 7645 [WARNING|trainer.py:803] 2025-04-26 22:16:10,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8571 [WARNING|trainer.py:803] 2025-04-26 22:16:11,188 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:11,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7588 7646 [WARNING|trainer.py:803] 2025-04-26 22:16:12,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:16:12,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:16:12,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8572 7589 7647 [WARNING|trainer.py:803] 2025-04-26 22:16:13,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:16:13,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:14,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8573 7590 7648 [WARNING|trainer.py:803] 2025-04-26 22:16:14,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:16:15,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:15,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8574 7591 7649 [WARNING|trainer.py:803] 2025-04-26 22:16:16,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:16:16,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:16:16,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8575 7592 7650 [WARNING|trainer.py:803] 2025-04-26 22:16:17,852 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:18,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:16:18,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8576 7593 7651 [WARNING|trainer.py:803] 2025-04-26 22:16:19,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:19,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:19,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8577 7594 7652 [WARNING|trainer.py:803] 2025-04-26 22:16:20,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:16:21,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:21,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8578 7595 7653 [WARNING|trainer.py:803] 2025-04-26 22:16:22,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:16:22,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:22,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8579 7596 7654 [WARNING|trainer.py:803] 2025-04-26 22:16:23,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:23,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:24,363 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7597 8580 7655 [WARNING|trainer.py:803] 2025-04-26 22:16:25,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:16:25,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:25,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7598 8581 7656 [WARNING|trainer.py:803] 2025-04-26 22:16:26,681 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:27,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:16:27,254 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7599 8582 7657 [WARNING|trainer.py:803] 2025-04-26 22:16:28,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:16:28,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:16:28,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7600 8583 7658 [WARNING|trainer.py:803] 2025-04-26 22:16:29,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7601 [WARNING|trainer.py:803] 2025-04-26 22:16:30,240 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:16:30,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7659 8584 [WARNING|trainer.py:803] 2025-04-26 22:16:31,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7602 [WARNING|trainer.py:803] 2025-04-26 22:16:31,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:16:31,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:32,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7660 8585 7603 [WARNING|trainer.py:803] 2025-04-26 22:16:33,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:33,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:33,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7661 8586 7604 [WARNING|trainer.py:803] 2025-04-26 22:16:34,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:34,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:16:35,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7662 8587 7605 [WARNING|trainer.py:803] 2025-04-26 22:16:36,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:16:36,352 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:36,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7663 8588 7606 [WARNING|trainer.py:803] 2025-04-26 22:16:37,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:16:37,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:38,253 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7664 8589 7607 [WARNING|trainer.py:803] 2025-04-26 22:16:39,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:39,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:16:39,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7665 8590 7608 [WARNING|trainer.py:803] 2025-04-26 22:16:40,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:40,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:41,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7666 8591 7609 [WARNING|trainer.py:803] 2025-04-26 22:16:42,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:42,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 22:16:42,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7667 7610 8592 [WARNING|trainer.py:803] 2025-04-26 22:16:43,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:43,914 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:43,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7668 7611 8593 [WARNING|trainer.py:803] 2025-04-26 22:16:45,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:45,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:45,454 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7612 7669 8594 [WARNING|trainer.py:803] 2025-04-26 22:16:46,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:16:46,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:16:47,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7613 7670 8595 [WARNING|trainer.py:803] 2025-04-26 22:16:48,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:48,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:48,545 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7614 7671 8596 [WARNING|trainer.py:803] 2025-04-26 22:16:49,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:49,778 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:16:50,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7615 7672 8597 [WARNING|trainer.py:803] 2025-04-26 22:16:51,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:51,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7616 [WARNING|trainer.py:803] 2025-04-26 22:16:51,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7673 8598 [WARNING|trainer.py:803] 2025-04-26 22:16:52,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:52,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:16:53,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7674 7617 8599 [WARNING|trainer.py:803] 2025-04-26 22:16:54,068 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:16:54,137 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:54,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7675 7618 8600 [WARNING|trainer.py:803] 2025-04-26 22:16:55,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:16:55,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7676 [WARNING|trainer.py:803] 2025-04-26 22:16:56,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7619 8601 [WARNING|trainer.py:803] 2025-04-26 22:16:56,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:57,121 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7677 [WARNING|trainer.py:803] 2025-04-26 22:16:57,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7620 8602 [WARNING|trainer.py:803] 2025-04-26 22:16:58,447 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:16:58,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7678 [WARNING|trainer.py:803] 2025-04-26 22:16:59,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7621 8603 [WARNING|trainer.py:803] 2025-04-26 22:16:59,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:17:00,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:17:00,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7679 7622 8604 [WARNING|trainer.py:803] 2025-04-26 22:17:01,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:17:01,543 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:17:01,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7680 7623 8605 [WARNING|trainer.py:803] 2025-04-26 22:17:02,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:17:02,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:17:03,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7681 7624 8606 [WARNING|trainer.py:803] 2025-04-26 22:17:04,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:17:04,428 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:17:04,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7682 7625 8607 [WARNING|trainer.py:803] 2025-04-26 22:17:05,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:17:05,940 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:17:06,221 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7626 7683 8608 [WARNING|trainer.py:803] 2025-04-26 22:17:07,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:07,381 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:17:07,749 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7684 7627 8609 [WARNING|trainer.py:803] 2025-04-26 22:17:08,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:08,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:17:09,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7628 7685 8610 [WARNING|trainer.py:803] 2025-04-26 22:17:10,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:17:10,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:10,589 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7629 7686 8611 [WARNING|trainer.py:803] 2025-04-26 22:17:11,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:17:11,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:17:12,140 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7630 7687 8612 [WARNING|trainer.py:803] 2025-04-26 22:17:13,219 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:13,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:17:13,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7631 7688 8613 [WARNING|trainer.py:803] 2025-04-26 22:17:14,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:17:14,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:17:15,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7632 7689 8614 [WARNING|trainer.py:803] 2025-04-26 22:17:16,048 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:17:16,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7633 7690 [WARNING|trainer.py:803] 2025-04-26 22:17:16,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:17:17,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8615 [WARNING|trainer.py:803] 2025-04-26 22:17:17,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7634 7691 [WARNING|trainer.py:803] 2025-04-26 22:17:18,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8616 [WARNING|trainer.py:803] 2025-04-26 22:17:18,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:17:19,149 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7635 7692 [WARNING|trainer.py:803] 2025-04-26 22:17:19,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8617 [WARNING|trainer.py:803] 2025-04-26 22:17:20,459 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:17:20,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7636 [WARNING|trainer.py:803] 2025-04-26 22:17:21,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7693 8618 [WARNING|trainer.py:803] 2025-04-26 22:17:21,918 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:17:22,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7637 [WARNING|trainer.py:803] 2025-04-26 22:17:22,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7694 [WARNING|trainer.py:803] 2025-04-26 22:17:23,323 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8619 [WARNING|trainer.py:803] 2025-04-26 22:17:23,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7638 [WARNING|trainer.py:803] 2025-04-26 22:17:24,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7695 [WARNING|trainer.py:803] 2025-04-26 22:17:24,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8620 [WARNING|trainer.py:803] 2025-04-26 22:17:25,215 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7639 [WARNING|trainer.py:803] 2025-04-26 22:17:25,621 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7696 [WARNING|trainer.py:803] 2025-04-26 22:17:26,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8621 [WARNING|trainer.py:803] 2025-04-26 22:17:26,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7640 [WARNING|trainer.py:803] 2025-04-26 22:17:27,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7697 [WARNING|trainer.py:803] 2025-04-26 22:17:27,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8622 7641 [WARNING|trainer.py:803] 2025-04-26 22:17:28,298 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:17:28,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7698 [WARNING|trainer.py:803] 2025-04-26 22:17:29,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8623 7642 [WARNING|trainer.py:803] 2025-04-26 22:17:29,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:17:30,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7699 [WARNING|trainer.py:803] 2025-04-26 22:17:30,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8624 7643 [WARNING|trainer.py:803] 2025-04-26 22:17:31,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:31,484 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7700 [WARNING|trainer.py:803] 2025-04-26 22:17:31,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8625 7644 [WARNING|trainer.py:803] 2025-04-26 22:17:32,609 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:32,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7701 [WARNING|trainer.py:803] 2025-04-26 22:17:33,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8626 7645 [WARNING|trainer.py:803] 2025-04-26 22:17:34,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:17:34,527 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7702 [WARNING|trainer.py:803] 2025-04-26 22:17:34,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8627 7646 [WARNING|trainer.py:803] 2025-04-26 22:17:35,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:17:35,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7703 [WARNING|trainer.py:803] 2025-04-26 22:17:36,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8628 7647 [WARNING|trainer.py:803] 2025-04-26 22:17:37,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:37,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7704 [WARNING|trainer.py:803] 2025-04-26 22:17:37,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8629 7648 [WARNING|trainer.py:803] 2025-04-26 22:17:38,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:17:38,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7705 8630 [WARNING|trainer.py:803] 2025-04-26 22:17:39,265 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7649 [WARNING|trainer.py:803] 2025-04-26 22:17:40,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:17:40,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8631 7706 [WARNING|trainer.py:803] 2025-04-26 22:17:40,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7650 [WARNING|trainer.py:803] 2025-04-26 22:17:41,405 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:41,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8632 7707 [WARNING|trainer.py:803] 2025-04-26 22:17:42,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7651 [WARNING|trainer.py:803] 2025-04-26 22:17:42,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:17:42,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8633 7708 [WARNING|trainer.py:803] 2025-04-26 22:17:43,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:17:44,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7652 [WARNING|trainer.py:803] 2025-04-26 22:17:44,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8634 7709 [WARNING|trainer.py:803] 2025-04-26 22:17:45,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:17:45,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7653 [WARNING|trainer.py:803] 2025-04-26 22:17:45,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8635 7710 [WARNING|trainer.py:803] 2025-04-26 22:17:46,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:17:47,002 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:17:47,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7654 8636 7711 [WARNING|trainer.py:803] 2025-04-26 22:17:48,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:17:48,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:17:48,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7655 8637 7712 [WARNING|trainer.py:803] 2025-04-26 22:17:49,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:17:49,846 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:50,232 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7656 8638 7713 [WARNING|trainer.py:803] 2025-04-26 22:17:51,201 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:17:51,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:51,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7657 8639 7714 [WARNING|trainer.py:803] 2025-04-26 22:17:52,607 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:17:52,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:53,037 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8640 7658 7715 [WARNING|trainer.py:803] 2025-04-26 22:17:54,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:54,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:17:54,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 8641 7659 7716 [WARNING|trainer.py:803] 2025-04-26 22:17:55,584 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:55,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:17:55,769 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8642 7717 7660 [WARNING|trainer.py:803] 2025-04-26 22:17:57,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:17:57,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:57,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7718 8643 7661 [WARNING|trainer.py:803] 2025-04-26 22:17:58,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:17:58,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:17:58,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7719 8644 7662 [WARNING|trainer.py:803] 2025-04-26 22:17:59,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:00,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:00,241 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7720 8645 7663 [WARNING|trainer.py:803] 2025-04-26 22:18:01,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:18:01,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:01,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7721 8646 7664 [WARNING|trainer.py:803] 2025-04-26 22:18:02,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:03,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:03,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7722 8647 7665 [WARNING|trainer.py:803] 2025-04-26 22:18:04,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:04,440 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:04,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7723 8648 7666 [WARNING|trainer.py:803] 2025-04-26 22:18:05,632 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:05,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7724 [WARNING|trainer.py:803] 2025-04-26 22:18:06,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8649 7667 [WARNING|trainer.py:803] 2025-04-26 22:18:06,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:18:07,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7725 [WARNING|trainer.py:803] 2025-04-26 22:18:07,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8650 [WARNING|trainer.py:803] 2025-04-26 22:18:08,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7668 [WARNING|trainer.py:803] 2025-04-26 22:18:08,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7726 8651 [WARNING|trainer.py:803] 2025-04-26 22:18:09,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:18:09,786 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:10,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7669 7727 8652 [WARNING|trainer.py:803] 2025-04-26 22:18:10,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:18:11,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7670 [WARNING|trainer.py:803] 2025-04-26 22:18:11,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7728 8653 [WARNING|trainer.py:803] 2025-04-26 22:18:12,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:18:12,570 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7671 [WARNING|trainer.py:803] 2025-04-26 22:18:13,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7729 8654 [WARNING|trainer.py:803] 2025-04-26 22:18:13,844 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:13,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7672 [WARNING|trainer.py:803] 2025-04-26 22:18:14,453 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7730 8655 [WARNING|trainer.py:803] 2025-04-26 22:18:15,276 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:18:15,344 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7673 7731 [WARNING|trainer.py:803] 2025-04-26 22:18:15,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8656 [WARNING|trainer.py:803] 2025-04-26 22:18:16,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:16,745 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7674 [WARNING|trainer.py:803] 2025-04-26 22:18:17,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7732 8657 [WARNING|trainer.py:803] 2025-04-26 22:18:18,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:18:18,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:18,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7733 7675 8658 [WARNING|trainer.py:803] 2025-04-26 22:18:19,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:19,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7734 [WARNING|trainer.py:803] 2025-04-26 22:18:20,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7676 8659 [WARNING|trainer.py:803] 2025-04-26 22:18:20,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:18:21,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7735 [WARNING|trainer.py:803] 2025-04-26 22:18:21,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7677 8660 [WARNING|trainer.py:803] 2025-04-26 22:18:22,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:22,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:22,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7736 7678 8661 [WARNING|trainer.py:803] 2025-04-26 22:18:23,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:18:24,091 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:18:24,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7737 7679 8662 [WARNING|trainer.py:803] 2025-04-26 22:18:25,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:25,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7738 [WARNING|trainer.py:803] 2025-04-26 22:18:25,807 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7680 8663 [WARNING|trainer.py:803] 2025-04-26 22:18:26,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:18:27,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7739 [WARNING|trainer.py:803] 2025-04-26 22:18:27,220 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7681 8664 [WARNING|trainer.py:803] 2025-04-26 22:18:27,916 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7740 [WARNING|trainer.py:803] 2025-04-26 22:18:28,562 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:18:28,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7682 [WARNING|trainer.py:803] 2025-04-26 22:18:29,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8665 7741 [WARNING|trainer.py:803] 2025-04-26 22:18:30,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:30,163 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7683 [WARNING|trainer.py:803] 2025-04-26 22:18:30,665 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8666 7742 [WARNING|trainer.py:803] 2025-04-26 22:18:31,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:31,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:32,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7684 8667 7743 [WARNING|trainer.py:803] 2025-04-26 22:18:33,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:18:33,132 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:18:33,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7685 8668 7744 [WARNING|trainer.py:803] 2025-04-26 22:18:34,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:18:34,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:18:34,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7686 8669 7745 [WARNING|trainer.py:803] 2025-04-26 22:18:35,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:36,120 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:18:36,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7687 8670 7746 [WARNING|trainer.py:803] 2025-04-26 22:18:37,435 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:18:37,653 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:18:37,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7688 8671 7747 [WARNING|trainer.py:803] 2025-04-26 22:18:38,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:39,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:39,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7689 7748 8672 [WARNING|trainer.py:803] 2025-04-26 22:18:40,397 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:18:40,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:18:40,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7690 7749 8673 [WARNING|trainer.py:803] 2025-04-26 22:18:41,907 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:18:41,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:42,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7691 7750 8674 [WARNING|trainer.py:803] 2025-04-26 22:18:43,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:43,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:18:43,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7751 7692 8675 [WARNING|trainer.py:803] 2025-04-26 22:18:44,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:44,896 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:18:45,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7752 7693 8676 [WARNING|trainer.py:803] 2025-04-26 22:18:46,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:46,394 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:18:46,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7753 8677 7694 [WARNING|trainer.py:803] 2025-04-26 22:18:47,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:47,963 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:18:48,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7754 8678 7695 [WARNING|trainer.py:803] 2025-04-26 22:18:48,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:49,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:49,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7755 8679 7696 [WARNING|trainer.py:803] 2025-04-26 22:18:50,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:18:50,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7756 [WARNING|trainer.py:803] 2025-04-26 22:18:51,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8680 7697 [WARNING|trainer.py:803] 2025-04-26 22:18:51,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:18:52,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7757 [WARNING|trainer.py:803] 2025-04-26 22:18:52,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8681 [WARNING|trainer.py:803] 2025-04-26 22:18:53,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7698 [WARNING|trainer.py:803] 2025-04-26 22:18:53,728 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7758 [WARNING|trainer.py:803] 2025-04-26 22:18:54,086 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8682 [WARNING|trainer.py:803] 2025-04-26 22:18:54,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7699 7759 [WARNING|trainer.py:803] 2025-04-26 22:18:55,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:18:55,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8683 [WARNING|trainer.py:803] 2025-04-26 22:18:56,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7700 7760 [WARNING|trainer.py:803] 2025-04-26 22:18:56,729 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:18:57,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8684 [WARNING|trainer.py:803] 2025-04-26 22:18:57,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7701 7761 [WARNING|trainer.py:803] 2025-04-26 22:18:58,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:58,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:18:58,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8685 7702 7762 [WARNING|trainer.py:803] 2025-04-26 22:18:59,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:00,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:00,222 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8686 7703 7763 [WARNING|trainer.py:803] 2025-04-26 22:19:01,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:01,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:01,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8687 7764 7704 [WARNING|trainer.py:803] 2025-04-26 22:19:02,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:03,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:19:03,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8688 7765 7705 [WARNING|trainer.py:803] 2025-04-26 22:19:04,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:04,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:04,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8689 7766 7706 [WARNING|trainer.py:803] 2025-04-26 22:19:05,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:19:05,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:19:05,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8690 7767 7707 [WARNING|trainer.py:803] 2025-04-26 22:19:07,180 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:19:07,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:19:07,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8691 7768 7708 [WARNING|trainer.py:803] 2025-04-26 22:19:08,608 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:08,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:08,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7769 8692 7709 [WARNING|trainer.py:803] 2025-04-26 22:19:10,039 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:10,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:10,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7770 8693 7710 [WARNING|trainer.py:803] 2025-04-26 22:19:11,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:11,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:11,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7771 8694 7711 [WARNING|trainer.py:803] 2025-04-26 22:19:12,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:19:13,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:13,278 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7772 7712 8695 [WARNING|trainer.py:803] 2025-04-26 22:19:14,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:19:14,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:14,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7773 7713 8696 [WARNING|trainer.py:803] 2025-04-26 22:19:15,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:19:16,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:19:16,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7774 7714 8697 [WARNING|trainer.py:803] 2025-04-26 22:19:17,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:19:17,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:17,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7775 7715 8698 [WARNING|trainer.py:803] 2025-04-26 22:19:18,465 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:19,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 7776 [WARNING|trainer.py:803] 2025-04-26 22:19:19,038 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8699 7716 [WARNING|trainer.py:803] 2025-04-26 22:19:19,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7777 [WARNING|trainer.py:803] 2025-04-26 22:19:20,418 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:20,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7717 8700 [WARNING|trainer.py:803] 2025-04-26 22:19:21,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7778 [WARNING|trainer.py:803] 2025-04-26 22:19:21,869 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:22,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8701 [WARNING|trainer.py:803] 2025-04-26 22:19:22,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7718 [WARNING|trainer.py:803] 2025-04-26 22:19:22,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7779 8702 [WARNING|trainer.py:803] 2025-04-26 22:19:23,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:23,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:19:24,050 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7719 8703 7780 [WARNING|trainer.py:803] 2025-04-26 22:19:24,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:25,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:25,343 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8704 7720 7781 [WARNING|trainer.py:803] 2025-04-26 22:19:26,275 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:26,387 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8705 [WARNING|trainer.py:803] 2025-04-26 22:19:26,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7721 [WARNING|trainer.py:803] 2025-04-26 22:19:27,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7782 8706 [WARNING|trainer.py:803] 2025-04-26 22:19:27,838 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:28,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:28,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7722 8707 7783 [WARNING|trainer.py:803] 2025-04-26 22:19:29,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:29,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:29,616 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8708 7723 7784 [WARNING|trainer.py:803] 2025-04-26 22:19:30,693 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:30,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8709 [WARNING|trainer.py:803] 2025-04-26 22:19:31,029 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7724 7785 [WARNING|trainer.py:803] 2025-04-26 22:19:31,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8710 [WARNING|trainer.py:803] 2025-04-26 22:19:32,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:19:32,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:19:32,799 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7725 8711 7786 [WARNING|trainer.py:803] 2025-04-26 22:19:33,688 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:19:33,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:19:33,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8712 7726 7787 [WARNING|trainer.py:803] 2025-04-26 22:19:35,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:35,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8713 [WARNING|trainer.py:803] 2025-04-26 22:19:35,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7727 7788 [WARNING|trainer.py:803] 2025-04-26 22:19:36,127 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8714 [WARNING|trainer.py:803] 2025-04-26 22:19:36,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:19:36,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7728 [WARNING|trainer.py:803] 2025-04-26 22:19:37,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7789 8715 [WARNING|trainer.py:803] 2025-04-26 22:19:37,898 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:38,141 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:19:38,379 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7729 8716 7790 [WARNING|trainer.py:803] 2025-04-26 22:19:39,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:39,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:39,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8717 7730 7791 [WARNING|trainer.py:803] 2025-04-26 22:19:40,602 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:19:40,610 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8718 [WARNING|trainer.py:803] 2025-04-26 22:19:41,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7731 7792 [WARNING|trainer.py:803] 2025-04-26 22:19:41,711 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8719 [WARNING|trainer.py:803] 2025-04-26 22:19:42,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:42,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7732 [WARNING|trainer.py:803] 2025-04-26 22:19:42,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7793 8720 [WARNING|trainer.py:803] 2025-04-26 22:19:43,407 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:43,850 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:43,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7733 8721 7794 [WARNING|trainer.py:803] 2025-04-26 22:19:44,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:19:45,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:19:45,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8722 7734 7795 [WARNING|trainer.py:803] 2025-04-26 22:19:46,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:19:46,194 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8723 [WARNING|trainer.py:803] 2025-04-26 22:19:46,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7735 7796 [WARNING|trainer.py:803] 2025-04-26 22:19:47,347 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:47,597 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8724 [WARNING|trainer.py:803] 2025-04-26 22:19:48,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7736 [WARNING|trainer.py:803] 2025-04-26 22:19:48,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7797 8725 [WARNING|trainer.py:803] 2025-04-26 22:19:48,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:49,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:19:49,601 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7737 8726 7798 [WARNING|trainer.py:803] 2025-04-26 22:19:50,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:50,740 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:19:50,990 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8727 7738 7799 [WARNING|trainer.py:803] 2025-04-26 22:19:51,830 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:19:51,881 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8728 [WARNING|trainer.py:803] 2025-04-26 22:19:52,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7739 [WARNING|trainer.py:803] 2025-04-26 22:19:52,938 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7800 8729 [WARNING|trainer.py:803] 2025-04-26 22:19:53,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:53,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7740 [WARNING|trainer.py:803] 2025-04-26 22:19:54,003 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8730 [WARNING|trainer.py:803] 2025-04-26 22:19:54,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7801 [WARNING|trainer.py:803] 2025-04-26 22:19:55,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7741 8731 [WARNING|trainer.py:803] 2025-04-26 22:19:55,598 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:19:56,109 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:19:56,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8732 7802 7742 [WARNING|trainer.py:803] 2025-04-26 22:19:57,376 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:19:57,411 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:19:57,531 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8733 7743 [WARNING|trainer.py:803] 2025-04-26 22:19:58,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7803 8734 [WARNING|trainer.py:803] 2025-04-26 22:19:58,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:19:59,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7744 [WARNING|trainer.py:803] 2025-04-26 22:19:59,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8735 7804 [WARNING|trainer.py:803] 2025-04-26 22:20:00,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:00,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7745 8736 [WARNING|trainer.py:803] 2025-04-26 22:20:01,134 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:20:01,716 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:01,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7805 8737 7746 [WARNING|trainer.py:803] 2025-04-26 22:20:02,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:02,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:03,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8738 7747 7806 [WARNING|trainer.py:803] 2025-04-26 22:20:04,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8739 [WARNING|trainer.py:803] 2025-04-26 22:20:04,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:04,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7748 [WARNING|trainer.py:803] 2025-04-26 22:20:05,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8740 7807 [WARNING|trainer.py:803] 2025-04-26 22:20:05,985 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:20:06,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:06,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8741 7749 7808 [WARNING|trainer.py:803] 2025-04-26 22:20:07,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:07,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8742 7750 [WARNING|trainer.py:803] 2025-04-26 22:20:08,152 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:20:08,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8743 [WARNING|trainer.py:803] 2025-04-26 22:20:08,864 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7809 7751 [WARNING|trainer.py:803] 2025-04-26 22:20:09,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8744 [WARNING|trainer.py:803] 2025-04-26 22:20:10,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:10,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:10,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7752 8745 7810 [WARNING|trainer.py:803] 2025-04-26 22:20:11,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:11,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:11,868 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8746 7753 7811 [WARNING|trainer.py:803] 2025-04-26 22:20:12,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:13,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8747 [WARNING|trainer.py:803] 2025-04-26 22:20:13,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7754 [WARNING|trainer.py:803] 2025-04-26 22:20:14,139 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8748 7812 [WARNING|trainer.py:803] 2025-04-26 22:20:14,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7755 [WARNING|trainer.py:803] 2025-04-26 22:20:15,274 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:15,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8749 [WARNING|trainer.py:803] 2025-04-26 22:20:16,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7813 [WARNING|trainer.py:803] 2025-04-26 22:20:16,386 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7756 8750 [WARNING|trainer.py:803] 2025-04-26 22:20:17,095 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:20:17,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:20:17,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8751 7814 7757 [WARNING|trainer.py:803] 2025-04-26 22:20:18,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:18,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:20:18,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8752 7758 7815 [WARNING|trainer.py:803] 2025-04-26 22:20:19,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8753 [WARNING|trainer.py:803] 2025-04-26 22:20:20,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:20:20,564 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:20:20,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7759 8754 7816 [WARNING|trainer.py:803] 2025-04-26 22:20:21,768 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:20:22,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8755 [WARNING|trainer.py:803] 2025-04-26 22:20:22,342 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7760 [WARNING|trainer.py:803] 2025-04-26 22:20:23,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:23,182 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7817 8756 7761 [WARNING|trainer.py:803] 2025-04-26 22:20:24,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:20:24,218 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8757 [WARNING|trainer.py:803] 2025-04-26 22:20:24,558 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7818 7762 [WARNING|trainer.py:803] 2025-04-26 22:20:25,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8758 [WARNING|trainer.py:803] 2025-04-26 22:20:25,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:20:25,968 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:20:26,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7763 8759 7819 [WARNING|trainer.py:803] 2025-04-26 22:20:27,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:20:27,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:27,557 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8760 7764 7820 [WARNING|trainer.py:803] 2025-04-26 22:20:28,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:28,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8761 7765 [WARNING|trainer.py:803] 2025-04-26 22:20:29,366 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:29,765 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8762 [WARNING|trainer.py:803] 2025-04-26 22:20:30,146 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7821 7766 [WARNING|trainer.py:803] 2025-04-26 22:20:30,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:31,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8763 [WARNING|trainer.py:803] 2025-04-26 22:20:31,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:20:31,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7767 7822 8764 [WARNING|trainer.py:803] 2025-04-26 22:20:32,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:20:32,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:20:33,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8765 7768 7823 [WARNING|trainer.py:803] 2025-04-26 22:20:34,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:34,270 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8766 7769 [WARNING|trainer.py:803] 2025-04-26 22:20:34,992 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:35,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8767 [WARNING|trainer.py:803] 2025-04-26 22:20:35,721 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7824 7770 [WARNING|trainer.py:803] 2025-04-26 22:20:36,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8768 [WARNING|trainer.py:803] 2025-04-26 22:20:36,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:20:37,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:20:37,541 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7825 7771 8769 [WARNING|trainer.py:803] 2025-04-26 22:20:38,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:20:38,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:20:38,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8770 7772 7826 [WARNING|trainer.py:803] 2025-04-26 22:20:39,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:39,982 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8771 [WARNING|trainer.py:803] 2025-04-26 22:20:40,022 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7773 [WARNING|trainer.py:803] 2025-04-26 22:20:40,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7827 8772 [WARNING|trainer.py:803] 2025-04-26 22:20:41,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:20:41,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7774 [WARNING|trainer.py:803] 2025-04-26 22:20:41,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8773 [WARNING|trainer.py:803] 2025-04-26 22:20:42,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:43,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7828 8774 7775 [WARNING|trainer.py:803] 2025-04-26 22:20:43,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:20:44,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:44,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8775 7776 7829 [WARNING|trainer.py:803] 2025-04-26 22:20:45,277 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8776 [WARNING|trainer.py:803] 2025-04-26 22:20:45,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:20:45,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7777 [WARNING|trainer.py:803] 2025-04-26 22:20:46,396 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7830 8777 [WARNING|trainer.py:803] 2025-04-26 22:20:47,030 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:47,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:20:47,523 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7778 8778 [WARNING|trainer.py:803] 2025-04-26 22:20:48,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:48,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7831 8779 7779 [WARNING|trainer.py:803] 2025-04-26 22:20:49,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:20:49,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:49,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8780 7780 7832 [WARNING|trainer.py:803] 2025-04-26 22:20:50,993 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8781 [WARNING|trainer.py:803] 2025-04-26 22:20:51,216 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:51,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7781 [WARNING|trainer.py:803] 2025-04-26 22:20:52,056 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7833 8782 [WARNING|trainer.py:803] 2025-04-26 22:20:52,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:20:52,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:53,150 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7782 8783 7834 [WARNING|trainer.py:803] 2025-04-26 22:20:54,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:20:54,349 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:20:54,645 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8784 7783 [WARNING|trainer.py:803] 2025-04-26 22:20:55,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:55,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7835 8785 7784 [WARNING|trainer.py:803] 2025-04-26 22:20:56,374 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:56,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:56,917 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8786 7836 7785 [WARNING|trainer.py:803] 2025-04-26 22:20:57,773 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:20:58,026 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8787 [WARNING|trainer.py:803] 2025-04-26 22:20:58,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7837 [WARNING|trainer.py:803] 2025-04-26 22:20:58,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7786 8788 [WARNING|trainer.py:803] 2025-04-26 22:20:59,606 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:59,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:20:59,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8789 7787 7838 [WARNING|trainer.py:803] 2025-04-26 22:21:01,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:01,199 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8790 [WARNING|trainer.py:803] 2025-04-26 22:21:01,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7788 [WARNING|trainer.py:803] 2025-04-26 22:21:02,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7839 8791 [WARNING|trainer.py:803] 2025-04-26 22:21:02,592 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7789 [WARNING|trainer.py:803] 2025-04-26 22:21:03,286 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:21:03,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8792 [WARNING|trainer.py:803] 2025-04-26 22:21:03,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7840 [WARNING|trainer.py:803] 2025-04-26 22:21:04,529 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7790 8793 [WARNING|trainer.py:803] 2025-04-26 22:21:04,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:05,475 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:21:05,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7841 8794 7791 [WARNING|trainer.py:803] 2025-04-26 22:21:06,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:06,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:06,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8795 7792 [WARNING|trainer.py:803] 2025-04-26 22:21:07,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7842 8796 [WARNING|trainer.py:803] 2025-04-26 22:21:08,365 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:08,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7793 [WARNING|trainer.py:803] 2025-04-26 22:21:08,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8797 [WARNING|trainer.py:803] 2025-04-26 22:21:09,760 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7843 [WARNING|trainer.py:803] 2025-04-26 22:21:10,102 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7794 8798 [WARNING|trainer.py:803] 2025-04-26 22:21:10,588 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:21:11,202 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:11,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8799 7795 7844 [WARNING|trainer.py:803] 2025-04-26 22:21:12,368 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:12,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8800 [WARNING|trainer.py:803] 2025-04-26 22:21:12,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7796 [WARNING|trainer.py:803] 2025-04-26 22:21:13,506 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7845 8801 [WARNING|trainer.py:803] 2025-04-26 22:21:14,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:21:14,469 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:14,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7797 8802 [WARNING|trainer.py:803] 2025-04-26 22:21:15,540 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7846 [WARNING|trainer.py:803] 2025-04-26 22:21:15,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8803 7798 [WARNING|trainer.py:803] 2025-04-26 22:21:16,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:21:16,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:16,977 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8804 7799 7847 [WARNING|trainer.py:803] 2025-04-26 22:21:18,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8805 [WARNING|trainer.py:803] 2025-04-26 22:21:18,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:18,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7800 [WARNING|trainer.py:803] 2025-04-26 22:21:19,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7848 8806 [WARNING|trainer.py:803] 2025-04-26 22:21:19,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:21:20,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:21:20,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8807 7801 7849 [WARNING|trainer.py:803] 2025-04-26 22:21:21,538 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:21:21,636 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8808 [WARNING|trainer.py:803] 2025-04-26 22:21:22,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7802 [WARNING|trainer.py:803] 2025-04-26 22:21:22,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8809 7850 [WARNING|trainer.py:803] 2025-04-26 22:21:23,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:21:23,812 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8810 [WARNING|trainer.py:803] 2025-04-26 22:21:24,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7803 [WARNING|trainer.py:803] 2025-04-26 22:21:24,932 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8811 7851 [WARNING|trainer.py:803] 2025-04-26 22:21:25,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:26,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:26,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7804 8812 [WARNING|trainer.py:803] 2025-04-26 22:21:27,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:21:27,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8813 7852 7805 [WARNING|trainer.py:803] 2025-04-26 22:21:28,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:21:28,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8814 [WARNING|trainer.py:803] 2025-04-26 22:21:28,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7853 [WARNING|trainer.py:803] 2025-04-26 22:21:29,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8815 7806 [WARNING|trainer.py:803] 2025-04-26 22:21:30,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:21:30,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:30,680 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8816 7854 7807 [WARNING|trainer.py:803] 2025-04-26 22:21:31,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:31,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8817 [WARNING|trainer.py:803] 2025-04-26 22:21:32,392 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7855 [WARNING|trainer.py:803] 2025-04-26 22:21:32,858 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 8818 7808 [WARNING|trainer.py:803] 2025-04-26 22:21:33,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:21:34,099 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:21:34,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8819 7856 7809 [WARNING|trainer.py:803] 2025-04-26 22:21:35,208 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:35,287 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8820 [WARNING|trainer.py:803] 2025-04-26 22:21:36,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7857 [WARNING|trainer.py:803] 2025-04-26 22:21:36,375 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8821 7810 [WARNING|trainer.py:803] 2025-04-26 22:21:36,964 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:21:37,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7858 [WARNING|trainer.py:803] 2025-04-26 22:21:37,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8822 [WARNING|trainer.py:803] 2025-04-26 22:21:38,518 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7811 [WARNING|trainer.py:803] 2025-04-26 22:21:38,692 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8823 [WARNING|trainer.py:803] 2025-04-26 22:21:39,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7859 [WARNING|trainer.py:803] 2025-04-26 22:21:39,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8824 [WARNING|trainer.py:803] 2025-04-26 22:21:40,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7812 [WARNING|trainer.py:803] 2025-04-26 22:21:40,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7860 8825 [WARNING|trainer.py:803] 2025-04-26 22:21:41,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:41,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:21:42,019 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7813 8826 7861 [WARNING|trainer.py:803] 2025-04-26 22:21:42,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:21:43,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:43,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8827 7814 7862 [WARNING|trainer.py:803] 2025-04-26 22:21:44,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8828 [WARNING|trainer.py:803] 2025-04-26 22:21:44,766 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:21:45,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:21:45,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7815 8829 7863 [WARNING|trainer.py:803] 2025-04-26 22:21:46,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:21:46,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:46,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8830 7816 7864 [WARNING|trainer.py:803] 2025-04-26 22:21:47,731 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8831 [WARNING|trainer.py:803] 2025-04-26 22:21:48,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:21:48,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:21:48,811 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8832 7817 7865 [WARNING|trainer.py:803] 2025-04-26 22:21:49,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:50,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8833 [WARNING|trainer.py:803] 2025-04-26 22:21:50,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7818 7866 [WARNING|trainer.py:803] 2025-04-26 22:21:51,067 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8834 [WARNING|trainer.py:803] 2025-04-26 22:21:51,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:21:51,866 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:21:52,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8835 7819 7867 [WARNING|trainer.py:803] 2025-04-26 22:21:53,313 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:53,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8836 [WARNING|trainer.py:803] 2025-04-26 22:21:53,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:54,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7820 8837 7868 [WARNING|trainer.py:803] 2025-04-26 22:21:55,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:21:55,511 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8838 [WARNING|trainer.py:803] 2025-04-26 22:21:55,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7821 [WARNING|trainer.py:803] 2025-04-26 22:21:56,591 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7869 8839 [WARNING|trainer.py:803] 2025-04-26 22:21:57,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:21:57,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:21:57,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8840 7822 [WARNING|trainer.py:803] 2025-04-26 22:21:58,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7870 [WARNING|trainer.py:803] 2025-04-26 22:21:58,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8841 [WARNING|trainer.py:803] 2025-04-26 22:21:59,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:21:59,905 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8842 7823 7871 [WARNING|trainer.py:803] 2025-04-26 22:22:00,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:00,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8843 [WARNING|trainer.py:803] 2025-04-26 22:22:01,520 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7824 [WARNING|trainer.py:803] 2025-04-26 22:22:02,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7872 8844 [WARNING|trainer.py:803] 2025-04-26 22:22:02,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:22:03,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:22:03,304 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8845 7825 7873 [WARNING|trainer.py:803] 2025-04-26 22:22:04,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:22:04,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8846 [WARNING|trainer.py:803] 2025-04-26 22:22:04,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7826 [WARNING|trainer.py:803] 2025-04-26 22:22:05,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7874 8847 [WARNING|trainer.py:803] 2025-04-26 22:22:06,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:06,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:06,655 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8848 7827 [WARNING|trainer.py:803] 2025-04-26 22:22:07,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:07,961 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7875 8849 [WARNING|trainer.py:803] 2025-04-26 22:22:08,895 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:22:08,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8850 7828 7876 [WARNING|trainer.py:803] 2025-04-26 22:22:10,023 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:10,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8851 [WARNING|trainer.py:803] 2025-04-26 22:22:10,639 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:11,131 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7829 8852 7877 [WARNING|trainer.py:803] 2025-04-26 22:22:11,994 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:22:12,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:12,398 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8853 7830 7878 [WARNING|trainer.py:803] 2025-04-26 22:22:13,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:13,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8854 [WARNING|trainer.py:803] 2025-04-26 22:22:14,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:14,568 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8855 7831 7879 [WARNING|trainer.py:803] 2025-04-26 22:22:15,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:15,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8856 [WARNING|trainer.py:803] 2025-04-26 22:22:15,931 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7832 [WARNING|trainer.py:803] 2025-04-26 22:22:16,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8857 7880 [WARNING|trainer.py:803] 2025-04-26 22:22:17,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:17,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:18,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8858 7833 7881 [WARNING|trainer.py:803] 2025-04-26 22:22:18,991 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:19,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8859 [WARNING|trainer.py:803] 2025-04-26 22:22:19,684 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7834 [WARNING|trainer.py:803] 2025-04-26 22:22:20,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8860 7882 [WARNING|trainer.py:803] 2025-04-26 22:22:20,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:22:21,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8861 [WARNING|trainer.py:803] 2025-04-26 22:22:21,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7835 [WARNING|trainer.py:803] 2025-04-26 22:22:22,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8862 [WARNING|trainer.py:803] 2025-04-26 22:22:22,585 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7883 [WARNING|trainer.py:803] 2025-04-26 22:22:23,383 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7836 8863 [WARNING|trainer.py:803] 2025-04-26 22:22:23,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:24,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:24,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8864 7837 7884 [WARNING|trainer.py:803] 2025-04-26 22:22:25,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:25,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8865 [WARNING|trainer.py:803] 2025-04-26 22:22:26,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:22:26,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7838 7885 8866 [WARNING|trainer.py:803] 2025-04-26 22:22:27,805 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:27,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:22:27,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8867 7839 7886 [WARNING|trainer.py:803] 2025-04-26 22:22:29,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8868 [WARNING|trainer.py:803] 2025-04-26 22:22:29,436 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:22:29,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:22:30,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7840 8869 7887 [WARNING|trainer.py:803] 2025-04-26 22:22:31,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:31,285 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8870 7841 [WARNING|trainer.py:803] 2025-04-26 22:22:31,848 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:22:32,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:32,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7888 8871 [WARNING|trainer.py:803] 2025-04-26 22:22:33,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:22:33,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8872 7842 7889 [WARNING|trainer.py:803] 2025-04-26 22:22:34,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:34,832 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8873 [WARNING|trainer.py:803] 2025-04-26 22:22:34,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:22:35,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7890 7843 8874 [WARNING|trainer.py:803] 2025-04-26 22:22:36,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:22:36,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:36,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8875 7891 [WARNING|trainer.py:803] 2025-04-26 22:22:37,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8876 7844 [WARNING|trainer.py:803] 2025-04-26 22:22:38,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:39,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:39,058 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8877 7892 7845 [WARNING|trainer.py:803] 2025-04-26 22:22:40,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8878 [WARNING|trainer.py:803] 2025-04-26 22:22:40,514 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:40,635 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:41,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7893 8879 7846 [WARNING|trainer.py:803] 2025-04-26 22:22:42,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:22:42,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8880 [WARNING|trainer.py:803] 2025-04-26 22:22:42,761 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:43,473 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8881 7894 7847 [WARNING|trainer.py:803] 2025-04-26 22:22:44,576 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:44,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:44,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8882 7848 [WARNING|trainer.py:803] 2025-04-26 22:22:45,690 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7895 8883 [WARNING|trainer.py:803] 2025-04-26 22:22:46,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:22:46,537 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:46,810 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8884 7849 7896 [WARNING|trainer.py:803] 2025-04-26 22:22:47,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8885 [WARNING|trainer.py:803] 2025-04-26 22:22:48,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:48,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:22:48,983 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8886 7897 7850 [WARNING|trainer.py:803] 2025-04-26 22:22:50,089 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:50,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8887 [WARNING|trainer.py:803] 2025-04-26 22:22:50,471 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:51,211 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7898 8888 7851 [WARNING|trainer.py:803] 2025-04-26 22:22:52,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:52,336 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:52,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8889 7899 [WARNING|trainer.py:803] 2025-04-26 22:22:53,480 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8890 7852 [WARNING|trainer.py:803] 2025-04-26 22:22:54,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:22:54,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:22:54,734 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8891 7900 7853 [WARNING|trainer.py:803] 2025-04-26 22:22:55,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:22:55,901 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8892 [WARNING|trainer.py:803] 2025-04-26 22:22:56,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:22:56,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7901 8893 7854 [WARNING|trainer.py:803] 2025-04-26 22:22:57,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:22:57,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8894 [WARNING|trainer.py:803] 2025-04-26 22:22:58,268 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7902 [WARNING|trainer.py:803] 2025-04-26 22:22:59,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7855 8895 [WARNING|trainer.py:803] 2025-04-26 22:22:59,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:22:59,922 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:23:00,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8896 7903 7856 [WARNING|trainer.py:803] 2025-04-26 22:23:01,217 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:01,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:23:01,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8897 [WARNING|trainer.py:803] 2025-04-26 22:23:02,426 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7857 8898 7904 [WARNING|trainer.py:803] 2025-04-26 22:23:03,279 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:23:03,533 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:23:03,755 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8899 7858 7905 [WARNING|trainer.py:803] 2025-04-26 22:23:04,685 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:04,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8900 [WARNING|trainer.py:803] 2025-04-26 22:23:05,422 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7859 [WARNING|trainer.py:803] 2025-04-26 22:23:05,814 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8901 7906 [WARNING|trainer.py:803] 2025-04-26 22:23:06,617 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:23:06,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:07,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8902 7860 [WARNING|trainer.py:803] 2025-04-26 22:23:08,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:08,305 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7907 8903 7861 [WARNING|trainer.py:803] 2025-04-26 22:23:09,175 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:23:09,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8904 [WARNING|trainer.py:803] 2025-04-26 22:23:09,783 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7908 [WARNING|trainer.py:803] 2025-04-26 22:23:10,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8905 7862 [WARNING|trainer.py:803] 2025-04-26 22:23:10,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:23:11,449 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:11,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7909 8906 7863 [WARNING|trainer.py:803] 2025-04-26 22:23:12,476 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:23:12,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8907 [WARNING|trainer.py:803] 2025-04-26 22:23:13,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7910 [WARNING|trainer.py:803] 2025-04-26 22:23:13,666 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7864 8908 [WARNING|trainer.py:803] 2025-04-26 22:23:14,314 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:14,795 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:14,825 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8909 7911 [WARNING|trainer.py:803] 2025-04-26 22:23:15,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:16,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7865 8910 [WARNING|trainer.py:803] 2025-04-26 22:23:16,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7912 [WARNING|trainer.py:803] 2025-04-26 22:23:17,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8911 7866 [WARNING|trainer.py:803] 2025-04-26 22:23:17,746 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:23:18,174 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:18,359 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8912 7913 7867 [WARNING|trainer.py:803] 2025-04-26 22:23:19,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:19,389 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8913 [WARNING|trainer.py:803] 2025-04-26 22:23:20,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7914 [WARNING|trainer.py:803] 2025-04-26 22:23:20,441 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8914 [WARNING|trainer.py:803] 2025-04-26 22:23:21,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:21,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7868 8915 7915 [WARNING|trainer.py:803] 2025-04-26 22:23:22,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:22,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:22,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8916 7869 [WARNING|trainer.py:803] 2025-04-26 22:23:23,733 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8917 7916 [WARNING|trainer.py:803] 2025-04-26 22:23:24,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:23:24,826 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:24,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8918 7870 7917 [WARNING|trainer.py:803] 2025-04-26 22:23:25,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8919 [WARNING|trainer.py:803] 2025-04-26 22:23:26,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:23:26,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:23:27,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7871 8920 7918 [WARNING|trainer.py:803] 2025-04-26 22:23:28,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:28,226 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8921 [WARNING|trainer.py:803] 2025-04-26 22:23:28,836 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7872 [WARNING|trainer.py:803] 2025-04-26 22:23:29,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7919 8922 [WARNING|trainer.py:803] 2025-04-26 22:23:29,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:23:30,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7873 [WARNING|trainer.py:803] 2025-04-26 22:23:30,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8923 [WARNING|trainer.py:803] 2025-04-26 22:23:31,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:31,775 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7920 8924 7874 [WARNING|trainer.py:803] 2025-04-26 22:23:32,618 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:23:32,942 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:32,949 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8925 7921 [WARNING|trainer.py:803] 2025-04-26 22:23:34,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8926 7875 [WARNING|trainer.py:803] 2025-04-26 22:23:34,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:35,115 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:35,452 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8927 7922 [WARNING|trainer.py:803] 2025-04-26 22:23:36,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7876 8928 [WARNING|trainer.py:803] 2025-04-26 22:23:37,093 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:37,213 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:37,515 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8929 7923 7877 [WARNING|trainer.py:803] 2025-04-26 22:23:38,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8930 [WARNING|trainer.py:803] 2025-04-26 22:23:38,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:38,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:23:39,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7878 8931 7924 [WARNING|trainer.py:803] 2025-04-26 22:23:40,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:40,853 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8932 [WARNING|trainer.py:803] 2025-04-26 22:23:41,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7879 [WARNING|trainer.py:803] 2025-04-26 22:23:42,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7925 8933 [WARNING|trainer.py:803] 2025-04-26 22:23:42,508 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:23:42,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:23:43,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8934 7880 7926 [WARNING|trainer.py:803] 2025-04-26 22:23:44,309 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:44,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8935 [WARNING|trainer.py:803] 2025-04-26 22:23:44,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7881 [WARNING|trainer.py:803] 2025-04-26 22:23:45,434 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8936 [WARNING|trainer.py:803] 2025-04-26 22:23:46,231 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7927 [WARNING|trainer.py:803] 2025-04-26 22:23:46,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8937 [WARNING|trainer.py:803] 2025-04-26 22:23:47,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7882 [WARNING|trainer.py:803] 2025-04-26 22:23:47,719 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7928 8938 [WARNING|trainer.py:803] 2025-04-26 22:23:48,013 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:48,720 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:48,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8939 7883 7929 [WARNING|trainer.py:803] 2025-04-26 22:23:49,960 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8940 [WARNING|trainer.py:803] 2025-04-26 22:23:50,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:50,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:51,117 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7930 8941 7884 [WARNING|trainer.py:803] 2025-04-26 22:23:52,299 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:23:52,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8942 [WARNING|trainer.py:803] 2025-04-26 22:23:52,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7931 [WARNING|trainer.py:803] 2025-04-26 22:23:53,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7885 8943 [WARNING|trainer.py:803] 2025-04-26 22:23:53,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:23:54,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:23:54,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8944 7932 7886 [WARNING|trainer.py:803] 2025-04-26 22:23:55,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8945 [WARNING|trainer.py:803] 2025-04-26 22:23:56,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:23:56,338 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:23:56,689 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8946 7933 7887 [WARNING|trainer.py:803] 2025-04-26 22:23:57,782 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:57,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8947 [WARNING|trainer.py:803] 2025-04-26 22:23:58,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7934 [WARNING|trainer.py:803] 2025-04-26 22:23:58,890 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8948 7888 [WARNING|trainer.py:803] 2025-04-26 22:23:59,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:23:59,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:24:00,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8949 7889 7935 [WARNING|trainer.py:803] 2025-04-26 22:24:01,083 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8950 [WARNING|trainer.py:803] 2025-04-26 22:24:01,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:24:01,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:02,235 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7890 8951 7936 [WARNING|trainer.py:803] 2025-04-26 22:24:03,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:24:03,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:03,502 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8952 7891 [WARNING|trainer.py:803] 2025-04-26 22:24:04,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7937 8953 [WARNING|trainer.py:803] 2025-04-26 22:24:05,076 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:05,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:05,580 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8954 7892 7938 [WARNING|trainer.py:803] 2025-04-26 22:24:06,663 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8955 [WARNING|trainer.py:803] 2025-04-26 22:24:07,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:24:07,293 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:24:07,764 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7893 8956 7939 [WARNING|trainer.py:803] 2025-04-26 22:24:08,822 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:24:08,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8957 [WARNING|trainer.py:803] 2025-04-26 22:24:09,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:24:09,995 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7940 8958 7894 [WARNING|trainer.py:803] 2025-04-26 22:24:11,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:11,114 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:11,247 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8959 [WARNING|trainer.py:803] 2025-04-26 22:24:12,166 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7895 7941 8960 [WARNING|trainer.py:803] 2025-04-26 22:24:13,042 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:24:13,112 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:24:13,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8961 7896 7942 [WARNING|trainer.py:803] 2025-04-26 22:24:14,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8962 [WARNING|trainer.py:803] 2025-04-26 22:24:14,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:24:14,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:15,499 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8963 7897 7943 [WARNING|trainer.py:803] 2025-04-26 22:24:16,586 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:16,676 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8964 [WARNING|trainer.py:803] 2025-04-26 22:24:16,899 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:24:17,702 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7944 7898 8965 [WARNING|trainer.py:803] 2025-04-26 22:24:18,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:24:18,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:24:18,809 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8966 7899 [WARNING|trainer.py:803] 2025-04-26 22:24:19,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8967 7945 [WARNING|trainer.py:803] 2025-04-26 22:24:20,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:24:21,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:21,142 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8968 7900 7946 [WARNING|trainer.py:803] 2025-04-26 22:24:22,178 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:22,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8969 [WARNING|trainer.py:803] 2025-04-26 22:24:22,784 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 22:24:23,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7901 8970 7947 [WARNING|trainer.py:803] 2025-04-26 22:24:24,311 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:24:24,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:24,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8971 7902 [WARNING|trainer.py:803] 2025-04-26 22:24:25,547 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8972 7948 [WARNING|trainer.py:803] 2025-04-26 22:24:26,197 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:24:26,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:26,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8973 7903 7949 [WARNING|trainer.py:803] 2025-04-26 22:24:27,747 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:24:27,875 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8974 [WARNING|trainer.py:803] 2025-04-26 22:24:28,406 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:24:28,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7904 8975 7950 [WARNING|trainer.py:803] 2025-04-26 22:24:30,007 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:24:30,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:24:30,438 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8976 7905 [WARNING|trainer.py:803] 2025-04-26 22:24:31,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7951 [WARNING|trainer.py:803] 2025-04-26 22:24:31,674 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8977 [WARNING|trainer.py:803] 2025-04-26 22:24:32,466 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:24:32,555 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7906 8978 [WARNING|trainer.py:803] 2025-04-26 22:24:33,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7952 [WARNING|trainer.py:803] 2025-04-26 22:24:33,790 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8979 7907 [WARNING|trainer.py:803] 2025-04-26 22:24:34,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:24:34,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8980 [WARNING|trainer.py:803] 2025-04-26 22:24:35,430 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7953 [WARNING|trainer.py:803] 2025-04-26 22:24:35,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7908 8981 [WARNING|trainer.py:803] 2025-04-26 22:24:36,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:24:37,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:24:37,148 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8982 7954 7909 [WARNING|trainer.py:803] 2025-04-26 22:24:38,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:24:38,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8983 [WARNING|trainer.py:803] 2025-04-26 22:24:38,707 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7955 [WARNING|trainer.py:803] 2025-04-26 22:24:39,450 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8984 7910 [WARNING|trainer.py:803] 2025-04-26 22:24:40,257 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:24:40,521 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:24:40,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8985 7956 7911 [WARNING|trainer.py:803] 2025-04-26 22:24:41,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:24:41,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8986 [WARNING|trainer.py:803] 2025-04-26 22:24:42,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:24:42,828 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7957 8987 7912 [WARNING|trainer.py:803] 2025-04-26 22:24:43,724 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:24:43,965 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:44,010 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8988 7913 [WARNING|trainer.py:803] 2025-04-26 22:24:45,074 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8989 7958 [WARNING|trainer.py:803] 2025-04-26 22:24:45,682 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:24:46,177 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:46,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8990 7914 7959 [WARNING|trainer.py:803] 2025-04-26 22:24:47,354 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:47,423 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8991 [WARNING|trainer.py:803] 2025-04-26 22:24:47,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7915 [WARNING|trainer.py:803] 2025-04-26 22:24:48,439 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8992 7960 [WARNING|trainer.py:803] 2025-04-26 22:24:49,160 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:24:49,504 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8993 [WARNING|trainer.py:803] 2025-04-26 22:24:49,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7916 [WARNING|trainer.py:803] 2025-04-26 22:24:50,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7961 8994 [WARNING|trainer.py:803] 2025-04-26 22:24:51,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:24:51,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:24:51,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8995 7917 7962 [WARNING|trainer.py:803] 2025-04-26 22:24:52,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:52,953 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8996 [WARNING|trainer.py:803] 2025-04-26 22:24:53,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:24:53,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8997 7918 7963 [WARNING|trainer.py:803] 2025-04-26 22:24:54,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:55,125 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:24:55,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8998 7919 [WARNING|trainer.py:803] 2025-04-26 22:24:56,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8999 7964 [WARNING|trainer.py:803] 2025-04-26 22:24:56,841 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:57,260 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:24:57,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 9000 7920 [WARNING|trainer.py:803] 2025-04-26 22:24:58,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7965 Results saved to /home/wangjiarui/InternVL/internvl_chat/AIGV60K_weights/qa_fast/qa_results.csv Accuracy: 0.787 [WARNING|trainer.py:803] 2025-04-26 22:24:58,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:24:59,204 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7921 7966 [2025-04-26 22:25:00,538] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [WARNING|trainer.py:803] 2025-04-26 22:25:01,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:25:01,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7967 7922 [WARNING|trainer.py:803] 2025-04-26 22:25:03,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:25:03,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) 7968 7923 [WARNING|trainer.py:803] 2025-04-26 22:25:05,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:25:05,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7969 [2025-04-26 22:25:06,450] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) 7924 [WARNING|trainer.py:803] 2025-04-26 22:25:07,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:25:07,670 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7970 7925 [WARNING|trainer.py:803] 2025-04-26 22:25:08,696 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [WARNING|trainer.py:803] 2025-04-26 22:25:09,332 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7971 7926 [WARNING|trainer.py:803] 2025-04-26 22:25:10,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:25:10,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7972 [WARNING|trainer.py:803] 2025-04-26 22:25:11,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [2025-04-26 22:25:12,145] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) 7927 7973 [WARNING|trainer.py:803] 2025-04-26 22:25:13,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:25:13,656 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7928 7974 /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [WARNING|trainer.py:803] 2025-04-26 22:25:15,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:25:15,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7929 7975 [WARNING|trainer.py:803] 2025-04-26 22:25:16,725 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:25:17,072 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7930 [2025-04-26 22:25:17,667] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) 7976 [WARNING|trainer.py:803] 2025-04-26 22:25:18,539 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:25:19,138 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7931 [WARNING|trainer.py:803] 2025-04-26 22:25:19,987 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7977 /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) [WARNING|trainer.py:803] 2025-04-26 22:25:20,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7932 7978 [WARNING|trainer.py:803] 2025-04-26 22:25:22,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:25:22,859 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7933 7979 [WARNING|trainer.py:803] 2025-04-26 22:25:24,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:25:24,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7934 [WARNING|trainer.py:803] 2025-04-26 22:25:25,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7980 [WARNING|trainer.py:803] 2025-04-26 22:25:26,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7935 7981 [WARNING|trainer.py:803] 2025-04-26 22:25:28,054 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:25:28,770 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7936 7982 [WARNING|trainer.py:803] 2025-04-26 22:25:29,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:25:30,399 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7937 7983 [WARNING|trainer.py:803] 2025-04-26 22:25:31,787 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:25:32,408 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7938 7984 [WARNING|trainer.py:803] 2025-04-26 22:25:33,478 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:25:34,040 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7939 7985 [WARNING|trainer.py:803] 2025-04-26 22:25:35,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:25:36,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7940 [WARNING|trainer.py:803] 2025-04-26 22:25:37,096 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7986 [WARNING|trainer.py:803] 2025-04-26 22:25:37,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7941 [WARNING|trainer.py:803] 2025-04-26 22:25:38,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7987 7942 [WARNING|trainer.py:803] 2025-04-26 22:25:39,954 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:25:40,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7988 7943 [WARNING|trainer.py:803] 2025-04-26 22:25:42,009 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:25:42,587 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7989 7944 [WARNING|trainer.py:803] 2025-04-26 22:25:43,658 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:25:44,345 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7990 [WARNING|trainer.py:803] 2025-04-26 22:25:45,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7945 [WARNING|trainer.py:803] 2025-04-26 22:25:46,751 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7991 7946 [WARNING|trainer.py:803] 2025-04-26 22:25:47,694 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:25:48,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 7992 7947 [WARNING|trainer.py:803] 2025-04-26 22:25:49,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:25:50,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7993 7948 [WARNING|trainer.py:803] 2025-04-26 22:25:51,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:25:52,264 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7949 7994 [WARNING|trainer.py:803] 2025-04-26 22:25:53,910 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:25:53,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7995 7950 [WARNING|trainer.py:803] 2025-04-26 22:25:55,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:25:55,913 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7996 7951 [WARNING|trainer.py:803] 2025-04-26 22:25:57,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:25:57,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7997 7952 [WARNING|trainer.py:803] 2025-04-26 22:25:59,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:26:00,020 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7998 7953 [WARNING|trainer.py:803] 2025-04-26 22:26:01,652 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:26:01,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7999 7954 [WARNING|trainer.py:803] 2025-04-26 22:26:03,445 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:26:03,726 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7955 8000 [WARNING|trainer.py:803] 2025-04-26 22:26:05,679 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:26:05,739 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7956 8001 [WARNING|trainer.py:803] 2025-04-26 22:26:07,312 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:26:07,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7957 8002 [WARNING|trainer.py:803] 2025-04-26 22:26:09,046 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:26:09,816 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7958 8003 [WARNING|trainer.py:803] 2025-04-26 22:26:11,489 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:26:11,634 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7959 8004 [WARNING|trainer.py:803] 2025-04-26 22:26:13,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:26:13,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8005 7960 [WARNING|trainer.py:803] 2025-04-26 22:26:14,933 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:26:15,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8006 7961 [WARNING|trainer.py:803] 2025-04-26 22:26:16,505 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:26:16,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8007 7962 [WARNING|trainer.py:803] 2025-04-26 22:26:18,432 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:26:18,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 7963 8008 [WARNING|trainer.py:803] 2025-04-26 22:26:20,322 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:26:20,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8009 7964 [WARNING|trainer.py:803] 2025-04-26 22:26:22,281 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:26:22,561 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8010 7965 [WARNING|trainer.py:803] 2025-04-26 22:26:23,873 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:26:24,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8011 7966 [WARNING|trainer.py:803] 2025-04-26 22:26:25,487 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:26:26,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8012 [WARNING|trainer.py:803] 2025-04-26 22:26:27,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7967 8013 [WARNING|trainer.py:803] 2025-04-26 22:26:28,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:26:28,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7968 8014 [WARNING|trainer.py:803] 2025-04-26 22:26:30,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:26:30,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8015 7969 [WARNING|trainer.py:803] 2025-04-26 22:26:31,870 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:26:31,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8016 7970 [WARNING|trainer.py:803] 2025-04-26 22:26:33,468 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:26:33,524 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8017 7971 [WARNING|trainer.py:803] 2025-04-26 22:26:35,106 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:26:35,238 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7972 8018 [WARNING|trainer.py:803] 2025-04-26 22:26:36,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:26:37,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7973 8019 [WARNING|trainer.py:803] 2025-04-26 22:26:38,566 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:26:38,736 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 7974 8020 [WARNING|trainer.py:803] 2025-04-26 22:26:40,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:26:40,803 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7975 8021 [WARNING|trainer.py:803] 2025-04-26 22:26:41,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:26:42,326 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7976 8022 [WARNING|trainer.py:803] 2025-04-26 22:26:44,005 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:26:44,350 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7977 8023 [WARNING|trainer.py:803] 2025-04-26 22:26:45,730 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:26:46,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7978 8024 [WARNING|trainer.py:803] 2025-04-26 22:26:47,703 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:26:47,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7979 8025 [WARNING|trainer.py:803] 2025-04-26 22:26:49,741 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:26:49,742 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8026 7980 [WARNING|trainer.py:803] 2025-04-26 22:26:51,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:26:51,771 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8027 7981 [WARNING|trainer.py:803] 2025-04-26 22:26:53,266 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:26:53,556 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8028 7982 [WARNING|trainer.py:803] 2025-04-26 22:26:54,854 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:26:55,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8029 7983 [WARNING|trainer.py:803] 2025-04-26 22:26:56,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:26:57,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8030 7984 [WARNING|trainer.py:803] 2025-04-26 22:26:58,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:26:58,756 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8031 [WARNING|trainer.py:803] 2025-04-26 22:26:59,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7985 [WARNING|trainer.py:803] 2025-04-26 22:27:00,776 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8032 [WARNING|trainer.py:803] 2025-04-26 22:27:01,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7986 8033 [WARNING|trainer.py:803] 2025-04-26 22:27:02,640 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:27:03,327 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 7987 8034 [WARNING|trainer.py:803] 2025-04-26 22:27:04,672 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:27:05,090 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8035 7988 [WARNING|trainer.py:803] 2025-04-26 22:27:06,650 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:27:06,727 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8036 7989 [WARNING|trainer.py:803] 2025-04-26 22:27:08,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:27:08,409 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8037 7990 [WARNING|trainer.py:803] 2025-04-26 22:27:10,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:27:10,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8038 7991 [WARNING|trainer.py:803] 2025-04-26 22:27:11,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:27:12,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8039 [WARNING|trainer.py:803] 2025-04-26 22:27:13,337 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 7992 8040 [WARNING|trainer.py:803] 2025-04-26 22:27:14,748 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:27:15,100 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8041 7993 [WARNING|trainer.py:803] 2025-04-26 22:27:16,706 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:27:16,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8042 7994 [WARNING|trainer.py:803] 2025-04-26 22:27:18,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:27:18,613 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8043 7995 [WARNING|trainer.py:803] 2025-04-26 22:27:19,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:27:20,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8044 7996 [WARNING|trainer.py:803] 2025-04-26 22:27:21,697 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:27:22,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8045 [WARNING|trainer.py:803] 2025-04-26 22:27:23,351 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 7997 [WARNING|trainer.py:803] 2025-04-26 22:27:24,699 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8046 7998 [WARNING|trainer.py:803] 2025-04-26 22:27:25,713 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:27:26,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8047 7999 [WARNING|trainer.py:803] 2025-04-26 22:27:27,443 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:27:28,233 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8048 [WARNING|trainer.py:803] 2025-04-26 22:27:29,165 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8000 8049 [WARNING|trainer.py:803] 2025-04-26 22:27:30,526 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:27:31,027 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8001 8050 [WARNING|trainer.py:803] 2025-04-26 22:27:32,599 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:27:32,647 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8051 8002 [WARNING|trainer.py:803] 2025-04-26 22:27:34,400 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:27:34,627 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8052 8003 [WARNING|trainer.py:803] 2025-04-26 22:27:36,081 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:27:36,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8053 8004 [WARNING|trainer.py:803] 2025-04-26 22:27:37,578 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:27:37,998 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8054 8005 [WARNING|trainer.py:803] 2025-04-26 22:27:39,158 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:27:39,851 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8055 8006 [WARNING|trainer.py:803] 2025-04-26 22:27:40,710 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8056 [WARNING|trainer.py:803] 2025-04-26 22:27:41,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:27:42,239 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8007 8057 [WARNING|trainer.py:803] 2025-04-26 22:27:43,417 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:27:43,847 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8008 8058 [WARNING|trainer.py:803] 2025-04-26 22:27:45,361 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:27:45,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 8059 8009 [WARNING|trainer.py:803] 2025-04-26 22:27:47,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:27:47,271 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8060 8010 [WARNING|trainer.py:803] 2025-04-26 22:27:48,516 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:27:48,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8061 8011 [WARNING|trainer.py:803] 2025-04-26 22:27:50,214 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:27:50,457 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8062 8012 [WARNING|trainer.py:803] 2025-04-26 22:27:51,946 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:27:52,088 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8013 8063 [WARNING|trainer.py:803] 2025-04-26 22:27:53,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:27:53,762 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8064 8014 [WARNING|trainer.py:803] 2025-04-26 22:27:55,382 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:27:55,395 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8015 8065 [WARNING|trainer.py:803] 2025-04-26 22:27:56,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:27:57,176 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8016 8066 [WARNING|trainer.py:803] 2025-04-26 22:27:58,582 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:27:58,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8017 8067 [WARNING|trainer.py:803] 2025-04-26 22:28:00,200 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:00,490 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8068 8018 [WARNING|trainer.py:803] 2025-04-26 22:28:02,153 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:28:02,167 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8019 8069 [WARNING|trainer.py:803] 2025-04-26 22:28:03,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:28:03,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8070 8020 [WARNING|trainer.py:803] 2025-04-26 22:28:05,671 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:28:05,861 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8071 8021 [WARNING|trainer.py:803] 2025-04-26 22:28:07,318 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:07,401 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8072 8022 [WARNING|trainer.py:803] 2025-04-26 22:28:09,092 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:28:09,464 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8073 8023 [WARNING|trainer.py:803] 2025-04-26 22:28:10,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:11,328 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8024 8074 [WARNING|trainer.py:803] 2025-04-26 22:28:12,980 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:12,997 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8025 8075 [WARNING|trainer.py:803] 2025-04-26 22:28:14,802 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:15,133 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8026 8076 [WARNING|trainer.py:803] 2025-04-26 22:28:16,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:28:16,800 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8027 8077 [WARNING|trainer.py:803] 2025-04-26 22:28:18,258 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:28:18,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8028 8078 [WARNING|trainer.py:803] 2025-04-26 22:28:19,820 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:19,927 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8029 8079 [WARNING|trainer.py:803] 2025-04-26 22:28:21,410 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:21,535 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8030 8080 [WARNING|trainer.py:803] 2025-04-26 22:28:23,154 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:23,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8031 8081 [WARNING|trainer.py:803] 2025-04-26 22:28:24,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:28:25,280 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8032 8082 [WARNING|trainer.py:803] 2025-04-26 22:28:26,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:27,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8033 8083 [WARNING|trainer.py:803] 2025-04-26 22:28:28,172 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:28,722 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8034 8084 [WARNING|trainer.py:803] 2025-04-26 22:28:29,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:28:30,308 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8035 8085 [WARNING|trainer.py:803] 2025-04-26 22:28:31,491 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:28:32,145 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8036 [WARNING|trainer.py:803] 2025-04-26 22:28:33,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8086 [WARNING|trainer.py:803] 2025-04-26 22:28:34,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8037 [WARNING|trainer.py:803] 2025-04-26 22:28:35,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8087 8038 [WARNING|trainer.py:803] 2025-04-26 22:28:36,085 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:36,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8088 8039 [WARNING|trainer.py:803] 2025-04-26 22:28:37,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:38,230 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8089 8040 [WARNING|trainer.py:803] 2025-04-26 22:28:39,283 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:40,012 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8041 8090 [WARNING|trainer.py:803] 2025-04-26 22:28:41,668 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:28:41,801 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8042 8091 [WARNING|trainer.py:803] 2025-04-26 22:28:43,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:43,437 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8043 8092 [WARNING|trainer.py:803] 2025-04-26 22:28:44,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:28:45,187 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8044 8093 [WARNING|trainer.py:803] 2025-04-26 22:28:46,705 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:46,813 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8045 8094 [WARNING|trainer.py:803] 2025-04-26 22:28:48,331 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:28:48,603 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8095 8046 [WARNING|trainer.py:803] 2025-04-26 22:28:50,065 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:50,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8096 8047 [WARNING|trainer.py:803] 2025-04-26 22:28:51,614 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:52,372 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8097 8048 [WARNING|trainer.py:803] 2025-04-26 22:28:53,269 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:28:54,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8098 8049 [WARNING|trainer.py:803] 2025-04-26 22:28:55,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:55,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8099 8050 [WARNING|trainer.py:803] 2025-04-26 22:28:56,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:28:57,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8100 8051 [WARNING|trainer.py:803] 2025-04-26 22:28:58,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No [WARNING|trainer.py:803] 2025-04-26 22:28:59,273 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8101 8052 [WARNING|trainer.py:803] 2025-04-26 22:29:00,695 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:00,900 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8053 8102 [WARNING|trainer.py:803] 2025-04-26 22:29:02,414 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:29:03,073 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8054 [WARNING|trainer.py:803] 2025-04-26 22:29:04,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8103 8055 [WARNING|trainer.py:803] 2025-04-26 22:29:05,123 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:29:05,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8104 8056 [WARNING|trainer.py:803] 2025-04-26 22:29:07,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:07,191 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8057 8105 [WARNING|trainer.py:803] 2025-04-26 22:29:08,834 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:29:09,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8058 8106 [WARNING|trainer.py:803] 2025-04-26 22:29:10,528 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No [WARNING|trainer.py:803] 2025-04-26 22:29:11,171 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8059 [WARNING|trainer.py:803] 2025-04-26 22:29:12,104 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8107 8060 [WARNING|trainer.py:803] 2025-04-26 22:29:13,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:29:13,573 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8108 8061 [WARNING|trainer.py:803] 2025-04-26 22:29:15,205 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:15,259 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8062 8109 [WARNING|trainer.py:803] 2025-04-26 22:29:17,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:17,321 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8063 8110 [WARNING|trainer.py:803] 2025-04-26 22:29:18,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:19,284 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8064 8111 [WARNING|trainer.py:803] 2025-04-26 22:29:20,501 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:21,267 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8065 [WARNING|trainer.py:803] 2025-04-26 22:29:22,315 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8112 8066 [WARNING|trainer.py:803] 2025-04-26 22:29:23,330 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:24,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8113 8067 [WARNING|trainer.py:803] 2025-04-26 22:29:25,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:25,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8068 8114 [WARNING|trainer.py:803] 2025-04-26 22:29:27,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:27,455 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8069 8115 [WARNING|trainer.py:803] 2025-04-26 22:29:29,196 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:29,552 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8070 8116 [WARNING|trainer.py:803] 2025-04-26 22:29:30,934 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:31,623 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8071 [WARNING|trainer.py:803] 2025-04-26 22:29:32,615 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8117 8072 [WARNING|trainer.py:803] 2025-04-26 22:29:33,622 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:29:34,364 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8118 8073 [WARNING|trainer.py:803] 2025-04-26 22:29:35,714 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:29:36,207 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8119 8074 [WARNING|trainer.py:803] 2025-04-26 22:29:37,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:29:38,317 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8120 8075 [WARNING|trainer.py:803] 2025-04-26 22:29:39,821 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:29:40,467 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8121 8076 [WARNING|trainer.py:803] 2025-04-26 22:29:41,845 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:42,168 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8077 8122 [WARNING|trainer.py:803] 2025-04-26 22:29:43,738 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:29:43,924 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8078 8123 [WARNING|trainer.py:803] 2025-04-26 22:29:45,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:29:46,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8079 [WARNING|trainer.py:803] 2025-04-26 22:29:46,894 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8124 8080 [WARNING|trainer.py:803] 2025-04-26 22:29:48,060 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:48,595 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8125 8081 [WARNING|trainer.py:803] 2025-04-26 22:29:50,032 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:50,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8126 8082 [WARNING|trainer.py:803] 2025-04-26 22:29:52,044 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:52,431 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8127 8083 [WARNING|trainer.py:803] 2025-04-26 22:29:53,936 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:29:54,082 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8084 8128 [WARNING|trainer.py:803] 2025-04-26 22:29:55,659 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:29:56,193 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8085 8129 [WARNING|trainer.py:803] 2025-04-26 22:29:57,530 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:29:58,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8086 [WARNING|trainer.py:803] 2025-04-26 22:29:59,503 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8130 [WARNING|trainer.py:803] 2025-04-26 22:30:00,517 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8087 [WARNING|trainer.py:803] 2025-04-26 22:30:01,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8131 8088 [WARNING|trainer.py:803] 2025-04-26 22:30:02,628 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:30:03,017 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8132 8089 [WARNING|trainer.py:803] 2025-04-26 22:30:04,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:30:04,662 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8133 8090 [WARNING|trainer.py:803] 2025-04-26 22:30:06,882 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:30:07,209 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8091 8134 [WARNING|trainer.py:803] 2025-04-26 22:30:08,872 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:30:08,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8092 8135 [WARNING|trainer.py:803] 2025-04-26 22:30:10,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:30:10,863 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8093 8136 [WARNING|trainer.py:803] 2025-04-26 22:30:12,306 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:30:12,878 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8094 8137 [WARNING|trainer.py:803] 2025-04-26 22:30:14,078 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8095 [WARNING|trainer.py:803] 2025-04-26 22:30:14,865 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:30:15,553 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8138 8096 [WARNING|trainer.py:803] 2025-04-26 22:30:17,001 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:30:17,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8097 8139 [WARNING|trainer.py:803] 2025-04-26 22:30:18,753 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:30:18,923 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8098 8140 [WARNING|trainer.py:803] 2025-04-26 22:30:20,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:30:20,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8099 8141 [WARNING|trainer.py:803] 2025-04-26 22:30:22,477 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:30:22,956 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8100 [WARNING|trainer.py:803] 2025-04-26 22:30:24,130 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8142 [WARNING|trainer.py:803] 2025-04-26 22:30:25,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8101 [WARNING|trainer.py:803] 2025-04-26 22:30:26,255 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8143 [WARNING|trainer.py:803] 2025-04-26 22:30:27,234 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8102 [WARNING|trainer.py:803] 2025-04-26 22:30:28,251 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8144 [WARNING|trainer.py:803] 2025-04-26 22:30:29,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8103 [WARNING|trainer.py:803] 2025-04-26 22:30:30,369 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8145 [WARNING|trainer.py:803] 2025-04-26 22:30:31,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8104 [WARNING|trainer.py:803] 2025-04-26 22:30:32,390 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8146 [WARNING|trainer.py:803] 2025-04-26 22:30:33,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8105 [WARNING|trainer.py:803] 2025-04-26 22:30:34,403 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8147 [WARNING|trainer.py:803] 2025-04-26 22:30:35,444 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8106 [WARNING|trainer.py:803] 2025-04-26 22:30:36,433 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8148 [WARNING|trainer.py:803] 2025-04-26 22:30:37,549 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8107 [WARNING|trainer.py:803] 2025-04-26 22:30:38,416 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8149 [WARNING|trainer.py:803] 2025-04-26 22:30:39,548 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8108 [WARNING|trainer.py:803] 2025-04-26 22:30:40,532 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8150 [WARNING|trainer.py:803] 2025-04-26 22:30:41,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8109 [WARNING|trainer.py:803] 2025-04-26 22:30:42,669 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8151 [WARNING|trainer.py:803] 2025-04-26 22:30:43,797 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8110 [WARNING|trainer.py:803] 2025-04-26 22:30:44,708 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8152 8111 [WARNING|trainer.py:803] 2025-04-26 22:30:45,959 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:30:46,677 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8153 8112 [WARNING|trainer.py:803] 2025-04-26 22:30:47,925 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:30:48,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8154 [WARNING|trainer.py:803] 2025-04-26 22:30:49,909 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8113 [WARNING|trainer.py:803] 2025-04-26 22:30:50,791 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8155 [WARNING|trainer.py:803] 2025-04-26 22:30:51,945 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8114 [WARNING|trainer.py:803] 2025-04-26 22:30:52,827 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8156 [WARNING|trainer.py:803] 2025-04-26 22:30:53,908 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8115 [WARNING|trainer.py:803] 2025-04-26 22:30:54,880 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8157 [WARNING|trainer.py:803] 2025-04-26 22:30:55,824 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8116 [WARNING|trainer.py:803] 2025-04-26 22:30:56,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8158 [WARNING|trainer.py:803] 2025-04-26 22:30:57,785 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8117 8159 [WARNING|trainer.py:803] 2025-04-26 22:30:59,063 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:30:59,857 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8118 8160 [WARNING|trainer.py:803] 2025-04-26 22:31:01,126 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:31:01,829 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8119 8161 [WARNING|trainer.py:803] 2025-04-26 22:31:03,169 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:31:03,757 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8120 8162 [WARNING|trainer.py:803] 2025-04-26 22:31:05,198 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:31:05,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8121 8163 [WARNING|trainer.py:803] 2025-04-26 22:31:07,250 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:07,939 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8122 8164 [WARNING|trainer.py:803] 2025-04-26 22:31:09,310 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:09,973 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8123 8165 [WARNING|trainer.py:803] 2025-04-26 22:31:11,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:31:12,079 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8124 8166 [WARNING|trainer.py:803] 2025-04-26 22:31:13,630 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:13,976 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8125 8167 [WARNING|trainer.py:803] 2025-04-26 22:31:15,625 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:15,912 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8126 8168 [WARNING|trainer.py:803] 2025-04-26 22:31:17,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:17,957 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8127 8169 [WARNING|trainer.py:803] 2025-04-26 22:31:19,649 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:20,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8128 8170 [WARNING|trainer.py:803] 2025-04-26 22:31:21,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:22,045 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8129 8171 [WARNING|trainer.py:803] 2025-04-26 22:31:24,008 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:24,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8172 8130 [WARNING|trainer.py:803] 2025-04-26 22:31:26,195 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:31:26,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8173 8131 [WARNING|trainer.py:803] 2025-04-26 22:31:28,229 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:31:28,451 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8174 8132 [WARNING|trainer.py:803] 2025-04-26 22:31:30,225 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:31:30,461 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8175 8133 [WARNING|trainer.py:803] 2025-04-26 22:31:32,546 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:32,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8176 8134 [WARNING|trainer.py:803] 2025-04-26 22:31:34,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:31:34,839 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8177 8135 [WARNING|trainer.py:803] 2025-04-26 22:31:36,612 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:36,889 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8178 8136 [WARNING|trainer.py:803] 2025-04-26 22:31:38,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:38,975 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8179 8137 [WARNING|trainer.py:803] 2025-04-26 22:31:40,744 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:41,015 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8180 8138 [WARNING|trainer.py:803] 2025-04-26 22:31:42,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:31:43,170 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8181 8139 [WARNING|trainer.py:803] 2025-04-26 22:31:44,871 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:45,223 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8182 8140 [WARNING|trainer.py:803] 2025-04-26 22:31:46,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:31:47,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8183 8141 [WARNING|trainer.py:803] 2025-04-26 22:31:49,118 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:49,329 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8184 8142 [WARNING|trainer.py:803] 2025-04-26 22:31:51,353 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:51,494 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8185 8143 [WARNING|trainer.py:803] 2025-04-26 22:31:53,339 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:53,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8186 8144 [WARNING|trainer.py:803] 2025-04-26 22:31:55,643 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:31:55,792 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8145 8187 [WARNING|trainer.py:803] 2025-04-26 22:31:57,837 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:31:57,874 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8188 8146 [WARNING|trainer.py:803] 2025-04-26 22:31:59,777 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:31:59,840 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8189 8147 [WARNING|trainer.py:803] 2025-04-26 22:32:01,675 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:32:01,950 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8190 8148 [WARNING|trainer.py:803] 2025-04-26 22:32:03,550 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:32:04,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8191 8149 [WARNING|trainer.py:803] 2025-04-26 22:32:05,712 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:32:06,101 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8192 8150 [WARNING|trainer.py:803] 2025-04-26 22:32:08,036 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:32:08,458 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8193 8151 [WARNING|trainer.py:803] 2025-04-26 22:32:10,024 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:32:10,604 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8194 8152 [WARNING|trainer.py:803] 2025-04-26 22:32:11,986 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:32:12,759 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo 8195 8153 [WARNING|trainer.py:803] 2025-04-26 22:32:14,103 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:32:14,806 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8196 8154 [WARNING|trainer.py:803] 2025-04-26 22:32:16,173 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:32:16,897 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8197 8155 [WARNING|trainer.py:803] 2025-04-26 22:32:18,157 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:32:18,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8198 8156 [WARNING|trainer.py:803] 2025-04-26 22:32:20,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:32:21,031 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8199 8157 [WARNING|trainer.py:803] 2025-04-26 22:32:22,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:32:22,971 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8200 8158 [WARNING|trainer.py:803] 2025-04-26 22:32:24,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:32:24,962 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8201 8159 [WARNING|trainer.py:803] 2025-04-26 22:32:26,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:32:27,049 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8160 8202 [WARNING|trainer.py:803] 2025-04-26 22:32:28,999 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:32:29,075 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8161 8203 [WARNING|trainer.py:803] 2025-04-26 22:32:30,883 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:32:31,059 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 8162 8204 [WARNING|trainer.py:803] 2025-04-26 22:32:33,053 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:32:33,252 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8163 8205 [WARNING|trainer.py:803] 2025-04-26 22:32:35,119 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:32:35,263 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8164 8206 [WARNING|trainer.py:803] 2025-04-26 22:32:37,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:32:37,442 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8165 8207 [WARNING|trainer.py:803] 2025-04-26 22:32:39,262 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:32:39,626 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8166 8208 [WARNING|trainer.py:803] 2025-04-26 22:32:41,227 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:32:41,657 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8167 8209 [WARNING|trainer.py:803] 2025-04-26 22:32:43,183 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:32:43,780 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8168 8210 [WARNING|trainer.py:803] 2025-04-26 22:32:45,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:32:45,958 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8169 8211 [WARNING|trainer.py:803] 2025-04-26 22:32:47,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:32:47,926 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8170 8212 [WARNING|trainer.py:803] 2025-04-26 22:32:49,320 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:32:49,921 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8171 8213 [WARNING|trainer.py:803] 2025-04-26 22:32:51,242 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:32:51,944 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8172 8214 [WARNING|trainer.py:803] 2025-04-26 22:32:53,462 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:32:54,129 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8173 8215 [WARNING|trainer.py:803] 2025-04-26 22:32:55,460 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:32:56,179 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8174 [WARNING|trainer.py:803] 2025-04-26 22:32:57,370 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8216 [WARNING|trainer.py:803] 2025-04-26 22:32:58,334 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8175 8217 [WARNING|trainer.py:803] 2025-04-26 22:32:59,661 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:33:00,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8176 [WARNING|trainer.py:803] 2025-04-26 22:33:01,717 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8218 [WARNING|trainer.py:803] 2025-04-26 22:33:02,767 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8177 [WARNING|trainer.py:803] 2025-04-26 22:33:03,678 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8219 8178 [WARNING|trainer.py:803] 2025-04-26 22:33:04,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:33:05,637 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8220 8179 [WARNING|trainer.py:803] 2025-04-26 22:33:07,080 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:33:07,687 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8221 8180 [WARNING|trainer.py:803] 2025-04-26 22:33:09,248 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:33:09,754 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8222 8181 [WARNING|trainer.py:803] 2025-04-26 22:33:11,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:33:11,704 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8182 8223 [WARNING|trainer.py:803] 2025-04-26 22:33:13,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:33:13,664 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8224 8183 [WARNING|trainer.py:803] 2025-04-26 22:33:15,788 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:33:15,970 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8225 8184 [WARNING|trainer.py:803] 2025-04-26 22:33:18,066 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:33:18,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8226 8185 [WARNING|trainer.py:803] 2025-04-26 22:33:20,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:33:20,156 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8227 8186 [WARNING|trainer.py:803] 2025-04-26 22:33:22,294 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:33:22,425 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8228 8187 [WARNING|trainer.py:803] 2025-04-26 22:33:24,303 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:33:24,648 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8188 8229 [WARNING|trainer.py:803] 2025-04-26 22:33:26,551 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:33:26,559 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8189 8230 [WARNING|trainer.py:803] 2025-04-26 22:33:28,522 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:33:28,641 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8190 8231 [WARNING|trainer.py:803] 2025-04-26 22:33:30,448 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:33:30,560 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8191 8232 [WARNING|trainer.py:803] 2025-04-26 22:33:32,574 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:33:32,660 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8233 8192 [WARNING|trainer.py:803] 2025-04-26 22:33:34,752 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:33:34,972 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8234 8193 [WARNING|trainer.py:803] 2025-04-26 22:33:36,831 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:33:36,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8194 8235 [WARNING|trainer.py:803] 2025-04-26 22:33:38,948 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:33:38,988 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8195 8236 [WARNING|trainer.py:803] 2025-04-26 22:33:41,043 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:33:41,164 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8196 8237 [WARNING|trainer.py:803] 2025-04-26 22:33:43,018 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:33:43,245 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8197 8238 [WARNING|trainer.py:803] 2025-04-26 22:33:44,967 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:33:45,249 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8239 8198 [WARNING|trainer.py:803] 2025-04-26 22:33:47,295 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:33:47,421 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8240 8199 [WARNING|trainer.py:803] 2025-04-26 22:33:49,384 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:33:49,483 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8241 8200 [WARNING|trainer.py:803] 2025-04-26 22:33:51,463 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:33:51,509 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8242 8201 [WARNING|trainer.py:803] 2025-04-26 22:33:53,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:33:53,701 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8243 8202 [WARNING|trainer.py:803] 2025-04-26 22:33:55,581 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:33:55,849 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8244 8203 [WARNING|trainer.py:803] 2025-04-26 22:33:57,563 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:33:57,888 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. No 8245 8204 [WARNING|trainer.py:803] 2025-04-26 22:33:59,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:34:00,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8246 8205 [WARNING|trainer.py:803] 2025-04-26 22:34:01,594 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:02,116 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8247 8206 [WARNING|trainer.py:803] 2025-04-26 22:34:03,855 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:34:04,237 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8248 8207 [WARNING|trainer.py:803] 2025-04-26 22:34:06,034 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:34:06,357 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8249 8208 [WARNING|trainer.py:803] 2025-04-26 22:34:08,047 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:08,429 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8250 8209 [WARNING|trainer.py:803] 2025-04-26 22:34:10,190 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:10,525 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :No 8251 8210 [WARNING|trainer.py:803] 2025-04-26 22:34:12,289 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:34:12,654 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8252 8211 [WARNING|trainer.py:803] 2025-04-26 22:34:14,492 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:14,691 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8253 8212 [WARNING|trainer.py:803] 2025-04-26 22:34:16,577 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:34:16,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8254 8213 [WARNING|trainer.py:803] 2025-04-26 22:34:18,554 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:18,735 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8255 8214 [WARNING|trainer.py:803] 2025-04-26 22:34:20,876 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:20,935 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8256 8215 [WARNING|trainer.py:803] 2025-04-26 22:34:22,843 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:23,000 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8257 8216 [WARNING|trainer.py:803] 2025-04-26 22:34:24,955 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:34:25,144 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8217 8258 [WARNING|trainer.py:803] 2025-04-26 22:34:27,316 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:27,319 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8259 8218 [WARNING|trainer.py:803] 2025-04-26 22:34:29,244 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:34:29,651 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8260 8219 [WARNING|trainer.py:803] 2025-04-26 22:34:31,536 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:34:31,794 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8261 8220 [WARNING|trainer.py:803] 2025-04-26 22:34:33,631 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:33,937 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8262 8221 [WARNING|trainer.py:803] 2025-04-26 22:34:35,611 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:36,077 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8263 8222 [WARNING|trainer.py:803] 2025-04-26 22:34:37,879 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:34:38,360 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8264 8223 [WARNING|trainer.py:803] 2025-04-26 22:34:40,136 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:34:40,367 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes 8265 8224 [WARNING|trainer.py:803] 2025-04-26 22:34:42,192 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:42,472 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8266 8225 [WARNING|trainer.py:803] 2025-04-26 22:34:44,302 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:44,779 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8267 8226 [WARNING|trainer.py:803] 2025-04-26 22:34:46,380 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:46,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8268 8227 [WARNING|trainer.py:803] 2025-04-26 22:34:48,750 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:48,979 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8228 8269 [WARNING|trainer.py:803] 2025-04-26 22:34:50,947 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:34:51,004 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8270 8229 [WARNING|trainer.py:803] 2025-04-26 22:34:52,877 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:34:53,212 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8271 8230 [WARNING|trainer.py:803] 2025-04-26 22:34:54,835 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:34:55,355 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8272 8231 [WARNING|trainer.py:803] 2025-04-26 22:34:56,862 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:34:57,228 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8273 8232 [WARNING|trainer.py:803] 2025-04-26 22:34:58,966 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:34:59,415 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8274 8233 [WARNING|trainer.py:803] 2025-04-26 22:35:00,911 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. Yes [WARNING|trainer.py:803] 2025-04-26 22:35:01,542 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8275 8234 [WARNING|trainer.py:803] 2025-04-26 22:35:02,842 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:35:03,644 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8276 8235 [WARNING|trainer.py:803] 2025-04-26 22:35:05,070 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:35:05,817 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8277 [WARNING|trainer.py:803] 2025-04-26 22:35:07,155 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8236 [WARNING|trainer.py:803] 2025-04-26 22:35:08,011 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8278 8237 [WARNING|trainer.py:803] 2025-04-26 22:35:09,333 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:35:10,105 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8279 8238 [WARNING|trainer.py:803] 2025-04-26 22:35:11,420 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:35:12,113 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8280 8239 [WARNING|trainer.py:803] 2025-04-26 22:35:13,590 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:35:14,184 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8281 8240 [WARNING|trainer.py:803] 2025-04-26 22:35:15,646 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes [WARNING|trainer.py:803] 2025-04-26 22:35:16,300 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8282 8241 [WARNING|trainer.py:803] 2025-04-26 22:35:17,715 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:35:18,404 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8283 8242 [WARNING|trainer.py:803] 2025-04-26 22:35:19,833 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:35:20,500 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8284 8243 [WARNING|trainer.py:803] 2025-04-26 22:35:21,892 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:35:22,565 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8285 8244 [WARNING|trainer.py:803] 2025-04-26 22:35:24,033 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:35:24,572 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8286 8245 [WARNING|trainer.py:803] 2025-04-26 22:35:25,989 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:35:26,629 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8287 8246 [WARNING|trainer.py:803] 2025-04-26 22:35:28,098 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:35:28,583 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8288 8247 [WARNING|trainer.py:803] 2025-04-26 22:35:30,373 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes [WARNING|trainer.py:803] 2025-04-26 22:35:30,867 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoYes 8289 8248 [WARNING|trainer.py:803] 2025-04-26 22:35:32,479 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. NoNo [WARNING|trainer.py:803] 2025-04-26 22:35:32,941 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. YesYes 8290 8249 [rank0]:[E426 22:35:34.569973427 ProcessGroupNCCL.cpp:616] [Rank 0] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=13920, OpType=ALLREDUCE, NumelIn=52297728, NumelOut=52297728, Timeout(ms)=600000) ran for 600060 milliseconds before timing out. [rank0]:[E426 22:35:34.570961530 ProcessGroupNCCL.cpp:1785] [PG ID 1 PG GUID 1 Rank 0] Exception (either an error or timeout) detected by watchdog at work: 13920, last enqueued NCCL work: 13920, last completed NCCL work: 13919. [WARNING|trainer.py:803] 2025-04-26 22:35:34,567 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [rank0]:[E426 22:35:34.004621339 ProcessGroupNCCL.cpp:1834] [PG ID 1 PG GUID 1 Rank 0] Timeout at NCCL work: 13920, last enqueued NCCL work: 13920, last completed NCCL work: 13919. [rank0]:[E426 22:35:34.004648571 ProcessGroupNCCL.cpp:630] [Rank 0] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank0]:[E426 22:35:34.004653621 ProcessGroupNCCL.cpp:636] [Rank 0] To avoid data inconsistency, we are taking the entire process down. [rank0]:[E426 22:35:34.005814513 ProcessGroupNCCL.cpp:1595] [PG ID 1 PG GUID 1 Rank 0] Process group watchdog thread terminated with exception: [Rank 0] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=13920, OpType=ALLREDUCE, NumelIn=52297728, NumelOut=52297728, Timeout(ms)=600000) ran for 600060 milliseconds before timing out. Exception raised from checkTimeout at ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f052e96c446 in /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f04e41c4a92 in /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f04e41cbed3 in /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f04e41cd93d in /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f052ee085c0 in /home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/lib/libtorch.so) frame #5: + 0x94ac3 (0x7f053338bac3 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x126850 (0x7f053341d850 in /lib/x86_64-linux-gnu/libc.so.6) [WARNING|trainer.py:803] 2025-04-26 22:35:34,930 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes 8291 8250 [WARNING|trainer.py:803] 2025-04-26 22:35:36,569 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes [WARNING|trainer.py:803] 2025-04-26 22:35:37,094 >> Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead. :Yes W0426 22:35:37.608392 784474 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 784563 closing signal SIGTERM W0426 22:35:37.611305 784474 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 784564 closing signal SIGTERM /home/wangjiarui/.conda/envs/intern25/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d ' /home/wangjiarui/.conda/envs/intern25/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d ' petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. petrel_client is not installed. If you read data locally instead of from ceph, ignore it. Replace train sampler!! petrel_client is not installed. Using PIL to load images. E0426 22:35:41.451914 784474 site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 0 (pid: 784562) of binary: /home/wangjiarui/.conda/envs/intern25/bin/python Traceback (most recent call last): File "/home/wangjiarui/.conda/envs/intern25/bin/torchrun", line 8, in sys.exit(main()) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/home/wangjiarui/.conda/envs/intern25/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ======================================================= train/train_qa.py FAILED ------------------------------------------------------- Failures: ------------------------------------------------------- Root Cause (first observed failure): [0]: time : 2025-04-26_22:35:37 host : amax rank : 0 (local_rank: 0) exitcode : -6 (pid: 784562) error_file: traceback : Signal 6 (SIGABRT) received by PID 784562 ======================================================= /home/wangjiarui/.conda/envs/intern25/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 21 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '